WorldWideScience

Sample records for efficient subtorus processor

  1. Green Secure Processors: Towards Power-Efficient Secure Processor Design

    Science.gov (United States)

    Chhabra, Siddhartha; Solihin, Yan

    With the increasing wealth of digital information stored on computer systems today, security issues have become increasingly important. In addition to attacks targeting the software stack of a system, hardware attacks have become equally likely. Researchers have proposed Secure Processor Architectures which utilize hardware mechanisms for memory encryption and integrity verification to protect the confidentiality and integrity of data and computation, even from sophisticated hardware attacks. While there have been many works addressing performance and other system level issues in secure processor design, power issues have largely been ignored. In this paper, we first analyze the sources of power (energy) increase in different secure processor architectures. We then present a power analysis of various secure processor architectures in terms of their increase in power consumption over a base system with no protection and then provide recommendations for designs that offer the best balance between performance and power without compromising security. We extend our study to the embedded domain as well. We also outline the design of a novel hybrid cryptographic engine that can be used to minimize the power consumption for a secure processor. We believe that if secure processors are to be adopted in future systems (general purpose or embedded), it is critically important that power issues are considered in addition to performance and other system level issues. To the best of our knowledge, this is the first work to examine the power implications of providing hardware mechanisms for security.

  2. Efficient quantum walk on a quantum processor

    Science.gov (United States)

    Qiang, Xiaogang; Loke, Thomas; Montanaro, Ashley; Aungskunsiri, Kanin; Zhou, Xiaoqi; O'Brien, Jeremy L.; Wang, Jingbo B.; Matthews, Jonathan C. F.

    2016-01-01

    The random walk formalism is used across a wide range of applications, from modelling share prices to predicting population genetics. Likewise, quantum walks have shown much potential as a framework for developing new quantum algorithms. Here we present explicit efficient quantum circuits for implementing continuous-time quantum walks on the circulant class of graphs. These circuits allow us to sample from the output probability distributions of quantum walks on circulant graphs efficiently. We also show that solving the same sampling problem for arbitrary circulant quantum circuits is intractable for a classical computer, assuming conjectures from computational complexity theory. This is a new link between continuous-time quantum walks and computational complexity theory and it indicates a family of tasks that could ultimately demonstrate quantum supremacy over classical computers. As a proof of principle, we experimentally implement the proposed quantum circuit on an example circulant graph using a two-qubit photonics quantum processor. PMID:27146471

  3. Efficient Multicriteria Protein Structure Comparison on Modern Processor Architectures

    Science.gov (United States)

    Manolakos, Elias S.

    2015-01-01

    Fast increasing computational demand for all-to-all protein structures comparison (PSC) is a result of three confounding factors: rapidly expanding structural proteomics databases, high computational complexity of pairwise protein comparison algorithms, and the trend in the domain towards using multiple criteria for protein structures comparison (MCPSC) and combining results. We have developed a software framework that exploits many-core and multicore CPUs to implement efficient parallel MCPSC in modern processors based on three popular PSC methods, namely, TMalign, CE, and USM. We evaluate and compare the performance and efficiency of the two parallel MCPSC implementations using Intel's experimental many-core Single-Chip Cloud Computer (SCC) as well as Intel's Core i7 multicore processor. We show that the 48-core SCC is more efficient than the latest generation Core i7, achieving a speedup factor of 42 (efficiency of 0.9), making many-core processors an exciting emerging technology for large-scale structural proteomics. We compare and contrast the performance of the two processors on several datasets and also show that MCPSC outperforms its component methods in grouping related domains, achieving a high F-measure of 0.91 on the benchmark CK34 dataset. The software implementation for protein structure comparison using the three methods and combined MCPSC, along with the developed underlying rckskel algorithmic skeletons library, is available via GitHub. PMID:26605332

  4. Efficient Multicriteria Protein Structure Comparison on Modern Processor Architectures.

    Science.gov (United States)

    Sharma, Anuj; Manolakos, Elias S

    2015-01-01

    Fast increasing computational demand for all-to-all protein structures comparison (PSC) is a result of three confounding factors: rapidly expanding structural proteomics databases, high computational complexity of pairwise protein comparison algorithms, and the trend in the domain towards using multiple criteria for protein structures comparison (MCPSC) and combining results. We have developed a software framework that exploits many-core and multicore CPUs to implement efficient parallel MCPSC in modern processors based on three popular PSC methods, namely, TMalign, CE, and USM. We evaluate and compare the performance and efficiency of the two parallel MCPSC implementations using Intel's experimental many-core Single-Chip Cloud Computer (SCC) as well as Intel's Core i7 multicore processor. We show that the 48-core SCC is more efficient than the latest generation Core i7, achieving a speedup factor of 42 (efficiency of 0.9), making many-core processors an exciting emerging technology for large-scale structural proteomics. We compare and contrast the performance of the two processors on several datasets and also show that MCPSC outperforms its component methods in grouping related domains, achieving a high F-measure of 0.91 on the benchmark CK34 dataset. The software implementation for protein structure comparison using the three methods and combined MCPSC, along with the developed underlying rckskel algorithmic skeletons library, is available via GitHub.

  5. Graphics processor efficiency for realization of rapid tabular computations

    International Nuclear Information System (INIS)

    Dudnik, V.A.; Kudryavtsev, V.I.; Us, S.A.; Shestakov, M.V.

    2016-01-01

    Capabilities of graphics processing units (GPU) and central processing units (CPU) have been investigated for realization of fast-calculation algorithms with the use of tabulated functions. The realization of tabulated functions is exemplified by the GPU/CPU architecture-based processors. Comparison is made between the operating efficiencies of GPU and CPU, employed for tabular calculations at different conditions of use. Recommendations are formulated for the use of graphical and central processors to speed up scientific and engineering computations through the use of tabulated functions

  6. Thermal Dissipation Efficiency in a Micro-Processor Using Carbon Nanotubes Based Composite

    Science.gov (United States)

    Thang, Bui Hung; Van Quang, Cao; Nghia, Van Trong; Hong, Phan Ngoc; Van Chuc, Nguyen; Tam, Ngo Thi Thanh; Quang, Le Dinh; Khang, Dao Duc; Khoi, Phan Hong; Minh, Phan Ngoc

    2009-09-01

    Modern electronic and optoelectronic devices such as μ-processor, light emitting diode, semiconductor laser issued a challenge in the thermal dissipation problem. Finding an effective way for thermal dissipation therefore becomes a very important issue. It is known that carbon nanotubes (CNTs) is one of the most valuable materials with high thermal conductivity (2000 W/m.K compared to thermal conductivity of Ag 419 W/m.K). This suggested an approach in applying the CNTs as an essential component for thermal dissipation media to improve the performance of computer processor and other high power electronic devices. In this work multi walled carbon nanotubes (MWCNTs) based composites were utilized as the thermal dissipation media in a micro processor of a personal computer. The MWCNTs of different concentrations were added into polyaniline, commercial silicon thermal paste and commercial silver thermal paste by mechanical methods. A personal computer with configuration: Intel Pentium IV 3.066 GHz, 512 MB of RAM and Windows XP Service Pack 2 Operating System was employed. The thermal dissipation efficiency of the system was evaluated by directly measure the temperature of the μ-processor during the operation of the computer in different CPU speeds. The measured results showed that the CNTs based composite could reduce the temperature of the u-processor more than 5° C, and the time for increasing the temperature of the μ-processor was three times longer than that when using commercial thermal paste.

  7. Energy-Efficiency of the Montium Reconfigurable Tile Processor

    NARCIS (Netherlands)

    Heysters, P.M.; Smit, Gerardus Johannes Maria; Molenkamp, Egbert

    Primary requirements of wireless multimedia handheld computers are high-performance, flexibility, energy-efficiency and low costs. A compromise for these contradicting requirements can be found in a heterogeneous SoC. Besides conventional architectures such a SoC contains domain specific coarse

  8. Efficient Execution of Video Applications on Heterogeneous Multi- and Many-Core Processors

    NARCIS (Netherlands)

    Pereira de Azevedo Filho, A.

    2011-01-01

    In this dissertation we present methodologies and evaluations aiming at increasing the efficiency of video coding applications for heterogeneous many-core processors composed of SIMD-only, scratchpad memory based cores. Our contributions are spread in three different fronts: thread-level parallelism

  9. Climbing Mont Blanc - A Training Site for Energy Efficient Programming on Heterogeneous Multicore Processors

    OpenAIRE

    Natvig, Lasse; Follan, Torbjørn; Støa, Simen; Magnussen, Sindre; Guirado, Antonio Garcia

    2015-01-01

    Climbing Mont Blanc (CMB) is an open online judge used for training in energy efficient programming of state-of-the-art heterogeneous multicores. It uses an Odroid-XU3 board from Hardkernel with an Exynos Octa processor and integrated power sensors. This processor is three-way heterogeneous containing 14 different cores of three different types. The board currently accepts C and C++ programs, with support for OpenCL v1.1, OpenMP 4.0 and Pthreads. Programs submitted using the graphical user in...

  10. Software and DVFS Tuning for Performance and Energy-Efficiency on Intel KNL Processors

    Directory of Open Access Journals (Sweden)

    Enrico Calore

    2018-06-01

    Full Text Available Energy consumption of processors and memories is quickly becoming a limiting factor in the deployment of large computing systems. For this reason, it is important to understand the energy performance of these processors and to study strategies allowing their use in the most efficient way. In this work, we focus on the computing and energy performance of the Knights Landing Xeon Phi, the latest Intel many-core architecture processor for HPC applications. We consider the 64-core Xeon Phi 7230 and profile its performance and energy efficiency using both its on-chip MCDRAM and the off-chip DDR4 memory as the main storage for application data. As a benchmark application, we use a lattice Boltzmann code heavily optimized for this architecture and implemented using several different arrangements of the application data in memory (data-layouts, in short. We also assess the dependence of energy consumption on data-layouts, memory configurations (DDR4 or MCDRAM and the number of threads per core. We finally consider possible trade-offs between computing performance and energy efficiency, tuning the clock frequency of the processor using the Dynamic Voltage and Frequency Scaling (DVFS technique.

  11. An Efficient Power Estimation Methodology for Complex RISC Processor-based Platforms

    OpenAIRE

    Rethinagiri , Santhosh Kumar; Ben Atitallah , Rabie; Dekeyser , Jean-Luc; Niar , Smail; Senn , Eric

    2012-01-01

    International audience; In this contribution, we propose an efficient power estima- tion methodology for complex RISC processor-based plat- forms. In this methodology, the Functional Level Power Analysis (FLPA) is used to set up generic power models for the different parts of the system. Then, a simulation framework based on virtual platform is developed to evalu- ate accurately the activities used in the related power mod- els. The combination of the two parts above leads to a het- erogeneou...

  12. Efficient gate set tomography on a multi-qubit superconducting processor

    Science.gov (United States)

    Nielsen, Erik; Rudinger, Kenneth; Blume-Kohout, Robin; Bestwick, Andrew; Bloom, Benjamin; Block, Maxwell; Caldwell, Shane; Curtis, Michael; Hudson, Alex; Orgiazzi, Jean-Luc; Papageorge, Alexander; Polloreno, Anthony; Reagor, Matt; Rubin, Nicholas; Scheer, Michael; Selvanayagam, Michael; Sete, Eyob; Sinclair, Rodney; Smith, Robert; Vahidpour, Mehrnoosh; Villiers, Marius; Zeng, William; Rigetti, Chad

    Quantum information processors with five or more qubits are becoming common. Complete, predictive characterization of such devices e.g. via any form of tomography, including gate set tomography appears impossible because the parameter space is intractably large. Randomized benchmarking scales well, but cannot predict device behavior or diagnose failure modes. We introduce a new type of gate set tomography that uses an efficient ansatz to model physically plausible errors, but scales polynomially with the number of qubits. We will describe the theory behind this multi-qubit tomography and present experimental results from using it to characterize a multi-qubit processor made by Rigetti Quantum Computing. Sandia National Laboratories is a multi-mission laboratory managed and operated by Sandia Corporation, a wholly owned subsidary of Lockheed Martin Corporation, for the US Department of Energy's NNSA under contract DE-AC04-94AL85000.

  13. ad-heap: an Efficient Heap Data Structure for Asymmetric Multicore Processors

    DEFF Research Database (Denmark)

    Liu, Weifeng; Vinter, Brian

    2014-01-01

    and its child nodes must be executed sequentially, and (2) heaps, even d-heaps (d-ary heaps or d-way heaps), cannot supply enough wide data parallelism to these processors. Recent research proposed more versatile asymmetric multicore processors (AMPs) that consist of two types of cores (latency......-oriented cores with high single-thread performance and throughput-oriented cores with wide vector processing capability), unified memory address space and faster synchronization mechanism among cores with different ISAs. To leverage the AMPs for the heap data structure, in this paper we propose ad......-heap, an efficient heap data structure that introduces an implicit bridge structure and properly apportions workloads to the two types of cores. We implement a batch k-selection algorithm and conduct experiments on simulated AMP environments composed of real CPUs and GPUs. In our experiments on two representative...

  14. Efficient Programming for Multicore Processor Heterogeneity: OpenMP versus OmpSs

    OpenAIRE

    Butko , Anastasiia; Bruguier , Florent; Gamatié , Abdoulaye; Sassatelli , Gilles

    2017-01-01

    International audience; ARM single-ISA heterogeneous multicore processors combine high-performance big cores with power-efficient small cores. They aim at achieving a suitable balance between performance and energy. How- ever, a main challenge is to program such architectures so as to efficiently exploit their features. In this paper, we study the impact on performance and energy trade-offs of single-ISA architecture according to OpenMP 3.0 and the OmpSs programming models. We consider differ...

  15. Examining the volume efficiency of the cortical architecture in a multi-processor network model.

    Science.gov (United States)

    Ruppin, E; Schwartz, E L; Yeshurun, Y

    1993-01-01

    The convoluted form of the sheet-like mammalian cortex naturally raises the question whether there is a simple geometrical reason for the prevalence of cortical architecture in the brains of higher vertebrates. Addressing this question, we present a formal analysis of the volume occupied by a massively connected network or processors (neurons) and then consider the pertaining cortical data. Three gross macroscopic features of cortical organization are examined: the segregation of white and gray matter, the circumferential organization of the gray matter around the white matter, and the folded cortical structure. Our results testify to the efficiency of cortical architecture.

  16. Efficient Backprojection-Based Synthetic Aperture Radar Computation with Many-Core Processors

    Directory of Open Access Journals (Sweden)

    Jongsoo Park

    2013-01-01

    Full Text Available Tackling computationally challenging problems with high efficiency often requires the combination of algorithmic innovation, advanced architecture, and thorough exploitation of parallelism. We demonstrate this synergy through synthetic aperture radar (SAR via backprojection, an image reconstruction method that can require hundreds of TFLOPS. Computation cost is significantly reduced by our new algorithm of approximate strength reduction; data movement cost is economized by software locality optimizations facilitated by advanced architecture support; parallelism is fully harnessed in various patterns and granularities. We deliver over 35 billion backprojections per second throughput per compute node on an Intel® Xeon® processor E5-2670-based cluster, equipped with Intel® Xeon Phi™ coprocessors. This corresponds to processing a 3K×3K image within a second using a single node. Our study can be extended to other settings: backprojection is applicable elsewhere including medical imaging, approximate strength reduction is a general code transformation technique, and many-core processors are emerging as a solution to energy-efficient computing.

  17. An energy-efficient high-performance processor with reconfigurable data-paths using RSFQ circuits

    International Nuclear Information System (INIS)

    Takagi, Naofumi

    2013-01-01

    Highlights: ► An idea of a high-performance computer using RSFQ circuits is shown. ► An outline of processor with reconfigurable data-paths (RDPs) is shown. ► Architectural details of an SFQ-RDP are described. -- Abstract: We show recent progress in our research on an energy-efficient high-performance processor with reconfigurable data-paths (RDPs) using rapid single-flux-quantum (RSFQ) circuits. We mainly describe the architectural details of an RDP implemented using RSFQ circuits. An RDP consists of a lot of floating-point units (FPUs) and operand routing networks (ORNs) which connect the FPUs. We reconfigure the RDP to fit a computation, i.e., a group of floating-point operations, appearing in a ‘for’ loop of programs for numerical computations by setting the route in ORNs before the execution of the loop. In the RDP, a lot of FPUs work in parallel with pipelined fashion, and hence, very high-performance computation is achieved

  18. Recovery Act: Integrated DC-DC Conversion for Energy-Efficient Multicore Processors

    Energy Technology Data Exchange (ETDEWEB)

    Shepard, Kenneth L

    2013-03-31

    devices. These new approaches to scaled voltage regulation for computing devices also promise significant impact on electricity consumption in the United States and abroad by improving the efficiency of all computational platforms. In 2006, servers and datacenters in the United States consumed an estimated 61 billion kWh or about 1.5% of the nation's total energy consumption. Federal Government servers and data centers alone accounted for about 10 billion kWh, for a total annual energy cost of about $450 million. Based upon market growth and efficiency trends, estimates place current server and datacenter power consumption at nearly 85 billion kWh in the US and at almost 280 billion kWh worldwide. Similar estimates place national desktop, mobile and portable computing at 80 billion kWh combined. While national electricity utilization for computation amounts to only 4% of current usage, it is growing at a rate of about 10% a year with volume servers representing one of the largest growth segments due to the increasing utilization of cloud-based services. The percentage of power that is consumed by the processor in a server varies but can be as much as 30% of the total power utilization, with an additional 50% associated with heat removal. The approaches considered here should allow energy efficiency gains as high as 30% in processors for all computing platforms, from high-end servers to smart phones, resulting in a direct annual energy savings of almost 15 billion kWh nationally, and 50 billion kWh globally. The work developed here is being commercialized by the start-up venture, Ferric Semiconductor, which has already secured two Phase I SBIR grants to bring these technologies to the marketplace.

  19. An efficient ASIC implementation of 16-channel on-line recursive ICA processor for real-time EEG system.

    Science.gov (United States)

    Fang, Wai-Chi; Huang, Kuan-Ju; Chou, Chia-Ching; Chang, Jui-Chung; Cauwenberghs, Gert; Jung, Tzyy-Ping

    2014-01-01

    This is a proposal for an efficient very-large-scale integration (VLSI) design, 16-channel on-line recursive independent component analysis (ORICA) processor ASIC for real-time EEG system, implemented with TSMC 40 nm CMOS technology. ORICA is appropriate to be used in real-time EEG system to separate artifacts because of its highly efficient and real-time process features. The proposed ORICA processor is composed of an ORICA processing unit and a singular value decomposition (SVD) processing unit. Compared with previous work [1], this proposed ORICA processor has enhanced effectiveness and reduced hardware complexity by utilizing a deeper pipeline architecture, shared arithmetic processing unit, and shared registers. The 16-channel random signals which contain 8-channel super-Gaussian and 8-channel sub-Gaussian components are used to analyze the dependence of the source components, and the average correlation coefficient is 0.95452 between the original source signals and extracted ORICA signals. Finally, the proposed ORICA processor ASIC is implemented with TSMC 40 nm CMOS technology, and it consumes 15.72 mW at 100 MHz operating frequency.

  20. Safe and Efficient Support for Embeded Multi-Processors in ADA

    Science.gov (United States)

    Ruiz, Jose F.

    2010-08-01

    New software demands increasing processing power, and multi-processor platforms are spreading as the answer to achieve the required performance. Embedded real-time systems are also subject to this trend, but in the case of real-time mission-critical systems, the properties of reliability, predictability and analyzability are also paramount. The Ada 2005 language defined a subset of its tasking model, the Ravenscar profile, that provides the basis for the implementation of deterministic and time analyzable applications on top of a streamlined run-time system. This Ravenscar tasking profile, originally designed for single processors, has proven remarkably useful for modelling verifiable real-time single-processor systems. This paper proposes a simple extension to the Ravenscar profile to support multi-processor systems using a fully partitioned approach. The implementation of this scheme is simple, and it can be used to develop applications amenable to schedulability analysis.

  1. Composable processor virtualization for embedded systems

    NARCIS (Netherlands)

    Molnos, A.M.; Milutinovic, A.; She, D.; Goossens, K.G.W.

    2010-01-01

    Processor virtualization divides a physical processor's time among a set of virual machines, enabling efficient hardware utilization, application security and allowing co-existence of different operating systems on the same processor. Through initially intended for the server domain, virtualization

  2. Increasing the energy efficiency of microcontroller platforms with low-design margin co-processors

    NARCIS (Netherlands)

    Gomez, A.; Bartolini, A.; Rossi, D.; Can Kara, B.; Fatemi, S.H.; Pineda de Gyvez, J.; Benini, L.

    2017-01-01

    Reducing the energy consumption in low cost, performance-constrained microcontroller units (MCU’s) cannot be achieved with complex energy minimization techniques (i.e. fine-grained DVFS, Thermal Management, etc), due to their high overheads. To this end, we propose an energy-efficient, multi-core

  3. Efficiency of a Woody 60 processor attached to a Mounty 4100 tower yarder when processing coniferous timber from thinning operations

    Directory of Open Access Journals (Sweden)

    S.A. Borz

    2014-12-01

    Full Text Available Processor tower yarders (PTY represent the current state of yarding technology being extensively used in mountainous conditions such as those from Central Europe where they were also developed and used for the first time. In proper technical conditions which are mostly related to forest road infrastructure such equipment may be introduced by technology transfer in other countries such as Romania where they could replace actual less-efficient forest equipment used in steep terrains. The aim of this study was to evaluate the efficiency of such equipment in conditions of thinning operations by adapting a time study to the general concepts and by using data collection techniques to suit the operational conditions imposed by such equipment. In conditions of a mean tree volume of 0.21 m3 × tree-1, the results of our study indicate net production rates as high as 12.72 m3 × h-1 when processing trees on landing, which could be also improved up to 17.52 m3 × h-1 if the PTY have been be adequately installed on the forest road. Another key aspect which could improve the efficiency of such equipment performing landing operations is the number of planned and realized wood assortments since the time expenditure was affected by their number. Given the reduced impact on forest soils as well as the increased efficiency of tower yarders, our study concludes that there would be a lot of potential in actually using them in the Romanian forests located in steep terrain, if proper transportation infrastructure would exist.

  4. Functional Basis for Efficient Physical Layer Classical Control in Quantum Processors

    Science.gov (United States)

    Ball, Harrison; Nguyen, Trung; Leong, Philip H. W.; Biercuk, Michael J.

    2016-12-01

    The rapid progress seen in the development of quantum-coherent devices for information processing has motivated serious consideration of quantum computer architecture and organization. One topic which remains open for investigation and optimization relates to the design of the classical-quantum interface, where control operations on individual qubits are applied according to higher-level algorithms; accommodating competing demands on performance and scalability remains a major outstanding challenge. In this work, we present a resource-efficient, scalable framework for the implementation of embedded physical layer classical controllers for quantum-information systems. Design drivers and key functionalities are introduced, leading to the selection of Walsh functions as an effective functional basis for both programing and controller hardware implementation. This approach leverages the simplicity of real-time Walsh-function generation in classical digital hardware, and the fact that a wide variety of physical layer controls, such as dynamic error suppression, are known to fall within the Walsh family. We experimentally implement a real-time field-programmable-gate-array-based Walsh controller producing Walsh timing signals and Walsh-synthesized analog waveforms appropriate for critical tasks in error-resistant quantum control and noise characterization. These demonstrations represent the first step towards a unified framework for the realization of physical layer controls compatible with large-scale quantum-information processing.

  5. A 410-nW Efficient QRS Processor for Mobile ECG Monitoring in 0.18-μm CMOS.

    Science.gov (United States)

    Li, Peng; Zhang, Xu; Liu, Ming; Hu, Xiaohui; Pang, Bo; Yao, Zhaolin; Jiang, Hanjun; Chen, Hongda

    2017-12-01

    This paper proposes a low power and efficient QRS processor for real-time and continuous mobile ECG monitoring. The QRS detector contains the wavelet transform (WT), the modulus maxima pair identification (MMPI), and the R position modification (RPM). In order to reduce power consumption, we choose the Haar function as the mother wavelet of WT. It is implemented by an optimized FIR filter structure where none of the multiplier is used. The MMPI processes the wavelet coefficients at scale 2 4 and provides candidate R peak positions for the RPM. To improve the accuracy and robust performance, a number of modules have been designed in MMPI, including the preprocessing unit, the automatic threshold updating, and the decision state machine. The RPM is designed to eliminate digital time delay in wavelet transform and locate the R peak position precisely. Raw ECG signals and QRS detection results are output simultaneously. Fabricated in 0.18-μm N-well CMOS 1P6M technology, the power consumption of this chip is only about 410 nW in 1 V voltage supply. Validated by all 48 sets of data in the MIT-BIH arrhythmia database, the sensitive and the positive prediction are 99.60% and 99.77% respectively.

  6. Integrated fuel processor development

    International Nuclear Information System (INIS)

    Ahmed, S.; Pereira, C.; Lee, S. H. D.; Krumpelt, M.

    2001-01-01

    The Department of Energy's Office of Advanced Automotive Technologies has been supporting the development of fuel-flexible fuel processors at Argonne National Laboratory. These fuel processors will enable fuel cell vehicles to operate on fuels available through the existing infrastructure. The constraints of on-board space and weight require that these fuel processors be designed to be compact and lightweight, while meeting the performance targets for efficiency and gas quality needed for the fuel cell. This paper discusses the performance of a prototype fuel processor that has been designed and fabricated to operate with liquid fuels, such as gasoline, ethanol, methanol, etc. Rated for a capacity of 10 kWe (one-fifth of that needed for a car), the prototype fuel processor integrates the unit operations (vaporization, heat exchange, etc.) and processes (reforming, water-gas shift, preferential oxidation reactions, etc.) necessary to produce the hydrogen-rich gas (reformate) that will fuel the polymer electrolyte fuel cell stacks. The fuel processor work is being complemented by analytical and fundamental research. With the ultimate objective of meeting on-board fuel processor goals, these studies include: modeling fuel cell systems to identify design and operating features; evaluating alternative fuel processing options; and developing appropriate catalysts and materials. Issues and outstanding challenges that need to be overcome in order to develop practical, on-board devices are discussed

  7. RTEMS SMP and MTAPI for Efficient Multi-Core Space Applications on LEON3/LEON4 Processors

    Science.gov (United States)

    Cederman, Daniel; Hellstrom, Daniel; Sherrill, Joel; Bloom, Gedare; Patte, Mathieu; Zulianello, Marco

    2015-09-01

    This paper presents the final result of an European Space Agency (ESA) activity aimed at improving the software support for LEON processors used in SMP configurations. One of the benefits of using a multicore system in a SMP configuration is that in many instances it is possible to better utilize the available processing resources by load balancing between cores. This however comes with the cost of having to synchronize operations between cores, leading to increased complexity. While in an AMP system one can use multiple instances of operating systems that are only uni-processor capable, a SMP system requires the operating system to be written to support multicore systems. In this activity we have improved and extended the SMP support of the RTEMS real-time operating system and ensured that it fully supports the multicore capable LEON processors. The targeted hardware in the activity has been the GR712RC, a dual-core core LEON3FT processor, and the functional prototype of ESA's Next Generation Multiprocessor (NGMP), a quad core LEON4 processor. The final version of the NGMP is now available as a product under the name GR740. An implementation of the Multicore Task Management API (MTAPI) has been developed as part of this activity to aid in the parallelization of applications for RTEMS SMP. It allows for simplified development of parallel applications using the task-based programming model. An existing space application, the Gaia Video Processing Unit, has been ported to RTEMS SMP using the MTAPI implementation to demonstrate the feasibility and usefulness of multicore processors for space payload software. The activity is funded by ESA under contract 4000108560/13/NL/JK. Gedare Bloom is supported in part by NSF CNS-0934725.

  8. Computationally efficient implementation of sarse-tap FIR adaptive filters with tap-position control on intel IA-32 processors

    OpenAIRE

    Hirano, Akihiro; Nakayama, Kenji

    2008-01-01

    This paper presents an computationally ef cient implementation of sparse-tap FIR adaptive lters with tapposition control on Intel IA-32 processors with single-instruction multiple-data (SIMD) capability. In order to overcome randomorder memory access which prevents a ectorization, a blockbased processing and a re-ordering buffer are introduced. A dynamic register allocation and the use of memory-to-register operations help the maximization of the loop-unrolling level. Up to 66percent speedup ...

  9. Evaluation of coat uniformity and taste-masking efficiency of irregular-shaped drug particles coated in a modified tangential spray fluidized bed processor.

    Science.gov (United States)

    Xu, Min; Heng, Paul Wan Sia; Liew, Celine Valeria

    2015-01-01

    To explore the feasibility of coating irregular-shaped drug particles in a modified tangential spray fluidized bed processor (FS processor) and evaluate the coated particles for their coat uniformity and taste-masking efficiency. Paracetamol particles were coated to 20%, w/w weight gain using a taste-masking polymer insoluble in neutral and basic pH but soluble in acidic pH. In-process samples (5, 10 and 15%, w/w coat) and the resultant coated particles (20%, w/w coat) were collected to monitor the changes in their physicochemical attributes. After coating to 20%, w/w coat weight gain, the usable yield was 81% with minimal agglomeration (coat compared with the uncoated particles. A 15%, w/w coat was optimal for inhibiting drug release in salivary pH with subsequent fast dissolution in simulated gastric pH. The FS processor shows promise for direct coating of irregular-shaped drug particles with wide size distribution. The coated particles with 15% coat were sufficiently taste masked and could be useful for further application in orally disintegrating tablet platforms.

  10. A device for checking the speeds and quality of X-ray films and efficiency of film processors

    International Nuclear Information System (INIS)

    Crooks, H.E.

    1980-01-01

    In order to carry out quality assurance tests of X-ray films, screens and processors, a standard exposure source of utmost reliability is desirable. The performance of apparatus containing yellow-green light emitting radioactive tritium activated phosphor, known as Betalight, is described. Over a period of 2 1/2 years, variations in film speeds up to +- 20% from nominally the same types of film have been observed. Additionally, substandard developer has been easily identified with the use of this device. The device is cheap and easy to manufacture, accurate and simple to use. (U.K.)

  11. Deterministic chaos in the processor load

    International Nuclear Information System (INIS)

    Halbiniak, Zbigniew; Jozwiak, Ireneusz J.

    2007-01-01

    In this article we present the results of research whose purpose was to identify the phenomenon of deterministic chaos in the processor load. We analysed the time series of the processor load during efficiency tests of database software. Our research was done on a Sparc Alpha processor working on the UNIX Sun Solaris 5.7 operating system. The conducted analyses proved the presence of the deterministic chaos phenomenon in the processor load in this particular case

  12. An Energy-Efficient and Scalable Deep Learning/Inference Processor With Tetra-Parallel MIMD Architecture for Big Data Applications.

    Science.gov (United States)

    Park, Seong-Wook; Park, Junyoung; Bong, Kyeongryeol; Shin, Dongjoo; Lee, Jinmook; Choi, Sungpill; Yoo, Hoi-Jun

    2015-12-01

    Deep Learning algorithm is widely used for various pattern recognition applications such as text recognition, object recognition and action recognition because of its best-in-class recognition accuracy compared to hand-crafted algorithm and shallow learning based algorithms. Long learning time caused by its complex structure, however, limits its usage only in high-cost servers or many-core GPU platforms so far. On the other hand, the demand on customized pattern recognition within personal devices will grow gradually as more deep learning applications will be developed. This paper presents a SoC implementation to enable deep learning applications to run with low cost platforms such as mobile or portable devices. Different from conventional works which have adopted massively-parallel architecture, this work adopts task-flexible architecture and exploits multiple parallelism to cover complex functions of convolutional deep belief network which is one of popular deep learning/inference algorithms. In this paper, we implement the most energy-efficient deep learning and inference processor for wearable system. The implemented 2.5 mm × 4.0 mm deep learning/inference processor is fabricated using 65 nm 8-metal CMOS technology for a battery-powered platform with real-time deep inference and deep learning operation. It consumes 185 mW average power, and 213.1 mW peak power at 200 MHz operating frequency and 1.2 V supply voltage. It achieves 411.3 GOPS peak performance and 1.93 TOPS/W energy efficiency, which is 2.07× higher than the state-of-the-art.

  13. The LASS hardware processor

    International Nuclear Information System (INIS)

    Kunz, P.F.

    1976-01-01

    The problems of data analysis with hardware processors are reviewed and a description is given of a programmable processor. This processor, the 168/E, has been designed for use in the LASS multi-processor system; it has an execution speed comparable to the IBM 370/168 and uses the subset of IBM 370 instructions appropriate to the LASS analysis task. (Auth.)

  14. Java Processor Optimized for RTSJ

    Directory of Open Access Journals (Sweden)

    Tu Shiliang

    2007-01-01

    Full Text Available Due to the preeminent work of the real-time specification for Java (RTSJ, Java is increasingly expected to become the leading programming language in real-time systems. To provide a Java platform suitable for real-time applications, a Java processor which can execute Java bytecode is directly proposed in this paper. It provides efficient support in hardware for some mechanisms specified in the RTSJ and offers a simpler programming model through ameliorating the scoped memory of the RTSJ. The worst case execution time (WCET of the bytecodes implemented in this processor is predictable by employing the optimization method proposed in our previous work, in which all the processing interfering predictability is handled before bytecode execution. Further advantage of this method is to make the implementation of the processor simpler and suited to a low-cost FPGA chip.

  15. Probabilistic programmable quantum processors

    International Nuclear Information System (INIS)

    Buzek, V.; Ziman, M.; Hillery, M.

    2004-01-01

    We analyze how to improve performance of probabilistic programmable quantum processors. We show how the probability of success of the probabilistic processor can be enhanced by using the processor in loops. In addition, we show that an arbitrary SU(2) transformations of qubits can be encoded in program state of a universal programmable probabilistic quantum processor. The probability of success of this processor can be enhanced by a systematic correction of errors via conditional loops. Finally, we show that all our results can be generalized also for qudits. (Abstract Copyright [2004], Wiley Periodicals, Inc.)

  16. Producing chopped firewood with firewood processors

    International Nuclear Information System (INIS)

    Kaerhae, K.; Jouhiaho, A.

    2009-01-01

    The TTS Institute's research and development project studied both the productivity of new, chopped firewood processors (cross-cutting and splitting machines) suitable for professional and independent small-scale production, and the costs of the chopped firewood produced. Seven chopped firewood processors were tested in the research, six of which were sawing processors and one shearing processor. The chopping work was carried out using wood feeding racks and a wood lifter. The work was also carried out without any feeding appliances. Altogether 132.5 solid m 3 of wood were chopped in the time studies. The firewood processor used had the most significant impact on chopping work productivity. In addition to the firewood processor, the stem mid-diameter, the length of the raw material, and of the firewood were also found to affect productivity. The wood feeding systems also affected productivity. If there is a feeding rack and hydraulic grapple loader available for use in chopping firewood, then it is worth using the wood feeding rack. A wood lifter is only worth using with the largest stems (over 20 cm mid-diameter) if a feeding rack cannot be used. When producing chopped firewood from small-diameter wood, i.e. with a mid-diameter less than 10 cm, the costs of chopping work were over 10 EUR solid m -3 with sawing firewood processors. The shearing firewood processor with a guillotine blade achieved a cost level of 5 EUR solid m -3 when the mid-diameter of the chopped stem was 10 cm. In addition to the raw material, the cost-efficient chopping work also requires several hundred annual operating hours with a firewood processor, which is difficult for individual firewood entrepreneurs to achieve. The operating hours of firewood processors can be increased to the required level by the joint use of the processors by a number of firewood entrepreneurs. (author)

  17. Embedded Processor Laboratory

    Data.gov (United States)

    Federal Laboratory Consortium — The Embedded Processor Laboratory provides the means to design, develop, fabricate, and test embedded computers for missile guidance electronics systems in support...

  18. Multithreading in vector processors

    Science.gov (United States)

    Evangelinos, Constantinos; Kim, Changhoan; Nair, Ravi

    2018-01-16

    In one embodiment, a system includes a processor having a vector processing mode and a multithreading mode. The processor is configured to operate on one thread per cycle in the multithreading mode. The processor includes a program counter register having a plurality of program counters, and the program counter register is vectorized. Each program counter in the program counter register represents a distinct corresponding thread of a plurality of threads. The processor is configured to execute the plurality of threads by activating the plurality of program counters in a round robin cycle.

  19. Invasive tightly coupled processor arrays

    CERN Document Server

    LARI, VAHID

    2016-01-01

    This book introduces new massively parallel computer (MPSoC) architectures called invasive tightly coupled processor arrays. It proposes strategies, architecture designs, and programming interfaces for invasive TCPAs that allow invading and subsequently executing loop programs with strict requirements or guarantees of non-functional execution qualities such as performance, power consumption, and reliability. For the first time, such a configurable processor array architecture consisting of locally interconnected VLIW processing elements can be claimed by programs, either in full or in part, using the principle of invasive computing. Invasive TCPAs provide unprecedented energy efficiency for the parallel execution of nested loop programs by avoiding any global memory access such as GPUs and may even support loops with complex dependencies such as loop-carried dependencies that are not amenable to parallel execution on GPUs. For this purpose, the book proposes different invasion strategies for claiming a desire...

  20. Logistic Fuel Processor Development

    National Research Council Canada - National Science Library

    Salavani, Reza

    2004-01-01

    ... to light gases then steam reform the light gases into hydrogen rich stream. This report documents the efforts in developing a fuel processor capable of providing hydrogen to a 3kW fuel cell stack...

  1. 3081/E processor

    International Nuclear Information System (INIS)

    Kunz, P.F.; Gravina, M.; Oxoby, G.

    1984-04-01

    The 3081/E project was formed to prepare a much improved IBM mainframe emulator for the future. Its design is based on a large amount of experience in using the 168/E processor to increase available CPU power in both online and offline environments. The processor will be at least equal to the execution speed of a 370/168 and up to 1.5 times faster for heavy floating point code. A single processor will thus be at least four times more powerful than the VAX 11/780, and five processors on a system would equal at least the performance of the IBM 3081K. With its large memory space and simple but flexible high speed interface, the 3081/E is well suited for the online and offline needs of high energy physics in the future

  2. Logistic Fuel Processor Development

    National Research Council Canada - National Science Library

    Salavani, Reza

    2004-01-01

    The Air Base Technologies Division of the Air Force Research Laboratory has developed a logistic fuel processor that removes the sulfur content of the fuel and in the process converts logistic fuel...

  3. Adaptive signal processor

    Energy Technology Data Exchange (ETDEWEB)

    Walz, H.V.

    1980-07-01

    An experimental, general purpose adaptive signal processor system has been developed, utilizing a quantized (clipped) version of the Widrow-Hoff least-mean-square adaptive algorithm developed by Moschner. The system accommodates 64 adaptive weight channels with 8-bit resolution for each weight. Internal weight update arithmetic is performed with 16-bit resolution, and the system error signal is measured with 12-bit resolution. An adapt cycle of adjusting all 64 weight channels is accomplished in 8 ..mu..sec. Hardware of the signal processor utilizes primarily Schottky-TTL type integrated circuits. A prototype system with 24 weight channels has been constructed and tested. This report presents details of the system design and describes basic experiments performed with the prototype signal processor. Finally some system configurations and applications for this adaptive signal processor are discussed.

  4. Adaptive signal processor

    International Nuclear Information System (INIS)

    Walz, H.V.

    1980-07-01

    An experimental, general purpose adaptive signal processor system has been developed, utilizing a quantized (clipped) version of the Widrow-Hoff least-mean-square adaptive algorithm developed by Moschner. The system accommodates 64 adaptive weight channels with 8-bit resolution for each weight. Internal weight update arithmetic is performed with 16-bit resolution, and the system error signal is measured with 12-bit resolution. An adapt cycle of adjusting all 64 weight channels is accomplished in 8 μsec. Hardware of the signal processor utilizes primarily Schottky-TTL type integrated circuits. A prototype system with 24 weight channels has been constructed and tested. This report presents details of the system design and describes basic experiments performed with the prototype signal processor. Finally some system configurations and applications for this adaptive signal processor are discussed

  5. Array processor architecture

    Science.gov (United States)

    Barnes, George H. (Inventor); Lundstrom, Stephen F. (Inventor); Shafer, Philip E. (Inventor)

    1983-01-01

    A high speed parallel array data processing architecture fashioned under a computational envelope approach includes a data base memory for secondary storage of programs and data, and a plurality of memory modules interconnected to a plurality of processing modules by a connection network of the Omega gender. Programs and data are fed from the data base memory to the plurality of memory modules and from hence the programs are fed through the connection network to the array of processors (one copy of each program for each processor). Execution of the programs occur with the processors operating normally quite independently of each other in a multiprocessing fashion. For data dependent operations and other suitable operations, all processors are instructed to finish one given task or program branch before all are instructed to proceed in parallel processing fashion on the next instruction. Even when functioning in the parallel processing mode however, the processors are not locked-step but execute their own copy of the program individually unless or until another overall processor array synchronization instruction is issued.

  6. Functional unit for a processor

    NARCIS (Netherlands)

    Rohani, A.; Kerkhoff, Hans G.

    2013-01-01

    The invention relates to a functional unit for a processor, such as a Very Large Instruction Word Processor. The invention further relates to a processor comprising at least one such functional unit. The invention further relates to a functional unit and processor capable of mitigating the effect of

  7. The European project Merlin on multi-gigabit, energy-efficient, ruggedized lightwave engines for advanced on-board digital processors

    Science.gov (United States)

    Stampoulidis, L.; Kehayas, E.; Karppinen, M.; Tanskanen, A.; Heikkinen, V.; Westbergh, P.; Gustavsson, J.; Larsson, A.; Grüner-Nielsen, L.; Sotom, M.; Venet, N.; Ko, M.; Micusik, D.; Kissinger, D.; Ulusoy, A. C.; King, R.; Safaisini, R.

    2017-11-01

    Modern broadband communication networks rely on satellites to complement the terrestrial telecommunication infrastructure. Satellites accommodate global reach and enable world-wide direct broadcasting by facilitating wide access to the backbone network from remote sites or areas where the installation of ground segment infrastructure is not economically viable. At the same time the new broadband applications increase the bandwidth demands in every part of the network - and satellites are no exception. Modern telecom satellites incorporate On-Board Processors (OBP) having analogue-to-digital (ADC) and digital-to-analogue converters (DAC) at their inputs/outputs and making use of digital processing to handle hundreds of signals; as the amount of information exchanged increases, so do the physical size, mass and power consumption of the interconnects required to transfer massive amounts of data through bulk electric wires.

  8. 3081//sub E/ processor

    International Nuclear Information System (INIS)

    Kunz, P.F.; Gravina, M.; Oxoby, G.; Trang, Q.; Fucci, A.; Jacobs, D.; Martin, B.; Storr, K.

    1983-03-01

    Since the introduction of the 168//sub E/, emulating processors have been successful over an amazingly wide range of applications. This paper will describe a second generation processor, the 3081//sub E/. This new processor, which is being developed as a collaboration between SLAC and CERN, goes beyond just fixing the obvious faults of the 168//sub E/. Not only will the 3081//sub E/ have much more memory space, incorporate many more IBM instructions, and have much more memory space, incorporate many more IBM instructions, and have full double precision floating point arithmetic, but it will also have faster execution times and be much simpler to build, debug, and maintain. The simple interface and reasonable cost of the 168//sub E/ will be maintained for the 3081//sub E/

  9. Recursive Matrix Inverse Update On An Optical Processor

    Science.gov (United States)

    Casasent, David P.; Baranoski, Edward J.

    1988-02-01

    A high accuracy optical linear algebraic processor (OLAP) using the digital multiplication by analog convolution (DMAC) algorithm is described for use in an efficient matrix inverse update algorithm with speed and accuracy advantages. The solution of the parameters in the algorithm are addressed and the advantages of optical over digital linear algebraic processors are advanced.

  10. The Central Trigger Processor (CTP)

    CERN Multimedia

    Franchini, Matteo

    2016-01-01

    The Central Trigger Processor (CTP) receives trigger information from the calorimeter and muon trigger processors, as well as from other sources of trigger. It makes the Level-1 decision (L1A) based on a trigger menu.

  11. Very Long Instruction Word Processors

    Indian Academy of Sciences (India)

    Pentium Processor have modified the processor architecture to exploit parallelism in a program. .... The type of operation itself is encoded using 14 bits. .... text of designing simple architectures with low power consump- tion and execute x86 ...

  12. The Molen Polymorphic Media Processor

    NARCIS (Netherlands)

    Kuzmanov, G.K.

    2004-01-01

    In this dissertation, we address high performance media processing based on a tightly coupled co-processor architectural paradigm. More specifically, we introduce a reconfigurable media augmentation of a general purpose processor and implement it into a fully operational processor prototype. The

  13. Dual-core Itanium Processor

    CERN Multimedia

    2006-01-01

    Intel’s first dual-core Itanium processor, code-named "Montecito" is a major release of Intel's Itanium 2 Processor Family, which implements the Intel Itanium architecture on a dual-core processor with two cores per die (integrated circuit). Itanium 2 is much more powerful than its predecessor. It has lower power consumption and thermal dissipation.

  14. Multimode power processor

    Science.gov (United States)

    O'Sullivan, George A.; O'Sullivan, Joseph A.

    1999-01-01

    In one embodiment, a power processor which operates in three modes: an inverter mode wherein power is delivered from a battery to an AC power grid or load; a battery charger mode wherein the battery is charged by a generator; and a parallel mode wherein the generator supplies power to the AC power grid or load in parallel with the battery. In the parallel mode, the system adapts to arbitrary non-linear loads. The power processor may operate on a per-phase basis wherein the load may be synthetically transferred from one phase to another by way of a bumpless transfer which causes no interruption of power to the load when transferring energy sources. Voltage transients and frequency transients delivered to the load when switching between the generator and battery sources are minimized, thereby providing an uninterruptible power supply. The power processor may be used as part of a hybrid electrical power source system which may contain, in one embodiment, a photovoltaic array, diesel engine, and battery power sources.

  15. APRON: A Cellular Processor Array Simulation and Hardware Design Tool

    Science.gov (United States)

    Barr, David R. W.; Dudek, Piotr

    2009-12-01

    We present a software environment for the efficient simulation of cellular processor arrays (CPAs). This software (APRON) is used to explore algorithms that are designed for massively parallel fine-grained processor arrays, topographic multilayer neural networks, vision chips with SIMD processor arrays, and related architectures. The software uses a highly optimised core combined with a flexible compiler to provide the user with tools for the design of new processor array hardware architectures and the emulation of existing devices. We present performance benchmarks for the software processor array implemented on standard commodity microprocessors. APRON can be configured to use additional processing hardware if necessary and can be used as a complete graphical user interface and development environment for new or existing CPA systems, allowing more users to develop algorithms for CPA systems.

  16. New development for low energy electron beam processor

    International Nuclear Information System (INIS)

    Takei, Taro; Goto, Hitoshi; Oizumi, Matsutoshi; Hirakawa, Tetsuya; Ochi, Masafumi

    2003-01-01

    Newly developed low-energy electron beam (EB) processors that have unique designs and configurations compared to conventional ones enable electron-beam treatment of small three-dimensional objects, such as grain-like agricultural products and small plastic parts. As the EB processor can irradiate the products from the whole angles, the uniform EB treatment can be achieved at one time regardless the complex shapes of the product. Here presented are two new EB processors: the first system has cylindrical process zone, which allows three-dimensional objects to be irradiated with one-pass treatment. The second is a tube-type small EB processor, achieving not only its compactor design, but also higher beam extraction efficiency and flexible installation of the irradiation heads. The basic design of each processor and potential applications with them will be presented in this paper. (author)

  17. APRON: A Cellular Processor Array Simulation and Hardware Design Tool

    Directory of Open Access Journals (Sweden)

    David R. W. Barr

    2009-01-01

    Full Text Available We present a software environment for the efficient simulation of cellular processor arrays (CPAs. This software (APRON is used to explore algorithms that are designed for massively parallel fine-grained processor arrays, topographic multilayer neural networks, vision chips with SIMD processor arrays, and related architectures. The software uses a highly optimised core combined with a flexible compiler to provide the user with tools for the design of new processor array hardware architectures and the emulation of existing devices. We present performance benchmarks for the software processor array implemented on standard commodity microprocessors. APRON can be configured to use additional processing hardware if necessary and can be used as a complete graphical user interface and development environment for new or existing CPA systems, allowing more users to develop algorithms for CPA systems.

  18. Video frame processor

    International Nuclear Information System (INIS)

    Joshi, V.M.; Agashe, Alok; Bairi, B.R.

    1993-01-01

    This report provides technical description regarding the Video Frame Processor (VFP) developed at Bhabha Atomic Research Centre. The instrument provides capture of video images available in CCIR format. Two memory planes each with a capacity of 512 x 512 x 8 bit data enable storage of two video image frames. The stored image can be processed on-line and on-line image subtraction can also be carried out for image comparisons. The VFP is a PC Add-on board and is I/O mapped within the host IBM PC/AT compatible computer. (author). 9 refs., 4 figs., 19 photographs

  19. Trigger and decision processors

    International Nuclear Information System (INIS)

    Franke, G.

    1980-11-01

    In recent years there have been many attempts in high energy physics to make trigger and decision processes faster and more sophisticated. This became necessary due to a permanent increase of the number of sensitive detector elements in wire chambers and calorimeters, and in fact it was possible because of the fast developments in integrated circuits technique. In this paper the present situation will be reviewed. The discussion will be mainly focussed upon event filtering by pure software methods and - rather hardware related - microprogrammable processors as well as random access memory triggers. (orig.)

  20. Optical Finite Element Processor

    Science.gov (United States)

    Casasent, David; Taylor, Bradley K.

    1986-01-01

    A new high-accuracy optical linear algebra processor (OLAP) with many advantageous features is described. It achieves floating point accuracy, handles bipolar data by sign-magnitude representation, performs LU decomposition using only one channel, easily partitions and considers data flow. A new application (finite element (FE) structural analysis) for OLAPs is introduced and the results of a case study presented. Error sources in encoded OLAPs are addressed for the first time. Their modeling and simulation are discussed and quantitative data are presented. Dominant error sources and the effects of composite error sources are analyzed.

  1. AMD's 64-bit Opteron processor

    CERN Multimedia

    CERN. Geneva

    2003-01-01

    This talk concentrates on issues that relate to obtaining peak performance from the Opteron processor. Compiler options, memory layout, MPI issues in multi-processor configurations and the use of a NUMA kernel will be covered. A discussion of recent benchmarking projects and results will also be included.BiographiesDavid RichDavid directs AMD's efforts in high performance computing and also in the use of Opteron processors...

  2. Distributed processor systems

    International Nuclear Information System (INIS)

    Zacharov, B.

    1976-01-01

    In recent years, there has been a growing tendency in high-energy physics and in other fields to solve computational problems by distributing tasks among the resources of inter-coupled processing devices and associated system elements. This trend has gained further momentum more recently with the increased availability of low-cost processors and with the development of the means of data distribution. In two lectures, the broad question of distributed computing systems is examined and the historical development of such systems reviewed. An attempt is made to examine the reasons for the existence of these systems and to discern the main trends for the future. The components of distributed systems are discussed in some detail and particular emphasis is placed on the importance of standards and conventions in certain key system components. The ideas and principles of distributed systems are discussed in general terms, but these are illustrated by a number of concrete examples drawn from the context of the high-energy physics environment. (Auth.)

  3. Processors and systems (picture processing)

    Energy Technology Data Exchange (ETDEWEB)

    Gemmar, P

    1983-01-01

    Automatic picture processing requires high performance computers and high transmission capacities in the processor units. The author examines the possibilities of operating processors in parallel in order to accelerate the processing of pictures. He therefore discusses a number of available processors and systems for picture processing and illustrates their capacities for special types of picture processing. He stresses the fact that the amount of storage required for picture processing is exceptionally high. The author concludes that it is as yet difficult to decide whether very large groups of simple processors or highly complex multiprocessor systems will provide the best solution. Both methods will be aided by the development of VLSI. New solutions have already been offered (systolic arrays and 3-d processing structures) but they also are subject to losses caused by inherently parallel algorithms. Greater efforts must be made to produce suitable software for multiprocessor systems. Some possibilities for future picture processing systems are discussed. 33 references.

  4. Seismometer array station processors

    International Nuclear Information System (INIS)

    Key, F.A.; Lea, T.G.; Douglas, A.

    1977-01-01

    A description is given of the design, construction and initial testing of two types of Seismometer Array Station Processor (SASP), one to work with data stored on magnetic tape in analogue form, the other with data in digital form. The purpose of a SASP is to detect the short period P waves recorded by a UK-type array of 20 seismometers and to edit these on to a a digital library tape or disc. The edited data are then processed to obtain a rough location for the source and to produce seismograms (after optimum processing) for analysis by a seismologist. SASPs are an important component in the scheme for monitoring underground explosions advocated by the UK in the Conference of the Committee on Disarmament. With digital input a SASP can operate at 30 times real time using a linear detection process and at 20 times real time using the log detector of Weichert. Although the log detector is slower, it has the advantage over the linear detector that signals with lower signal-to-noise ratio can be detected and spurious large amplitudes are less likely to produce a detection. It is recommended, therefore, that where possible array data should be recorded in digital form for input to a SASP and that the log detector of Weichert be used. Trial runs show that a SASP is capable of detecting signals down to signal-to-noise ratios of about two with very few false detections, and at mid-continental array sites it should be capable of detecting most, if not all, the signals with magnitude above msub(b) 4.5; the UK argues that, given a suitable network, it is realistic to hope that sources of this magnitude and above can be detected and identified by seismological means alone. (author)

  5. Techniques for optimizing inerting in electron processors

    International Nuclear Information System (INIS)

    Rangwalla, I.J.; Korn, D.J.; Nablo, S.V.

    1993-01-01

    The design of an ''inert gas'' distribution system in an electron processor must satisfy a number of requirements. The first of these is the elimination or control of beam produced ozone and NO x which can be transported from the process zone by the product into the work area. Since the tolerable levels for O 3 in occupied areas around the processor are 3 in the beam heated process zone, or exhausting and dilution of the gas at the processor exit. The second requirement of the inerting system is to provide a suitable environment for completing efficient, free radical initiated addition polymerization. The competition between radical loss through de-excitation and that from O 2 quenching must be understood. This group has used gas chromatographic analysis of electron cured coatings to study the trade-offs of delivered dose, dose rate and O 2 concentrations in the process zone to determine the tolerable ranges of parameter excursions for production quality control purposes. These techniques are described for an ink coating system on paperboard, where a broad range of process parameters have been studied (D, D radical, O 2 ). It is then shown how the technique is used to optimize the use of higher purity (10-100 ppm O 2 ) nitrogen gas for inerting, in combination with lower purity (2-20,000 ppm O 2 ) non-cryogenically produced gas, as from a membrane or pressure swing adsorption generators. (author)

  6. Dual-scale topology optoelectronic processor.

    Science.gov (United States)

    Marsden, G C; Krishnamoorthy, A V; Esener, S C; Lee, S H

    1991-12-15

    The dual-scale topology optoelectronic processor (D-STOP) is a parallel optoelectronic architecture for matrix algebraic processing. The architecture can be used for matrix-vector multiplication and two types of vector outer product. The computations are performed electronically, which allows multiplication and summation concepts in linear algebra to be generalized to various nonlinear or symbolic operations. This generalization permits the application of D-STOP to many computational problems. The architecture uses a minimum number of optical transmitters, which thereby reduces fabrication requirements while maintaining area-efficient electronics. The necessary optical interconnections are space invariant, minimizing space-bandwidth requirements.

  7. XL-100S microprogrammable processor

    International Nuclear Information System (INIS)

    Gorbunov, N.V.; Guzik, Z.; Sutulin, V.A.; Forytski, A.

    1983-01-01

    The XL-100S microprogrammable processor providing the multiprocessor operation mode in the XL system crate is described. The processor meets the EUR 6500 CAMAC standards, address up to 4 Mbyte memory, and interacts with 7 CAMAC branchas. Eight external requests initiate operations preset by a sequence of microcommands in a memory of the capacity up to 64 kwords of 32-Git. The microprocessor architecture allows one to emulate commands of the majority of mini- or micro-computers, including floating point operations. The XL-100S processor may be used in various branches of experimental physics: for physical experiment apparatus control, fast selection of useful physical events, organization of the of input/output operations, organization of direct assess to memory included, etc. The Am2900 microprocessor set is used as an elementary base. The device is made in the form of a single width CAMAC module

  8. Making CSB + -Trees Processor Conscious

    DEFF Research Database (Denmark)

    Samuel, Michael; Pedersen, Anders Uhl; Bonnet, Philippe

    2005-01-01

    of the CSB+-tree. We argue that it is necessary to consider a larger group of parameters in order to adapt CSB+-tree to processor architectures as different as Pentium and Itanium. We identify this group of parameters and study how it impacts the performance of CSB+-tree on Itanium 2. Finally, we propose......Cache-conscious indexes, such as CSB+-tree, are sensitive to the underlying processor architecture. In this paper, we focus on how to adapt the CSB+-tree so that it performs well on a range of different processor architectures. Previous work has focused on the impact of node size on the performance...... a systematic method for adapting CSB+-tree to new platforms. This work is a first step towards integrating CSB+-tree in MySQL’s heap storage manager....

  9. Optical Array Processor: Laboratory Results

    Science.gov (United States)

    Casasent, David; Jackson, James; Vaerewyck, Gerard

    1987-01-01

    A Space Integrating (SI) Optical Linear Algebra Processor (OLAP) is described and laboratory results on its performance in several practical engineering problems are presented. The applications include its use in the solution of a nonlinear matrix equation for optimal control and a parabolic Partial Differential Equation (PDE), the transient diffusion equation with two spatial variables. Frequency-multiplexed, analog and high accuracy non-base-two data encoding are used and discussed. A multi-processor OLAP architecture is described and partitioning and data flow issues are addressed.

  10. Fast processor for dilepton triggers

    International Nuclear Information System (INIS)

    Katsanevas, S.; Kostarakis, P.; Baltrusaitis, R.

    1983-01-01

    We describe a fast trigger processor, developed for and used in Fermilab experiment E-537, for selecting high-mass dimuon events produced by negative pions and anti-protons. The processor finds candidate tracks by matching hit information received from drift chambers and scintillation counters, and determines their momenta. Invariant masses are calculated for all possible pairs of tracks and an event is accepted if any invariant mass is greater than some preselectable minimum mass. The whole process, accomplished within 5 to 10 microseconds, achieves up to a ten-fold reduction in trigger rate

  11. Extended performance electric propulsion power processor design study. Volume 2: Technical summary

    Science.gov (United States)

    Biess, J. J.; Inouye, L. Y.; Schoenfeld, A. D.

    1977-01-01

    Electric propulsion power processor technology has processed during the past decade to the point that it is considered ready for application. Several power processor design concepts were evaluated and compared. Emphasis was placed on a 30 cm ion thruster power processor with a beam power rating supply of 2.2KW to 10KW for the main propulsion power stage. Extension in power processor performance were defined and were designed in sufficient detail to determine efficiency, component weight, part count, reliability and thermal control. A detail design was performed on a microprocessor as the thyristor power processor controller. A reliability analysis was performed to evaluate the effect of the control electronics redesign. Preliminary electrical design, mechanical design and thermal analysis were performed on a 6KW power transformer for the beam supply. Bi-Mod mechanical, structural and thermal control configurations were evaluated for the power processor and preliminary estimates of mechanical weight were determined.

  12. SAPIENS: Spreading Activation Processor for Information Encoded in Network Structures. Technical Report No. 296.

    Science.gov (United States)

    Ortony, Andrew; Radin, Dean I.

    The product of researchers' efforts to develop a computer processor which distinguishes between relevant and irrelevant information in the database, Spreading Activation Processor for Information Encoded in Network Structures (SAPIENS) exhibits (1) context sensitivity, (2) efficiency, (3) decreasing activation over time, (4) summation of…

  13. Very Long Instruction Word Processors

    Indian Academy of Sciences (India)

    Explicitly Parallel Instruction Computing (EPIC) is an instruction processing paradigm that has been in the spot- light due to its adoption by the next generation of Intel. Processors starting with the IA-64. The EPIC processing paradigm is an evolution of the Very Long Instruction. Word (VLIW) paradigm. This article gives an ...

  14. An implementation of the SANE Virtual Processor using POSIX threads

    NARCIS (Netherlands)

    van Tol, M.W.; Jesshope, C.R.; Lankamp, M.; Polstra, S.

    2009-01-01

    The SANE Virtual Processor (SVP) is an abstract concurrent programming model that is both deadlock free and supports efficient implementation. It is captured by the μTC programming language. The work presented in this paper covers a portable implementation of this model as a C++ library on top of

  15. Optimization of Particle-in-Cell Codes on RISC Processors

    Science.gov (United States)

    Decyk, Viktor K.; Karmesin, Steve Roy; Boer, Aeint de; Liewer, Paulette C.

    1996-01-01

    General strategies are developed to optimize particle-cell-codes written in Fortran for RISC processors which are commonly used on massively parallel computers. These strategies include data reorganization to improve cache utilization and code reorganization to improve efficiency of arithmetic pipelines.

  16. Matrix preconditioning: a robust operation for optical linear algebra processors.

    Science.gov (United States)

    Ghosh, A; Paparao, P

    1987-07-15

    Analog electrooptical processors are best suited for applications demanding high computational throughput with tolerance for inaccuracies. Matrix preconditioning is one such application. Matrix preconditioning is a preprocessing step for reducing the condition number of a matrix and is used extensively with gradient algorithms for increasing the rate of convergence and improving the accuracy of the solution. In this paper, we describe a simple parallel algorithm for matrix preconditioning, which can be implemented efficiently on a pipelined optical linear algebra processor. From the results of our numerical experiments we show that the efficacy of the preconditioning algorithm is affected very little by the errors of the optical system.

  17. Ring-array processor distribution topology for optical interconnects

    Science.gov (United States)

    Li, Yao; Ha, Berlin; Wang, Ting; Wang, Sunyu; Katz, A.; Lu, X. J.; Kanterakis, E.

    1992-01-01

    The existing linear and rectangular processor distribution topologies for optical interconnects, although promising in many respects, cannot solve problems such as clock skews, the lack of supporting elements for efficient optical implementation, etc. The use of a ring-array processor distribution topology, however, can overcome these problems. Here, a study of the ring-array topology is conducted with an aim of implementing various fast clock rate, high-performance, compact optical networks for digital electronic multiprocessor computers. Practical design issues are addressed. Some proof-of-principle experimental results are included.

  18. VON WISPR Family Processors: Volume 1

    National Research Council Canada - National Science Library

    Wagstaff, Ronald

    1997-01-01

    ...) and the background noise they are embedded in. Processors utilizing those fluctuations such as the von WISPR Family Processors discussed herein, are methods or algorithms that preferentially attenuate the fluctuating signals and noise...

  19. Design Principles for Synthesizable Processor Cores

    DEFF Research Database (Denmark)

    Schleuniger, Pascal; McKee, Sally A.; Karlsson, Sven

    2012-01-01

    As FPGAs get more competitive, synthesizable processor cores become an attractive choice for embedded computing. Currently popular commercial processor cores do not fully exploit current FPGA architectures. In this paper, we propose general design principles to increase instruction throughput...

  20. Token-Aware Completion Functions for Elastic Processor Verification

    Directory of Open Access Journals (Sweden)

    Sudarshan K. Srinivasan

    2009-01-01

    Full Text Available We develop a formal verification procedure to check that elastic pipelined processor designs correctly implement their instruction set architecture (ISA specifications. The notion of correctness we use is based on refinement. Refinement proofs are based on refinement maps, which—in the context of this problem—are functions that map elastic processor states to states of the ISA specification model. Data flow in elastic architectures is complicated by the insertion of any number of buffers in any place in the design, making it hard to construct refinement maps for elastic systems in a systematic manner. We introduce token-aware completion functions, which incorporate a mechanism to track the flow of data in elastic pipelines, as a highly automated and systematic approach to construct refinement maps. We demonstrate the efficiency of the overall verification procedure based on token-aware completion functions using six elastic pipelined processor models based on the DLX architecture.

  1. A programmable systolic trigger processor for FERA bus data

    International Nuclear Information System (INIS)

    Appelquist, G.; Hovander, B.; Sellden, B.; Bohm, C.

    1992-09-01

    A generic CAMAC based trigger processor module for fast processing of large amounts of ADC data, has been designed. This module has been realised using complex programmable gate arrays (LCAs from XILINX). The gate arrays have been connected to memories and multipliers in such a way that different gate array configurations can cover a wide range of module applications. Using this module, it is possible to construct complex trigger processors. The module uses both the fast ECL FERA bus and the CAMAC bus for inputs and outputs. The latter, however, is primarily used for set-up and control but may also be used for data output. Large numbers of ADCs can be served by a hierarchical arrangement of trigger processor modules, processing ADC data with pipe-line arithmetics producing the final result at the apex of the pyramid. The trigger decision will be transmitted to the data acquisition system via a logic signal while numeric results may be extracted by the CAMAC controller. The trigger processor was originally developed for the proposed neutral particle search experiment at CERN, NUMASS. There it was designed to serve as a second level trigger processor. It was required to correct all ADC raw data for efficiency and pedestal, calculate the total calorimeter energy, obtain the optimal time of flight data and calculate the particle mass. A suitable mass cut would then deliver the trigger decision. More complex triggers were also considered. (au)

  2. JPP: A Java Pre-Processor

    OpenAIRE

    Kiniry, Joseph R.; Cheong, Elaine

    1998-01-01

    The Java Pre-Processor, or JPP for short, is a parsing pre-processor for the Java programming language. Unlike its namesake (the C/C++ Pre-Processor, cpp), JPP provides functionality above and beyond simple textual substitution. JPP's capabilities include code beautification, code standard conformance checking, class and interface specification and testing, and documentation generation.

  3. Online Fastbus processor for LEP

    International Nuclear Information System (INIS)

    Mueller, H.

    1986-01-01

    The author describes the online computing aspects of Fastbus systems using a processor module which has been developed at CERN and is now available commercially. These General Purpose Master/Slaves (GPMS) are based on 68000/10 (or optionally 68020/68881) processors. Applications include use as event-filters (DELPHI), supervisory controllers, Fastbus stand-alone diagnostic tools, and multiprocessor array components. The direct mapping of single, 32-bit assembly instructions to execute Fastbus protocols makes the use of a GPM both simple and flexible. Loosely coupled processing in Fastbus networks is possible between GPM's as they support access semaphores and use a two port memory as I/O buffer for Fastbus. Both master and slave-ports support block transfers up to 20 Mbytes/s. The CERN standard Fastbus software and the MoniCa symbolic debugging monitor are available on the GPM with real time, multiprocessing support. (Auth.)

  4. Automotive Fuel Processor Development and Demonstration with Fuel Cell Systems

    Energy Technology Data Exchange (ETDEWEB)

    Nuvera Fuel Cells

    2005-04-15

    The potential for fuel cell systems to improve energy efficiency and reduce emissions over conventional power systems has generated significant interest in fuel cell technologies. While fuel cells are being investigated for use in many applications such as stationary power generation and small portable devices, transportation applications present some unique challenges for fuel cell technology. Due to their lower operating temperature and non-brittle materials, most transportation work is focusing on fuel cells using proton exchange membrane (PEM) technology. Since PEM fuel cells are fueled by hydrogen, major obstacles to their widespread use are the lack of an available hydrogen fueling infrastructure and hydrogen's relatively low energy storage density, which leads to a much lower driving range than conventional vehicles. One potential solution to the hydrogen infrastructure and storage density issues is to convert a conventional fuel such as gasoline into hydrogen onboard the vehicle using a fuel processor. Figure 2 shows that gasoline stores roughly 7 times more energy per volume than pressurized hydrogen gas at 700 bar and 4 times more than liquid hydrogen. If integrated properly, the fuel processor/fuel cell system would also be more efficient than traditional engines and would give a fuel economy benefit while hydrogen storage and distribution issues are being investigated. Widespread implementation of fuel processor/fuel cell systems requires improvements in several aspects of the technology, including size, startup time, transient response time, and cost. In addition, the ability to operate on a number of hydrocarbon fuels that are available through the existing infrastructure is a key enabler for commercializing these systems. In this program, Nuvera Fuel Cells collaborated with the Department of Energy (DOE) to develop efficient, low-emission, multi-fuel processors for transportation applications. Nuvera's focus was on (1) developing fuel

  5. The ALICE Central Trigger Processor (CTP) upgrade

    International Nuclear Information System (INIS)

    Krivda, M.; Alexandre, D.; Barnby, L.S.; Evans, D.; Jones, P.G.; Jusko, A.; Lietava, R.; Baillie, O. Villalobos; Pospíšil, J.

    2016-01-01

    The ALICE Central Trigger Processor (CTP) at the CERN LHC has been upgraded for LHC Run 2, to improve the Transition Radiation Detector (TRD) data-taking efficiency and to improve the physics performance of ALICE. There is a new additional CTP interaction record sent using a new second Detector Data Link (DDL), a 2 GB DDR3 memory and an extension of functionality for classes. The CTP switch has been incorporated directly onto the new LM0 board. A design proposal for an ALICE CTP upgrade for LHC Run 3 is also presented. Part of the development is a low latency high bandwidth interface whose purpose is to minimize an overall trigger latency

  6. Rapid prototyping and evaluation of programmable SIMD SDR processors in LISA

    Science.gov (United States)

    Chen, Ting; Liu, Hengzhu; Zhang, Botao; Liu, Dongpei

    2013-03-01

    With the development of international wireless communication standards, there is an increase in computational requirement for baseband signal processors. Time-to-market pressure makes it impossible to completely redesign new processors for the evolving standards. Due to its high flexibility and low power, software defined radio (SDR) digital signal processors have been proposed as promising technology to replace traditional ASIC and FPGA fashions. In addition, there are large numbers of parallel data processed in computation-intensive functions, which fosters the development of single instruction multiple data (SIMD) architecture in SDR platform. So a new way must be found to prototype the SDR processors efficiently. In this paper we present a bit-and-cycle accurate model of programmable SIMD SDR processors in a machine description language LISA. LISA is a language for instruction set architecture which can gain rapid model at architectural level. In order to evaluate the availability of our proposed processor, three common baseband functions, FFT, FIR digital filter and matrix multiplication have been mapped on the SDR platform. Analytical results showed that the SDR processor achieved the maximum of 47.1% performance boost relative to the opponent processor.

  7. Accuracies Of Optical Processors For Adaptive Optics

    Science.gov (United States)

    Downie, John D.; Goodman, Joseph W.

    1992-01-01

    Paper presents analysis of accuracies and requirements concerning accuracies of optical linear-algebra processors (OLAP's) in adaptive-optics imaging systems. Much faster than digital electronic processor and eliminate some residual distortion. Question whether errors introduced by analog processing of OLAP overcome advantage of greater speed. Paper addresses issue by presenting estimate of accuracy required in general OLAP that yields smaller average residual aberration of wave front than digital electronic processor computing at given speed.

  8. Functional Verification of Enhanced RISC Processor

    OpenAIRE

    SHANKER NILANGI; SOWMYA L

    2013-01-01

    This paper presents design and verification of a 32-bit enhanced RISC processor core having floating point computations integrated within the core, has been designed to reduce the cost and complexity. The designed 3 stage pipelined 32-bit RISC processor is based on the ARM7 processor architecture with single precision floating point multiplier, floating point adder/subtractor for floating point operations and 32 x 32 booths multiplier added to the integer core of ARM7. The binary representati...

  9. Alternative Water Processor Test Development

    Science.gov (United States)

    Pickering, Karen D.; Mitchell, Julie; Vega, Leticia; Adam, Niklas; Flynn, Michael; Wjee (er. Rau); Lunn, Griffin; Jackson, Andrew

    2012-01-01

    The Next Generation Life Support Project is developing an Alternative Water Processor (AWP) as a candidate water recovery system for long duration exploration missions. The AWP consists of biological water processor (BWP) integrated with a forward osmosis secondary treatment system (FOST). The basis of the BWP is a membrane aerated biological reactor (MABR), developed in concert with Texas Tech University. Bacteria located within the MABR metabolize organic material in wastewater, converting approximately 90% of the total organic carbon to carbon dioxide. In addition, bacteria convert a portion of the ammonia-nitrogen present in the wastewater to nitrogen gas, through a combination of nitrogen and denitrification. The effluent from the BWP system is low in organic contaminants, but high in total dissolved solids. The FOST system, integrated downstream of the BWP, removes dissolved solids through a combination of concentration-driven forward osmosis and pressure driven reverse osmosis. The integrated system is expected to produce water with a total organic carbon less than 50 mg/l and dissolved solids that meet potable water requirements for spaceflight. This paper describes the test definition, the design of the BWP and FOST subsystems, and plans for integrated testing.

  10. The UA1 trigger processor

    International Nuclear Information System (INIS)

    Grayer, G.H.

    1981-01-01

    Experiment UA1 is a large multi-purpose spectrometer at the CERN proton-antiproton collider, scheduled for late 1981. The principal trigger is formed on the basis of the energy deposition in calorimeters. A trigger decision taken in under 2.4 microseconds can avoid dead time losses due to the bunched nature of the beam. To achieve this we have built fast 8-bit charge to digital converters followed by two identical digital processors tailored to the experiment. The outputs of groups of the 2440 photomultipliers in the calorimeters are summed to form a total of 288 input channels to the ADCs. A look-up table in RAM is used to convert the digitised photomultiplier signals to energy in one processor, combinations of input channels, and also counts the number of clusters with electromagnetic or hadronic energy above pre-determined levels. Up to twelve combinations of these conditions, together with external information, may be combined in coincidence or in veto to form the final trigger. Provision has been made for testing using simulated data in an off-line mode, and sampling real data when on-line. (orig.)

  11. Data register and processor for multiwire chambers

    International Nuclear Information System (INIS)

    Karpukhin, V.V.

    1985-01-01

    A data register and a processor for data receiving and processing from drift chambers of a device for investigating relativistic positroniums are described. The data are delivered to the register input in the form of the Grey 8 bit code, memorized and transformed to a position code. The register information is delivered to the KAMAK trunk and to the front panel plug. The processor selects particle tracks in a horizontal plane of the facility. ΔY maximum coordinate divergence and minimum point quantity on the track are set from the processor front panel. Processor solution time is 16 μs maximum quantity of simultaneously analyzed coordinates is 16

  12. Many - body simulations using an array processor

    International Nuclear Information System (INIS)

    Rapaport, D.C.

    1985-01-01

    Simulations of microscopic models of water and polypeptides using molecular dynamics and Monte Carlo techniques have been carried out with the aid of an FPS array processor. The computational techniques are discussed, with emphasis on the development and optimization of the software to take account of the special features of the processor. The computing requirements of these simulations exceed what could be reasonably carried out on a normal 'scientific' computer. While the FPS processor is highly suited to the kinds of models described, several other computationally intensive problems in statistical mechanics are outlined for which alternative processor architectures are more appropriate

  13. Sensitometric control of roentgen film processors

    International Nuclear Information System (INIS)

    Forsberg, H.; Karolinska Sjukhuset, Stockholm

    1987-01-01

    Monitoring of film processors performance is essential since image quality, patient dose and costs are influenced by the performance. A system for sensitometric constancy control of film processors and their associated components is described. Experience with the system for 3 years is given when implemented on 17 film processors. Modern high quality film processors have a stability that makes a test frequency of once a week sufficient to maintain adequate image quality. The test system is so sensitive that corrective actions almost invariably have been taken before any technical problem degraded the image quality to a visible degree. (orig.)

  14. Interactive high-resolution isosurface ray casting on multicore processors.

    Science.gov (United States)

    Wang, Qin; JaJa, Joseph

    2008-01-01

    We present a new method for the interactive rendering of isosurfaces using ray casting on multi-core processors. This method consists of a combination of an object-order traversal that coarsely identifies possible candidate 3D data blocks for each small set of contiguous pixels, and an isosurface ray casting strategy tailored for the resulting limited-size lists of candidate 3D data blocks. While static screen partitioning is widely used in the literature, our scheme performs dynamic allocation of groups of ray casting tasks to ensure almost equal loads among the different threads running on multi-cores while maintaining spatial locality. We also make careful use of memory management environment commonly present in multi-core processors. We test our system on a two-processor Clovertown platform, each consisting of a Quad-Core 1.86-GHz Intel Xeon Processor, for a number of widely different benchmarks. The detailed experimental results show that our system is efficient and scalable, and achieves high cache performance and excellent load balancing, resulting in an overall performance that is superior to any of the previous algorithms. In fact, we achieve an interactive isosurface rendering on a 1024(2) screen for all the datasets tested up to the maximum size of the main memory of our platform.

  15. A Modular Pipelined Processor for High Resolution Gamma-Ray Spectroscopy

    Science.gov (United States)

    Veiga, Alejandro; Grunfeld, Christian

    2016-02-01

    The design of a digital signal processor for gamma-ray applications is presented in which a single ADC input can simultaneously provide temporal and energy characterization of gamma radiation for a wide range of applications. Applying pipelining techniques, the processor is able to manage and synchronize very large volumes of streamed real-time data. Its modular user interface provides a flexible environment for experimental design. The processor can fit in a medium-sized FPGA device operating at ADC sampling frequency, providing an efficient solution for multi-channel applications. Two experiments are presented in order to characterize its temporal and energy resolution.

  16. 'Iconic' tracking algorithms for high energy physics using the TRAX-I massively parallel processor

    International Nuclear Information System (INIS)

    Vesztergombi, G.

    1989-01-01

    TRAX-I, a cost-effective parallel microcomputer, applying associative string processor (ASP) architecture with 16 K parallel processing elements, is being built by Aspex Microsystems Ltd. (UK). When applied to the tracking problem of very complex events with several hundred tracks, the large number of processors allows one to dedicate one or more processors to each wire (in MWPC), each pixel (in digitized images from streamer chambers or other visual detectors), or each pad (in TPC) to perform very efficient pattern recognition. Some linear tracking algorithms based on this ''ionic'' representation are presented. (orig.)

  17. 'Iconic' tracking algorithms for high energy physics using the TRAX-I massively parallel processor

    International Nuclear Information System (INIS)

    Vestergombi, G.

    1989-11-01

    TRAX-I, a cost-effective parallel microcomputer, applying Associative String Processor (ASP) architecture with 16 K parallel processing elements, is being built by Aspex Microsystems Ltd. (UK). When applied to the tracking problem of very complex events with several hundred tracks, the large number of processors allows one to dedicate one or more processors to each wire (in MWPC), each pixel (in digitized images from streamer chambers or other visual detectors), or each pad (in TPC) to perform very efficient pattern recognition. Some linear tracking algorithms based on this 'iconic' representation are presented. (orig.)

  18. Monte Carlo photon transport on shared memory and distributed memory parallel processors

    International Nuclear Information System (INIS)

    Martin, W.R.; Wan, T.C.; Abdel-Rahman, T.S.; Mudge, T.N.; Miura, K.

    1987-01-01

    Parallelized Monte Carlo algorithms for analyzing photon transport in an inertially confined fusion (ICF) plasma are considered. Algorithms were developed for shared memory (vector and scalar) and distributed memory (scalar) parallel processors. The shared memory algorithm was implemented on the IBM 3090/400, and timing results are presented for dedicated runs with two, three, and four processors. Two alternative distributed memory algorithms (replication and dispatching) were implemented on a hypercube parallel processor (1 through 64 nodes). The replication algorithm yields essentially full efficiency for all cube sizes; with the 64-node configuration, the absolute performance is nearly the same as with the CRAY X-MP. The dispatching algorithm also yields efficiencies above 80% in a large simulation for the 64-processor configuration

  19. A light hydrocarbon fuel processor producing high-purity hydrogen

    Science.gov (United States)

    Löffler, Daniel G.; Taylor, Kyle; Mason, Dylan

    This paper discusses the design process and presents performance data for a dual fuel (natural gas and LPG) fuel processor for PEM fuel cells delivering between 2 and 8 kW electric power in stationary applications. The fuel processor resulted from a series of design compromises made to address different design constraints. First, the product quality was selected; then, the unit operations needed to achieve that product quality were chosen from the pool of available technologies. Next, the specific equipment needed for each unit operation was selected. Finally, the unit operations were thermally integrated to achieve high thermal efficiency. Early in the design process, it was decided that the fuel processor would deliver high-purity hydrogen. Hydrogen can be separated from other gases by pressure-driven processes based on either selective adsorption or permeation. The pressure requirement made steam reforming (SR) the preferred reforming technology because it does not require compression of combustion air; therefore, steam reforming is more efficient in a high-pressure fuel processor than alternative technologies like autothermal reforming (ATR) or partial oxidation (POX), where the combustion occurs at the pressure of the process stream. A low-temperature pre-reformer reactor is needed upstream of a steam reformer to suppress coke formation; yet, low temperatures facilitate the formation of metal sulfides that deactivate the catalyst. For this reason, a desulfurization unit is needed upstream of the pre-reformer. Hydrogen separation was implemented using a palladium alloy membrane. Packed beds were chosen for the pre-reformer and reformer reactors primarily because of their low cost, relatively simple operation and low maintenance. Commercial, off-the-shelf balance of plant (BOP) components (pumps, valves, and heat exchangers) were used to integrate the unit operations. The fuel processor delivers up to 100 slm hydrogen >99.9% pure with <1 ppm CO, <3 ppm CO 2. The

  20. Micro processors for plant protection

    International Nuclear Information System (INIS)

    McAffer, N.T.C.

    1976-01-01

    Micro computers can be used satisfactorily in general protection duties with economic advantages over hardwired systems. The reliability of such protection functions can be enhanced by keeping the task performed by each protection micro processor simple and by avoiding such a task being dependent on others in any substantial way. This implies that vital work done for any task is kept within it and that any communications from it to outside or to it from outside are restricted to those for controlling data transfer. Also that the amount of this data should be the minimum consistent with satisfactory task execution. Technology is changing rapidly and devices may become obsolete and be supplanted by new ones before their theoretical reliability can be confirmed or otherwise by field service. This emphasises the need for users to pool device performance data so that effective reliability judgements can be made within the lifetime of the devices. (orig.) [de

  1. Scientific Computing Kernels on the Cell Processor

    Energy Technology Data Exchange (ETDEWEB)

    Williams, Samuel W.; Shalf, John; Oliker, Leonid; Kamil, Shoaib; Husbands, Parry; Yelick, Katherine

    2007-04-04

    The slowing pace of commodity microprocessor performance improvements combined with ever-increasing chip power demands has become of utmost concern to computational scientists. As a result, the high performance computing community is examining alternative architectures that address the limitations of modern cache-based designs. In this work, we examine the potential of using the recently-released STI Cell processor as a building block for future high-end computing systems. Our work contains several novel contributions. First, we introduce a performance model for Cell and apply it to several key scientific computing kernels: dense matrix multiply, sparse matrix vector multiply, stencil computations, and 1D/2D FFTs. The difficulty of programming Cell, which requires assembly level intrinsics for the best performance, makes this model useful as an initial step in algorithm design and evaluation. Next, we validate the accuracy of our model by comparing results against published hardware results, as well as our own implementations on a 3.2GHz Cell blade. Additionally, we compare Cell performance to benchmarks run on leading superscalar (AMD Opteron), VLIW (Intel Itanium2), and vector (Cray X1E) architectures. Our work also explores several different mappings of the kernels and demonstrates a simple and effective programming model for Cell's unique architecture. Finally, we propose modest microarchitectural modifications that could significantly increase the efficiency of double-precision calculations. Overall results demonstrate the tremendous potential of the Cell architecture for scientific computations in terms of both raw performance and power efficiency.

  2. Towards a Process Algebra for Shared Processors

    DEFF Research Database (Denmark)

    Buchholtz, Mikael; Andersen, Jacob; Løvengreen, Hans Henrik

    2002-01-01

    We present initial work on a timed process algebra that models sharing of processor resources allowing preemption at arbitrary points in time. This enables us to model both the functional and the timely behaviour of concurrent processes executed on a single processor. We give a refinement relation...

  3. Vector and parallel processors in computational science

    International Nuclear Information System (INIS)

    Duff, I.S.; Reid, J.K.

    1985-01-01

    These proceedings contain the articles presented at the named conference. These concern hardware and software for vector and parallel processors, numerical methods and algorithms for the computation on such processors, as well as applications of such methods to different fields of physics and related sciences. See hints under the relevant topics. (HSI)

  4. The communication processor of TUMULT-64

    NARCIS (Netherlands)

    Smit, Gerardus Johannes Maria; Jansen, P.G.

    1988-01-01

    Tumult (Twente University MULTi-processor system) is a modular extendible multi-processor system designed and implemented at the Twente University of Technology in co-operation with Oce Nederland B.V. and the Dr. Neher Laboratories (Dutch PTT). Characteristics of the hardware are: MIMD type,

  5. An interactive parallel processor for data analysis

    International Nuclear Information System (INIS)

    Mong, J.; Logan, D.; Maples, C.; Rathbun, W.; Weaver, D.

    1984-01-01

    A parallel array of eight minicomputers has been assembled in an attempt to deal with kiloparameter data events. By exporting computer system functions to a separate processor, the authors have been able to achieve computer amplification linearly proportional to the number of executing processors

  6. Comparison of Processor Performance of SPECint2006 Benchmarks of some Intel Xeon Processors

    OpenAIRE

    Abdul Kareem PARCHUR; Ram Asaray SINGH

    2012-01-01

    High performance is a critical requirement to all microprocessors manufacturers. The present paper describes the comparison of performance in two main Intel Xeon series processors (Type A: Intel Xeon X5260, X5460, E5450 and L5320 and Type B: Intel Xeon X5140, 5130, 5120 and E5310). The microarchitecture of these processors is implemented using the basis of a new family of processors from Intel starting with the Pentium 4 processor. These processors can provide a performance boost for many ke...

  7. Neurovision processor for designing intelligent sensors

    Science.gov (United States)

    Gupta, Madan M.; Knopf, George K.

    1992-03-01

    A programmable multi-task neuro-vision processor, called the Positive-Negative (PN) neural processor, is proposed as a plausible hardware mechanism for constructing robust multi-task vision sensors. The computational operations performed by the PN neural processor are loosely based on the neural activity fields exhibited by certain nervous tissue layers situated in the brain. The neuro-vision processor can be programmed to generate diverse dynamic behavior that may be used for spatio-temporal stabilization (STS), short-term visual memory (STVM), spatio-temporal filtering (STF) and pulse frequency modulation (PFM). A multi- functional vision sensor that performs a variety of information processing operations on time- varying two-dimensional sensory images can be constructed from a parallel and hierarchical structure of numerous individually programmed PN neural processors.

  8. Development of a highly reliable CRT processor

    International Nuclear Information System (INIS)

    Shimizu, Tomoya; Saiki, Akira; Hirai, Kenji; Jota, Masayoshi; Fujii, Mikiya

    1996-01-01

    Although CRT processors have been employed by the main control board to reduce the operator's workload during monitoring, the control systems are still operated by hardware switches. For further advancement, direct controller operation through a display device is expected. A CRT processor providing direct controller operation must be as reliable as the hardware switches are. The authors are developing a new type of highly reliable CRT processor that enables direct controller operations. In this paper, we discuss the design principles behind a highly reliable CRT processor. The principles are defined by studies of software reliability and of the functional reliability of the monitoring and operation systems. The functional configuration of an advanced CRT processor is also addressed. (author)

  9. Online track processor for the CDF upgrade

    International Nuclear Information System (INIS)

    Thomson, E. J.

    2002-01-01

    A trigger track processor, called the eXtremely Fast Tracker (XFT), has been designed for the CDF upgrade. This processor identifies high transverse momentum (> 1.5 GeV/c) charged particles in the new central outer tracking chamber for CDF II. The XFT design is highly parallel to handle the input rate of 183 Gbits/s and output rate of 44 Gbits/s. The processor is pipelined and reports the result for a new event every 132 ns. The processor uses three stages: hit classification, segment finding, and segment linking. The pattern recognition algorithms for the three stages are implemented in programmable logic devices (PLDs) which allow in-situ modification of the algorithm at any time. The PLDs reside on three different types of modules. The complete system has been installed and commissioned at CDF II. An overview of the track processor and performance in CDF Run II are presented

  10. Computer Generated Inputs for NMIS Processor Verification

    International Nuclear Information System (INIS)

    J. A. Mullens; J. E. Breeding; J. A. McEvers; R. W. Wysor; L. G. Chiang; J. R. Lenarduzzi; J. T. Mihalczo; J. K. Mattingly

    2001-01-01

    Proper operation of the Nuclear Identification Materials System (NMIS) processor can be verified using computer-generated inputs [BIST (Built-In-Self-Test)] at the digital inputs. Preselected sequences of input pulses to all channels with known correlation functions are compared to the output of the processor. These types of verifications have been utilized in NMIS type correlation processors at the Oak Ridge National Laboratory since 1984. The use of this test confirmed a malfunction in a NMIS processor at the All-Russian Scientific Research Institute of Experimental Physics (VNIIEF) in 1998. The NMIS processor boards were returned to the U.S. for repair and subsequently used in NMIS passive and active measurements with Pu at VNIIEF in 1999

  11. Performance evaluation of integrated fuel processor for residential PEMFCs application

    International Nuclear Information System (INIS)

    Yu Taek Seo; Dong Joo Seo; Young-Seog Seo; Hyun-Seog Roh; Wang Lai Yoon; Jin Hyeok Jeong

    2006-01-01

    KIER has been developing the natural gas fuel processor to produce hydrogen rich gas for residential PEMFCs system. To realize a compact and high efficiency, the unit processes of steam reforming, water gas shift, and preferential oxidation are chemically and physically integrated in a package. Current fuel processor designed for 1 kW class PEMFCs shows thermal efficiency of 78% as a HHV basis with methane conversion of 90% at rated load operation. CO concentration below 10 ppm in the produced gas is achieved with preferential oxidation unit using Pt and Ru based catalyst under the condition of [O 2 ]/[CO]=2.0. The partial load operation have been carried out to test the performance of fuel processor from 40% to 80% load, showing stable methane conversion and CO concentration below 10 ppm. The durability test for the daily start-stop and 8 hr operation procedure is under investigation and shows no deterioration of its performance after 40 start-stop cycles. (authors)

  12. Analytical Bounds on the Threads in IXP1200 Network Processor

    OpenAIRE

    Ramakrishna, STGS; Jamadagni, HS

    2003-01-01

    Increasing link speeds have placed enormous burden on the processing requirements and the processors are expected to carry out a variety of tasks. Network Processors (NP) [1] [2] is the blanket name given to the processors, which are traded for flexibility and performance. Network Processors are offered by a number of vendors; to take the main burden of processing requirement of network related operations from the conventional processors. The Network Processors cover a spectrum of design trad...

  13. Programming massively parallel processors a hands-on approach

    CERN Document Server

    Kirk, David B

    2010-01-01

    Programming Massively Parallel Processors discusses basic concepts about parallel programming and GPU architecture. ""Massively parallel"" refers to the use of a large number of processors to perform a set of computations in a coordinated parallel way. The book details various techniques for constructing parallel programs. It also discusses the development process, performance level, floating-point format, parallel patterns, and dynamic parallelism. The book serves as a teaching guide where parallel programming is the main topic of the course. It builds on the basics of C programming for CUDA, a parallel programming environment that is supported on NVI- DIA GPUs. Composed of 12 chapters, the book begins with basic information about the GPU as a parallel computer source. It also explains the main concepts of CUDA, data parallelism, and the importance of memory access efficiency using CUDA. The target audience of the book is graduate and undergraduate students from all science and engineering disciplines who ...

  14. Initial explorations of ARM processors for scientific computing

    International Nuclear Information System (INIS)

    Abdurachmanov, David; Elmer, Peter; Eulisse, Giulio; Muzaffar, Shahzad

    2014-01-01

    Power efficiency is becoming an ever more important metric for both high performance and high throughput computing. Over the course of next decade it is expected that flops/watt will be a major driver for the evolution of computer architecture. Servers with large numbers of ARM processors, already ubiquitous in mobile computing, are a promising alternative to traditional x86-64 computing. We present the results of our initial investigations into the use of ARM processors for scientific computing applications. In particular we report the results from our work with a current generation ARMv7 development board to explore ARM-specific issues regarding the software development environment, operating system, performance benchmarks and issues for porting High Energy Physics software

  15. SPP: A data base processor data communications protocol

    Science.gov (United States)

    Fishwick, P. A.

    1983-01-01

    The design and implementation of a data communications protocol for the Intel Data Base Processor (DBP) is defined. The protocol is termed SPP (Service Port Protocol) since it enables data transfer between the host computer and the DBP service port. The protocol implementation is extensible in that it is explicitly layered and the protocol functionality is hierarchically organized. Extensive trace and performance capabilities have been supplied with the protocol software to permit optional efficient monitoring of the data transfer between the host and the Intel data base processor. Machine independence was considered to be an important attribute during the design and implementation of SPP. The protocol source is fully commented and is included in Appendix A of this report.

  16. Effect of processor temperature on film dosimetry

    International Nuclear Information System (INIS)

    Srivastava, Shiv P.; Das, Indra J.

    2012-01-01

    Optical density (OD) of a radiographic film plays an important role in radiation dosimetry, which depends on various parameters, including beam energy, depth, field size, film batch, dose, dose rate, air film interface, postexposure processing time, and temperature of the processor. Most of these parameters have been studied for Kodak XV and extended dose range (EDR) films used in radiation oncology. There is very limited information on processor temperature, which is investigated in this study. Multiple XV and EDR films were exposed in the reference condition (d max. , 10 × 10 cm 2 , 100 cm) to a given dose. An automatic film processor (X-Omat 5000) was used for processing films. The temperature of the processor was adjusted manually with increasing temperature. At each temperature, a set of films was processed to evaluate OD at a given dose. For both films, OD is a linear function of processor temperature in the range of 29.4–40.6°C (85–105°F) for various dose ranges. The changes in processor temperature are directly related to the dose by a quadratic function. A simple linear equation is provided for the changes in OD vs. processor temperature, which could be used for correcting dose in radiation dosimetry when film is used.

  17. Optical Associative Processors For Visual Perception"

    Science.gov (United States)

    Casasent, David; Telfer, Brian

    1988-05-01

    We consider various associative processor modifications required to allow these systems to be used for visual perception, scene analysis, and object recognition. For these applications, decisions on the class of the objects present in the input image are required and thus heteroassociative memories are necessary (rather than the autoassociative memories that have been given most attention). We analyze the performance of both associative processors and note that there is considerable difference between heteroassociative and autoassociative memories. We describe associative processors suitable for realizing functions such as: distortion invariance (using linear discriminant function memory synthesis techniques), noise and image processing performance (using autoassociative memories in cascade with with a heteroassociative processor and with a finite number of autoassociative memory iterations employed), shift invariance (achieved through the use of associative processors operating on feature space data), and the analysis of multiple objects in high noise (which is achieved using associative processing of the output from symbolic correlators). We detail and provide initial demonstrations of the use of associative processors operating on iconic, feature space and symbolic data, as well as adaptive associative processors.

  18. Development of Innovative Design Processor

    International Nuclear Information System (INIS)

    Park, Y.S.; Park, C.O.

    2004-01-01

    The nuclear design analysis requires time-consuming and erroneous model-input preparation, code run, output analysis and quality assurance process. To reduce human effort and improve design quality and productivity, Innovative Design Processor (IDP) is being developed. Two basic principles of IDP are the document-oriented design and the web-based design. The document-oriented design is that, if the designer writes a design document called active document and feeds it to a special program, the final document with complete analysis, table and plots is made automatically. The active documents can be written with ordinary HTML editors or created automatically on the web, which is another framework of IDP. Using the proper mix-up of server side and client side programming under the LAMP (Linux/Apache/MySQL/PHP) environment, the design process on the web is modeled as a design wizard style so that even a novice designer makes the design document easily. This automation using the IDP is now being implemented for all the reload design of Korea Standard Nuclear Power Plant (KSNP) type PWRs. The introduction of this process will allow large reduction in all reload design efforts of KSNP and provide a platform for design and R and D tasks of KNFC. (authors)

  19. Onboard spectral imager data processor

    Science.gov (United States)

    Otten, Leonard J.; Meigs, Andrew D.; Franklin, Abraham J.; Sears, Robert D.; Robison, Mark W.; Rafert, J. Bruce; Fronterhouse, Donald C.; Grotbeck, Ronald L.

    1999-10-01

    Previous papers have described the concept behind the MightySat II.1 program, the satellite's Fourier Transform imaging spectrometer's optical design, the design for the spectral imaging payload, and its initial qualification testing. This paper discusses the on board data processing designed to reduce the amount of downloaded data by an order of magnitude and provide a demonstration of a smart spaceborne spectral imaging sensor. Two custom components, a spectral imager interface 6U VME card that moves data at over 30 MByte/sec, and four TI C-40 processors mounted to a second 6U VME and daughter card, are used to adapt the sensor to the spacecraft and provide the necessary high speed processing. A system architecture that offers both on board real time image processing and high-speed post data collection analysis of the spectral data has been developed. In addition to the on board processing of the raw data into a usable spectral data volume, one feature extraction technique has been incorporated. This algorithm operates on the basic interferometric data. The algorithm is integrated within the data compression process to search for uploadable feature descriptions.

  20. A data base processor semantics specification package

    Science.gov (United States)

    Fishwick, P. A.

    1983-01-01

    A Semantics Specification Package (DBPSSP) for the Intel Data Base Processor (DBP) is defined. DBPSSP serves as a collection of cross assembly tools that allow the analyst to assemble request blocks on the host computer for passage to the DBP. The assembly tools discussed in this report may be effectively used in conjunction with a DBP compatible data communications protocol to form a query processor, precompiler, or file management system for the database processor. The source modules representing the components of DBPSSP are fully commented and included.

  1. Hardware trigger processor for the MDT system

    CERN Document Server

    AUTHOR|(SzGeCERN)757787; The ATLAS collaboration; Hazen, Eric; Butler, John; Black, Kevin; Gastler, Daniel Edward; Ntekas, Konstantinos; Taffard, Anyes; Martinez Outschoorn, Verena; Ishino, Masaya; Okumura, Yasuyuki

    2017-01-01

    We are developing a low-latency hardware trigger processor for the Monitored Drift Tube system in the Muon spectrometer. The processor will fit candidate Muon tracks in the drift tubes in real time, improving significantly the momentum resolution provided by the dedicated trigger chambers. We present a novel pure-FPGA implementation of a Legendre transform segment finder, an associative-memory alternative implementation, an ARM (Zynq) processor-based track fitter, and compact ATCA carrier board architecture. The ATCA architecture is designed to allow a modular, staged approach to deployment of the system and exploration of alternative technologies.

  2. Use of a track and vertex processor in a fixed-target charm experiment

    International Nuclear Information System (INIS)

    Schub, M.H.; Carey, T.A.; Hsiung, Y.B.; Kaplan, D.M.; Lee, C.; Miller, G.; Sa, J.; Teng, P.K.

    1996-01-01

    We have constructed and operated a high-speed parallel-pipelined track and vertex processor and used it to trigger data acquisition in a high-rate charm and beauty experiment at Fermilab. The processor uses information from hodoscopes and wire chambers to reconstruct tracks in the bend view of a magnetic spectrometer, and uses these tracks to find the corresponding tracks in a set of silicon-strip detectors. The processor then forms vertices and triggers the experiment if at least one vertex is downstream of the target. Under typical charm running conditions, with an interaction rate of ∼5 MHz, the processor rejects 80-90% of lower-level triggers while maintaining efficiency of ∼70% for two-prong D-meson decays. (orig.)

  3. Photonics and Fiber Optics Processor Lab

    Data.gov (United States)

    Federal Laboratory Consortium — The Photonics and Fiber Optics Processor Lab develops, tests and evaluates high speed fiber optic network components as well as network protocols. In addition, this...

  4. Optimizing the performance of streaming numerical kernels on the IBM Blue Gene/P PowerPC 450 processor

    KAUST Repository

    Malas, Tareq Majed Yasin; Ahmadia, Aron; Brown, Jed; Gunnels, John A.; Keyes, David E.

    2012-01-01

    Several emerging petascale architectures use energy-efficient processors with vectorized computational units and in-order thread processing. On these architectures the sustained performance of streaming numerical kernels, ubiquitous in the solution

  5. Dataflow formalisation of real-time streaming applications on a composable and predictable multi-processor SOC

    NARCIS (Netherlands)

    Nelson, A.T.; Goossens, K.G.W.; Akesson, K.B.

    2015-01-01

    Embedded systems often contain multiple applications, some of which have real-time requirements and whose performance must be guaranteed. To efficiently execute applications, modern embedded systems contain Globally Asynchronous Locally Synchronous (GALS) processors, network on chip, DRAM and SRAM

  6. Keystone Business Models for Network Security Processors

    OpenAIRE

    Arthur Low; Steven Muegge

    2013-01-01

    Network security processors are critical components of high-performance systems built for cybersecurity. Development of a network security processor requires multi-domain experience in semiconductors and complex software security applications, and multiple iterations of both software and hardware implementations. Limited by the business models in use today, such an arduous task can be undertaken only by large incumbent companies and government organizations. Neither the “fabless semiconductor...

  7. Real time monitoring of electron processors

    International Nuclear Information System (INIS)

    Nablo, S.V.; Kneeland, D.R.; McLaughlin, W.L.

    1995-01-01

    A real time radiation monitor (RTRM) has been developed for monitoring the dose rate (current density) of electron beam processors. The system provides continuous monitoring of processor output, electron beam uniformity, and an independent measure of operating voltage or electron energy. In view of the device's ability to replace labor-intensive dosimetry in verification of machine performance on a real-time basis, its application to providing archival performance data for in-line processing is discussed. (author)

  8. Design and implementation of a high performance network security processor

    Science.gov (United States)

    Wang, Haixin; Bai, Guoqiang; Chen, Hongyi

    2010-03-01

    The last few years have seen many significant progresses in the field of application-specific processors. One example is network security processors (NSPs) that perform various cryptographic operations specified by network security protocols and help to offload the computation intensive burdens from network processors (NPs). This article presents a high performance NSP system architecture implementation intended for both internet protocol security (IPSec) and secure socket layer (SSL) protocol acceleration, which are widely employed in virtual private network (VPN) and e-commerce applications. The efficient dual one-way pipelined data transfer skeleton and optimised integration scheme of the heterogenous parallel crypto engine arrays lead to a Gbps rate NSP, which is programmable with domain specific descriptor-based instructions. The descriptor-based control flow fragments large data packets and distributes them to the crypto engine arrays, which fully utilises the parallel computation resources and improves the overall system data throughput. A prototyping platform for this NSP design is implemented with a Xilinx XC3S5000 based FPGA chip set. Results show that the design gives a peak throughput for the IPSec ESP tunnel mode of 2.85 Gbps with over 2100 full SSL handshakes per second at a clock rate of 95 MHz.

  9. Accuracy Limitations in Optical Linear Algebra Processors

    Science.gov (United States)

    Batsell, Stephen Gordon

    1990-01-01

    One of the limiting factors in applying optical linear algebra processors (OLAPs) to real-world problems has been the poor achievable accuracy of these processors. Little previous research has been done on determining noise sources from a systems perspective which would include noise generated in the multiplication and addition operations, noise from spatial variations across arrays, and from crosstalk. In this dissertation, we propose a second-order statistical model for an OLAP which incorporates all these system noise sources. We now apply this knowledge to determining upper and lower bounds on the achievable accuracy. This is accomplished by first translating the standard definition of accuracy used in electronic digital processors to analog optical processors. We then employ our second-order statistical model. Having determined a general accuracy equation, we consider limiting cases such as for ideal and noisy components. From the ideal case, we find the fundamental limitations on improving analog processor accuracy. From the noisy case, we determine the practical limitations based on both device and system noise sources. These bounds allow system trade-offs to be made both in the choice of architecture and in individual components in such a way as to maximize the accuracy of the processor. Finally, by determining the fundamental limitations, we show the system engineer when the accuracy desired can be achieved from hardware or architecture improvements and when it must come from signal pre-processing and/or post-processing techniques.

  10. A lock circuit for a multi-core processor

    DEFF Research Database (Denmark)

    2015-01-01

    An integrated circuit comprising a multiple processor cores and a lock circuit that comprises a queue register with respective bits set or reset via respective, connections dedicated to respective processor cores, whereby the queue register identifies those among the multiple processor cores...... that are enqueued in the queue register. Furthermore, the integrated circuit comprises a current register and a selector circuit configured to select a processor core and identify that processor core by a value in the current register. A selected processor core is a prioritized processor core among the cores...... configured with an integrated circuit; and a silicon die configured with an integrated circuit....

  11. Network Coding on Heterogeneous Multi-Core Processors for Wireless Sensor Networks

    Science.gov (United States)

    Kim, Deokho; Park, Karam; Ro, Won W.

    2011-01-01

    While network coding is well known for its efficiency and usefulness in wireless sensor networks, the excessive costs associated with decoding computation and complexity still hinder its adoption into practical use. On the other hand, high-performance microprocessors with heterogeneous multi-cores would be used as processing nodes of the wireless sensor networks in the near future. To this end, this paper introduces an efficient network coding algorithm developed for the heterogenous multi-core processors. The proposed idea is fully tested on one of the currently available heterogeneous multi-core processors referred to as the Cell Broadband Engine. PMID:22164053

  12. Architectural design and analysis of a programmable image processor

    International Nuclear Information System (INIS)

    Siyal, M.Y.; Chowdhry, B.S.; Rajput, A.Q.K.

    2003-01-01

    In this paper we present an architectural design and analysis of a programmable image processor, nicknamed Snake. The processor was designed with a high degree of parallelism to speed up a range of image processing operations. Data parallelism found in array processors has been included into the architecture of the proposed processor. The implementation of commonly used image processing algorithms and their performance evaluation are also discussed. The performance of Snake is also compared with other types of processor architectures. (author)

  13. Negative base encoding in optical linear algebra processors

    Science.gov (United States)

    Perlee, C.; Casasent, D.

    1986-01-01

    In the digital multiplication by analog convolution algorithm, the bits of two encoded numbers are convolved to form the product of the two numbers in mixed binary representation; this output can be easily converted to binary. Attention is presently given to negative base encoding, treating base -2 initially, and then showing that the negative base system can be readily extended to any radix. In general, negative base encoding in optical linear algebra processors represents a more efficient technique than either sign magnitude or 2's complement encoding, when the additions of digitally encoded products are performed in parallel.

  14. A high-accuracy optical linear algebra processor for finite element applications

    Science.gov (United States)

    Casasent, D.; Taylor, B. K.

    1984-01-01

    Optical linear processors are computationally efficient computers for solving matrix-matrix and matrix-vector oriented problems. Optical system errors limit their dynamic range to 30-40 dB, which limits their accuray to 9-12 bits. Large problems, such as the finite element problem in structural mechanics (with tens or hundreds of thousands of variables) which can exploit the speed of optical processors, require the 32 bit accuracy obtainable from digital machines. To obtain this required 32 bit accuracy with an optical processor, the data can be digitally encoded, thereby reducing the dynamic range requirements of the optical system (i.e., decreasing the effect of optical errors on the data) while providing increased accuracy. This report describes a new digitally encoded optical linear algebra processor architecture for solving finite element and banded matrix-vector problems. A linear static plate bending case study is described which quantities the processor requirements. Multiplication by digital convolution is explained, and the digitally encoded optical processor architecture is advanced.

  15. Making the black box signal processor transparent explains the contradictions in x-ray spectroscopy

    International Nuclear Information System (INIS)

    Papp, T.; Maxwell, J.A.; Papp, A.T.

    2008-01-01

    Full text: There are significant differences in the experimental data needed in the analysis of x-ray spectra, and many of the results contradict basic conservation laws and simple arithmetic. We have identified that the main source of the unexplainable results is rooted in the signal processing electronics. We have developed a line of fully digital signal processors that have yielded improved resolution, line shape, tailing and pile up recognition. The signal processor is a time variant, non-paralyzable signal processor. The signal processor accounts for and registers all events, sorting them into two spectra, one spectrum for the desirable or accepted events, and one spectrum for the rejected events. Although the information on the rejected events is always necessary, we recently realized its additional benefits in high rate, (10 5 -10 6 cps) analytical measurements. Having all information available we were surprised to see how different conclusions and level of understandings are possible in detector characterization, detector efficiency, spectrum evaluation methodology, and that it explains many of the contradictions. We will demonstrate how the Coster-Kronig transition measurements often do not even comply with arithmetic, and why is it difficult to interpret the spectra with other processors. It will be presented that for different spectra in origin, like radioisotope measurements, x-ray fluorescence, and particle induced x-ray emission, the primary signal from the preamplifier is so different, that the signal processor is facing very different challenges, and different metrological approaches are necessary in data processing. This data processing methodology cannot be established on the partial and fractional information offered by other approaches. However, the maximum information utilization approach offered by our processor's rejected spectrum supplements the accepted spectrum to allow the development of straight forward and accurate metrology. All the

  16. Compact gasoline fuel processor for passenger vehicle APU

    Science.gov (United States)

    Severin, Christopher; Pischinger, Stefan; Ogrzewalla, Jürgen

    Due to the increasing demand for electrical power in today's passenger vehicles, and with the requirements regarding fuel consumption and environmental sustainability tightening, a fuel cell-based auxiliary power unit (APU) becomes a promising alternative to the conventional generation of electrical energy via internal combustion engine, generator and battery. It is obvious that the on-board stored fuel has to be used for the fuel cell system, thus, gasoline or diesel has to be reformed on board. This makes the auxiliary power unit a complex integrated system of stack, air supply, fuel processor, electrics as well as heat and water management. Aside from proving the technical feasibility of such a system, the development has to address three major barriers:start-up time, costs, and size/weight of the systems. In this paper a packaging concept for an auxiliary power unit is presented. The main emphasis is placed on the fuel processor, as good packaging of this large subsystem has the strongest impact on overall size. The fuel processor system consists of an autothermal reformer in combination with water-gas shift and selective oxidation stages, based on adiabatic reactors with inter-cooling. The configuration was realized in a laboratory set-up and experimentally investigated. The results gained from this confirm a general suitability for mobile applications. A start-up time of 30 min was measured, while a potential reduction to 10 min seems feasible. An overall fuel processor efficiency of about 77% was measured. On the basis of the know-how gained by the experimental investigation of the laboratory set-up a packaging concept was developed. Using state-of-the-art catalyst and heat exchanger technology, the volumes of these components are fixed. However, the overall volume is higher mainly due to mixing zones and flow ducts, which do not contribute to the chemical or thermal function of the system. Thus, the concept developed mainly focuses on minimization of those

  17. Control structures for high speed processors

    Science.gov (United States)

    Maki, G. K.; Mankin, R.; Owsley, P. A.; Kim, G. M.

    1982-01-01

    A special processor was designed to function as a Reed Solomon decoder with throughput data rate in the Mhz range. This data rate is significantly greater than is possible with conventional digital architectures. To achieve this rate, the processor design includes sequential, pipelined, distributed, and parallel processing. The processor was designed using a high level language register transfer language. The RTL can be used to describe how the different processes are implemented by the hardware. One problem of special interest was the development of dependent processes which are analogous to software subroutines. For greater flexibility, the RTL control structure was implemented in ROM. The special purpose hardware required approximately 1000 SSI and MSI components. The data rate throughput is 2.5 megabits/second. This data rate is achieved through the use of pipelined and distributed processing. This data rate can be compared with 800 kilobits/second in a recently proposed very large scale integration design of a Reed Solomon encoder.

  18. Real time processor for array speckle interferometry

    Science.gov (United States)

    Chin, Gordon; Florez, Jose; Borelli, Renan; Fong, Wai; Miko, Joseph; Trujillo, Carlos

    1989-02-01

    The authors are constructing a real-time processor to acquire image frames, perform array flat-fielding, execute a 64 x 64 element two-dimensional complex FFT (fast Fourier transform) and average the power spectrum, all within the 25 ms coherence time for speckles at near-IR (infrared) wavelength. The processor will be a compact unit controlled by a PC with real-time display and data storage capability. This will provide the ability to optimize observations and obtain results on the telescope rather than waiting several weeks before the data can be analyzed and viewed with offline methods. The image acquisition and processing, design criteria, and processor architecture are described.

  19. The UA1 upgrade calorimeter trigger processor

    International Nuclear Information System (INIS)

    Bains, M.; Charleton, D.; Ellis, N.; Garvey, J.; Gregory, J.; Jimack, M.P.; Jovanovic, P.; Kenyon, I.R.; Baird, S.A.; Campbell, D.; Cawthraw, M.; Coughlan, J.; Flynn, P.; Galagedera, S.; Grayer, G.; Halsall, R.; Shah, T.P.; Stephens, R.; Biddulph, P.; Eisenhandler, E.; Fensome, I.F.; Landon, M.; Robinson, D.; Oliver, J.; Sumorok, K.

    1990-01-01

    The increased luminosity of the improved CERN Collider and the more subtle signals of second-generation collider physics demand increasingly sophisticated triggering. We have built a new first-level trigger processor designed to use the excellent granularity of the UA1 upgrade calorimeter. This device is entirely digital and handles events in 1.5 μs, thus introducing no dead time. Its most novel feature is fast two-dimensional electromagnetic cluster-finding with the possibility of demanding an isolated shower of limited penetration. The processor allows multiple combinations of triggers on electromagnetic showers, hadronic jets and energy sums, including a total-energy veto of multiple interactions and a full vector sum of missing transverse energy. This hard-wired processor is about five times more powerful than its predecessor, and makes extensive use of pipelining techniques. It was used extensively in the 1988 and 1989 runs of the CERN Collider. (orig.)

  20. Embedded processor extensions for image processing

    Science.gov (United States)

    Thevenin, Mathieu; Paindavoine, Michel; Letellier, Laurent; Heyrman, Barthélémy

    2008-04-01

    The advent of camera phones marks a new phase in embedded camera sales. By late 2009, the total number of camera phones will exceed that of both conventional and digital cameras shipped since the invention of photography. Use in mobile phones of applications like visiophony, matrix code readers and biometrics requires a high degree of component flexibility that image processors (IPs) have not, to date, been able to provide. For all these reasons, programmable processor solutions have become essential. This paper presents several techniques geared to speeding up image processors. It demonstrates that a gain of twice is possible for the complete image acquisition chain and the enhancement pipeline downstream of the video sensor. Such results confirm the potential of these computing systems for supporting future applications.

  1. The UA1 upgrade calorimeter trigger processor

    International Nuclear Information System (INIS)

    Bains, N.; Baird, S.A.; Biddulph, P.

    1990-01-01

    The increased luminosity of the improved CERN Collider and the more subtle signals of second-generation collider physics demand increasingly sophisticated triggering. We have built a new first-level trigger processor designed to use the excellent granularity of the UA1 upgrade calorimeter. This device is entirely digital and handles events in 1.5 μs, thus introducing no deadtime. Its most novel feature is fast two-dimensional electromagnetic cluster-finding with the possibility of demanding an isolated shower of limited penetration. The processor allows multiple combinations of triggers on electromagnetic showers, hadronic jets and energy sums, including a total-energy veto of multiple interactions and a full vector sum of missing transverse energy. This hard-wired processor is about five times more powerful than its predecessor, and makes extensive use of pipelining techniques. It was used extensively in the 1988 and 1989 runs of the CERN Collider. (author)

  2. Development methods for VLSI-processors

    International Nuclear Information System (INIS)

    Horninger, K.; Sandweg, G.

    1982-01-01

    The aim of this project, which was originally planed for 3 years, was the development of modern system and circuit concepts, for VLSI-processors having a 32 bit wide data path. The result of this first years work is the concept of a general purpose processor. This processor is not only logically but also physically (on the chip) divided into four functional units: a microprogrammable instruction unit, an execution unit in slice technique, a fully associative cache memory and an I/O unit. For the ALU of the execution unit circuits in PLA and slice techniques have been realized. On the basis of regularity, area consumption and achievable performance the slice technique has been prefered. The designs utilize selftesting circuitry. (orig.) [de

  3. Software-defined reconfigurable microwave photonics processor.

    Science.gov (United States)

    Pérez, Daniel; Gasulla, Ivana; Capmany, José

    2015-06-01

    We propose, for the first time to our knowledge, a software-defined reconfigurable microwave photonics signal processor architecture that can be integrated on a chip and is capable of performing all the main functionalities by suitable programming of its control signals. The basic configuration is presented and a thorough end-to-end design model derived that accounts for the performance of the overall processor taking into consideration the impact and interdependencies of both its photonic and RF parts. We demonstrate the model versatility by applying it to several relevant application examples.

  4. Parallel processor for fast event analysis

    International Nuclear Information System (INIS)

    Hensley, D.C.

    1983-01-01

    Current maximum data rates from the Spin Spectrometer of approx. 5000 events/s (up to 1.3 MBytes/s) and minimum analysis requiring at least 3000 operations/event require a CPU cycle time near 70 ns. In order to achieve an effective cycle time of 70 ns, a parallel processing device is proposed where up to 4 independent processors will be implemented in parallel. The individual processors are designed around the Am2910 Microsequencer, the AM29116 μP, and the Am29517 Multiplier. Satellite histogramming in a mass memory system will be managed by a commercial 16-bit μP system

  5. Time Manager Software for a Flight Processor

    Science.gov (United States)

    Zoerne, Roger

    2012-01-01

    Data analysis is a process of inspecting, cleaning, transforming, and modeling data to highlight useful information and suggest conclusions. Accurate timestamps and a timeline of vehicle events are needed to analyze flight data. By moving the timekeeping to the flight processor, there is no longer a need for a redundant time source. If each flight processor is initially synchronized to GPS, they can freewheel and maintain a fairly accurate time throughout the flight with no additional GPS time messages received. How ever, additional GPS time messages will ensure an even greater accuracy. When a timestamp is required, a gettime function is called that immediately reads the time-base register.

  6. Comparison of Processor Performance of SPECint2006 Benchmarks of some Intel Xeon Processors

    Directory of Open Access Journals (Sweden)

    Abdul Kareem PARCHUR

    2012-08-01

    Full Text Available High performance is a critical requirement to all microprocessors manufacturers. The present paper describes the comparison of performance in two main Intel Xeon series processors (Type A: Intel Xeon X5260, X5460, E5450 and L5320 and Type B: Intel Xeon X5140, 5130, 5120 and E5310. The microarchitecture of these processors is implemented using the basis of a new family of processors from Intel starting with the Pentium 4 processor. These processors can provide a performance boost for many key application areas in modern generation. The scaling of performance in two major series of Intel Xeon processors (Type A: Intel Xeon X5260, X5460, E5450 and L5320 and Type B: Intel Xeon X5140, 5130, 5120 and E5310 has been analyzed using the performance numbers of 12 CPU2006 integer benchmarks, performance numbers that exhibit significant differences in performance. The results and analysis can be used by performance engineers, scientists and developers to better understand the performance scaling in modern generation processors.

  7. Simulation of a parallel processor on a serial processor: The neutron diffusion equation

    International Nuclear Information System (INIS)

    Honeck, H.C.

    1981-01-01

    Parallel processors could provide the nuclear industry with very high computing power at a very moderate cost. Will we be able to make effective use of this power. This paper explores the use of a very simple parallel processor for solving the neutron diffusion equation to predict power distributions in a nuclear reactor. We first describe a simple parallel processor and estimate its theoretical performance based on the current hardware technology. Next, we show how the parallel processor could be used to solve the neutron diffusion equation. We then present the results of some simulations of a parallel processor run on a serial processor and measure some of the expected inefficiencies. Finally we extrapolate the results to estimate how actual design codes would perform. We find that the standard numerical methods for solving the neutron diffusion equation are still applicable when used on a parallel processor. However, some simple modifications to these methods will be necessary if we are to achieve the full power of these new computers. (orig.) [de

  8. Special purpose processors for high energy physics applications

    International Nuclear Information System (INIS)

    Verkerk, C.

    1978-01-01

    The review on the subject of hardware processors from very fast decision logic for the split field magnet facility at CERN, to a point-finding processor used to relieve the data-acquisition minicomputer from the task of monitoring the SPS experiment is given. Block diagrams of decision making processor, point-finding processor, complanarity and opening angle processor and programmable track selector module are presented and discussed. The applications of fully programmable but slower processor on the one hand, and very fast and programmable decision logic on the other hand are given in this review

  9. Noise limitations in optical linear algebra processors.

    Science.gov (United States)

    Batsell, S G; Jong, T L; Walkup, J F; Krile, T F

    1990-05-10

    A general statistical noise model is presented for optical linear algebra processors. A statistical analysis which includes device noise, the multiplication process, and the addition operation is undertaken. We focus on those processes which are architecturally independent. Finally, experimental results which verify the analytical predictions are also presented.

  10. Cassava processors' awareness of occupational and environmental ...

    African Journals Online (AJOL)

    A larger percentage (74.5%) of the respondents indicated that the Agricultural Development Programme (ADP) is their source of information. The result also showed that processor's awareness of occupational hazards associated with the different stages of cassava processing vary because their involvement in these stages

  11. A high-speed analog neural processor

    NARCIS (Netherlands)

    Masa, P.; Masa, Peter; Hoen, Klaas; Hoen, Klaas; Wallinga, Hans

    1994-01-01

    Targeted at high-energy physics research applications, our special-purpose analog neural processor can classify up to 70 dimensional vectors within 50 nanoseconds. The decision-making process of the implemented feedforward neural network enables this type of computation to tolerate weight

  12. Beeldverwerking met de Micron Automatic Processor

    OpenAIRE

    Goyens, Frank

    2017-01-01

    Deze thesis is een onderzoek naar toepassingen binnen beeldverwerking op de Micron Automata Processor hardware. De hardware wordt vergeleken met populaire hedendaagse hardware. Ook bevat dit onderzoek nuttige informatie en strategieën voor het ontwikkelen van nieuwe toepassingen. Bevindingen in dit onderzoek omvatten proof of concept algoritmes en een praktische toepassing.

  13. 7 CFR 1215.14 - Processor.

    Science.gov (United States)

    2010-01-01

    ... 7 Agriculture 10 2010-01-01 2010-01-01 false Processor. 1215.14 Section 1215.14 Agriculture Regulations of the Department of Agriculture (Continued) AGRICULTURAL MARKETING SERVICE (MARKETING AGREEMENTS... CONSUMER INFORMATION Popcorn Promotion, Research, and Consumer Information Order Definitions § 1215.14...

  14. Simplifying cochlear implant speech processor fitting

    NARCIS (Netherlands)

    Willeboer, C.

    2008-01-01

    Conventional fittings of the speech processor of a cochlear implant (CI) rely to a large extent on the implant recipient's subjective responses. For each of the 22 intracochlear electrodes the recipient has to indicate the threshold level (T-level) and comfortable loudness level (C-level) while

  15. Vector and parallel processors in computational science

    International Nuclear Information System (INIS)

    Duff, I.S.; Reid, J.K.

    1985-01-01

    This book presents the papers given at a conference which reviewed the new developments in parallel and vector processing. Topics considered at the conference included hardware (array processors, supercomputers), programming languages, software aids, numerical methods (e.g., Monte Carlo algorithms, iterative methods, finite elements, optimization), and applications (e.g., neutron transport theory, meteorology, image processing)

  16. Space Station Water Processor Process Pump

    Science.gov (United States)

    Parker, David

    1995-01-01

    This report presents the results of the development program conducted under contract NAS8-38250-12 related to the International Space Station (ISS) Water Processor (WP) Process Pump. The results of the Process Pumps evaluation conducted on this program indicates that further development is required in order to achieve the performance and life requirements for the ISSWP.

  17. Interleaved Subtask Scheduling on Multi Processor SOC

    NARCIS (Netherlands)

    Zhe, M.

    2006-01-01

    The ever-progressing semiconductor processing technique has integrated more and more embedded processors on a single system-on-achip (SoC). With such powerful SoC platforms, and also due to the stringent time-to-market deadlines, many functionalities which used to be implemented in ASICs are

  18. User manual Dieka PreProcessor

    NARCIS (Netherlands)

    Valkering, Kasper

    2000-01-01

    This is the user manual belonging to the Dieka-PreProcessor. This application was written by Wenhua Cao and revised and expanded by Kasper Valkering. The aim of this preproccesor is to be able to draw and mesh extrusion dies in ProEngineer, and do the FE-calculation in Dieka. The preprocessor makes

  19. Globe hosts launch of new processor

    CERN Multimedia

    2006-01-01

    Launch of the quadecore processor chip at the Globe. On 14 November, in a series of major media events around the world, the chip-maker Intel launched its new 'quadcore' processor. For the regions of Europe, the Middle East and Africa, the day-long launch event took place in CERN's Globe of Science and Innovation, with over 30 journalists in attendance, coming from as far away as Johannesburg and Dubai. CERN was a significant choice for the event: the first tests of this new generation of processor in Europe had been made at CERN over the preceding months, as part of CERN openlab, a research partnership with leading IT companies such as Intel, HP and Oracle. The event also provided the opportunity for the journalists to visit ATLAS and the CERN Computer Centre. The strategy of putting multiple processor cores on the same chip, which has been pursued by Intel and other chip-makers in the last few years, represents an important departure from the more traditional improvements in the sheer speed of such chips. ...

  20. Event analysis using a massively parallel processor

    International Nuclear Information System (INIS)

    Bale, A.; Gerelle, E.; Messersmith, J.; Warren, R.; Hoek, J.

    1990-01-01

    This paper describes a system for performing histogramming of n-tuple data at interactive rates using a commercial SIMD processor array connected to a work-station running the well-known Physics Analysis Workstation software (PAW). Results indicate that an order of magnitude performance improvement over current RISC technology is easily achievable

  1. Auditory-like filterbank: An optimal speech processor for efficient ...

    Indian Academy of Sciences (India)

    The transmitter and the receiver in a communication system have to be designed optimally with respect to one another to ensure reliable and efficient communication. Following this principle, we derive an optimal filterbank for processing speech signal in the listener's auditory system (receiver), so that maximum information ...

  2. Decimal Engine for Energy-Efficient Multicore Processors

    DEFF Research Database (Denmark)

    Nannarelli, Alberto

    2014-01-01

    propose a hybrid BFP/DFP engine to perform BFP division and DFP addition, multiplication and division. The main purpose of this engine is to offload the binary floating-point units for this type of operations and reduce the latency for decimal operations, and power and temperature for the whole die....

  3. GMB: An Efficient Query Processor for Biological Data

    Directory of Open Access Journals (Sweden)

    Taha Kamal

    2011-06-01

    Full Text Available Bioinformatics applications manage complex biological data stored into distributed and often heterogeneous databases and require large computing power. These databases are too big and complicated to be rapidly queried every time a user submits a query, due to the overhead involved in decomposing the queries, sending the decomposed queries to remote databases, and composing the results. There is also considerable communication costs involved. This study addresses the mentioned problems in Grid-based environment for bioinformatics. We propose a Grid middleware called GMB that alleviates these problems by caching the results of Frequently Used Queries (FUQ. Queries are classified based on their types and frequencies. FUQ are answered from the middleware, which improves their response time. GMB acts as a gateway to TeraGrid Grid: it resides between users’ applications and TeraGrid Grid. We evaluate GMB experimentally.

  4. Efficient probabilistic model checking on general purpose graphic processors

    NARCIS (Netherlands)

    Bosnacki, D.; Edelkamp, S.; Sulewski, D.; Pasareanu, C.S.

    2009-01-01

    We present algorithms for parallel probabilistic model checking on general purpose graphic processing units (GPGPUs). For this purpose we exploit the fact that some of the basic algorithms for probabilistic model checking rely on matrix vector multiplication. Since this kind of linear algebraic

  5. Advanced Avionics and Processor Systems for a Flexible Space Exploration Architecture

    Science.gov (United States)

    Keys, Andrew S.; Adams, James H.; Smith, Leigh M.; Johnson, Michael A.; Cressler, John D.

    2010-01-01

    The Advanced Avionics and Processor Systems (AAPS) project, formerly known as the Radiation Hardened Electronics for Space Environments (RHESE) project, endeavors to develop advanced avionic and processor technologies anticipated to be used by NASA s currently evolving space exploration architectures. The AAPS project is a part of the Exploration Technology Development Program, which funds an entire suite of technologies that are aimed at enabling NASA s ability to explore beyond low earth orbit. NASA s Marshall Space Flight Center (MSFC) manages the AAPS project. AAPS uses a broad-scoped approach to developing avionic and processor systems. Investment areas include advanced electronic designs and technologies capable of providing environmental hardness, reconfigurable computing techniques, software tools for radiation effects assessment, and radiation environment modeling tools. Near-term emphasis within the multiple AAPS tasks focuses on developing prototype components using semiconductor processes and materials (such as Silicon-Germanium (SiGe)) to enhance a device s tolerance to radiation events and low temperature environments. As the SiGe technology will culminate in a delivered prototype this fiscal year, the project emphasis shifts its focus to developing low-power, high efficiency total processor hardening techniques. In addition to processor development, the project endeavors to demonstrate techniques applicable to reconfigurable computing and partially reconfigurable Field Programmable Gate Arrays (FPGAs). This capability enables avionic architectures the ability to develop FPGA-based, radiation tolerant processor boards that can serve in multiple physical locations throughout the spacecraft and perform multiple functions during the course of the mission. The individual tasks that comprise AAPS are diverse, yet united in the common endeavor to develop electronics capable of operating within the harsh environment of space. Specifically, the AAPS tasks for

  6. Performance evaluation of throughput computing workloads using multi-core processors and graphics processors

    Science.gov (United States)

    Dave, Gaurav P.; Sureshkumar, N.; Blessy Trencia Lincy, S. S.

    2017-11-01

    Current trend in processor manufacturing focuses on multi-core architectures rather than increasing the clock speed for performance improvement. Graphic processors have become as commodity hardware for providing fast co-processing in computer systems. Developments in IoT, social networking web applications, big data created huge demand for data processing activities and such kind of throughput intensive applications inherently contains data level parallelism which is more suited for SIMD architecture based GPU. This paper reviews the architectural aspects of multi/many core processors and graphics processors. Different case studies are taken to compare performance of throughput computing applications using shared memory programming in OpenMP and CUDA API based programming.

  7. Data collection from FASTBUS to a DEC UNIBUS processor through the UNIBUS-Processor Interface

    International Nuclear Information System (INIS)

    Larwill, M.; Barsotti, E.; Lesny, D.; Pordes, R.

    1983-01-01

    This paper describes the use of the UNIBUS Processor Interface, an interface between FASTBUS and the Digital Equipment Corporation UNIBUS. The UPI was developed by Fermilab and the University of Illinois. Details of the use of this interface in a high energy physics experiment at Fermilab are given. The paper includes a discussion of the operation of the UPI on the UNIBUS of a VAX-11, and plans for using the UPI to perform data acquisition from FASTBUS to a VAX-11 Processor

  8. Array processors based on Gaussian fraction-free method

    Energy Technology Data Exchange (ETDEWEB)

    Peng, S; Sedukhin, S [Aizu Univ., Aizuwakamatsu, Fukushima (Japan); Sedukhin, I

    1998-03-01

    The design of algorithmic array processors for solving linear systems of equations using fraction-free Gaussian elimination method is presented. The design is based on a formal approach which constructs a family of planar array processors systematically. These array processors are synthesized and analyzed. It is shown that some array processors are optimal in the framework of linear allocation of computations and in terms of number of processing elements and computing time. (author)

  9. An enhanced Ada run-time system for real-time embedded processors

    Science.gov (United States)

    Sims, J. T.

    1991-01-01

    An enhanced Ada run-time system has been developed to support real-time embedded processor applications. The primary focus of this development effort has been on the tasking system and the memory management facilities of the run-time system. The tasking system has been extended to support efficient and precise periodic task execution as required for control applications. Event-driven task execution providing a means of task-asynchronous control and communication among Ada tasks is supported in this system. Inter-task control is even provided among tasks distributed on separate physical processors. The memory management system has been enhanced to provide object allocation and protected access support for memory shared between disjoint processors, each of which is executing a distinct Ada program.

  10. Computations on the massively parallel processor at the Goddard Space Flight Center

    Science.gov (United States)

    Strong, James P.

    1991-01-01

    Described are four significant algorithms implemented on the massively parallel processor (MPP) at the Goddard Space Flight Center. Two are in the area of image analysis. Of the other two, one is a mathematical simulation experiment and the other deals with the efficient transfer of data between distantly separated processors in the MPP array. The first algorithm presented is the automatic determination of elevations from stereo pairs. The second algorithm solves mathematical logistic equations capable of producing both ordered and chaotic (or random) solutions. This work can potentially lead to the simulation of artificial life processes. The third algorithm is the automatic segmentation of images into reasonable regions based on some similarity criterion, while the fourth is an implementation of a bitonic sort of data which significantly overcomes the nearest neighbor interconnection constraints on the MPP for transferring data between distant processors.

  11. Simulation of Particulate Flows Multi-Processor Machines with Distributed Memory

    Energy Technology Data Exchange (ETDEWEB)

    Uhlmann, M.

    2004-07-01

    We presented a method for the parallelization of an immersed boundary algorithm for particulate flows using the MPI standard of communication. The treatment of the fluid phase used the domain decomposition technique over a Cartesian processor grid. The solution of the Helmholtz problem is approximately factorized an relies upon apparel tri-diagonal solver the Poisson problem is solved by means of a parallel multi-grid technique similar to MUDPACK. for the solid phase we employ a master-slaves technique where one processor handles all the particles contained in its Eulerian fluid sub-domain and zero or more neighbor processors collaborate in the computation of particle-related quantities whenever a particle position over laps the boundary of a sub-domain. the parallel efficiency for some preliminary computations is presented. (Author) 9 refs.

  12. Discussion paper for a highly parallel array processor-based machine

    International Nuclear Information System (INIS)

    Hagstrom, R.; Bolotin, G.; Dawson, J.

    1984-01-01

    The architectural plant for a quickly realizable implementation of a highly parallel special-purpose computer system with peak performance in the range of 6 billion floating point operations per second is discussed. The architecture is suitable to Lattice Gauge theoretical computations of fundamental physics interest and may be applicable to a range of other problems which deal with numerically intensive computational problems. The plan is quickly realizable because it employs a maximum of commercially available hardware subsystems and because the architecture is software-transparent to the individual processors, allowing straightforward re-use of whatever commercially available operating-systems and support software that is suitable to run on the commercially-produced processors. A tiny prototype instrument, designed along this architecture has already operated. A few elementary examples of programs which can run efficiently are presented. The large machine which the authors would propose to build would be based upon a highly competent array-processor, the ST-100 Array Processor, and specific design possibilities are discussed. The first step toward realizing this plan practically is to install a single ST-100 to allow algorithm development to proceed while a demonstration unit is built using two of the ST-100 Array Processors

  13. PERFORMANCE EVALUATION OF OR1200 PROCESSOR WITH EVOLUTIONARY PARALLEL HPRC USING GEP

    Directory of Open Access Journals (Sweden)

    R. Maheswari

    2012-04-01

    Full Text Available In this fast computing era, most of the embedded system requires more computing power to complete the complex function/ task at the lesser amount of time. One way to achieve this is by boosting up the processor performance which allows processor core to run faster. This paper presents a novel technique of increasing the performance by parallel HPRC (High Performance Reconfigurable Computing in the CPU/DSP (Digital Signal Processor unit of OR1200 (Open Reduced Instruction Set Computer (RISC 1200 using Gene Expression Programming (GEP an evolutionary programming model. OR1200 is a soft-core RISC processor of the Intellectual Property cores that can efficiently run any modern operating system. In the manufacturing process of OR1200 a parallel HPRC is placed internally in the Integer Execution Pipeline unit of the CPU/DSP core to increase the performance. The GEP Parallel HPRC is activated /deactivated by triggering the signals i HPRC_Gene_Start ii HPRC_Gene_End. A Verilog HDL(Hardware Description language functional code for Gene Expression Programming parallel HPRC is developed and synthesised using XILINX ISE in the former part of the work and a CoreMark processor core benchmark is used to test the performance of the OR1200 soft core in the later part of the work. The result of the implementation ensures the overall speed-up increased to 20.59% by GEP based parallel HPRC in the execution unit of OR1200.

  14. Lipsi: Probably the Smallest Processor in the World

    DEFF Research Database (Denmark)

    Schoeberl, Martin

    2018-01-01

    While research on high-performance processors is important, it is also interesting to explore processor architectures at the other end of the spectrum: tiny processor cores for auxiliary functions. While it is common to implement small circuits for such functions, such as a serial port, in dedica...... at a minimal cost....

  15. Reconfigurable Secure Video Codec Based on DWT and AES Processor

    Directory of Open Access Journals (Sweden)

    Rached Tourki

    2010-01-01

    Full Text Available In this paper, we proposed a secure video codec based on the discrete wavelet transformation (DWT and the Advanced Encryption Standard (AES processor. Either, use of video coding with DWT or encryption using AES is well known. However, linking these two designs to achieve secure video coding is leading. The contributions of our work are as follows. First, a new method for image and video compression is proposed. This codec is a synthesis of JPEG and JPEG2000,which is implemented using Huffman coding to the JPEG and DWT to the JPEG2000. Furthermore, an improved motion estimation algorithm is proposed. Second, the encryptiondecryption effects are achieved by the AES processor. AES is aim to encrypt group of LL bands. The prominent feature of this method is an encryption of LL bands by AES-128 (128-bit keys, or AES-192 (192-bit keys, or AES-256 (256-bit keys.Third, we focus on a method that implements partial encryption of LL bands. Our approach provides considerable levels of security (key size, partial encryption, mode encryption, and has very limited adverse impact on the compression efficiency. The proposed codec can provide up to 9 cipher schemes within a reasonable software cost. Latency, correlation, PSNR and compression rate results are analyzed and shown.

  16. Bulk-memory processor for data acquisition

    International Nuclear Information System (INIS)

    Nelson, R.O.; McMillan, D.E.; Sunier, J.W.; Meier, M.; Poore, R.V.

    1981-01-01

    To meet the diverse needs and data rate requirements at the Van de Graaff and Weapons Neutron Research (WNR) facilities, a bulk memory system has been implemented which includes a fast and flexible processor. This bulk memory processor (BMP) utilizes bit slice and microcode techniques and features a 24 bit wide internal architecture allowing direct addressing of up to 16 megawords of memory and histogramming up to 16 million counts per channel without overflow. The BMP is interfaced to the MOSTEK MK 8000 bulk memory system and to the standard MODCOMP computer I/O bus. Coding for the BMP both at the microcode level and with macro instructions is supported. The generalized data acquisition system has been extended to support the BMP in a manner transparent to the user

  17. Design of Processors with Reconfigurable Microarchitecture

    Directory of Open Access Journals (Sweden)

    Andrey Mokhov

    2014-01-01

    Full Text Available Energy becomes a dominating factor for a wide spectrum of computations: from intensive data processing in “big data” companies resulting in large electricity bills, to infrastructure monitoring with wireless sensors relying on energy harvesting. In this context it is essential for a computation system to be adaptable to the power supply and the service demand, which often vary dramatically during runtime. In this paper we present an approach to building processors with reconfigurable microarchitecture capable of changing the way they fetch and execute instructions depending on energy availability and application requirements. We show how to use Conditional Partial Order Graphs to formally specify the microarchitecture of such a processor, explore the design possibilities for its instruction set, and synthesise the instruction decoder using correct-by-construction techniques. The paper is focused on the design methodology, which is evaluated by implementing a power-proportional version of Intel 8051 microprocessor.

  18. Real time processor for array speckle interferometry

    International Nuclear Information System (INIS)

    Chin, G.; Florez, J.; Borelli, R.; Fong, W.; Miko, J.; Trujillo, C.

    1989-01-01

    With the construction of several new large aperture telescopes and the development of large format array detectors in the near IR, the ability to obtain diffraction limited seeing via IR array speckle interferometry offers a powerful tool. We are constructing a real-time processor to acquire image frames, perform array flat-fielding, execute a 64 x 64 element 2D complex FFT, and to average the power spectrum all within the 25 msec coherence time for speckles at near IR wavelength. The processor is a compact unit controlled by a PC with real time display and data storage capability. It provides the ability to optimize observations and obtain results on the telescope rather than waiting several weeks before the data can be analyzed and viewed with off-line methods

  19. Parallel processor programs in the Federal Government

    Science.gov (United States)

    Schneck, P. B.; Austin, D.; Squires, S. L.; Lehmann, J.; Mizell, D.; Wallgren, K.

    1985-01-01

    In 1982, a report dealing with the nation's research needs in high-speed computing called for increased access to supercomputing resources for the research community, research in computational mathematics, and increased research in the technology base needed for the next generation of supercomputers. Since that time a number of programs addressing future generations of computers, particularly parallel processors, have been started by U.S. government agencies. The present paper provides a description of the largest government programs in parallel processing. Established in fiscal year 1985 by the Institute for Defense Analyses for the National Security Agency, the Supercomputing Research Center will pursue research to advance the state of the art in supercomputing. Attention is also given to the DOE applied mathematical sciences research program, the NYU Ultracomputer project, the DARPA multiprocessor system architectures program, NSF research on multiprocessor systems, ONR activities in parallel computing, and NASA parallel processor projects.

  20. RISC Processors and High Performance Computing

    Science.gov (United States)

    Bailey, David H.; Saini, Subhash; Craw, James M. (Technical Monitor)

    1995-01-01

    This tutorial will discuss the top five RISC microprocessors and the parallel systems in which they are used. It will provide a unique cross-machine comparison not available elsewhere. The effective performance of these processors will be compared by citing standard benchmarks in the context of real applications. The latest NAS Parallel Benchmarks, both absolute performance and performance per dollar, will be listed. The next generation of the NPB will be described. The tutorial will conclude with a discussion of future directions in the field. Technology Transfer Considerations: All of these computer systems are commercially available internationally. Information about these processors is available in the public domain, mostly from the vendors themselves. The NAS Parallel Benchmarks and their results have been previously approved numerous times for public release, beginning back in 1991.

  1. Multi-Core Processor Memory Contention Benchmark Analysis Case Study

    Science.gov (United States)

    Simon, Tyler; McGalliard, James

    2009-01-01

    Multi-core processors dominate current mainframe, server, and high performance computing (HPC) systems. This paper provides synthetic kernel and natural benchmark results from an HPC system at the NASA Goddard Space Flight Center that illustrate the performance impacts of multi-core (dual- and quad-core) vs. single core processor systems. Analysis of processor design, application source code, and synthetic and natural test results all indicate that multi-core processors can suffer from significant memory subsystem contention compared to similar single-core processors.

  2. VIRTUS: a multi-processor system in FASTBUS

    International Nuclear Information System (INIS)

    Ellett, J.; Jackson, R.; Ritter, R.; Schlein, P.; Yaeger, D.; Zweizig, J.

    1986-01-01

    VIRTUS is a system of parallel MC68000-based processors interconnected by FASTBUS that is used either on-line as an intelligent trigger component or off-line for full event processing. Each processor receives the complete set of data from one event. The host computer, a VAX 11/780, down-line loads all software to the processors, controls and monitors the functioning of all processors, and writes processed data to tape. Instructions, programs, and data are transferred among the processors and the host in the form of fixed format, variable length data blocks. (Auth.)

  3. Low-Latency Embedded Vision Processor (LLEVS)

    Science.gov (United States)

    2016-03-01

    algorithms, low-latency video processing, embedded image processor, wearable electronics, helmet-mounted systems, alternative night / day imaging...external subsystems and data sources with the device. The establishment of data interfaces in terms of data transfer rates, formats and types are...video signals from Near-visible Infrared (NVIR) sensor, Shortwave IR (SWIR) and Longwave IR (LWIR) is the main processing for Night Vision (NI) system

  4. Keystone Business Models for Network Security Processors

    Directory of Open Access Journals (Sweden)

    Arthur Low

    2013-07-01

    Full Text Available Network security processors are critical components of high-performance systems built for cybersecurity. Development of a network security processor requires multi-domain experience in semiconductors and complex software security applications, and multiple iterations of both software and hardware implementations. Limited by the business models in use today, such an arduous task can be undertaken only by large incumbent companies and government organizations. Neither the “fabless semiconductor” models nor the silicon intellectual-property licensing (“IP-licensing” models allow small technology companies to successfully compete. This article describes an alternative approach that produces an ongoing stream of novel network security processors for niche markets through continuous innovation by both large and small companies. This approach, referred to here as the "business ecosystem model for network security processors", includes a flexible and reconfigurable technology platform, a “keystone” business model for the company that maintains the platform architecture, and an extended ecosystem of companies that both contribute and share in the value created by innovation. New opportunities for business model innovation by participating companies are made possible by the ecosystem model. This ecosystem model builds on: i the lessons learned from the experience of the first author as a senior integrated circuit architect for providers of public-key cryptography solutions and as the owner of a semiconductor startup, and ii the latest scholarly research on technology entrepreneurship, business models, platforms, and business ecosystems. This article will be of interest to all technology entrepreneurs, but it will be of particular interest to owners of small companies that provide security solutions and to specialized security professionals seeking to launch their own companies.

  5. Silicon Processors Using Organically Reconfigurable Techniques (SPORT)

    Science.gov (United States)

    2014-05-19

    AFRL-OSR-VA-TR-2014-0132 SILICON PROCESSORS USING ORGANICALLY RECONFIGURABLE TECHNIQUES ( SPORT ) Dennis Prather UNIVERSITY OF DELAWARE Final Report 05...5a. CONTRACT NUMBER Silicon Processes for Organically Reconfigurable Techniques ( SPORT ) 5b. GRANT NUMBER FA9550-10-1-0363 5c...Contract: Silicon Processes for Organically Reconfigurable Techniques ( SPORT ) Contract #: FA9550-10-1-0363 Reporting Period: 1 July 2010 – 31 December

  6. Quantum chemistry on a superconducting quantum processor

    Energy Technology Data Exchange (ETDEWEB)

    Kaicher, Michael P.; Wilhelm, Frank K. [Theoretical Physics, Saarland University, 66123 Saarbruecken (Germany); Love, Peter J. [Department of Physics and Astronomy, Tufts University, Medford, MA 02155 (United States)

    2016-07-01

    Quantum chemistry is the most promising civilian application for quantum processors to date. We study its adaptation to superconducting (sc) quantum systems, computing the ground state energy of LiH through a variational hybrid quantum classical algorithm. We demonstrate how interactions native to sc qubits further reduce the amount of quantum resources needed, pushing sc architectures as a near-term candidate for simulations of more complex atoms/molecules.

  7. Debugging in a multi-processor environment

    International Nuclear Information System (INIS)

    Spann, J.M.

    1981-01-01

    The Supervisory Control and Diagnostic System (SCDS) for the Mirror Fusion Test Facility (MFTF) consists of nine 32-bit minicomputers arranged in a tightly coupled distributed computer system utilizing a share memory as the data exchange medium. Debugging of more than one program in the multi-processor environment is a difficult process. This paper describes what new tools were developed and how the testing of software is performed in the SCDS for the MFTF project

  8. Intelligent trigger processor for the crystal box

    International Nuclear Information System (INIS)

    Sanders, G.H.; Butler, H.S.; Cooper, M.D.

    1981-01-01

    A large solid angle modular NaI(Tl) detector with 432 phototubes and 88 trigger scintillators is being used to search simultaneously for three lepton flavor changing decays of muon. A beam of up to 10 6 muons stopping per second with a 6% duty factor would yield up to 1000 triggers per second from random triple coincidences. A reduction of the trigger rate to 10 Hz is required from a hardwired primary trigger processor described in this paper. Further reduction to < 1 Hz is achieved by a microprocessor based secondary trigger processor. The primary trigger hardware imposes voter coincidence logic, stringent timing requirements, and a non-adjacency requirement in the trigger scintillators defined by hardwired circuits. Sophisticated geometric requirements are imposed by a PROM-based matrix logic, and energy and vector-momentum cuts are imposed by a hardwired processor using LSI flash ADC's and digital arithmetic loci. The secondary trigger employs four satellite microprocessors to do a sparse data scan, multiplex the data acquisition channels and apply additional event filtering

  9. Multibus-based parallel processor for simulation

    Science.gov (United States)

    Ogrady, E. P.; Wang, C.-H.

    1983-01-01

    A Multibus-based parallel processor simulation system is described. The system is intended to serve as a vehicle for gaining hands-on experience, testing system and application software, and evaluating parallel processor performance during development of a larger system based on the horizontal/vertical-bus interprocessor communication mechanism. The prototype system consists of up to seven Intel iSBC 86/12A single-board computers which serve as processing elements, a multiple transmission controller (MTC) designed to support system operation, and an Intel Model 225 Microcomputer Development System which serves as the user interface and input/output processor. All components are interconnected by a Multibus/IEEE 796 bus. An important characteristic of the system is that it provides a mechanism for a processing element to broadcast data to other selected processing elements. This parallel transfer capability is provided through the design of the MTC and a minor modification to the iSBC 86/12A board. The operation of the MTC, the basic hardware-level operation of the system, and pertinent details about the iSBC 86/12A and the Multibus are described.

  10. Code compression for VLIW embedded processors

    Science.gov (United States)

    Piccinelli, Emiliano; Sannino, Roberto

    2004-04-01

    The implementation of processors for embedded systems implies various issues: main constraints are cost, power dissipation and die area. On the other side, new terminals perform functions that require more computational flexibility and effort. Long code streams must be loaded into memories, which are expensive and power consuming, to run on DSPs or CPUs. To overcome this issue, the "SlimCode" proprietary algorithm presented in this paper (patent pending technology) can reduce the dimensions of the program memory. It can run offline and work directly on the binary code the compiler generates, by compressing it and creating a new binary file, about 40% smaller than the original one, to be loaded into the program memory of the processor. The decompression unit will be a small ASIC, placed between the Memory Controller and the System bus of the processor, keeping unchanged the internal CPU architecture: this implies that the methodology is completely transparent to the core. We present comparisons versus the state-of-the-art IBM Codepack algorithm, along with its architectural implementation into the ST200 VLIW family core.

  11. Treecode with a Special-Purpose Processor

    Science.gov (United States)

    Makino, Junichiro

    1991-08-01

    We describe an implementation of the modified Barnes-Hut tree algorithm for a gravitational N-body calculation on a GRAPE (GRAvity PipE) backend processor. GRAPE is a special-purpose computer for N-body calculations. It receives the positions and masses of particles from a host computer and then calculates the gravitational force at each coordinate specified by the host. To use this GRAPE processor with the hierarchical tree algorithm, the host computer must maintain a list of all nodes that exert force on a particle. If we create this list for each particle of the system at each timestep, the number of floating-point operations on the host and that on GRAPE would become comparable, and the increased speed obtained by using GRAPE would be small. In our modified algorithm, we create a list of nodes for many particles. Thus, the amount of the work required of the host is significantly reduced. This algorithm was originally developed by Barnes in order to vectorize the force calculation on a Cyber 205. With this algorithm, the computing time of the force calculation becomes comparable to that of the tree construction, if the GRAPE backend processor is sufficiently fast. The obtained speed-up factor is 30 to 50 for a RISC-based host computer and GRAPE-1A with a peak speed of 240 Mflops.

  12. Multi-processor network implementations in Multibus II and VME

    International Nuclear Information System (INIS)

    Briegel, C.

    1992-01-01

    ACNET (Fermilab Accelerator Controls Network), a proprietary network protocol, is implemented in a multi-processor configuration for both Multibus II and VME. The implementations are contrasted by the bus protocol and software design goals. The Multibus II implementation provides for multiple processors running a duplicate set of tasks on each processor. For a network connected task, messages are distributed by a network round-robin scheduler. Further, messages can be stopped, continued, or re-routed for each task by user-callable commands. The VME implementation provides for multiple processors running one task across all processors. The process can either be fixed to a particular processor or dynamically allocated to an available processor depending on the scheduling algorithm of the multi-processing operating system. (author)

  13. Merged ozone profiles from four MIPAS processors

    Science.gov (United States)

    Laeng, Alexandra; von Clarmann, Thomas; Stiller, Gabriele; Dinelli, Bianca Maria; Dudhia, Anu; Raspollini, Piera; Glatthor, Norbert; Grabowski, Udo; Sofieva, Viktoria; Froidevaux, Lucien; Walker, Kaley A.; Zehner, Claus

    2017-04-01

    The Michelson Interferometer for Passive Atmospheric Sounding (MIPAS) was an infrared (IR) limb emission spectrometer on the Envisat platform. Currently, there are four MIPAS ozone data products, including the operational Level-2 ozone product processed at ESA, with the scientific prototype processor being operated at IFAC Florence, and three independent research products developed by the Istituto di Fisica Applicata Nello Carrara (ISAC-CNR)/University of Bologna, Oxford University, and the Karlsruhe Institute of Technology-Institute of Meteorology and Climate Research/Instituto de Astrofísica de Andalucía (KIT-IMK/IAA). Here we present a dataset of ozone vertical profiles obtained by merging ozone retrievals from four independent Level-2 MIPAS processors. We also discuss the advantages and the shortcomings of this merged product. As the four processors retrieve ozone in different parts of the spectra (microwindows), the source measurements can be considered as nearly independent with respect to measurement noise. Hence, the information content of the merged product is greater and the precision is better than those of any parent (source) dataset. The merging is performed on a profile per profile basis. Parent ozone profiles are weighted based on the corresponding error covariance matrices; the error correlations between different profile levels are taken into account. The intercorrelations between the processors' errors are evaluated statistically and are used in the merging. The height range of the merged product is 20-55 km, and error covariance matrices are provided as diagnostics. Validation of the merged dataset is performed by comparison with ozone profiles from ACE-FTS (Atmospheric Chemistry Experiment-Fourier Transform Spectrometer) and MLS (Microwave Limb Sounder). Even though the merging is not supposed to remove the biases of the parent datasets, around the ozone volume mixing ratio peak the merged product is found to have a smaller (up to 0.1 ppmv

  14. Applications of selfshielded electron processors in industry 1987

    International Nuclear Information System (INIS)

    Nablo, S.V.

    1988-01-01

    Many new applications of the selfshielded electron processor have emerged over the past 5 years which continue to demonstrate the advantages and practical problems associated with these efficient, compact energy sources. Many of these, such as for the curing of: binders for abrasive products, binders for magnetic media, binders for textile flocking, silicone coatings for release papers, etc., offer good prospects for growth but are thusfar in limited use. With this ever broadening base of process technology, the prospects remain 'bullish' for the expanded industrial use of these new energy sources. With their unrivalled process off-licences and high throughputs, improved appreciation of their advantageous economics assures expanded application in an increasingly energy/environmentally concerned world economy(author)

  15. Directions in parallel processor architecture, and GPUs too

    CERN Multimedia

    CERN. Geneva

    2014-01-01

    Modern computing is power-limited in every domain of computing. Performance increments extracted from instruction-level parallelism (ILP) are no longer power-efficient; they haven't been for some time. Thread-level parallelism (TLP) is a more easily exploited form of parallelism, at the expense of programmer effort to expose it in the program. In this talk, I will introduce you to disparate topics in parallel processor architecture that will impact programming models (and you) in both the near and far future. About the speaker Olivier is a senior GPU (SM) architect at NVIDIA and an active participant in the concurrency working group of the ISO C++ committee. He has also worked on very large diesel engines as a mechanical engineer, and taught at McGill University (Canada) as a faculty instructor.

  16. Renegotiating ethane contracts from producer/processor perspectives

    International Nuclear Information System (INIS)

    Hawkins, D. J.

    1997-01-01

    An overview of commercial practices relating to ethane sup ply and other collateral issues, including the variety of technologies used in the recovery of ethane and the manufacture of ethylene was provided. Ethane supply and demand balances, the Alberta ethane supply, the justifications for ethane recovery, the need for renegotiating ethane contracts in view of the changing Alberta market, the major price components , producers and processors' objectives in ethane sales, the nature of ethane contracts, and ethane pricing mechanisms were reviewed. The 'Alberta Advantage' based on gas price, large-scale ethane recovery, proximity to the largest market in the world, efficient transportation and fractionation facilities, further enhanced by deregulation and reduced regulatory barriers, was described. It was suggested that the enhanced competition and increase in market diversity demands a transparency in pricing that may well be realized as additional players enter the petrochemical business, and as competing transportation, processing and fractionation options become available to ethane suppliers and purchasers.1 tab

  17. Scalable Motion Estimation Processor Core for Multimedia System-on-Chip Applications

    Science.gov (United States)

    Lai, Yeong-Kang; Hsieh, Tian-En; Chen, Lien-Fei

    2007-04-01

    In this paper, we describe a high-throughput and scalable motion estimation processor architecture for multimedia system-on-chip applications. The number of processing elements (PEs) is scalable according to the variable algorithm parameters and the performance required for different applications. Using the PE rings efficiently and an intelligent memory-interleaving organization, the efficiency of the architecture can be increased. Moreover, using efficient on-chip memories and a data management technique can effectively decrease the power consumption and memory bandwidth. Techniques for reducing the number of interconnections and external memory accesses are also presented. Our results demonstrate that the proposed scalable PE-ringed architecture is a flexible and high-performance processor core in multimedia system-on-chip applications.

  18. Modcomp MAX IV System Processors reference guide

    Energy Technology Data Exchange (ETDEWEB)

    Cummings, J.

    1990-10-01

    A user almost always faces a big problem when having to learn to use a new computer system. The information necessary to use the system is often scattered throughout many different manuals. The user also faces the problem of extracting the information really needed from each manual. Very few computer vendors supply a single Users Guide or even a manual to help the new user locate the necessary manuals. Modcomp is no exception to this, Modcomp MAX IV requires that the user be familiar with the system file usage which adds to the problem. At General Atomics there is an ever increasing need for new users to learn how to use the Modcomp computers. This paper was written to provide a condensed Users Reference Guide'' for Modcomp computer users. This manual should be of value not only to new users but any users that are not Modcomp computer systems experts. This Users Reference Guide'' is intended to provided the basic information for the use of the various Modcomp System Processors necessary to, create, compile, link-edit, and catalog a program. Only the information necessary to provide the user with a basic understanding of the Systems Processors is included. This document provides enough information for the majority of programmers to use the Modcomp computers without having to refer to any other manuals. A lot of emphasis has been placed on the file description and usage for each of the System Processors. This allows the user to understand how Modcomp MAX IV does things rather than just learning the system commands.

  19. Optical linear algebra processors - Architectures and algorithms

    Science.gov (United States)

    Casasent, David

    1986-01-01

    Attention is given to the component design and optical configuration features of a generic optical linear algebra processor (OLAP) architecture, as well as the large number of OLAP architectures, number representations, algorithms and applications encountered in current literature. Number-representation issues associated with bipolar and complex-valued data representations, high-accuracy (including floating point) performance, and the base or radix to be employed, are discussed, together with case studies on a space-integrating frequency-multiplexed architecture and a hybrid space-integrating and time-integrating multichannel architecture.

  20. The design of a graphics processor

    International Nuclear Information System (INIS)

    Holmes, M.; Thorne, A.R.

    1975-12-01

    The design of a graphics processor is described which takes into account known and anticipated user requirements, the availability of cheap minicomputers, the state of integrated circuit technology, and the overall need to minimise cost for a given performance. The main user needs are the ability to display large high resolution pictures, and to dynamically change the user's view in real time by means of fast coordinate processing hardware. The transformations that can be applied to 2D or 3D coordinates either singly or in combination are: translation, scaling, mirror imaging, rotation, and the ability to map the transformation origin on to any point on the screen. (author)

  1. Nuclear interactive evaluations on distributed processors

    International Nuclear Information System (INIS)

    Dix, G.E.; Congdon, S.P.

    1988-01-01

    BWR [boiling water reactor] nuclear design is a complicated process, involving trade-offs among a variety of conflicting objectives. Complex computer calculations and usually required for each design iteration. GE Nuclear Energy has implemented a system where the evaluations are performed interactively on a large number of small microcomputers. This approach minimizes the time it takes to carry out design iterations even through the processor speeds are low compared with modern super computers. All of the desktop microcomputers are linked to a common data base via an ethernet communications system so that design data can be shared and data quality can be maintained

  2. Integral Fast Reactor fuel pin processor

    International Nuclear Information System (INIS)

    Levinskas, D.

    1993-01-01

    This report discusses the pin processor which receives metal alloy pins cast from recycled Integral Fast Reactor (IFR) fuel and prepares them for assembly into new IFR fuel elements. Either full length as-cast or precut pins are fed to the machine from a magazine, cut if necessary, and measured for length, weight, diameter and deviation from straightness. Accepted pins are loaded into cladding jackets located in a magazine, while rejects and cutting scraps are separated into trays. The magazines, trays, and the individual modules that perform the different machine functions are assembled and removed using remote manipulators and master-slaves

  3. Lattice gauge theory using parallel processors

    International Nuclear Information System (INIS)

    Lee, T.D.; Chou, K.C.; Zichichi, A.

    1987-01-01

    The book's contents include: Lattice Gauge Theory Lectures: Introduction and Current Fermion Simulations; Monte Carlo Algorithms for Lattice Gauge Theory; Specialized Computers for Lattice Gauge Theory; Lattice Gauge Theory at Finite Temperature: A Monte Carlo Study; Computational Method - An Elementary Introduction to the Langevin Equation, Present Status of Numerical Quantum Chromodynamics; Random Lattice Field Theory; The GF11 Processor and Compiler; and The APE Computer and First Physics Results; Columbia Supercomputer Project: Parallel Supercomputer for Lattice QCD; Statistical and Systematic Errors in Numerical Simulations; Monte Carlo Simulation for LGT and Programming Techniques on the Columbia Supercomputer; Food for Thought: Five Lectures on Lattice Gauge Theory

  4. Introduction to programming multiple-processor computers

    International Nuclear Information System (INIS)

    Hicks, H.R.; Lynch, V.E.

    1985-04-01

    FORTRAN applications programs can be executed on multiprocessor computers in either a unitasking (traditional) or multitasking form. The latter allows a single job to use more than one processor simultaneously, with a consequent reduction in wall-clock time and, perhaps, the cost of the calculation. An introduction to programming in this environment is presented. The concepts of synchronization and data sharing using EVENTS and LOCKS are illustrated with examples. The strategy of strong synchronization and the use of synchronization templates are proposed. We emphasize that incorrect multitasking programs can produce irreproducible results, which makes debugging more difficult

  5. Recommending the heterogeneous cluster type multi-processor system computing

    International Nuclear Information System (INIS)

    Iijima, Nobukazu

    2010-01-01

    Real-time reactor simulator had been developed by reusing the equipment of the Musashi reactor and its performance improvement became indispensable for research tools to increase sampling rate with introduction of arithmetic units using multi-Digital Signal Processor(DSP) system (cluster). In order to realize the heterogeneous cluster type multi-processor system computing, combination of two kinds of Control Processor (CP) s, Cluster Control Processor (CCP) and System Control Processor (SCP), were proposed with Large System Control Processor (LSCP) for hierarchical cluster if needed. Faster computing performance of this system was well evaluated by simulation results for simultaneous execution of plural jobs and also pipeline processing between clusters, which showed the system led to effective use of existing system and enhancement of the cost performance. (T. Tanaka)

  6. SSC 254 Screen-Based Word Processors: Production Tests. The Lanier Word Processor.

    Science.gov (United States)

    Moyer, Ruth A.

    Designed for use in Trident Technical College's Secretarial Lab, this series of 12 production tests focuses on the use of the Lanier Word Processor for a variety of tasks. In tests 1 and 2, students are required to type and print out letters. Tests 3 through 8 require students to reformat a text; make corrections on a letter; divide and combine…

  7. Multiprocessor Real-Time Scheduling with Hierarchical Processor Affinities

    OpenAIRE

    Bonifaci , Vincenzo; Brandenburg , Björn; D'Angelo , Gianlorenzo; Marchetti-Spaccamela , Alberto

    2016-01-01

    International audience; Many multiprocessor real-time operating systems offer the possibility to restrict the migrations of any task to a specified subset of processors by setting affinity masks. A notion of " strong arbitrary processor affinity scheduling " (strong APA scheduling) has been proposed; this notion avoids schedulability losses due to overly simple implementations of processor affinities. Due to potential overheads, strong APA has not been implemented so far in a real-time operat...

  8. Coordinated Energy Management in Heterogeneous Processors

    Directory of Open Access Journals (Sweden)

    Indrani Paul

    2014-01-01

    Full Text Available This paper examines energy management in a heterogeneous processor consisting of an integrated CPU–GPU for high-performance computing (HPC applications. Energy management for HPC applications is challenged by their uncompromising performance requirements and complicated by the need for coordinating energy management across distinct core types – a new and less understood problem. We examine the intra-node CPU–GPU frequency sensitivity of HPC applications on tightly coupled CPU–GPU architectures as the first step in understanding power and performance optimization for a heterogeneous multi-node HPC system. The insights from this analysis form the basis of a coordinated energy management scheme, called DynaCo, for integrated CPU–GPU architectures. We implement DynaCo on a modern heterogeneous processor and compare its performance to a state-of-the-art power- and performance-management algorithm. DynaCo improves measured average energy-delay squared (ED2 product by up to 30% with less than 2% average performance loss across several exascale and other HPC workloads.

  9. Expert System Constant False Alarm Rate (CFAR) Processor

    National Research Council Canada - National Science Library

    Wicks, Michael C

    2006-01-01

    An artificial intelligence system improves radar signal processor performance by increasing target probability of detection and reducing probability of false alarm in a severe radar clutter environment...

  10. Fast track trigger processor for the OPAL detector at LEP

    Energy Technology Data Exchange (ETDEWEB)

    Carter, A A; Carter, J R; Ward, D R; Heuer, R D; Jaroslawski, S; Wagner, A

    1986-09-20

    A fast hardware track trigger processor being built for the OPAL experiment is described. The processor will analyse data from the central drift chambers of OPAL to determine whether any tracks come from the interaction region, and thereby eliminate background events. The processor will find tracks over a large angular range, vertical strokecos thetavertical stroke < or approx. 0.95. The design of the processor is described, together with a brief account of its hardware implementation for OPAL. The results of feasibility studies are also presented.

  11. Special processor for in-core control systems

    International Nuclear Information System (INIS)

    Golovanov, M.N.; Duma, V.R.; Levin, G.L.; Mel'nikov, A.V.; Polikanin, A.V.; Filatov, V.P.

    1978-01-01

    The BUTs-20 special processor is discussed, designed to control the units of the in-core control equipment which are incorporated into the VECTOR communication channel, and to provide preliminary data processing prior to computer calculations. A set of instructions and flowsheet of the processor, organization of its communication with memories and other units of the system are given. The processor components: a control unit and an arithmetic logical unit are discussed. It is noted that the special processor permits more effective utilization of the computer time

  12. Development of level 2 processor for the readout of TMC

    International Nuclear Information System (INIS)

    Arai, Y.; Ikeno, M.; Murata, T.; Sudo, F.; Emura, T.

    1995-01-01

    We have developed a prototype 8-bit processor for the level 2 data processing for the Time Memory Cell (TMC). The first prototype processor successfully runs with 18 MHz clock. The operation of same clock frequency as TMC (30 MHz) will be easily achieved with simple modifications. Although the processor is very primitive one but shows its powerful performance and flexibility. To realize the compact TMC/L2P (Level 2 Processor) system, it is better to include the microcode memory within the chip. Encoding logic of the microcode must be included to reduce the microcode memory in this case. (J.P.N.)

  13. Design Approach and Implementation of Application Specific Instruction Set Processor for SHA-3 BLAKE Algorithm

    Science.gov (United States)

    Zhang, Yuli; Han, Jun; Weng, Xinqian; He, Zhongzhu; Zeng, Xiaoyang

    This paper presents an Application Specific Instruction-set Processor (ASIP) for the SHA-3 BLAKE algorithm family by instruction set extensions (ISE) from an RISC (reduced instruction set computer) processor. With a design space exploration for this ASIP to increase the performance and reduce the area cost, we accomplish an efficient hardware and software implementation of BLAKE algorithm. The special instructions and their well-matched hardware function unit improve the calculation of the key section of the algorithm, namely G-functions. Also, relaxing the time constraint of the special function unit can decrease its hardware cost, while keeping the high data throughput of the processor. Evaluation results reveal the ASIP achieves 335Mbps and 176Mbps for BLAKE-256 and BLAKE-512. The extra area cost is only 8.06k equivalent gates. The proposed ASIP outperforms several software approaches on various platforms in cycle per byte. In fact, both high throughput and low hardware cost achieved by this programmable processor are comparable to that of ASIC implementations.

  14. Blocked All-Pairs Shortest Paths Algorithm on Intel Xeon Phi KNL Processor: A Case Study

    OpenAIRE

    Rucci, Enzo; De Giusti, Armando Eduardo; Naiouf, Marcelo

    2017-01-01

    Manycores are consolidating in HPC community as a way of improving performance while keeping power efficiency. Knights Landing is the recently released second generation of Intel Xeon Phi architec- ture.While optimizing applications on CPUs, GPUs and first Xeon Phi’s has been largely studied in the last years, the new features in Knights Landing processors require the revision of programming and optimization techniques for these devices. In this work, we selected the Floyd-Warshall algorithm ...

  15. Scaling and optimizing the Gysela code on a cluster of many-core processors

    OpenAIRE

    Latu , Guillaume; ASAHI , Yuuichi; Bigot , Julien; Fehér , Tamás; Grandgirard , Virginie

    2018-01-01

    The current generation of the Xeon Phi Knights Landing (KNL) processor provides a highly multi-threaded environment on which regular programming models such as MPI/OpenMP can be used. This specific hardware offers both large memory bandwidth and large computing resources and is currently available on computing facilities. Many factors impact the performance achieved by applications, one of the key points is the efficient exploitation of SIMD vector units, another one is the memory access patt...

  16. Aspects of computation on asynchronous parallel processors

    International Nuclear Information System (INIS)

    Wright, M.

    1989-01-01

    The increasing availability of asynchronous parallel processors has provided opportunities for original and useful work in scientific computing. However, the field of parallel computing is still in a highly volatile state, and researchers display a wide range of opinion about many fundamental questions such as models of parallelism, approaches for detecting and analyzing parallelism of algorithms, and tools that allow software developers and users to make effective use of diverse forms of complex hardware. This volume collects the work of researchers specializing in different aspects of parallel computing, who met to discuss the framework and the mechanics of numerical computing. The far-reaching impact of high-performance asynchronous systems is reflected in the wide variety of topics, which include scientific applications (e.g. linear algebra, lattice gauge simulation, ordinary and partial differential equations), models of parallelism, parallel language features, task scheduling, automatic parallelization techniques, tools for algorithm development in parallel environments, and system design issues

  17. Processor-in-memory-and-storage architecture

    Science.gov (United States)

    DeBenedictis, Erik

    2018-01-02

    A method and apparatus for performing reliable general-purpose computing. Each sub-core of a plurality of sub-cores of a processor core processes a same instruction at a same time. A code analyzer receives a plurality of residues that represents a code word corresponding to the same instruction and an indication of whether the code word is a memory address code or a data code from the plurality of sub-cores. The code analyzer determines whether the plurality of residues are consistent or inconsistent. The code analyzer and the plurality of sub-cores perform a set of operations based on whether the code word is a memory address code or a data code and a determination of whether the plurality of residues are consistent or inconsistent.

  18. An accurate projection algorithm for array processor based SPECT systems

    International Nuclear Information System (INIS)

    King, M.A.; Schwinger, R.B.; Cool, S.L.

    1985-01-01

    A data re-projection algorithm has been developed for use in single photon emission computed tomography (SPECT) on an array processor based computer system. The algorithm makes use of an accurate representation of pixel activity (uniform square pixel model of intensity distribution), and is rapidly performed due to the efficient handling of an array based algorithm and the Fast Fourier Transform (FFT) on parallel processing hardware. The algorithm consists of using a pixel driven nearest neighbour projection operation to an array of subdivided projection bins. This result is then convolved with the projected uniform square pixel distribution before being compressed to original bin size. This distribution varies with projection angle and is explicitly calculated. The FFT combined with a frequency space multiplication is used instead of a spatial convolution for more rapid execution. The new algorithm was tested against other commonly used projection algorithms by comparing the accuracy of projections of a simulated transverse section of the abdomen against analytically determined projections of that transverse section. The new algorithm was found to yield comparable or better standard error and yet result in easier and more efficient implementation on parallel hardware. Applications of the algorithm include iterative reconstruction and attenuation correction schemes and evaluation of regions of interest in dynamic and gated SPECT

  19. The Potential of the Cell Processor for Scientific Computing

    Energy Technology Data Exchange (ETDEWEB)

    Williams, Samuel; Shalf, John; Oliker, Leonid; Husbands, Parry; Kamil, Shoaib; Yelick, Katherine

    2005-10-14

    The slowing pace of commodity microprocessor performance improvements combined with ever-increasing chip power demands has become of utmost concern to computational scientists. As a result, the high performance computing community is examining alternative architectures that address the limitations of modern cache-based designs. In this work, we examine the potential of the using the forth coming STI Cell processor as a building block for future high-end computing systems. Our work contains several novel contributions. We are the first to present quantitative Cell performance data on scientific kernels and show direct comparisons against leading superscalar (AMD Opteron), VLIW (IntelItanium2), and vector (Cray X1) architectures. Since neither Cell hardware nor cycle-accurate simulators are currently publicly available, we develop both analytical models and simulators to predict kernel performance. Our work also explores the complexity of mapping several important scientific algorithms onto the Cells unique architecture. Additionally, we propose modest microarchitectural modifications that could significantly increase the efficiency of double-precision calculations. Overall results demonstrate the tremendous potential of the Cell architecture for scientific computations in terms of both raw performance and power efficiency.

  20. Parallelization of applications for networks with homogeneous and heterogeneous processors

    International Nuclear Information System (INIS)

    Colombet, L.

    1994-01-01

    The aim of this thesis is to study and develop efficient methods for parallelization of scientific applications on parallel computers with distributed memory. The first part presents two libraries of PVM (Parallel Virtual Machine) and MPI (Message Passing Interface) communication tools. They allow implementation of programs on most parallel machines, but also on heterogeneous computer networks. This chapter illustrates the problems faced when trying to evaluate performances of networks with heterogeneous processors. To evaluate such performances, the concepts of speed-up and efficiency have been modified and adapted to account for heterogeneity. The second part deals with a study of parallel application libraries such as ScaLAPACK and with the development of communication masking techniques. The general concept is based on communication anticipation, in particular by pipelining message sending operations. Experimental results on Cray T3D and IBM SP1 machines validates the theoretical studies performed on basic algorithms of the libraries discussed above. Two examples of scientific applications are given: the first is a model of young stars for astrophysics and the other is a model of photon trajectories in the Compton effect. (J.S.). 83 refs., 65 figs., 24 tabs

  1. PRODUCTIVITY AND COSTS OF PROCESSOR WORKING IN STANDS OF Eucalyptus grandis Hill ex Maiden

    Directory of Open Access Journals (Sweden)

    Bernardo Carlos Tarnowski

    2009-09-01

    Full Text Available In the present work a time study was conducted with the objective of adjusting equations to estimate the time of activities, productivity, operational costs and the production of the processor used in a harvest operation of stands of Eucalyptus grandis in plain topography in the state of Bahia, Brazil. The operational cycle of the processor consisted of the time spent to process a tree, and was divided in to stages, which were assessed using the methotodology of single activity times. The sampling unit was the operational cycle of the machine. The statistical analysis was based on regression analysis considering the selection procedure “stepwise”. With the adjusted equations it was possible to estimate the productivity of the machine taking into account the of tree diameter. Considering an operational efficiency of 70 % under the circumstances of the study, the productivity of the processor was 25,8 m3 cc/h, the operational costs 47,90 US$/h and the production costs 1,86 US$/m3 cc. On the basis of the obtained results it can be concluded that the time of tree processing has varied directly according to the diameter increase diameter; the preparation time, contrary to the processing time, only shows a weak correlation with tree diameter; productivity of the processor is directly proportional to tree diameter, when expressed in volume and inversely proportional when expressed in tree number; the costs per cubic meter of wood processed varies inversely with of increased diameter; from the operational costs, fixed costs had the highest proportion followed by the variable costs, administrative costs and costs for manpower; the production costs of the processor decreased exponentially with increasing tree diameter.

  2. Optimal processor for malfunction detection in operating nuclear reactor

    International Nuclear Information System (INIS)

    Ciftcioglu, O.

    1990-01-01

    An optimal processor for diagnosing operational transients in a nuclear reactor is described. Basic design of the processor involves real-time processing of noise signal obtained from a particular in core sensor and the optimality is based on minimum alarm failure in contrast to minimum false alarm criterion from the safe and reliable plant operation viewpoint

  3. Sojourn time tails in processor-sharing systems

    NARCIS (Netherlands)

    Egorova, R.R.

    2009-01-01

    The processor-sharing discipline was originally introduced as a modeling abstraction for the design and performance analysis of the processing unit of a computer system. Under the processor-sharing discipline, all active tasks are assumed to be processed simultaneously, receiving an equal share of

  4. ACP/R3000 processors in data acquisition systems

    International Nuclear Information System (INIS)

    Deppe, J.; Areti, H.; Atac, R.

    1989-02-01

    We describe ACP/R3000 processor based data acquisition systems for high energy physics. This VME bus compatible processor board, with a computational power equivalent to 15 VAX 11/780s or better, contains 8 Mb of memory for event buffering and has a high speed secondary bus that allows data gathering from front end electronics. 2 refs., 3 figs

  5. On the effective parallel programming of multi-core processors

    NARCIS (Netherlands)

    Varbanescu, A.L.

    2010-01-01

    Multi-core processors are considered now the only feasible alternative to the large single-core processors which have become limited by technological aspects such as power consumption and heat dissipation. However, due to their inherent parallel structure and their diversity, multi-cores are

  6. Bank switched memory interface for an image processor

    International Nuclear Information System (INIS)

    Barron, M.; Downward, J.

    1980-09-01

    A commercially available image processor is interfaced to a PDP-11/45 through an 8K window of memory addresses. When the image processor was not in use it was desired to be able to use the 8K address space as real memory. The standard method of accomplishing this would have been to use UNIBUS switches to switch in either the physical 8K bank of memory or the image processor memory. This method has the disadvantage of being rather expensive. As a simple alternative, a device was built to selectively enable or disable either an 8K bank of memory or the image processor memory. To enable the image processor under program control, GEN is contracted in size, the memory is disabled, a device partition for the image processor is created above GEN, and the image processor memory is enabled. The process is reversed to restore memory to GEN. The hardware to enable/disable the image and computer memories is controlled using spare bits from a DR-11K output register. The image processor and physical memory can be switched in or out on line with no adverse affects on the system's operation

  7. Digital image processing software system using an array processor

    International Nuclear Information System (INIS)

    Sherwood, R.J.; Portnoff, M.R.; Journeay, C.H.; Twogood, R.E.

    1981-01-01

    A versatile array processor-based system for general-purpose image processing was developed. At the heart of this system is an extensive, flexible software package that incorporates the array processor for effective interactive image processing. The software system is described in detail, and its application to a diverse set of applications at LLNL is briefly discussed. 4 figures, 1 table

  8. Designing a dataflow processor using CλaSH

    NARCIS (Netherlands)

    Niedermeier, A.; Wester, Rinse; Wester, Rinse; Rovers, K.C.; Baaij, C.P.R.; Kuper, Jan; Smit, Gerardus Johannes Maria

    2010-01-01

    In this paper we show how a simple dataflow processor can be fully implemented using CλaSH, a high level HDL based on the functional programming language Haskell. The processor was described using Haskell, the CλaSH compiler was then used to translate the design into a fully synthesisable VHDL code.

  9. Biomass is beginning to threaten the wood-processors

    International Nuclear Information System (INIS)

    Beer, G.; Sobinkovic, B.

    2004-01-01

    In this issue an exploitation of biomass in Slovak Republic is analysed. Some new projects of constructing of the stoke-holds for biomass processing are published. The grants for biomass are ascending the prices of wood raw material, which is thus becoming less accessible for the wood-processors. An excessive wood export threatens the domestic processors

  10. Digital Signal Processor System for AC Power Drivers

    Directory of Open Access Journals (Sweden)

    Ovidiu Neamtu

    2009-10-01

    Full Text Available DSP (Digital Signal Processor is the bestsolution for motor control systems to make possible thedevelopment of advanced motor drive systems. The motorcontrol processor calculates the required motor windingvoltage magnitude and frequency to operate the motor atthe desired speed. A PWM (Pulse Width Modulationcircuit controls the on and off duty cycle of the powerinverter switches to vary the magnitude of the motorvoltages.

  11. Evaluation of the Intel Sandy Bridge-EP server processor

    CERN Document Server

    Jarp, S; Leduc, J; Nowak, A; CERN. Geneva. IT Department

    2012-01-01

    In this paper we report on a set of benchmark results recently obtained by CERN openlab when comparing an 8-core “Sandy Bridge-EP” processor with Intel’s previous microarchitecture, the “Westmere-EP”. The Intel marketing names for these processors are “Xeon E5-2600 processor series” and “Xeon 5600 processor series”, respectively. Both processors are produced in a 32nm process, and both platforms are dual-socket servers. Multiple benchmarks were used to get a good understanding of the performance of the new processor. We used both industry-standard benchmarks, such as SPEC2006, and specific High Energy Physics benchmarks, representing both simulation of physics detectors and data analysis of physics events. Before summarizing the results we must stress the fact that benchmarking of modern processors is a very complex affair. One has to control (at least) the following features: processor frequency, overclocking via Turbo mode, the number of physical cores in use, the use of logical cores ...

  12. Acoustooptic linear algebra processors - Architectures, algorithms, and applications

    Science.gov (United States)

    Casasent, D.

    1984-01-01

    Architectures, algorithms, and applications for systolic processors are described with attention to the realization of parallel algorithms on various optical systolic array processors. Systolic processors for matrices with special structure and matrices of general structure, and the realization of matrix-vector, matrix-matrix, and triple-matrix products and such architectures are described. Parallel algorithms for direct and indirect solutions to systems of linear algebraic equations and their implementation on optical systolic processors are detailed with attention to the pipelining and flow of data and operations. Parallel algorithms and their optical realization for LU and QR matrix decomposition are specifically detailed. These represent the fundamental operations necessary in the implementation of least squares, eigenvalue, and SVD solutions. Specific applications (e.g., the solution of partial differential equations, adaptive noise cancellation, and optimal control) are described to typify the use of matrix processors in modern advanced signal processing.

  13. Multiple Embedded Processors for Fault-Tolerant Computing

    Science.gov (United States)

    Bolotin, Gary; Watson, Robert; Katanyoutanant, Sunant; Burke, Gary; Wang, Mandy

    2005-01-01

    A fault-tolerant computer architecture has been conceived in an effort to reduce vulnerability to single-event upsets (spurious bit flips caused by impingement of energetic ionizing particles or photons). As in some prior fault-tolerant architectures, the redundancy needed for fault tolerance is obtained by use of multiple processors in one computer. Unlike prior architectures, the multiple processors are embedded in a single field-programmable gate array (FPGA). What makes this new approach practical is the recent commercial availability of FPGAs that are capable of having multiple embedded processors. A working prototype (see figure) consists of two embedded IBM PowerPC 405 processor cores and a comparator built on a Xilinx Virtex-II Pro FPGA. This relatively simple instantiation of the architecture implements an error-detection scheme. A planned future version, incorporating four processors and two comparators, would correct some errors in addition to detecting them.

  14. Experimental testing of the noise-canceling processor.

    Science.gov (United States)

    Collins, Michael D; Baer, Ralph N; Simpson, Harry J

    2011-09-01

    Signal-processing techniques for localizing an acoustic source buried in noise are tested in a tank experiment. Noise is generated using a discrete source, a bubble generator, and a sprinkler. The experiment has essential elements of a realistic scenario in matched-field processing, including complex source and noise time series in a waveguide with water, sediment, and multipath propagation. The noise-canceling processor is found to outperform the Bartlett processor and provide the correct source range for signal-to-noise ratios below -10 dB. The multivalued Bartlett processor is found to outperform the Bartlett processor but not the noise-canceling processor. © 2011 Acoustical Society of America

  15. Simulation of a processor switching circuit with APLSV

    International Nuclear Information System (INIS)

    Dilcher, H.

    1979-01-01

    The report describes the simulation of a processor switching circuit with APL. Furthermore an APL function is represented to simulate a processor in an assembly like language. Both together serve as a tool for studying processor properties. By means of the programming function it is also possible to program other simulated processors. The processor is to be used in the processing of data in real time analysis that occur in high energy physics experiments. The data are already offered to the computer in digitalized form. A typical data rate is at 10 KB/ sec. The data are structured in blocks. The particular blocks are 1 KB wide and are independent from each other. Aprocessor has to decide, whether the block data belong to an event that is part of the backround noise and can therefore be forgotten, or whether the data should be saved for a later evaluation. (orig./WB) [de

  16. MPC Related Computational Capabilities of ARMv7A Processors

    DEFF Research Database (Denmark)

    Frison, Gianluca; Jørgensen, John Bagterp

    2015-01-01

    In recent years, the mass market of mobile devices has pushed the demand for increasingly fast but cheap processors. ARM, the world leader in this sector, has developed the Cortex-A series of processors with focus on computationally intensive applications. If properly programmed, these processors...... are powerful enough to solve the complex optimization problems arising in MPC in real-time, while keeping the traditional low-cost and low-power consumption. This makes these processors ideal candidates for use in embedded MPC. In this paper, we investigate the floating-point capabilities of Cortex A7, A9...... and A15 and show how to exploit the unique features of each processor to obtain the best performance, in the context of a novel implementation method for the linear-algebra routines used in MPC solvers. This method adapts high-performance computing techniques to the needs of embedded MPC. In particular...

  17. Efficacy of Code Optimization on Cache-based Processors

    Science.gov (United States)

    VanderWijngaart, Rob F.; Chancellor, Marisa K. (Technical Monitor)

    1997-01-01

    The current common wisdom in the U.S. is that the powerful, cost-effective supercomputers of tomorrow will be based on commodity (RISC) micro-processors with cache memories. Already, most distributed systems in the world use such hardware as building blocks. This shift away from vector supercomputers and towards cache-based systems has brought about a change in programming paradigm, even when ignoring issues of parallelism. Vector machines require inner-loop independence and regular, non-pathological memory strides (usually this means: non-power-of-two strides) to allow efficient vectorization of array operations. Cache-based systems require spatial and temporal locality of data, so that data once read from main memory and stored in high-speed cache memory is used optimally before being written back to main memory. This means that the most cache-friendly array operations are those that feature zero or unit stride, so that each unit of data read from main memory (a cache line) contains information for the next iteration in the loop. Moreover, loops ought to be 'fat', meaning that as many operations as possible are performed on cache data-provided instruction caches do not overflow and enough registers are available. If unit stride is not possible, for example because of some data dependency, then care must be taken to avoid pathological strides, just ads on vector computers. For cache-based systems the issues are more complex, due to the effects of associativity and of non-unit block (cache line) size. But there is more to the story. Most modern micro-processors are superscalar, which means that they can issue several (arithmetic) instructions per clock cycle, provided that there are enough independent instructions in the loop body. This is another argument for providing fat loop bodies. With these restrictions, it appears fairly straightforward to produce code that will run efficiently on any cache-based system. It can be argued that although some of the important

  18. Launching applications on compute and service processors running under different operating systems in scalable network of processor boards with routers

    Science.gov (United States)

    Tomkins, James L [Albuquerque, NM; Camp, William J [Albuquerque, NM

    2009-03-17

    A multiple processor computing apparatus includes a physical interconnect structure that is flexibly configurable to support selective segregation of classified and unclassified users. The physical interconnect structure also permits easy physical scalability of the computing apparatus. The computing apparatus can include an emulator which permits applications from the same job to be launched on processors that use different operating systems.

  19. Low Cost Design of an Advanced Encryption Standard (AES) Processor Using a New Common-Subexpression-Elimination Algorithm

    Science.gov (United States)

    Chen, Ming-Chih; Hsiao, Shen-Fu

    In this paper, we propose an area-efficient design of Advanced Encryption Standard (AES) processor by applying a new common-expression-elimination (CSE) method to the sub-functions of various transformations required in AES. The proposed method reduces the area cost of realizing the sub-functions by extracting the common factors in the bit-level XOR/AND-based sum-of-product expressions of these sub-functions using a new CSE algorithm. Cell-based implementation results show that the AES processor with our proposed CSE method has significant area improvement compared with previous designs.

  20. Balanced Bipartite Graph Based Register Allocation for Network Processors in Mobile and Wireless Networks

    Directory of Open Access Journals (Sweden)

    Feilong Tang

    2010-01-01

    Full Text Available Mobile and wireless networks are the integrant infrastructure of mobile and pervasive computing that aims at providing transparent and preferred information and services for people anytime anywhere. In such environments, end-to-end network bandwidth is crucial to improve user's transparent experience when providing on-demand services such as mobile video playing. As a result, powerful computing power is required for networked nodes, especially for routers. General-purpose processors cannot meet such requirements due to their limited processing ability, and poor programmability and scalability. Intel's network processor IXP is specially designed for fast packet processing to achieve a broad bandwidth. IXP provides a large number of registers to reduce the number of memory accesses. Registers in an IXP are physically partitioned as two banks so that two source operands in an instruction have to come from the two banks respectively, which makes the IXP register allocation tricky and different from conventional ones. In this paper, we investigate an approach for efficiently generating balanced bipartite graph and register allocation algorithms for the dual-bank register allocation in IXPs. The paper presents a graph uniform 2-way partition algorithm (FPT, which provides an optimal solution to the graph partition, and a heuristic algorithm for generating balanced bipartite graph. Finally, we design a framework for IXP register allocation. Experimental results demonstrate the framework and the algorithms are efficient in register allocation for IXP network processors.

  1. Methanol fuel processor and PEM fuel cell modeling for mobile application

    Energy Technology Data Exchange (ETDEWEB)

    Chrenko, Daniela [ISAT, University of Burgundy, Rue Mlle Bourgoise, 58000 Nevers (France); Gao, Fei; Blunier, Benjamin; Bouquain, David; Miraoui, Abdellatif [Transport and Systems Laboratory (SeT) - EA 3317/UTBM, Fuel cell Laboratory (FCLAB), University of Technology of Belfort-Montbeliard, Rue Thierry Mieg 90010, Belfort Cedex (France)

    2010-07-15

    The use of hydrocarbon fed fuel cell systems including a fuel processor can be an entry market for this emerging technology avoiding the problem of hydrogen infrastructure. This article presents a 1 kW low temperature PEM fuel cell system with fuel processor, the system is fueled by a mixture of methanol and water that is converted into hydrogen rich gas using a steam reformer. A complete system model including a fluidic fuel processor model containing evaporation, steam reformer, hydrogen filter, combustion, as well as a multi-domain fuel cell model is introduced. Experiments are performed with an IDATECH FCS1200 trademark fuel cell system. The results of modeling and experimentation show good results, namely with regard to fuel cell current and voltage as well as hydrogen production and pressure. The system is auto sufficient and shows an efficiency of 25.12%. The presented work is a step towards a complete system model, needed to develop a well adapted system control assuring optimized system efficiency. (author)

  2. A Hybrid Scheme Based on Pipelining and Multitasking in Mobile Application Processors for Advanced Video Coding

    Directory of Open Access Journals (Sweden)

    Muhammad Asif

    2015-01-01

    Full Text Available One of the key requirements for mobile devices is to provide high-performance computing at lower power consumption. The processors used in these devices provide specific hardware resources to handle computationally intensive video processing and interactive graphical applications. Moreover, processors designed for low-power applications may introduce limitations on the availability and usage of resources, which present additional challenges to the system designers. Owing to the specific design of the JZ47x series of mobile application processors, a hybrid software-hardware implementation scheme for H.264/AVC encoder is proposed in this work. The proposed scheme distributes the encoding tasks among hardware and software modules. A series of optimization techniques are developed to speed up the memory access and data transferring among memories. Moreover, an efficient data reusage design is proposed for the deblock filter video processing unit to reduce the memory accesses. Furthermore, fine grained macroblock (MB level parallelism is effectively exploited and a pipelined approach is proposed for efficient utilization of hardware processing cores. Finally, based on parallelism in the proposed design, encoding tasks are distributed between two processing cores. Experiments show that the hybrid encoder is 12 times faster than a highly optimized sequential encoder due to proposed techniques.

  3. Computation of Molecular Spectra on a Quantum Processor with an Error-Resilient Algorithm

    Directory of Open Access Journals (Sweden)

    J. I. Colless

    2018-02-01

    Full Text Available Harnessing the full power of nascent quantum processors requires the efficient management of a limited number of quantum bits with finite coherent lifetimes. Hybrid algorithms, such as the variational quantum eigensolver (VQE, leverage classical resources to reduce the required number of quantum gates. Experimental demonstrations of VQE have resulted in calculation of Hamiltonian ground states, and a new theoretical approach based on a quantum subspace expansion (QSE has outlined a procedure for determining excited states that are central to dynamical processes. We use a superconducting-qubit-based processor to apply the QSE approach to the H_{2} molecule, extracting both ground and excited states without the need for auxiliary qubits or additional minimization. Further, we show that this extended protocol can mitigate the effects of incoherent errors, potentially enabling larger-scale quantum simulations without the need for complex error-correction techniques.

  4. Simulation of Particulate Flows on Multi-Processor Machines with Distributed Memory

    International Nuclear Information System (INIS)

    Uhlmann, M.

    2004-01-01

    We present a method for the parallelization of an immersed boundary algorithm for particulate flows using the MPI standard of communication. The treatment of the fluid phase uses the domain decomposition technique over a Cartesian processor grid. The solution of the Hehnholtz problem is approximately factorized an relies upon apparel tri-diagonal solver; the Poisson problem is solved by means of a parallel multi-grid technique simulator MUDPACK. For the solid phase we employ a master-slaves technique where one process or handles all the particles contained in its Eulerian fluid sub-domain and zero or more neighbor processors collaborate in the computation of particle-related quantities whenever a particle position overlaps the boundary of a sub- do mam.The parallel efficiency for some preliminary computations is presented. (Author) 9 refs

  5. A natural-gas fuel processor for a residential fuel cell system

    Science.gov (United States)

    Adachi, H.; Ahmed, S.; Lee, S. H. D.; Papadias, D.; Ahluwalia, R. K.; Bendert, J. C.; Kanner, S. A.; Yamazaki, Y.

    A system model was used to develop an autothermal reforming fuel processor to meet the targets of 80% efficiency (higher heating value) and start-up energy consumption of less than 500 kJ when operated as part of a 1-kWe natural-gas fueled fuel cell system for cogeneration of heat and power. The key catalytic reactors of the fuel processor - namely the autothermal reformer, a two-stage water gas shift reactor and a preferential oxidation reactor - were configured and tested in a breadboard apparatus. Experimental results demonstrated a reformate containing ∼48% hydrogen (on a dry basis and with pure methane as fuel) and less than 5 ppm CO. The effects of steam-to-carbon and part load operations were explored.

  6. Computation of Molecular Spectra on a Quantum Processor with an Error-Resilient Algorithm

    Science.gov (United States)

    Colless, J. I.; Ramasesh, V. V.; Dahlen, D.; Blok, M. S.; Kimchi-Schwartz, M. E.; McClean, J. R.; Carter, J.; de Jong, W. A.; Siddiqi, I.

    2018-02-01

    Harnessing the full power of nascent quantum processors requires the efficient management of a limited number of quantum bits with finite coherent lifetimes. Hybrid algorithms, such as the variational quantum eigensolver (VQE), leverage classical resources to reduce the required number of quantum gates. Experimental demonstrations of VQE have resulted in calculation of Hamiltonian ground states, and a new theoretical approach based on a quantum subspace expansion (QSE) has outlined a procedure for determining excited states that are central to dynamical processes. We use a superconducting-qubit-based processor to apply the QSE approach to the H2 molecule, extracting both ground and excited states without the need for auxiliary qubits or additional minimization. Further, we show that this extended protocol can mitigate the effects of incoherent errors, potentially enabling larger-scale quantum simulations without the need for complex error-correction techniques.

  7. Track recognition in 4 μs by a systolic trigger processor using a parallel Hough transform

    International Nuclear Information System (INIS)

    Klefenz, F.; Noffz, K.H.; Conen, W.; Zoz, R.; Kugel, A.; Maenner, R.; Univ. Heidelberg

    1993-01-01

    A parallel Hough transform processor has been developed that identifies circular particle tracks in a 2D projection of the OPAL jet chamber. The high-speed requirements imposed by the 8 bunch crossing mode of LEP could be fulfilled by computing the starting angle and the radius of curvature for each well defined track in less than 4 μs. The system consists of a Hough transform processor that determines well defined tracks, and a Euler processor that counts their number by applying the Euler relation to the thresholded result of the Hough transform. A prototype of a systolic processor has been built that handles one sector of the jet chamber. It consists of 35 x 32 processing elements that were loaded into 21 programmable gate arrays (XILINX). This processor runs at a clock rate of 40 MHz. It has been tested offline with about 1,000 original OPAL events. No deviations from the off-line simulation have been found. A trigger efficiency of 93% has been obtained. The prototype together with the associated drift time measurement unit has been installed at the OPAL detector at LEP and 100k events have been sampled to evaluate the system under detector conditions

  8. First level trigger processor for the ZEUS calorimeter

    International Nuclear Information System (INIS)

    Dawson, J.W.; Talaga, R.L.; Burr, G.W.; Laird, R.J.; Smith, W.; Lackey, J.

    1990-01-01

    This paper discusses the design of the first level trigger processor for the ZEUS calorimeter. This processor accepts data from the 13,000 photomultipliers of the calorimeter which is topologically divided into 16 regions, and after regional preprocessing, performs logical and numerical operations which cross regional boundaries. Because the crossing period at the HERA collider is 96 ns, it is necessary that first-level trigger decisions be made in pipelined hardware. One microsecond is allowed for the processor to perform the required logical and numerical operations, during which time the data from ten crossings would be resident in the processor while being clocked through the pipelined hardware. The circuitry is implemented in 100K ECL, Advanced CMOS discrete devices, and programmable gate arrays, and operates in a VME environment. All tables and registers are written/read from VME, and all diagnostic codes are executed from VME. Preprocessed data flows into the processor at a rate of 5.2GB/s, and processed data flows from the processor to the Global First-Level Trigger at a rate of 700MB/s. The system allows for subsets of the logic to be configured by software and for various important variables to be histogrammed as they flow through the processor. 2 refs., 3 figs

  9. A dedicated line-processor as used at the SHF

    International Nuclear Information System (INIS)

    Bevan, A.V.; Hatley, R.W.; Price, D.R.; Rankin, P.

    1985-01-01

    A hardwired trigger processor was used at the SLAC Hybrid Facility to find evidence for charged tracks originating from the fiducial volume of a 40'' rapidcycling bubble chamber. Straight-line projections of these tracks in the plane perpendicular to the applied magnetic field were searched for using data from three sets of proportional wire chambers (PWC). This information was made directly available to the processor by means of a special digitizing card. The results memory of the processor simulated read-only memory in a 168/E processor and was accessible by it. The 168/E controlled the issuing of a trigger command to the bubble chamber flash tubes. The same design of digitizer card used by the line processor was incorporated into the 168/E, again as read only memory, which allowed it access to the raw data for continual monitoring of trigger integrity. The design logic of the trigger processor was verified by running real PWC data through a FORTRAN simulation of the hardware. This enabled the debugging to become highly automated since a step by step, computer controlled comparison of processor registers to simulation predictions could be made

  10. First-level trigger processor for the ZEUS calorimeter

    International Nuclear Information System (INIS)

    Dawson, J.W.; Talaga, R.L.; Burr, G.W.; Laird, R.J.; Smith, W.; Lackey, J.

    1990-01-01

    The design of the first-level trigger processor for the Zeus calorimeter is discussed. This processor accepts data from the 13,000 photomultipliers of the calorimeter, which is topologically divided into 16 regions, and after regional preprocessing performs logical and numerical operations that cross regional boundaries. Because the crossing period at the HERA collider is 96 ns, it is necessary that first-level trigger decisions be made in pipelined hardware. One microsecond is allowed for the processor to perform the required logical and numerical operations, during which time the data from ten crossings would be resident in the processor while being clocked through the pipelined hardware. The circuitry is implemented in 100K emitter-coupled logic (ECL), advanced CMOS discrete devices and programmable gate arrays, and operates in a VME environment. All tables and registers are written/read from VME, and all diagnostic codes are executed from VME. Preprocessed data flows into the processor at a rate of 5.2 Gbyte/s, and processed data flows from the processor to the global first-level trigger at a rate of 70 Mbyte/s. The system allows for subsets of the logic to be configured by software and for various important variables to be histogrammed as they flow through the processor

  11. Novel memory architecture for video signal processor

    Science.gov (United States)

    Hung, Jen-Sheng; Lin, Chia-Hsing; Jen, Chein-Wei

    1993-11-01

    An on-chip memory architecture for video signal processor (VSP) is proposed. This memory structure is a two-level design for the different data locality in video applications. The upper level--Memory A provides enough storage capacity to reduce the impact on the limitation of chip I/O bandwidth, and the lower level--Memory B provides enough data parallelism and flexibility to meet the requirements of multiple reconfigurable pipeline function units in a single VSP chip. The needed memory size is decided by the memory usage analysis for video algorithms and the number of function units. Both levels of memory adopted a dual-port memory scheme to sustain the simultaneous read and write operations. Especially, Memory B uses multiple one-read-one-write memory banks to emulate the real multiport memory. Therefore, one can change the configuration of Memory B to several sets of memories with variable read/write ports by adjusting the bus switches. Then the numbers of read ports and write ports in proposed memory can meet requirement of data flow patterns in different video coding algorithms. We have finished the design of a prototype memory design using 1.2- micrometers SPDM SRAM technology and will fabricated it through TSMC, in Taiwan.

  12. A CNN-Specific Integrated Processor

    Directory of Open Access Journals (Sweden)

    Suleyman Malki

    2009-01-01

    Full Text Available Integrated Processors (IP are algorithm-specific cores that either by programming or by configuration can be re-used within many microelectronic systems. This paper looks at Cellular Neural Networks (CNN to become realized as IP. First current digital implementations are reviewed, and the memoryprocessor bandwidth issues are analyzed. Then a generic view is taken on the structure of the network, and a new intra-communication protocol based on rotating wheels is proposed. It is shown that this provides for guaranteed high-performance with a minimal network interface. The resulting node is small and supports multi-level CNN designs, giving the system a 30-fold increase in capacity compared to classical designs. As it facilitates multiple operations on a single image, and single operations on multiple images, with minimal access to the external image memory, balancing the internal and external data transfer requirements optimizes the system operation. In conventional digital CNN designs, the treatment of boundary nodes requires additional logic to handle the CNN value propagation scheme. In the new architecture, only a slight modification of the existing cells is necessary to model the boundary effect. A typical prototype for visual pattern recognition will house 4096 CNN cells with a 2% overhead for making it an IP.

  13. The ATLAS fast tracker processor design

    CERN Document Server

    Volpi, Guido; Albicocco, Pietro; Alison, John; Ancu, Lucian Stefan; Anderson, James; Andari, Nansi; Andreani, Alessandro; Andreazza, Attilio; Annovi, Alberto; Antonelli, Mario; Asbah, Needa; Atkinson, Markus; Baines, J; Barberio, Elisabetta; Beccherle, Roberto; Beretta, Matteo; Biesuz, Nicolo Vladi; Blair, R E; Bogdan, Mircea; Boveia, Antonio; Britzger, Daniel; Bryant, Partick; Burghgrave, Blake; Calderini, Giovanni; Camplani, Alessandra; Cavaliere, Viviana; Cavasinni, Vincenzo; Chakraborty, Dhiman; Chang, Philip; Cheng, Yangyang; Citraro, Saverio; Citterio, Mauro; Crescioli, Francesco; Dawe, Noel; Dell'Orso, Mauro; Donati, Simone; Dondero, Paolo; Drake, G; Gadomski, Szymon; Gatta, Mauro; Gentsos, Christos; Giannetti, Paola; Gkaitatzis, Stamatios; Gramling, Johanna; Howarth, James William; Iizawa, Tomoya; Ilic, Nikolina; Jiang, Zihao; Kaji, Toshiaki; Kasten, Michael; Kawaguchi, Yoshimasa; Kim, Young Kee; Kimura, Naoki; Klimkovich, Tatsiana; Kolb, Mathis; Kordas, K; Krizka, Karol; Kubota, T; Lanza, Agostino; Li, Ho Ling; Liberali, Valentino; Lisovyi, Mykhailo; Liu, Lulu; Love, Jeremy; Luciano, Pierluigi; Luongo, Carmela; Magalotti, Daniel; Maznas, Ioannis; Meroni, Chiara; Mitani, Takashi; Nasimi, Hikmat; Negri, Andrea; Neroutsos, Panos; Neubauer, Mark; Nikolaidis, Spiridon; Okumura, Y; Pandini, Carlo; Petridou, Chariclia; Piendibene, Marco; Proudfoot, James; Rados, Petar Kevin; Roda, Chiara; Rossi, Enrico; Sakurai, Yuki; Sampsonidis, Dimitrios; Saxon, James; Schmitt, Stefan; Schoening, Andre; Shochet, Mel; Shoijaii, Jafar; Soltveit, Hans Kristian; Sotiropoulou, Calliope-Louisa; Stabile, Alberto; Swiatlowski, Maximilian J; Tang, Fukun; Taylor, Pierre Thor Elliot; Testa, Marianna; Tompkins, Lauren; Vercesi, V; Wang, Rui; Watari, Ryutaro; Zhang, Jianhong; Zeng, Jian Cong; Zou, Rui; Bertolucci, Federico

    2015-01-01

    The extended use of tracking information at the trigger level in the LHC is crucial for the trigger and data acquisition (TDAQ) system to fulfill its task. Precise and fast tracking is important to identify specific decay products of the Higgs boson or new phenomena, as well as to distinguish the contributions coming from the many collisions that occur at every bunch crossing. However, track reconstruction is among the most demanding tasks performed by the TDAQ computing farm; in fact, complete reconstruction at full Level-1 trigger accept rate (100 kHz) is not possible. In order to overcome this limitation, the ATLAS experiment is planning the installation of a dedicated processor, the Fast Tracker (FTK), which is aimed at achieving this goal. The FTK is a pipeline of high performance electronics, based on custom and commercial devices, which is expected to reconstruct, with high resolution, the trajectories of charged-particle tracks with a transverse momentum above 1 GeV, using the ATLAS inner tracker info...

  14. Preventing Precipitation in the ISS Urine Processor

    Science.gov (United States)

    Muirhead, Dean; Carter, Layne; Williamson, Jill; Chambers, Antja

    2017-01-01

    The ISS Urine Processor Assembly (UPA) was initially designed to achieve 85% recovery of water from pretreated urine on ISS. Pretreated urine is comprised of crew urine treated with flush water, an oxidant (chromium trioxide), and an inorganic acid (sulfuric acid) to control microbial growth and inhibit precipitation. Unfortunately, initial operation of the UPA on ISS resulted in the precipitation of calcium sulfate at 85% recovery. This occurred because the calcium concentration in the crew urine was elevated in microgravity due to bone loss. The higher calcium concentration precipitated with sulfate from the pretreatment acid, resulting in a failure of the UPA due to the accumulation of solids in the Distillation Assembly. Since this failure, the UPA has been limited to a reduced recovery of water from urine to prevent calcium sulfate from reaching the solubility limit. NASA personnel have worked to identify a solution that would allow the UPA to return to a nominal recovery rate of 85%. This effort has culminated with the development of a pretreatment based on phosphoric acid instead of sulfuric acid. By eliminating the sulfate associated with the pretreatment, the brine can be concentrated to a much higher concentration before calcium sulfate reach the solubility limit. This paper summarizes the development of this pretreatment and the testing performed to verify its implementation on ISS.

  15. Multipurpose silicon photonics signal processor core.

    Science.gov (United States)

    Pérez, Daniel; Gasulla, Ivana; Crudgington, Lee; Thomson, David J; Khokhar, Ali Z; Li, Ke; Cao, Wei; Mashanovich, Goran Z; Capmany, José

    2017-09-21

    Integrated photonics changes the scaling laws of information and communication systems offering architectural choices that combine photonics with electronics to optimize performance, power, footprint, and cost. Application-specific photonic integrated circuits, where particular circuits/chips are designed to optimally perform particular functionalities, require a considerable number of design and fabrication iterations leading to long development times. A different approach inspired by electronic Field Programmable Gate Arrays is the programmable photonic processor, where a common hardware implemented by a two-dimensional photonic waveguide mesh realizes different functionalities through programming. Here, we report the demonstration of such reconfigurable waveguide mesh in silicon. We demonstrate over 20 different functionalities with a simple seven hexagonal cell structure, which can be applied to different fields including communications, chemical and biomedical sensing, signal processing, multiprocessor networks, and quantum information systems. Our work is an important step toward this paradigm.Integrated optical circuits today are typically designed for a few special functionalities and require complex design and development procedures. Here, the authors demonstrate a reconfigurable but simple silicon waveguide mesh with different functionalities.

  16. Element Load Data Processor (ELDAP) Users Manual

    Science.gov (United States)

    Ramsey, John K., Jr.; Ramsey, John K., Sr.

    2015-01-01

    Often, the shear and tensile forces and moments are extracted from finite element analyses to be used in off-line calculations for evaluating the integrity of structural connections involving bolts, rivets, and welds. Usually the maximum forces and moments are desired for use in the calculations. In situations where there are numerous structural connections of interest for numerous load cases, the effort in finding the true maximum force and/or moment combinations among all fasteners and welds and load cases becomes difficult. The Element Load Data Processor (ELDAP) software described herein makes this effort manageable. This software eliminates the possibility of overlooking the worst-case forces and moments that could result in erroneous positive margins of safety and/or selecting inconsistent combinations of forces and moments resulting in false negative margins of safety. In addition to forces and moments, any scalar quantity output in a PATRAN report file may be evaluated with this software. This software was originally written to fill an urgent need during the structural analysis of the Ares I-X Interstage segment. As such, this software was coded in a straightforward manner with no effort made to optimize or minimize code or to develop a graphical user interface.

  17. Virtual Modular Redundancy of Processor Module in the PLC

    International Nuclear Information System (INIS)

    Lee, Kwang-Il; Hwang, SungJae; Yoon, DongHwa

    2016-01-01

    Dual Modular Redundancy (DMR) is mainly used to implement these safety control systems. DMR is conveyed when components of a system are duplicated, providing another component in case one should fault or fail. This feature has a high availability and large fault tolerant. It provides zero downtime that is required for nuclear power plants. So nuclear power plant has been commercialized by multiple redundant systems. The following paper, we propose a Virtual Modular Redundancy (VMR) rather than physical triple of the Programmable Logic Controller (PLC) processor module to ensure the reliability of the nuclear power plant control system. VMR implementation minimizes design changes to continue to use the commercially available redundant system. Also, the purpose of the VMR is to improve the efficiency and reliability in many ways, such as fault tolerant and fail-safe and cost. VMR guarantees a wide range of reliable fault recovery, fault tolerance, etc. It is prevented before it causes great damages due to the continuous failure of the two modules. The reliable communication speed is slow and also it has a small bandwidth. It is a great loss in the safety control system. However, VMR aims to avoid nuclear power plants that were suspended due to fail-safe. It is not for the purpose of commonly used. Application of VMR is actually expected to require a lot of research and trial and error until they adapt to the nuclear regulatory and standards

  18. Algorithms for computational fluid dynamics n parallel processors

    International Nuclear Information System (INIS)

    Van de Velde, E.F.

    1986-01-01

    A study of parallel algorithms for the numerical solution of partial differential equations arising in computational fluid dynamics is presented. The actual implementation on parallel processors of shared and nonshared memory design is discussed. The performance of these algorithms is analyzed in terms of machine efficiency, communication time, bottlenecks and software development costs. For elliptic equations, a parallel preconditioned conjugate gradient method is described, which has been used to solve pressure equations discretized with high order finite elements on irregular grids. A parallel full multigrid method and a parallel fast Poisson solver are also presented. Hyperbolic conservation laws were discretized with parallel versions of finite difference methods like the Lax-Wendroff scheme and with the Random Choice method. Techniques are developed for comparing the behavior of an algorithm on different architectures as a function of problem size and local computational effort. Effective use of these advanced architecture machines requires the use of machine dependent programming. It is shown that the portability problems can be minimized by introducing high level operations on vectors and matrices structured into program libraries

  19. Virtual Modular Redundancy of Processor Module in the PLC

    Energy Technology Data Exchange (ETDEWEB)

    Lee, Kwang-Il; Hwang, SungJae; Yoon, DongHwa [SOOSAN ENS Co., Seoul (Korea, Republic of)

    2016-10-15

    Dual Modular Redundancy (DMR) is mainly used to implement these safety control systems. DMR is conveyed when components of a system are duplicated, providing another component in case one should fault or fail. This feature has a high availability and large fault tolerant. It provides zero downtime that is required for nuclear power plants. So nuclear power plant has been commercialized by multiple redundant systems. The following paper, we propose a Virtual Modular Redundancy (VMR) rather than physical triple of the Programmable Logic Controller (PLC) processor module to ensure the reliability of the nuclear power plant control system. VMR implementation minimizes design changes to continue to use the commercially available redundant system. Also, the purpose of the VMR is to improve the efficiency and reliability in many ways, such as fault tolerant and fail-safe and cost. VMR guarantees a wide range of reliable fault recovery, fault tolerance, etc. It is prevented before it causes great damages due to the continuous failure of the two modules. The reliable communication speed is slow and also it has a small bandwidth. It is a great loss in the safety control system. However, VMR aims to avoid nuclear power plants that were suspended due to fail-safe. It is not for the purpose of commonly used. Application of VMR is actually expected to require a lot of research and trial and error until they adapt to the nuclear regulatory and standards.

  20. Nonlinear Wave Simulation on the Xeon Phi Knights Landing Processor

    Science.gov (United States)

    Hristov, Ivan; Goranov, Goran; Hristova, Radoslava

    2018-02-01

    We consider an interesting from computational point of view standing wave simulation by solving coupled 2D perturbed Sine-Gordon equations. We make an OpenMP realization which explores both thread and SIMD levels of parallelism. We test the OpenMP program on two different energy equivalent Intel architectures: 2× Xeon E5-2695 v2 processors, (code-named "Ivy Bridge-EP") in the Hybrilit cluster, and Xeon Phi 7250 processor (code-named "Knights Landing" (KNL). The results show 2 times better performance on KNL processor.

  1. Nonlinear Wave Simulation on the Xeon Phi Knights Landing Processor

    Directory of Open Access Journals (Sweden)

    Hristov Ivan

    2018-01-01

    Full Text Available We consider an interesting from computational point of view standing wave simulation by solving coupled 2D perturbed Sine-Gordon equations. We make an OpenMP realization which explores both thread and SIMD levels of parallelism. We test the OpenMP program on two different energy equivalent Intel architectures: 2× Xeon E5-2695 v2 processors, (code-named “Ivy Bridge-EP” in the Hybrilit cluster, and Xeon Phi 7250 processor (code-named “Knights Landing” (KNL. The results show 2 times better performance on KNL processor.

  2. Optimization and parallelization of B-spline based orbital evaluations in QMC on multi/many-core shared memory processors

    OpenAIRE

    Mathuriya, Amrita; Luo, Ye; Benali, Anouar; Shulenburger, Luke; Kim, Jeongnim

    2016-01-01

    B-spline based orbital representations are widely used in Quantum Monte Carlo (QMC) simulations of solids, historically taking as much as 50% of the total run time. Random accesses to a large four-dimensional array make it challenging to efficiently utilize caches and wide vector units of modern CPUs. We present node-level optimizations of B-spline evaluations on multi/many-core shared memory processors. To increase SIMD efficiency and bandwidth utilization, we first apply data layout transfo...

  3. Median and Morphological Specialized Processors for a Real-Time Image Data Processing

    Directory of Open Access Journals (Sweden)

    Kazimierz Wiatr

    2002-01-01

    Full Text Available This paper presents the considerations on selecting a multiprocessor MISD architecture for fast implementation of the vision image processing. Using the author′s earlier experience with real-time systems, implementing of specialized hardware processors based on the programmable FPGA systems has been proposed in the pipeline architecture. In particular, the following processors are presented: median filter and morphological processor. The structure of a universal reconfigurable processor developed has been proposed as well. Experimental results are presented as delays on LCA level implementation for median filter, morphological processor, convolution processor, look-up-table processor, logic processor and histogram processor. These times compare with delays in general purpose processor and DSP processor.

  4. Reconfigurable VLIW Processor for Software Defined Radio, Phase I

    Data.gov (United States)

    National Aeronautics and Space Administration — We will design and formally verify a VLIW processor that is radiation-hardened, and where the VLIW instructions consist of predicated RISC instructions from the...

  5. Detailed algorithmic description of a processor: a recipe for ...

    African Journals Online (AJOL)

    International Journal of Natural and Applied Sciences ... a simple developed compiler could generate the code of a simple programming language. ... It should be noted that such code generation must be done on a particular processor- for ...

  6. Analysis of Intel IA-64 Processor Support for Secure Systems

    National Research Council Canada - National Science Library

    Unalmis, Bugra

    2001-01-01

    .... Systems could be constructed for which serious security threats would be eliminated. This thesis explores the Intel IA-64 processor's hardware support and its relationship to software for building a secure system...

  7. Fast parallel computation of polynomials using few processors

    DEFF Research Database (Denmark)

    Valiant, Leslie; Skyum, Sven

    1981-01-01

    It is shown that any multivariate polynomial that can be computed sequentially in C steps and has degree d can be computed in parallel in 0((log d) (log C + log d)) steps using only (Cd)0(1) processors....

  8. Optical backplane interconnect switch for data processors and computers

    Science.gov (United States)

    Hendricks, Herbert D.; Benz, Harry F.; Hammer, Jacob M.

    1989-01-01

    An optoelectronic integrated device design is reported which can be used to implement an all-optical backplane interconnect switch. The switch is sized to accommodate an array of processors and memories suitable for direct replacement into the basic avionic multiprocessor backplane. The optical backplane interconnect switch is also suitable for direct replacement of the PI bus traffic switch and at the same time, suitable for supporting pipelining of the processor and memory. The 32 bidirectional switchable interconnects are configured with broadcast capability for controls, reconfiguration, and messages. The approach described here can handle a serial interconnection of data processors or a line-to-link interconnection of data processors. An optical fiber demonstration of this approach is presented.

  9. High-speed packet filtering utilizing stream processors

    Science.gov (United States)

    Hummel, Richard J.; Fulp, Errin W.

    2009-04-01

    Parallel firewalls offer a scalable architecture for the next generation of high-speed networks. While these parallel systems can be implemented using multiple firewalls, the latest generation of stream processors can provide similar benefits with a significantly reduced latency due to locality. This paper describes how the Cell Broadband Engine (CBE), a popular stream processor, can be used as a high-speed packet filter. Results show the CBE can potentially process packets arriving at a rate of 1 Gbps with a latency less than 82 μ-seconds. Performance depends on how well the packet filtering process is translated to the unique stream processor architecture. For example the method used for transmitting data and control messages among the pseudo-independent processor cores has a significant impact on performance. Experimental results will also show the current limitations of a CBE operating system when used to process packets. Possible solutions to these issues will be discussed.

  10. 2009 Survey of Gulf of Mexico Dockside Seafood Processors

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — This survey gathered and analyze economic data from seafood processors throughout the states in the Gulf region. The survey sought to collect financial variables...

  11. Huffman-based code compression techniques for embedded processors

    KAUST Repository

    Bonny, Mohamed Talal; Henkel, Jö rg

    2010-01-01

    % for ARM and MIPS, respectively. In our compression technique, we have conducted evaluations using a representative set of applications and we have applied each technique to two major embedded processor architectures, namely ARM and MIPS. © 2010 ACM.

  12. High-Performance Linear Algebra Processor using FPGA

    National Research Council Canada - National Science Library

    Johnson, J

    2004-01-01

    With recent advances in FPGA (Field Programmable Gate Array) technology it is now feasible to use these devices to build special purpose processors for floating point intensive applications that arise in scientific computing...

  13. Particle simulation on a distributed memory highly parallel processor

    International Nuclear Information System (INIS)

    Sato, Hiroyuki; Ikesaka, Morio

    1990-01-01

    This paper describes parallel molecular dynamics simulation of atoms governed by local force interaction. The space in the model is divided into cubic subspaces and mapped to the processor array of the CAP-256, a distributed memory, highly parallel processor developed at Fujitsu Labs. We developed a new technique to avoid redundant calculation of forces between atoms in different processors. Experiments showed the communication overhead was less than 5%, and the idle time due to load imbalance was less than 11% for two model problems which contain 11,532 and 46,128 argon atoms. From the software simulation, the CAP-II which is under development is estimated to be about 45 times faster than CAP-256 and will be able to run the same problem about 40 times faster than Fujitsu's M-380 mainframe when 256 processors are used. (author)

  14. Radiation Tolerant Software Defined Video Processor, Phase I

    Data.gov (United States)

    National Aeronautics and Space Administration — MaXentric's is proposing a radiation tolerant Software Define Video Processor, codenamed SDVP, for the problem of advanced motion imaging in the space environment....

  15. Assembly processor program converts symbolic programming language to machine language

    Science.gov (United States)

    Pelto, E. V.

    1967-01-01

    Assembly processor program converts symbolic programming language to machine language. This program translates symbolic codes into computer understandable instructions, assigns locations in storage for successive instructions, and computer locations from symbolic addresses.

  16. Suboptimal processor for anomaly detection for system surveillance and diagnosis

    Energy Technology Data Exchange (ETDEWEB)

    Ciftcioglu, Oe.; Hoogenboom, J.E.; Dam, H. van

    1989-06-01

    Anomaly detection for nuclear reactor surveillance and diagnosis is described. The residual noise obtained as a result of autoregressive (AR) modelling is essential to obtain high sensitivity for anomaly detection. By means of the method of hypothesis testing a suboptimal anomaly detection processor is devised for system surveillance and diagnosis. Experiments are carried out to investigate the performance of the processor, which is in particular of interest for on-line and real-time applications.

  17. Reducing Competitive Cache Misses in Modern Processor Architectures

    OpenAIRE

    Prisagjanec, Milcho; Mitrevski, Pece

    2017-01-01

    The increasing number of threads inside the cores of a multicore processor, and competitive access to the shared cache memory, become the main reasons for an increased number of competitive cache misses and performance decline. Inevitably, the development of modern processor architectures leads to an increased number of cache misses. In this paper, we make an attempt to implement a technique for decreasing the number of competitive cache misses in the first level of cache memory. This tec...

  18. UA1 upgrade first-level calorimeter trigger processor

    International Nuclear Information System (INIS)

    Bains, N.; Charlton, D.; Ellis, N.; Garvey, J.; Gregory, J.; Jimack, M.P.; Jovanovic, P.; Kenyon, I.R.; Baird, S.A.; Campbell, D.; Cawthraw, M.; Coughlan, J.; Flynn, P.; Galagedera, S.; Grayer, G.; Halsall, R.; Shah, T.P.; Stephens, R.; Eisenhandler, E.; Fensome, I.; Landon, M.

    1989-01-01

    A new first-level trigger processor has been built for the UA1 experiment on the Cern SppS Collider. The processor exploits the fine granularity of the new UA1 uranium-TMP calorimeter to improve the selectivity of the trigger. The new electron trigger has improved hadron jet rejection, achieved by requiring low energy deposition around the electromagnetic cluster. A missing transverse energy trigger and a total energy trigger have also been implemented. (orig.)

  19. GA103: A microprogrammable processor for online filtering

    International Nuclear Information System (INIS)

    Calzas, A.; Danon, G.; Bouquet, B.

    1981-01-01

    GA 103 is a 16 bit microprogrammable processor which emulates the PDP 11 instruction set. It is based on the Am 2900 slices. It allows user-implemented microinstructions and addition of hardwired processors. It will perform on-line filtering tasks in the NA 14 experiment at CERN, based on the reconstruction of transverse momentum of photons detected in a lead glass calorimeter. (orig.)

  20. 16-Bit RISC Processor Design for Convolution Application

    OpenAIRE

    Anand Nandakumar Shardul

    2013-01-01

    In this project, we propose a 16-bit non-pipelined RISC processor, which is used for signal processing applications. The processor consists of the blocks, namely, program counter, clock control unit, ALU, IDU and registers. Advantageous architectural modifications have been made in the incremented circuit used in program counter and carry select adder unit of the ALU in the RISC CPU core. Furthermore, a high speed and low power modified modifies multiplier has been designed and introduced in ...

  1. The Serial Link Processor for the Fast TracKer (FTK) processor at ATLAS

    CERN Document Server

    Biesuz, Nicolo Vladi; The ATLAS collaboration; Luciano, Pierluigi; Magalotti, Daniel; Rossi, Enrico

    2015-01-01

    The Associative Memory (AM) system of the Fast Tracker (FTK) processor has been designed to perform pattern matching using the hit information of the ATLAS experiment silicon tracker. The AM is the heart of FTK and is mainly based on the use of ASICs (AM chips) designed on purpose to execute pattern matching with a high degree of parallelism. It finds track candidates at low resolution that are seeds for a full resolution track fitting. To solve the very challenging data traffic problems inside FTK, multiple board and chip designs have been performed. The currently proposed solution is named the “Serial Link Processor” and is based on an extremely powerful network of 2 Gb/s serial links. This paper reports on the design of the Serial Link Processor consisting of two types of boards, the Local Associative Memory Board (LAMB), a mezzanine where the AM chips are mounted, and the Associative Memory Board (AMB), a 9U VME board which holds and exercises four LAMBs. We report on the performance of the intermedia...

  2. The Serial Link Processor for the Fast TracKer (FTK) processor at ATLAS

    CERN Document Server

    Andreani, A; The ATLAS collaboration; Beccherle, R; Beretta, M; Cipriani, R; Citraro, S; Citterio, M; Colombo, A; Crescioli, F; Dimas, D; Donati, S; Giannetti, P; Kordas, K; Lanza, A; Liberali, V; Luciano, P; Magalotti, D; Neroutsos, P; Nikolaidis, S; Piendibene, M; Sakellariou, A; Shojaii, S; Sotiropoulou, C-L; Stabile, A

    2014-01-01

    The Associative Memory (AM) system of the FTK processor has been designed to perform pattern matching using the hit information of the ATLAS silicon tracker. The AM is the heart of the FTK and it finds track candidates at low resolution that are seeds for a full resolution track fitting. To solve the very challenging data traffic problems inside the FTK, multiple designs and tests have been performed. The currently proposed solution is named the “Serial Link Processor” and is based on an extremely powerful network of 2 Gb/s serial links. This paper reports on the design of the Serial Link Processor consisting of the AM chip, an ASIC designed and optimized to perform pattern matching, and two types of boards, the Local Associative Memory Board (LAMB), a mezzanine where the AM chips are mounted, and the Associative Memory Board (AMB), a 9U VME board which holds and exercises four LAMBs. Special relevance will be given to the AMchip design that includes two custom cells optimized for low consumption. We repo...

  3. The Serial Link Processor for the Fast TracKer (FTK) processor at ATLAS

    CERN Document Server

    Biesuz, Nicolo Vladi; The ATLAS collaboration; Luciano, Pierluigi; Magalotti, Daniel; Rossi, Enrico

    2015-01-01

    The Associative Memory (AM) system of the Fast Tracker (FTK) processor has been designed to perform pattern matching using the hit information of the ATLAS experiment silicon tracker. The AM is the heart of FTK and is mainly based on the use of ASICs (AM chips) designed to execute pattern matching with a high degree of parallelism. The AM system finds track candidates at low resolution that are seeds for a full resolution track fitting. To solve the very challenging data traffic problems inside FTK, multiple board and chip designs have been performed. The currently proposed solution is named the “Serial Link Processor” and is based on an extremely powerful network of 828 2 Gbit/s serial links for a total in/out bandwidth of 56 Gb/s. This paper reports on the design of the Serial Link Processor consisting of two types of boards, the Local Associative Memory Board (LAMB), a mezzanine where the AM chips are mounted, and the Associative Memory Board (AMB), a 9U VME board which holds and exercises four LAMBs. ...

  4. Unified and Modular Modeling and Functional Verification Framework of Real-Time Image Signal Processors

    Directory of Open Access Journals (Sweden)

    Abhishek Jain

    2016-01-01

    Full Text Available In VLSI industry, image signal processing algorithms are developed and evaluated using software models before implementation of RTL and firmware. After the finalization of the algorithm, software models are used as a golden reference model for the image signal processor (ISP RTL and firmware development. In this paper, we are describing the unified and modular modeling framework of image signal processing algorithms used for different applications such as ISP algorithms development, reference for hardware (HW implementation, reference for firmware (FW implementation, and bit-true certification. The universal verification methodology- (UVM- based functional verification framework of image signal processors using software reference models is described. Further, IP-XACT based tools for automatic generation of functional verification environment files and model map files are described. The proposed framework is developed both with host interface and with core using virtual register interface (VRI approach. This modeling and functional verification framework is used in real-time image signal processing applications including cellphone, smart cameras, and image compression. The main motivation behind this work is to propose the best efficient, reusable, and automated framework for modeling and verification of image signal processor (ISP designs. The proposed framework shows better results and significant improvement is observed in product verification time, verification cost, and quality of the designs.

  5. Design and development of a diesel and JP-8 logistic fuel processor

    Science.gov (United States)

    Roychoudhury, Subir; Lyubovsky, Maxim; Walsh, D.; Chu, Deryn; Kallio, Erik

    The paper describes the design and performance of a breadboard prototype for a 5 kW fuel-processor for powering a solid oxide fuel cell (SOFC) stack. The system was based on a small, modular catalytic Microlith auto-thermal (ATR) reactor with the versatility of operating on diesel, Jet-A or JP-8 fuels. The reforming reactor utilized Microlith substrates and catalyst technology (patented and trademarked). These reactors have demonstrated the capability of efficiently reforming liquid and gaseous hydrocarbon fuels at exceptionally high power densities. The performance characteristics of the auto-thermal reactor (ATR) have been presented along with durability data. The fuel processor integrates fuel preparation, steam generation, sulfur removal, pumps, blowers and controls. The system design was developed via ASPEN ® Engineering Suite process simulation software and was analyzed with reference to system balance requirements. Since the fuel processor has not been integrated with a fuel cell, aspects of thermal integration with the stack have not been specifically addressed.

  6. Reconfigurable signal processor designs for advanced digital array radar systems

    Science.gov (United States)

    Suarez, Hernan; Zhang, Yan (Rockee); Yu, Xining

    2017-05-01

    The new challenges originated from Digital Array Radar (DAR) demands a new generation of reconfigurable backend processor in the system. The new FPGA devices can support much higher speed, more bandwidth and processing capabilities for the need of digital Line Replaceable Unit (LRU). This study focuses on using the latest Altera and Xilinx devices in an adaptive beamforming processor. The field reprogrammable RF devices from Analog Devices are used as analog front end transceivers. Different from other existing Software-Defined Radio transceivers on the market, this processor is designed for distributed adaptive beamforming in a networked environment. The following aspects of the novel radar processor will be presented: (1) A new system-on-chip architecture based on Altera's devices and adaptive processing module, especially for the adaptive beamforming and pulse compression, will be introduced, (2) Successful implementation of generation 2 serial RapidIO data links on FPGA, which supports VITA-49 radio packet format for large distributed DAR processing. (3) Demonstration of the feasibility and capabilities of the processor in a Micro-TCA based, SRIO switching backplane to support multichannel beamforming in real-time. (4) Application of this processor in ongoing radar system development projects, including OU's dual-polarized digital array radar, the planned new cylindrical array radars, and future airborne radars.

  7. PixonVision real-time video processor

    Science.gov (United States)

    Puetter, R. C.; Hier, R. G.

    2007-09-01

    PixonImaging LLC and DigiVision, Inc. have developed a real-time video processor, the PixonVision PV-200, based on the patented Pixon method for image deblurring and denoising, and DigiVision's spatially adaptive contrast enhancement processor, the DV1000. The PV-200 can process NTSC and PAL video in real time with a latency of 1 field (1/60 th of a second), remove the effects of aerosol scattering from haze, mist, smoke, and dust, improve spatial resolution by up to 2x, decrease noise by up to 6x, and increase local contrast by up to 8x. A newer version of the processor, the PV-300, is now in prototype form and can handle high definition video. Both the PV-200 and PV-300 are FPGA-based processors, which could be spun into ASICs if desired. Obvious applications of these processors include applications in the DOD (tanks, aircraft, and ships), homeland security, intelligence, surveillance, and law enforcement. If developed into an ASIC, these processors will be suitable for a variety of portable applications, including gun sights, night vision goggles, binoculars, and guided munitions. This paper presents a variety of examples of PV-200 processing, including examples appropriate to border security, battlefield applications, port security, and surveillance from unmanned aerial vehicles.

  8. Review of trigger and on-line processors at SLAC

    International Nuclear Information System (INIS)

    Lankford, A.J.

    1984-07-01

    The role of trigger and on-line processors in reducing data rates to manageable proportions in e + e - physics experiments is defined not by high physics or background rates, but by the large event sizes of the general-purpose detectors employed. The rate of e + e - annihilation is low, and backgrounds are not high; yet the number of physics processes which can be studied is vast and varied. This paper begins by briefly describing the role of trigger processors in the e + e - context. The usual flow of the trigger decision process is illustrated with selected examples of SLAC trigger processing. The features are mentioned of triggering at the SLC and the trigger processing plans of the two SLC detectors: The Mark II and the SLD. The most common on-line processors at SLAC, the BADC, the SLAC Scanner Processor, the SLAC FASTBUS Controller, and the VAX CAMAC Channel, are discussed. Uses of the 168/E, 3081/E, and FASTBUS VAX processors are mentioned. The manner in which these processors are interfaced and the function they serve on line is described. Finally, the accelerator control system for the SLC is outlined. This paper is a survey in nature, and hence, relies heavily upon references to previous publications for detailed description of work mentioned here. 27 references, 9 figures, 1 table

  9. High-Speed General Purpose Genetic Algorithm Processor.

    Science.gov (United States)

    Hoseini Alinodehi, Seyed Pourya; Moshfe, Sajjad; Saber Zaeimian, Masoumeh; Khoei, Abdollah; Hadidi, Khairollah

    2016-07-01

    In this paper, an ultrafast steady-state genetic algorithm processor (GAP) is presented. Due to the heavy computational load of genetic algorithms (GAs), they usually take a long time to find optimum solutions. Hardware implementation is a significant approach to overcome the problem by speeding up the GAs procedure. Hence, we designed a digital CMOS implementation of GA in [Formula: see text] process. The proposed processor is not bounded to a specific application. Indeed, it is a general-purpose processor, which is capable of performing optimization in any possible application. Utilizing speed-boosting techniques, such as pipeline scheme, parallel coarse-grained processing, parallel fitness computation, parallel selection of parents, dual-population scheme, and support for pipelined fitness computation, the proposed processor significantly reduces the processing time. Furthermore, by relying on a built-in discard operator the proposed hardware may be used in constrained problems that are very common in control applications. In the proposed design, a large search space is achievable through the bit string length extension of individuals in the genetic population by connecting the 32-bit GAPs. In addition, the proposed processor supports parallel processing, in which the GAs procedure can be run on several connected processors simultaneously.

  10. A UNIX-based prototype biomedical virtual image processor

    International Nuclear Information System (INIS)

    Fahy, J.B.; Kim, Y.

    1987-01-01

    The authors have developed a multiprocess virtual image processor for the IBM PC/AT, in order to maximize image processing software portability for biomedical applications. An interprocess communication scheme, based on two-way metacode exchange, has been developed and verified for this purpose. Application programs call a device-independent image processing library, which transfers commands over a shared data bridge to one or more Autonomous Virtual Image Processors (AVIP). Each AVIP runs as a separate process in the UNIX operating system, and implements the device-independent functions on the image processor to which it corresponds. Application programs can control multiple image processors at a time, change the image processor configuration used at any time, and are completely portable among image processors for which an AVIP has been implemented. Run-time speeds have been found to be acceptable for higher level functions, although rather slow for lower level functions, owing to the overhead associated with sending commands and data over the shared data bridge

  11. A digital retina-like low-level vision processor.

    Science.gov (United States)

    Mertoguno, S; Bourbakis, N G

    2003-01-01

    This correspondence presents the basic design and the simulation of a low level multilayer vision processor that emulates to some degree the functional behavior of a human retina. This retina-like multilayer processor is the lower part of an autonomous self-organized vision system, called Kydon, that could be used on visually impaired people with a damaged visual cerebral cortex. The Kydon vision system, however, is not presented in this paper. The retina-like processor consists of four major layers, where each of them is an array processor based on hexagonal, autonomous processing elements that perform a certain set of low level vision tasks, such as smoothing and light adaptation, edge detection, segmentation, line recognition and region-graph generation. At each layer, the array processor is a 2D array of k/spl times/m hexagonal identical autonomous cells that simultaneously execute certain low level vision tasks. Thus, the hardware design and the simulation at the transistor level of the processing elements (PEs) of the retina-like processor and its simulated functionality with illustrative examples are provided in this paper.

  12. Air-Lubricated Thermal Processor For Dry Silver Film

    Science.gov (United States)

    Siryj, B. W.

    1980-09-01

    Since dry silver film is processed by heat, it may be viewed on a light table only seconds after exposure. On the other hand, wet films require both bulky chemicals and substantial time before an image can be analyzed. Processing of dry silver film, although simple in concept, is not so simple when reduced to practice. The main concern is the effect of film temperature gradients on uniformity of optical film density. RCA has developed two thermal processors, different in implementation but based on the same philosophy. Pressurized air is directed to both sides of the film to support the film and to conduct the heat to the film. Porous graphite is used as the medium through which heat and air are introduced. The initial thermal processor was designed to process 9.5-inch-wide film moving at speeds ranging from 0.0034 to 0.008 inch per second. The processor configuration was curved to match the plane generated by the laser recording beam. The second thermal processor was configured to process 5-inch-wide film moving at a continuously variable rate ranging from 0.15 to 3.5 inches per second. Due to field flattening optics used in this laser recorder, the required film processing area was plane. In addition, this processor was sectioned in the direction of film motion, giving the processor the capability of varying both temperature and effective processing area.

  13. QERx- A Faster than Real-Time Emulator for Space Processors

    Science.gov (United States)

    Carvalho, B.; Pidgeon, A.; Robinson, P.

    2012-08-01

    Developing software for space systems is challenging. Especially because, in order to be sure it can cope with the harshness of the environment and the imperative requirements and constrains imposed by the platform were it will run, it needs to be tested exhaustively. Software Validation Facilities (SVF) are known to the industry and developers, and provide the means to run the On-Board Software (OBSW) in a realistic environment, allowing the development team to debug and test the software.But the challenge is to be able to keep up with the performance of the new processors (LEON2 and LEON3), which need to be emulated within the SVF. Such processor emulators are also used in Operational Simulators, used to support mission preparation and train mission operators. These simulators mimic the satellite and its behaviour, as realistically as possible. For test/operational efficiency reasons and because they will need to interact with external systems, both these uses cases require the processor emulators to provide real-time, or faster, performance.It is known to the industry that the performance of previously available emulators is not enough to cope with the performance of the new processors available in the market. SciSys approached this problem with dynamic translation technology trying to keep costs down by avoiding a hardware solution and keeping the integration flexibility of full software emulation.SciSys presented “QERx: A High Performance Emulator for Software Validation and Simulations” [1], in a previous DASIA event. Since then that idea has evolved and QERx has been successfully validated. SciSys is now presenting QERx as a product that can be tailored to fit different emulation needs. This paper will present QERx latest developments and current status.

  14. Experimental demonstration of selective quantum process tomography on an NMR quantum information processor

    Science.gov (United States)

    Gaikwad, Akshay; Rehal, Diksha; Singh, Amandeep; Arvind, Dorai, Kavita

    2018-02-01

    We present the NMR implementation of a scheme for selective and efficient quantum process tomography without ancilla. We generalize this scheme such that it can be implemented efficiently using only a set of measurements involving product operators. The method allows us to estimate any element of the quantum process matrix to a desired precision, provided a set of quantum states can be prepared efficiently. Our modified technique requires fewer experimental resources as compared to the standard implementation of selective and efficient quantum process tomography, as it exploits the special nature of NMR measurements to allow us to compute specific elements of the process matrix by a restrictive set of subsystem measurements. To demonstrate the efficacy of our scheme, we experimentally tomograph the processes corresponding to "no operation," a controlled-NOT (CNOT), and a controlled-Hadamard gate on a two-qubit NMR quantum information processor, with high fidelities.

  15. High-speed special-purpose processor for event selection by number of direct tracks

    International Nuclear Information System (INIS)

    Kalinnikov, V.A.; Krastev, V.R.; Chudakov, E.A.

    1986-01-01

    A processor which uses data on events from five detector planes is described. To increase economy and speed in parallel processing, the processor converts the input data to superposition code and recognizes tracks by a generated search mask. The resolving time of the processor is ≤300 nsec. The processor is CAMAC-compatible and uses ECL integrated circuits

  16. Resilient Afforable Cubesat Processor, Phase I

    Data.gov (United States)

    National Aeronautics and Space Administration — Advanced Materials Applications, LLC (AMA) proposes the Resilient Affordable Computing Platform (RACP), a power-efficient high-performance space computer design for...

  17. THOR Fields and Wave Processor - FWP

    Science.gov (United States)

    Soucek, Jan; Rothkaehl, Hanna; Ahlen, Lennart; Balikhin, Michael; Carr, Christopher; Dekkali, Moustapha; Khotyaintsev, Yuri; Lan, Radek; Magnes, Werner; Morawski, Marek; Nakamura, Rumi; Uhlir, Ludek; Yearby, Keith; Winkler, Marek; Zaslavsky, Arnaud

    2017-04-01

    If selected, Turbulence Heating ObserveR (THOR) will become the first spacecraft mission dedicated to the study of plasma turbulence. The Fields and Waves Processor (FWP) is an integrated electronics unit for all electromagnetic field measurements performed by THOR. FWP will interface with all THOR fields sensors: electric field antennas of the EFI instrument, the MAG fluxgate magnetometer, and search-coil magnetometer (SCM), and perform signal digitization and on-board data processing. FWP box will house multiple data acquisition sub-units and signal analyzers all sharing a common power supply and data processing unit and thus a single data and power interface to the spacecraft. Integrating all the electromagnetic field measurements in a single unit will improve the consistency of field measurement and accuracy of time synchronization. The scientific value of highly sensitive electric and magnetic field measurements in space has been demonstrated by Cluster (among other spacecraft) and THOR instrumentation will further improve on this heritage. Large dynamic range of the instruments will be complemented by a thorough electromagnetic cleanliness program, which will prevent perturbation of field measurements by interference from payload and platform subsystems. Taking advantage of the capabilities of modern electronics and the large telemetry bandwidth of THOR, FWP will provide multi-component electromagnetic field waveforms and spectral data products at a high time resolution. Fully synchronized sampling of many signals will allow to resolve wave phase information and estimate wavelength via interferometric correlations between EFI probes. FWP will also implement a plasma resonance sounder and a digital plasma quasi-thermal noise analyzer designed to provide high cadence measurements of plasma density and temperature complementary to data from particle instruments. FWP will rapidly transmit information about magnetic field vector and spacecraft potential to the

  18. Performance of Artificial Intelligence Workloads on the Intel Core 2 Duo Series Desktop Processors

    OpenAIRE

    Abdul Kareem PARCHUR; Kuppangari Krishna RAO; Fazal NOORBASHA; Ram Asaray SINGH

    2010-01-01

    As the processor architecture becomes more advanced, Intel introduced its Intel Core 2 Duo series processors. Performance impact on Intel Core 2 Duo processors are analyzed using SPEC CPU INT 2006 performance numbers. This paper studied the behavior of Artificial Intelligence (AI) benchmarks on Intel Core 2 Duo series processors. Moreover, we estimated the task completion time (TCT) @1 GHz, @2 GHz and @3 GHz Intel Core 2 Duo series processors frequency. Our results show the performance scalab...

  19. Digital signal processor for silicon audio playback devices; Silicon audio saisei kikiyo digital signal processor

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    2000-03-01

    The digital audio signal processor (DSP) TC9446F series has been developed silicon audio playback devices with a memory medium of, e.g., flash memory, DVD players, and AV devices, e.g., TV sets. It corresponds to AAC (advanced audio coding) (2ch) and MP3 (MPEG1 Layer3), as the audio compressing techniques being used for transmitting music through an internet. It also corresponds to compressed types, e.g., Dolby Digital, DTS (digital theater system) and MPEG2 audio, being adopted for, e.g., DVDs. It can carry a built-in audio signal processing program, e.g., Dolby ProLogic, equalizer, sound field controlling, and 3D sound. TC9446XB has been lined up anew. It adopts an FBGA (fine pitch ball grid array) package for portable audio devices. (translated by NEDO)

  20. Self-powered information measuring wireless networks using the distribution of tasks within multicore processors

    Science.gov (United States)

    Zhuravska, Iryna M.; Koretska, Oleksandra O.; Musiyenko, Maksym P.; Surtel, Wojciech; Assembay, Azat; Kovalev, Vladimir; Tleshova, Akmaral

    2017-08-01

    The article contains basic approaches to develop the self-powered information measuring wireless networks (SPIM-WN) using the distribution of tasks within multicore processors critical applying based on the interaction of movable components - as in the direction of data transmission as wireless transfer of energy coming from polymetric sensors. Base mathematic model of scheduling tasks within multiprocessor systems was modernized to schedule and allocate tasks between cores of one-crystal computer (SoC) to increase energy efficiency SPIM-WN objects.

  1. PP: A graphics post-processor for the EQ6 reaction path code

    International Nuclear Information System (INIS)

    Stockman, H.W.

    1994-09-01

    The PP code is a graphics post-processor and plotting program for EQ6, a popular reaction-path code. PP runs on personal computers, allocates memory dynamically, and can handle very large reaction path runs. Plots of simple variable groups, such as fluid and solid phase composition, can be obtained with as few as two keystrokes. Navigation through the list of reaction path variables is simple and efficient. Graphics files can be exported for inclusion in word processing documents and spreadsheets, and experimental data may be imported and superposed on the reaction path runs. The EQ6 thermodynamic database can be searched from within PP, to simplify interpretation of complex plots

  2. Hardware Realization of an FPGA Processor – Operating System Call Offload and Experiences

    DEFF Research Database (Denmark)

    Hindborg, Andreas Erik; Schleuniger, Pascal; Jensen, Nicklas Bo

    2014-01-01

    Field-programmable gate arrays, FPGAs, are attractive implementation platforms for low-volume signal and image processing applications. The structure of FPGAs allows for an efficient implementation of parallel algorithms. Sequential algorithms, on the other hand, often perform better...... core that can be integrated in many signal and data processing platforms on FPGAs. We also show how we allow the processor to use operating system services. For a set of SPLASH-2 and SPEC CPU2006 benchmarks we show a speedup of up to 64% over a similar Xilinx MicroBlaze implementation while using 27...

  3. Huffman-based code compression techniques for embedded processors

    KAUST Repository

    Bonny, Mohamed Talal

    2010-09-01

    The size of embedded software is increasing at a rapid pace. It is often challenging and time consuming to fit an amount of required software functionality within a given hardware resource budget. Code compression is a means to alleviate the problem by providing substantial savings in terms of code size. In this article we introduce a novel and efficient hardware-supported compression technique that is based on Huffman Coding. Our technique reduces the size of the generated decoding table, which takes a large portion of the memory. It combines our previous techniques, Instruction Splitting Technique and Instruction Re-encoding Technique into new one called Combined Compression Technique to improve the final compression ratio by taking advantage of both previous techniques. The instruction Splitting Technique is instruction set architecture (ISA)-independent. It splits the instructions into portions of varying size (called patterns) before Huffman coding is applied. This technique improves the final compression ratio by more than 20% compared to other known schemes based on Huffman Coding. The average compression ratios achieved using this technique are 48% and 50% for ARM and MIPS, respectively. The Instruction Re-encoding Technique is ISA-dependent. It investigates the benefits of reencoding unused bits (we call them reencodable bits) in the instruction format for a specific application to improve the compression ratio. Reencoding those bits can reduce the size of decoding tables by up to 40%. Using this technique, we improve the final compression ratios in comparison to the first technique to 46% and 45% for ARM and MIPS, respectively (including all overhead that incurs). The Combined Compression Technique improves the compression ratio to 45% and 42% for ARM and MIPS, respectively. In our compression technique, we have conducted evaluations using a representative set of applications and we have applied each technique to two major embedded processor architectures

  4. Architecture and VHDL behavioural validation of a parallel processor dedicated to computer vision

    International Nuclear Information System (INIS)

    Collette, Thierry

    1992-01-01

    Speeding up image processing is mainly obtained using parallel computers; SIMD processors (single instruction stream, multiple data stream) have been developed, and have proven highly efficient regarding low-level image processing operations. Nevertheless, their performances drop for most intermediate of high level operations, mainly when random data reorganisations in processor memories are involved. The aim of this thesis was to extend the SIMD computer capabilities to allow it to perform more efficiently at the image processing intermediate level. The study of some representative algorithms of this class, points out the limits of this computer. Nevertheless, these limits can be erased by architectural modifications. This leads us to propose SYMPATIX, a new SIMD parallel computer. To valid its new concept, a behavioural model written in VHDL - Hardware Description Language - has been elaborated. With this model, the new computer performances have been estimated running image processing algorithm simulations. VHDL modeling approach allows to perform the system top down electronic design giving an easy coupling between system architectural modifications and their electronic cost. The obtained results show SYMPATIX to be an efficient computer for low and intermediate level image processing. It can be connected to a high level computer, opening up the development of new computer vision applications. This thesis also presents, a top down design method, based on the VHDL, intended for electronic system architects. (author) [fr

  5. [Improving speech comprehension using a new cochlear implant speech processor].

    Science.gov (United States)

    Müller-Deile, J; Kortmann, T; Hoppe, U; Hessel, H; Morsnowski, A

    2009-06-01

    The aim of this multicenter clinical field study was to assess the benefits of the new Freedom 24 sound processor for cochlear implant (CI) users implanted with the Nucleus 24 cochlear implant system. The study included 48 postlingually profoundly deaf experienced CI users who demonstrated speech comprehension performance with their current speech processor on the Oldenburg sentence test (OLSA) in quiet conditions of at least 80% correct scores and who were able to perform adaptive speech threshold testing using the OLSA in noisy conditions. Following baseline measures of speech comprehension performance with their current speech processor, subjects were upgraded to the Freedom 24 speech processor. After a take-home trial period of at least 2 weeks, subject performance was evaluated by measuring the speech reception threshold with the Freiburg multisyllabic word test and speech intelligibility with the Freiburg monosyllabic word test at 50 dB and 70 dB in the sound field. The results demonstrated highly significant benefits for speech comprehension with the new speech processor. Significant benefits for speech comprehension were also demonstrated with the new speech processor when tested in competing background noise.In contrast, use of the Abbreviated Profile of Hearing Aid Benefit (APHAB) did not prove to be a suitably sensitive assessment tool for comparative subjective self-assessment of hearing benefits with each processor. Use of the preprocessing algorithm known as adaptive dynamic range optimization (ADRO) in the Freedom 24 led to additional improvements over the standard upgrade map for speech comprehension in quiet and showed equivalent performance in noise. Through use of the preprocessing beam-forming algorithm BEAM, subjects demonstrated a highly significant improved signal-to-noise ratio for speech comprehension thresholds (i.e., signal-to-noise ratio for 50% speech comprehension scores) when tested with an adaptive procedure using the Oldenburg

  6. Low voltage 80 KV to 125 KV electron processors

    International Nuclear Information System (INIS)

    Lauppi, U.V.

    1999-01-01

    The classic electron beam technology made use of accelerating energies in the voltage range of 300 to 800 kV. The first EB processors - built for the curing of coatings - operated at 300 kV. The products to be treated were thicker than a simple layer of coating with thicknesses up to 100g and more. It was only in the beginning of the 1970's that industrial EB processors with accelerating voltages below 300 kV appeared on the market. Our company developed the first commercial electron accelerator without a beam scanner. The new EB machine featured a linear cathode, emitting a shower or 'curtain' of electrons over the full width of the product. These units were much smaller than anv previous EB processors and dedicated to the curing of coatings and other thin layers. ESI's first EB units operated with accelerating voltages between 150 and 200 kV. In 1993 ESI announced the introduction of a new generation of Electrocure. EB processors operating at 120 kV, and in 1998, at the RadTech North America '98 Conference in Chicago, the introduction of an 80 kV electron beam processor under the designation Microbeam LV

  7. Design of RISC Processor Using VHDL and Cadence

    Science.gov (United States)

    Moslehpour, Saeid; Puliroju, Chandrasekhar; Abu-Aisheh, Akram

    The project deals about development of a basic RISC processor. The processor is designed with basic architecture consisting of internal modules like clock generator, memory, program counter, instruction register, accumulator, arithmetic and logic unit and decoder. This processor is mainly used for simple general purpose like arithmetic operations and which can be further developed for general purpose processor by increasing the size of the instruction register. The processor is designed in VHDL by using Xilinx 8.1i version. The present project also serves as an application of the knowledge gained from past studies of the PSPICE program. The study will show how PSPICE can be used to simplify massive complex circuits designed in VHDL Synthesis. The purpose of the project is to explore the designed RISC model piece by piece, examine and understand the Input/ Output pins, and to show how the VHDL synthesis code can be converted to a simplified PSPICE model. The project will also serve as a collection of various research materials about the pieces of the circuit.

  8. Heat dissipation for the Intel Core i5 processor using multiwalled carbon-nanotube-based ethylene glycol

    International Nuclear Information System (INIS)

    Thang, Bui Hung; Trinh, Pham Van; Quang, Le Dinh; Khoi, Phan Hong; Minh, Phan Ngoc; Huong, Nguyen Thi

    2014-01-01

    Carbon nanotubes (CNTs) are some of the most valuable materials with high thermal conductivity. The thermal conductivity of individual multiwalled carbon nanotubes (MWCNTs) grown by using chemical vapor deposition is 600 ± 100 Wm -1 K -1 compared with the thermal conductivity 419 Wm -1 K -1 of Ag. Carbon-nanotube-based liquids - a new class of nanomaterials, have shown many interesting properties and distinctive features offering potential in heat dissipation applications for electronic devices, such as computer microprocessor, high power LED, etc. In this work, a multiwalled carbon-nanotube-based liquid was made of well-dispersed hydroxyl-functional multiwalled carbon nanotubes (MWCNT-OH) in ethylene glycol (EG)/distilled water (DW) solutions by using Tween-80 surfactant and an ultrasonication method. The concentration of MWCNT-OH in EG/DW solutions ranged from 0.1 to 1.2 gram/liter. The dispersion of the MWCNT-OH-based EG/DW solutions was evaluated by using a Zeta-Sizer analyzer. The MWCNT-OH-based EG/DW solutions were used as coolants in the liquid cooling system for the Intel Core i5 processor. The thermal dissipation efficiency and the thermal response of the system were evaluated by directly measuring the temperature of the micro-processor using the Core Temp software and the temperature sensors built inside the micro-processor. The results confirmed the advantages of CNTs in thermal dissipation systems for computer processors and other high-power electronic devices.

  9. An intercomparison of Canadian external dosimetry processors for radiation protection

    International Nuclear Information System (INIS)

    1989-10-01

    The five Canadian external dosimetry processors have participated in a two-stage intercomparison. The first stage involved dosimeters to known radiation fields under controlled laboratory conditions. The second stage involved exposing dosimeters to radiation fields in power reactor working environments. The results for each stage indicated the dose reported by each processor relative to an independently determined dose and relative to the others. The results of the intercomparisons confirm the original supposition: namely that the average differences in reported dose among five processors are much less than the uncertainty limits recommended by the ICRP. This report provides a description of the experimental methods as well as a discussion of the results for each stage. The report also includes a set of recommendations

  10. First Results of an “Artificial Retina” Processor Prototype

    International Nuclear Information System (INIS)

    Cenci, Riccardo; Bedeschi, Franco; Marino, Pietro; Morello, Michael J.; Ninci, Daniele; Piucci, Alessio; Punzi, Giovanni; Ristori, Luciano; Spinella, Franco; Stracka, Simone; Tonelli, Diego; Walsh, John

    2016-01-01

    We report on the performance of a specialized processor capable of reconstructing charged particle tracks in a realistic LHC silicon tracker detector, at the same speed of the readout and with sub-microsecond latency. The processor is based on an innovative pattern-recognition algorithm, called “artificial retina algorithm”, inspired from the vision system of mammals. A prototype of the processor has been designed, simulated, and implemented on Tel62 boards equipped with high-bandwidth Altera Stratix III FPGA devices. The prototype is the first step towards a real-time track reconstruction device aimed at processing complex events of high-luminosity LHC experiments at 40 MHz crossing rate

  11. Modal Processor Effects Inspired by Hammond Tonewheel Organs

    Directory of Open Access Journals (Sweden)

    Kurt James Werner

    2016-06-01

    Full Text Available In this design study, we introduce a novel class of digital audio effects that extend the recently introduced modal processor approach to artificial reverberation and effects processing. These pitch and distortion processing effects mimic the design and sonics of a classic additive-synthesis-based electromechanical musical instrument, the Hammond tonewheel organ. As a reverb effect, the modal processor simulates a room response as the sum of resonant filter responses. This architecture provides precise, interactive control over the frequency, damping, and complex amplitude of each mode. Into this framework, we introduce two types of processing effects: pitch effects inspired by the Hammond organ’s equal tempered “tonewheels”, “drawbar” tone controls, vibrato/chorus circuit, and distortion effects inspired by the pseudo-sinusoidal shape of its tonewheels and electromagnetic pickup distortion. The result is an effects processor that imprints the Hammond organ’s sonics onto any audio input.

  12. Safety-critical Java on a time-predictable processor

    DEFF Research Database (Denmark)

    Korsholm, Stephan E.; Schoeberl, Martin; Puffitsch, Wolfgang

    2015-01-01

    For real-time systems the whole execution stack needs to be time-predictable and analyzable for the worst-case execution time (WCET). This paper presents a time-predictable platform for safety-critical Java. The platform consists of (1) the Patmos processor, which is a time-predictable processor......; (2) a C compiler for Patmos with support for WCET analysis; (3) the HVM, which is a Java-to-C compiler; (4) the HVM-SCJ implementation which supports SCJ Level 0, 1, and 2 (for both single and multicore platforms); and (5) a WCET analysis tool. We show that real-time Java programs translated to C...... and compiled to a Patmos binary can be analyzed by the AbsInt aiT WCET analysis tool. To the best of our knowledge the presented system is the second WCET analyzable real-time Java system; and the first one on top of a RISC processor....

  13. A Bayesian sequential processor approach to spectroscopic portal system decisions

    Energy Technology Data Exchange (ETDEWEB)

    Sale, K; Candy, J; Breitfeller, E; Guidry, B; Manatt, D; Gosnell, T; Chambers, D

    2007-07-31

    The development of faster more reliable techniques to detect radioactive contraband in a portal type scenario is an extremely important problem especially in this era of constant terrorist threats. Towards this goal the development of a model-based, Bayesian sequential data processor for the detection problem is discussed. In the sequential processor each datum (detector energy deposit and pulse arrival time) is used to update the posterior probability distribution over the space of model parameters. The nature of the sequential processor approach is that a detection is produced as soon as it is statistically justified by the data rather than waiting for a fixed counting interval before any analysis is performed. In this paper the Bayesian model-based approach, physics and signal processing models and decision functions are discussed along with the first results of our research.

  14. Processor farming in two-level analysis of historical bridge

    Science.gov (United States)

    Krejčí, T.; Kruis, J.; Koudelka, T.; Šejnoha, M.

    2017-11-01

    This contribution presents a processor farming method in connection with a multi-scale analysis. In this method, each macro-scopic integration point or each finite element is connected with a certain meso-scopic problem represented by an appropriate representative volume element (RVE). The solution of a meso-scale problem provides then effective parameters needed on the macro-scale. Such an analysis is suitable for parallel computing because the meso-scale problems can be distributed among many processors. The application of the processor farming method to a real world masonry structure is illustrated by an analysis of Charles bridge in Prague. The three-dimensional numerical model simulates the coupled heat and moisture transfer of one half of arch No. 3. and it is a part of a complex hygro-thermo-mechanical analysis which has been developed to determine the influence of climatic loading on the current state of the bridge.

  15. A single chip pulse processor for nuclear spectroscopy

    International Nuclear Information System (INIS)

    Hilsenrath, F.; Bakke, J.C.; Voss, H.D.

    1985-01-01

    A high performance digital pulse processor, integrated into a single gate array microcircuit, has been developed for spaceflight applications. The new approach takes advantage of the latest CMOS high speed A/D flash converters and low-power gated logic arrays. The pulse processor measures pulse height, pulse area and the required timing information (e.g. multi detector coincidence and pulse pile-up detection). The pulse processor features high throughput rate (e.g. 0.5 Mhz for 2 usec gausssian pulses) and improved differential linearity (e.g. + or - 0.2 LSB for a + or - 1 LSB A/D). Because of the parallel digital architecture of the device, the interface is microprocessor bus compatible. A satellite flight application of this module is presented for use in the X-ray imager and high energy particle spectrometers of the PEM experiment on the Upper Atmospheric Research Satellite

  16. The ATLAS Level-1 Central Trigger Processor (CTP)

    CERN Document Server

    Spiwoks, Ralf; Ellis, Nick; Farthouat, P; Gällnö, P; Haller, J; Krasznahorkay, A; Maeno, T; Pauly, T; Pessoa-Lima, H; Resurreccion-Arcas, I; Schuler, G; De Seixas, J M; Torga-Teixeira, R; Wengler, T

    2005-01-01

    The ATLAS Level-1 Central Trigger Processor (CTP) combines information from calorimeter and muon trigger processors and makes the final Level-1 Accept (L1A) decision on the basis of lists of selection criteria (trigger menus). In addition to the event-selection decision, the CTP also provides trigger summary information to the Level-2 trigger and the data acquisition system. It further provides accumulated and bunch-by-bunch scaler data for monitoring of the trigger, detector and beam conditions. The CTP is presented and results are shown from tests with the calorimeter adn muon trigger processors connected to detectors in a particle beam, as well as from stand-alone full-system tests in the laboratory which were used to validate the CTP.

  17. Stepping motor control processor reference manual. Volume I

    International Nuclear Information System (INIS)

    Holloway, F.W.; VanArsdall, P.J.; Suski, G.J.; Gant, R.G.; Rash, M.

    1980-01-01

    This manual is intended to serve several purposes. The first goal is to describe the capabilities and operation of the SMC processor package from an operator or user point of view. Secondly, the manual will describe in some detail the basic hardware elements and how they can be used effectively to implement a step motor control system. Practical information on the use, installation and checkout of the hardware set is presented in the following sections along with programming suggestions. Available related system software is described in this manual for reference and as an aid in understanding the system architecture. Section two presents an overview and operations manual of the SMC processor describing its composition and functional capabilities. Section three contains hardware descriptions in some detail for the LLL-designed hardware used in the SMC processor. Basic theory of operation and important features are explained

  18. Embedded Processor Based Automatic Temperature Control of VLSI Chips

    Directory of Open Access Journals (Sweden)

    Narasimha Murthy Yayavaram

    2009-01-01

    Full Text Available This paper presents embedded processor based automatic temperature control of VLSI chips, using temperature sensor LM35 and ARM processor LPC2378. Due to the very high packing density, VLSI chips get heated very soon and if not cooled properly, the performance is very much affected. In the present work, the sensor which is kept very near proximity to the IC will sense the temperature and the speed of the fan arranged near to the IC is controlled based on the PWM signal generated by the ARM processor. A buzzer is also provided with the hardware, to indicate either the failure of the fan or overheating of the IC. The entire process is achieved by developing a suitable embedded C program.

  19. A Processor-Sharing Scheduling Strategy for NFV Nodes

    Directory of Open Access Journals (Sweden)

    Giuseppe Faraci

    2016-01-01

    Full Text Available The introduction of the two paradigms SDN and NFV to “softwarize” the current Internet is making management and resource allocation two key challenges in the evolution towards the Future Internet. In this context, this paper proposes Network-Aware Round Robin (NARR, a processor-sharing strategy, to reduce delays in traversing SDN/NFV nodes. The application of NARR alleviates the job of the Orchestrator by automatically working at the intranode level, dynamically assigning the processor slices to the virtual network functions (VNFs according to the state of the queues associated with the output links of the network interface cards (NICs. An extensive simulation set is presented to show the improvements achieved with respect to two more processor-sharing strategies chosen as reference.

  20. Satellite on-board real-time SAR processor prototype

    Science.gov (United States)

    Bergeron, Alain; Doucet, Michel; Harnisch, Bernd; Suess, Martin; Marchese, Linda; Bourqui, Pascal; Desnoyers, Nicholas; Legros, Mathieu; Guillot, Ludovic; Mercier, Luc; Châteauneuf, François

    2017-11-01

    A Compact Real-Time Optronic SAR Processor has been successfully developed and tested up to a Technology Readiness Level of 4 (TRL4), the breadboard validation in a laboratory environment. SAR, or Synthetic Aperture Radar, is an active system allowing day and night imaging independent of the cloud coverage of the planet. The SAR raw data is a set of complex data for range and azimuth, which cannot be compressed. Specifically, for planetary missions and unmanned aerial vehicle (UAV) systems with limited communication data rates this is a clear disadvantage. SAR images are typically processed electronically applying dedicated Fourier transformations. This, however, can also be performed optically in real-time. Originally the first SAR images were optically processed. The optical Fourier processor architecture provides inherent parallel computing capabilities allowing real-time SAR data processing and thus the ability for compression and strongly reduced communication bandwidth requirements for the satellite. SAR signal return data are in general complex data. Both amplitude and phase must be combined optically in the SAR processor for each range and azimuth pixel. Amplitude and phase are generated by dedicated spatial light modulators and superimposed by an optical relay set-up. The spatial light modulators display the full complex raw data information over a two-dimensional format, one for the azimuth and one for the range. Since the entire signal history is displayed at once, the processor operates in parallel yielding real-time performances, i.e. without resulting bottleneck. Processing of both azimuth and range information is performed in a single pass. This paper focuses on the onboard capabilities of the compact optical SAR processor prototype that allows in-orbit processing of SAR images. Examples of processed ENVISAT ASAR images are presented. Various SAR processor parameters such as processing capabilities, image quality (point target analysis), weight and

  1. Benchmarking NWP Kernels on Multi- and Many-core Processors

    Science.gov (United States)

    Michalakes, J.; Vachharajani, M.

    2008-12-01

    Increased computing power for weather, climate, and atmospheric science has provided direct benefits for defense, agriculture, the economy, the environment, and public welfare and convenience. Today, very large clusters with many thousands of processors are allowing scientists to move forward with simulations of unprecedented size. But time-critical applications such as real-time forecasting or climate prediction need strong scaling: faster nodes and processors, not more of them. Moreover, the need for good cost- performance has never been greater, both in terms of performance per watt and per dollar. For these reasons, the new generations of multi- and many-core processors being mass produced for commercial IT and "graphical computing" (video games) are being scrutinized for their ability to exploit the abundant fine- grain parallelism in atmospheric models. We present results of our work to date identifying key computational kernels within the dynamics and physics of a large community NWP model, the Weather Research and Forecast (WRF) model. We benchmark and optimize these kernels on several different multi- and many-core processors. The goals are to (1) characterize and model performance of the kernels in terms of computational intensity, data parallelism, memory bandwidth pressure, memory footprint, etc. (2) enumerate and classify effective strategies for coding and optimizing for these new processors, (3) assess difficulties and opportunities for tool or higher-level language support, and (4) establish a continuing set of kernel benchmarks that can be used to measure and compare effectiveness of current and future designs of multi- and many-core processors for weather and climate applications.

  2. Programmable Quantum Photonic Processor Using Silicon Photonics

    Science.gov (United States)

    2017-04-01

    8 Figure 6: (a) Proposed on-demand single photon source based on dynamic cavity storage . (b) Example of a gate implementation...electronic architectures tuned to implement artificial neural networks that improve upon both computational speed and energy efficiency. 3.6 All...states are in the dual- rail logic representation. Approved for Public Release; Distribution Unlimited. 6 Figure 3: Schematic of two-photon

  3. Nonlinear Wave Simulation on the Xeon Phi Knights Landing Processor

    OpenAIRE

    Hristov Ivan; Goranov Goran; Hristova Radoslava

    2018-01-01

    We consider an interesting from computational point of view standing wave simulation by solving coupled 2D perturbed Sine-Gordon equations. We make an OpenMP realization which explores both thread and SIMD levels of parallelism. We test the OpenMP program on two different energy equivalent Intel architectures: 2× Xeon E5-2695 v2 processors, (code-named “Ivy Bridge-EP”) in the Hybrilit cluster, and Xeon Phi 7250 processor (code-named “Knights Landing” (KNL). The results show 2 times better per...

  4. The Danish real-time SAR processor: first results

    DEFF Research Database (Denmark)

    Dall, Jørgen; Jørgensen, Jørn Hjelm; Netterstrøm, Anders

    1993-01-01

    A real-time processor (RTP) for the Danish airborne Synthetic Aperture Radar (SAR) has been designed and constructed at the Electromagnetics Institute. The implementation was completed in mid 1992, and since then the RTP has been operated successfully on several test and demonstration flights....... The processor is capable of focusing the entire swath of the raw SAR data into full resolution, and depending on the choice made by the on-board operator, either a high resolution one-look zoom image or a spatially multilooked overview image is displayed. After a brief design review, the paper addresses various...

  5. UNIBUS processor interface for a FASTBUS data acquisition system

    International Nuclear Information System (INIS)

    Larwill, M.; Lagerlund, T.D.; Barsotti, E.; Taff, L.M.; Franzen, J.

    1981-01-01

    Current work on a FASTBUS data acquisition system at Fermilab is described. The system will consist of three pieces of FASTBUS hardware: a UNIBUS processor interface (UPI), a dual-ported bulk memory, and a FASTBUS ''event builder'' (i.e., data acquisition processor). Primary efforts have been on specifying and constructing a UPI. The present specification includes capability for all basic FASTBUS operations, including list processing of consecutive FASTBUS operations. Some possible FASTBUS data acquisition system architectures employing the UPI are discussed along with some detailed specifications of the UPI itself

  6. Global synchronization of parallel processors using clock pulse width modulation

    Science.gov (United States)

    Chen, Dong; Ellavsky, Matthew R.; Franke, Ross L.; Gara, Alan; Gooding, Thomas M.; Haring, Rudolf A.; Jeanson, Mark J.; Kopcsay, Gerard V.; Liebsch, Thomas A.; Littrell, Daniel; Ohmacht, Martin; Reed, Don D.; Schenck, Brandon E.; Swetz, Richard A.

    2013-04-02

    A circuit generates a global clock signal with a pulse width modification to synchronize processors in a parallel computing system. The circuit may include a hardware module and a clock splitter. The hardware module may generate a clock signal and performs a pulse width modification on the clock signal. The pulse width modification changes a pulse width within a clock period in the clock signal. The clock splitter may distribute the pulse width modified clock signal to a plurality of processors in the parallel computing system.

  7. Post-silicon and runtime verification for modern processors

    CERN Document Server

    Wagner, Ilya

    2010-01-01

    The purpose of this book is to survey the state of the art and evolving directions in post-silicon and runtime verification. The authors start by giving an overview of the state of the art in verification, particularly current post-silicon methodologies in use in the industry, both for the domain of processor pipeline design and for memory subsystems. They then dive into the presentation of several new post-silicon verification solutions aimed at boosting the verification coverage of modern processors, dedicating several chapters to this topic. The presentation of runtime verification solution

  8. A VAX-FPS Loosely-Coupled Array of Processors

    International Nuclear Information System (INIS)

    Grosdidier, G.

    1987-03-01

    The main features of a VAX-FPS Loosely-Coupled Array of Processors (LCAP) set-up and the implementation of a High Energy Physics tracking program for off-line purposes will be described. This LCAP consists of a VAX 11/750 host and two FPS 64 bit attached processors. Before analyzing the performances of this LCAP, its characteristics will be outlined, especially from a user's point of vue, and will be briefly compared to those of the IBM-FPS LCAP

  9. Parallel Processor for 3D Recovery from Optical Flow

    Directory of Open Access Journals (Sweden)

    Jose Hugo Barron-Zambrano

    2009-01-01

    Full Text Available 3D recovery from motion has received a major effort in computer vision systems in the recent years. The main problem lies in the number of operations and memory accesses to be performed by the majority of the existing techniques when translated to hardware or software implementations. This paper proposes a parallel processor for 3D recovery from optical flow. Its main feature is the maximum reuse of data and the low number of clock cycles to calculate the optical flow, along with the precision with which 3D recovery is achieved. The results of the proposed architecture as well as those from processor synthesis are presented.

  10. FASTBUS Standard Routines implementation for Fermilab embedded processor boards

    International Nuclear Information System (INIS)

    Pangburn, J.; Patrick, J.; Kent, S.; Oleynik, G.; Pordes, R.; Votava, M.; Heyes, G.; Watson, W.A. III

    1992-10-01

    In collaboration with CEBAF, Fermilab's Online Support Department and the CDF experiment have produced a new implementation of the IEEE FASTBUS Standard Routines for two embedded processor FASTBUS boards: the Fermilab Smart Crate Controller (FSCC) and the FASTBUS Readout Controller (FRC). Features of this implementation include: portability (to other embedded processor boards), remote source-level debugging, high speed, optional generation of very high-speed code for readout applications, and built-in Sun RPC support for execution of FASTBUS transactions and lists over the network

  11. The associative memory system for the FTK processor at ATLAS

    CERN Document Server

    Magalotti, D; The ATLAS collaboration; Donati, S; Luciano, P; Piendibene, M; Giannetti, P; Lanza, A; Verzellesi, G; Sakellariou, Andreas; Billereau, W; Combe, J M

    2014-01-01

    In high energy physics experiments, the most interesting processes are very rare and hidden in an extremely large level of background. As the experiment complexity, accelerator backgrounds, and instantaneous luminosity increase, more effective and accurate data selection techniques are needed. The Fast TracKer processor (FTK) is a real time tracking processor designed for the ATLAS trigger upgrade. The FTK core is the Associative Memory system. It provides massive computing power to minimize the processing time of complex tracking algorithms executed online. This paper reports on the results and performance of a new prototype of Associative Memory system.

  12. Wavelength-encoded OCDMA system using opto-VLSI processors.

    Science.gov (United States)

    Aljada, Muhsen; Alameh, Kamal

    2007-07-01

    We propose and experimentally demonstrate a 2.5 Gbits/sper user wavelength-encoded optical code-division multiple-access encoder-decoder structure based on opto-VLSI processing. Each encoder and decoder is constructed using a single 1D opto-very-large-scale-integrated (VLSI) processor in conjunction with a fiber Bragg grating (FBG) array of different Bragg wavelengths. The FBG array spectrally and temporally slices the broadband input pulse into several components and the opto-VLSI processor generates codewords using digital phase holograms. System performance is measured in terms of the autocorrelation and cross-correlation functions as well as the eye diagram.

  13. A fast processor for di-lepton triggers

    CERN Document Server

    Kostarakis, P; Barsotti, E; Conetti, S; Cox, B; Enagonio, J; Haldeman, M; Haynes, W; Katsanevas, S; Kerns, C; Lebrun, P; Smith, H; Soszyniski, T; Stoffel, J; Treptow, K; Turkot, F; Wagner, R

    1981-01-01

    As a new application of the Fermilab ECL-CAMAC logic modules a fast trigger processor was developed for Fermilab experiment E-537, aiming to measure the higher mass di-muon production by antiprotons. The processor matches the hit information received from drift chambers and scintillation counters, to find candidate muon tracks and determine their directions and momenta. The tracks are then paired to compute an invariant mass: when the computed mass falls within the desired range, the event is accepted. The process is accomplished in times of 5 to 10 microseconds, while achieving a trigger rate reduction of up to a factor of ten. (5 refs).

  14. ARM Processor Based Embedded System for Remote Data Acquisition

    OpenAIRE

    Raj Kumar Tiwari; Santosh Kumar Agrahari

    2014-01-01

    The embedded systems are widely used for the data acquisition. The data acquired may be used for monitoring various activity of the system or it can be used to control the parts of the system. Accessing various signals with remote location has greater advantage for multisite operation or unmanned systems. The remote data acquisition used in this paper is based on ARM processor. The Cortex M3 processor used in this system has in-built Ethernet controller which facilitate to acquire the remote ...

  15. Digital control card based on digital signal processor

    International Nuclear Information System (INIS)

    Hou Shigang; Yin Zhiguo; Xia Le

    2008-01-01

    A digital control card based on digital signal processor was developed. Two Freescale DSP-56303 processors were utilized to achieve 3 channels proportional- integral-differential regulations. The card offers high flexibility for 100 MeV cyclotron RF system development. It was used as feedback controller in low level radio frequency control prototype, with the feedback gain parameters continuously adjustable. By using high precision analog to digital converter with 500 kHz sampling rate, a regulation bandwidth of 20 kHz was achieved. (authors)

  16. OLYMPUS system and development of its pre-processor

    International Nuclear Information System (INIS)

    Okamoto, Masao; Takeda, Tatsuoki; Tanaka, Masatoshi; Asai, Kiyoshi; Nakano, Koh.

    1977-08-01

    The OLYMPUS SYSTEM developed by K. V. Roverts et al. was converted and introduced in computer system FACOM 230/75 of the JAERI Computing Center. A pre-processor was also developed for the OLYMPUS SYSTEM. The OLYMPUS SYSTEM is very useful for development, standardization and exchange of programs in thermonuclear fusion research and plasma physics. The pre-processor developed by the present authors is not only essential for the JAERI OLYMPUS SYSTEM, but also useful in manipulation, creation and correction of program files. (auth.)

  17. Wavelength-encoded OCDMA system using opto-VLSI processors

    Science.gov (United States)

    Aljada, Muhsen; Alameh, Kamal

    2007-07-01

    We propose and experimentally demonstrate a 2.5 Gbits/sper user wavelength-encoded optical code-division multiple-access encoder-decoder structure based on opto-VLSI processing. Each encoder and decoder is constructed using a single 1D opto-very-large-scale-integrated (VLSI) processor in conjunction with a fiber Bragg grating (FBG) array of different Bragg wavelengths. The FBG array spectrally and temporally slices the broadband input pulse into several components and the opto-VLSI processor generates codewords using digital phase holograms. System performance is measured in terms of the autocorrelation and cross-correlation functions as well as the eye diagram.

  18. Reconfigurable lattice mesh designs for programmable photonic processors.

    Science.gov (United States)

    Pérez, Daniel; Gasulla, Ivana; Capmany, José; Soref, Richard A

    2016-05-30

    We propose and analyse two novel mesh design geometries for the implementation of tunable optical cores in programmable photonic processors. These geometries are the hexagonal and the triangular lattice. They are compared here to a previously proposed square mesh topology in terms of a series of figures of merit that account for metrics that are relevant to on-chip integration of the mesh. We find that that the hexagonal mesh is the most suitable option of the three considered for the implementation of the reconfigurable optical core in the programmable processor.

  19. Hardware Synchronization for Embedded Multi-Core Processors

    DEFF Research Database (Denmark)

    Stoif, Christian; Schoeberl, Martin; Liccardi, Benito

    2011-01-01

    Multi-core processors are about to conquer embedded systems — it is not the question of whether they are coming but how the architectures of the microcontrollers should look with respect to the strict requirements in the field. We present the step from one to multiple cores in this paper, establi......Multi-core processors are about to conquer embedded systems — it is not the question of whether they are coming but how the architectures of the microcontrollers should look with respect to the strict requirements in the field. We present the step from one to multiple cores in this paper...

  20. Accuracy requirements of optical linear algebra processors in adaptive optics imaging systems

    Science.gov (United States)

    Downie, John D.; Goodman, Joseph W.

    1989-10-01

    The accuracy requirements of optical processors in adaptive optics systems are determined by estimating the required accuracy in a general optical linear algebra processor (OLAP) that results in a smaller average residual aberration than that achieved with a conventional electronic digital processor with some specific computation speed. Special attention is given to an error analysis of a general OLAP with regard to the residual aberration that is created in an adaptive mirror system by the inaccuracies of the processor, and to the effect of computational speed of an electronic processor on the correction. Results are presented on the ability of an OLAP to compete with a digital processor in various situations.

  1. The Trigger Processor and Trigger Processor Algorithms for the ATLAS New Small Wheel Upgrade

    CERN Document Server

    Lazovich, Tomo; The ATLAS collaboration

    2015-01-01

    The ATLAS New Small Wheel (NSW) is an upgrade to the ATLAS muon endcap detectors that will be installed during the next long shutdown of the LHC. Comprising both MicroMegas (MMs) and small-strip Thin Gap Chambers (sTGCs), this system will drastically improve the performance of the muon system in a high cavern background environment. The NSW trigger, in particular, will significantly reduce the rate of fake triggers coming from track segments in the endcap not originating from the interaction point. We will present an overview of the trigger, the proposed sTGC and MM trigger algorithms, and the hardware implementation of the trigger. In particular, we will discuss both the heart of the trigger, an ATCA system with FPGA-based trigger processors (using the same hardware platform for both MM and sTGC triggers), as well as the full trigger electronics chain, including dedicated cards for transmission of data via GBT optical links. Finally, we will detail the challenges of ensuring that the trigger electronics can ...

  2. GNAQPMS v1.1: accelerating the Global Nested Air Quality Prediction Modeling System (GNAQPMS) on Intel Xeon Phi processors

    Science.gov (United States)

    Wang, Hui; Chen, Huansheng; Wu, Qizhong; Lin, Junmin; Chen, Xueshun; Xie, Xinwei; Wang, Rongrong; Tang, Xiao; Wang, Zifa

    2017-08-01

    The Global Nested Air Quality Prediction Modeling System (GNAQPMS) is the global version of the Nested Air Quality Prediction Modeling System (NAQPMS), which is a multi-scale chemical transport model used for air quality forecast and atmospheric environmental research. In this study, we present the porting and optimisation of GNAQPMS on a second-generation Intel Xeon Phi processor, codenamed Knights Landing (KNL). Compared with the first-generation Xeon Phi coprocessor (codenamed Knights Corner, KNC), KNL has many new hardware features such as a bootable processor, high-performance in-package memory and ISA compatibility with Intel Xeon processors. In particular, we describe the five optimisations we applied to the key modules of GNAQPMS, including the CBM-Z gas-phase chemistry, advection, convection and wet deposition modules. These optimisations work well on both the KNL 7250 processor and the Intel Xeon E5-2697 V4 processor. They include (1) updating the pure Message Passing Interface (MPI) parallel mode to the hybrid parallel mode with MPI and OpenMP in the emission, advection, convection and gas-phase chemistry modules; (2) fully employing the 512 bit wide vector processing units (VPUs) on the KNL platform; (3) reducing unnecessary memory access to improve cache efficiency; (4) reducing the thread local storage (TLS) in the CBM-Z gas-phase chemistry module to improve its OpenMP performance; and (5) changing the global communication from writing/reading interface files to MPI functions to improve the performance and the parallel scalability. These optimisations greatly improved the GNAQPMS performance. The same optimisations also work well for the Intel Xeon Broadwell processor, specifically E5-2697 v4. Compared with the baseline version of GNAQPMS, the optimised version was 3.51 × faster on KNL and 2.77 × faster on the CPU. Moreover, the optimised version ran at 26 % lower average power on KNL than on the CPU. With the combined performance and energy

  3. GNAQPMS v1.1: accelerating the Global Nested Air Quality Prediction Modeling System (GNAQPMS on Intel Xeon Phi processors

    Directory of Open Access Journals (Sweden)

    H. Wang

    2017-08-01

    Full Text Available The Global Nested Air Quality Prediction Modeling System (GNAQPMS is the global version of the Nested Air Quality Prediction Modeling System (NAQPMS, which is a multi-scale chemical transport model used for air quality forecast and atmospheric environmental research. In this study, we present the porting and optimisation of GNAQPMS on a second-generation Intel Xeon Phi processor, codenamed Knights Landing (KNL. Compared with the first-generation Xeon Phi coprocessor (codenamed Knights Corner, KNC, KNL has many new hardware features such as a bootable processor, high-performance in-package memory and ISA compatibility with Intel Xeon processors. In particular, we describe the five optimisations we applied to the key modules of GNAQPMS, including the CBM-Z gas-phase chemistry, advection, convection and wet deposition modules. These optimisations work well on both the KNL 7250 processor and the Intel Xeon E5-2697 V4 processor. They include (1 updating the pure Message Passing Interface (MPI parallel mode to the hybrid parallel mode with MPI and OpenMP in the emission, advection, convection and gas-phase chemistry modules; (2 fully employing the 512 bit wide vector processing units (VPUs on the KNL platform; (3 reducing unnecessary memory access to improve cache efficiency; (4 reducing the thread local storage (TLS in the CBM-Z gas-phase chemistry module to improve its OpenMP performance; and (5 changing the global communication from writing/reading interface files to MPI functions to improve the performance and the parallel scalability. These optimisations greatly improved the GNAQPMS performance. The same optimisations also work well for the Intel Xeon Broadwell processor, specifically E5-2697 v4. Compared with the baseline version of GNAQPMS, the optimised version was 3.51 × faster on KNL and 2.77 × faster on the CPU. Moreover, the optimised version ran at 26 % lower average power on KNL than on the CPU. With the combined

  4. Metal membrane-type 25-kW methanol fuel processor for fuel-cell hybrid vehicle

    Science.gov (United States)

    Han, Jaesung; Lee, Seok-Min; Chang, Hyuksang

    A 25-kW on-board methanol fuel processor has been developed. It consists of a methanol steam reformer, which converts methanol to hydrogen-rich gas mixture, and two metal membrane modules, which clean-up the gas mixture to high-purity hydrogen. It produces hydrogen at rates up to 25 N m 3/h and the purity of the product hydrogen is over 99.9995% with a CO content of less than 1 ppm. In this fuel processor, the operating condition of the reformer and the metal membrane modules is nearly the same, so that operation is simple and the overall system construction is compact by eliminating the extensive temperature control of the intermediate gas streams. The recovery of hydrogen in the metal membrane units is maintained at 70-75% by the control of the pressure in the system, and the remaining 25-30% hydrogen is recycled to a catalytic combustion zone to supply heat for the methanol steam-reforming reaction. The thermal efficiency of the fuel processor is about 75% and the inlet air pressure is as low as 4 psi. The fuel processor is currently being integrated with 25-kW polymer electrolyte membrane fuel-cell (PEMFC) stack developed by the Hyundai Motor Company. The stack exhibits the same performance as those with pure hydrogen, which proves that the maximum power output as well as the minimum stack degradation is possible with this fuel processor. This fuel-cell 'engine' is to be installed in a hybrid passenger vehicle for road testing.

  5. A Software Implementation of a Satellite Interface Message Processor.

    Science.gov (United States)

    Eastwood, Margaret A.; Eastwood, Lester F., Jr.

    A design for network control software for a computer network is described in which some nodes are linked by a communications satellite channel. It is assumed that the network has an ARPANET-like configuration; that is, that specialized processors at each node are responsible for message switching and network control. The purpose of the control…

  6. Optical linear algebra processors - Noise and error-source modeling

    Science.gov (United States)

    Casasent, D.; Ghosh, A.

    1985-01-01

    The modeling of system and component noise and error sources in optical linear algebra processors (OLAPs) are considered, with attention to the frequency-multiplexed OLAP. General expressions are obtained for the output produced as a function of various component errors and noise. A digital simulator for this model is discussed.

  7. Optical linear algebra processors: noise and error-source modeling.

    Science.gov (United States)

    Casasent, D; Ghosh, A

    1985-06-01

    The modeling of system and component noise and error sources in optical linear algebra processors (OLAP's) are considered, with attention to the frequency-multiplexed OLAP. General expressions are obtained for the output produced as a function of various component errors and noise. A digital simulator for this model is discussed.

  8. A post-processor for the PEST code

    International Nuclear Information System (INIS)

    Priesche, S.; Manickam, J.; Johnson, J.L.

    1992-01-01

    A new post-processor has been developed for use with output from the PEST tokamak stability code. It allows us to use quantities calculated by PEST and take better advantage of the physical picture of the plasma instability which they can provide. This will improve comparison with experimentally measured quantities as well as facilitate understanding of theoretical studies

  9. The Operational Semantics of a Java Secure Processor

    NARCIS (Netherlands)

    Hartel, Pieter H.; Butler, M.J.; Levy, M.; Alves-Foss, J.

    1999-01-01

    A formal specification of a Java Secure Processor is presented, which is mechanically checked for type consistency, well formedness and operational conservativity. The specification is executable and it is used to animate and study the behaviour of sample Java programs. The purpose of the semantics

  10. Analytic processor model for fast design-space exploration

    NARCIS (Netherlands)

    Jongerius, R.; Mariani, G.; Anghel, A.; Dittmann, G.; Vermij, E.; Corporaal, H.

    2015-01-01

    In this paper, we propose an analytic model that takes as inputs a) a parametric microarchitecture-independent characterization of the target workload, and b) a hardware configuration of the core and the memory hierarchy, and returns as output an estimation of processor-core performance. To validate

  11. Real-time trajectory optimization on parallel processors

    Science.gov (United States)

    Psiaki, Mark L.

    1993-01-01

    A parallel algorithm has been developed for rapidly solving trajectory optimization problems. The goal of the work has been to develop an algorithm that is suitable to do real-time, on-line optimal guidance through repeated solution of a trajectory optimization problem. The algorithm has been developed on an INTEL iPSC/860 message passing parallel processor. It uses a zero-order-hold discretization of a continuous-time problem and solves the resulting nonlinear programming problem using a custom-designed augmented Lagrangian nonlinear programming algorithm. The algorithm achieves parallelism of function, derivative, and search direction calculations through the principle of domain decomposition applied along the time axis. It has been encoded and tested on 3 example problems, the Goddard problem, the acceleration-limited, planar minimum-time to the origin problem, and a National Aerospace Plane minimum-fuel ascent guidance problem. Execution times as fast as 118 sec of wall clock time have been achieved for a 128-stage Goddard problem solved on 32 processors. A 32-stage minimum-time problem has been solved in 151 sec on 32 processors. A 32-stage National Aerospace Plane problem required 2 hours when solved on 32 processors. A speed-up factor of 7.2 has been achieved by using 32-nodes instead of 1-node to solve a 64-stage Goddard problem.

  12. Dynamic overset grid communication on distributed memory parallel processors

    Science.gov (United States)

    Barszcz, Eric; Weeratunga, Sisira K.; Meakin, Robert L.

    1993-01-01

    A parallel distributed memory implementation of intergrid communication for dynamic overset grids is presented. Included are discussions of various options considered during development. Results are presented comparing an Intel iPSC/860 to a single processor Cray Y-MP. Results for grids in relative motion show the iPSC/860 implementation to be faster than the Cray implementation.

  13. Low-power analogue processor for Bonner sphere spectrometers

    International Nuclear Information System (INIS)

    Ciobanu, M.I.; Alevra, A.V.

    1998-01-01

    The electronic system proposed is compact, small-size (the dimensions of the prototype are 107 x 105 x 58 mm) and battery-powered. The whole detection system is portable and independent of the mains supply and is well shielded against external disturbances. Technical details of the analog processor are given. (M.D.)

  14. Fast Parallel Computation of Polynomials Using Few Processors

    DEFF Research Database (Denmark)

    Valiant, Leslie G.; Skyum, Sven; Berkowitz, S.

    1983-01-01

    It is shown that any multivariate polynomial of degree $d$ that can be computed sequentially in $C$ steps can be computed in parallel in $O((\\log d)(\\log C + \\log d))$ steps using only $(Cd)^{O(1)} $ processors....

  15. 50 CFR 648.6 - Dealer/processor permits.

    Science.gov (United States)

    2010-10-01

    ... of incorporation if the business is a corporation, and a copy of the partnership agreement and the names and addresses of all partners, if the business is a partnership, name of at-sea processor vessel... the fishing year to an applicant, unless the applicant fails to submit a completed application. An...

  16. The study of image processing of parallel digital signal processor

    International Nuclear Information System (INIS)

    Liu Jie

    2000-01-01

    The author analyzes the basic characteristic of parallel DSP (digital signal processor) TMS320C80 and proposes related optimized image algorithm and the parallel processing method based on parallel DSP. The realtime for many image processing can be achieved in this way

  17. A design of a computer complex including vector processors

    International Nuclear Information System (INIS)

    Asai, Kiyoshi

    1982-12-01

    We, members of the Computing Center, Japan Atomic Energy Research Institute have been engaged for these six years in the research of adaptability of vector processing to large-scale nuclear codes. The research has been done in collaboration with researchers and engineers of JAERI and a computer manufacturer. In this research, forty large-scale nuclear codes were investigated from the viewpoint of vectorization. Among them, twenty-six codes were actually vectorized and executed. As the results of the investigation, it is now estimated that about seventy percents of nuclear codes and seventy percents of our total amount of CPU time of JAERI are highly vectorizable. Based on the data obtained by the investigation, (1)currently vectorizable CPU time, (2)necessary number of vector processors, (3)necessary manpower for vectorization of nuclear codes, (4)computing speed, memory size, number of parallel 1/0 paths, size and speed of 1/0 buffer of vector processor suitable for our applications, (5)necessary software and operational policy for use of vector processors are discussed, and finally (6)a computer complex including vector processors is presented in this report. (author)

  18. Sojourn time asymptotics in processor-sharing queues

    NARCIS (Netherlands)

    Borst, S.C.; Núñez Queija, R.; Zwart, B.

    2006-01-01

    Over the past few decades, the Processor-Sharing (PS) discipline has attracted a great deal of attention in the queueing literature. While the PS paradigm emerged in the sixties as an idealization of round-robin scheduling in time-shared computer systems, it has recently captured renewed interest as

  19. The impact of reneging in processor sharing queues

    NARCIS (Netherlands)

    Gromoll, H.C.; Robert, Ph.; Zwart, B.; Bakker, R.F.

    2006-01-01

    We investigate an overloaded processor sharing queue with renewal arrivals and generally distributed service times. Impatient customers may abandon the queue, or renege, before completing service. The random time representing a customer’s patience has a general distribution and may be dependent on

  20. Sojourn times in finite-capacity processor-sharing queues

    NARCIS (Netherlands)

    Borst, S.C.; Boxma, O.J.; Hegde, N.

    2005-01-01

    Motivated by the need to develop simple parsimonious models for evaluating the performance of wireless data systems, we consider finite-capacity processor-sharing systems. For such systems, we analyze the sojourn time distribution, which presents a useful measure for the transfer delay of documents

  1. Scientific programming on massively parallel processor CP-PACS

    International Nuclear Information System (INIS)

    Boku, Taisuke

    1998-01-01

    The massively parallel processor CP-PACS takes various problems of calculation physics as the object, and it has been designed so that its architecture has been devised to do various numerical processings. In this report, the outline of the CP-PACS and the example of programming in the Kernel CG benchmark in NAS Parallel Benchmarks, version 1, are shown, and the pseudo vector processing mechanism and the parallel processing tuning of scientific and technical computation utilizing the three-dimensional hyper crossbar net, which are two great features of the architecture of the CP-PACS are described. As for the CP-PACS, the PUs based on RISC processor and added with pseudo vector processor are used. Pseudo vector processing is realized as the loop processing by scalar command. The features of the connection net of PUs are explained. The algorithm of the NPB version 1 Kernel CG is shown. The part that takes the time for processing most in the main loop is the product of matrix and vector (matvec), and the parallel processing of the matvec is explained. The time for the computation by the CPU is determined. As the evaluation of the performance, the evaluation of the time for execution, the short vector processing of pseudo vector processor based on slide window, and the comparison with other parallel computers are reported. (K.I.)

  2. Word Processors: A Look at Four Popular Programs.

    Science.gov (United States)

    Press, Larry

    1980-01-01

    Described are types of programs used for processing text (editors, print formatters, and word processors), followed by the comparison of four word-processing packages: Auto Scribe, Electric Pencil, Magic Want and Word Star. With the exception of Auto Scribe, all programs reviewed are CP/M versions. (KC)

  3. The hardware track finder processor in CMS at CERN

    CERN Document Server

    Kluge, A

    1997-01-01

    The work covers the design of the Track Finder Processor in the high energy experiment CMS (Compact Muon Solenoid, planned for 2005) at CERN/Geneva. The task of this processor is to identify muons and measure their transverse momentum. The track finder processor makes it possible to determine the physical relevance of each high energetic collision and to forward only interesting data to the data an alysis units. Data of more than two hundred thousand detector cells are used to determine the location of muons and measure their transverse momentum. Each 25 ns a new data set is generated. Measurem ent of location and transverse momentum of the muons can be terminated within 350 ns by using an ASIC (Application Specific Integrated Circuit). A pipeline architecture processes new data sets with th e required data rate of 40 MHz to ensure dead time free operation. In the framework of this study specifications and the overall concept of the track finder processor were worked out in detail. Simul ations were performed...

  4. 21 CFR 864.3875 - Automated tissue processor.

    Science.gov (United States)

    2010-04-01

    ... 21 Food and Drugs 8 2010-04-01 2010-04-01 false Automated tissue processor. 864.3875 Section 864.3875 Food and Drugs FOOD AND DRUG ADMINISTRATION, DEPARTMENT OF HEALTH AND HUMAN SERVICES (CONTINUED) MEDICAL DEVICES HEMATOLOGY AND PATHOLOGY DEVICES Pathology Instrumentation and Accessories § 864.3875...

  5. High performance graphics processors for medical imaging applications

    International Nuclear Information System (INIS)

    Goldwasser, S.M.; Reynolds, R.A.; Talton, D.A.; Walsh, E.S.

    1989-01-01

    This paper describes a family of high- performance graphics processors with special hardware for interactive visualization of 3D human anatomy. The basic architecture expands to multiple parallel processors, each processor using pipelined arithmetic and logical units for high-speed rendering of Computed Tomography (CT), Magnetic Resonance (MR) and Positron Emission Tomography (PET) data. User-selectable display alternatives include multiple 2D axial slices, reformatted images in sagittal or coronal planes and shaded 3D views. Special facilities support applications requiring color-coded display of multiple datasets (such as radiation therapy planning), or dynamic replay of time- varying volumetric data (such as cine-CT or gated MR studies of the beating heart). The current implementation is a single processor system which generates reformatted images in true real time (30 frames per second), and shaded 3D views in a few seconds per frame. It accepts full scale medical datasets in their native formats, so that minimal preprocessing delay exists between data acquisition and display

  6. Single particle irradiation effect of digital signal processor

    International Nuclear Information System (INIS)

    Fan Si'an; Chen Kenan

    2010-01-01

    The single particle irradiation effect of high energy neutron on digital signal processor TMS320P25 in dynamic working condition has been studied. The influence of the single particle on the device has been explored through the acquired waveform and working current of TMS320P25. Analysis results, test data and test methods have also been presented. (authors)

  7. Evaluation of the Intel Westmere-EP server processor

    CERN Document Server

    Jarp, S; Leduc, J; Nowak, A; CERN. Geneva. IT Department

    2010-01-01

    In this paper we report on a set of benchmark results recently obtained by CERN openlab when comparing the 6-core “Westmere-EP” processor with Intel’s previous generation of the same microarchitecture, the “Nehalem-EP”. The former is produced in a new 32nm process, the latter in 45nm. Both platforms are dual-socket servers. Multiple benchmarks were used to get a good understanding of the performance of the new processor. We used both industry-standard benchmarks, such as SPEC2006, and specific High Energy Physics benchmarks, representing both simulation of physics detectors and data analysis of physics events. Before summarizing the results we must stress the fact that benchmarking of modern processors is a very complex affair. One has to control (at least) the following features: processor frequency, overclocking via Turbo mode, the number of physical cores in use, the use of logical cores via Simultaneous Multi-Threading (SMT), the cache sizes available, the memory configuration installed, as well...

  8. Evaluation of the Intel Nehalem-EX server processor

    CERN Document Server

    Jarp, S; Leduc, J; Nowak, A; CERN. Geneva. IT Department

    2010-01-01

    In this paper we report on a set of benchmark results recently obtained by the CERN openlab by comparing the 4-socket, 32-core Intel Xeon X7560 server with the previous generation 4-socket server, based on the Xeon X7460 processor. The Xeon X7560 processor represents a major change in many respects, especially the memory sub-system, so it was important to make multiple comparisons. In most benchmarks the two 4-socket servers were compared. It should be underlined that both servers represent the “top of the line” in terms of frequency. However, in some cases, it was important to compare systems that integrated the latest processor features, such as QPI links, Symmetric multithreading and over-clocking via Turbo mode, and in such situations the X7560 server was compared to a dual socket L5520 based system with an identical frequency of 2.26 GHz. Before summarizing the results we must stress the fact that benchmarking of modern processors is a very complex affair. One has to control (at least) the following ...

  9. Elementary function calculation programs for the central processor-6

    International Nuclear Information System (INIS)

    Dobrolyubov, L.V.; Ovcharenko, G.A.; Potapova, V.A.

    1976-01-01

    Subprograms of elementary functions calculations are given for the central processor (CP AS-6). A procedure is described to obtain calculated formulae which represent the elementary functions as a polynomial. Standard programs for random numbers are considered. All the programs described are based upon the algorithms of respective programs for BESM computer

  10. Digital signal array processor for NSLS booster power supply upgrade

    International Nuclear Information System (INIS)

    Olsen, R.; Dabrowski, J.; Murray, J.

    1993-01-01

    The booster at the NSLS is being upgraded from 0.75 to 2 pulses per second. To accomplish this, new power supplied for the dipole, quadrupole, and sextupole have been installed. This paper will outline the design and function of the digital signal processor used as the primary control element in the power supply control system

  11. C-HEAP : a heterogeneous multi-processor architecture template and scalable and flexible protocol for the design of embedded signal processing systems

    NARCIS (Netherlands)

    Nieuwland, A.K.; Kang, J.; Gangwal, O.P.; Sethuraman, R.; Busá, N.G.; Goossens, K.G.W.; Peset Llopis, R.; Lippens, P.E.R.

    2002-01-01

    The key issue in the design of Systems-on-a-Chip (SoC) is to trade-off efficiency against flexibility, and time to market versus cost. Current deep submicron processing technologies enable integration of multiple software programmable processors (e.g., CPUs, DSPs) and dedicated hardware components

  12. Multichannel Baseband Processor for Wideband CDMA

    Science.gov (United States)

    Jalloul, Louay M. A.; Lin, Jim

    2005-12-01

    The system architecture of the cellular base station modem engine (CBME) is described. The CBME is a single-chip multichannel transceiver capable of processing and demodulating signals from multiple users simultaneously. It is optimized to process different classes of code-division multiple-access (CDMA) signals. The paper will show that through key functional system partitioning, tightly coupled small digital signal processing cores, and time-sliced reuse architecture, CBME is able to achieve a high degree of algorithmic flexibility while maintaining efficiency. The paper will also highlight the implementation and verification aspects of the CBME chip design. In this paper, wideband CDMA is used as an example to demonstrate the architecture concept.

  13. Multichannel Baseband Processor for Wideband CDMA

    Directory of Open Access Journals (Sweden)

    Jim Lin

    2005-07-01

    Full Text Available The system architecture of the cellular base station modem engine (CBME is described. The CBME is a single-chip multichannel transceiver capable of processing and demodulating signals from multiple users simultaneously. It is optimized to process different classes of code-division multiple-access (CDMA signals. The paper will show that through key functional system partitioning, tightly coupled small digital signal processing cores, and time-sliced reuse architecture, CBME is able to achieve a high degree of algorithmic flexibility while maintaining efficiency. The paper will also highlight the implementation and verification aspects of the CBME chip design. In this paper, wideband CDMA is used as an example to demonstrate the architecture concept.

  14. Optimal neural computations require analog processors

    Energy Technology Data Exchange (ETDEWEB)

    Beiu, V.

    1998-12-31

    This paper discusses some of the limitations of hardware implementations of neural networks. The authors start by presenting neural structures and their biological inspirations, while mentioning the simplifications leading to artificial neural networks. Further, the focus will be on hardware imposed constraints. They will present recent results for three different alternatives of parallel implementations of neural networks: digital circuits, threshold gate circuits, and analog circuits. The area and the delay will be related to the neurons` fan-in and to the precision of their synaptic weights. The main conclusion is that hardware-efficient solutions require analog computations, and suggests the following two alternatives: (i) cope with the limitations imposed by silicon, by speeding up the computation of the elementary silicon neurons; (2) investigate solutions which would allow the use of the third dimension (e.g. using optical interconnections).

  15. HTGR core seismic analysis using an array processor

    International Nuclear Information System (INIS)

    Shatoff, H.; Charman, C.M.

    1983-01-01

    A Floating Point Systems array processor performs nonlinear dynamic analysis of the high-temperature gas-cooled reactor (HTGR) core with significant time and cost savings. The graphite HTGR core consists of approximately 8000 blocks of various shapes which are subject to motion and impact during a seismic event. Two-dimensional computer programs (CRUNCH2D, MCOCO) can perform explicit step-by-step dynamic analyses of up to 600 blocks for time-history motions. However, use of two-dimensional codes was limited by the large cost and run times required. Three-dimensional analysis of the entire core, or even a large part of it, had been considered totally impractical. Because of the needs of the HTGR core seismic program, a Floating Point Systems array processor was used to enhance computer performance of the two-dimensional core seismic computer programs, MCOCO and CRUNCH2D. This effort began by converting the computational algorithms used in the codes to a form which takes maximum advantage of the parallel and pipeline processors offered by the architecture of the Floating Point Systems array processor. The subsequent conversion of the vectorized FORTRAN coding to the array processor required a significant programming effort to make the system work on the General Atomic (GA) UNIVAC 1100/82 host. These efforts were quite rewarding, however, since the cost of running the codes has been reduced approximately 50-fold and the time threefold. The core seismic analysis with large two-dimensional models has now become routine and extension to three-dimensional analysis is feasible. These codes simulate the one-fifth-scale full-array HTGR core model. This paper compares the analysis with the test results for sine-sweep motion

  16. Simulation of a 250 kW diesel fuel processor/PEM fuel cell system

    Science.gov (United States)

    Amphlett, J. C.; Mann, R. F.; Peppley, B. A.; Roberge, P. R.; Rodrigues, A.; Salvador, J. P.

    Polymer-electrolyte membrane (PEM) fuel cell systems offer a potential power source for utility and mobile applications. Practical fuel cell systems use fuel processors for the production of hydrogen-rich gas. Liquid fuels, such as diesel or other related fuels, are attractive options as feeds to a fuel processor. The generation of hydrogen gas for fuel cells, in most cases, becomes the crucial design issue with respect to weight and volume in these applications. Furthermore, these systems will require a gas clean-up system to insure that the fuel quality meets the demands of the cell anode. The endothermic nature of the reformer will have a significant affect on the overall system efficiency. The gas clean-up system may also significantly effect the overall heat balance. To optimize the performance of this integrated system, therefore, waste heat must be used effectively. Previously, we have concentrated on catalytic methanol-steam reforming. A model of a methanol steam reformer has been previously developed and has been used as the basis for a new, higher temperature model for liquid hydrocarbon fuels. Similarly, our fuel cell evaluation program previously led to the development of a steady-state electrochemical fuel cell model (SSEM). The hydrocarbon fuel processor model and the SSEM have now been incorporated in the development of a process simulation of a 250 kW diesel-fueled reformer/fuel cell system using a process simulator. The performance of this system has been investigated for a variety of operating conditions and a preliminary assessment of thermal integration issues has been carried out. This study demonstrates the application of a process simulation model as a design analysis tool for the development of a 250 kW fuel cell system.

  17. Soft-core dataflow processor architecture optimised for radar signal processing: Article

    CSIR Research Space (South Africa)

    Broich, R

    2014-10-01

    Full Text Available Current radar signal processors lack either performance or flexibility. Custom soft-core processors exhibit potential in high-performance signal processing applications, yet remain relatively unexplored in research literature. In this paper, we use...

  18. The Fast Tracker Real Time Processor

    CERN Document Server

    Annovi, A; The ATLAS collaboration

    2011-01-01

    As the LHC luminosity is ramped up to the SLHC Phase I level and beyond, the high rates, multiplicities, and energies of particles seen by the detectors will pose a unique challenge. Only a tiny fraction of the produced collisions can be stored on tape and immense real-time data reduction is needed. An effective trigger system must maintain high trigger efficiencies for the physics we are most interested in, and at the same time suppress the enormous QCD backgrounds. This requires massive computing power to minimize the online execution time of complex algorithms. A multi-level trigger is an effective solution for an otherwise impossible problem. The Fast Tracker (FTK)[1], is a proposed upgrade to the current ATLAS trigger system that will operate at full Level-1 output rates and provide high quality tracks reconstructed over the entire detector by the start of processing in Level-2. FTK solves the combinatorial challenge inherent to tracking by exploiting massive parallelism of associative memories [2] that ...

  19. An updated program-controlled analog processor, model AP-006, for semiconductor detector spectrometers

    International Nuclear Information System (INIS)

    Shkola, N.F.; Shevchenko, Yu.A.

    1989-01-01

    An analog processor, model AP-006, is reported. The processor is a development of a series of spectrometric units based on a shaper of the type 'DL dif +TVS+gated ideal integrator'. Structural and circuits design features are described. The results of testing the processor in a setup with a Si(Li) detecting unit over an input count-rate range of up to 5x10 5 cps are presented. Processor applications are illustrated. (orig.)

  20. A prediction method for job runtimes on shared processors: Survey, statistical analysis and new avenues

    NARCIS (Netherlands)

    Dobber, A.M.; van der Mei, R.D.; Koole, G.M.

    2007-01-01

    Grid computing is an emerging technology by which huge numbers of processors over the world create a global source of processing power. Their collaboration makes it possible to perform computations that are too extensive to perform on a single processor. On a grid, processors may connect and

  1. 77 FR 124 - Biological Processors of Alabama; Decatur, Morgan County, AL; Notice of Settlement

    Science.gov (United States)

    2012-01-03

    ... ENVIRONMENTAL PROTECTION AGENCY [FRL-9612-9] Biological Processors of Alabama; Decatur, Morgan... reimbursement of past response costs concerning the Biological Processors of Alabama Superfund Site located in... Ms. Paula V. Painter. Submit your comments by Site name Biological Processors of Alabama Superfund...

  2. M7--a high speed digital processor for second level trigger selections

    International Nuclear Information System (INIS)

    Droege, T.F.; Gaines, I.; Turner, K.J.

    1978-01-01

    A digital processor is described which reconstructs mass and momentum as a second-level trigger selection. The processor is a five-address, microprogramed, pipelined, ECL machine with simultaneous memory access to four operands which load two parallel multipliers and an ALU. Source data modules are extensions of the processor

  3. The microelectronic and photonic test bed RISC processor and DRAM memory stack experiments

    International Nuclear Information System (INIS)

    Clark, K.A.; Meehan, T.J.

    1999-01-01

    This paper reports on the on-orbit data obtained from the MPTB RISC Processor Experiment, containing three Integrated Device Technologies R3081 processors. During operations, nine SEUs were observed in the processors, and four SEUs were observed in the memory and/or support circuitry. (authors)

  4. Upgrade of the ATLAS Level‐1 trigger with an FPGA based Topological Processor

    CERN Document Server

    Caputo, R; The ATLAS collaboration; Buescher, V; Degele, R; Kiese, P; Maldaner, S; Reiss, A; Schaefer, U; Simioni, E; Tapprogge, S; Urejola, P

    2013-01-01

    The ATLAS experiment is located at the European Centre for Nuclear Research (CERN) in Switzerland. It is designed to measure decay properties of high energetic particles produced in the protons collisions at the Large Hadron Collider (LHC). LHC proton collision at a frequency of 40 MHz, requires a trigger system to efficiently select events down to a manageable event storage rate of about 400 Hz. Event triggering is therefore one of the extraordinary challenges faced by the ATLAS detector. The Level-1 Trigger is the first rate-reducing step in the ATLAS Trigger, with an output rate of 75kHz and decision latency of less than 2.5$\\mu$s. It is primarily composed of the Calorimeter Trigger, Muon Trigger, the Central Trigger Processor (CTP). Due to the increase in the LHC instantaneous luminosity up to 3$\\times$10$^{34}$cm$^{−2}$s$^{−1}$ in 2015, a new element will be included in the Level-1 Trigger scheme: the Topological Processor (L1Topo). The L1Topo receive data in a dedicated format from the calorimeters ...

  5. The Topological Processor for the future ATLAS Level-1 Trigger: from design to commissioning

    CERN Document Server

    Simioni, E; The ATLAS collaboration

    2014-01-01

    The ATLAS experiment is located at the European Centre for Nuclear Research (CERN) in Switzerland. It is designed to measure decay properties of highly energetic particles produced in the protons collisions at the Large Hadron Collider (LHC). The LHC has a beam collision frequency of 40 MHz, and thus requires a trigger system to efficiently select events, thereby reducing the storage rate to a manageable level of about 400 Hz. Event triggering is therefore one of the extraordinary challenges faced by the ATLAS detector. The Level-1 Trigger is the first rate-reducing step in the ATLAS Trigger, with an output rate of 75kHz and decision latency of less than 2.5 s. It is primarily composed of the Calorimeter Trigger, Muon Trigger, the Central Trigger Processor (CTP). Due to the increase in the LHC instantaneous luminosity up 3 x 10^34/cm2 s from 2015 onwards, a new element will be included in the Level-1 Trigger scheme: the Topological Processor (L1Topo). The L1Topo receives data in a specialized format from the ...

  6. Upgrade of the ATLAS Level-1 trigger with an FPGA based Topological Processor

    CERN Document Server

    Caputo, R; The ATLAS collaboration; Buescher, V; Degele, R; Kiese, P; Maldaner, S; Reiss, A; Schaefer, U; Simioni, E; Tapprogge, S; Urrejola, P

    2013-01-01

    The ATLAS experiment is located at the European Centre for Nuclear Research (CERN) in Switzerland. It is designed to measure decay properties of high energetic particles produced in the protons collisions at the Large Hadron Collider (LHC). The LHC has a proton collision at a frequency of 40 MHz, and thus requires a trigger system to efficiently select events down to a manageable event storage rate of about 400Hz. Event triggering is therefore one of the extraordinary challenges faced by the ATLAS detector. The Level-1 Trigger is the first rate-reducing step in the ATLAS Trigger, with an output rate of 75kHz and decision latency of less than 2.5$\\mu$s. It is primarily composed of the Calorimeter Trigger, Muon Trigger, the Central Trigger Processor (CTP). Due to the increase in the LHC instantaneous luminosity up to 3$\\times$10$^{34}$ cm$^{−2}$ s$^{−1}$ from 2015 onwards, a new element will be included in the Level-1 Trigger scheme: the Topological Processor (L1Topo). The L1Topo receives data in a dedicate...

  7. Catalyst development and systems analysis of methanol partial oxidation for the fuel processor - fuel cell integration

    Energy Technology Data Exchange (ETDEWEB)

    Newson, E; Mizsey, P; Hottinger, P; Truong, T B; Roth, F von; Schucan, Th H [Paul Scherrer Inst. (PSI), Villigen (Switzerland)

    1999-08-01

    Methanol partial oxidation (pox) to produce hydrogen for mobile fuel cell applications has proved initially more successful than hydrocarbon pox. Recent results of catalyst screening and kinetic studies with methanol show that hydrogen production rates have reached 7000 litres/hour/(litre reactor volume) for the dry pox route and 12,000 litres/hour/(litre reactor volume) for wet pox. These rates are equivalent to 21 and 35 kW{sub th}/(litre reactor volume) respectively. The reaction engineering problems remain to be solved for dry pox due to the significant exotherm of the reaction (hot spots of 100-200{sup o}C), but wet pox is essentially isothermal in operation. Analyses of the integrated fuel processor - fuel cell systems show that two routes are available to satisfy the sensitivity of the fuel cell catalysts to carbon monoxide, i.e. a preferential oxidation reactor or a membrane separator. Targets for individual system components are evaluated for the base and best case systems for both routes to reach the combined 40% efficiency required for the integrated fuel processor - fuel cell system. (author) 2 figs., 1 tab., 3 refs.

  8. Design of an Elliptic Curve Cryptography processor for RFID tag chips.

    Science.gov (United States)

    Liu, Zilong; Liu, Dongsheng; Zou, Xuecheng; Lin, Hui; Cheng, Jian

    2014-09-26

    Radio Frequency Identification (RFID) is an important technique for wireless sensor networks and the Internet of Things. Recently, considerable research has been performed in the combination of public key cryptography and RFID. In this paper, an efficient architecture of Elliptic Curve Cryptography (ECC) Processor for RFID tag chip is presented. We adopt a new inversion algorithm which requires fewer registers to store variables than the traditional schemes. A new method for coordinate swapping is proposed, which can reduce the complexity of the controller and shorten the time of iterative calculation effectively. A modified circular shift register architecture is presented in this paper, which is an effective way to reduce the area of register files. Clock gating and asynchronous counter are exploited to reduce the power consumption. The simulation and synthesis results show that the time needed for one elliptic curve scalar point multiplication over GF(2163) is 176.7 K clock cycles and the gate area is 13.8 K with UMC 0.13 μm Complementary Metal Oxide Semiconductor (CMOS) technology. Moreover, the low power and low cost consumption make the Elliptic Curve Cryptography Processor (ECP) a prospective candidate for application in the RFID tag chip.

  9. Redesign of an in-market food processor for manufacturing cost reduction using DFMA methodology

    Directory of Open Access Journals (Sweden)

    Akshay Harlalka

    2016-01-01

    Full Text Available Reducing the time and cost involved in the product development is very important to stay competitive in the market. Design for Manufacturing and Assembly (DFMA is a cost-reduction framework for designers to evaluate manufacturing aspects of a product design. A case study on an in-market food processor, this paper aims to demonstrate the significance of DFMA implementation on an Indian consumer product. Even though many DFMA case studies have been published till date, very few have dealt with the Indian consumer durables category. In this paper, various cost-reduction opportunities are identified in the design of a food processor manufactured by a reputed company in India. Using the DFMA study, design ideas are developed with an objective of reducing the overall manufacturing cost of the product. As a result of DFMA implementation, significant improvement in terms of the product’s architecture, assembly time and design efficiency is identified. Overall cost reduction of 0.25 USD (United States Dollar was achieved and an improvement in the Design for Assembly (DFA index from 15.99 to 19.93 is reported. This procedure utilized in this paper can be adopted for any consumer durable products of similar type and design improvements and cost reduction can be achieved.

  10. Firmware implementation of algorithms for the new topological processor in the ATLAS first level trigger

    Energy Technology Data Exchange (ETDEWEB)

    Maldaner, Stephan; Caputo, Regina; Schaefer, Ulrich; Tapprogge, Stefan [Universitaet Mainz, Staudingerweg 7, 55128 Mainz (Germany)

    2013-07-01

    After the upgrade of the Large Hadron Collider in 2013/2014 proton-proton collisions will be provided at a center-of-mass energy of up to 14 TeV with an instantaneous luminosity of at least 1 . 10{sup 34} cm{sup -2}s{sup -1}. During this upgrade a new FPGA based electronics system (Topological Processor) will be included in the ATLAS trigger chain to keep up with the increased rate of events. To reduce rates while maintaining high signal efficiency of the trigger the processor will make its decisions based upon topological criteria like angular cuts and mass calculations. As a hardware based trigger, it will have to fit into the tight first level trigger latency budget of 2.5 μs and thus provides the challenge of making decisions within very short time. Beside the latency, the main constraints on the algorithms are the required amount of logic resources of the FPGA which will be implemented as firmware. Therefore to be able to use as much information as possible, each module will be equipped with 2 state-of-the-art Xilinx Virtex 7 FPGAs to process the incoming data. This talk will present some of the topological algorithms and discuss properties of their implementation in firmware.

  11. The performance of an LSI-11/23 with a SKYMNK-Q array processor as a high speed front end processor

    International Nuclear Information System (INIS)

    Clark, D.L.

    1983-01-01

    The NSRL has recently installed a VAX-11/750 based data acquisition system which is networked to two LSI-11/23 satellite processors. Each of the LSI's are connected to CAMAC branch drivers. The LSI's have small array processors installed for use in preprocessing data. The objective is to provide an easy to use high speed processor that will relieve the VAX of some of the real-time data analysis tasks. The basic operation of the array processor and some of the results of performance tests are described

  12. LASIP-III, a generalized processor for standard interface files

    International Nuclear Information System (INIS)

    Bosler, G.E.; O'Dell, R.D.; Resnik, W.M.

    1976-03-01

    The LASIP-III code was developed for processing Version III standard interface data files which have been specified by the Committee on Computer Code Coordination. This processor performs two distinct tasks, namely, transforming free-field format, BCD data into well-defined binary files and providing for printing and punching data in the binary files. While LASIP-III is exported as a complete free-standing code package, techniques are described for easily separating the processor into two modules, viz., one for creating the binary files and one for printing the files. The two modules can be separated into free-standing codes or they can be incorporated into other codes. Also, the LASIP-III code can be easily expanded for processing additional files, and procedures are described for such an expansion. 2 figures, 8 tables

  13. In-Network Adaptation of Video Streams Using Network Processors

    Directory of Open Access Journals (Sweden)

    Mohammad Shorfuzzaman

    2009-01-01

    problem can be addressed, near the network edge, by applying dynamic, in-network adaptation (e.g., transcoding of video streams to meet available connection bandwidth, machine characteristics, and client preferences. In this paper, we extrapolate from earlier work of Shorfuzzaman et al. 2006 in which we implemented and assessed an MPEG-1 transcoding system on the Intel IXP1200 network processor to consider the feasibility of in-network transcoding for other video formats and network processor architectures. The use of “on-the-fly” video adaptation near the edge of the network offers the promise of simpler support for a wide range of end devices with different display, and so forth, characteristics that can be used in different types of environments.

  14. JIST: Just-In-Time Scheduling Translation for Parallel Processors

    Directory of Open Access Journals (Sweden)

    Giovanni Agosta

    2005-01-01

    Full Text Available The application fields of bytecode virtual machines and VLIW processors overlap in the area of embedded and mobile systems, where the two technologies offer different benefits, namely high code portability, low power consumption and reduced hardware cost. Dynamic compilation makes it possible to bridge the gap between the two technologies, but special attention must be paid to software instruction scheduling, a must for the VLIW architectures. We have implemented JIST, a Virtual Machine and JIT compiler for Java Bytecode targeted to a VLIW processor. We show the impact of various optimizations on the performance of code compiled with JIST through the experimental study on a set of benchmark programs. We report significant speedups, and increments in the number of instructions issued per cycle up to 50% with respect to the non-scheduling version of the JITcompiler. Further optimizations are discussed.

  15. NMRFx Processor: a cross-platform NMR data processing program

    International Nuclear Information System (INIS)

    Norris, Michael; Fetler, Bayard; Marchant, Jan; Johnson, Bruce A.

    2016-01-01

    NMRFx Processor is a new program for the processing of NMR data. Written in the Java programming language, NMRFx Processor is a cross-platform application and runs on Linux, Mac OS X and Windows operating systems. The application can be run in both a graphical user interface (GUI) mode and from the command line. Processing scripts are written in the Python programming language and executed so that the low-level Java commands are automatically run in parallel on computers with multiple cores or CPUs. Processing scripts can be generated automatically from the parameters of NMR experiments or interactively constructed in the GUI. A wide variety of processing operations are provided, including methods for processing of non-uniformly sampled datasets using iterative soft thresholding. The interactive GUI also enables the use of the program as an educational tool for teaching basic and advanced techniques in NMR data analysis.

  16. Efficacy of Code Optimization on Cache-Based Processors

    Science.gov (United States)

    VanderWijngaart, Rob F.; Saphir, William C.; Chancellor, Marisa K. (Technical Monitor)

    1997-01-01

    In this paper a number of techniques for improving the cache performance of a representative piece of numerical software is presented. Target machines are popular processors from several vendors: MIPS R5000 (SGI Indy), MIPS R8000 (SGI PowerChallenge), MIPS R10000 (SGI Origin), DEC Alpha EV4 + EV5 (Cray T3D & T3E), IBM RS6000 (SP Wide-node), Intel PentiumPro (Ames' Whitney), Sun UltraSparc (NERSC's NOW). The optimizations all attempt to increase the locality of memory accesses. But they meet with rather varied and often counterintuitive success on the different computing platforms. We conclude that it may be genuinely impossible to obtain portable performance on the current generation of cache-based machines. At the least, it appears that the performance of modern commodity processors cannot be described with parameters defining the cache alone.

  17. The fast tracker processor for hadron collider triggers

    CERN Document Server

    Annovi, A; Bardi, A; Carosi, R; Dell'Orso, Mauro; D'Onofrio, M; Giannetti, P; Iannaccone, G; Morsani, E; Pietri, M; Varotto, G

    2001-01-01

    Perspectives for precise and fast track reconstruction in future hadron collider experiments are addressed. We discuss the feasibility of a pipelined highly parallel processor dedicated to the implementation of a very fast tracking algorithm. The algorithm is based on the use of a large bank of pre-stored combinations of trajectory points, called patterns, for extremely complex tracking systems. The CMS experiment at LHC is used as a benchmark. Tracking data from the events selected by the level-1 trigger are sorted and filtered by the Fast Tracker processor at an input rate of 100 kHz. This data organization allows the level-2 trigger logic to reconstruct full resolution tracks with transverse momentum above a few GeV and search for secondary vertices within typical level-2 times. (15 refs).

  18. The fast tracker processor for hadronic collider triggers

    CERN Document Server

    Annovi, A; Bardi, A; Carosi, R; Dell'Orso, Mauro; D'Onofrio, M; Giannetti, P; Iannaccone, G; Morsani, F; Pietri, M; Varotto, G

    2000-01-01

    Perspective for precise and fast track reconstruction in future hadronic collider experiments are addressed. We discuss the feasibility of a pipelined highly parallelized processor dedicated to the implementation of a very fast algorithm. The algorithm is based on the use of a large bank of pre-stored combinations of trajectory points (patterns) for extremely complex tracking systems. The CMS experiment at LHC is used as a benchmark. Tracking data from the events selected by the level-1 trigger are sorted and filtered by the Fast Tracker processor at a rate of 100 kHz. This data organization allows the level-2 trigger logic to reconstruct full resolution traces with transverse momentum above few GeV and search secondary vertexes within typical level-2 times. 15 Refs.

  19. The ATLAS Trigger Algorithms for General Purpose Graphics Processor Units

    CERN Document Server

    Tavares Delgado, Ademar; The ATLAS collaboration

    2016-01-01

    The ATLAS Trigger Algorithms for General Purpose Graphics Processor Units Type: Talk Abstract: We present the ATLAS Trigger algorithms developed to exploit General­ Purpose Graphics Processor Units. ATLAS is a particle physics experiment located on the LHC collider at CERN. The ATLAS Trigger system has two levels, hardware-­based Level 1 and the High Level Trigger implemented in software running on a farm of commodity CPU. Performing the trigger event selection within the available farm resources presents a significant challenge that will increase future LHC upgrades. are being evaluated as a potential solution for trigger algorithms acceleration. Key factors determining the potential benefit of this new technology are the relative execution speedup, the number of GPUs required and the relative financial cost of the selected GPU. We have developed a trigger demonstrator which includes algorithms for reconstructing tracks in the Inner Detector and Muon Spectrometer and clusters of energy deposited in the Cal...

  20. NMRFx Processor: a cross-platform NMR data processing program

    Energy Technology Data Exchange (ETDEWEB)

    Norris, Michael; Fetler, Bayard [One Moon Scientific, Inc. (United States); Marchant, Jan [University of Maryland Baltimore County, Howard Hughes Medical Institute (United States); Johnson, Bruce A., E-mail: bruce.johnson@asrc.cuny.edu [One Moon Scientific, Inc. (United States)

    2016-08-15

    NMRFx Processor is a new program for the processing of NMR data. Written in the Java programming language, NMRFx Processor is a cross-platform application and runs on Linux, Mac OS X and Windows operating systems. The application can be run in both a graphical user interface (GUI) mode and from the command line. Processing scripts are written in the Python programming language and executed so that the low-level Java commands are automatically run in parallel on computers with multiple cores or CPUs. Processing scripts can be generated automatically from the parameters of NMR experiments or interactively constructed in the GUI. A wide variety of processing operations are provided, including methods for processing of non-uniformly sampled datasets using iterative soft thresholding. The interactive GUI also enables the use of the program as an educational tool for teaching basic and advanced techniques in NMR data analysis.

  1. Nested dissection on a mesh-connected processor array

    International Nuclear Information System (INIS)

    Worley, P.H.; Schreiber, R.

    1986-01-01

    The authors present a parallel implementation of Gaussian elimination without pivoting using the nested dissection ordering for solving Ax=b where A is an N x N symmetric positive definite matrix. If the graph of A is a √N x √N finite element mesh then a parallel complexity of O(√N) can be achieved for Gaussian elimination with the nested dissection ordering. The authors' implementation achieves this parallel complexity on a two dimensional MIMD processor array with N processors and nearest neighbors interconnections. Thus nested dissection is a near optimal algorithm for this problem on this interconnection topology. The parallel implementation on this architecture requires 158√N + O(log/sub 2/(√N)) parallel floating point multiplications. It is faster than a Kung-Leiserson systolic array for banded matrices for N≥961, and faster than a serial implementation for N as small as 9

  2. Optical chirp z-transform processor with a simplified architecture.

    Science.gov (United States)

    Ngo, Nam Quoc

    2014-12-29

    Using a simplified chirp z-transform (CZT) algorithm based on the discrete-time convolution method, this paper presents the synthesis of a simplified architecture of a reconfigurable optical chirp z-transform (OCZT) processor based on the silica-based planar lightwave circuit (PLC) technology. In the simplified architecture of the reconfigurable OCZT, the required number of optical components is small and there are no waveguide crossings which make fabrication easy. The design of a novel type of optical discrete Fourier transform (ODFT) processor as a special case of the synthesized OCZT is then presented to demonstrate its effectiveness. The designed ODFT can be potentially used as an optical demultiplexer at the receiver of an optical fiber orthogonal frequency division multiplexing (OFDM) transmission system.

  3. A Time-Composable Operating System for the Patmos Processor

    DEFF Research Database (Denmark)

    Ziccardi, Marco; Schoeberl, Martin; Vardanega, Tullio

    2015-01-01

    -composable operating system, on top of a time-composable processor, facilitates incremental development, which is highly desirable for industry. This paper makes a twofold contribution. First, we present enhancements to the Patmos processor to allow achieving time composability at the operating system level. Second......, we extend an existing time-composable operating system, TiCOS, to make best use of advanced Patmos hardware features in the pursuit of time composability.......In the last couple of decades we have witnessed a steady growth in the complexity and widespread of real-time systems. In order to master the rising complexity in the timing behaviour of those systems, rightful attention has been given to the development of time-predictable computer architectures...

  4. Interference and protection of electromagnetic pulse to digital signal processor

    International Nuclear Information System (INIS)

    Wang Yan; Jiao Hongling; He Shanhong; Pan Chao; Feng Deren; Che Wenquan; Xiong Ying

    2013-01-01

    The effective electromagnetic pulse protection is studied in this paper, first the interference of electromagnetic pulse simulator path is analyzed, including the digital signal processor (DSP) and the discharge circuit of coupling interference and net electricity coupling interference. Using the structure optimization design, the hardware block reinforcement measurement and the setting of open software trap, and the watchdog anti-jamming measures, the interference test is completed such as the central processor core voltage of DSP, input/output (I/O) ports of DSP and the display screen. The experimental results show that the combination of hardware and software protection reinforcement technology is effective, and the interference pulse amplitude of DSP board I/O port and the kernel work voltage are reduced, and the interference duration is reduced from 2 μs to 400 ns. The interference pulse is effectively restrained. (authors)

  5. A VLSI image processor via pseudo-mersenne transforms

    International Nuclear Information System (INIS)

    Sei, W.J.; Jagadeesh, J.M.

    1986-01-01

    The computational burden on image processing in medical fields where a large amount of information must be processed quickly and accurately has led to consideration of special-purpose image processor chip design for some time. The very large scale integration (VLSI) resolution has made it cost-effective and feasible to consider the design of special purpose chips for medical imaging fields. This paper describes a VLSI CMOS chip suitable for parallel implementation of image processing algorithms and cyclic convolutions by using Pseudo-Mersenne Number Transform (PMNT). The main advantages of the PMNT over the Fast Fourier Transform (FFT) are: (1) no multiplications are required; (2) integer arithmetic is used. The design and development of this processor, which operates on 32-point convolution or 5 x 5 window image, are described

  6. Color sensor and neural processor on one chip

    Science.gov (United States)

    Fiesler, Emile; Campbell, Shannon R.; Kempem, Lother; Duong, Tuan A.

    1998-10-01

    Low-cost, compact, and robust color sensor that can operate in real-time under various environmental conditions can benefit many applications, including quality control, chemical sensing, food production, medical diagnostics, energy conservation, monitoring of hazardous waste, and recycling. Unfortunately, existing color sensor are either bulky and expensive or do not provide the required speed and accuracy. In this publication we describe the design of an accurate real-time color classification sensor, together with preprocessing and a subsequent neural network processor integrated on a single complementary metal oxide semiconductor (CMOS) integrated circuit. This one-chip sensor and information processor will be low in cost, robust, and mass-producible using standard commercial CMOS processes. The performance of the chip and the feasibility of its manufacturing is proven through computer simulations based on CMOS hardware parameters. Comparisons with competing methodologies show a significantly higher performance for our device.

  7. Behavioral Simulation and Performance Evaluation of Multi-Processor Architectures

    Directory of Open Access Journals (Sweden)

    Ausif Mahmood

    1996-01-01

    Full Text Available The development of multi-processor architectures requires extensive behavioral simulations to verify the correctness of design and to evaluate its performance. A high level language can provide maximum flexibility in this respect if the constructs for handling concurrent processes and a time mapping mechanism are added. This paper describes a novel technique for emulating hardware processes involved in a parallel architecture such that an object-oriented description of the design is maintained. The communication and synchronization between hardware processes is handled by splitting the processes into their equivalent subprograms at the entry points. The proper scheduling of these subprograms is coordinated by a timing wheel which provides a time mapping mechanism. Finally, a high level language pre-processor is proposed so that the timing wheel and the process emulation details can be made transparent to the user.

  8. Digital implementation of the preloaded filter pulse processor

    International Nuclear Information System (INIS)

    Westphal, G.P.; Cadek, G.R.; Keroe, N.; Sauter, TH.; Thorwartl, P.C.

    1995-01-01

    Adapting it's processing time to the respective pulse intervals, the Preloaded Filter (PLF) pulse processor offers optimum resolution together with highest possible throughput rates. The PLF algorithm could be formulated in a recursive manner which made possible it's implementation by means of a large field-programmable gate array, as a fast, pipe-lined digital processor with 10 MHz maximum throughput rate. While pre-filter digitization by an ADC with 12 bit resolution and 10M Hz sampling rate resulted in a poorer resolution than that of an analog filter, a digital PLF based on an ADC with 14 bit resolution and 10 MHz sampling rate, surpassed high-quality analog filters in resolution, throughput rate and long-term stability. (author) 6 refs.; 7 figs

  9. Biological Water Processor and Forward Osmosis Secondary Treatment

    Science.gov (United States)

    Shull, Sarah; Meyer, Caitlin

    2014-01-01

    The goal of the Biological Water Processor (BWP) is to remove 90% organic carbon and 75% ammonium from an exploration-based wastewater stream for four crew members. The innovative design saves on space, power and consumables as compared to the ISS Urine Processor Assembly (UPA) by utilizing microbes in a biofilm. The attached-growth system utilizes simultaneous nitrification and denitrification to mineralize organic carbon and ammonium to carbon dioxide and nitrogen gas, which can be scrubbed in a cabin air revitalization system. The BWP uses a four-crew wastewater comprised of urine and humidity condensate, as on the ISS, but also includes hygiene (shower, shave, hand washing and oral hygiene) and laundry. The BWP team donates 58L per day of this wastewater processed in Building 7.

  10. Performance of Distributed CFAR Processors in Pearson Distributed Clutter

    Directory of Open Access Journals (Sweden)

    Messali Zoubeida

    2007-01-01

    Full Text Available This paper deals with the distributed constant false alarm rate (CFAR radar detection of targets embedded in heavy-tailed Pearson distributed clutter. In particular, we extend the results obtained for the cell averaging (CA, order statistics (OS, and censored mean level CMLD CFAR processors operating in positive alpha-stable (P&S random variables to more general situations, specifically to the presence of interfering targets and distributed CFAR detectors. The receiver operating characteristics of the greatest of (GO and the smallest of (SO CFAR processors are also determined. The performance characteristics of distributed systems are presented and compared in both homogeneous and in presence of interfering targets. We demonstrate, via simulation results, that the distributed systems when the clutter is modelled as positive alpha-stable distribution offer robustness properties against multiple target situations especially when using the "OR" fusion rule.

  11. Performance of Distributed CFAR Processors in Pearson Distributed Clutter

    Directory of Open Access Journals (Sweden)

    Faouzi Soltani

    2007-01-01

    Full Text Available This paper deals with the distributed constant false alarm rate (CFAR radar detection of targets embedded in heavy-tailed Pearson distributed clutter. In particular, we extend the results obtained for the cell averaging (CA, order statistics (OS, and censored mean level CMLD CFAR processors operating in positive alpha-stable (P&S random variables to more general situations, specifically to the presence of interfering targets and distributed CFAR detectors. The receiver operating characteristics of the greatest of (GO and the smallest of (SO CFAR processors are also determined. The performance characteristics of distributed systems are presented and compared in both homogeneous and in presence of interfering targets. We demonstrate, via simulation results, that the distributed systems when the clutter is modelled as positive alpha-stable distribution offer robustness properties against multiple target situations especially when using the “OR” fusion rule.

  12. Fast digital processor for event selection according to particle number difference

    International Nuclear Information System (INIS)

    Basiladze, S.G.; Gus'kov, B.N.; Li Van Sun; Maksimov, A.N.; Parfenov, A.N.

    1978-01-01

    A fast digital processor for a magnetic spectrometer is described. It is used in experimental searches for charmed particles. The basic purpose of the processor is discriminating events in the difference of numbers of particles passing through two proportional chambers (PC). The processor consists of three units for detecting signals with PC, and a binary coder. The number of inputs of the processor is 32 for the first PC and 64 for the second. The difference in the number of particles discriminated is from 0 to 8. The resolution time is 180 ns. The processor is built in the CAMAC standard

  13. The Interface Between Redundant Processor Modules Of Safety Grade PLC Using Mass Storage DPRAM

    International Nuclear Information System (INIS)

    Hwang, Sung Jae; Song, Seong Hwan; No, Young Hun; Yun, Dong Hwa; Park, Gang Min; Kim, Min Gyu; Choi, Kyung Chul; Lee, Ui Taek

    2010-01-01

    Processor module of safety grade PLC (hereinafter called as POSAFE-Q) developed by POSCO ICT provides high reliability and safety. However, POSAFEQ would have suffered a malfunction when we think taking place of abnormal operation by exceptional environmental. POSAFE-Q would not able to conduct its function normally in such case. To prevent these situations, the necessity of redundant processor module has been raised. Therefore, redundant processor module, NCPU-2Q, has been developed which has not only functions of single processor module with high reliability and safety but also functions of redundant processor

  14. Accelerating molecular dynamic simulation on the cell processor and Playstation 3.

    Science.gov (United States)

    Luttmann, Edgar; Ensign, Daniel L; Vaidyanathan, Vishal; Houston, Mike; Rimon, Noam; Øland, Jeppe; Jayachandran, Guha; Friedrichs, Mark; Pande, Vijay S

    2009-01-30

    Implementation of molecular dynamics (MD) calculations on novel architectures will vastly increase its power to calculate the physical properties of complex systems. Herein, we detail algorithmic advances developed to accelerate MD simulations on the Cell processor, a commodity processor found in PlayStation 3 (PS3). In particular, we discuss issues regarding memory access versus computation and the types of calculations which are best suited for streaming processors such as the Cell, focusing on implicit solvation models. We conclude with a comparison of improved performance on the PS3's Cell processor over more traditional processors. (c) 2008 Wiley Periodicals, Inc.

  15. Probabilistic programmable quantum processors with multiple copies of program states

    International Nuclear Information System (INIS)

    Brazier, Adam; Buzek, Vladimir; Knight, Peter L.

    2005-01-01

    We examine the execution of general U(1) transformations on programmable quantum processors. We show that, with only the minimal assumption of availability of copies of the 1-qubit program state, the apparent advantage of existing schemes proposed by G. Vidal et al. [Phys. Rev. Lett. 88, 047905 (2002)] and M. Hillery et al. [Phys. Rev. A 65, 022301 (2003)] to execute a general U(1) transformation with greater probability using complex program states appears not to hold

  16. Reconfigurable Secure Video Codec Based on DWT and AES Processor

    OpenAIRE

    Rached Tourki; M. Machhout; B. Bouallegue; M. Atri; M. Zeghid; D. Dia

    2010-01-01

    In this paper, we proposed a secure video codec based on the discrete wavelet transformation (DWT) and the Advanced Encryption Standard (AES) processor. Either, use of video coding with DWT or encryption using AES is well known. However, linking these two designs to achieve secure video coding is leading. The contributions of our work are as follows. First, a new method for image and video compression is proposed. This codec is a synthesis of JPEG and JPEG2000,which is implemented using Huffm...

  17. DEMAND FOR WILD BLUEBERRIES AT FARM AND PROCESSOR LEVELS

    OpenAIRE

    Cheng, Hsiang-Tai; Peavey, Stephanie R.; Kezis, Alan S.

    2000-01-01

    The wild blueberry crop harvested in Maine and eastern Canada has increased considerably in recent years. The purpose of this study is to understand the recent trends in demand for wild blueberries with particular attention to the effects of production and the marketing of wild and cultivated blueberries. A price response model was developed to analyze farm-gate price and the processor price, using annual data from 1978 through 1997. Key explanatory variables in the model include quantity of ...

  18. Environmental data processor of the adaptive intrusion data system

    International Nuclear Information System (INIS)

    Rogers, M.S.

    1977-06-01

    A data acquisition system oriented specifically toward collection and processing of various meteorological and environmental parameters has been designed around a National Semiconductor IMP-16 microprocessor, This system, called the Environmental Data Processor (EDP), was developed specifically for use with the Adaptive Intrusion Data System (AIDS) in a perimeter intrusion alarm evaluation, although its design is sufficiently general to permit use elsewhere. This report describes in general detail the design of the EDP and its interaction with other AIDS components

  19. Processor tradeoffs in distributed real-time systems

    Science.gov (United States)

    Krishna, C. M.; Shin, Kang G.; Bhandari, Inderpal S.

    1987-01-01

    The problem of the optimization of the design of real-time distributed systems is examined with reference to a class of computer architectures similar to the continuously reconfigurable multiprocessor flight control system structure, CM2FCS. Particular attention is given to the impact of processor replacement and the burn-in time on the probability of dynamic failure and mean cost. The solution is obtained numerically and interpreted in the context of real-time applications.

  20. Event Pre Processor for the CZT Detector on MIRAX

    International Nuclear Information System (INIS)

    Kendziorra, Eckhard; Schanz, Thomas; Distratis, Giuseppe; Suchy, Slawomir

    2006-01-01

    We describe the Event Pre Processor (EPP) for the Hard X-ray Imager (HXI) on MIRAX. The EPP provides on board data reduction and event filtering for the HXI Cadmium Zinc Telluride strip detector. Emphasis is placed upon the EPP requirements, its implementation as VHDL design in a Field Programmable Gate Array (FPGA), and the description of a test environment for both the VHDL code and the FPGA hardware

  1. A Parallel Workload Model and its Implications for Processor Allocation

    Science.gov (United States)

    1996-11-01

    with SEV or AVG, both of which can tolerate c = 0.4 { 0.6 before their performance deteriorates signi cantly. On the other hand, Setia [10] has...Sanjeev. K Setia . The interaction between memory allocation and adaptive partitioning in message-passing multicomputers. In IPPS 󈨣 Workshop on Job...Scheduling Strategies for Parallel Processing, pages 89{99, 1995. [11] Sanjeev K. Setia and Satish K. Tripathi. An analysis of several processor

  2. Performance of Distributed CFAR Processors in Pearson Distributed Clutter

    OpenAIRE

    Messali Zoubeida; Soltani Faouzi

    2007-01-01

    This paper deals with the distributed constant false alarm rate (CFAR) radar detection of targets embedded in heavy-tailed Pearson distributed clutter. In particular, we extend the results obtained for the cell averaging (CA), order statistics (OS), and censored mean level CMLD CFAR processors operating in positive alpha-stable (P&S) random variables to more general situations, specifically to the presence of interfering targets and distributed CFAR detectors. The receiver operating ...

  3. A Geometric Algebra Co-Processor for Color Edge Detection

    Directory of Open Access Journals (Sweden)

    Biswajit Mishra

    2015-01-01

    Full Text Available This paper describes advancement in color edge detection, using a dedicated Geometric Algebra (GA co-processor implemented on an Application Specific Integrated Circuit (ASIC. GA provides a rich set of geometric operations, giving the advantage that many signal and image processing operations become straightforward and the algorithms intuitive to design. The use of GA allows images to be represented with the three R, G, B color channels defined as a single entity, rather than separate quantities. A novel custom ASIC is proposed and fabricated that directly targets GA operations and results in significant performance improvement for color edge detection. Use of the hardware described in this paper also shows that the convolution operation with the rotor masks within GA belongs to a class of linear vector filters and can be applied to image or speech signals. The contribution of the proposed approach has been demonstrated by implementing three different types of edge detection schemes on the proposed hardware. The overall performance gains using the proposed GA Co-Processor over existing software approaches are more than 3.2× faster than GAIGEN and more than 2800× faster than GABLE. The performance of the fabricated GA co-processor is approximately an order of magnitude faster than previously published results for hardware implementations.

  4. Broadband set-top box using MAP-CA processor

    Science.gov (United States)

    Bush, John E.; Lee, Woobin; Basoglu, Chris

    2001-12-01

    Advances in broadband access are expected to exert a profound impact in our everyday life. It will be the key to the digital convergence of communication, computer and consumer equipment. A common thread that facilitates this convergence comprises digital media and Internet. To address this market, Equator Technologies, Inc., is developing the Dolphin broadband set-top box reference platform using its MAP-CA Broadband Signal ProcessorT chip. The Dolphin reference platform is a universal media platform for display and presentation of digital contents on end-user entertainment systems. The objective of the Dolphin reference platform is to provide a complete set-top box system based on the MAP-CA processor. It includes all the necessary hardware and software components for the emerging broadcast and the broadband digital media market based on IP protocols. Such reference design requires a broadband Internet access and high-performance digital signal processing. By using the MAP-CA processor, the Dolphin reference platform is completely programmable, allowing various codecs to be implemented in software, such as MPEG-2, MPEG-4, H.263 and proprietary codecs. The software implementation also enables field upgrades to keep pace with evolving technology and industry demands.

  5. Monitoring the performance of off-site processors

    International Nuclear Information System (INIS)

    Miller, C.C.

    1995-01-01

    Commercial nuclear power plants have been able to utilize the latest technologies and achieve large volume reduction by obtaining off-site waste processor services. Although the use of such services reduce the burden of waste processing it also reduces the utility's control over the process. Monitoring the performance of off-site processors is important so that the utility is cognizant of the waste disposition for required regulatory reporting. In addition to obtaining data for Reg Guide 1.21 reporting, Performance monitoring is important to determine which vendor and which services to utilize. Off-site processor services were initially offered for the decontamination of metallic waste. Since that time the list of services has expanded to include supercompaction, survey for release, incineration and metal melting. The number of vendors offering off-site services has increased and the services they offer vary. processing rates vary between vendors and have different charge bases. Determining which vendor to use for what service can be complicated and confusing

  6. Evaluation of the Intel Westmere-EX server processor

    CERN Document Server

    Jarp, S; Leduc, J; Nowak, A; CERN. Geneva. IT Department

    2011-01-01

    One year after the arrival of the Intel Xeon 7500 systems (“Nehalem-EX”), CERN openlab is presenting a set of benchmark results obtained when running on the new Xeon E7-4870 Processors, representing the “Westmere-EX” family. A modern 4-socket, 40-core system is confronted with the previous generation of expandable (“EX”) platforms, represented by a 4-socket, 32-core Intel Xeon X7560 based system – both being “top of the line” systems. Benchmarking of modern processors is a very complex affair. One has to control (at least) the following features: processor frequency, overclocking via Turbo mode, the number of physical cores in use, the use of logical cores via Symmetric MultiThreading (SMT), the cache sizes available, the configured memory topology, as well as the power configuration if throughput per watt is to be measured. As in previous activities, we have tried to do a good job of comparing like with like. In a “top of the line” comparison based on the HEPSPEC06 benchmark, the “We...

  7. Statistical analysis of quality control of automatic processor

    International Nuclear Information System (INIS)

    Niu Yantao; Zhao Lei; Zhang Wei; Yan Shulin

    2002-01-01

    Objective: To strengthen the scientific management of automatic processor and promote QC, based on analyzing QC management chart for automatic processor by statistical method, evaluating and interpreting the data and trend of the chart. Method: Speed, contrast, minimum density of step wedge of film strip were measured everyday and recorded on the QC chart. Mean (x-bar), standard deviation (s) and range (R) were calculated. The data and the working trend were evaluated and interpreted for management decisions. Results: Using relative frequency distribution curve constructed by measured data, the authors can judge whether it is a symmetric bell-shaped curve or not. If not, it indicates a few extremes overstepping control limits possibly are pulling the curve to the left or right. If it is a normal distribution, standard deviation (s) is observed. When x-bar +- 2s lies in upper and lower control limits of relative performance indexes, it indicates the processor works in stable status in this period. Conclusion: Guided by statistical method, QC work becomes more scientific and quantified. The authors can deepen understanding and application of the trend chart, and improve the quality management to a new step

  8. A fast inner product processor based on equal alignments

    Energy Technology Data Exchange (ETDEWEB)

    Smith, S.P.; Torng, H.C.

    1985-11-01

    Inner product computation is an important operation, invoked repeatedly in matrix multiplications. A high-speed inner product processor can be very useful (among many possible applications) in real-time signal processing. This paper presents the design of a fast inner product processor, with appreciably reduced latency and cost. The inner product processor is implemented with a tree of carry-propagate or carry-save adders; this structure is obtained with the incorporation of three innovations in the conventional multiply/add tree: The leaf-multipliers are expanded into adder subtrees, thus achieving an O(log Nb) latency, where N denotes the number of elements in a vector and b the number of bits in each element. The partial products, to be summed in producing an inner product, are reordered according to their ''minimum alignments.'' This reordering brings approximately a 20% savings in hardware-including adders and data paths. The reduction in adder widths also yields savings in carry propagation time for carry-propagate adders. For trees implemented with carry-save adders, the partial product reordering also serves to truncate the carry propagation chain in the final propagation stage by 2 log b - 1 positions, thus significantly reducing the latency further. A form of the Baugh and Wooley algorithm is adopted to implement two's complement notation with changes only in peripheral hardware.

  9. The hardware track finder processor in CMS at CERN

    International Nuclear Information System (INIS)

    Kluge, A.

    1997-07-01

    The work covers the design of the Track Finder Processor in the high energy experiment CMS at CERN/Geneva. The task of this processor is to identify muons and to measure their transverse momentum. The Track Finder makes it possible to determine the physical relevance of each high energetic collision and to forward only interesting data to the data analysis units. Data of more than two hundred thousand detector cells are used to determine the location of muons and to measure their transverse momentum. Each 25 ns a new data set is generated. Measurement of location and transverse momentum of the muons can be terminated within 350 ns by using an ASIC. The classical method in high energy physics experiments is to employ a pattern comparison method. The predefined patterns are compared to the found patterns. The high number of data channels and the complex requirements to the spatial detector resolution do not permit to employ a pattern comparison method. A so called track following algorithm was designed, which is able to assemble complete tracks through the whole detector starting from single track segments. Instead of storing a high number of track patterns the problem is brought back to the algorithm level. Comprehensive simulations, employing the hardware simulation language VHDL, were conducted in order to optimize the algorithm and its hardware implementation. A FPGA (field program able gate array)-prototype was designed. A feasibility study to implement the track finder processor employing ASICs was conducted. (author)

  10. 3081/E processor and its on-line use

    International Nuclear Information System (INIS)

    Rankin, P.; Bricaud, B.; Gravina, M.

    1985-05-01

    The 3081/E is a second generation emulator of a mainframe IBM. One of it's applications will be to form part of the data acquisition system of the upgraded Mark II detector for data taking at the SLAC linear collider. Since the processor does not have direct connections to I/O devices a FASTBUS interface will be provided to allow communication with both SLAC Scanner Processors (which are responsible for the accumulation of data at a crate level) and the experiment's VAX 8600 mainframe. The 3081/E's will supply a significant amount of on-line computing power to the experiment (a single 3081/E is equivalent to 4 to 5 VAX 11/780's). A major advantage of the 3081/E is that program development can be done on an IBM mainframe (such as the one used for off-line analysis) which gives the programmer access to a full range of debugging tools. The processor's performance can be continually monitored by comparison of the results obtained using it to those given when the same program is run on an IBM computer. 9 refs

  11. Performance of Artificial Intelligence Workloads on the Intel Core 2 Duo Series Desktop Processors

    Directory of Open Access Journals (Sweden)

    Abdul Kareem PARCHUR

    2010-12-01

    Full Text Available As the processor architecture becomes more advanced, Intel introduced its Intel Core 2 Duo series processors. Performance impact on Intel Core 2 Duo processors are analyzed using SPEC CPU INT 2006 performance numbers. This paper studied the behavior of Artificial Intelligence (AI benchmarks on Intel Core 2 Duo series processors. Moreover, we estimated the task completion time (TCT @1 GHz, @2 GHz and @3 GHz Intel Core 2 Duo series processors frequency. Our results show the performance scalability in Intel Core 2 Duo series processors. Even though AI benchmarks have similar execution time, they have dissimilar characteristics which are identified using principal component analysis and dendogram. As the processor frequency increased from 1.8 GHz to 3.167 GHz the execution time is decreased by ~370 sec for AI workloads. In the case of Physics/Quantum Computing programs it was ~940 sec.

  12. The Heidelberg POLYP - a flexible and fault-tolerant poly-processor

    International Nuclear Information System (INIS)

    Maenner, R.; Deluigi, B.

    1981-01-01

    The Heidelberg poly-processor system POLYP is described. It is intended to be used in nuclear physics for reprocessing of experimental data, in high energy physics as second-stage trigger processor, and generally in other applications requiring high-computing power. The POLYP system consists of any number of I/O-processors, processor modules (eventually of different types), global memory segments, and a host processor. All modules (up to several hundred) are connected by a multiple common-data-bus system; all processors, additionally, by a multiple sync bus system for processor/task-scheduling. All hard- and software is designed to be decentralized and free of bottle-necks. Most hardware-faults like single-bit errors in memory or multi-bit errors during transfers are automatically corrected. Defective modules, buses, etc., can be removed with only a graceful degradation of the system-throughput. (orig.)

  13. CASPER: Embedding Power Estimation and Hardware-Controlled Power Management in a Cycle-Accurate Micro-Architecture Simulation Platform for Many-Core Multi-Threading Heterogeneous Processors

    Directory of Open Access Journals (Sweden)

    Arun Ravindran

    2012-02-01

    Full Text Available Despite the promising performance improvement observed in emerging many-core architectures in high performance processors, high power consumption prohibitively affects their use and marketability in the low-energy sectors, such as embedded processors, network processors and application specific instruction processors (ASIPs. While most chip architects design power-efficient processors by finding an optimal power-performance balance in their design, some use sophisticated on-chip autonomous power management units, which dynamically reduce the voltage or frequencies of idle cores and hence extend battery life and reduce operating costs. For large scale designs of many-core processors, a holistic approach integrating both these techniques at different levels of abstraction can potentially achieve maximal power savings. In this paper we present CASPER, a robust instruction trace driven cycle-accurate many-core multi-threading micro-architecture simulation platform where we have incorporated power estimation models of a wide variety of tunable many-core micro-architectural design parameters, thus enabling processor architects to explore a sufficiently large design space and achieve power-efficient designs. Additionally CASPER is designed to accommodate cycle-accurate models of hardware controlled power management units, enabling architects to experiment with and evaluate different autonomous power-saving mechanisms to study the run-time power-performance trade-offs in embedded many-core processors. We have implemented two such techniques in CASPER–Chipwide Dynamic Voltage and Frequency Scaling, and Performance Aware Core-Specific Frequency Scaling, which show average power savings of 35.9% and 26.2% on a baseline 4-core SPARC based architecture respectively. This power saving data accounts for the power consumption of the power management units themselves. The CASPER simulation platform also provides users with complete support of SPARCV9

  14. Investigating the effectiveness of many-core network processors for high performance cyber protection systems. Part I, FY2011.

    Energy Technology Data Exchange (ETDEWEB)

    Wheeler, Kyle Bruce; Naegle, John Hunt; Wright, Brian J.; Benner, Robert E., Jr.; Shelburg, Jeffrey Scott; Pearson, David Benjamin; Johnson, Joshua Alan; Onunkwo, Uzoma A.; Zage, David John; Patel, Jay S.

    2011-09-01

    This report documents our first year efforts to address the use of many-core processors for high performance cyber protection. As the demands grow for higher bandwidth (beyond 1 Gbits/sec) on network connections, the need to provide faster and more efficient solution to cyber security grows. Fortunately, in recent years, the development of many-core network processors have seen increased interest. Prior working experiences with many-core processors have led us to investigate its effectiveness for cyber protection tools, with particular emphasis on high performance firewalls. Although advanced algorithms for smarter cyber protection of high-speed network traffic are being developed, these advanced analysis techniques require significantly more computational capabilities than static techniques. Moreover, many locations where cyber protections are deployed have limited power, space and cooling resources. This makes the use of traditionally large computing systems impractical for the front-end systems that process large network streams; hence, the drive for this study which could potentially yield a highly reconfigurable and rapidly scalable solution.

  15. A FPGA-Based, Granularity-Variable Neuromorphic Processor and Its Application in a MIMO Real-Time Control System.

    Science.gov (United States)

    Zhang, Zhen; Ma, Cheng; Zhu, Rong

    2017-08-23

    Artificial Neural Networks (ANNs), including Deep Neural Networks (DNNs), have become the state-of-the-art methods in machine learning and achieved amazing success in speech recognition, visual object recognition, and many other domains. There are several hardware platforms for developing accelerated implementation of ANN models. Since Field Programmable Gate Array (FPGA) architectures are flexible and can provide high performance per watt of power consumption, they have drawn a number of applications from scientists. In this paper, we propose a FPGA-based, granularity-variable neuromorphic processor (FBGVNP). The traits of FBGVNP can be summarized as granularity variability, scalability, integrated computing, and addressing ability: first, the number of neurons is variable rather than constant in one core; second, the multi-core network scale can be extended in various forms; third, the neuron addressing and computing processes are executed simultaneously. These make the processor more flexible and better suited for different applications. Moreover, a neural network-based controller is mapped to FBGVNP and applied in a multi-input, multi-output, (MIMO) real-time, temperature-sensing and control system. Experiments validate the effectiveness of the neuromorphic processor. The FBGVNP provides a new scheme for building ANNs, which is flexible, highly energy-efficient, and can be applied in many areas.

  16. A FPGA-Based, Granularity-Variable Neuromorphic Processor and Its Application in a MIMO Real-Time Control System

    Directory of Open Access Journals (Sweden)

    Zhen Zhang

    2017-08-01

    Full Text Available Artificial Neural Networks (ANNs, including Deep Neural Networks (DNNs, have become the state-of-the-art methods in machine learning and achieved amazing success in speech recognition, visual object recognition, and many other domains. There are several hardware platforms for developing accelerated implementation of ANN models. Since Field Programmable Gate Array (FPGA architectures are flexible and can provide high performance per watt of power consumption, they have drawn a number of applications from scientists. In this paper, we propose a FPGA-based, granularity-variable neuromorphic processor (FBGVNP. The traits of FBGVNP can be summarized as granularity variability, scalability, integrated computing, and addressing ability: first, the number of neurons is variable rather than constant in one core; second, the multi-core network scale can be extended in various forms; third, the neuron addressing and computing processes are executed simultaneously. These make the processor more flexible and better suited for different applications. Moreover, a neural network-based controller is mapped to FBGVNP and applied in a multi-input, multi-output, (MIMO real-time, temperature-sensing and control system. Experiments validate the effectiveness of the neuromorphic processor. The FBGVNP provides a new scheme for building ANNs, which is flexible, highly energy-efficient, and can be applied in many areas.

  17. Graphical user interface for TOUGH/TOUGH2 - development of database, pre-processor, and post-processor

    Energy Technology Data Exchange (ETDEWEB)

    Sato, Tatsuya; Okabe, Takashi; Osato, Kazumi [Geothermal Energy Research and Development Co., Ltd., Tokyo (Japan)

    1995-03-01

    One of the advantages of the TOUGH/TOUGH2 (Pruess, 1987 and 1991) is the modeling using {open_quotes}free shape{close_quotes} polygonal blocks. However, the treatment of three-dimensional information, particularly for TOUGH/TOUGH2 is not easy because of the {open_quotes}free shape{close_quotes} polygonal blocks. Therefore, we have developed a database named {open_quotes}GEOBASE{close_quotes} and a pre/post-processor named {open_quotes}GEOGRAPH{close_quotes} for TOUGH/TOUGH2 on engineering work station (EWS). {open_quotes}GEOGRAPH{close_quotes} is based on the ORACLE{sup *1} relational database manager system to access data sets of surface exploration (geology, geophysics, geochemistry, etc.), drilling (well trajectory, geological column, logging, etc.), well testing (production test, injection test, interference test, tracer test, etc.) and production/injection history.{open_quotes}GEOGRAPH{close_quotes} consists of {open_quotes}Pre-processor{close_quotes} that can construct the three-dimensional free shape reservoir modeling by mouse operation on X-window and {open_quotes}Post-processor{close_quotes} that can display several kinds of two/three-dimensional maps and X-Y plots to compile data on {open_quotes}GEOBASE{close_quotes} and result of TOUGH/TOUGH2 calculation. This paper shows concept of the systems and examples of utilization.

  18. Real Time Phase Noise Meter Based on a Digital Signal Processor

    Science.gov (United States)

    Angrisani, Leopoldo; D'Arco, Mauro; Greenhall, Charles A.; Schiano Lo Morille, Rosario

    2006-01-01

    A digital signal-processing meter for phase noise measurement on sinusoidal signals is dealt with. It enlists a special hardware architecture, made up of a core digital signal processor connected to a data acquisition board, and takes advantage of a quadrature demodulation-based measurement scheme, already proposed by the authors. Thanks to an efficient measurement process and an optimized implementation of its fundamental stages, the proposed meter succeeds in exploiting all hardware resources in such an effective way as to gain high performance and real-time operation. For input frequencies up to some hundreds of kilohertz, the meter is capable both of updating phase noise power spectrum while seamlessly capturing the analyzed signal into its memory, and granting as good frequency resolution as few units of hertz.

  19. Real time implementation of a linear predictive coding algorithm on digital signal processor DSP32C

    International Nuclear Information System (INIS)

    Sheikh, N.M.; Usman, S.R.; Fatima, S.

    2002-01-01

    Pulse Code Modulation (PCM) has been widely used in speech coding. However, due to its high bit rate. PCM has severe limitations in application where high spectral efficiency is desired, for example, in mobile communication, CD quality broadcasting system etc. These limitation have motivated research in bit rate reduction techniques. Linear predictive coding (LPC) is one of the most powerful complex techniques for bit rate reduction. With the introduction of powerful digital signal processors (DSP) it is possible to implement the complex LPC algorithm in real time. In this paper we present a real time implementation of the LPC algorithm on AT and T's DSP32C at a sampling frequency of 8192 HZ. Application of the LPC algorithm on two speech signals is discussed. Using this implementation , a bit rate reduction of 1:3 is achieved for better than tool quality speech, while a reduction of 1.16 is possible for speech quality required in military applications. (author)

  20. Performance of the AMBFTK board for the FastTracker processor for the ATLAS detector upgrade

    International Nuclear Information System (INIS)

    Alberti, F; Citterio, M; Liberali, V; Meroni, C; Andreani, A; Stabile, A; Annovi, A; Beretta, M; Crescioli, F; Dell'Orso, M; Piendibene, M; Volpi, G; Giannetti, P; Lanza, A; Magalotti, D; Sacco, I

    2013-01-01

    Modern experiments at hadron colliders search for extremely rare processes hidden in a very large background. As the experiment complexity and the accelerator backgrounds and luminosity increase we need increasingly complex and exclusive selections. The FastTracker (FTK) processor for the ATLAS experiment offers extremely powerful, very compact and low power consumption processing units for the future, which is essential for increased efficiency and purity in the Level 2 trigger selection through the intensive use of tracking. Pattern recognition is performed with Associative Memories (AM). The AMBFTK board and the AMchip04 integrated circuit have been designed specifically for this purpose. We report on the preliminary test results of the first prototypes of the AMBFTK board and of the AMchip04.

  1. A Track Reconstructing Low-latency Trigger Processor for High-energy Physics

    CERN Document Server

    AUTHOR|(CDS)2067518

    2009-01-01

    The detection and analysis of the large number of particles emerging from high-energy collisions between atomic nuclei is a major challenge in experimental heavy-ion physics. Efficient trigger systems help to focus the analysis on relevant events. A primary objective of the Transition Radiation Detector of the ALICE experiment at the LHC is to trigger on high-momentum electrons. In this thesis, a trigger processor is presented that employs massive parallelism to perform the required online event reconstruction within 2 µs to contribute to the Level-1 trigger decision. Its three-stage hierarchical architecture comprises 109 nodes based on FPGA technology. Ninety processing nodes receive data from the detector front-end at an aggregate net bandwidth of 2.16 Tbps via 1080 optical links. Using specifically developed components and interconnections, the system combines high bandwidth with minimum latency. The employed tracking algorithm three-dimensionally reassembles the track segments found in the detector's dr...

  2. Heat dissipation for the Intel Core i5 processor using multiwalled carbon-nanotube-based ethylene glycol

    Energy Technology Data Exchange (ETDEWEB)

    Thang, Bui Hung; Trinh, Pham Van; Quang, Le Dinh; Khoi, Phan Hong; Minh, Phan Ngoc [Vietnam Academy of Science and Technology, Ho Chi Minh CIty (Viet Nam); Huong, Nguyen Thi [Hanoi University of Science, Hanoi (Viet Nam); Vietnam National University, Hanoi (Viet Nam)

    2014-08-15

    Carbon nanotubes (CNTs) are some of the most valuable materials with high thermal conductivity. The thermal conductivity of individual multiwalled carbon nanotubes (MWCNTs) grown by using chemical vapor deposition is 600 ± 100 Wm{sup -1}K{sup -1} compared with the thermal conductivity 419 Wm{sup -1}K{sup -1} of Ag. Carbon-nanotube-based liquids - a new class of nanomaterials, have shown many interesting properties and distinctive features offering potential in heat dissipation applications for electronic devices, such as computer microprocessor, high power LED, etc. In this work, a multiwalled carbon-nanotube-based liquid was made of well-dispersed hydroxyl-functional multiwalled carbon nanotubes (MWCNT-OH) in ethylene glycol (EG)/distilled water (DW) solutions by using Tween-80 surfactant and an ultrasonication method. The concentration of MWCNT-OH in EG/DW solutions ranged from 0.1 to 1.2 gram/liter. The dispersion of the MWCNT-OH-based EG/DW solutions was evaluated by using a Zeta-Sizer analyzer. The MWCNT-OH-based EG/DW solutions were used as coolants in the liquid cooling system for the Intel Core i5 processor. The thermal dissipation efficiency and the thermal response of the system were evaluated by directly measuring the temperature of the micro-processor using the Core Temp software and the temperature sensors built inside the micro-processor. The results confirmed the advantages of CNTs in thermal dissipation systems for computer processors and other high-power electronic devices.

  3. A track reconstructing low-latency trigger processor for high-energy physics

    International Nuclear Information System (INIS)

    Cuveland, Jan de

    2009-01-01

    The detection and analysis of the large number of particles emerging from high-energy collisions between atomic nuclei is a major challenge in experimental heavy-ion physics. Efficient trigger systems help to focus the analysis on relevant events. A primary objective of the Transition Radiation Detector of the ALICE experiment at the LHC is to trigger on high-momentum electrons. In this thesis, a trigger processor is presented that employs massive parallelism to perform the required online event reconstruction within 2 μs to contribute to the Level-1 trigger decision. Its three-stage hierarchical architecture comprises 109 nodes based on FPGA technology. Ninety processing nodes receive data from the detector front-end at an aggregate net bandwidth of 2.16 Tbit/s via 1080 optical links. Using specifically developed components and interconnections, the system combines high bandwidth with minimum latency. The employed tracking algorithm three-dimensionally reassembles the track segments found in the detector's drift chambers based on explicit value comparisons, calculates the momentum of the originating particles from the course of the reconstructed tracks, and finally leads to a trigger decision. The architecture is capable of processing up to 20 000 track segments in less than 2 μs with high detection efficiency and reconstruction precision for high-momentum particles. As a result, this thesis shows how a trigger processor performing complex online track reconstruction within tight real-time requirements can be realized. The presented hardware has been built and is in continuous data taking operation in the ALICE experiment. (orig.)

  4. A track reconstructing low-latency trigger processor for high-energy physics

    Energy Technology Data Exchange (ETDEWEB)

    Cuveland, Jan de

    2009-09-17

    The detection and analysis of the large number of particles emerging from high-energy collisions between atomic nuclei is a major challenge in experimental heavy-ion physics. Efficient trigger systems help to focus the analysis on relevant events. A primary objective of the Transition Radiation Detector of the ALICE experiment at the LHC is to trigger on high-momentum electrons. In this thesis, a trigger processor is presented that employs massive parallelism to perform the required online event reconstruction within 2 {mu}s to contribute to the Level-1 trigger decision. Its three-stage hierarchical architecture comprises 109 nodes based on FPGA technology. Ninety processing nodes receive data from the detector front-end at an aggregate net bandwidth of 2.16 Tbit/s via 1080 optical links. Using specifically developed components and interconnections, the system combines high bandwidth with minimum latency. The employed tracking algorithm three-dimensionally reassembles the track segments found in the detector's drift chambers based on explicit value comparisons, calculates the momentum of the originating particles from the course of the reconstructed tracks, and finally leads to a trigger decision. The architecture is capable of processing up to 20 000 track segments in less than 2 {mu}s with high detection efficiency and reconstruction precision for high-momentum particles. As a result, this thesis shows how a trigger processor performing complex online track reconstruction within tight real-time requirements can be realized. The presented hardware has been built and is in continuous data taking operation in the ALICE experiment. (orig.)

  5. Implementation of 4-way Superscalar Hash MIPS Processor Using FPGA

    Science.gov (United States)

    Sahib Omran, Safaa; Fouad Jumma, Laith

    2018-05-01

    Due to the quick advancements in the personal communications systems and wireless communications, giving data security has turned into a more essential subject. This security idea turns into a more confounded subject when next-generation system requirements and constant calculation speed are considered in real-time. Hash functions are among the most essential cryptographic primitives and utilized as a part of the many fields of signature authentication and communication integrity. These functions are utilized to acquire a settled size unique fingerprint or hash value of an arbitrary length of message. In this paper, Secure Hash Algorithms (SHA) of types SHA-1, SHA-2 (SHA-224, SHA-256) and SHA-3 (BLAKE) are implemented on Field-Programmable Gate Array (FPGA) in a processor structure. The design is described and implemented using a hardware description language, namely VHSIC “Very High Speed Integrated Circuit” Hardware Description Language (VHDL). Since the logical operation of the hash types of (SHA-1, SHA-224, SHA-256 and SHA-3) are 32-bits, so a Superscalar Hash Microprocessor without Interlocked Pipelines (MIPS) processor are designed with only few instructions that were required in invoking the desired Hash algorithms, when the four types of hash algorithms executed sequentially using the designed processor, the total time required equal to approximately 342 us, with a throughput of 4.8 Mbps while the required to execute the same four hash algorithms using the designed four-way superscalar is reduced to 237 us with improved the throughput to 5.1 Mbps.

  6. GPU: the biggest key processor for AI and parallel processing

    Science.gov (United States)

    Baji, Toru

    2017-07-01

    Two types of processors exist in the market. One is the conventional CPU and the other is Graphic Processor Unit (GPU). Typical CPU is composed of 1 to 8 cores while GPU has thousands of cores. CPU is good for sequential processing, while GPU is good to accelerate software with heavy parallel executions. GPU was initially dedicated for 3D graphics. However from 2006, when GPU started to apply general-purpose cores, it was noticed that this architecture can be used as a general purpose massive-parallel processor. NVIDIA developed a software framework Compute Unified Device Architecture (CUDA) that make it possible to easily program the GPU for these application. With CUDA, GPU started to be used in workstations and supercomputers widely. Recently two key technologies are highlighted in the industry. The Artificial Intelligence (AI) and Autonomous Driving Cars. AI requires a massive parallel operation to train many-layers of neural networks. With CPU alone, it was impossible to finish the training in a practical time. The latest multi-GPU system with P100 makes it possible to finish the training in a few hours. For the autonomous driving cars, TOPS class of performance is required to implement perception, localization, path planning processing and again SoC with integrated GPU will play a key role there. In this paper, the evolution of the GPU which is one of the biggest commercial devices requiring state-of-the-art fabrication technology will be introduced. Also overview of the GPU demanding key application like the ones described above will be introduced.

  7. An Alternative Water Processor for Long Duration Space Missions

    Science.gov (United States)

    Barta, Daniel J.; Pickering, Karen D.; Meyer, Caitlin; Pennsinger, Stuart; Vega, Leticia; Flynn, Michael; Jackson, Andrew; Wheeler, Raymond

    2014-01-01

    A new wastewater recovery system has been developed that combines novel biological and physicochemical components for recycling wastewater on long duration human space missions. Functionally, this Alternative Water Processor (AWP) would replace the Urine Processing Assembly on the International Space Station and reduce or eliminate the need for the multi-filtration beds of the Water Processing Assembly (WPA). At its center are two unique game changing technologies: 1) a biological water processor (BWP) to mineralize organic forms of carbon and nitrogen and 2) an advanced membrane processor (Forward Osmosis Secondary Treatment) for removal of solids and inorganic ions. The AWP is designed for recycling larger quantities of wastewater from multiple sources expected during future exploration missions, including urine, hygiene (hand wash, shower, oral and shave) and laundry. The BWP utilizes a single-stage membrane-aerated biological reactor for simultaneous nitrification and denitrification. The Forward Osmosis Secondary Treatment (FOST) system uses a combination of forward osmosis (FO) and reverse osmosis (RO), is resistant to biofouling and can easily tolerate wastewaters high in non-volatile organics and solids associated with shower and/or hand washing. The BWP has been operated continuously for over 300 days. After startup, the mature biological system averaged 85% organic carbon removal and 44% nitrogen removal, close to stoichiometric maximum based on available carbon. To date, the FOST has averaged 93% water recovery, with a maximum of 98%. If the wastewater is slighty acidified, ammonia rejection is optimal. This paper will provide a description of the technology and summarize results from ground-based testing using real wastewater

  8. A Real-Time Sound Field Rendering Processor

    Directory of Open Access Journals (Sweden)

    Tan Yiyu

    2017-12-01

    Full Text Available Real-time sound field renderings are computationally intensive and memory-intensive. Traditional rendering systems based on computer simulations suffer from memory bandwidth and arithmetic units. The computation is time-consuming, and the sample rate of the output sound is low because of the long computation time at each time step. In this work, a processor with a hybrid architecture is proposed to speed up computation and improve the sample rate of the output sound, and an interface is developed for system scalability through simply cascading many chips to enlarge the simulated area. To render a three-minute Beethoven wave sound in a small shoe-box room with dimensions of 1.28 m × 1.28 m × 0.64 m, the field programming gate array (FPGA-based prototype machine with the proposed architecture carries out the sound rendering at run-time while the software simulation with the OpenMP parallelization takes about 12.70 min on a personal computer (PC with 32 GB random access memory (RAM and an Intel i7-6800K six-core processor running at 3.4 GHz. The throughput in the software simulation is about 194 M grids/s while it is 51.2 G grids/s in the prototype machine even if the clock frequency of the prototype machine is much lower than that of the PC. The rendering processor with a processing element (PE and interfaces consumes about 238,515 gates after fabricated by the 0.18 µm processing technology from the ROHM semiconductor Co., Ltd. (Kyoto Japan, and the power consumption is about 143.8 mW.

  9. FPGA Based Intelligent Co-operative Processor in Memory Architecture

    DEFF Research Database (Denmark)

    Ahmed, Zaki; Sotudeh, Reza; Hussain, Dil Muhammad Akbar

    2011-01-01

    benefits of PIM, a concept of Co-operative Intelligent Memory (CIM) was developed by the intelligent system group of University of Hertfordshire, based on the previously developed Co-operative Pseudo Intelligent Memory (CPIM). This paper provides an overview on previous works (CPIM, CIM) and realization......In a continuing effort to improve computer system performance, Processor-In-Memory (PIM) architecture has emerged as an alternative solution. PIM architecture incorporates computational units and control logic directly on the memory to provide immediate access to the data. To exploit the potential...

  10. Does the Intel Xeon Phi processor fit HEP workloads?

    Science.gov (United States)

    Nowak, A.; Bitzes, G.; Dotti, A.; Lazzaro, A.; Jarp, S.; Szostek, P.; Valsan, L.; Botezatu, M.; Leduc, J.

    2014-06-01

    This paper summarizes the five years of CERN openlab's efforts focused on the Intel Xeon Phi co-processor, from the time of its inception to public release. We consider the architecture of the device vis a vis the characteristics of HEP software and identify key opportunities for HEP processing, as well as scaling limitations. We report on improvements and speedups linked to parallelization and vectorization on benchmarks involving software frameworks such as Geant4 and ROOT. Finally, we extrapolate current software and hardware trends and project them onto accelerators of the future, with the specifics of offline and online HEP processing in mind.

  11. High performance deformable image registration algorithms for manycore processors

    CERN Document Server

    Shackleford, James; Sharp, Gregory

    2013-01-01

    High Performance Deformable Image Registration Algorithms for Manycore Processors develops highly data-parallel image registration algorithms suitable for use on modern multi-core architectures, including graphics processing units (GPUs). Focusing on deformable registration, we show how to develop data-parallel versions of the registration algorithm suitable for execution on the GPU. Image registration is the process of aligning two or more images into a common coordinate frame and is a fundamental step to be able to compare or fuse data obtained from different sensor measurements. E

  12. Safety-critical Java on a Java processor

    DEFF Research Database (Denmark)

    Schoeberl, Martin; Rios Rivas, Juan Ricardo

    2012-01-01

    The safety-critical Java (SCJ) specification is developed within the Java Community Process under specification request number JSR 302. The specification is available as public draft, but details are still discussed by the expert group. In this stage of the specification we need prototype...... implementations of SCJ and first test applications that are written with SCJ, even when the specification is not finalized. The feedback from those prototype implementations is needed for final decisions. To help the SCJ expert group, a prototype implementation of SCJ on top of the Java optimized processor...

  13. Application of digital beam position processor Libera on tune measurement

    International Nuclear Information System (INIS)

    Zhang Chunhui; Sun Baogen; Cao Yong; Lu Ping; Li Jihao

    2006-01-01

    Digital signal processing (DSP) is widely used in the field of beam diagnostics. Especially, DSP achieves very good performance in beam position signal analysis and betatron tune measurement. In Hefei light source, when beam was excited by narrow-band Gaussian white nose, Libera, a digital beam position processor, was used to process the signals from beam position monitor (BPM), which contained betatron oscillation. Fast Fourier transform (FFT) was applied to finding out betatron resonance frequency, from which the decimal part of betatron oscillation tune was calculated. By this means, the measure of horizontal tune was 3.5352 and the measure of vertical tune is 2.6299. (authors)

  14. Does the Intel Xeon Phi processor fit HEP workloads?

    International Nuclear Information System (INIS)

    Nowak, A; Bitzes, G; Dotti, A; Lazzaro, A; Jarp, S; Szostek, P; Valsan, L; Botezatu, M; Leduc, J

    2014-01-01

    This paper summarizes the five years of CERN openlab's efforts focused on the Intel Xeon Phi co-processor, from the time of its inception to public release. We consider the architecture of the device vis a vis the characteristics of HEP software and identify key opportunities for HEP processing, as well as scaling limitations. We report on improvements and speedups linked to parallelization and vectorization on benchmarks involving software frameworks such as Geant4 and ROOT. Finally, we extrapolate current software and hardware trends and project them onto accelerators of the future, with the specifics of offline and online HEP processing in mind.

  15. An introduction to programming multiple-processor computers

    International Nuclear Information System (INIS)

    Hicks, H.R.; Lynch, V.E.

    1986-01-01

    Fortran applications programs can be executed on multiprocessor computers in either a unitasking (traditional) or multitasking form. The later allows a single job to use more than one processor simultaneously, with a consequent reduction in elapsed time and, perhaps, the cost of the calculation. An introduction to programming in this environment is presented. The concept of synchronization and data sharing using EVENTS and LOCKS are illustrated with examples. The strategy of strong synchronization and the use of synchronization templates are proposed. We emphasize that incorrect multitasking programs can produce irreducible results, which makes debugging more difficult

  16. Power estimation on functional level for programmable processors

    Directory of Open Access Journals (Sweden)

    M. Schneider

    2004-01-01

    Full Text Available In diesem Beitrag werden verschiedene Ansätze zur Verlustleistungsschätzung von programmierbaren Prozessoren vorgestellt und bezüglich ihrer Übertragbarkeit auf moderne Prozessor-Architekturen wie beispielsweise Very Long Instruction Word (VLIW-Architekturen bewertet. Besonderes Augenmerk liegt hierbei auf dem Konzept der sogenannten Functional-Level Power Analysis (FLPA. Dieser Ansatz basiert auf der Einteilung der Prozessor-Architektur in funktionale Blöcke wie beispielsweise Processing-Unit, Clock-Netzwerk, interner Speicher und andere. Die Verlustleistungsaufnahme dieser Bl¨ocke wird parameterabhängig durch arithmetische Modellfunktionen beschrieben. Durch automatisierte Analyse von Assemblercodes des zu schätzenden Systems mittels eines Parsers können die Eingangsparameter wie beispielsweise der erzielte Parallelitätsgrad oder die Art des Speicherzugriffs gewonnen werden. Dieser Ansatz wird am Beispiel zweier moderner digitaler Signalprozessoren durch eine Vielzahl von Basis-Algorithmen der digitalen Signalverarbeitung evaluiert. Die ermittelten Schätzwerte für die einzelnen Algorithmen werden dabei mit physikalisch gemessenen Werten verglichen. Es ergibt sich ein sehr kleiner maximaler Schätzfehler von 3%. In this contribution different approaches for power estimation for programmable processors are presented and evaluated concerning their capability to be applied to modern digital signal processor architectures like e.g. Very Long InstructionWord (VLIW -architectures. Special emphasis will be laid on the concept of so-called Functional-Level Power Analysis (FLPA. This approach is based on the separation of the processor architecture into functional blocks like e.g. processing unit, clock network, internal memory and others. The power consumption of these blocks is described by parameter dependent arithmetic model functions. By application of a parser based automized analysis of assembler codes of the systems to be estimated

  17. Power estimation on functional level for programmable processors

    Science.gov (United States)

    Schneider, M.; Blume, H.; Noll, T. G.

    2004-05-01

    In diesem Beitrag werden verschiedene Ansätze zur Verlustleistungsschätzung von programmierbaren Prozessoren vorgestellt und bezüglich ihrer Übertragbarkeit auf moderne Prozessor-Architekturen wie beispielsweise Very Long Instruction Word (VLIW)-Architekturen bewertet. Besonderes Augenmerk liegt hierbei auf dem Konzept der sogenannten Functional-Level Power Analysis (FLPA). Dieser Ansatz basiert auf der Einteilung der Prozessor-Architektur in funktionale Blöcke wie beispielsweise Processing-Unit, Clock-Netzwerk, interner Speicher und andere. Die Verlustleistungsaufnahme dieser Bl¨ocke wird parameterabhängig durch arithmetische Modellfunktionen beschrieben. Durch automatisierte Analyse von Assemblercodes des zu schätzenden Systems mittels eines Parsers können die Eingangsparameter wie beispielsweise der erzielte Parallelitätsgrad oder die Art des Speicherzugriffs gewonnen werden. Dieser Ansatz wird am Beispiel zweier moderner digitaler Signalprozessoren durch eine Vielzahl von Basis-Algorithmen der digitalen Signalverarbeitung evaluiert. Die ermittelten Schätzwerte für die einzelnen Algorithmen werden dabei mit physikalisch gemessenen Werten verglichen. Es ergibt sich ein sehr kleiner maximaler Schätzfehler von 3%. In this contribution different approaches for power estimation for programmable processors are presented and evaluated concerning their capability to be applied to modern digital signal processor architectures like e.g. Very Long InstructionWord (VLIW) -architectures. Special emphasis will be laid on the concept of so-called Functional-Level Power Analysis (FLPA). This approach is based on the separation of the processor architecture into functional blocks like e.g. processing unit, clock network, internal memory and others. The power consumption of these blocks is described by parameter dependent arithmetic model functions. By application of a parser based automized analysis of assembler codes of the systems to be estimated the input

  18. Formal characterizations of FA-based string processors

    CSIR Research Space (South Africa)

    Ngassam, EK

    2010-08-01

    Full Text Available stream_source_info Ngassam_2010.pdf.txt stream_content_type text/plain stream_size 7434 Content-Encoding UTF-8 stream_name Ngassam_2010.pdf.txt Content-Type text/plain; charset=UTF-8 Formal Characterizations of FA...-based String Processors Ernest Ketcha Ngassam1,2,?, Bruce W. Watson3, and Derrick G. Kourie3 1SAP Meraka UTD, Pretoria, South Africa 2School of Computing University of South Africa Pretoria 0001 ernest.ngassam@sap.com 3Department of Computer Science...

  19. BWR thermohydraulics simulation on the AD-10 peripheral processor

    International Nuclear Information System (INIS)

    Wulff, W.; Cheng, H.S.; Lekach, S.V.; Mallen, A.N.

    1983-01-01

    This presentation demonstrates the feasibility of simulating plant transients and severe abnormal transients in nuclear power plants at much faster than real-time computing speeds in a low-cost, dedicated, interactive minicomputer. This is achieved by implementing advanced modeling techniques in modern, special-purpose peripheral processors for high-speed system simulation. The results of this demonstration will impact safety analyses and parametric studies, studies on operator responses and control system failures and it will make possible the continuous on-line monitoring of plant performance and the detection and diagnosis of system or component failures

  20. Pulses processor modeling of the AR-PET tomograph

    International Nuclear Information System (INIS)

    Martinez Garbino, Lucio J.; Venialgo, E.; Estryk, Daniel S.; Verrastro, Claudio A.

    2009-01-01

    The detection of two gamma photons in time coincidence is the main process in Positron Emission Tomography. The front end processor estimate the energy and the time stamp of each incident gamma photon, the accuracy of such estimation improves the quality of contrast and resolution of final images. In this work a modeling tool of the full detection chain is described. Starting from stochastic generation of light photons, followed by photoelectrons time transit spread inside the photomultiplier, preamplifier response and digitalisation process were modeling and finally, several algorithms of Energy and Time Stamp estimation were evaluated and compared. (author)

  1. Addressing Thermal and Performance Variability Issues in Dynamic Processors

    Energy Technology Data Exchange (ETDEWEB)

    Yoshii, Kazutomo [Argonne National Lab. (ANL), Argonne, IL (United States); Llopis, Pablo [Univ. Carlos III de Madrid (Spain); Zhang, Kaicheng [Northwestern Univ., Evanston, IL (United States); Luo, Yingyi [Northwestern Univ., Evanston, IL (United States); Ogrenci-Memik, Seda [Northwestern Univ., Evanston, IL (United States); Memik, Gokhan [Northwestern Univ., Evanston, IL (United States); Sankaran, Rajesh [Argonne National Lab. (ANL), Argonne, IL (United States); Beckman, Pete [Argonne National Lab. (ANL), Argonne, IL (United States)

    2017-03-01

    As CMOS scaling nears its end, parameter variations (process, temperature and voltage) are becoming a major concern. To overcome parameter variations and provide stability, modern processors are becoming dynamic, opportunistically adjusting voltage and frequency based on thermal and energy constraints, which negatively impacts traditional bulk-synchronous parallelism-minded hardware and software designs. As node-level architecture is growing in complexity, implementing variation control mechanisms only with hardware can be a challenging task. In this paper we investigate a software strategy to manage hardwareinduced variations, leveraging low-level monitoring/controlling mechanisms.

  2. Sn transport calculations on vector and parallel processors

    International Nuclear Information System (INIS)

    Rhoades, W.A.; Childs, R.L.

    1987-01-01

    The transport of radiation from the source to the location of people or equipment gives rise to some of the most challenging of calculations. A problem may involve as many as a billion unknowns, each evaluated several times to resolve interdependence. Such calculations run many hours on a Cray computer, and a typical study involves many such calculations. This paper will discuss the steps taken to vectorize the DOT code, which solves transport problems in two space dimensions (2-D); the extension of this code to 3-D; and the plans for extension to parallel processors

  3. The Level 0 Trigger Processor for the NA62 experiment

    International Nuclear Information System (INIS)

    Chiozzi, S.; Gamberini, E.; Gianoli, A.; Mila, G.; Neri, I.; Petrucci, F.; Soldi, D.

    2016-01-01

    In the NA62 experiment at CERN, the intense flux of particles requires a high-performance trigger for the data acquisition system. A Level 0 Trigger Processor (L0TP) was realized, performing the event selection based on trigger primitives coming from sub-detectors and reducing the trigger rate from 10 to 1 MHz. The L0TP is based on a commercial FPGA device and has been implemented in two different solutions. The performance of the two systems are highlighted and compared.

  4. The Level 0 Trigger Processor for the NA62 experiment

    Energy Technology Data Exchange (ETDEWEB)

    Chiozzi, S. [INFN, Ferrara (Italy); Gamberini, E. [University of Ferrara and INFN, Ferrara (Italy); Gianoli, A. [INFN, Ferrara (Italy); Mila, G. [University of Turin and INFN, Turin (Italy); Neri, I., E-mail: neri@fe.infn.it [University of Ferrara and INFN, Ferrara (Italy); Petrucci, F. [University of Ferrara and INFN, Ferrara (Italy); Soldi, D. [University of Turin and INFN, Turin (Italy)

    2016-07-11

    In the NA62 experiment at CERN, the intense flux of particles requires a high-performance trigger for the data acquisition system. A Level 0 Trigger Processor (L0TP) was realized, performing the event selection based on trigger primitives coming from sub-detectors and reducing the trigger rate from 10 to 1 MHz. The L0TP is based on a commercial FPGA device and has been implemented in two different solutions. The performance of the two systems are highlighted and compared.

  5. Eight-Channel Digital Signal Processor and Universal Trigger Module

    Science.gov (United States)

    Skulski, Wojtek; Wolfs, Frank

    2003-04-01

    A 10-bit, 8-channel, 40 megasamples per second digital signal processor and waveform digitizer DDC-8 (nicknamed Universal Trigger Module) is presented. The digitizer features 8 analog inputs, 1 analog output for a reconstructed analog waveform, 16 NIM logic inputs, 8 NIM logic outputs, and a pool of 16 TTL logic lines which can be individually configured as either inputs or outputs. The first application of this device is to enhance the present trigger electronics for PHOBOS at RHIC. The status of the development and the first results are presented. Possible applications of the new device are discussed. Supported by the NSF grant PHY-0072204.

  6. Development of Softcore Processor based RTU for FBR

    International Nuclear Information System (INIS)

    Gour, Aditya; Santhana Raj, A.; Behera, R.P.; Murali, N.; Swaminathan, P.

    2010-01-01

    Remote Terminal Units (RTU) are used to acquire analog/digital signals and generate potential free contact outputs and send the acquired data through LAN. The aim of this design is to develop a Soft-Core Processor based RTU by implementing the glue logic along with 8051 microcontroller present in existing RTUs into a single FPGA, so that component count and power consumption on the board will be reduced and thereby achieving a higher reliability than before. Implementation of glue logic was done using VHDL and Altium's TSK51 Softcore was used in place of 8051 microcontroller. (author)

  7. The Associative Memory Boards for the FTK Processor at ATLAS

    CERN Document Server

    Calabro, D; The ATLAS collaboration; Citraro, S; Donati, S; Giannetti, P; Lanza, A; Luciano, P; Magalotti, D; Piendibene, M

    2013-01-01

    The Associative Memory (AM) system, the main part of the FastTracker (FTK) processor, is designed to perform pattern matching using the information of the silicon tracking detectors. It finds track candidates at low resolution that are seeds for the following step performing precise track fitting. The system has to support challenging data traffic, handled by a group of modern low cost FPGAs, the Xilinx Spartan6 chips, which have Low-Power Gigabit Transceivers (GTP). Each GTP transceiver is a combined transmitter and receiver capable of operating at data rates up to 3.2 Gb/s. \

  8. Researching, building a soft-processor and Ethernet interface circuit using EDK

    International Nuclear Information System (INIS)

    Tuong Thi Thu Huong; Pham Ngoc Tuan; Truong Van Dat, Dang Lanh; Chau Thi Nhu Quynh

    2014-01-01

    The processor is an indispensable component in the measurement and automatic control systems. This report describes the fabrication of a soft-processor (32-bits, on-chip block RAM 64K, 50M clock, internal and peripheral bus) for receiving, sending and processing of data Ethernet packets. This processor is fabricated using the XPS component from EDK (Xilinx) software toolkit. After that, it is configured on the FPGA named Spartan XC3S500E circuit. A firmware of a processor for controlling the interface between processor and Ethernet port is written in C language and can play a role of a HOST (station) which has its own IP to connect to Ethernet network. Besides, there are some needed parts as follows: an Ethernet interfacing controller chip, a suitable cable providing a speed up to 100 Mbs and an application program running under Window XP environment written in LabView to communicate with soft-processor. (author)

  9. Processors for wavelet analysis and synthesis: NIFS and TI-C80 MVP

    Science.gov (United States)

    Brooks, Geoffrey W.

    1996-03-01

    Two processors are considered for image quadrature mirror filtering (QMF). The neuromorphic infrared focal-plane sensor (NIFS) is an existing prototype analog processor offering high speed spatio-temporal Gaussian filtering, which could be used for the QMF low- pass function, and difference of Gaussian filtering, which could be used for the QMF high- pass function. Although not designed specifically for wavelet analysis, the biologically- inspired system accomplishes the most computationally intensive part of QMF processing. The Texas Instruments (TI) TMS320C80 Multimedia Video Processor (MVP) is a 32-bit RISC master processor with four advanced digital signal processors (DSPs) on a single chip. Algorithm partitioning, memory management and other issues are considered for optimal performance. This paper presents these considerations with simulated results leading to processor implementation of high-speed QMF analysis and synthesis.

  10. Effects of PEMFC operating parameters on the performance of an integrated ethanol processor

    Energy Technology Data Exchange (ETDEWEB)

    Francesconi, Javier A.; Mussati, Miguel C.; Aguirre, Pio A. [INGAR Instituto de Desarrollo y Diseno (CONICET-UTN), Avellaneda 3657, CP:S3002GJC, Santa Fe (Argentina)

    2010-06-15

    In this paper the performance of a complete fuel cell system processing ethanol fuel has been analyzed as a function of the main fuel cell operating parameters. The fuel processor is based on the steam reforming process, followed by high- and low-temperature shift reactors, and carbon monoxide preferential oxidation reactor, which are coupled to a polymeric fuel cell (PEMFC). The goal was to analyze and improve the fuel cell system performance by simulation techniques. PEMFC operation has been analyzed using an available parametric model, which was implemented within HYSYS environment software. Pinch Analysis concepts were used to investigate the process energy integration and determine the maximum efficiency minimizing ethanol consumption. The system performance was analyzed for the SR-12 Modular PEM Generator, the Ballard Mark V fuel cell and the BCS 500 W stack. The net system efficiency is dependent on the required power demand. Efficiency values higher than 50% at low loads and less than 30% at high power demands are computed. In addition, the effect of fuel cell temperature, pressure and hydrogen utilization was analyzed. The trade-off between the reformer yield and the fuel cell performance defines the optimal operation pressure. The cell temperature determines operating zones where the water, involved in the reforming reactions, can be produced or demanded. (author)

  11. Application of the Computer Capacity to the Analysis of Processors Evolution

    OpenAIRE

    Ryabko, Boris; Rakitskiy, Anton

    2017-01-01

    The notion of computer capacity was proposed in 2012, and this quantity has been estimated for computers of different kinds. In this paper we show that, when designing new processors, the manufacturers change the parameters that affect the computer capacity. This allows us to predict the values of parameters of future processors. As the main example we use Intel processors, due to the accessibility of detailed description of all their technical characteristics.

  12. Array processors: an introduction to their architecture, software, and applications in nuclear medicine

    International Nuclear Information System (INIS)

    King, M.A.; Doherty, P.W.; Rosenberg, R.J.; Cool, S.L.

    1983-01-01

    Array processors are ''number crunchers'' that dramatically enhance the processing power of nuclear medicine computer systems for applicatons dealing with the repetitive operations involved in digital image processing of large segments of data. The general architecture and the programming of array processors are introduced, along with some applications of array processors to the reconstruction of emission tomographic images, digital image enhancement, and functional image formation

  13. Realization Of Algebraic Processor For XML Documents Processing

    International Nuclear Information System (INIS)

    Georgiev, Bozhidar; Georgieva, Adriana

    2010-01-01

    In this paper, are presented some possibilities concerning the implementation of an algebraic method for XML hierarchical data processing which makes faster the XML search mechanism. Here is offered a different point of view for creation of advanced algebraic processor (with all necessary software tools and programming modules respectively). Therefore, this nontraditional approach for fast XML navigation with the presented algebraic processor may help to build an easier user-friendly interface provided XML transformations, which can avoid the difficulties in the complicated language constructions of XSL, XSLT and XPath. This approach allows comparatively simple search of XML hierarchical data by means of the following types of functions: specification functions and so named build-in functions. The choice of programming language Java may appear strange at first, but it isn't when you consider that the applications can run on different kinds of computers. The specific search mechanism based on the linear algebra theory is faster in comparison with MSXML parsers (on the basis of the developed examples with about 30%). Actually, there exists the possibility for creating new software tools based on the linear algebra theory, which cover the whole navigation and search techniques characterizing XSLT/XPath. The proposed method is able to replace more complicated operations in other SOA components.

  14. Video rate morphological processor based on a redundant number representation

    Science.gov (United States)

    Kuczborski, Wojciech; Attikiouzel, Yianni; Crebbin, Gregory A.

    1992-03-01

    This paper presents a video rate morphological processor for automated visual inspection of printed circuit boards, integrated circuit masks, and other complex objects. Inspection algorithms are based on gray-scale mathematical morphology. Hardware complexity of the known methods of real-time implementation of gray-scale morphology--the umbra transform and the threshold decomposition--has prompted us to propose a novel technique which applied an arithmetic system without carrying propagation. After considering several arithmetic systems, a redundant number representation has been selected for implementation. Two options are analyzed here. The first is a pure signed digit number representation (SDNR) with the base of 4. The second option is a combination of the base-2 SDNR (to represent gray levels of images) and the conventional twos complement code (to represent gray levels of structuring elements). Operation principle of the morphological processor is based on the concept of the digit level systolic array. Individual processing units and small memory elements create a pipeline. The memory elements store current image windows (kernels). All operation primitives of processing units apply a unified direction of digit processing: most significant digit first (MSDF). The implementation technology is based on the field programmable gate arrays by Xilinx. This paper justified the rationality of a new approach to logic design, which is the decomposition of Boolean functions instead of Boolean minimization.

  15. A day in the life of a pharmacovigilance case processor

    Directory of Open Access Journals (Sweden)

    Ritesh Bhangale

    2017-01-01

    Full Text Available Pharmacovigilance (PV has grown significantly in India in the last couple of decades. The etymological roots for the word “pharmacovigilance” are “Pharmakon” (Greek for drug and “Vigilare” (Latin for to keep watch. It relies on information gathered from the collection of individual case safety reports and other pharmacoepidemiological data. The PV data processing cycle starts with data collection in computerized systems followed by complete data entry which includes adverse event coding, drug coding, causality and expectedness assessment, narrative writing, quality control, and report submissions followed by data storage and maintenance. A case processor plays an important role in conducting these various tasks. The case processor should also manage drug safety information, possess updated knowledge about global drug safety regulations, summarize clinical safety data, participate in meetings, write narratives with medical input from a physician, report serious adverse events to the regulatory authorities, participate in the training of operational staff on drug safety issues, quality control work of other staff in the department, and take on any other task as assigned by the manager or medical director within the capabilities of the drug safety associate. There can be challenges while handling all these tasks at a time, hence the associate will have to maintain a balance to overcome them and keep on updating their knowledge on drug safety regulations, which in turn, would help in increasing their learning curve.

  16. A programmable two-qubit quantum processor in silicon.

    Science.gov (United States)

    Watson, T F; Philips, S G J; Kawakami, E; Ward, D R; Scarlino, P; Veldhorst, M; Savage, D E; Lagally, M G; Friesen, Mark; Coppersmith, S N; Eriksson, M A; Vandersypen, L M K

    2018-03-29

    Now that it is possible to achieve measurement and control fidelities for individual quantum bits (qubits) above the threshold for fault tolerance, attention is moving towards the difficult task of scaling up the number of physical qubits to the large numbers that are needed for fault-tolerant quantum computing. In this context, quantum-dot-based spin qubits could have substantial advantages over other types of qubit owing to their potential for all-electrical operation and ability to be integrated at high density onto an industrial platform. Initialization, readout and single- and two-qubit gates have been demonstrated in various quantum-dot-based qubit representations. However, as seen with small-scale demonstrations of quantum computers using other types of qubit, combining these elements leads to challenges related to qubit crosstalk, state leakage, calibration and control hardware. Here we overcome these challenges by using carefully designed control techniques to demonstrate a programmable two-qubit quantum processor in a silicon device that can perform the Deutsch-Josza algorithm and the Grover search algorithm-canonical examples of quantum algorithms that outperform their classical analogues. We characterize the entanglement in our processor by using quantum-state tomography of Bell states, measuring state fidelities of 85-89 per cent and concurrences of 73-82 per cent. These results pave the way for larger-scale quantum computers that use spins confined to quantum dots.

  17. Soft electron processor for surface sterilization of food material

    International Nuclear Information System (INIS)

    Baba, Takashi; Kaneko, Hiromi; Taniguchi, Shuichi

    2004-01-01

    As frozen or chilled foods have become popular nowadays, it has become very important to provide raw materials with lower level microbial contamination to food processing companies. Consequently, the sterilization of food material is one of the major topics for food processing. Dried materials like grains, beans and spices, etc., are not typically deeply contaminated by microorganisms, which reside on the surfaces of materials, so it is very useful to take low energetic, lower than 300 keV, electrons with small penetration power (Soft-Electrons), as a sterilization method for such materials. Soft-Electrons is researched and named by Dr. Hayashi et al. This is a non-thermal method, so one can keep foods hygienic without serious deterioration. It is also a physical method, so is free from residues of chemicals in foods. Recently, Nissin-High Voltage Co., Ltd. have developed and manufactured equipment for commercial use of Soft-Electrons (Soft Electron Processor), which can process 500 kg/h of grains. This report introduces the Soft Electron Processor and shows the results of sterilization of wheat and brown rice by the equipment

  18. A day in the life of a pharmacovigilance case processor.

    Science.gov (United States)

    Bhangale, Ritesh; Vaity, Sayali; Kulkarni, Niranjan

    2017-01-01

    Pharmacovigilance (PV) has grown significantly in India in the last couple of decades. The etymological roots for the word "pharmacovigilance" are "Pharmakon" (Greek for drug) and "Vigilare" (Latin for to keep watch). It relies on information gathered from the collection of individual case safety reports and other pharmacoepidemiological data. The PV data processing cycle starts with data collection in computerized systems followed by complete data entry which includes adverse event coding, drug coding, causality and expectedness assessment, narrative writing, quality control, and report submissions followed by data storage and maintenance. A case processor plays an important role in conducting these various tasks. The case processor should also manage drug safety information, possess updated knowledge about global drug safety regulations, summarize clinical safety data, participate in meetings, write narratives with medical input from a physician, report serious adverse events to the regulatory authorities, participate in the training of operational staff on drug safety issues, quality control work of other staff in the department, and take on any other task as assigned by the manager or medical director within the capabilities of the drug safety associate. There can be challenges while handling all these tasks at a time, hence the associate will have to maintain a balance to overcome them and keep on updating their knowledge on drug safety regulations, which in turn, would help in increasing their learning curve.

  19. The ATLAS Level-1 Muon to Central Trigger Processor Interface

    CERN Document Server

    Berge, D; Farthouat, P; Haas, S; Klofver, P; Krasznahorkay, A; Messina, A; Pauly, T; Schuler, G; Spiwoks, R; Wengler, T; PH-EP

    2007-01-01

    The Muon to Central Trigger Processor Interface (MUCTPI) is part of the ATLAS Level-1 trigger system and connects the output of muon trigger system to the Central Trigger Processor (CTP). At every bunch crossing (BC), the MUCTPI receives information on muon candidates from each of the 208 muon trigger sectors and calculates the total multiplicity for each of six transverse momentum (pT) thresholds. This multiplicity value is then sent to the CTP, where it is used together with the input from the Calorimeter trigger to make the final Level-1 Accept (L1A) decision. In addition the MUCTPI provides summary information to the Level-2 trigger and to the data acquisition (DAQ) system for events selected at Level-1. This information is used to define the regions of interest (RoIs) that drive the Level-2 muontrigger processing. The MUCTPI system consists of a 9U VME chassis with a dedicated active backplane and 18 custom designed modules. The design of the modules is based on state-of-the-art FPGA devices and special ...

  20. Discovering Motifs in Biological Sequences Using the Micron Automata Processor.

    Science.gov (United States)

    Roy, Indranil; Aluru, Srinivas

    2016-01-01

    Finding approximately conserved sequences, called motifs, across multiple DNA or protein sequences is an important problem in computational biology. In this paper, we consider the (l, d) motif search problem of identifying one or more motifs of length l present in at least q of the n given sequences, with each occurrence differing from the motif in at most d substitutions. The problem is known to be NP-complete, and the largest solved instance reported to date is (26,11). We propose a novel algorithm for the (l,d) motif search problem using streaming execution over a large set of non-deterministic finite automata (NFA). This solution is designed to take advantage of the micron automata processor, a new technology close to deployment that can simultaneously execute multiple NFA in parallel. We demonstrate the capability for solving much larger instances of the (l, d) motif search problem using the resources available within a single automata processor board, by estimating run-times for problem instances (39,18) and (40,17). The paper serves as a useful guide to solving problems using this new accelerator technology.

  1. Computing prime factors with a Josephson phase qubit quantum processor

    Science.gov (United States)

    Lucero, Erik; Barends, R.; Chen, Y.; Kelly, J.; Mariantoni, M.; Megrant, A.; O'Malley, P.; Sank, D.; Vainsencher, A.; Wenner, J.; White, T.; Yin, Y.; Cleland, A. N.; Martinis, John M.

    2012-10-01

    A quantum processor can be used to exploit quantum mechanics to find the prime factors of composite numbers. Compiled versions of Shor's algorithm and Gauss sum factorizations have been demonstrated on ensemble quantum systems, photonic systems and trapped ions. Although proposed, these algorithms have yet to be shown using solid-state quantum bits. Using a number of recent qubit control and hardware advances, here we demonstrate a nine-quantum-element solid-state quantum processor and show three experiments to highlight its capabilities. We begin by characterizing the device with spectroscopy. Next, we produce coherent interactions between five qubits and verify bi- and tripartite entanglement through quantum state tomography. In the final experiment, we run a three-qubit compiled version of Shor's algorithm to factor the number 15, and successfully find the prime factors 48% of the time. Improvements in the superconducting qubit coherence times and more complex circuits should provide the resources necessary to factor larger composite numbers and run more intricate quantum algorithms.

  2. Image matrix processor for fast multi-dimensional computations

    Science.gov (United States)

    Roberson, George P.; Skeate, Michael F.

    1996-01-01

    An apparatus for multi-dimensional computation which comprises a computation engine, including a plurality of processing modules. The processing modules are configured in parallel and compute respective contributions to a computed multi-dimensional image of respective two dimensional data sets. A high-speed, parallel access storage system is provided which stores the multi-dimensional data sets, and a switching circuit routes the data among the processing modules in the computation engine and the storage system. A data acquisition port receives the two dimensional data sets representing projections through an image, for reconstruction algorithms such as encountered in computerized tomography. The processing modules include a programmable local host, by which they may be configured to execute a plurality of different types of multi-dimensional algorithms. The processing modules thus include an image manipulation processor, which includes a source cache, a target cache, a coefficient table, and control software for executing image transformation routines using data in the source cache and the coefficient table and loading resulting data in the target cache. The local host processor operates to load the source cache with a two dimensional data set, loads the coefficient table, and transfers resulting data out of the target cache to the storage system, or to another destination.

  3. The data handling processor of the Belle II DEPFET detector

    Energy Technology Data Exchange (ETDEWEB)

    Germic, Leonard; Hemperek, Tomasz; Kishishita, Tetsuichi; Paschen, Botho; Luetticke, Florian; Krueger, Hans; Marinas, Carlos; Wermes, Norbert [Universitaet Bonn (Germany); Collaboration: Belle II-Collaboration

    2016-07-01

    A two layer highly granular DEPFET pixel detector will be operated as the innermost subsystem of the Belle II experiment, at the new Japanese super flavor factory (SuperKEKB). Such a finely segmented system will allow to improve the vertex reconstruction in such ultra high luminosity environment but, at the same time, the raw data stream generated by the 8 million pixel detector will exceed the capability of real-time processing due to its high frame rate, considering the limited material budged and strict space constrains. For this reason a new ASIC, the Data Handling Processor (DHP) is designed to provide data processing at the level of the front-end electronics, such as zero-suppression and common mode correction. Additional feature of the Data Handling Processor is the control block, providing control signals for the on-module ASICs used in the pixel detector. In this contribution, the description of the latest chip revision in TSMC 65 nm technology together with the latest test results of the interface functionality tests are presented.

  4. The Topological Processor for the future ATLAS Level-1 Trigger

    CERN Document Server

    Kahra, C; The ATLAS collaboration

    2014-01-01

    ATLAS is an experiment on the Large Hadron Collider (LHC), located at the European Organization for Nuclear Research (CERN) in Switzerland. By 2015 the LHC instantaneous luminosity will be increased from $10^{34}$ up to $3\\cdot 10^{34} \\mathrm{cm}^{-2} \\mathrm{s}^{-1}$. This places stringent operational and physical requirements on the ATLAS Trigger in order to reduce the 40MHz collision rate to a manageable event storage rate of 1kHz while at the same time, selecting those events that contain interesting physics events. The Level-1 Trigger is the first rate-reducing step in the ATLAS Trigger, with an output rate of 100kHz and decision latency of less than $2.5 \\mu \\mathrm{s}$. It is composed of the Calorimeter Trigger, the Muon Trigger and the Central Trigger Processor (CTP). In 2014, there will be a new electronics module: the Topological Processor (L1Topo). The L1Topo will make it possible, for the first time, to use detailed information from subdetectors in a single Level-1 module. This allows the determi...

  5. Fuel processor and method for generating hydrogen for fuel cells

    Science.gov (United States)

    Ahmed, Shabbir [Naperville, IL; Lee, Sheldon H. D. [Willowbrook, IL; Carter, John David [Bolingbrook, IL; Krumpelt, Michael [Naperville, IL; Myers, Deborah J [Lisle, IL

    2009-07-21

    A method of producing a H.sub.2 rich gas stream includes supplying an O.sub.2 rich gas, steam, and fuel to an inner reforming zone of a fuel processor that includes a partial oxidation catalyst and a steam reforming catalyst or a combined partial oxidation and stream reforming catalyst. The method also includes contacting the O.sub.2 rich gas, steam, and fuel with the partial oxidation catalyst and the steam reforming catalyst or the combined partial oxidation and stream reforming catalyst in the inner reforming zone to generate a hot reformate stream. The method still further includes cooling the hot reformate stream in a cooling zone to produce a cooled reformate stream. Additionally, the method includes removing sulfur-containing compounds from the cooled reformate stream by contacting the cooled reformate stream with a sulfur removal agent. The method still further includes contacting the cooled reformate stream with a catalyst that converts water and carbon monoxide to carbon dioxide and H.sub.2 in a water-gas-shift zone to produce a final reformate stream in the fuel processor.

  6. A programmable two-qubit quantum processor in silicon

    Science.gov (United States)

    Watson, T. F.; Philips, S. G. J.; Kawakami, E.; Ward, D. R.; Scarlino, P.; Veldhorst, M.; Savage, D. E.; Lagally, M. G.; Friesen, Mark; Coppersmith, S. N.; Eriksson, M. A.; Vandersypen, L. M. K.

    2018-03-01

    Now that it is possible to achieve measurement and control fidelities for individual quantum bits (qubits) above the threshold for fault tolerance, attention is moving towards the difficult task of scaling up the number of physical qubits to the large numbers that are needed for fault-tolerant quantum computing. In this context, quantum-dot-based spin qubits could have substantial advantages over other types of qubit owing to their potential for all-electrical operation and ability to be integrated at high density onto an industrial platform. Initialization, readout and single- and two-qubit gates have been demonstrated in various quantum-dot-based qubit representations. However, as seen with small-scale demonstrations of quantum computers using other types of qubit, combining these elements leads to challenges related to qubit crosstalk, state leakage, calibration and control hardware. Here we overcome these challenges by using carefully designed control techniques to demonstrate a programmable two-qubit quantum processor in a silicon device that can perform the Deutsch–Josza algorithm and the Grover search algorithm—canonical examples of quantum algorithms that outperform their classical analogues. We characterize the entanglement in our processor by using quantum-state tomography of Bell states, measuring state fidelities of 85–89 per cent and concurrences of 73–82 per cent. These results pave the way for larger-scale quantum computers that use spins confined to quantum dots.

  7. Point and track-finding processors for multiwire chambers

    CERN Document Server

    Hansroul, M

    1973-01-01

    The hardware processors described below are designed to be used in conjunction with multi-wire chambers. They have the characteristic of being based on computational methods in contrast to analogue procedures. In a sense, they are hardware implementations of computer programs. But, being specially designed for their purpose, they are free of the restrictions imposed by the architecture of the computer on which the equivalent program is to run. The parallelism inherent in the algorithms can thus be fully exploited. Combined with the use of fast access scratch-pad memories and the non-sequential nature of the control program, the parallelism accounts for the fact that these processors are expected to execute 2-3 orders of magnitude faster than the equivalent Fortran programs on a CDC 7600 or 6600. As a consequence, methods which are simple and straightforward, but which are impractical because they require an exorbitant amount of computer time can on the contrary be very attractive for hardware implementation. ...

  8. A fast track trigger processor for the OPAL detector at LEP

    International Nuclear Information System (INIS)

    Carter, A.A.; Jaroslawski, S.; Wagner, A.

    1986-01-01

    A fast hardware track trigger processor being built for the OPAL experiment is described. The processor will analyse data from the central drift chambers of OPAL to determine whether any tracks come from the interaction region, and thereby eliminate background events. The processor will find tracks over a large angular range, vertical strokecos thetavertical stroke < or approx. 0.95. The design of the processor is described, together with a brief account of its hardware implementation for OPAL. The results of feasibility studies are also presented. (orig.)

  9. A Performance-Prediction Model for PIC Applications on Clusters of Symmetric MultiProcessors: Validation with Hierarchical HPF+OpenMP Implementation

    Directory of Open Access Journals (Sweden)

    Sergio Briguglio

    2003-01-01

    Full Text Available A performance-prediction model is presented, which describes different hierarchical workload decomposition strategies for particle in cell (PIC codes on Clusters of Symmetric MultiProcessors. The devised workload decomposition is hierarchically structured: a higher-level decomposition among the computational nodes, and a lower-level one among the processors of each computational node. Several decomposition strategies are evaluated by means of the prediction model, with respect to the memory occupancy, the parallelization efficiency and the required programming effort. Such strategies have been implemented by integrating the high-level languages High Performance Fortran (at the inter-node stage and OpenMP (at the intra-node one. The details of these implementations are presented, and the experimental values of parallelization efficiency are compared with the predicted results.

  10. Efficient Sorting on the Tilera Manycore Architecture

    Energy Technology Data Exchange (ETDEWEB)

    Morari, Alessandro; Tumeo, Antonino; Villa, Oreste; Secchi, Simone; Valero, Mateo

    2012-10-24

    e present an efficient implementation of the radix sort algo- rithm for the Tilera TILEPro64 processor. The TILEPro64 is one of the first successful commercial manycore processors. It is com- posed of 64 tiles interconnected through multiple fast Networks- on-chip and features a fully coherent, shared distributed cache. The architecture has a large degree of flexibility, and allows various optimization strategies. We describe how we mapped the algorithm to this architecture. We present an in-depth analysis of the optimizations for each phase of the algorithm with respect to the processor’s sustained performance. We discuss the overall throughput reached by our radix sort implementation (up to 132 MK/s) and show that it provides comparable or better performance-per-watt with respect to state-of-the art implemen- tations on x86 processors and graphic processing units.

  11. 3D Seismic Imaging through Reverse-Time Migration on Homogeneous and Heterogeneous Multi-Core Processors

    Directory of Open Access Journals (Sweden)

    Mauricio Araya-Polo

    2009-01-01

    Full Text Available Reverse-Time Migration (RTM is a state-of-the-art technique in seismic acoustic imaging, because of the quality and integrity of the images it provides. Oil and gas companies trust RTM with crucial decisions on multi-million-dollar drilling investments. But RTM requires vastly more computational power than its predecessor techniques, and this has somewhat hindered its practical success. On the other hand, despite multi-core architectures promise to deliver unprecedented computational power, little attention has been devoted to mapping efficiently RTM to multi-cores. In this paper, we present a mapping of the RTM computational kernel to the IBM Cell/B.E. processor that reaches close-to-optimal performance. The kernel proves to be memory-bound and it achieves a 98% utilization of the peak memory bandwidth. Our Cell/B.E. implementation outperforms a traditional processor (PowerPC 970MP in terms of performance (with an 15.0× speedup and energy-efficiency (with a 10.0× increase in the GFlops/W delivered. Also, it is the fastest RTM implementation available to the best of our knowledge. These results increase the practical usability of RTM. Also, the RTM-Cell/B.E. combination proves to be a strong competitor in the seismic arena.

  12. Memory-efficient optimization of Gyrokinetic particle-to-grid interpolation for multicore processors

    Energy Technology Data Exchange (ETDEWEB)

    Madduri, Kamesh [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Williams, Samuel [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Ethier, Stephane [Princeton Plasma Physics Lab. (PPPL), Princeton, NJ (United States); Oliker, Leonid [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Shalf, John [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Strohmaier, Erich [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Yelicky, Katherine [Univ. of California, Berkeley, CA (United States)

    2009-01-01

    We present multicore parallelization strategies for the particle-to-grid interpolation step in the Gyrokinetic Toroidal Code (GTC), a 3D particle-in-cell (PIC) application to study turbulent transport in magnetic-confinement fusion devices. Particle-grid interpolation is a known performance bottleneck in several PIC applications. In GTC, this step involves particles depositing charges to a 3D toroidal mesh, and multiple particles may contribute to the charge at a grid point. We design new parallel algorithms for the GTC charge deposition kernel, and analyze their performance on three leading multicore platforms. We implement thirteen different variants for this kernel and identify the best-performing ones given typical PIC parameters such as the grid size, number of particles per cell, and the GTC-specific particle Larmor radius variation. We find that our best strategies can be 2x faster than the reference optimized MPI implementation, and our analysis provides insight into desirable architectural features for high-performance PIC simulation codes.

  13. Efficient hardware implementation of a full COFDM processor with robust channel equalization and reduced power consumption

    Directory of Open Access Journals (Sweden)

    Alexander López Parrado

    2013-01-01

    Full Text Available Este trabajo presenta el diseño de un procesador banda-base para multiplexación por división de frecuencias ortogonales codificada (COFDM de 12 Mb/s para el estándar IEEE 802.11a. El procesador COFDM banda- base fue diseñado usando circuitos diseñados por nosotros para corrección de fase de portadora, sincronización de tiempo de símbolo, ecualización de canal robusta y decodificación Viterbi. Estos circuitos son flexibles, parametrizados y descritos usando VHDL estructural y genérico. El procesador COFDM banda- base tiene dos dominios de reloj para reducción del consumo de potencia, fue sintetizado sobre un FPGA Stratix II y fue probado experimentalmente usando circuitería de radio frecuencia (RF a 2.4 GHz.

  14. Efficient pseudo-random number generation for monte-carlo simulations using graphic processors

    Science.gov (United States)

    Mohanty, Siddhant; Mohanty, A. K.; Carminati, F.

    2012-06-01

    A hybrid approach based on the combination of three Tausworthe generators and one linear congruential generator for pseudo random number generation for GPU programing as suggested in NVIDIA-CUDA library has been used for MONTE-CARLO sampling. On each GPU thread, a random seed is generated on fly in a simple way using the quick and dirty algorithm where mod operation is not performed explicitly due to unsigned integer overflow. Using this hybrid generator, multivariate correlated sampling based on alias technique has been carried out using both CUDA and OpenCL languages.

  15. Efficient pseudo-random number generation for Monte-Carlo simulations using graphic processors

    International Nuclear Information System (INIS)

    Mohanty, Siddhant; Mohanty, A K; Carminati, F

    2012-01-01

    A hybrid approach based on the combination of three Tausworthe generators and one linear congruential generator for pseudo random number generation for GPU programing as suggested in NVIDIA-CUDA library has been used for MONTE-CARLO sampling. On each GPU thread, a random seed is generated on fly in a simple way using the quick and dirty algorithm where mod operation is not performed explicitly due to unsigned integer overflow. Using this hybrid generator, multivariate correlated sampling based on alias technique has been carried out using both CUDA and OpenCL languages.

  16. Clock generators for SOC processors circuits and architectures

    CERN Document Server

    Fahim, Amr

    2004-01-01

    This book explores the design of fully-integrated frequency synthesizers suitable for system-on-a-chip (SOC) processors. The text takes a more global design perspective in jointly examining the design space at the circuit level as well as at the architectural level. The comprehensive coverage includes summary chapters on circuit theory as well as feedback control theory relevant to the operation of phase locked loops (PLLs). On the circuit level, the discussion includes low-voltage analog design in deep submicron digital CMOS processes, effects of supply noise, substrate noise, as well device noise. On the architectural level, the discussion includes PLL analysis using continuous-time as well as discrete-time models, linear and nonlinear effects of PLL performance, and detailed analysis of locking behavior. The book provides numerous real world applications, as well as practical rules-of-thumb for modern designers to use at the system, architectural, as well as the circuit level.

  17. Deep Trek Re-configurable Processor for Data Acquisition (RPDA)

    Energy Technology Data Exchange (ETDEWEB)

    Bruce Ohme; Michael Johnson

    2009-06-30

    This report summarizes technical progress achieved during the cooperative research agreement between Honeywell and U.S. Department of Energy to develop a high-temperature Re-configurable Processor for Data Acquisition (RPDA). The RPDA development has incorporated multiple high-temperature (225C) electronic components within a compact co-fired ceramic Multi-Chip-Module (MCM) package. This assembly is suitable for use in down-hole oil and gas applications. The RPDA module is programmable to support a wide range of functionality. Specifically this project has demonstrated functional integrity of the RPDA package and internal components, as well as functional integrity of the RPDA configured to operate as a Multi-Channel Data Acquisition Controller. This report reviews the design considerations, electrical hardware design, MCM package design, considerations for manufacturing assembly, test and screening, and results from prototype assembly and characterization testing.

  18. State-based Communication on Time-predictable Multicore Processors

    DEFF Research Database (Denmark)

    Sørensen, Rasmus Bo; Schoeberl, Martin; Sparsø, Jens

    2016-01-01

    Some real-time systems use a form of task-to-task communication called state-based or sample-based communication that does not impose any flow control among the communicating tasks. The concept is similar to a shared variable, where a reader may read the same value multiple times or may not read...... a given value at all. This paper explores time-predictable implementations of state-based communication in network-on-chip based multicore platforms through five algorithms. With the presented analysis of the implemented algorithms, the communicating tasks of one core can be scheduled independently...... of tasks on other cores. Assuming a specific time-predictable multicore processor, we evaluate how the read and write primitives of the five algorithms contribute to the worst-case execution time of the communicating tasks. Each of the five algorithms has specific capabilities that make them suitable...

  19. The Associative Memory system for the FTK processor at ATLAS

    CERN Document Server

    Cipriani, R; The ATLAS collaboration; Donati, S; Giannetti, P; Lanza, A; Luciano, P; Magalotti, D; Piendibene, M

    2013-01-01

    Modern experiments search for extremely rare processes hidden in much larger background levels. As the experiment complexity, the accelerator backgrounds and luminosity increase we need increasingly complex and exclusive selections. We present results and performances of a new prototype of Associative Memory system, the core of the Fast Tracker processor (FTK). FTK is a real time tracking device for the Atlas experiment trigger upgrade. The AM system provides massive computing power to minimize the online execution time of complex tracking algorithms. The time consuming pattern recognition problem, generally referred to as the “combinatorial challenge”, is beat by the Associative Memory (AM) technology exploiting parallelism to the maximum level: it compares the event to pre-calculated “expectations” or “patterns” (pattern matching) at once looking for candidate tracks called “roads”. The problem is solved by the time data are loaded into the AM devices. We report on the tests of the integrate...

  20. The Associative Memory system for the FTK processor at ATLAS

    CERN Document Server

    Cipriani, R; The ATLAS collaboration; Donati, S; Giannetti, P; Lanza, A; Luciano, P; Magalotti, D; Piendibene, M

    2014-01-01

    Modern experiments search for extremely rare processes hidden in much larger background levels. As the experiment complexity, the accelerator backgrounds and luminosity increase we need increasingly complex and exclusive selections. We present results and performances of a new prototype of Associative Memory system, the core of the Fast Tracker processor (FTK). FTK is a real time tracking device for the Atlas experiment trigger upgrade. The AM system provides massive computing power to minimize the online execution time of complex tracking algorithms. The time consuming pattern recognition problem, generally referred to as the “combinatorial challenge”, is beat by the Associative Memory (AM) technology exploiting parallelism to the maximum level: it compares the event to pre-calculated “expectations” or “patterns” (pattern matching) at once looking for candidate tracks called “roads”. The problem is solved by the time data are loaded into the AM devices. We report on the tests of the integrate...