WorldWideScience

Sample records for vlsi neural processor

  1. VLSI Processor For Vector Quantization

    Science.gov (United States)

    Tawel, Raoul

    1995-01-01

    Pixel intensities in each kernel compared simultaneously with all code vectors. Prototype high-performance, low-power, very-large-scale integrated (VLSI) circuit designed to perform compression of image data by vector-quantization method. Contains relatively simple analog computational cells operating on direct or buffered outputs of photodetectors grouped into blocks in imaging array, yielding vector-quantization code word for each such block in sequence. Scheme exploits parallel-processing nature of vector-quantization architecture, with consequent increase in speed.

  2. Bilinear Interpolation Image Scaling Processor for VLSI

    Directory of Open Access Journals (Sweden)

    Ms. Pawar Ashwini Dilip

    2014-05-01

    Full Text Available We introduce image scaling processor using VLSI technique. It consist of Bilinear interpolation, clamp filter and a sharpening spatial filter. Bilinear interpolation algorithm is popular due to its computational efficiency and image quality. But resultant image consist of blurring edges and aliasing artifacts after scaling. To reduce the blurring and aliasing artifacts sharpening spatial filter and clamp filters are used as pre-filter. These filters are realized by using T-model and inversed T-model convolution kernels. To reduce the memory buffer and computing resources for proposed image processor design two T-model or inversed T-model filters are combined into combined filter which requires only one line buffer memory. Also, to reduce hardware cost Reconfigurable calculation unit (RCUis invented. The VLSI architecture in this work can achieve 280 MHz with 6.08-K gate counts, and its core area is 30 378 μm2 synthesized by a 0.13-μm CMOS process

  3. Wavelength-encoded OCDMA system using opto-VLSI processors.

    Science.gov (United States)

    Aljada, Muhsen; Alameh, Kamal

    2007-07-01

    We propose and experimentally demonstrate a 2.5 Gbits/sper user wavelength-encoded optical code-division multiple-access encoder-decoder structure based on opto-VLSI processing. Each encoder and decoder is constructed using a single 1D opto-very-large-scale-integrated (VLSI) processor in conjunction with a fiber Bragg grating (FBG) array of different Bragg wavelengths. The FBG array spectrally and temporally slices the broadband input pulse into several components and the opto-VLSI processor generates codewords using digital phase holograms. System performance is measured in terms of the autocorrelation and cross-correlation functions as well as the eye diagram.

  4. VLSI implementation of neural networks.

    Science.gov (United States)

    Wilamowski, B M; Binfet, J; Kaynak, M O

    2000-06-01

    Currently, fuzzy controllers are the most popular choice for hardware implementation of complex control surfaces because they are easy to design. Neural controllers are more complex and hard to train, but provide an outstanding control surface with much less error than that of a fuzzy controller. There are also some problems that have to be solved before the networks can be implemented on VLSI chips. First, an approximation function needs to be developed because CMOS neural networks have an activation function different than any function used in neural network software. Next, this function has to be used to train the network. Finally, the last problem for VLSI designers is the quantization effect caused by discrete values of the channel length (L) and width (W) of MOS transistor geometries. Two neural networks were designed in 1.5 microm technology. Using adequate approximation functions solved the problem of activation function. With this approach, trained networks were characterized by very small errors. Unfortunately, when the weights were quantized, errors were increased by an order of magnitude. However, even though the errors were enlarged, the results obtained from neural network hardware implementations were superior to the results obtained with fuzzy system approach.

  5. Embedded Processor Based Automatic Temperature Control of VLSI Chips

    Directory of Open Access Journals (Sweden)

    Narasimha Murthy Yayavaram

    2009-01-01

    Full Text Available This paper presents embedded processor based automatic temperature control of VLSI chips, using temperature sensor LM35 and ARM processor LPC2378. Due to the very high packing density, VLSI chips get heated very soon and if not cooled properly, the performance is very much affected. In the present work, the sensor which is kept very near proximity to the IC will sense the temperature and the speed of the fan arranged near to the IC is controlled based on the PWM signal generated by the ARM processor. A buzzer is also provided with the hardware, to indicate either the failure of the fan or overheating of the IC. The entire process is achieved by developing a suitable embedded C program.

  6. VLSI digital demodulator co-processor

    Science.gov (United States)

    Stephen, Karen J.; Buznitsky, Mitchell A.; Lindsey, Mark J.

    A demodulation coprocessor that incorporates into a single VLSI package a number of important arithmetic functions commonly encountered in demodulation processing is developed. The LD17 demodulator is designed for use in a digital modem as a companion to any of the commercially available digital signal processing (DSP) microprocessors. The LD17 includes an 8-b complex multiplier-accumulator (MAC), a programmable tone generator, a preintegrator, a dedicated noncoherent differential phase-shift keying (DPSK) calculator, and a program/data sequencer. By using a simple generic interface and small but powerful instruction set, the LD17 has the capability to operate in several architectural schemes with a minimum of glue logic. Speed, size, and power constraints will dictate which of these schemes is best for a particular application. The LD17 will be implemented in a 1.5-micron DLM CMOS gate array and packaged in an 84-pin JLCC. With the LD17 and its memory, the real-time processing compatibility of a typical DSP microprocessor can be extended to sampling rates from hundreds to thousands of kilosamples per second.

  7. On-board neural processor design for intelligent multisensor microspacecraft

    Science.gov (United States)

    Fang, Wai-Chi; Sheu, Bing J.; Wall, James

    1996-03-01

    A compact VLSI neural processor based on the Optimization Cellular Neural Network (OCNN) has been under development to provide a wide range of support for an intelligent remote sensing microspacecraft which requires both high bandwidth communication and high- performance computing for on-board data analysis, thematic data reduction, synergy of multiple types of sensors, and other advanced smart-sensor functions. The OCNN is developed with emphasis on its capability to find global optimal solutions by using a hardware annealing method. The hardware annealing function is embedded in the network. It is a parallel version of fast mean-field annealing in analog networks, and is highly efficient in finding globally optimal solutions for cellular neural networks. The OCNN is designed to perform programmable functions for fine-grained processing with annealing control to enhance the output quality. The OCNN architecture is a programmable multi-dimensional array of neurons which are locally connected with their local neurons. Major design features of the OCNN neural processor includes massively parallel neural processing, hardware annealing capability, winner-take-all mechanism, digitally programmable synaptic weights, and multisensor parallel interface. A compact current-mode VLSI design feasibility of the OCNN neural processor is demonstrated by a prototype 5 X 5-neuroprocessor array chip in a 2-micrometers CMOS technology. The OCNN operation theory, architecture, design and implementation, prototype chip, and system applications have been investigated in detail and presented in this paper.

  8. Experimental demonstration of a tunable laser using an SOA and an Opto-VLSI Processor.

    Science.gov (United States)

    Aljada, Muhsen; Zheng, Rong; Alameh, Kamal; Lee, Yong-Tak

    2007-07-23

    In this paper we propose and experimentally demonstrate a tunable laser structure cascading a semiconductor optical amplifier (SOA) that generates broadband amplified spontaneous emission and a reflective Opto-VLSI processor that dynamically reflects arbitrarily wavelengths and injects them back into the SOA, thus synthesizing an output signal of variable wavelength. The wavelength tunablility is performed using digital phase holograms uploaded on the Opto-VLSI processor. Experimental results demonstrate a tuning range from 1524nm to 1534nm, and show that the proposed tunable laser structure has a stable performance.

  9. Geometric Design Rule Check of VLSI Layouts in Mesh Connected Processors

    Directory of Open Access Journals (Sweden)

    S. K. Nandy

    1994-01-01

    Full Text Available Design Rule Checking is a compute-intensive VLSI CAD tool. In this paper we propose a parallel algorithm to perform Design Rule Check (DRC of Layout geometries in a VLSI layout. The algorithm assumes the parallel architecture to be a two-dimensional mesh of processors. The algorithm is based on a linear quadtree representation of the layout. Through a complexity analysis it is shown that it is possible to achieve a linear speedup in DRC with respect to the number of processors.

  10. A VLSI Processor Design of Real-Time Data Compression for High-Resolution Imaging Radar

    Science.gov (United States)

    Fang, W.

    1994-01-01

    For the high-resolution imaging radar systems, real-time data compression of raw imaging data is required to accomplish the science requirements and satisfy the given communication and storage constraints. The Block Adaptive Quantizer (BAQ) algorithm and its associated VLSI processor design have been developed to provide a real-time data compressor for high-resolution imaging radar systems.

  11. Implementing neural architectures using analog VLSI circuits

    Science.gov (United States)

    Maher, Mary Ann C.; Deweerth, Stephen P.; Mahowald, Misha A.; Mead, Carver A.

    1989-05-01

    Analog very large-scale integrated (VLSI) technology can be used not only to study and simulate biological systems, but also to emulate them in designing artificial sensory systems. A methodology for building these systems in CMOS VLSI technology has been developed using analog micropower circuit elements that can be hierarchically combined. Using this methodology, experimental VLSI chips of visual and motor subsystems have been designed and fabricated. These chips exhibit behavior similar to that of biological systems, and perform computations useful for artificial sensory systems.

  12. Tunable multi-wavelength fiber lasers based on an Opto-VLSI processor and optical amplifiers.

    Science.gov (United States)

    Xiao, Feng; Alameh, Kamal; Lee, Yong Tak

    2009-12-07

    A multi-wavelength tunable fiber laser based on the use of an Opto-VLSI processor in conjunction with different optical amplifiers is proposed and experimentally demonstrated. The Opto-VLSI processor can simultaneously select any part of the gain spectrum from each optical amplifier into its associated fiber ring, leading to a multiport tunable fiber laser source. We experimentally demonstrate a 3-port tunable fiber laser source, where each output wavelength of each port can independently be tuned within the C-band with a wavelength step of about 0.05 nm. Experimental results demonstrate a laser linewidth as narrow as 0.05 nm and an optical side-mode-suppression-ratio (SMSR) of about 35 dB. The demonstrated three fiber lasers have excellent stability at room temperature and output power uniformity less than 0.5 dB over the whole C-band.

  13. A novel VLSI processor architecture for supercomputing arrays

    Science.gov (United States)

    Venkateswaran, N.; Pattabiraman, S.; Devanathan, R.; Ahmed, Ashaf; Venkataraman, S.; Ganesh, N.

    1993-01-01

    Design of the processor element for general purpose massively parallel supercomputing arrays is highly complex and cost ineffective. To overcome this, the architecture and organization of the functional units of the processor element should be such as to suit the diverse computational structures and simplify mapping of complex communication structures of different classes of algorithms. This demands that the computation and communication structures of different class of algorithms be unified. While unifying the different communication structures is a difficult process, analysis of a wide class of algorithms reveals that their computation structures can be expressed in terms of basic IP,IP,OP,CM,R,SM, and MAA operations. The execution of these operations is unified on the PAcube macro-cell array. Based on this PAcube macro-cell array, we present a novel processor element called the GIPOP processor, which has dedicated functional units to perform the above operations. The architecture and organization of these functional units are such to satisfy the two important criteria mentioned above. The structure of the macro-cell and the unification process has led to a very regular and simpler design of the GIPOP processor. The production cost of the GIPOP processor is drastically reduced as it is designed on high performance mask programmable PAcube arrays.

  14. Specification for a reconfigurable optoelectronic VLSI processor suitable for digital signal processing.

    Science.gov (United States)

    Fey, D; Kasche, B; Burkert, C; Tschäche, O

    1998-01-10

    A concept for a parallel digital signal processor based on opticalinterconnections and optoelectronic VLSI circuits is presented. Itis shown that the proper combination of optical communication, architecture, and algorithms allows a throughput that outperformspurely electronic solutions. The usefulness of low-level algorithmsfrom the add-and-shift class is emphasized. These algorithms leadto fine-grain, massively parallel on-chip processor architectures withhigh demands for optical off-chip interconnections. A comparativeperformance analysis shows the superiority of a bit-serialarchitecture. This architecture is mapped onto an optoelectronicthree-dimensional circuit, and the necessary optical interconnectionscheme is specified.

  15. VLSI design

    CERN Document Server

    Einspruch, Norman G

    1986-01-01

    VLSI Electronics Microstructure Science, Volume 14: VLSI Design presents a comprehensive exposition and assessment of the developments and trends in VLSI (Very Large Scale Integration) electronics. This volume covers topics that range from microscopic aspects of materials behavior and device performance to the comprehension of VLSI in systems applications. Each article is prepared by a recognized authority. The subjects discussed in this book include VLSI processor design methodology; the RISC (Reduced Instruction Set Computer); the VLSI testing program; silicon compilers for VLSI; and special

  16. VLSI neural system architecture for finite ring recursive reduction.

    Science.gov (United States)

    Zhang, D; Jullien, G A

    1996-12-01

    The use of neural-like networks to implement finite ring computations has been presented in a previous paper. This paper develops efficient VLSI neural system architecture for the finite ring recursive reduction (FRRR), including module reduction, MSB carry iteration and feedforward processing. These techniques deal with the basic principles involved in constructing a FRRR, and their implementations are efficiently matched to the VLSI medium. Compared with the other structure models for finite ring computation (e.g. modification of binary arithmetic logic and bit-steered ROM's), the FRRR structure has the lowest area complexity in silicon while maintaining a high throughput rate. Examples of several implementations are used to illustrate the effectiveness of the FRRR architecture.

  17. DESIGN AND ANALOG VLSI IMPLEMENTATION OF ARTIFICIAL NEURAL NETWORK

    OpenAIRE

    2011-01-01

    Nature has evolved highly advanced systems capable of performing complex computations, adoption and learning using analog computations. Furthermore nature has evolved techniques to deal with imprecise analog computations by using redundancy and massive connectivity. In this paper we are making use of Artificial Neural Network to demonstrate the way in which the biological system processes in analog domain. We are using 180nm CMOS VLSI technology for implementing circuits which ...

  18. Implementation Issues for Algorithmic VLSI (Very Large Scale Integration) Processor Arrays.

    Science.gov (United States)

    1984-10-01

    analysis of the various algorithms are described in Appendiccs 5.A, 5.B and 5.C. A note on notation: Following Ottmann ei aL [40], the variable n is used...redundant operations OK. Ottmann log i I log 1 up to n wasted processors. X-tree topology. Atallah log n I 1 redundant operations OK. up to n wasted...for Computing Machinery 14(2):203-241, April, 1967. 40] Thomas A. Ottmann , Arnold L. Rosenberg and Larry J. Stockmeyer. A dictionary machine (for VLSI

  19. Analog VLSI neural network integrated circuits

    Science.gov (United States)

    Kub, F. J.; Moon, K. K.; Just, E. A.

    1991-01-01

    Two analog very large scale integration (VLSI) vector matrix multiplier integrated circuit chips were designed, fabricated, and partially tested. They can perform both vector-matrix and matrix-matrix multiplication operations at high speeds. The 32 by 32 vector-matrix multiplier chip and the 128 by 64 vector-matrix multiplier chip were designed to perform 300 million and 3 billion multiplications per second, respectively. An additional circuit that has been developed is a continuous-time adaptive learning circuit. The performance achieved thus far for this circuit is an adaptivity of 28 dB at 300 KHz and 11 dB at 15 MHz. This circuit has demonstrated greater than two orders of magnitude higher frequency of operation than any previous adaptive learning circuit.

  20. Event-driven neural integration and synchronicity in analog VLSI.

    Science.gov (United States)

    Yu, Theodore; Park, Jongkil; Joshi, Siddharth; Maier, Christoph; Cauwenberghs, Gert

    2012-01-01

    Synchrony and temporal coding in the central nervous system, as the source of local field potentials and complex neural dynamics, arises from precise timing relationships between spike action population events across neuronal assemblies. Recently it has been shown that coincidence detection based on spike event timing also presents a robust neural code invariant to additive incoherent noise from desynchronized and unrelated inputs. We present spike-based coincidence detection using integrate-and-fire neural membrane dynamics along with pooled conductance-based synaptic dynamics in a hierarchical address-event architecture. Within this architecture, we encode each synaptic event with parameters that govern synaptic connectivity, synaptic strength, and axonal delay with additional global configurable parameters that govern neural and synaptic temporal dynamics. Spike-based coincidence detection is observed and analyzed in measurements on a log-domain analog VLSI implementation of the integrate-and-fire neuron and conductance-based synapse dynamics.

  1. Learning in Neural Networks: VLSI Implementation Strategies

    Science.gov (United States)

    Duong, Tuan Anh

    1995-01-01

    Fully-parallel hardware neural network implementations may be applied to high-speed recognition, classification, and mapping tasks in areas such as vision, or can be used as low-cost self-contained units for tasks such as error detection in mechanical systems (e.g. autos). Learning is required not only to satisfy application requirements, but also to overcome hardware-imposed limitations such as reduced dynamic range of connections.

  2. A fast neural-network algorithm for VLSI cell placement.

    Science.gov (United States)

    Aykanat, Cevdet; Bultan, Tevfik; Haritaoğlu, Ismail

    1998-12-01

    Cell placement is an important phase of current VLSI circuit design styles such as standard cell, gate array, and Field Programmable Gate Array (FPGA). Although nondeterministic algorithms such as Simulated Annealing (SA) were successful in solving this problem, they are known to be slow. In this paper, a neural network algorithm is proposed that produces solutions as good as SA in substantially less time. This algorithm is based on Mean Field Annealing (MFA) technique, which was successfully applied to various combinatorial optimization problems. A MFA formulation for the cell placement problem is derived which can easily be applied to all VLSI design styles. To demonstrate that the proposed algorithm is applicable in practice, a detailed formulation for the FPGA design style is derived, and the layouts of several benchmark circuits are generated. The performance of the proposed cell placement algorithm is evaluated in comparison with commercial automated circuit design software Xilinx Automatic Place and Route (APR) which uses SA technique. Performance evaluation is conducted using ACM/SIGDA Design Automation benchmark circuits. Experimental results indicate that the proposed MFA algorithm produces comparable results with APR. However, MFA is almost 20 times faster than APR on the average.

  3. A novel reconfigurable optical interconnect architecture using an Opto-VLSI processor and a 4-f imaging system.

    Science.gov (United States)

    Shen, Mingya; Xiao, Feng; Alameh, Kamal

    2009-12-07

    A novel reconfigurable optical interconnect architecture for on-board high-speed data transmission is proposed and experimentally demonstrated. The interconnect architecture is based on the use of an Opto-VLSI processor in conjunction with a 4-f imaging system to achieve reconfigurable chip-to-chip or board-to-board data communications. By reconfiguring the phase hologram of an Opto-VLSI processor, optical data generated by a vertical Cavity Surface Emitting Laser (VCSEL) associated to a chip (or a board) is arbitrarily steered to the photodetector associated to another chip (or another board). Experimental results show that the optical interconnect losses range from 5.8dB to 9.6dB, and that the maximum crosstalk level is below -36dB. The proposed architecture is tested for high-speed data transmission, and measured eye diagrams display good eye opening for data rate of up to 10Gb/s.

  4. High-speed (2.5 Gbps) reconfigurable inter-chip optical interconnects using opto-VLSI processors.

    Science.gov (United States)

    Aljada, Muhsen; Alameh, Kamal E; Lee, Yong-Tak; Chung, Il-Sug

    2006-07-24

    Reconfigurablele optical interconnects enable flexible and high-performance communication in multi-chip architectures to be arbitrarily adapted, leading to efficient parallel signal processing. The use of Opto-VLSI processors as beam steerers and multicasters for reconfigurable inter-chip optical interconnection is discussed. We demonstrate, as proof-of-concept, 2.5 Gbps reconfigurable optical interconnects between an 850nm vertical cavity surface emitting lasers (VCSEL) array and a photodiode (PD) array integrated onto a PCB by driving two Opto-VLSI processors with steering and multicasting digital phase holograms. The architecture is experimentally demonstrated through three scenarios showing its flexibility to perform single, multicasting, and parallel reconfigurable optical interconnects. To our knowledge, this is the first reported high-speed reconfigurable N-to-N optical interconnects architecture, which will have a significant impact on the flexibility and efficiency of large shared-memory multiprocessor machines.

  5. DESIGN AND ANALOG VLSI IMPLEMENTATION OF ARTIFICIAL NEURAL NETWORK

    Directory of Open Access Journals (Sweden)

    D.Yammenavar

    2011-08-01

    Full Text Available Nature has evolved highly advanced systems capable of performing complex computations, adoption andlearning using analog computations. Furthermore nature has evolved techniques to deal with impreciseanalog computations by using redundancy and massive connectivity. In this paper we are making use ofArtificial Neural Network to demonstrate the way in which the biological system processes in analogdomain.We are using 180nm CMOS VLSI technology for implementing circuits which performs arithmeticoperations and for implementing Neural Network. The arithmetic circuits presented here are based onMOS transistors operating in subthreshold region. The basic blocks of artificial neuron are multiplier,adder and neuron activation function.The functionality of designed neural network is verified for analog operations like signal amplificationand frequency multiplication. The network designed can be adopted for digital operations like AND, ORand NOT. The network realizes its functionality for the trained targets which is verified using simulationresults. The schematic, Layout design and verification of proposed Neural Network is carried out usingCadence Virtuoso tool.

  6. Design and Analog VLSI Implementation of Artificial Neural Network

    Directory of Open Access Journals (Sweden)

    Prof. Bapuray.D.Yammenavar

    2011-07-01

    Full Text Available Nature has evolved highly advanced systems capable of performing complex computations, adoption and learning using analog computations. Furthermore nature has evolved techniques to deal with imprecise analog computations by using redundancy and massive connectivity. In this paper we are making use of Artificial Neural Network to demonstrate the way in which the biological system processes in analog domain. We are using 180nm CMOS VLSI technology for implementing circuits which performs arithmetic operations and for implementing Neural Network. The arithmetic circuits presented here are based on MOS transistors operating in subthreshold region. The basic blocks of artificial neuron are multiplier, adder and neuron activation function. The functionality of designed neural network is verified for analog operations like signal amplification and frequency multiplication. The network designed can be adopted for digital operations like AND, OR and NOT. The network realizes its functionality for the trained targets which is verified using simulation results. The schematic, Layout design and verification of proposed Neural Network is carried out using Cadence Virtuoso tool.

  7. Low complexity VLSI implementation of CORDIC-based exponent calculation for neural networks

    Science.gov (United States)

    Aggarwal, Supriya; Khare, Kavita

    2012-11-01

    This article presents a low hardware complexity for exponent calculations based on CORDIC. The proposed CORDIC algorithm is designed to overcome major drawbacks (scale-factor compensation, low range of convergence and optimal selection of micro-rotations) of the conventional CORDIC in hyperbolic mode of operation. The micro-rotations are identified using leading-one bit detection with uni-direction rotations to eliminate redundant iterations and improve throughput. The efficiency and performance of the processor are independent of the probability of rotation angles being known prior to implementation. The eight-staged pipelined architecture implementation requires an 8 × N ROM in the pre-processing unit for storing the initial coordinate values; it no longer requires the ROM for storing the elementary angles. It provides an area-time efficient design for VLSI implementation for calculating exponents in activation functions and Gaussain Potential Functions (GPF) in neural networks. The proposed CORDIC processor requires 32.68% less adders and 72.23% less registers compared to that of the conventional design. The proposed design when implemented on Virtex 2P (2vp50ff1148-6) device, dissipates 55.58% less power and has 45.09% less total gate count and 16.91% less delay as compared to Xilinx CORDIC Core. The detailed algorithm design along with FPGA implementation and area and time complexities is presented.

  8. High-Level Synthesis of VLSI Processors for Intelligent Integrated SystemsBased on Logic-in-Memory Structure

    Science.gov (United States)

    Kudoh, Takao; Kameyama, Michitaka

    One of the most serious problems in recent VLSI systems is data transfer bottleneck between memories and processing elements. To solve the problem, a model of highly parallel VLSI processors for intelligent integrated systems is presented. A logic-in-memory module composed of a processing element, a register and a local memory is defined as a basic building block to form a regular parallel structure. The data transfer between adjacent modules are done simply in a single clock period by a shift-register chain. A high-level synthesis method is discussed on the hardware model, when a data-dependency graph corresponding to a processing algorithm is given. We must simultaneously consider both scheduling and allocation for the time optimization problem under a constraint of an chip area. That is, we consider the best scheduling together with allocation such that the processing time becomes minimum under a constraint of a fixed number of modules. Not only an exhaustive enumeration method but also a branch-and-bound method is proposed for the problem. As a result, it is made clear that the proposed high-level synthesis method is very effective to design special-purpose VLSI processors free from data transfer bottleneck.

  9. Novel broadband reconfigurable optical add-drop multiplexer employing custom fiber arrays and Opto-VLSI processors.

    Science.gov (United States)

    Xiao, Feng; Juswardy, Budi; Alameh, Kamal; Lee, Yong Tak

    2008-08-04

    A reconfigurable optical add/drop multiplexer (ROADM) structure based on using a custom-made fiber array and an Opto-VLSI processor is proposed and demonstrated. The fiber array consists of N pairs of angled fibers corresponding to N channels, each of which can independently perform add, drop, and thru functions through a reconfigurable Opto-VLSI beam steerer. Experimental results show that the ROADM structure can attain an average add, drop/thru insertion loss of 5.5 dB and a uniformity of 0.3 dB over a wide bandwidth from 1524 nm to 1576 nm, and a drop/thru crosstalk level as small as -40 dB.

  10. Design of 10Gbps optical encoder/decoder structure for FE-OCDMA system using SOA and opto-VLSI processors.

    Science.gov (United States)

    Aljada, Muhsen; Hwang, Seow; Alameh, Kamal

    2008-01-21

    In this paper we propose and experimentally demonstrate a reconfigurable 10Gbps frequency-encoded (1D) encoder/decoder structure for optical code division multiple access (OCDMA). The encoder is constructed using a single semiconductor optical amplifier (SOA) and 1D reflective Opto-VLSI processor. The SOA generates broadband amplified spontaneous emission that is dynamically sliced using digital phase holograms loaded onto the Opto-VLSI processor to generate 1D codewords. The selected wavelengths are injected back into the same SOA for amplifications. The decoder is constructed using single Opto-VLSI processor only. The encoded signal can successfully be retrieved at the decoder side only when the digital phase holograms of the encoder and the decoder are matched. The system performance is measured in terms of the auto-correlation and cross-correlation functions as well as the eye diagram.

  11. VLSI System Implementation of 200 MHz, 8-bit, 90nm CMOS Arithmetic and Logic Unit (ALU) Processor Controller

    OpenAIRE

    2012-01-01

    In this present study includes the Very Large Scale Integration (VLSI) system implementation of 200MHz, 8-bit, 90nm Complementary Metal Oxide Semiconductor (CMOS) Arithmetic and Logic Unit (ALU) processor control with logic gate design style and 0.12µm six metal 90nm CMOS fabrication technology. The system blocks and the behaviour are defined and the logical design is implemented in gate level in the design phase. Then, the logic circuits are simulated and the subunits are converted in to 90n...

  12. Constant fan-in digital neural networks are VLSI-optimal

    Energy Technology Data Exchange (ETDEWEB)

    Beiu, V.

    1995-12-31

    The paper presents a theoretical proof revealing an intrinsic limitation of digital VLSI technology: its inability to cope with highly connected structures (e.g. neural networks). We are in fact able to prove that efficient digital VLSI implementations (known as VLSI-optimal when minimizing the AT{sup 2} complexity measure - A being the area of the chip, and T the delay for propagating the inputs to the outputs) of neural networks are achieved for small-constant fan-in gates. This result builds on quite recent ones dealing with a very close estimate of the area of neural networks when implemented by threshold gates, but it is also valid for classical Boolean gates. Limitations and open questions are presented in the conclusions.

  13. VLSI System Implementation of 200 MHz, 8-bit, 90nm CMOS Arithmetic and Logic Unit (ALU Processor Controller

    Directory of Open Access Journals (Sweden)

    Fazal NOORBASHA

    2012-08-01

    Full Text Available In this present study includes the Very Large Scale Integration (VLSI system implementation of 200MHz, 8-bit, 90nm Complementary Metal Oxide Semiconductor (CMOS Arithmetic and Logic Unit (ALU processor control with logic gate design style and 0.12µm six metal 90nm CMOS fabrication technology. The system blocks and the behaviour are defined and the logical design is implemented in gate level in the design phase. Then, the logic circuits are simulated and the subunits are converted in to 90nm CMOS layout. Finally, in order to construct the VLSI system these units are placed in the floor plan and simulated with analog and digital, logic and switch level simulators. The results of the simulations indicates that the VLSI system can control different instructions which can divided into sub groups: transfer instructions, arithmetic and logic instructions, rotate and shift instructions, branch instructions, input/output instructions, control instructions. The data bus of the system is 16-bit. It runs at 200MHz, and operating power is 1.2V. In this paper, the parametric analysis of the system, the design steps and obtained results are explained.

  14. VLSI neuroprocessors

    Science.gov (United States)

    Kemeny, Sabrina E.

    1994-01-01

    Electronic and optoelectronic hardware implementations of highly parallel computing architectures address several ill-defined and/or computation-intensive problems not easily solved by conventional computing techniques. The concurrent processing architectures developed are derived from a variety of advanced computing paradigms including neural network models, fuzzy logic, and cellular automata. Hardware implementation technologies range from state-of-the-art digital/analog custom-VLSI to advanced optoelectronic devices such as computer-generated holograms and e-beam fabricated Dammann gratings. JPL's concurrent processing devices group has developed a broad technology base in hardware implementable parallel algorithms, low-power and high-speed VLSI designs and building block VLSI chips, leading to application-specific high-performance embeddable processors. Application areas include high throughput map-data classification using feedforward neural networks, terrain based tactical movement planner using cellular automata, resource optimization (weapon-target assignment) using a multidimensional feedback network with lateral inhibition, and classification of rocks using an inner-product scheme on thematic mapper data. In addition to addressing specific functional needs of DOD and NASA, the JPL-developed concurrent processing device technology is also being customized for a variety of commercial applications (in collaboration with industrial partners), and is being transferred to U.S. industries. This viewgraph p resentation focuses on two application-specific processors which solve the computation intensive tasks of resource allocation (weapon-target assignment) and terrain based tactical movement planning using two extremely different topologies. Resource allocation is implemented as an asynchronous analog competitive assignment architecture inspired by the Hopfield network. Hardware realization leads to a two to four order of magnitude speed-up over conventional

  15. The design, fabrication, and test of a new VLSI hybrid analog-digital neural processing element

    Science.gov (United States)

    Deyong, Mark R.; Findley, Randall L.; Fields, Chris

    1992-01-01

    A hybrid analog-digital neural processing element with the time-dependent behavior of biological neurons has been developed. The hybrid processing element is designed for VLSI implementation and offers the best attributes of both analog and digital computation. Custom VLSI layout reduces the layout area of the processing element, which in turn increases the expected network density. The hybrid processing element operates at the nanosecond time scale, which enables it to produce real-time solutions to complex spatiotemporal problems found in high-speed signal processing applications. VLSI prototype chips have been designed, fabricated, and tested with encouraging results. Systems utilizing the time-dependent behavior of the hybrid processing element have been simulated and are currently in the fabrication process. Future applications are also discussed.

  16. The design, fabrication, and test of a new VLSI hybrid analog-digital neural processing element

    Science.gov (United States)

    Deyong, Mark R.; Findley, Randall L.; Fields, Chris

    1992-01-01

    A hybrid analog-digital neural processing element with the time-dependent behavior of biological neurons has been developed. The hybrid processing element is designed for VLSI implementation and offers the best attributes of both analog and digital computation. Custom VLSI layout reduces the layout area of the processing element, which in turn increases the expected network density. The hybrid processing element operates at the nanosecond time scale, which enables it to produce real-time solutions to complex spatiotemporal problems found in high-speed signal processing applications. VLSI prototype chips have been designed, fabricated, and tested with encouraging results. Systems utilizing the time-dependent behavior of the hybrid processing element have been simulated and are currently in the fabrication process. Future applications are also discussed.

  17. Accurate and Precise Computation Using Analog VLSI, with Applications to Computer Graphics and Neural Networks.

    Science.gov (United States)

    Kirk, David Blair

    This thesis develops an engineering practice and design methodology to enable us to use CMOS analog VLSI chips to perform more accurate and precise computation. These techniques form the basis of an approach that permits us to build computer graphics and neural network applications using analog VLSI. The nature of the design methodology focuses on defining goals for circuit behavior to be met as part of the design process. To increase the accuracy of analog computation, we develop techniques for creating compensated circuit building blocks, where compensation implies the cancellation of device variations, offsets, and nonlinearities. These compensated building blocks can be used as components in larger and more complex circuits, which can then also be compensated. To this end, we develop techniques for automatically determining appropriate parameters for circuits, using constrained optimization. We also fabricate circuits that implement multi-dimensional gradient estimation for a gradient descent optimization technique. The parameter-setting and optimization tools allow us to automatically choose values for compensating our circuit building blocks, based on our goals for the circuit performance. We can also use the techniques to optimize parameters for larger systems, applying the goal-based techniques hierarchically. We also describe a set of thought experiments involving circuit techniques for increasing the precision of analog computation. Our engineering design methodology is a step toward easier use of analog VLSI to solve problems in computer graphics and neural networks. We provide data measured from compensated multipliers built using these design techniques. To demonstrate the feasibility of using analog VLSI for more quantitative computation, we develop small applications using the goal-based design approach and compensated components. Finally, we conclude by discussing the expected significance of this work for the wider use of analog VLSI for

  18. Current-mode subthreshold MOS circuits for analog VLSI neural systems

    Science.gov (United States)

    Andreou, Andreas G.; Boahen, Kwabena A.; Pouliquen, Philippe O.; Pavasovic, Aleksandra; Jenkins, Robert E.

    1991-03-01

    An overview of the current-mode approach for designing analog VLSI neural systems in subthreshold CMOS technology is presented. Emphasis is given to design techniques at the device level using the current-controlled current conveyor and the translinear principle. Circuits for associative memory and silicon retina systems are used as examples. The design methodology and how it relates to actual biological microcircuits are discussed.

  19. Current-mode subthreshold MOS circuits for analog VLSI neural systems.

    Science.gov (United States)

    Andreou, A G; Boahen, K A; Pouliquen, P O; Pavasovic, A; Jenkins, R E; Strohbehn, K

    1991-01-01

    An overview of the current-mode approach for designing analog VLSI neural systems in subthreshold CMOS technology is presented. Emphasis is given to design techniques at the device level using the current-controlled current conveyor and the translinear principle. Circuits for associative memory and silicon retina systems are used as examples. The design methodology and how it relates to actual biological microcircuits are discussed.

  20. VLSI Design of a Variable-Length FFT/IFFT Processor for OFDM-Based Communication Systems

    Directory of Open Access Journals (Sweden)

    Jen-Chih Kuo

    2003-12-01

    Full Text Available The technique of {orthogonal frequency division multiplexing (OFDM} is famous for its robustness against frequency-selective fading channel. This technique has been widely used in many wired and wireless communication systems. In general, the {fast Fourier transform (FFT} and {inverse FFT (IFFT} operations are used as the modulation/demodulation kernel in the OFDM systems, and the sizes of FFT/IFFT operations are varied in different applications of OFDM systems. In this paper, we design and implement a variable-length prototype FFT/IFFT processor to cover different specifications of OFDM applications. The cached-memory FFT architecture is our suggested VLSI system architecture to design the prototype FFT/IFFT processor for the consideration of low-power consumption. We also implement the twiddle factor butterfly {processing element (PE} based on the {{coordinate} rotation digital computer (CORDIC} algorithm, which avoids the use of conventional multiplication-and-accumulation unit, but evaluates the trigonometric functions using only add-and-shift operations. Finally, we implement a variable-length prototype FFT/IFFT processor with TSMC 0.35 μm 1P4M CMOS technology. The simulations results show that the chip can perform (64-2048-point FFT/IFFT operations up to 80 MHz operating frequency which can meet the speed requirement of most OFDM standards such as WLAN, ADSL, VDSL (256∼2K, DAB, and 2K-mode DVB.

  1. Design and Realization of Array Signal Processor VLSI Architecture for Phased Array System

    Directory of Open Access Journals (Sweden)

    D. Govind Rao

    2016-08-01

    Full Text Available A method for implementing an array signal processor for phased array radars. The array signal processor can receive planar array antenna inputs and can process it. It is based on the application of Adaptive Digital beam formers using FPGAs. Adaptive filter algorithm used here is Inverse Q-R Decomposition based Recursive Least Squares (IQRD-RLS [1] algorithm. Array signal processor based on FPGAs is suitable in the areas of Phased Array Radar receiver, where speed, accuracy and numerical stability are of utmost important. Using IQRD-RLS algorithm, optimal weights are calculated in much less time compared to conventional QRD-RLS algorithm. A customized multiple FPGA board comprising three Kintex-7 FPGAs is employed to implement array signal processor. The proposed architecture can form multiple beams from planar array antenna elements

  2. Cascaded VLSI Chips Help Neural Network To Learn

    Science.gov (United States)

    Duong, Tuan A.; Daud, Taher; Thakoor, Anilkumar P.

    1993-01-01

    Cascading provides 12-bit resolution needed for learning. Using conventional silicon chip fabrication technology of VLSI, fully connected architecture consisting of 32 wide-range, variable gain, sigmoidal neurons along one diagonal and 7-bit resolution, electrically programmable, synaptic 32 x 31 weight matrix implemented on neuron-synapse chip. To increase weight nominally from 7 to 13 bits, synapses on chip individually cascaded with respective synapses on another 32 x 32 matrix chip with 7-bit resolution synapses only (without neurons). Cascade correlation algorithm varies number of layers effectively connected into network; adds hidden layers one at a time during learning process in such way as to optimize overall number of neurons and complexity and configuration of network.

  3. A pipeline VLSI design of fast singular value decomposition processor for real-time EEG system based on on-line recursive independent component analysis.

    Science.gov (United States)

    Huang, Kuan-Ju; Shih, Wei-Yeh; Chang, Jui Chung; Feng, Chih Wei; Fang, Wai-Chi

    2013-01-01

    This paper presents a pipeline VLSI design of fast singular value decomposition (SVD) processor for real-time electroencephalography (EEG) system based on on-line recursive independent component analysis (ORICA). Since SVD is used frequently in computations of the real-time EEG system, a low-latency and high-accuracy SVD processor is essential. During the EEG system process, the proposed SVD processor aims to solve the diagonal, inverse and inverse square root matrices of the target matrices in real time. Generally, SVD requires a huge amount of computation in hardware implementation. Therefore, this work proposes a novel design concept for data flow updating to assist the pipeline VLSI implementation. The SVD processor can greatly improve the feasibility of real-time EEG system applications such as brain computer interfaces (BCIs). The proposed architecture is implemented using TSMC 90 nm CMOS technology. The sample rate of EEG raw data adopts 128 Hz. The core size of the SVD processor is 580×580 um(2), and the speed of operation frequency is 20MHz. It consumes 0.774mW of power during the 8-channel EEG system per execution time.

  4. VLSI implementation of a nonlinear neuronal model: a "neural prosthesis" to restore hippocampal trisynaptic dynamics.

    Science.gov (United States)

    Hsiao, Min-Chi; Chan, Chiu-Hsien; Srinivasan, Vijay; Ahuja, Ashish; Erinjippurath, Gopal; Zanos, Theodoros P; Gholmieh, Ghassan; Song, Dong; Wills, Jack D; LaCoss, Jeff; Courellis, Spiros; Tanguay, Armand R; Granacki, John J; Marmarelis, Vasilis Z; Berger, Theodore W

    2006-01-01

    We are developing a biomimetic electronic neural prosthesis to replace regions of the hippocampal brain area that have been damaged by disease or insult. We have used the hippocampal slice preparation as the first step in developing such a prosthesis. The major intrinsic circuitry of the hippocampus consists of an excitatory cascade involving the dentate gyrus (DG), CA3, and CA1 subregions; this trisynaptic circuit can be maintained in a transverse slice preparation. Our demonstration of a neural prosthesis for the hippocampal slice involves: (i) surgically removing CA3 function from the trisynaptic circuit by transecting CA3 axons, (ii) replacing biological CA3 function with a hardware VLSI (very large scale integration) model of the nonlinear dynamics of CA3, and (iii) through a specially designed multi-site electrode array, transmitting DG output to the hardware device, and routing the hardware device output to the synaptic inputs of the CA1 subregion, thus by-passing the damaged CA3. Field EPSPs were recorded from the CA1 dendritic zone in intact slices and "hybrid" DG-VLSI-CA1 slices. Results show excellent agreement between data from intact slices and transected slices with the hardware-substituted CA3: propagation of temporal patterns of activity from DG-->VLSI-->CA1 reproduces that observed experimentally in the biological DG-->CA3-->CA1 circuit.

  5. A configurable realtime DWT-based neural data compression and communication VLSI system for wireless implants.

    Science.gov (United States)

    Yang, Yuning; Kamboh, Awais M; Mason, Andrew J

    2014-04-30

    This paper presents the design of a complete multi-channel neural recording compression and communication system for wireless implants that addresses the challenging simultaneous requirements for low power, high bandwidth and error-free communication. The compression engine implements discrete wavelet transform (DWT) and run length encoding schemes and offers a practical data compression solution that faithfully preserves neural information. The communication engine encodes data and commands separately into custom-designed packet structures utilizing a protocol capable of error handling. VLSI hardware implementation of these functions, within the design constraints of a 32-channel neural compression implant, is presented. Designed in 0.13μm CMOS, the core of the neural compression and communication chip occupies only 1.21mm(2) and consumes 800μW of power (25μW per channel at 26KS/s) demonstrating an effective solution for intra-cortical neural interfaces.

  6. A unified approach to VLSI layout automation and algorithm mapping on processor arrays

    Science.gov (United States)

    Venkateswaran, N.; Pattabiraman, S.; Srinivasan, Vinoo N.

    1993-01-01

    Development of software tools for designing supercomputing systems is highly complex and cost ineffective. To tackle this a special purpose PAcube silicon compiler which integrates different design levels from cell to processor arrays has been proposed. As a part of this, we present in this paper a novel methodology which unifies the problems of Layout Automation and Algorithm Mapping.

  7. VLSI based FFT Processor with Improvement in Computation Speed and Area Reduction

    Directory of Open Access Journals (Sweden)

    M.Sheik Mohamed

    2013-06-01

    Full Text Available In this paper, a modular approach is presented to develop parallel pipelined architectures for the fast Fourier transform (FFT processor. The new pipelined FFT architecture has the advantage of underutilized hardware based on the complex conjugate of final stage results without increasing the hardware complexity. The operating frequency of the new architecture can be decreased that in turn reduces the power consumption. A comparison of area and computing time are drawn between the new design and the previous architectures. The new structure is synthesized using Xilinx ISE and simulated using ModelSim Starter Edition. The designed FFT algorithm is realized in our processor to reduce the number of complex computations.

  8. Design and VLSI Implementation of Anticollision Enabled Robot Processor Using RFID Technology

    Directory of Open Access Journals (Sweden)

    Joyashree Bag

    2012-12-01

    Full Text Available RFID is a low power wireless emerging technology which has given rise to highly promising applications in real life. It can be employed for robot navigation. In multi-robot environment, when many robots are moving in the same work space, there is a possibility of their physical collision with themselves as well as with physical objects. In the present work, we have proposed and developed a processor incorporating smart algorithm for avoiding such collisions with the help of RFID technology and implemented it by using VHDL. The design procedure and the simulated results are very useful in designing and implementing a practical RFID system. The RTL schematic view of the processor is achieved by successfully synthesizing the proposed design.KEYWORDS

  9. Optimal neural computations require analog processors

    Energy Technology Data Exchange (ETDEWEB)

    Beiu, V.

    1998-12-31

    This paper discusses some of the limitations of hardware implementations of neural networks. The authors start by presenting neural structures and their biological inspirations, while mentioning the simplifications leading to artificial neural networks. Further, the focus will be on hardware imposed constraints. They will present recent results for three different alternatives of parallel implementations of neural networks: digital circuits, threshold gate circuits, and analog circuits. The area and the delay will be related to the neurons` fan-in and to the precision of their synaptic weights. The main conclusion is that hardware-efficient solutions require analog computations, and suggests the following two alternatives: (i) cope with the limitations imposed by silicon, by speeding up the computation of the elementary silicon neurons; (2) investigate solutions which would allow the use of the third dimension (e.g. using optical interconnections).

  10. Review: “Implementation of Feedforward and Feedback Neural Network for Signal Processing Using Analog VLSI Technology”

    Directory of Open Access Journals (Sweden)

    Miss. Rachana R. Patil

    2015-01-01

    Full Text Available Main focus of project is on implementation of Neural Network Architecture (NNA with on chip learning on Analog VLSI Technology for signal processing application. In the proposed paper the analog components like Gilbert Cell Multiplier (GCM, Neuron Activation Function (NAF are used to implement artificial NNA. Analog components used comprises of multiplier, adder and tan sigmoidal function circuit using MOS transistor. This Neural Architecture is trained using Back Propagation (BP Algorithm in analog domain with new techniques of weight storage. Layout design and verification of above design is carried out using VLSI Backend Microwind 3.1 software Tool. The technology used to design layout is 32 nm CMOS Technology

  11. VLSI Implementation of a Bio-inspired Olfactory Spiking Neural Network

    Science.gov (United States)

    Hsieh, Hung-Yi; Tang, Kea-Tiong

    2011-11-01

    This paper proposes a VLSI circuit implementing a low power, high-resolution spiking neural network (SNN) with STDP synapses, inspired by mammalian olfactory systems. By representing mitral cell action potential by a step function, the power consumption and the chip area can be reduced. By cooperating sub-threshold oscillation and inhibition, the network outputs can be distinct. This circuit was fabricated using the TSMC 0.18 μm 1P6M CMOS process. Post-layout simulation results are reported.

  12. Non-linear feedback neural networks VLSI implementations and applications

    CERN Document Server

    Ansari, Mohd Samar

    2014-01-01

    This book aims to present a viable alternative to the Hopfield Neural Network (HNN) model for analog computation. It is well known that the standard HNN suffers from problems of convergence to local minima, and requirement of a large number of neurons and synaptic weights. Therefore, improved solutions are needed. The non-linear synapse neural network (NoSyNN) is one such possibility and is discussed in detail in this book. This book also discusses the applications in computationally intensive tasks like graph coloring, ranking, and linear as well as quadratic programming. The material in the book is useful to students, researchers and academician working in the area of analog computation.

  13. VLSI Neural Networks Help To Compress Video Signals

    Science.gov (United States)

    Fang, Wai-Chi; Sheu, Bing J.

    1996-01-01

    Advanced analog/digital electronic system for compression of video signals incorporates artificial neural networks. Performs motion-estimation and image-data-compression processing. Effectively eliminates temporal and spatial redundancies of sequences of video images; processes video image data, retaining only nonredundant parts to be transmitted, then transmits resulting data stream in form of efficient code. Reduces bandwidth and storage requirements for transmission and recording of video signal.

  14. VLSI synthesis of digital application specific neural networks

    Science.gov (United States)

    Beagles, Grant; Winters, Kel

    1991-01-01

    Neural networks tend to fall into two general categories: (1) software simulations, or (2) custom hardware that must be trained. The scope of this project is the merger of these two classifications into a system whereby a software model of a network is trained to perform a specific task and the results used to synthesize a standard cell realization of the network using automated tools.

  15. How to build VLSI-efficient neural chips

    Energy Technology Data Exchange (ETDEWEB)

    Beiu, V.

    1998-02-01

    This paper presents several upper and lower bounds for the number-of-bits required for solving a classification problem, as well as ways in which these bounds can be used to efficiently build neural network chips. The focus will be on complexity aspects pertaining to neural networks: (1) size complexity and depth (size) tradeoffs, and (2) precision of weights and thresholds as well as limited interconnectivity. They show difficult problems-exponential growth in either space (precision and size) and/or time (learning and depth)-when using neural networks for solving general classes of problems (particular cases may enjoy better performances). The bounds for the number-of-bits required for solving a classification problem represent the first step of a general class of constructive algorithms, by showing how the quantization of the input space could be done in O (m{sup 2}n) steps. Here m is the number of examples, while n is the number of dimensions. The second step of the algorithm finds its roots in the implementation of a class of Boolean functions using threshold gates. It is substantiated by mathematical proofs for the size O (mn/{Delta}), and the depth O [log(mn)/log{Delta}] of the resulting network (here {Delta} is the maximum fan in). Using the fan in as a parameter, a full class of solutions can be designed. The third step of the algorithm represents a reduction of the size and an increase of its generalization capabilities. Extensions by using analogue COMPARISONs, allows for real inputs, and increase the generalization capabilities at the expense of longer training times. Finally, several solutions which can lower the size of the resulting neural network are detailed. The interesting aspect is that they are obtained for limited, or even constant, fan-ins. In support of these claims many simulations have been performed and are called upon.

  16. Cellular pulse-coupled neural network with adaptive weights for image segmentation and its VLSI implementation

    Science.gov (United States)

    Schreiter, Juerg; Ramacher, Ulrich; Heittmann, Arne; Matolin, Daniel; Schuffny, Rene

    2004-05-01

    We present a cellular pulse coupled neural network with adaptive weights and its analog VLSI implementation. The neural network operates on a scalar image feature, such as grey scale or the output of a spatial filter. It detects segments and marks them with synchronous pulses of the corresponding neurons. The network consists of integrate-and-fire neurons, which are coupled to their nearest neighbors via adaptive synaptic weights. Adaptation follows either one of two empirical rules. Both rules lead to spike grouping in wave like patterns. This synchronous activity binds groups of neurons and labels the corresponding image segments. Applications of the network also include feature preserving noise removal, image smoothing, and detection of bright and dark spots. The adaptation rules are insensitive for parameter deviations, mismatch and non-ideal approximation of the implied functions. That makes an analog VLSI implementation feasible. Simulations showed no significant differences in the synchronization properties between networks using the ideal adaptation rules and networks resembling implementation properties such as randomly distributed parameters and roughly implemented adaptation functions. A prototype is currently being designed and fabricated using an Infineon 130nm technology. It comprises a 128 × 128 neuron array, analog image memory, and an address event representation pulse output.

  17. A radial basis function neurocomputer implemented with analog VLSI circuits

    Science.gov (United States)

    Watkins, Steven S.; Chau, Paul M.; Tawel, Raoul

    1992-01-01

    An electronic neurocomputer which implements a radial basis function neural network (RBFNN) is described. The RBFNN is a network that utilizes a radial basis function as the transfer function. The key advantages of RBFNNs over existing neural network architectures include reduced learning time and the ease of VLSI implementation. This neurocomputer is based on an analog/digital hybrid design and has been constructed with both custom analog VLSI circuits and a commercially available digital signal processor. The hybrid architecture is selected because it offers high computational performance while compensating for analog inaccuracies, and it features the ability to model large problems.

  18. A VLSI Neural Monitoring System With Ultra-Wideband Telemetry for Awake Behaving Subjects.

    Science.gov (United States)

    Greenwald, E; Mollazadeh, M; Hu, C; Wei Tang; Culurciello, E; Thakor, V

    2011-04-01

    Long-term monitoring of neuronal activity in awake behaving subjects can provide fundamental information about brain dynamics for neuroscience and neuroengineering applications. Here, we present a miniature, lightweight, and low-power recording system for monitoring neural activity in awake behaving animals. The system integrates two custom designed very-large-scale integrated chips, a neural interface module fabricated in 0.5 μm complementary metal-oxide semiconductor technology and an ultra-wideband transmitter module fabricated in a 0.5 μm silicon-on-sapphire (SOS) technology. The system amplifies, filters, digitizes, and transmits 16 channels of neural data at a rate of 1 Mb/s. The entire system, which includes the VLSI circuits, a digital interface board, a battery, and a custom housing, is small and lightweight (24 g) and, thus, can be chronically mounted on small animals. The system consumes 4.8 mA and records continuously for up to 40 h powered by a 3.7-V, 200-mAh rechargeable lithium-ion battery. Experimental benchtop characterizations as well as in vivo multichannel neural recordings from awake behaving rats are presented here.

  19. An efficient VLSI implementation of on-line recursive ICA processor for real-time multi-channel EEG signal separation.

    Science.gov (United States)

    Shih, Wei-Yeh; Liao, Jui-Chieh; Huang, Kuan-Ju; Fang, Wai-Chi; Cauwenberghs, Gert; Jung, Tzyy-Ping

    2013-01-01

    This paper presents an efficient VLSI implementation of on-line recursive ICA (ORICA) processor for real-time multi-channel EEG signal separation. The proposed design contains a system control unit, a whitening unit, a singular value decomposition unit, a floating matrix multiply unit and, and an ORICA weight training unit. Because the input sample rate of the ORICA processor is 128 Hz, the ORICA processor should produce independent components before the next sample is input in 1/128 s. Under the timing constraints of commutating multi-channel ORICA in real time, the design of the ORICA processor is a mixed architecture, which is designed as different hardware parallelism according to the complexity of processing units. The shared arithmetic processing unit and shared register can reduce hardware complexity and power consumption. The proposed design is implemented used TSMC 90 nm CMOS technology with 8-channel EEG processing in 128 Hz sample rate of raw data and consumes 2.827 mW at 50 MHz clock rate. The performance of the proposed design is also shown to reach 0.0078125 s latency after each EEG sample time, and the average correlation coefficient between the original source signals and extracted ORICA signals for each 1 s frame is 0.9763.

  20. Biophysical Neural Spiking, Bursting, and Excitability Dynamics in Reconfigurable Analog VLSI.

    Science.gov (United States)

    Yu, T; Sejnowski, T J; Cauwenberghs, G

    2011-10-01

    We study a range of neural dynamics under variations in biophysical parameters underlying extended Morris-Lecar and Hodgkin-Huxley models in three gating variables. The extended models are implemented in NeuroDyn, a four neuron, twelve synapse continuous-time analog VLSI programmable neural emulation platform with generalized channel kinetics and biophysical membrane dynamics. The dynamics exhibit a wide range of time scales extending beyond 100 ms neglected in typical silicon models of tonic spiking neurons. Circuit simulations and measurements show transition from tonic spiking to tonic bursting dynamics through variation of a single conductance parameter governing calcium recovery. We similarly demonstrate transition from graded to all-or-none neural excitability in the onset of spiking dynamics through the variation of channel kinetic parameters governing the speed of potassium activation. Other combinations of variations in conductance and channel kinetic parameters give rise to phasic spiking and spike frequency adaptation dynamics. The NeuroDyn chip consumes 1.29 mW and occupies 3 mm × 3 mm in 0.5 μm CMOS, supporting emerging developments in neuromorphic silicon-neuron interfaces.

  1. Robust Working Memory in an Asynchronously Spiking Neural Network Realized with Neuromorphic VLSI.

    Science.gov (United States)

    Giulioni, Massimiliano; Camilleri, Patrick; Mattia, Maurizio; Dante, Vittorio; Braun, Jochen; Del Giudice, Paolo

    2011-01-01

    We demonstrate bistable attractor dynamics in a spiking neural network implemented with neuromorphic VLSI hardware. The on-chip network consists of three interacting populations (two excitatory, one inhibitory) of leaky integrate-and-fire (LIF) neurons. One excitatory population is distinguished by strong synaptic self-excitation, which sustains meta-stable states of "high" and "low"-firing activity. Depending on the overall excitability, transitions to the "high" state may be evoked by external stimulation, or may occur spontaneously due to random activity fluctuations. In the former case, the "high" state retains a "working memory" of a stimulus until well after its release. In the latter case, "high" states remain stable for seconds, three orders of magnitude longer than the largest time-scale implemented in the circuitry. Evoked and spontaneous transitions form a continuum and may exhibit a wide range of latencies, depending on the strength of external stimulation and of recurrent synaptic excitation. In addition, we investigated "corrupted" "high" states comprising neurons of both excitatory populations. Within a "basin of attraction," the network dynamics "corrects" such states and re-establishes the prototypical "high" state. We conclude that, with effective theoretical guidance, full-fledged attractor dynamics can be realized with comparatively small populations of neuromorphic hardware neurons.

  2. Optogenetics in Silicon: A Neural Processor for Predicting Optically Active Neural Networks.

    Science.gov (United States)

    Luo, Junwen; Nikolic, Konstantin; Evans, Benjamin D; Dong, Na; Sun, Xiaohan; Andras, Peter; Yakovlev, Alex; Degenaar, Patrick

    2016-08-17

    We present a reconfigurable neural processor for real-time simulation and prediction of opto-neural behaviour. We combined a detailed Hodgkin-Huxley CA3 neuron integrated with a four-state Channelrhodopsin-2 (ChR2) model into reconfigurable silicon hardware. Our architecture consists of a Field Programmable Gated Array (FPGA) with a custom-built computing data-path, a separate data management system and a memory approach based router. Advancements over previous work include the incorporation of short and long-term calcium and light-dependent ion channels in reconfigurable hardware. Also, the developed processor is computationally efficient, requiring only 0.03 ms processing time per sub-frame for a single neuron and 9.7 ms for a fully connected network of 500 neurons with a given FPGA frequency of 56.7 MHz. It can therefore be utilized for exploration of closed loop processing and tuning of biologically realistic optogenetic circuitry.

  3. Optogenetics in Silicon: A Neural Processor for Predicting Optically Active Neural Networks.

    Science.gov (United States)

    Junwen Luo; Nikolic, Konstantin; Evans, Benjamin D; Na Dong; Xiaohan Sun; Andras, Peter; Yakovlev, Alex; Degenaar, Patrick

    2017-02-01

    We present a reconfigurable neural processor for real-time simulation and prediction of opto-neural behaviour. We combined a detailed Hodgkin-Huxley CA3 neuron integrated with a four-state Channelrhodopsin-2 (ChR2) model into reconfigurable silicon hardware. Our architecture consists of a Field Programmable Gated Array (FPGA) with a custom-built computing data-path, a separate data management system and a memory approach based router. Advancements over previous work include the incorporation of short and long-term calcium and light-dependent ion channels in reconfigurable hardware. Also, the developed processor is computationally efficient, requiring only 0.03 ms processing time per sub-frame for a single neuron and 9.7 ms for a fully connected network of 500 neurons with a given FPGA frequency of 56.7 MHz. It can therefore be utilized for exploration of closed loop processing and tuning of biologically realistic optogenetic circuitry.

  4. Phase-Synchronization Early Epileptic Seizure Detector VLSI Architecture.

    Science.gov (United States)

    Abdelhalim, K; Smolyakov, V; Genov, R

    2011-10-01

    A low-power VLSI processor architecture that computes in real time the magnitude and phase-synchronization of two input neural signals is presented. The processor is a part of an envisioned closed-loop implantable microsystem for adaptive neural stimulation. The architecture uses three CORDIC processing cores that require shift-and-add operations but no multiplication. The 10-bit processor synthesized and prototyped in a standard 1.2 V 0.13 μm CMOS technology utilizes 41,000 logic gates. It dissipates 3.6 μW per input pair, and provides 1.7 kS/s per-channel throughput when clocked at 2.5 MHz. The power scales linearly with the number of input channels or the sampling rate. The efficacy of the processor in early epileptic seizure detection is validated on human intracranial EEG data.

  5. VLSI design

    CERN Document Server

    Basu, D K

    2014-01-01

    Very Large Scale Integrated Circuits (VLSI) design has moved from costly curiosity to an everyday necessity, especially with the proliferated applications of embedded computing devices in communications, entertainment and household gadgets. As a result, more and more knowledge on various aspects of VLSI design technologies is becoming a necessity for the engineering/technology students of various disciplines. With this goal in mind the course material of this book has been designed to cover the various fundamental aspects of VLSI design, like Categorization and comparison between various technologies used for VLSI design Basic fabrication processes involved in VLSI design Design of MOS, CMOS and Bi CMOS circuits used in VLSI Structured design of VLSI Introduction to VHDL for VLSI design Automated design for placement and routing of VLSI systems VLSI testing and testability The various topics of the book have been discussed lucidly with analysis, when required, examples, figures and adequate analytical and the...

  6. Neural network surface acoustic wave RF signal processor for digital modulation recognition.

    Science.gov (United States)

    Kavalov, Dimitar; Kalinin, Victor

    2002-09-01

    An architecture of a surface acoustic wave (SAW) processor based on an artificial neural network is proposed for an automatic recognition of different types of digital passband modulation. Three feed-forward networks are trained to recognize filtered and unfiltered binary phase shift keying (BPSK) and quadrature phase shift keying (QPSK) signals, as well as unfiltered BPSK, QPSK, and 16 quadrature amplitude (16QAM) signals. Performance of the processor in the presence of additive white Gaussian noise (AWGN) is simulated. The influence of second-order effects in SAW devices, phase, and amplitude errors on the performance of the processor also is studied.

  7. VLSI design

    CERN Document Server

    Chandrasetty, Vikram Arkalgud

    2011-01-01

    This book provides insight into the practical design of VLSI circuits. It is aimed at novice VLSI designers and other enthusiasts who would like to understand VLSI design flows. Coverage includes key concepts in CMOS digital design, design of DSP and communication blocks on FPGAs, ASIC front end and physical design, and analog and mixed signal design. The approach is designed to focus on practical implementation of key elements of the VLSI design process, in order to make the topic accessible to novices. The design concepts are demonstrated using software from Mathworks, Xilinx, Mentor Graphic

  8. VLSI metallization

    CERN Document Server

    Einspruch, Norman G; Gildenblat, Gennady Sh

    1987-01-01

    VLSI Electronics Microstructure Science, Volume 15: VLSI Metallization discusses the various issues and problems related to VLSI metallization. It details the available solutions and presents emerging trends.This volume is comprised of 10 chapters. The two introductory chapters, Chapter 1 and 2 serve as general references for the electrical and metallurgical properties of thin conducting films. Subsequent chapters review the various aspects of VLSI metallization. The order of presentation has been chosen to follow the common processing sequence. In Chapter 3, some relevant metal deposition tec

  9. The digi-neocognitron: a digital neocognitron neural network model for VLSI.

    Science.gov (United States)

    White, B A; Elmasry, M I

    1992-01-01

    One of the most complicated ANN models, the neocognitron (NC), is adapted to an efficient all-digital implementation for VLSI. The new model, the digi-neocognitron (DNC), has the same pattern recognition performance as the NC. The DNC model is derived from the NC model by a combination of preprocessing approximation and the definition of new model functions, e.g., multiplication and division are eliminated by conversion of factors to powers of 2, requiring only shift operations. The NC model is reviewed, the DNC model is presented, a methodology to convert NC models to DNC models is discussed, and the performances of the two models are compared on a character recognition example. The DNC model has substantial advantages over the NC model for VLSI implementation. The area-delay product is improved by two to three orders of magnitude, and I/O and memory requirements are reduced by representation of weights with 3 bits or less and neuron outputs with 4 bits or 7 bits.

  10. Synaptic dynamics in analog VLSI.

    Science.gov (United States)

    Bartolozzi, Chiara; Indiveri, Giacomo

    2007-10-01

    Synapses are crucial elements for computation and information transfer in both real and artificial neural systems. Recent experimental findings and theoretical models of pulse-based neural networks suggest that synaptic dynamics can play a crucial role for learning neural codes and encoding spatiotemporal spike patterns. Within the context of hardware implementations of pulse-based neural networks, several analog VLSI circuits modeling synaptic functionality have been proposed. We present an overview of previously proposed circuits and describe a novel analog VLSI synaptic circuit suitable for integration in large VLSI spike-based neural systems. The circuit proposed is based on a computational model that fits the real postsynaptic currents with exponentials. We present experimental data showing how the circuit exhibits realistic dynamics and show how it can be connected to additional modules for implementing a wide range of synaptic properties.

  11. An Application Specific Instruction Set Processor (ASIP) for Adaptive Filters in Neural Prosthetics.

    Science.gov (United States)

    Xin, Yao; Li, Will X Y; Zhang, Zhaorui; Cheung, Ray C C; Song, Dong; Berger, Theodore W

    2015-01-01

    Neural coding is an essential process for neuroprosthetic design, in which adaptive filters have been widely utilized. In a practical application, it is needed to switch between different filters, which could be based on continuous observations or point process, when the neuron models, conditions, or system requirements have changed. As candidates of coding chip for neural prostheses, low-power general purpose processors are not computationally efficient especially for large scale neural population coding. Application specific integrated circuits (ASICs) do not have flexibility to switch between different adaptive filters while the cost for design and fabrication is formidable. In this research work, we explore an application specific instruction set processor (ASIP) for adaptive filters in neural decoding activity. The proposed architecture focuses on efficient computation for the most time-consuming matrix/vector operations among commonly used adaptive filters, being able to provide both flexibility and throughput. Evaluation and implementation results are provided to demonstrate that the proposed ASIP design is area-efficient while being competitive to commercial CPUs in computational performance.

  12. Implantable neurotechnologies: bidirectional neural interfaces--applications and VLSI circuit implementations.

    Science.gov (United States)

    Greenwald, Elliot; Masters, Matthew R; Thakor, Nitish V

    2016-01-01

    A bidirectional neural interface is a device that transfers information into and out of the nervous system. This class of devices has potential to improve treatment and therapy in several patient populations. Progress in very large-scale integration has advanced the design of complex integrated circuits. System-on-chip devices are capable of recording neural electrical activity and altering natural activity with electrical stimulation. Often, these devices include wireless powering and telemetry functions. This review presents the state of the art of bidirectional circuits as applied to neuroprosthetic, neurorepair, and neurotherapeutic systems.

  13. A study of the Intel ETANN VLSI neural network for an electron isolation trigger

    Energy Technology Data Exchange (ETDEWEB)

    Lindsey, C.S.; Denby, B.

    1992-10-01

    A neural network circuit structure was previously proposed for making an isolated electron trigger for the CDF plug calorimeter. Here we discuss a study of the isolation performance of the Intel ETANN (Electrically Trainable Analog Neural Network) chip. We used the PC based ETANN development system and Monte Carlo generated shower patterns to test the chip. A C ++ application program was developed that runs within the development system but extends chip control and testing capabilities beyond the standard routines. With this program several configurations of the chip parameters were tested and the results are discussed here.

  14. 低成本可调FFT处理器的超大规模集成电路设计%Low Cost VLSI Design of a Flexible FFT Processor

    Institute of Scientific and Technical Information of China (English)

    戴亦奇

    2011-01-01

    In this paper, a radix-22/23 based pipeline structure is presented, in order to implement a low-cost VLSI fast Fourier transform (FFT) processor. As well as reducing the steps of normal complex multiplications, it minimizes the memory words to get the FFT results with the single-path delay feedback (SDF) memory access method. As for the data-path in the pipeline FFT processor, the hybrid floating point data-scaling scheme is adopted to achieve enough signal-to-quantization-noise ratio with minimum data width and RAM requirements.%文章提出了一种以基-22/23为基础的流水线结构,用以实现低成本、超大规模集成电路(VLSI)的快速傅里叶变换(FFT)处理器设计。该处理器在减少普通复数乘法器级数的同时,通过单路延时反馈(SDF)存取方式,以最少的存储字来获得FFT结果。对于数据通路,我们采用了混合浮点的数据缩放方式,在保证信噪比的同时,降低了数据长度和RAM容量的需求。

  15. Parallel optimization algorithms and their implementation in VLSI design

    Science.gov (United States)

    Lee, G.; Feeley, J. J.

    1991-01-01

    Two new parallel optimization algorithms based on the simplex method are described. They may be executed by a SIMD parallel processor architecture and be implemented in VLSI design. Several VLSI design implementations are introduced. An application example is reported to demonstrate that the algorithms are effective.

  16. VLSI architecture of NEO spike detection with noise shaping filter and feature extraction using informative samples.

    Science.gov (United States)

    Hoang, Linh; Yang, Zhi; Liu, Wentai

    2009-01-01

    An emerging class of multi-channel neural recording systems aims to simultaneously monitor the activity of many neurons by miniaturizing and increasing the number of recording channels. Vast volume of data from the recording systems, however, presents a challenge for processing and transmitting wirelessly. An on-chip neural signal processor is needed for filtering uninterested recording samples and performing spike sorting. This paper presents a VLSI architecture of a neural signal processor that can reliably detect spike via a nonlinear energy operator, enhance spike signal over noise ratio by a noise shaping filter, and select meaningful recording samples for clustering by using informative samples. The architecture is implemented in 90-nm CMOS process, occupies 0.2 mm(2), and consumes 0.5 mW of power.

  17. A novel low-voltage low-power analogue VLSI implementation of neural networks with on-chip back-propagation learning

    Science.gov (United States)

    Carrasco, Manuel; Garde, Andres; Murillo, Pilar; Serrano, Luis

    2005-06-01

    In this paper a novel design and implementation of a VLSI Analogue Neural Net based on Multi-Layer Perceptron (MLP) with on-chip Back Propagation (BP) learning algorithm suitable for the resolution of classification problems is described. In order to implement a general and programmable analogue architecture, the design has been carried out in a hierarchical way. In this way the net has been divided in synapsis-blocks and neuron-blocks providing an easy method for the analysis. These blocks basically consist on simple cells, which are mainly, the activation functions (NAF), derivatives (DNAF), multipliers and weight update circuits. The analogue design is based on current-mode translinear techniques using MOS transistors working in the weak inversion region in order to reduce both the voltage supply and the power consumption. Moreover, with the purpose of minimizing the noise, offset and distortion of even order, the topologies are fully-differential and balanced. The circuit, named ANNE (Analogue Neural NEt), has been prototyped and characterized as a proof of concept on CMOS AMI-0.5A technology occupying a total area of 2.7mm2. The chip includes two versions of neural nets with on-chip BP learning algorithm, which are respectively a 2-1 and a 2-2-1 implementations. The proposed nets have been experimentally tested using supply voltages from 2.5V to 1.8V, which is suitable for single cell lithium-ion battery supply applications. Experimental results of both implementations included in ANNE exhibit a good performance on solving classification problems. These results have been compared with other proposed Analogue VLSI implementations of Neural Nets published in the literature demonstrating that our proposal is very efficient in terms of occupied area and power consumption.

  18. VLSI implementation of a bio-inspired olfactory spiking neural network.

    Science.gov (United States)

    Hsieh, Hung-Yi; Tang, Kea-Tiong

    2012-07-01

    This paper presents a low-power, neuromorphic spiking neural network (SNN) chip that can be integrated in an electronic nose system to classify odor. The proposed SNN takes advantage of sub-threshold oscillation and onset-latency representation to reduce power consumption and chip area, providing a more distinct output for each odor input. The synaptic weights between the mitral and cortical cells are modified according to an spike-timing-dependent plasticity learning rule. During the experiment, the odor data are sampled by a commercial electronic nose (Cyranose 320) and are normalized before training and testing to ensure that the classification result is only caused by learning. Measurement results show that the circuit only consumed an average power of approximately 3.6 μW with a 1-V power supply to discriminate odor data. The SNN has either a high or low output response for a given input odor, making it easy to determine whether the circuit has made the correct decision. The measurement result of the SNN chip and some well-known algorithms (support vector machine and the K-nearest neighbor program) is compared to demonstrate the classification performance of the proposed SNN chip.The mean testing accuracy is 87.59% for the data used in this paper.

  19. Analogue VLSI for probabilistic networks and spike-time computation.

    Science.gov (United States)

    Murray, A

    2001-02-01

    The history and some of the methods of analogue neural VLSI are described. The strengths of analogue techniques are described, along with residual problems to be solved. The nature of hardware-friendly and hardware-appropriate algorithms is reviewed and suggestions are offered as to where analogue neural VLSI's future lies.

  20. Classification of correlated patterns with a configurable analog VLSI neural network of spiking neurons and self-regulating plastic synapses.

    Science.gov (United States)

    Giulioni, Massimilian; Pannunzi, Mario; Badoni, Davide; Dante, Vittorio; Del Giudice, Paolo

    2009-11-01

    We describe the implementation and illustrate the learning performance of an analog VLSI network of 32 integrate-and-fire neurons with spike-frequency adaptation and 2016 Hebbian bistable spike-driven stochastic synapses, endowed with a self-regulating plasticity mechanism, which avoids unnecessary synaptic changes. The synaptic matrix can be flexibly configured and provides both recurrent and external connectivity with address-event representation compliant devices. We demonstrate a marked improvement in the efficiency of the network in classifying correlated patterns, owing to the self-regulating mechanism.

  1. CMOS VLSI Layout and Verification of a SIMD Computer

    Science.gov (United States)

    Zheng, Jianqing

    1996-01-01

    A CMOS VLSI layout and verification of a 3 x 3 processor parallel computer has been completed. The layout was done using the MAGIC tool and the verification using HSPICE. Suggestions for expanding the computer into a million processor network are presented. Many problems that might be encountered when implementing a massively parallel computer are discussed.

  2. VLSI electronics microstructure science

    CERN Document Server

    1982-01-01

    VLSI Electronics: Microstructure Science, Volume 4 reviews trends for the future of very large scale integration (VLSI) electronics and the scientific base that supports its development.This book discusses the silicon-on-insulator for VLSI and VHSIC, X-ray lithography, and transient response of electron transport in GaAs using the Monte Carlo method. The technology and manufacturing of high-density magnetic-bubble memories, metallic superlattices, challenge of education for VLSI, and impact of VLSI on medical signal processing are also elaborated. This text likewise covers the impact of VLSI t

  3. Three-dimensional optoelectronic stacked processor by use of free-space optical interconnection and three-dimensional VLSI chip stacks.

    Science.gov (United States)

    Li, Guoqiang; Huang, Dawei; Yuceturk, Emel; Marchand, Philippe J; Esener, Sadik C; Ozguz, Volkan H; Liu, Yue

    2002-01-10

    We present a demonstration system under the three-dimensional (3D) optoelectronic stacked processor consortium. The processor combines the advantages of optics in global, high-density, high-speed parallel interconnections with the density and computational power of 3D chip stacks. In particular, a compact and scalable optoelectronic switching system with a high bandwidth is designed. The system consists of three silicon chip stacks, each integrated with a single vertical-cavity-surface-emitting-laser-metal-semiconductor-metal detector array and an optical interconnection module. Any input signal at one end stack can be switched through the central crossbar stack to any output channel on the opposite end stack. The crossbar bandwidth is designed to be 256 Gb/s. For the free-space optical interconnection, a novel folded hybrid micro-macro optical system with a concave reflection mirror has been designed. The optics module can provide a high resolution, a large field of view, a high link efficiency, and low optical cross talk. It is also symmetric and modular. Off-the-shelf macro-optical components are used. The concave reflection mirror can significantly improve the image quality and tolerate a large misalignment of the optical components, and it can also compensate for the lateral shift of the chip stacks. Scaling of the macrolens can be used to adjust the interconnection length between the chip stacks or make the system more compact. The components are easy to align, and only passive alignment is required. Optics and electronics are separated until the final assembly step, and the optomechanic module can be removed and replaced. By use of 3D chip stacks, commercially available optical components, and simple passive packaging techniques, it is possible to achieve a high-performance optoelectronic switching system.

  4. Distinct rhythmic locomotor patterns can be generated by a simple adaptive neural circuit: biology, simulation, and VLSI implementation.

    Science.gov (United States)

    Ryckebusch, S; Wehr, M; Laurent, G

    1994-12-01

    Rhythmic motor patterns can be induced in leg motor neurons of isolated locust thoracic ganglia by bath application of pilocarpine. We observed that the relative phases of levators and depressors differed in the three thoracic ganglia. Assuming that the central pattern generating circuits underlying these three segmental rhythms are probably very similar, we developed a simple model circuit that can produce any one of the three activity patterns and characteristic phase relationships by modifying a single synaptic weight. We show results of a computer simulation of this circuit using the neuronal simulator NeuraLOG/Spike. We built and tested an analog VLSI circuit implementation of this model circuit that exhibits the same range of "behaviors" as the computer simulation. This multidisciplinary strategy will be useful to explore the dynamics of central pattern generating networks coupled to physical actuators, and ultimately should allow the design of biologically realistic walking robots.

  5. Real-Time Neural Signals Decoding onto Off-the-Shelf DSP Processors for Neuroprosthetic Applications.

    Science.gov (United States)

    Pani, Danilo; Barabino, Gianluca; Citi, Luca; Meloni, Paolo; Raspopovic, Stanisa; Micera, Silvestro; Raffo, Luigi

    2016-09-01

    The control of upper limb neuroprostheses through the peripheral nervous system (PNS) can allow restoring motor functions in amputees. At present, the important aspect of the real-time implementation of neural decoding algorithms on embedded systems has been often overlooked, notwithstanding the impact that limited hardware resources have on the efficiency/effectiveness of any given algorithm. Present study is addressing the optimization of a template matching based algorithm for PNS signals decoding that is a milestone for its real-time, full implementation onto a floating-point digital signal processor (DSP). The proposed optimized real-time algorithm achieves up to 96% of correct classification on real PNS signals acquired through LIFE electrodes on animals, and can correctly sort spikes of a synthetic cortical dataset with sufficiently uncorrelated spike morphologies (93% average correct classification) comparably to the results obtained with top spike sorter (94% on average on the same dataset). The power consumption enables more than 24 h processing at the maximum load, and latency model has been derived to enable a fair performance assessment. The final embodiment demonstrates the real-time performance onto a low-power off-the-shelf DSP, opening to experiments exploiting the efferent signals to control a motor neuroprosthesis.

  6. VLSI electronics microstructure science

    CERN Document Server

    1981-01-01

    VLSI Electronics: Microstructure Science, Volume 3 evaluates trends for the future of very large scale integration (VLSI) electronics and the scientific base that supports its development.This book discusses the impact of VLSI on computer architectures; VLSI design and design aid requirements; and design, fabrication, and performance of CCD imagers. The approaches, potential, and progress of ultra-high-speed GaAs VLSI; computer modeling of MOSFETs; and numerical physics of micron-length and submicron-length semiconductor devices are also elaborated. This text likewise covers the optical linewi

  7. VLSI in medicine

    CERN Document Server

    Einspruch, Norman G

    1989-01-01

    VLSI Electronics Microstructure Science, Volume 17: VLSI in Medicine deals with the more important applications of VLSI in medical devices and instruments.This volume is comprised of 11 chapters. It begins with an article about medical electronics. The following three chapters cover diagnostic imaging, focusing on such medical devices as magnetic resonance imaging, neurometric analyzer, and ultrasound. Chapters 5, 6, and 7 present the impact of VLSI in cardiology. The electrocardiograph, implantable cardiac pacemaker, and the use of VLSI in Holter monitoring are detailed in these chapters. The

  8. VLSI placement

    Energy Technology Data Exchange (ETDEWEB)

    Hojat, S.

    1986-01-01

    The placement problem of assigning modules to module sites in a regular array must be addressed in VLSI and WSI. The placement problem of assigning heterogeneous modules to module sites in a regular array is NP-complete. The placement problem could be simplified if one could find a footprint with the property that all modules of the optimum placement occupy locations in the footprint, with no vacancies within the footprint region. If such footprints were known, they could be precomputed for each system size and the optimization problem would be reduced to a search of placements meeting the footprint constraint. The author shows that the placement problem could not be simplified by finding footprints. As result, several heuristic algorithms for the placement problem were developed and compared to each other and other established algorithms with respect to time complexity and performance measured, by the expected distance traversed by an intermodule message. Compared to previous algorithms, one new heuristic algorithm gave better performance in a shorter execution time on all test examples.

  9. VLSI 'smart' I/O module development

    Science.gov (United States)

    Kirk, Dan

    The developmental history, design, and operation of the MIL-STD-1553A/B discrete and serial module (DSM) for the U.S. Navy AN/AYK-14(V) avionics computer are described and illustrated with diagrams. The ongoing preplanned product improvement for the AN/AYK-14(V) includes five dual-redundant MIL-STD-1553 channels based on DSMs. The DSM is a front-end processor for transferring data to and from a common memory, sharing memory with a host processor to provide improved 'smart' input/output performance. Each DSM comprises three hardware sections: three VLSI-6000 semicustomized CMOS arrays, memory units to support the arrays, and buffers and resynchronization circuits. The DSM hardware module design, VLSI-6000 design tools, controlware and test software, and checkout procedures (using a hardware simulator) are characterized in detail.

  10. VLSI Research

    Science.gov (United States)

    1984-04-01

    massive amounts of data pertaining to seismic exploration or weather observation require much more processing power. These scientific calculations...1« IC *• Number of Processors it 3* (a) 5g - *• * C > «i o •• u w »- a • c a. MM , / \\ i i T2C sp«r*ttoni •*l«y > M unit...algorithms can be divided into two categories; namely, single-input single-output (SISO) and multi-input multi- output ( MIMO ) systems. A highly

  11. Opto-VLSI-based reconfigurable free-space optical interconnects architecture

    DEFF Research Database (Denmark)

    Aljada, Muhsen; Alameh, Kamal; Chung, Il-Sug;

    2007-01-01

    is the Opto-VLSI processor which can be driven by digital phase steering and multicasting holograms that reconfigure the optical interconnects between the input and output ports. The optical interconnects architecture is experimentally demonstrated at 2.5 Gbps using high-speed 1×3 VCSEL array and 1......This paper presents a short-distance reconfigurable high-speed optical interconnects architecture employing a Vertical Cavity Surface Emitting Laser (VCSEL) array, Opto-very-large-scale-integrated (Opto-VLSI) processors, and a photodetector (PD) array. The core component of the architecture......×3 photoreceiver array in conjunction with two 1×4096 pixel Opto-VLSI processors. The minimisation of the crosstalk between the output ports is achieved by appropriately aligning the VCSEL and PD elements with respect to the Opto-VLSI processors and driving the latter with optimal steering phase holograms....

  12. Reconfigurable optical power splitter/combiner based on Opto-VLSI processing.

    Science.gov (United States)

    Mustafa, Haithem; Xiao, Feng; Alameh, Kamal

    2011-10-24

    A novel 1×4 reconfigurable optical splitter/combiner structure based on Opto-VLSI processor and 4-f imaging system with high resolution is proposed and experimentally demonstrated. By uploading optimized multicasting phase holograms onto the software-driven Opto-VLSI processor, an input optical signal is dynamically split into different output fiber ports with user-defined splitting ratios. Also, multiple input optical signals are dynamically combined with arbitrary user-defined weights.

  13. Transformational VLSI Design

    DEFF Research Database (Denmark)

    Rasmussen, Ole Steen

    This thesis introduces a formal approach to deriving VLSI circuits by the use of correctness-preserving transformations. Both the specification and the implementation are descibed by the relation based language Ruby. In order to prove the transformation rules a proof tool called RubyZF has been...... in connection with VLSI design are defined in terms of Pure Ruby and their properties proved. The design process is illustrated by several non-trivial examples of standard VLSI problems....

  14. VLSI Universal Noiseless Coder

    Science.gov (United States)

    Rice, Robert F.; Lee, Jun-Ji; Fang, Wai-Chi

    1989-01-01

    Proposed universal noiseless coder (UNC) compresses stream of data signals for efficient transmission in channel of limited bandwidth. Noiseless in sense original data completely recoverable from output code. System built as very-large-scale integrated (VLSI) circuit, compressing data in real time at input rates as high as 24 Mb/s, and possibly faster, depending on specific design. Approach yields small, lightweight system operating reliably and consuming little power. Constructed as single, compact, low-power VLSI circuit chip. Design of VLSI circuit chip made specific to code algorithms. Entire UNC fabricated in single chip, worst-case power dissipation less than 1 W.

  15. Plasma processing for VLSI

    CERN Document Server

    Einspruch, Norman G

    1984-01-01

    VLSI Electronics: Microstructure Science, Volume 8: Plasma Processing for VLSI (Very Large Scale Integration) discusses the utilization of plasmas for general semiconductor processing. It also includes expositions on advanced deposition of materials for metallization, lithographic methods that use plasmas as exposure sources and for multiple resist patterning, and device structures made possible by anisotropic etching.This volume is divided into four sections. It begins with the history of plasma processing, a discussion of some of the early developments and trends for VLSI. The second section

  16. A Hardware-Efficient Scalable Spike Sorting Neural Signal Processor Module for Implantable High-Channel-Count Brain Machine Interfaces.

    Science.gov (United States)

    Yang, Yuning; Boling, Sam; Mason, Andrew J

    2017-08-01

    Next-generation brain machine interfaces demand a high-channel-count neural recording system to wirelessly monitor activities of thousands of neurons. A hardware efficient neural signal processor (NSP) is greatly desirable to ease the data bandwidth bottleneck for a fully implantable wireless neural recording system. This paper demonstrates a complete multichannel spike sorting NSP module that incorporates all of the necessary spike detector, feature extractor, and spike classifier blocks. To meet high-channel-count and implantability demands, each block was designed to be highly hardware efficient and scalable while sharing resources efficiently among multiple channels. To process multiple channels in parallel, scalability analysis was performed, and the utilization of each block was optimized according to its input data statistics and the power, area and/or speed of each block. Based on this analysis, a prototype 32-channel spike sorting NSP scalable module was designed and tested on an FPGA using synthesized datasets over a wide range of signal to noise ratios. The design was mapped to 130 nm CMOS to achieve 0.75 μW power and 0.023 mm(2) area consumptions per channel based on post synthesis simulation results, which permits scalability of digital processing to 690 channels on a 4×4 mm(2) electrode array.

  17. A Method of Neural Network Controller Implementation in VLSI Design%将神经网络控制器用于VLSI设计的方法研究

    Institute of Scientific and Technical Information of China (English)

    詹璨铭

    2015-01-01

    This article presents an approach to neural network implementation in VLSI ,which is called as neural network controller based on Petri net .The structure of the neurons in the network is uniform ;it is triggered by external events ,and output place and transition signals of petri net .Two types of neurons are introduced ,one is standing for serial process ,and the other is used in synchronization of processes .The dual types of neurons are chained together by stimulate inputs ,and compose the fabric .The controller is designed to conquer the side effects of state machine ,and improves the performance and reliability .Typically ,the controller is a precise description of the circuits .It is optimized to timing closure against constraints much easier than state machine .Reduplication of each node in neural network decrease single event upset (SEU) .Finally ,the controller is easy to rebuild .The new design flow has applied in practice ,and proved effectively .%探索在超大规模集成电路中应用神经网络控制器的方法.根据Petri网理论,将库所与变迁组合成神经节点,节点通过输入触发信号链接组成复杂控制网络.定义两种类型神经节点,一种是节点组成串行分枝,另外一种用于同步并发分枝.通过两种节点组合,形成三种基本网络结构,三种结构再次组合又可形成任意复杂控制器结构.根据控制器分枝内串行、分枝间并行的特点,设计编译软件,输入更抽象的分枝描述代码,自动生成对应神经网络控制器逻辑电路描述代码.VLSI设计中使用神经网络控制器,能够更接近了寄存器传输级电路,以及更精确地描述电路,还能提高设计性能与可靠性.复制神经节点减小单节点负载,可优化电路时序;复制节点还可构成冗余缓解空间单粒子翻转.神经网络控制器可以处理各种异常情况,提高功能容错性和可维护性.这种方法已经用

  18. VLSI Reliability in Europe

    NARCIS (Netherlands)

    Verweij, Jan F.

    1993-01-01

    Several issue's regarding VLSI reliability research in Europe are discussed. Organizations involved in stimulating the activities on reliability by exchanging information or supporting research programs are described. Within one such program, ESPRIT, a technical interest group on IC reliability was

  19. NASA Space Engineering Research Center for VLSI systems design

    Science.gov (United States)

    1991-01-01

    This annual review reports the center's activities and findings on very large scale integration (VLSI) systems design for 1990, including project status, financial support, publications, the NASA Space Engineering Research Center (SERC) Symposium on VLSI Design, research results, and outreach programs. Processor chips completed or under development are listed. Research results summarized include a design technique to harden complementary metal oxide semiconductors (CMOS) memory circuits against single event upset (SEU); improved circuit design procedures; and advances in computer aided design (CAD), communications, computer architectures, and reliability design. Also described is a high school teacher program that exposes teachers to the fundamentals of digital logic design.

  20. An application of mapping neural networks and a digital signal processor for cochlear neuroprostheses.

    Science.gov (United States)

    Zadák, J; Unbehauen, R

    1993-01-01

    Cochlear neuroprostheses strive to restore the sensation of hearing to patients with a profound sensorineural deafness. They exhibit a stimulation of the surviving auditory nerve neurons by electrical currents delivered through electrodes placed on or within the cochlea. The present article describes a new method for an efficient derivation of the required information from the incoming speech signal necessary for the implant stimulation. Also some realization aspects of the new approach are addressed. In the new strategy, a multilayer neural network is employed in the formant frequency estimation having some suitable speech signal descriptors as particular input signals. The proposed method allows us a fast formant frequency estimation necessary for the implant stimulation. With the developed strategy, the prosthesis can be adjusted to the environment which the patient is supposed to live in. Moreover, the neural network concept offers us an alternative for dealing with the areas of neural loss or "holes" in the frequency map of the patient's ear.

  1. A VLSI field-programmable mixed-signal array to perform neural signal processing and neural modeling in a prosthetic system.

    Science.gov (United States)

    Bamford, Simeon A; Hogri, Roni; Giovannucci, Andrea; Taub, Aryeh H; Herreros, Ivan; Verschure, Paul F M J; Mintz, Matti; Del Giudice, Paolo

    2012-07-01

    A very-large-scale integration field-programmable mixed-signal array specialized for neural signal processing and neural modeling has been designed. This has been fabricated as a core on a chip prototype intended for use in an implantable closed-loop prosthetic system aimed at rehabilitation of the learning of a discrete motor response. The chosen experimental context is cerebellar classical conditioning of the eye-blink response. The programmable system is based on the intimate mixing of switched capacitor analog techniques with low speed digital computation; power saving innovations within this framework are presented. The utility of the system is demonstrated by the implementation of a motor classical conditioning model applied to eye-blink conditioning in real time with associated neural signal processing. Paired conditioned and unconditioned stimuli were repeatedly presented to an anesthetized rat and recordings were taken simultaneously from two precerebellar nuclei. These paired stimuli were detected in real time from this multichannel data. This resulted in the acquisition of a trigger for a well-timed conditioned eye-blink response, and repetition of unpaired trials constructed from the same data led to the extinction of the conditioned response trigger, compatible with natural cerebellar learning in awake animals.

  2. VLSI Circuit Configuration Using Satisfiability Logic in Hopfield Network

    Directory of Open Access Journals (Sweden)

    Mohd Asyraf Mansor

    2016-09-01

    Full Text Available Very large scale integration (VLSI circuit comprises of integrated circuit (IC with transistors in a single chip, widely used in many sophisticated electronic devices. In our paper, we proposed VLSI circuit design by implementing satisfiability problem in Hopfield neural network as circuit verification technique. We restrict our logic construction to 2-Satisfiability (2-SAT and 3- Satisfiability (3-SAT clauses in order to suit with the transistor configuration in VLSI circuit. In addition, we developed VLSI circuit based on Hopfield neural network in order to detect any possible error earlier than the manual circuit design. Microsoft Visual C++ 2013 is used as a platform for training, testing and validating of our proposed design. Hence, the performance of our proposed technique evaluated based on global VLSI configuration, circuit accuracy and the runtime. It has been observed that the VLSI circuits (HNN-2SAT and HNN-3SAT circuit developed by proposed design are better than the conventional circuit due to the early error detection in our circuit.

  3. Lithography for VLSI

    CERN Document Server

    Einspruch, Norman G

    1987-01-01

    VLSI Electronics Microstructure Science, Volume 16: Lithography for VLSI treats special topics from each branch of lithography, and also contains general discussion of some lithographic methods.This volume contains 8 chapters that discuss the various aspects of lithography. Chapters 1 and 2 are devoted to optical lithography. Chapter 3 covers electron lithography in general, and Chapter 4 discusses electron resist exposure modeling. Chapter 5 presents the fundamentals of ion-beam lithography. Mask/wafer alignment for x-ray proximity printing and for optical lithography is tackled in Chapter 6.

  4. Opto-VLSI-based tunable single-mode fiber laser.

    Science.gov (United States)

    Xiao, Feng; Alameh, Kamal; Lee, Tongtak

    2009-10-12

    A new tunable fiber ring laser structure employing an Opto-VLSI processor and an erbium-doped fiber amplifier (EDFA) is reported. The Opto-VLSI processor is able to dynamically select and couple a waveband from the gain spectrum of the EDFA into a fiber ring, leading to a narrow-linewidth high-quality tunable laser output. Experimental results demonstrate a tunable fiber laser of linewidth 0.05 nm and centre wavelength tuned over the C-band with a 0.05 nm step. The measured side mode suppression ratio (SMSR) is greater than 35 dB and the laser output power uniformity is better than 0.25 dB. The laser output is very stable at room temperature.

  5. Analog and VLSI circuits

    CERN Document Server

    Chen, Wai-Kai

    2009-01-01

    Featuring hundreds of illustrations and references, this book provides the information on analog and VLSI circuits. It focuses on analog integrated circuits, presenting the knowledge on monolithic device models, analog circuit cells, high performance analog circuits, RF communication circuits, and PLL circuits.

  6. CMOS VLSI Hyperbolic Tangent Function & its Derivative Circuits for Neuron Implementation

    Directory of Open Access Journals (Sweden)

    Hussein CHIBLE,

    2013-10-01

    Full Text Available The hyperbolic tangent function and its derivative are key essential element in analog signal processing and especially in analog VLSI implementation of neuron of artificial neural networks. The main conditions of these types of circuits are the small silicon area, and the low power consumption. The objective of this paper is to study and design CMOS VLSI hyperbolic tangent function and its derivative circuit for neural network implementation. A circuit is designed and the results are presented

  7. Opto-VLSI-based N × M wavelength selective switch.

    Science.gov (United States)

    Xiao, Feng; Alameh, Kamal

    2013-07-29

    In this paper, we propose and experimentally demonstrate a novel N × M wavelength selective switch (WSS) architecture based on the use of an Opto-VLSI processor. Through a two-stage beamsteering process, wavelength channels from any input optical fiber port can be switched into any output optical fiber port. A proof-of-concept 2 × 3 WSS structure is developed, demonstrating flexible wavelength selective switching with an insertion loss around 15 dB.

  8. A coherent VLSI design environment

    Science.gov (United States)

    Penfield, Paul, Jr.

    1988-05-01

    The CAD effort is centered on timing analysis and circuit simulation. Advances have been made in tightening the bounds of timing analysis. The superiority of the Gauss-Jacobi technique for matrix solution, over the Gauss-Seidel method, has been proven when the algorithms are implemented on massively parallel machines. In the circuits area, one result of importance is a new technique for calculating the highest frequency of operation of transistors with parasitic elements present. Work on a synthesis technique is under way. In the architecture area, many new results have been derived for parallel algorithms and complexity. One of the most astonishing is that a hypercube with a large number of faulty nodes can be used, with high probability, as another perfectly functioning hypercube of half the size, by using reconfiguration algorithms that are simple, fast, and require only local information. Also, the design of the message-driven processor is continuing, with several advances in architecture, software, communications, and ALU design. Many of these are being implemented in VLSI circuits. The theory work has as a central theme that the cost of communication should be included in complexity analyses. This has led to advances in models for computation, including volume-universal networks, routing, network flow, fault avoidance, queue management, and network simulation.

  9. Low-Power and Optimized VLSI Implementation of Compact Recursive Discrete Fourier Transform (RDFT Processor for the Computations of DFT and Inverse Modified Cosine Transform (IMDCT in a Digital Radio Mondiale (DRM and DRM+ Receiver

    Directory of Open Access Journals (Sweden)

    Sheau-Fang Lei

    2013-05-01

    Full Text Available This paper presents a compact structure of recursive discrete Fourier transform (RDFT with prime factor (PF and common factor (CF algorithms to calculate variable-length DFT coefficients. Low-power optimizations in VLSI implementation are applied to the proposed RDFT design. In the algorithm, for 256-point DFT computation, the results show that the proposed method greatly reduces the number of multiplications/additions/computational cycles by 97.40/94.31/46.50% compared to a recent approach. In chip realization, the core size and chip size are, respectively, 0.84 × 0.84 and 1.38 × 1.38 mm2. The power consumption for the 288- and 256-point DFT computations are, respectively, 10.2 (or 0.1051 and 11.5 (or 0.1176 mW at 25 (or 0.273 MHz simulated by NanoSim. It would be more efficient and more suitable than previous works for DRM and DRM+ applications.

  10. Very Large Scale Integration (VLSI).

    Science.gov (United States)

    Yeaman, Andrew R. J.

    Very Large Scale Integration (VLSI), the state-of-the-art production techniques for computer chips, promises such powerful, inexpensive computing that, in the future, people will be able to communicate with computer devices in natural language or even speech. However, before full-scale VLSI implementation can occur, certain salient factors must be…

  11. A special purpose silicon compiler for designing supercomputing VLSI systems

    Science.gov (United States)

    Venkateswaran, N.; Murugavel, P.; Kamakoti, V.; Shankarraman, M. J.; Rangarajan, S.; Mallikarjun, M.; Karthikeyan, B.; Prabhakar, T. S.; Satish, V.; Venkatasubramaniam, P. R.

    1991-01-01

    Design of general/special purpose supercomputing VLSI systems for numeric algorithm execution involves tackling two important aspects, namely their computational and communication complexities. Development of software tools for designing such systems itself becomes complex. Hence a novel design methodology has to be developed. For designing such complex systems a special purpose silicon compiler is needed in which: the computational and communicational structures of different numeric algorithms should be taken into account to simplify the silicon compiler design, the approach is macrocell based, and the software tools at different levels (algorithm down to the VLSI circuit layout) should get integrated. In this paper a special purpose silicon (SPS) compiler based on PACUBE macrocell VLSI arrays for designing supercomputing VLSI systems is presented. It is shown that turn-around time and silicon real estate get reduced over the silicon compilers based on PLA's, SLA's, and gate arrays. The first two silicon compiler characteristics mentioned above enable the SPS compiler to perform systolic mapping (at the macrocell level) of algorithms whose computational structures are of GIPOP (generalized inner product outer product) form. Direct systolic mapping on PLA's, SLA's, and gate arrays is very difficult as they are micro-cell based. A novel GIPOP processor is under development using this special purpose silicon compiler.

  12. Hybrid VLSI/QCA Architecture for Computing FFTs

    Science.gov (United States)

    Fijany, Amir; Toomarian, Nikzad; Modarres, Katayoon; Spotnitz, Matthew

    2003-01-01

    A data-processor architecture that would incorporate elements of both conventional very-large-scale integrated (VLSI) circuitry and quantum-dot cellular automata (QCA) has been proposed to enable the highly parallel and systolic computation of fast Fourier transforms (FFTs). The proposed circuit would complement the QCA-based circuits described in several prior NASA Tech Briefs articles, namely Implementing Permutation Matrices by Use of Quantum Dots (NPO-20801), Vol. 25, No. 10 (October 2001), page 42; Compact Interconnection Networks Based on Quantum Dots (NPO-20855) Vol. 27, No. 1 (January 2003), page 32; and Bit-Serial Adder Based on Quantum Dots (NPO-20869), Vol. 27, No. 1 (January 2003), page 35. The cited prior articles described the limitations of very-large-scale integrated (VLSI) circuitry and the major potential advantage afforded by QCA. To recapitulate: In a VLSI circuit, signal paths that are required not to interact with each other must not cross in the same plane. In contrast, for reasons too complex to describe in the limited space available for this article, suitably designed and operated QCAbased signal paths that are required not to interact with each other can nevertheless be allowed to cross each other in the same plane without adverse effect. In principle, this characteristic could be exploited to design compact, coplanar, simple (relative to VLSI) QCA-based networks to implement complex, advanced interconnection schemes.

  13. UW VLSI chip tester

    Science.gov (United States)

    McKenzie, Neil

    1989-12-01

    We present a design for a low-cost, functional VLSI chip tester. It is based on the Apple MacIntosh II personal computer. It tests chips that have up to 128 pins. All pin drivers of the tester are bidirectional; each pin is programmed independently as an input or an output. The tester can test both static and dynamic chips. Rudimentary speed testing is provided. Chips are tested by executing C programs written by the user. A software library is provided for program development. Tests run under both the Mac Operating System and A/UX. The design is implemented using Xilinx Logic Cell Arrays. Price/performance tradeoffs are discussed.

  14. The VLSI handbook

    CERN Document Server

    Chen, Wai-Kai

    2007-01-01

    Written by a stellar international panel of expert contributors, this handbook remains the most up-to-date, reliable, and comprehensive source for real answers to practical problems. In addition to updated information in most chapters, this edition features several heavily revised and completely rewritten chapters, new chapters on such topics as CMOS fabrication and high-speed circuit design, heavily revised sections on testing of digital systems and design languages, and two entirely new sections on low-power electronics and VLSI signal processing. An updated compendium of references and othe

  15. Artificial immune system algorithm in VLSI circuit configuration

    Science.gov (United States)

    Mansor, Mohd. Asyraf; Sathasivam, Saratha; Kasihmuddin, Mohd Shareduwan Mohd

    2017-08-01

    In artificial intelligence, the artificial immune system is a robust bio-inspired heuristic method, extensively used in solving many constraint optimization problems, anomaly detection, and pattern recognition. This paper discusses the implementation and performance of artificial immune system (AIS) algorithm integrated with Hopfield neural networks for VLSI circuit configuration based on 3-Satisfiability problems. Specifically, we emphasized on the clonal selection technique in our binary artificial immune system algorithm. We restrict our logic construction to 3-Satisfiability (3-SAT) clauses in order to outfit with the transistor configuration in VLSI circuit. The core impetus of this research is to find an ideal hybrid model to assist in the VLSI circuit configuration. In this paper, we compared the artificial immune system (AIS) algorithm (HNN-3SATAIS) with the brute force algorithm incorporated with Hopfield neural network (HNN-3SATBF). Microsoft Visual C++ 2013 was used as a platform for training, simulating and validating the performances of the proposed network. The results depict that the HNN-3SATAIS outperformed HNN-3SATBF in terms of circuit accuracy and CPU time. Thus, HNN-3SATAIS can be used to detect an early error in the VLSI circuit design.

  16. Mixed voltage VLSI design

    Science.gov (United States)

    Panwar, Ramesh; Rennels, David; Alkalaj, Leon

    1993-01-01

    A technique for minimizing the power dissipated in a Very Large Scale Integration (VLSI) chip by lowering the operating voltage without any significant penalty in the chip throughput even though low voltage operation results in slower circuits. Since the overall throughput of a VLSI chip depends on the speed of the critical path(s) in the chip, it may be possible to sustain the throughput rates attained at higher voltages by operating the circuits in the critical path(s) with a high voltage while operating the other circuits with a lower voltage to minimize the power dissipation. The interface between the gates which operate at different voltages is crucial for low power dissipation since the interface may possibly have high static current dissipation thus negating the gains of the low voltage operation. The design of a voltage level translator which does the interface between the low voltage and high voltage circuits without any significant static dissipation is presented. Then, the results of the mixed voltage design using a greedy algorithm on three chips for various operating voltages are presented.

  17. VLSI signal processing technology

    CERN Document Server

    Swartzlander, Earl

    1994-01-01

    This book is the first in a set of forthcoming books focussed on state-of-the-art development in the VLSI Signal Processing area. It is a response to the tremendous research activities taking place in that field. These activities have been driven by two factors: the dramatic increase in demand for high speed signal processing, especially in consumer elec­ tronics, and the evolving microelectronic technologies. The available technology has always been one of the main factors in determining al­ gorithms, architectures, and design strategies to be followed. With every new technology, signal processing systems go through many changes in concepts, design methods, and implementation. The goal of this book is to introduce the reader to the main features of VLSI Signal Processing and the ongoing developments in this area. The focus of this book is on: • Current developments in Digital Signal Processing (DSP) pro­ cessors and architectures - several examples and case studies of existing DSP chips are discussed in...

  18. VLSI Architectures for Computing DFT's

    Science.gov (United States)

    Truong, T. K.; Chang, J. J.; Hsu, I. S.; Reed, I. S.; Pei, D. Y.

    1986-01-01

    Simplifications result from use of residue Fermat number systems. System of finite arithmetic over residue Fermat number systems enables calculation of discrete Fourier transform (DFT) of series of complex numbers with reduced number of multiplications. Computer architectures based on approach suitable for design of very-large-scale integrated (VLSI) circuits for computing DFT's. General approach not limited to DFT's; Applicable to decoding of error-correcting codes and other transform calculations. System readily implemented in VLSI.

  19. A VLSI design concept for parallel iterative algorithms

    Directory of Open Access Journals (Sweden)

    C. C. Sun

    2009-05-01

    Full Text Available Modern VLSI manufacturing technology has kept shrinking down to the nanoscale level with a very fast trend. Integration with the advanced nano-technology now makes it possible to realize advanced parallel iterative algorithms directly which was almost impossible 10 years ago. In this paper, we want to discuss the influences of evolving VLSI technologies for iterative algorithms and present design strategies from an algorithmic and architectural point of view. Implementing an iterative algorithm on a multiprocessor array, there is a trade-off between the performance/complexity of processors and the load/throughput of interconnects. This is due to the behavior of iterative algorithms. For example, we could simplify the parallel implementation of the iterative algorithm (i.e., processor elements of the multiprocessor array in any way as long as the convergence is guaranteed. However, the modification of the algorithm (processors usually increases the number of required iterations which also means that the switch activity of interconnects is increasing. As an example we show that a 25×25 full Jacobi EVD array could be realized into one single FPGA device with the simplified μ-rotation CORDIC architecture.

  20. VLSI implementations for image communications

    CERN Document Server

    Pirsch, P

    1993-01-01

    The past few years have seen a rapid growth in image processing and image communication technologies. New video services and multimedia applications are continuously being designed. Essential for all these applications are image and video compression techniques. The purpose of this book is to report on recent advances in VLSI architectures and their implementation for video signal processing applications with emphasis on video coding for bit rate reduction. Efficient VLSI implementation for video signal processing spans a broad range of disciplines involving algorithms, architectures, circuits

  1. A digital neuron-type processor and its VLSI design

    Science.gov (United States)

    Akel, H.; Habib, Mahmoud K.

    1989-05-01

    A set of neuron-type circuits elements based on logic gate circuits with multiinput multifan output capability is described. Three types of elements are introduced, one called the cell body with its dendritic inputs and synaptic junction, another representing the axon base, and the axon circuit. These three elements are cascaded to form a neuron-type processing element. The circuit performs input temporal and spatial summation as well as thresholding. The entire neuron circuit is simulated and a design is given using VSLI techniques.

  2. Design and Implementation of VLSI Prime Factor Algorithm Processor.

    Science.gov (United States)

    1987-12-01

    for A, 1i ’ 1 1 Fh Ai ,,r equjAtIInI. art, ( , 4 10 4,t’ 4 ( - /’ cr tht, (-arr% sur ma% akL, be represented as Figure 36 Carry Select Adder Blocking... Select Adder Blocking .......................................................... 81 Figure 37: ALU Adder Cell...ALU Logic Implementation............................................................ 81 viii J,.. in List of Figures (continued) Figure 36: Carry

  3. Micro mirrors based coupling of light to multi-core fiber realizing in-fiber photonic neural network processor

    Science.gov (United States)

    Cohen, Eyal; Malka, Dror; Shemer, Amir; Shahmoon, Asaf; London, Michael; Zalevsky, Zeev

    2017-02-01

    Hardware implementation of artificial neural networks facilitates real-time parallel processing of massive data sets. Optical neural networks offer low-volume 3D connectivity together with large bandwidth and minimal heat production in contrast to electronic implementation. Here, we present a DMD based approaches to realize energetically efficient light coupling into a multi-core fiber realizing a unique design for in-fiber optical neural networks. Neurons and synapses are realized as individual cores in a multi-core fiber. Optical signals are transferred transversely between cores by means of optical coupling. Pump driven amplification in Erbium-doped cores mimics synaptic interactions. In order to dynamically and efficiently couple light into the multi-core fiber a DMD based micro mirror device is used to perform proper beam shaping operation. The beam shaping reshapes the light into a large set of points in space matching the positions of the required cores in the entrance plane to the multi-core fiber.

  4. A cost-effective methodology for the design of massively-parallel VLSI functional units

    Science.gov (United States)

    Venkateswaran, N.; Sriram, G.; Desouza, J.

    1993-01-01

    In this paper we propose a generalized methodology for the design of cost-effective massively-parallel VLSI Functional Units. This methodology is based on a technique of generating and reducing a massive bit-array on the mask-programmable PAcube VLSI array. This methodology unifies (maintains identical data flow and control) the execution of complex arithmetic functions on PAcube arrays. It is highly regular, expandable and uniform with respect to problem-size and wordlength, thereby reducing the communication complexity. The memory-functional unit interface is regular and expandable. Using this technique functional units of dedicated processors can be mask-programmed on the naked PAcube arrays, reducing the turn-around time. The production cost of such dedicated processors can be drastically reduced since the naked PAcube arrays can be mass-produced. Analysis of the the performance of functional units designed by our method yields promising results.

  5. Recovery Act - CAREER: Sustainable Silicon -- Energy-Efficient VLSI Interconnect for Extreme-Scale Computing

    Energy Technology Data Exchange (ETDEWEB)

    Chiang, Patrick [Oregon State Univ., Corvallis, OR (United States)

    2014-01-31

    The research goal of this CAREER proposal is to develop energy-efficient, VLSI interconnect circuits and systems that will facilitate future massively-parallel, high-performance computing. Extreme-scale computing will exhibit massive parallelism on multiple vertical levels, from thou­ sands of computational units on a single processor to thousands of processors in a single data center. Unfortunately, the energy required to communicate between these units at every level (on­ chip, off-chip, off-rack) will be the critical limitation to energy efficiency. Therefore, the PI's career goal is to become a leading researcher in the design of energy-efficient VLSI interconnect for future computing systems.

  6. Neuromorphic neural interfaces: from neurophysiological inspiration to biohybrid coupling with nervous systems

    Science.gov (United States)

    Broccard, Frédéric D.; Joshi, Siddharth; Wang, Jun; Cauwenberghs, Gert

    2017-08-01

    Objective. Computation in nervous systems operates with different computational primitives, and on different hardware, than traditional digital computation and is thus subjected to different constraints from its digital counterpart regarding the use of physical resources such as time, space and energy. In an effort to better understand neural computation on a physical medium with similar spatiotemporal and energetic constraints, the field of neuromorphic engineering aims to design and implement electronic systems that emulate in very large-scale integration (VLSI) hardware the organization and functions of neural systems at multiple levels of biological organization, from individual neurons up to large circuits and networks. Mixed analog/digital neuromorphic VLSI systems are compact, consume little power and operate in real time independently of the size and complexity of the model. Approach. This article highlights the current efforts to interface neuromorphic systems with neural systems at multiple levels of biological organization, from the synaptic to the system level, and discusses the prospects for future biohybrid systems with neuromorphic circuits of greater complexity. Main results. Single silicon neurons have been interfaced successfully with invertebrate and vertebrate neural networks. This approach allowed the investigation of neural properties that are inaccessible with traditional techniques while providing a realistic biological context not achievable with traditional numerical modeling methods. At the network level, populations of neurons are envisioned to communicate bidirectionally with neuromorphic processors of hundreds or thousands of silicon neurons. Recent work on brain-machine interfaces suggests that this is feasible with current neuromorphic technology. Significance. Biohybrid interfaces between biological neurons and VLSI neuromorphic systems of varying complexity have started to emerge in the literature. Primarily intended as a

  7. A systematic method for configuring VLSI networks of spiking neurons.

    Science.gov (United States)

    Neftci, Emre; Chicca, Elisabetta; Indiveri, Giacomo; Douglas, Rodney

    2011-10-01

    An increasing number of research groups are developing custom hybrid analog/digital very large scale integration (VLSI) chips and systems that implement hundreds to thousands of spiking neurons with biophysically realistic dynamics, with the intention of emulating brainlike real-world behavior in hardware and robotic systems rather than simply simulating their performance on general-purpose digital computers. Although the electronic engineering aspects of these emulation systems is proceeding well, progress toward the actual emulation of brainlike tasks is restricted by the lack of suitable high-level configuration methods of the kind that have already been developed over many decades for simulations on general-purpose computers. The key difficulty is that the dynamics of the CMOS electronic analogs are determined by transistor biases that do not map simply to the parameter types and values used in typical abstract mathematical models of neurons and their networks. Here we provide a general method for resolving this difficulty. We describe a parameter mapping technique that permits an automatic configuration of VLSI neural networks so that their electronic emulation conforms to a higher-level neuronal simulation. We show that the neurons configured by our method exhibit spike timing statistics and temporal dynamics that are the same as those observed in the software simulated neurons and, in particular, that the key parameters of recurrent VLSI neural networks (e.g., implementing soft winner-take-all) can be precisely tuned. The proposed method permits a seamless integration between software simulations with hardware emulations and intertranslatability between the parameters of abstract neuronal models and their emulation counterparts. Most important, our method offers a route toward a high-level task configuration language for neuromorphic VLSI systems.

  8. Neurovision processor for designing intelligent sensors

    Science.gov (United States)

    Gupta, Madan M.; Knopf, George K.

    1992-03-01

    A programmable multi-task neuro-vision processor, called the Positive-Negative (PN) neural processor, is proposed as a plausible hardware mechanism for constructing robust multi-task vision sensors. The computational operations performed by the PN neural processor are loosely based on the neural activity fields exhibited by certain nervous tissue layers situated in the brain. The neuro-vision processor can be programmed to generate diverse dynamic behavior that may be used for spatio-temporal stabilization (STS), short-term visual memory (STVM), spatio-temporal filtering (STF) and pulse frequency modulation (PFM). A multi- functional vision sensor that performs a variety of information processing operations on time- varying two-dimensional sensory images can be constructed from a parallel and hierarchical structure of numerous individually programmed PN neural processors.

  9. VLSI mixed signal processing system

    Science.gov (United States)

    Alvarez, A.; Premkumar, A. B.

    1993-01-01

    An economical and efficient VLSI implementation of a mixed signal processing system (MSP) is presented in this paper. The MSP concept is investigated and the functional blocks of the proposed MSP are described. The requirements of each of the blocks are discussed in detail. A sample application using active acoustic cancellation technique is described to demonstrate the power of the MSP approach.

  10. Fundamentals of Microelectronics Processing (VLSI).

    Science.gov (United States)

    Takoudis, Christos G.

    1987-01-01

    Describes a 15-week course in the fundamentals of microelectronics processing in chemical engineering, which emphasizes the use of very large scale integration (VLSI). Provides a listing of the topics covered in the course outline, along with a sample of some of the final projects done by students. (TW)

  11. A configurable simulation environment for the efficient simulation of large-scale spiking neural networks on graphics processors.

    Science.gov (United States)

    Nageswaran, Jayram Moorkanikara; Dutt, Nikil; Krichmar, Jeffrey L; Nicolau, Alex; Veidenbaum, Alexander V

    2009-01-01

    Neural network simulators that take into account the spiking behavior of neurons are useful for studying brain mechanisms and for various neural engineering applications. Spiking Neural Network (SNN) simulators have been traditionally simulated on large-scale clusters, super-computers, or on dedicated hardware architectures. Alternatively, Compute Unified Device Architecture (CUDA) Graphics Processing Units (GPUs) can provide a low-cost, programmable, and high-performance computing platform for simulation of SNNs. In this paper we demonstrate an efficient, biologically realistic, large-scale SNN simulator that runs on a single GPU. The SNN model includes Izhikevich spiking neurons, detailed models of synaptic plasticity and variable axonal delay. We allow user-defined configuration of the GPU-SNN model by means of a high-level programming interface written in C++ but similar to the PyNN programming interface specification. PyNN is a common programming interface developed by the neuronal simulation community to allow a single script to run on various simulators. The GPU implementation (on NVIDIA GTX-280 with 1 GB of memory) is up to 26 times faster than a CPU version for the simulation of 100K neurons with 50 Million synaptic connections, firing at an average rate of 7 Hz. For simulation of 10 Million synaptic connections and 100K neurons, the GPU SNN model is only 1.5 times slower than real-time. Further, we present a collection of new techniques related to parallelism extraction, mapping of irregular communication, and network representation for effective simulation of SNNs on GPUs. The fidelity of the simulation results was validated on CPU simulations using firing rate, synaptic weight distribution, and inter-spike interval analysis. Our simulator is publicly available to the modeling community so that researchers will have easy access to large-scale SNN simulations.

  12. From neural-based object recognition toward microelectronic eyes

    Science.gov (United States)

    Sheu, Bing J.; Bang, Sa Hyun

    1994-01-01

    Engineering neural network systems are best known for their abilities to adapt to the changing characteristics of the surrounding environment by adjusting system parameter values during the learning process. Rapid advances in analog current-mode design techniques have made possible the implementation of major neural network functions in custom VLSI chips. An electrically programmable analog synapse cell with large dynamic range can be realized in a compact silicon area. New designs of the synapse cells, neurons, and analog processor are presented. A synapse cell based on Gilbert multiplier structure can perform the linear multiplication for back-propagation networks. A double differential-pair synapse cell can perform the Gaussian function for radial-basis network. The synapse cells can be biased in the strong inversion region for high-speed operation or biased in the subthreshold region for low-power operation. The voltage gain of the sigmoid-function neurons is externally adjustable which greatly facilitates the search of optimal solutions in certain networks. Various building blocks can be intelligently connected to form useful industrial applications. Efficient data communication is a key system-level design issue for large-scale networks. We also present analog neural processors based on perceptron architecture and Hopfield network for communication applications. Biologically inspired neural networks have played an important role towards the creation of powerful intelligent machines. Accuracy, limitations, and prospects of analog current-mode design of the biologically inspired vision processing chips and cellular neural network chips are key design issues.

  13. The Fifth NASA Symposium on VLSI Design

    Science.gov (United States)

    1993-01-01

    The fifth annual NASA Symposium on VLSI Design had 13 sessions including Radiation Effects, Architectures, Mixed Signal, Design Techniques, Fault Testing, Synthesis, Signal Processing, and other Featured Presentations. The symposium provides insights into developments in VLSI and digital systems which can be used to increase data systems performance. The presentations share insights into next generation advances that will serve as a basis for future VLSI design.

  14. A Design Methodology for Optoelectronic VLSI

    Science.gov (United States)

    2007-01-01

    it for the layout of large-scale VLSI circuits such as bit-parallel datapaths , crossbars, RAMs, megacells and cores. These VLSI circuits have custom...by the 64-bit ALU and the 64-bit register file circuits. Typically, these VLSI circuits use a datapath layout style that creates a highly regular row...and column structure. The datapath layout style is preferred for multiple-bit processing circuits because it achieves uniform timing for all bits in a

  15. Opto-VLSI-based photonic true-time delay architecture for broadband adaptive nulling in phased array antennas.

    Science.gov (United States)

    Juswardy, Budi; Xiao, Feng; Alameh, Kamal

    2009-03-16

    This paper proposes a novel Opto-VLSI-based tunable true-time delay generation unit for adaptively steering the nulls of microwave phased array antennas. Arbitrary single or multiple true-time delays can simultaneously be synthesized for each antenna element by slicing an RF-modulated broadband optical source and routing specific sliced wavebands through an Opto-VLSI processor to a high-dispersion fiber. Experimental results are presented, which demonstrate the principle of the true-time delay unit through the generation of 5 arbitrary true-time delays of up to 2.5 ns each.

  16. An Experimental Digital Image Processor

    Science.gov (United States)

    Cok, Ronald S.

    1986-12-01

    A prototype digital image processor for enhancing photographic images has been built in the Research Laboratories at Kodak. This image processor implements a particular version of each of the following algorithms: photographic grain and noise removal, edge sharpening, multidimensional image-segmentation, image-tone reproduction adjustment, and image-color saturation adjustment. All processing, except for segmentation and analysis, is performed by massively parallel and pipelined special-purpose hardware. This hardware runs at 10 MHz and can be adjusted to handle any size digital image. The segmentation circuits run at 30 MHz. The segmentation data are used by three single-board computers for calculating the tonescale adjustment curves. The system, as a whole, has the capability of completely processing 10 million three-color pixels per second. The grain removal and edge enhancement algorithms represent the largest part of the pipelined hardware, operating at over 8 billion integer operations per second. The edge enhancement is performed by unsharp masking, and the grain removal is done using a collapsed Walsh-hadamard transform filtering technique (U.S. Patent No. 4549212). These two algo-rithms can be realized using four basic processing elements, some of which have been imple-mented as VLSI semicustom integrated circuits. These circuits implement the algorithms with a high degree of efficiency, modularity, and testability. The digital processor is controlled by a Digital Equipment Corporation (DEC) PDP 11 minicomputer and can be interfaced to electronic printing and/or electronic scanning de-vices. The processor has been used to process over a thousand diagnostic images.

  17. VLSI Watermark Implementations and Applications

    OpenAIRE

    Shoshan, Yonatan; Fish, Alexander; Li, Xin; Jullien, Graham,; Yadid-Pecht, Orly

    2008-01-01

    This paper presents an up to date review of digital watermarking (WM) from a VLSI designer point of view. The reader is introduced to basic principles and terms in the field of image watermarking. It goes through a brief survey on WM theory, laying out common classification criterions and discussing important design considerations and trade-offs. Elementary WM properties such as robustness, computational complexity and their influence on image quality are discussed. Common att...

  18. Implementation of Plasmonics in VLSI

    Directory of Open Access Journals (Sweden)

    Shreya Bhattacharya

    2012-12-01

    Full Text Available This Paper presents the idea of Very Large Scale Integration (VLSI using Plasmonic Waveguides.Current VLSI techniques are facing challenges with respect to clock frequencies which tend to scale up, making it more difficult for the designers to distribute and maintain low clock skew between these high frequency clocks across the entire chip. Surface Plasmons are light waves that occur at a metal/dielectric interface, where a group of electrons is collectively moving back and forth. These waves are trapped near the surface as they interact with the plasma of electrons near the surface of the metal. The decay length of SPs into the metal is two orders of magnitude smaller than the wavelength of the light in air. This feature of SPs provides the possibility of localization and the guiding of light in sub wavelength metallic structures, and it can be used to construct miniaturized optoelectronic circuits with sub wavelength components. In this paper, various methods of doing the same have been discussed some of which include DLSPPW’s, Plasmon waveguides by self-assembly, Silicon-based plasmonic waveguides etc. Hence by using Plasmonic chips, the speed, size and efficiency of microprocessor chips can be revolutionized thus bringing a whole new dimension to VLSI design.

  19. Implementation of Plasmonics in VLSI

    Directory of Open Access Journals (Sweden)

    Shreya Bhattacharya

    2012-12-01

    Full Text Available This Paper presents the idea of Very Large Scale Integration (VLSI using Plasmonic Waveguides. Current VLSI techniques are facing challenges with respect to clock frequencies which tend to scale up, making it more difficult for the designers to distribute and maintain low clock skew between these high frequency clocks across the entire chip. Surface Plasmons are light waves that occur at a metal/dielectric interface, where a group of electrons is collectively moving back and forth. These waves are trapped near the surface as they interact with the plasma of electrons near the surface of the metal. The decay length of SPs into the metal is two orders of magnitude smaller than the wavelength of the light in air. This feature of SPs provides the possibility of localization and the guiding of light in sub wavelength metallic structures, and it can be used to construct miniaturized optoelectronic circuits with sub wavelength components. In this paper, various methods of doing the same have been discussed some of which include DLSPPW’s, Plasmon waveguides by self-assembly, Silicon-based plasmonic waveguides etc. Hence by using Plasmonic chips, the speed, size and efficiency of microprocessor chips can be revolutionized thus bringing a whole new dimension to VLSI design.

  20. Parallel optical interconnects utilizing VLSI/FLC spatial light modulators

    Science.gov (United States)

    Genco, Sheryl M.

    1991-12-01

    Interconnection architectures are a cornerstone of parallel computing systems. However, interconnections can be a bottleneck in conventional computer architectures because of queuing structures that are necessary to handle the traffic through a switch at very high data rates and bandwidths. These issues must find new solutions to advance the state of the art in computing beyond the fundamental limit of silicon logic technology. Today's optoelectronic (OE) technology in particular VLSI/FLC spatial light modulators (SLMs) can provide a unique and innovative solution to these issues. This paper reports on the motivations for the system, describes the major areas of architectural requirements, discusses interconnection topologies and processor element alternatives, and documents an optical arbitration (i.e., control) scheme using `smart' SLMs and optical logic gates. The network topology is given in section 2.1 `Architectural Requirements -- Networks,' but it should be noted that the emphasis is on the optical control scheme (section 2.4) and the system.

  1. Surface and interface effects in VLSI

    CERN Document Server

    Einspruch, Norman G

    1985-01-01

    VLSI Electronics Microstructure Science, Volume 10: Surface and Interface Effects in VLSI provides the advances made in the science of semiconductor surface and interface as they relate to electronics. This volume aims to provide a better understanding and control of surface and interface related properties. The book begins with an introductory chapter on the intimate link between interfaces and devices. The book is then divided into two parts. The first part covers the chemical and geometric structures of prototypical VLSI interfaces. Subjects detailed include, the technologically most import

  2. New VLSI complexity results for threshold gate comparison

    Energy Technology Data Exchange (ETDEWEB)

    Beiu, V.

    1996-12-31

    The paper overviews recent developments concerning optimal (from the point of view of size and depth) implementations of COMPARISON using threshold gates. We detail a class of solutions which also covers another particular solution, and spans from constant to logarithmic depths. These circuit complexity results are supplemented by fresh VLSI complexity results having applications to hardware implementations of neural networks and to VLSI-friendly learning algorithms. In order to estimate the area (A) and the delay (T), as well as the classical AT{sup 2}, we shall use the following {open_quote}cost functions{close_quote}: (i) the connectivity (i.e., sum of fan-ins) and the number-of-bits for representing the weights and thresholds are used as closer approximations of the area; while (ii) the fan-ins and the length of the wires are used for closer estimates of the delay. Such approximations allow us to compare the different solutions-which present very interesting fan-in dependent depth-size and area-delay tradeoffs - with respect to AT{sup 2}.

  3. Analog VLSI implementation of resonate-and-fire neuron.

    Science.gov (United States)

    Nakada, Kazuki; Asai, Tetsuya; Hayashi, Hatsuo

    2006-12-01

    We propose an analog integrated circuit that implements a resonate-and-fire neuron (RFN) model based on the Lotka-Volterra (LV) system. The RFN model is a spiking neuron model that has second-order membrane dynamics, and thus exhibits fast damped subthreshold oscillation, resulting in the coincidence detection, frequency preference, and post-inhibitory rebound. The RFN circuit has been derived from the LV system to mimic such dynamical behavior of the RFN model. Through circuit simulations, we demonstrate that the RFN circuit can act as a coincidence detector and a band-pass filter at circuit level even in the presence of additive white noise and background random activity. These results show that our circuit is expected to be useful for very large-scale integration (VLSI) implementation of functional spiking neural networks.

  4. Programmable retinal dynamics in a CMOS mixed-signal array processor chip

    Science.gov (United States)

    Carmona, Ricardo A.; Jimenez-Garrido, Francisco J.; Dominguez-Castro, Rafael; Espejo, Servando; Rodriguez-Vazquez, Angel

    2003-04-01

    The retina is responsible of the treatment of visual information at early stages. Visual stimuli generate patterns of activity that are transmitted through its layered structure up to the ganglion cells that interface it to the optical nerve. In this trip of micrometers, information is sustained by continuous signals that interact in excitatory and inhibitory ways. This low-level processing compresses the relevant information of the images to a manageable size. The behavior of the more external layers of the biological retina has been successfully modelled within the Cellular Neural Network framework. Interactions between cells are realized on a local basic. Each cell interacts with its nearest neighbors and every cell in the same layer follows the same interconnection pattern. Intra- and inter-layer interactions are continuous in magnitude and time. The evolution of the network can be described by a set of coupled nonlinear differential equations. A mixed-signal VLSI implementation of focal-plane low-level image processing based upon this biological model constitutes a feasible and cost effective alternative to conventional digital processing in real-time applications. A CMOS Programmable Array Processor prototype chip has been designed and fabricated in a standard technology. It has been successfully tested, validating the proposed design techniques. The integrated system consists of a network of 2 coupled layers, containing 32×32 elementary processors, running at different time constants. Involved image processing algorithms can be programmed on this chip by tuning the appropriate interconnection weights, internally coded as analog but programmed via a digital interface. Propagative, active wave phenomena and retina-lake effects can be observed in this chip. Low-level image processing tasks for early vision applications can be developed based on these high-order dynamics.

  5. A Novel Efficient VLSI Architecture for IEEE 754 Floating point multiplier using Modified CSA

    Directory of Open Access Journals (Sweden)

    Nishi Pandey

    2015-10-01

    Full Text Available Due to advancement of new technology in the field of VLSI and Embedded system, there is an increasing demand of high speed and low power consumption processor. Speed of processor greatly depends on its multiplier as well as adder performance. In spite of complexity involved in floating point arithmetic, its implementation is increasing day by day. Due to which high speed adder architecture become important. Several adder architecture designs have been developed to increase the efficiency of the adder. In this paper, we introduce an architecture that performs high speed IEEE 754 floating point multiplier using modified carry select adder (CSA. Modified CSA depend on booth encoder (BEC Technique. Booth encoder, Mathematics is an ancient Indian system of Mathematics. Here we are introduced two carry select based design. These designs are implementation Xilinx Vertex device family

  6. Associative Pattern Recognition In Analog VLSI Circuits

    Science.gov (United States)

    Tawel, Raoul

    1995-01-01

    Winner-take-all circuit selects best-match stored pattern. Prototype cascadable very-large-scale integrated (VLSI) circuit chips built and tested to demonstrate concept of electronic associative pattern recognition. Based on low-power, sub-threshold analog complementary oxide/semiconductor (CMOS) VLSI circuitry, each chip can store 128 sets (vectors) of 16 analog values (vector components), vectors representing known patterns as diverse as spectra, histograms, graphs, or brightnesses of pixels in images. Chips exploit parallel nature of vector quantization architecture to implement highly parallel processing in relatively simple computational cells. Through collective action, cells classify input pattern in fraction of microsecond while consuming power of few microwatts.

  7. Compact MOSFET models for VLSI design

    CERN Document Server

    Bhattacharyya, A B

    2009-01-01

    Practicing designers, students, and educators in the semiconductor field face an ever expanding portfolio of MOSFET models. In Compact MOSFET Models for VLSI Design , A.B. Bhattacharyya presents a unified perspective on the topic, allowing the practitioner to view and interpret device phenomena concurrently using different modeling strategies. Readers will learn to link device physics with model parameters, helping to close the gap between device understanding and its use for optimal circuit performance. Bhattacharyya also lays bare the core physical concepts that will drive the future of VLSI.

  8. Compensating Inhomogeneities of Neuromorphic VLSI Devices Via Short-Term Synaptic Plasticity.

    Science.gov (United States)

    Bill, Johannes; Schuch, Klaus; Brüderle, Daniel; Schemmel, Johannes; Maass, Wolfgang; Meier, Karlheinz

    2010-01-01

    Recent developments in neuromorphic hardware engineering make mixed-signal VLSI neural network models promising candidates for neuroscientific research tools and massively parallel computing devices, especially for tasks which exhaust the computing power of software simulations. Still, like all analog hardware systems, neuromorphic models suffer from a constricted configurability and production-related fluctuations of device characteristics. Since also future systems, involving ever-smaller structures, will inevitably exhibit such inhomogeneities on the unit level, self-regulation properties become a crucial requirement for their successful operation. By applying a cortically inspired self-adjusting network architecture, we show that the activity of generic spiking neural networks emulated on a neuromorphic hardware system can be kept within a biologically realistic firing regime and gain a remarkable robustness against transistor-level variations. As a first approach of this kind in engineering practice, the short-term synaptic depression and facilitation mechanisms implemented within an analog VLSI model of I&F neurons are functionally utilized for the purpose of network level stabilization. We present experimental data acquired both from the hardware model and from comparative software simulations which prove the applicability of the employed paradigm to neuromorphic VLSI devices.

  9. VLSI implementation of a template subtraction algorithm for real-time stimulus artifact rejection.

    Science.gov (United States)

    Limnuson, Kanokwan; Lu, Hui; Chiel, Hillel J; Mohseni, Pedram

    2010-01-01

    In this paper, we present very-large-scale integrated (VLSI) implementation of a template subtraction algorithm for stimulus artifact rejection (SAR) in real time with applicability to closed-loop neuroprostheses. The SAR algorithm is based upon an infinite impulse response (IIR) temporal filtering technique, which can be efficiently implemented in VLSI with reduced power consumption and silicon area. We demonstrate that initialization of the memory within the system architecture using the first recorded stimulus artifact significantly decreases system response time as compared to the case without memory initialization. Two sets of pre-recorded neural data from an Aplysia californica are used to simulate the functionality of the proposed VLSI architecture in AMS 0.35 microm complementary metal-oxide-semiconductor (CMOS) technology. Depending upon the reproducibility in the shape of stimulus artifacts in vivo, the system eliminates virtually all artifacts in real time and recovers the extracellular neural activity with microW-level power consumption from 1.5 V.

  10. An analog VLSI implementation of a visual interneuron: enhanced sensory processing through biophysical modeling.

    Science.gov (United States)

    Harrison, R R; Koch, C

    1999-10-01

    Flies are capable of rapid, coordinated flight through unstructured environments. This flight is guided by visual motion information that is extracted from photoreceptors in a robust manner. One feature of the fly's visual processing that adds to this robustness is the saturation of wide-field motion-sensitive neuron responses with increasing pattern size. This makes the cell's responses less dependent on the sparseness of the optical flow field while retaining motion information. By implementing a compartmental neuronal model in silicon, we add this "gain control" to an existing analog VLSI model of fly vision. This results in enhanced performance in a compact, low-power CMOS motion sensor. Our silicon system also demonstrates that modern, biophysically-detailed models of neural sensory processing systems can be instantiated in VLSI hardware.

  11. Technology Development and Circuit Design for a Parallel Laser Programmable Floating Point Application Specific Processor

    Science.gov (United States)

    1989-12-01

    The laser programmable floating point application specific processor (LPASP) is a new approach at rapid development of custom VLSI chips. The LPASP...double precision floating point adder and the laser programmable read-only memory (LPROM) that are macrocells within the LPASP. In addition, the...thesis analyzes the applicability of an LPASP parallel processing system. The double precision floating point adder is an adder/subtractor macrocell

  12. SSI/MSI/LSI/VLSI/ULSI.

    Science.gov (United States)

    Alexander, George

    1984-01-01

    Discusses small-scale integrated (SSI), medium-scale integrated (MSI), large-scale integrated (LSI), very large-scale integrated (VLSI), and ultra large-scale integrated (ULSI) chips. The development and properties of these chips, uses of gallium arsenide, Josephson devices (two superconducting strips sandwiching a thin insulator), and future…

  13. Research into Self-Timed VLSI Circuits.

    Science.gov (United States)

    1984-10-22

    without reorganization. and identical processors are spaced around the store. In the Gauss- Seidel method [8] the grid point Computer simulation of the...convergence rates intermediate between those of grid point values are used throughout each iteration. the Jacobi and Gauss- Seidel methods are obtained... Seidel method , number of processors) of 40-60% is achieved with as when it uses one processor per grid point, it reduces to many as N processors, where

  14. Fault tolerant, radiation hard, high performance digital signal processor

    Science.gov (United States)

    Holmann, Edgar; Linscott, Ivan R.; Maurer, Michael J.; Tyler, G. L.; Libby, Vibeke

    1990-01-01

    An architecture has been developed for a high-performance VLSI digital signal processor that is highly reliable, fault-tolerant, and radiation-hard. The signal processor, part of a spacecraft receiver designed to support uplink radio science experiments at the outer planets, organizes the connections between redundant arithmetic resources, register files, and memory through a shuffle exchange communication network. The configuration of the network and the state of the processor resources are all under microprogram control, which both maps the resources according to algorithmic needs and reconfigures the processing should a failure occur. In addition, the microprogram is reloadable through the uplink to accommodate changes in the science objectives throughout the course of the mission. The processor will be implemented with silicon compiler tools, and its design will be verified through silicon compilation simulation at all levels from the resources to full functionality. By blending reconfiguration with redundancy the processor implementation is fault-tolerant and reliable, and possesses the long expected lifetime needed for a spacecraft mission to the outer planets.

  15. Software implementation of floating-Point arithmetic on a reduced-Instruction-set processor

    Energy Technology Data Exchange (ETDEWEB)

    Gross, T.

    1985-11-01

    Current single chip implementations of reduced-instruction-set processors do not support hardware floating-point operations. Instead, floating-point operations have to be provided either by a coprocessor or by software. This paper discusses issues arising from a software implementation of floating-point arithmetic for the MIPS processor, an experimental VLSI architecture. Measurements indicate that an acceptable level of performance is achieved, but this approach is no substitute for a hardware accelerator if higher-precision results are required. This paper includes instruction profiles for the basic floating-point operations and evaluates the usefulness of some aspects of the instruction set.

  16. Associative Memory Design for the FastTrack Processor (FTK) at ATLAS

    CERN Document Server

    Annovi, A; The ATLAS collaboration; Volpi, G; Beccherle, R; Bossini, E; Crescioli, F; Dell'Orso, M; Giannetti, P; Amerio, S; Hoff, J; Liu, T; Sacco, I; Liberali, V; Stabile, A; Schoening, A; Soltveit, H; Tripiccione, R

    2011-01-01

    We propose a new generation of VLSI processor for pattern recognition based on Associative Memory architecture, optimized for on-line track finding in high-energy physics experiments. We describe the architecture, the technology studies and the prototype design of a new Associative Memory project: it maximizes the pattern density on ASICs, minimizes the power consumption and improves the functionality for the fast tracker processor proposed to upgrade the ATLAS trigger at LHC. Finally we will focus on possible future applications inside and outside high physics energy.

  17. Next generation Associative Memory devices for the FTK tracking processor of the ATLAS experiment

    CERN Document Server

    Andreani, A; The ATLAS collaboration; Beccherle, B; Beretta, M; Citterio, M; Crescioli, F; Colombo, A; Giannetti, P; Liberali, V; Shojaii, J; Stabile, A

    2013-01-01

    The AMchip is a VLSI device that implements the associative memory function, a special content addressable memory specifically designed for high energy physics applications and first used in the CDF experiment at Tevatron. The 4th generation of AMchip has been developed for the core pattern recognition stage of the Fast TracKer (FTK) processor: a hardware processor for online reconstruction of particle trajectories at the ATLAS experiment at LHC. We present the architecture, design considerations, power consumption and performance measurements of the 4th generation of AMchip. We present also the design innovations toward the 5th generation and the first prototype results.

  18. ASIC Design of Floating-Point FFT Processor

    Institute of Scientific and Technical Information of China (English)

    陈禾; 赵忠武

    2004-01-01

    An application specific integrated circuit (ASIC) design of a 1024 points floating-point fast Fourier transform(FFT) processor is presented. It can satisfy the requirement of high accuracy FFT result in related fields. Several novel design techniques for floating-point adder and multiplier are introduced in detail to enhance the speed of the system. At the same time, the power consumption is decreased. The hardware area is effectively reduced as an improved butterfly processor is developed. There is a substantial increase in the performance of the design since a pipelined architecture is adopted, and very large scale integrated (VLSI) is easy to realize due to the regularity. A result of validation using field programmable gate array (FPGA) is shown at the end. When the system clock is set to 50 MHz, 204.8 μs is needed to complete the operation of FFT computation.

  19. Embedded Processor Laboratory

    Data.gov (United States)

    Federal Laboratory Consortium — The Embedded Processor Laboratory provides the means to design, develop, fabricate, and test embedded computers for missile guidance electronics systems in support...

  20. Harnessing VLSI System Design with EDA Tools

    CERN Document Server

    Kamat, Rajanish K; Gaikwad, Pawan K; Guhilot, Hansraj

    2012-01-01

    This book explores various dimensions of EDA technologies for achieving different goals in VLSI system design. Although the scope of EDA is very broad and comprises diversified hardware and software tools to accomplish different phases of VLSI system design, such as design, layout, simulation, testability, prototyping and implementation, this book focuses only on demystifying the code, a.k.a. firmware development and its implementation with FPGAs. Since there are a variety of languages for system design, this book covers various issues related to VHDL, Verilog and System C synergized with EDA tools, using a variety of case studies such as testability, verification and power consumption. * Covers aspects of VHDL, Verilog and Handel C in one text; * Enables designers to judge the appropriateness of each EDA tool for relevant applications; * Omits discussion of design platforms and focuses on design case studies; * Uses design case studies from diversified application domains such as network on chip, hospital on...

  1. VLSI Microsystem for Rapid Bioinformatic Pattern Recognition

    Science.gov (United States)

    Fang, Wai-Chi; Lue, Jaw-Chyng

    2009-01-01

    A system comprising very-large-scale integrated (VLSI) circuits is being developed as a means of bioinformatics-oriented analysis and recognition of patterns of fluorescence generated in a microarray in an advanced, highly miniaturized, portable genetic-expression-assay instrument. Such an instrument implements an on-chip combination of polymerase chain reactions and electrochemical transduction for amplification and detection of deoxyribonucleic acid (DNA).

  2. Leak detection utilizing analog binaural (VLSI) techniques

    Science.gov (United States)

    Hartley, Frank T. (Inventor)

    1995-01-01

    A detection method and system utilizing silicon models of the traveling wave structure of the human cochlea to spatially and temporally locate a specific sound source in the presence of high noise pandemonium. The detection system combines two-dimensional stereausis representations, which are output by at least three VLSI binaural hearing chips, to generate a three-dimensional stereausis representation including both binaural and spectral information which is then used to locate the sound source.

  3. Modular VLSI Reed-Solomon Decoder

    Science.gov (United States)

    Hsu, In-Shek; Truong, Trieu-Kie

    1991-01-01

    Proposed Reed-Solomon decoder contains multiple very-large-scale integrated (VLSI) circuit chips of same type. Each chip contains sets of logic cells and subcells performing functions from all stages of decoding process. Full decoder assembled by concatenating chips, with selective utilization of cells in particular chips. Cost of development reduced by factor of 5. In addition, decoder programmable in field and switched between 8-bit and 10-bit symbol sizes.

  4. Modular VLSI Reed-Solomon Decoder

    Science.gov (United States)

    Hsu, In-Shek; Truong, Trieu-Kie

    1991-01-01

    Proposed Reed-Solomon decoder contains multiple very-large-scale integrated (VLSI) circuit chips of same type. Each chip contains sets of logic cells and subcells performing functions from all stages of decoding process. Full decoder assembled by concatenating chips, with selective utilization of cells in particular chips. Cost of development reduced by factor of 5. In addition, decoder programmable in field and switched between 8-bit and 10-bit symbol sizes.

  5. Generating Weighted Test Patterns for VLSI Chips

    Science.gov (United States)

    Siavoshi, Fardad

    1990-01-01

    Improved built-in self-testing circuitry for very-large-scale integrated (VLSI) digital circuits based on version of weighted-test-pattern-generation concept, in which ones and zeros in pseudorandom test patterns occur with probabilities weighted to enhance detection of certain kinds of faults. Requires fewer test patterns and less computation time and occupies less area on circuit chips. Easy to relate switching activity in outputs with fault-detection activity by use of probabilistic fault-detection techniques.

  6. Spike-based VLSI modeling of the ILD system in the echolocating bat.

    Science.gov (United States)

    Horiuchi, T; Hynna, K

    2001-01-01

    The azimuthal localization of objects by echolocating bats is based on the difference of echo intensity received at the two ears, known as the interaural level difference (ILD). Mimicking the neural circuitry in the bat associated with the computation of ILD, we have constructed a spike-based VLSI model that can produce responses similar to those seen in the lateral superior olive (LSO) and some parts of the inferior colliculus (IC). We further explore some of the interesting computational consequences of the dynamics of both synapses and cellular mechanisms.

  7. Analog CMOS Nonlinear Cells and Their Applications in VLSI Signal and Information Processing

    Science.gov (United States)

    Khachab, Nabil Ibrahim

    1990-01-01

    The development of reconfigurable analog CMOS building blocks and their applications in analog VLSI is discussed and introduced in much the same way a logic gate is used in digital VLSI. They simultaneously achieve four -quadrant multiplication and division. These applications include multiplication, signal squaring, division, signal inversion, amplitude modulation. New all MOS implementations of the Hopfield like neural networks are developed by using the new cells. In addition new and novel techniques for sensor linearization and for MOSFET-C programmable-Q and omega_{n} filters are introduced. The new designs are simple, versatile, programmable and make effective use of analog CAD tools. Moreover, they are easily extendable to other technologies such as GaAs and BiCMOS. The objective of these designs is to achieve reduction in Silicon area and power consumption and reduce the interconnections between cells. It is also sought to provide a robust design that is CAD-compatible and make effective use of the standard cell library approach. This will offer more versatility and flexibility for analog signal processing systems and neural networks. Some of these new cells and a 3-neuron neural system are fabricated in a 2mum CMOS process. Experimental results of these circuits verify the validity of this new design approach.

  8. Array processors in chemistry

    Energy Technology Data Exchange (ETDEWEB)

    Ostlund, N.S.

    1980-01-01

    The field of attached scientific processors (''array processors'') is surveyed, and an attempt is made to indicate their present and possible future use in computational chemistry. The current commercial products from Floating Point Systems, Inc., Datawest Corporation, and CSP, Inc. are discussed.

  9. Training probabilistic VLSI models on-chip to recognise biomedical signals under hardware nonidealities.

    Science.gov (United States)

    Jiang, P C; Chen, H

    2006-01-01

    VLSI implementation of probabilistic models is attractive for many biomedical applications. However, hardware non-idealities can prevent probabilistic VLSI models from modelling data optimally through on-chip learning. This paper investigates the maximum computational errors that a probabilistic VLSI model can tolerate when modelling real biomedical data. VLSI circuits capable of achieving the required precision are also proposed.

  10. Trends and challenges in VLSI technology scaling towards 100 nm

    NARCIS (Netherlands)

    Rusu, S.; Sachdev, M.; Svensson, C.; Nauta, Bram

    Summary form only given. Moore's Law drives VLSI technology to continuous increases in transistor densities and higher clock frequencies. This tutorial will review the trends in VLSI technology scaling in the last few years and discuss the challenges facing process and circuit engineers in the 100nm

  11. On VLSI Design of Rank-Order Filtering using DCRAM Architecture.

    Science.gov (United States)

    Lin, Meng-Chun; Dung, Lan-Rong

    2008-02-01

    This paper addresses on VLSI design of rank-order filtering (ROF) with a maskable memory for real-time speech and image processing applications. Based on a generic bit-sliced ROF algorithm, the proposed design uses a special-defined memory, called the dual-cell random-access memory (DCRAM), to realize major operations of ROF: threshold decomposition and polarization. Using the memory-oriented architecture, the proposed ROF processor can benefit from high flexibility, low cost and high speed. The DCRAM can perform the bit-sliced read, partial write, and pipelined processing. The bit-sliced read and partial write are driven by maskable registers. With recursive execution of the bit-slicing read and partial write, the DCRAM can effectively realize ROF in terms of cost and speed. The proposed design has been implemented using TSMC 0.18 μm 1P6M technology. As shown in the result of physical implementation, the core size is 356.1 × 427.7μm(2) and the VLSI implementation of ROF can operate at 256 MHz for 1.8V supply.

  12. Adaptive WTA with an analog VLSI neuromorphic learning chip.

    Science.gov (United States)

    Häfliger, Philipp

    2007-03-01

    In this paper, we demonstrate how a particular spike-based learning rule (where exact temporal relations between input and output spikes of a spiking model neuron determine the changes of the synaptic weights) can be tuned to express rate-based classical Hebbian learning behavior (where the average input and output spike rates are sufficient to describe the synaptic changes). This shift in behavior is controlled by the input statistic and by a single time constant. The learning rule has been implemented in a neuromorphic very large scale integration (VLSI) chip as part of a neurally inspired spike signal image processing system. The latter is the result of the European Union research project Convolution AER Vision Architecture for Real-Time (CAVIAR). Since it is implemented as a spike-based learning rule (which is most convenient in the overall spike-based system), even if it is tuned to show rate behavior, no explicit long-term average signals are computed on the chip. We show the rule's rate-based Hebbian learning ability in a classification task in both simulation and chip experiment, first with artificial stimuli and then with sensor input from the CAVIAR system.

  13. VLSI Circuits for High Speed Data Conversion

    Science.gov (United States)

    1994-05-16

    Meeting, pp. 289-292, Sept. 199 1. [4] Behzad Razavi , "High-Speed, Nigh-Resolution Analog-to-Digital Conversion in VLSI Technologies, Ph.D. Thesis... Behzad Razavi and Bruce A. Wooley, "Design Techniques for High-Speed, High- Resolution Comparators," IEEE J. Solid-State Circuits, vol. 27, pp. 1916-192...Dec. 1992. [8] Behzad Razavi and Bruce A. Wooley, "A 12-Bkt 5-MSamplesoc Two-Step CMOS A/D Converter," IEEE J. Solid-State Circuits, vol. 27, pp

  14. Self arbitrated VLSI asynchronous sequential circuits

    Science.gov (United States)

    Whitaker, S.; Maki, G.

    1990-01-01

    A new class of asynchronous sequential circuits is introduced in this paper. The new design procedures are oriented towards producing asynchronous sequential circuits that are implemented with CMOS VLSI and take advantage of pass transistor technology. The first design algorithm utilizes a standard Single Transition Time (STT) state assignment. The second method introduces a new class of self synchronizing asynchronous circuits which eliminates the need for critical race free state assignments. These circuits arbitrate the transition path action by forcing the circuit to sequence through proper unstable states. These methods result in near minimum hardware since only the transition paths associated with state variable changes need to be implemented with pass transistor networks.

  15. Single Spin Logic Implementation of VLSI Adders

    CERN Document Server

    Shukla, Soumitra

    2011-01-01

    Some important VLSI adder circuits are implemented using quantum dots (qd) and Spin Polarized Scanning Tunneling Microscopy (SPSTM) in Single Spin Logic (SSL) paradigm. A simple comparison between these adder circuits shows that the mirror adder implementation in SSL does not carry any advantage over CMOS adder in terms of complexity and number of qds, opposite to the trend observed in their charge-based counterparts. On the contrary, the transmission gate adder, Static and Dynamic Manchester carry gate adders in SSL reduce the complexity and number of qds, in harmony with the trend shown in transistor adders.

  16. An Analog VLSI Saccadic Eye Movement System

    OpenAIRE

    1994-01-01

    In an effort to understand saccadic eye movements and their relation to visual attention and other forms of eye movements, we - in collaboration with a number of other laboratories - are carrying out a large-scale effort to design and build a complete primate oculomotor system using analog CMOS VLSI technology. Using this technology, a low power, compact, multi-chip system has been built which works in real-time using real-world visual inputs. We describe in this paper the performance of a...

  17. Communication Protocols Augmentation in VLSI Design Applications

    Directory of Open Access Journals (Sweden)

    Kanhu Charan Padhy

    2015-05-01

    Full Text Available With the advancement in communication System, the use of various protocols got a sharp rise in the different applications. Especially in the VLSI design for FPGAs, ASICS, CPLDs, the application areas got expanded to FPGA based technologies. Today, it has moved from commercial application to the defence sectors like missiles & aerospace controls. In this paper the use of FPGAs and its interface with various application circuits in the communication field for data (textual & visual & control transfer is discussed. To be specific, the paper discusses the use of FPGA in various communication protocols like SPI, I2C, and TMDS in synchronous mode in Digital System Design using VHDL/Verilog.

  18. VLSI binary multiplier using residue number systems

    Energy Technology Data Exchange (ETDEWEB)

    Barsi, F.; Di Cola, A.

    1982-01-01

    The idea of performing multiplication of n-bit binary numbers using a hardware based on residue number systems is considered. This paper develops the design of a VLSI chip deriving area and time upper bounds of a n-bit multiplier. To perform multiplication using residue arithmetic, numbers are converted from binary to residue representation and, after residue multiplication, the result is reconverted to the original notation. It is shown that the proposed design requires an area a=o(n/sup 2/ log n) and an execution time t=o(log/sup 2/n). 7 references.

  19. Benchmarking a DSP processor

    OpenAIRE

    Lennartsson, Per; Nordlander, Lars

    2002-01-01

    This Master thesis describes the benchmarking of a DSP processor. Benchmarking means measuring the performance in some way. In this report, we have focused on the number of instruction cycles needed to execute certain algorithms. The algorithms we have used in the benchmark are all very common in signal processing today. The results we have reached in this thesis have been compared to benchmarks for other processors, performed by Berkeley Design Technology, Inc. The algorithms were programm...

  20. Real-Time Classification of Complex Patterns Using Spike-Based Learning in Neuromorphic VLSI.

    Science.gov (United States)

    Mitra, S; Fusi, S; Indiveri, G

    2009-02-01

    Real-time classification of patterns of spike trains is a difficult computational problem that both natural and artificial networks of spiking neurons are confronted with. The solution to this problem not only could contribute to understanding the fundamental mechanisms of computation used in the biological brain, but could also lead to efficient hardware implementations of a wide range of applications ranging from autonomous sensory-motor systems to brain-machine interfaces. Here we demonstrate real-time classification of complex patterns of mean firing rates, using a VLSI network of spiking neurons and dynamic synapses which implement a robust spike-driven plasticity mechanism. The learning rule implemented is a supervised one: a teacher signal provides the output neuron with an extra input spike-train during training, in parallel to the spike-trains that represent the input pattern. The teacher signal simply indicates if the neuron should respond to the input pattern with a high rate or with a low one. The learning mechanism modifies the synaptic weights only as long as the current generated by all the stimulated plastic synapses does not match the output desired by the teacher, as in the perceptron learning rule. We describe the implementation of this learning mechanism and present experimental data that demonstrate how the VLSI neural network can learn to classify patterns of neural activities, also in the case in which they are highly correlated.

  1. APRON: A Cellular Processor Array Simulation and Hardware Design Tool

    Directory of Open Access Journals (Sweden)

    David R. W. Barr

    2009-01-01

    Full Text Available We present a software environment for the efficient simulation of cellular processor arrays (CPAs. This software (APRON is used to explore algorithms that are designed for massively parallel fine-grained processor arrays, topographic multilayer neural networks, vision chips with SIMD processor arrays, and related architectures. The software uses a highly optimised core combined with a flexible compiler to provide the user with tools for the design of new processor array hardware architectures and the emulation of existing devices. We present performance benchmarks for the software processor array implemented on standard commodity microprocessors. APRON can be configured to use additional processing hardware if necessary and can be used as a complete graphical user interface and development environment for new or existing CPA systems, allowing more users to develop algorithms for CPA systems.

  2. Technology computer aided design simulation for VLSI MOSFET

    CERN Document Server

    Sarkar, Chandan Kumar

    2013-01-01

    Responding to recent developments and a growing VLSI circuit manufacturing market, Technology Computer Aided Design: Simulation for VLSI MOSFET examines advanced MOSFET processes and devices through TCAD numerical simulations. The book provides a balanced summary of TCAD and MOSFET basic concepts, equations, physics, and new technologies related to TCAD and MOSFET. A firm grasp of these concepts allows for the design of better models, thus streamlining the design process, saving time and money. This book places emphasis on the importance of modeling and simulations of VLSI MOS transistors and

  3. The VLSI-PLM Board: Design, Construction, and Testing

    Science.gov (United States)

    1989-03-01

    Computer Aided Design CB- Xenologic Corporation’s X-1 cache board DAS - Digital Analysis System EECS - Electrical Engineering and Computer...PLM Board is to debug the VLSI-PLM Chip [STN88] and to interface the chip to the Xenologic Corporation’s X-1 cache board. The chip is a high...a wire-wrapped board designed for debugging VLSI-PLM [STN88] and connecting VLSI- PLM to the cache board of Xenologic Corporation’s X-1 system. The

  4. VLSI circuits for high speed data conversion

    Science.gov (United States)

    Wooley, Bruce A.

    1994-05-01

    The focus of research has been the study of fundamental issues in the design and testing of data conversion interfaces for high performance VLSI signal processing and communications systems. Because of the increased speed and density that accompany the continuing scaling of VLSI technologies, digital means of processing, communicating, and storing information are rapidly displacing their analog counterparts across a broadening spectrum of applications. In such systems, the limitations on system performance generally occur at the interfaces between the digital representation of information and the analog environment in which the system is embedded. Specific results of this research include the design and implementation of low-power BiCMOS comparators and sample-and-hold amplifiers operating at clock rates as high as 200 MHz, the design and integration of a 12-bit, 5 MHz CMOS A/D converter employing a two-step architecture and a novel self-calibrating comparator, the design and integration of an optoelectronic communications receiver front-end in a GaAs-on-Si technology, the initiation of research into the use of an active silicon substrate probe card for fully testing high-performance mixed-signal circuits at the wafer level, and a preliminary study of means for correcting dynamic errors in high-performance A/D converters.

  5. Analog VLSI Models of Range-Tuned Neurons in the Bat Echolocation System

    Directory of Open Access Journals (Sweden)

    Horiuchi Timothy

    2003-01-01

    Full Text Available Bat echolocation is a fascinating topic of research for both neuroscientists and engineers, due to the complex and extremely time-constrained nature of the problem and its potential for application to engineered systems. In the bat's brainstem and midbrain exist neural circuits that are sensitive to the specific difference in time between the outgoing sonar vocalization and the returning echo. While some of the details of the neural mechanisms are known to be species-specific, a basic model of reafference-triggered, postinhibitory rebound timing is reasonably well supported by available data. We have designed low-power, analog VLSI circuits to mimic this mechanism and have demonstrated range-dependent outputs for use in a real-time sonar system. These circuits are being used to implement range-dependent vocalization amplitude, vocalization rate, and closest target isolation.

  6. Hardware multiplier processor

    Science.gov (United States)

    Pierce, Paul E.

    1986-01-01

    A hardware processor is disclosed which in the described embodiment is a memory mapped multiplier processor that can operate in parallel with a 16 bit microcomputer. The multiplier processor decodes the address bus to receive specific instructions so that in one access it can write and automatically perform single or double precision multiplication involving a number written to it with or without addition or subtraction with a previously stored number. It can also, on a single read command automatically round and scale a previously stored number. The multiplier processor includes two concatenated 16 bit multiplier registers, two 16 bit concatenated 16 bit multipliers, and four 16 bit product registers connected to an internal 16 bit data bus. A high level address decoder determines when the multiplier processor is being addressed and first and second low level address decoders generate control signals. In addition, certain low order address lines are used to carry uncoded control signals. First and second control circuits coupled to the decoders generate further control signals and generate a plurality of clocking pulse trains in response to the decoded and address control signals.

  7. Signal processor packaging design

    Science.gov (United States)

    McCarley, Paul L.; Phipps, Mickie A.

    1993-10-01

    The Signal Processor Packaging Design (SPPD) program was a technology development effort to demonstrate that a miniaturized, high throughput programmable processor could be fabricated to meet the stringent environment imposed by high speed kinetic energy guided interceptor and missile applications. This successful program culminated with the delivery of two very small processors, each about the size of a large pin grid array package. Rockwell International's Tactical Systems Division in Anaheim, California developed one of the processors, and the other was developed by Texas Instruments' (TI) Defense Systems and Electronics Group (DSEG) of Dallas, Texas. The SPPD program was sponsored by the Guided Interceptor Technology Branch of the Air Force Wright Laboratory's Armament Directorate (WL/MNSI) at Eglin AFB, Florida and funded by SDIO's Interceptor Technology Directorate (SDIO/TNC). These prototype processors were subjected to rigorous tests of their image processing capabilities, and both successfully demonstrated the ability to process 128 X 128 infrared images at a frame rate of over 100 Hz.

  8. The 1992 4th NASA SERC Symposium on VLSI Design

    Science.gov (United States)

    Whitaker, Sterling R.

    1992-01-01

    Papers from the fourth annual NASA Symposium on VLSI Design, co-sponsored by the IEEE, are presented. Each year this symposium is organized by the NASA Space Engineering Research Center (SERC) at the University of Idaho and is held in conjunction with a quarterly meeting of the NASA Data System Technology Working Group (DSTWG). One task of the DSTWG is to develop new electronic technologies that will meet next generation electronic data system needs. The symposium provides insights into developments in VLSI and digital systems which can be used to increase data systems performance. The NASA SERC is proud to offer, at its fourth symposium on VLSI design, presentations by an outstanding set of individuals from national laboratories, the electronics industry, and universities. These speakers share insights into next generation advances that will serve as a basis for future VLSI design.

  9. Interaction of algorithm and implementation for analog VLSI stereo vision

    Science.gov (United States)

    Hakkarainen, J. M.; Little, James J.; Lee, Hae-Seung; Wyatt, John L., Jr.

    1991-07-01

    Design of a high-speed stereo vision system in analog VLSI technology is reported. The goal is to determine how the advantages of analog VLSI--small area, high speed, and low power-- can be exploited, and how the effects of its principal disadvantages--limited accuracy, inflexibility, and lack of storage capacity--can be minimized. Three stereo algorithms are considered, and a simulation study is presented to examine details of the interaction between algorithm and analog VLSI implementation. The Marr-Poggio-Drumheller algorithm is shown to be best suited for analog VLSI implementation. A CCD/CMOS stereo system implementation is proposed, capable of operation at 6000 image frame pairs per second for 48 X 48 images, and faster than frame rate operation on 256 X 256 binocular image pairs.

  10. Associative Memory design for the Fast TracK processor (FTK) at Atlas

    CERN Document Server

    Annovi, A; The ATLAS collaboration; Bossini, E; Crescioli, F; Dell'Orso, M; Giannetti, P; Piendibene, M; Sacco, I; Sartori, L; Tripiccione, R

    2010-01-01

    We describe a VLSI processor for pattern recognition based on Content Addressable Memory (CAM) architecture, optimized for on-line track finding in high-energy physics experiments. A large CAM bank stores all trajectories of interest and extracts the ones compatible with a given event. This task is naturally parallelized by a CAM architecture able to output identified trajectories, recognized among a huge amount of possible combinations, in just a few 100 MHz clock cycles. We have developed this device (called the AMchip03 processor), using 180 nm technology, for the Silicon Vertex Trigger (SVT) upgrade at CDF using a standard-cell VLSI design methodology. We propose now a new design (90 nm technology) where we introduce a full custom standard cell. This is a customized design that allows to maximize the pattern density and to minimize the power consumption. We discuss also possible future extensions based on 3-D technology. This processor has a flexible and easily configurable structure that makes it suitabl...

  11. Memory Based Machine Intelligence Techniques in VLSI hardware

    OpenAIRE

    James, Alex Pappachen

    2012-01-01

    We briefly introduce the memory based approaches to emulate machine intelligence in VLSI hardware, describing the challenges and advantages. Implementation of artificial intelligence techniques in VLSI hardware is a practical and difficult problem. Deep architectures, hierarchical temporal memories and memory networks are some of the contemporary approaches in this area of research. The techniques attempt to emulate low level intelligence tasks and aim at providing scalable solutions to high ...

  12. NASA Space Engineering Research Center for VLSI System Design

    Science.gov (United States)

    1993-01-01

    This annual report outlines the activities of the past year at the NASA SERC on VLSI Design. Highlights for this year include the following: a significant breakthrough was achieved in utilizing commercial IC foundries for producing flight electronics; the first two flight qualified chips were designed, fabricated, and tested and are now being delivered into NASA flight systems; and a new technology transfer mechanism has been established to transfer VLSI advances into NASA and commercial systems.

  13. Design and Verification of High-Speed VLSI Physical Design

    Institute of Scientific and Technical Information of China (English)

    Dian Zhou; Rui-Ming Li

    2005-01-01

    With the rapid development of deep submicron (DSM) VLSI circuit designs, many issues such as time closure and power consumption are making the physical designs more and more challenging. In this review paper we provide readers with some recent progress of the VLSI physical designs. The recent developments of floorplanning and placement,interconnect effects, modeling and delay, buffer insertion and wire sizing, circuit order reduction, power grid analysis,parasitic extraction, and clock signal distribution are briefly reviewed.

  14. Memory Based Machine Intelligence Techniques in VLSI hardware

    CERN Document Server

    James, Alex Pappachen

    2012-01-01

    We briefly introduce the memory based approaches to emulate machine intelligence in VLSI hardware, describing the challenges and advantages. Implementation of artificial intelligence techniques in VLSI hardware is a practical and difficult problem. Deep architectures, hierarchical temporal memories and memory networks are some of the contemporary approaches in this area of research. The techniques attempt to emulate low level intelligence tasks and aim at providing scalable solutions to high level intelligence problems such as sparse coding and contextual processing.

  15. The Milstar Advanced Processor

    Science.gov (United States)

    Tjia, Khiem-Hian; Heely, Stephen D.; Morphet, John P.; Wirick, Kevin S.

    The Milstar Advanced Processor (MAP) is a 'drop-in' replacement for its predecessor which preserves existing interfaces with other Milstar satellite processors and minimizes the impact of such upgrading to already-developed application software. In addition to flight software development, and hardware development that involves the application of VHSIC technology to the electrical design, the MAP project is developing two sophisticated and similar test environments. High density RAM and ROM are employed by the MAP memory array. Attention is given to the fine-pitch VHSIC design techniques and lead designs used, as well as the tole of TQM and concurrent engineering in the development of the MAP manufacturing process.

  16. Cascaded VLSI neural network architecture for on-line learning

    Science.gov (United States)

    Duong, Tuan A. (Inventor); Daud, Taher (Inventor); Thakoor, Anilkumar P. (Inventor)

    1995-01-01

    High-speed, analog, fully-parallel and asynchronous building blocks are cascaded for larger sizes and enhanced resolution. A hardware-compatible algorithm permits hardware-in-the-loop learning despite limited weight resolution. A comparison-intensive feature classification application has been demonstrated with this flexible hardware and new algorithm at high speed. This result indicates that these building block chips can be embedded as application-specific-coprocessors for solving real-world problems at extremely high data rates.

  17. Associative Memory Design for the Fast TracKer Processor (FTK) at ATLAS

    CERN Document Server

    Beretta, M; The ATLAS collaboration

    2011-01-01

    We describe a VLSI processor for pattern recognition based on Content Addressable Memory (CAM) architecture, optimized for on-line track finding in high-energy physics experiments. We have developed this device using 65 nm technology combining a full custom CAM cell with standard-cell control logic. The customized design maximizes the pattern density, minimizes the power consumption and implements the functionalities needed for the planned Fast Tracker (FTK) [2], an ATLAS trigger upgrade project at LHC. We introduce a new variable resolution pattern matching technique using “don’t care” bits to set the pattern-matching window for each pattern and each layer can be independently.

  18. Associative Memory design for the FastTrack processor (FTK) at ATLAS

    CERN Document Server

    Annovi, A; The ATLAS collaboration; Bossini, E; Crescioli, F; Dell'Orso, M; Giannetti, P; Piendibene, M; Sacco, I; Sartori, L; Tripiccione, R

    2010-01-01

    We propose a new generation of VLSI processor for pattern recognition based on Associative Memory architecture, optimized for on-line track finding in high-energy physics experiments. We describe the architecture, the technology studies and the prototype design of a new R&D Associative Memory project: it maximizes the pattern density on ASICs, minimizes the power consumption and improves the functionality for the Fast Tracker (FTK) proposed to upgrade the ATLAS trigger at LHC. Finally we will focus on possible future applications inside and outside High Physics Energy (HEP).

  19. VLSI Design of a Turbo Decoder

    Science.gov (United States)

    Fang, Wai-Chi

    2007-01-01

    A very-large-scale-integrated-circuit (VLSI) turbo decoder has been designed to serve as a compact, high-throughput, low-power, lightweight decoder core of a receiver in a data-communication system. In a typical contemplated application, such a decoder core would be part of a single integrated circuit that would include the rest of the receiver circuitry and possibly some or all of the transmitter circuitry, all designed and fabricated together according to an advanced communication-system-on-a-chip design concept. Turbo codes are forward-error-correction (FEC) codes. Relative to older FEC codes, turbo codes enable communication at lower signal-to-noise ratios and offer greater coding gain. In addition, turbo codes can be implemented by relatively simple hardware. Therefore, turbo codes have been adopted as standard for some advanced broadband communication systems.

  20. Relaxation Based Electrical Simulation for VLSI Circuits

    Directory of Open Access Journals (Sweden)

    S. Rajkumar

    2012-06-01

    Full Text Available Electrical circuit simulation was one of the first CAD tools developed for IC design. The conventional circuit simulators like SPICE and ASTAP were designed initially for the cost effective analysis of circuits containing a few hundred transistors or less. A number of approaches have been used to improve the performances of congenital circuit simulators for the analysis of large circuits. Thereafter relaxation methods was proposed to provide more accurate waveforms than standard circuit simulators with up to two orders of magnitude speed improvement for large circuits. In this paper we have tried to highlights recently used waveform and point relaxation techniques for simulation of VLSI circuits. We also propose a simple parallelization technique and experimentally demonstrate that we can solve digital circuits with tens of million transistors in a few hours.

  1. Multi-net optimization of VLSI interconnect

    CERN Document Server

    Moiseev, Konstantin; Wimer, Shmuel

    2015-01-01

    This book covers layout design and layout migration methodologies for optimizing multi-net wire structures in advanced VLSI interconnects. Scaling-dependent models for interconnect power, interconnect delay and crosstalk noise are covered in depth, and several design optimization problems are addressed, such as minimization of interconnect power under delay constraints, or design for minimal delay in wire bundles within a given routing area. A handy reference or a guide for design methodologies and layout automation techniques, this book provides a foundation for physical design challenges of interconnect in advanced integrated circuits.  • Describes the evolution of interconnect scaling and provides new techniques for layout migration and optimization, focusing on multi-net optimization; • Presents research results that provide a level of design optimization which does not exist in commercially-available design automation software tools; • Includes mathematical properties and conditions for optimal...

  2. PLA realizations for VLSI state machines

    Science.gov (United States)

    Gopalakrishnan, S.; Whitaker, S.; Maki, G.; Liu, K.

    1990-01-01

    A major problem associated with state assignment procedures for VLSI controllers is obtaining an assignment that produces minimal or near minimal logic. The key item in Programmable Logic Array (PLA) area minimization is the number of unique product terms required by the design equations. This paper presents a state assignment algorithm for minimizing the number of product terms required to implement a finite state machine using a PLA. Partition algebra with predecessor state information is used to derive a near optimal state assignment. A maximum bound on the number of product terms required can be obtained by inspecting the predecessor state information. The state assignment algorithm presented is much simpler than existing procedures and leads to the same number of product terms or less. An area-efficient PLA structure implemented in a 1.0 micron CMOS process is presented along with a summary of the performance for a controller implemented using this design procedure.

  3. Interactive Digital Signal Processor

    Science.gov (United States)

    Mish, W. H.

    1985-01-01

    Interactive Digital Signal Processor, IDSP, consists of set of time series analysis "operators" based on various algorithms commonly used for digital signal analysis. Processing of digital signal time series to extract information usually achieved by applications of number of fairly standard operations. IDSP excellent teaching tool for demonstrating application for time series operators to artificially generated signals.

  4. Beyond processor sharing

    NARCIS (Netherlands)

    Aalto, S.; Ayesta, U.; Borst, S.C.; Misra, V.; Núñez Queija, R.

    2007-01-01

    While the (Egalitarian) Processor-Sharing (PS) discipline offers crucial insights in the performance of fair resource allocation mechanisms, it is inherently limited in analyzing and designing differentiated scheduling algorithms such as Weighted Fair Queueing and Weighted Round-Robin. The Discrimin

  5. Design and Evaluation of Fault-Tolerant VLSI/WSI Processor Arrays.

    Science.gov (United States)

    1987-12-31

    way that production quantitiis. However, the -arR-- adata item is not only used when it is input advantages of this technology are nut fully FIR, IIR...rla’lt fP i m iodhile is r , -aIict i tn o’ raftlier low% wh~en N i. very big . ’lf’s i 𔃾n -V4let2t Ito Ill’ statement thlat, regardless of the relia

  6. The Central Trigger Processor (CTP)

    CERN Multimedia

    Franchini, Matteo

    2016-01-01

    The Central Trigger Processor (CTP) receives trigger information from the calorimeter and muon trigger processors, as well as from other sources of trigger. It makes the Level-1 decision (L1A) based on a trigger menu.

  7. Analog VLSI Biophysical Neurons and Synapses With Programmable Membrane Channel Kinetics.

    Science.gov (United States)

    Yu, Theodore; Cauwenberghs, Gert

    2010-06-01

    We present and characterize an analog VLSI network of 4 spiking neurons and 12 conductance-based synapses, implementing a silicon model of biophysical membrane dynamics and detailed channel kinetics in 384 digitally programmable parameters. Each neuron in the analog VLSI chip (NeuroDyn) implements generalized Hodgkin-Huxley neural dynamics in 3 channel variables, each with 16 parameters defining channel conductance, reversal potential, and voltage-dependence profile of the channel kinetics. Likewise, 12 synaptic channel variables implement a rate-based first-order kinetic model of neurotransmitter and receptor dynamics, accounting for NMDA and non-NMDA type chemical synapses. The biophysical origin of all 384 parameters in 24 channel variables supports direct interpretation of the results of adapting/tuning the parameters in terms of neurobiology. We present experimental results from the chip characterizing single neuron dynamics, single synapse dynamics, and multi-neuron network dynamics showing phase-locking behavior as a function of synaptic coupling strength. Uniform temporal scaling of the dynamics of membrane and gating variables is demonstrated by tuning a single current parameter, yielding variable speed output exceeding real time. The 0.5 CMOS chip measures 3 mm 3 mm, and consumes 1.29 mW.

  8. A Domain Specific DSP Processor

    OpenAIRE

    Tell, Eric

    2001-01-01

    This thesis describes the design of a domain specific DSP processor. The thesis is divided into two parts. The first part gives some theoretical background, describes the different steps of the design process (both for DSP processors in general and for this project) and motivates the design decisions made for this processor. The second part is a nearly complete design specification. The intended use of the processor is as a platform for hardware acceleration units. Support for this has howe...

  9. Processor register error correction management

    Science.gov (United States)

    Bose, Pradip; Cher, Chen-Yong; Gupta, Meeta S.

    2016-12-27

    Processor register protection management is disclosed. In embodiments, a method of processor register protection management can include determining a sensitive logical register for executable code generated by a compiler, generating an error-correction table identifying the sensitive logical register, and storing the error-correction table in a memory accessible by a processor. The processor can be configured to generate a duplicate register of the sensitive logical register identified by the error-correction table.

  10. Dual-core Itanium Processor

    CERN Multimedia

    2006-01-01

    Intel’s first dual-core Itanium processor, code-named "Montecito" is a major release of Intel's Itanium 2 Processor Family, which implements the Intel Itanium architecture on a dual-core processor with two cores per die (integrated circuit). Itanium 2 is much more powerful than its predecessor. It has lower power consumption and thermal dissipation.

  11. Drift chamber tracking with neural networks

    Energy Technology Data Exchange (ETDEWEB)

    Lindsey, C.S.; Denby, B.; Haggerty, H.

    1992-10-01

    We discuss drift chamber tracking with a commercial log VLSI neural network chip. Voltages proportional to the drift times in a 4-layer drift chamber were presented to the Intel ETANN chip. The network was trained to provide the intercept and slope of straight tracks traversing the chamber. The outputs were recorded and later compared off line to conventional track fits. Two types of network architectures were studied. Applications of neural network tracking to high energy physics detector triggers is discussed.

  12. Digital systems for artificial neural networks

    Energy Technology Data Exchange (ETDEWEB)

    Atlas, L.E. (Interactive Systems Design Lab., Univ. of Washington, WA (US)); Suzuki, Y. (NTT Human Interface Labs. (US))

    1989-11-01

    A tremendous flurry of research activity has developed around artificial neural systems. These systems have also been tested in many applications, often with positive results. Most of this work has taken place as digital simulations on general-purpose serial or parallel digital computers. Specialized neural network emulation systems have also been developed for more efficient learning and use. The authors discussed how dedicated digital VLSI integrated circuits offer the highest near-term future potential for this technology.

  13. New Generation Processor Architecture Research

    Institute of Scientific and Technical Information of China (English)

    Chen Hongsong(陈红松); Hu Mingzeng; Ji Zhenzhou

    2003-01-01

    With the rapid development of microelectronics and hardware,the use of ever faster micro-processors and new architecture must be continued to meet tomorrow′s computing needs. New processor microarchitectures are needed to push performance further and to use higher transistor counts effectively.At the same time,aiming at different usages,the processor has been optimized in different aspects,such as high performace,low power consumption,small chip area and high security. SOC (System on chip)and SCMP (Single Chip Multi Processor) constitute the main processor system architecture.

  14. Stereoscopic Optical Signal Processor

    Science.gov (United States)

    Graig, Glenn D.

    1988-01-01

    Optical signal processor produces two-dimensional cross correlation of images from steroscopic video camera in real time. Cross correlation used to identify object, determines distance, or measures movement. Left and right cameras modulate beams from light source for correlation in video detector. Switch in position 1 produces information about range of object viewed by cameras. Position 2 gives information about movement. Position 3 helps to identify object.

  15. Tiled Multicore Processors

    Science.gov (United States)

    Taylor, Michael B.; Lee, Walter; Miller, Jason E.; Wentzlaff, David; Bratt, Ian; Greenwald, Ben; Hoffmann, Henry; Johnson, Paul R.; Kim, Jason S.; Psota, James; Saraf, Arvind; Shnidman, Nathan; Strumpen, Volker; Frank, Matthew I.; Amarasinghe, Saman; Agarwal, Anant

    For the last few decades Moore’s Law has continually provided exponential growth in the number of transistors on a single chip. This chapter describes a class of architectures, called tiled multicore architectures, that are designed to exploit massive quantities of on-chip resources in an efficient, scalable manner. Tiled multicore architectures combine each processor core with a switch to create a modular element called a tile. Tiles are replicated on a chip as needed to create multicores with any number of tiles. The Raw processor, a pioneering example of a tiled multicore processor, is examined in detail to explain the philosophy, design, and strengths of such architectures. Raw addresses the challenge of building a general-purpose architecture that performs well on a larger class of stream and embedded computing applications than existing microprocessors, while still running existing ILP-based sequential programs with reasonable performance. Central to achieving this goal is Raw’s ability to exploit all forms of parallelism, including ILP, DLP, TLP, and Stream parallelism. Raw approaches this challenge by implementing plenty of on-chip resources - including logic, wires, and pins - in a tiled arrangement, and exposing them through a new ISA, so that the software can take advantage of these resources for parallel applications. Compared to a traditional superscalar processor, Raw performs within a factor of 2x for sequential applications with a very low degree of ILP, about 2x-9x better for higher levels of ILP, and 10x-100x better when highly parallel applications are coded in a stream language or optimized by hand.

  16. A Model of Stimulus-Specific Adaptation in Neuromorphic Analog VLSI.

    Science.gov (United States)

    Mill, R; Sheik, S; Indiveri, G; Denham, S L

    2011-10-01

    Stimulus-specific adaptation (SSA) is a phenomenon observed in neural systems which occurs when the spike count elicited in a single neuron decreases with repetitions of the same stimulus, and recovers when a different stimulus is presented. SSA therefore effectively highlights rare events in stimulus sequences, and suppresses responses to repetitive ones. In this paper we present a model of SSA based on synaptic depression and describe its implementation in neuromorphic analog very-large-scale integration (VLSI). The hardware system is evaluated using biologically realistic spike trains with parameters chosen to reflect those of the stimuli used in physiological experiments. We examine the effect of input parameters and stimulus history upon SSA and show that the trends apparent in the results obtained in silico compare favorably with those observed in biological neurons.

  17. A dynamic CMOS multiplier for analog VLSI based on exponential pulse-decay modulation

    Science.gov (United States)

    Massengill, Lloyd W.

    1991-03-01

    A clocked, charge-based, CMOS modulator circuit is presented. The circuit, which performs a semilinear multiplication function, has applications in arrayed analog VLSI architectures such as parallel filters and neural network systems. The design presented is simple in structure, uses no operational amplifiers for the actual multiplication function, and uses no power in the static mode. Two-quadrant weighting of an input signal is accomplished by control of the magnitude and decay time of an exponential current pulse, resulting in the delivery of charge packets to a shared capacitive summing bus. The cell is modular in structure and can be fabricated in a standard CMOS process. An analytical derivation of the operation of the circuit, SPICE simulations, and MOSIS fabrication results are presented. The simulation studies indicate that the circuit is inherently tolerant to temperature effects, absolute device sizing errors, and clock-feedthrough transients.

  18. An Opto-VLSI-based reconfigurable optical adddrop multiplexer employing an off-axis 4-f imaging system.

    Science.gov (United States)

    Shen, Mingya; Xiao, Feng; Ahderom, Selam; Alameh, Kamal

    2009-08-03

    A novel reconfigurable optical add-drop multiplexer (ROADM) structure is proposed and demonstrated experimentally. The ROADM structure employs two arrayed waveguide gratings (AWGs), an array of optical fiber pairs, an array of 4-f imaging microlenses that are offset in relation to the axis of symmetry of the fiber pairs, and a reconfigurable Opto-VLSI processor that switches various wavelength channels between the fiber pairs to achieve add or drop multiplexing. Experimental results are shown, which demonstrate the principle of add/drop multiplexing with crosstalk of less than -27dB and insertion loss of less than 8dB over the Cband for drop and through operation modes.

  19. On limited fan-in optimal neural networks

    Energy Technology Data Exchange (ETDEWEB)

    Beiu, V.; Makaruk, H.E. [Los Alamos National Lab., NM (United States); Draghici, S. [Wayne State Univ., Detroit, MI (United States). Vision and Neural Networks Lab.

    1998-03-01

    Because VLSI implementations do not cope well with highly interconnected nets the area of a chip growing as the cube of the fan-in--this paper analyses the influence of limited fan in on the size and VLSI optimality of such nets. Two different approaches will show that VLSI- and size-optimal discrete neural networks can be obtained for small (i.e. lower than linear) fan-in values. They have applications to hardware implementations of neural networks. The first approach is based on implementing a certain sub class of Boolean functions, IF{sub n,m} functions. The authors will show that this class of functions can be implemented in VLSI optimal (i.e., minimizing AT{sup 2}) neural networks of small constant fan ins. The second approach is based on implementing Boolean functions for which the classical Shannon`s decomposition can be used. Such a solution has already been used to prove bounds on neural networks with fan-ins limited to 2. They generalize the result presented there to arbitrary fan-in, and prove that the size is minimized by small fan in values, while relative minimum size solutions can be obtained for fan-ins strictly lower than linear. Finally, a size-optimal neural network having small constant fan-ins will be suggested for IF{sub n,m} functions.

  20. Associative Memory Design for the Fast TracKer Processor (FTK)at ATLAS

    CERN Document Server

    Annovi, A; The ATLAS collaboration; Beretta, M; Bossini, E; Crescioli, F; Dell'Orso, M; Giannetti, P; Hoff, J; Liu, T; Liberali, V; Sacco, I; Schoening, A; Soltveit, H K; Stabile, A; Tripiccione, R

    2011-01-01

    We describe a VLSI processor for pattern recognition based on Content Addressable Memory (CAM) architecture, optimized for on-line track finding in high-energy physics experiments. A large CAM bank stores all trajectories of interest and extracts the ones compatible with a given event. This task is naturally parallelized by a CAM architecture able to output identified trajectories, recognized among a huge amount of possible combinations, in just a few 100 MHz clock cycles. We have developed this device (called the AMchip03 processor), using 180 nm technology, for the Silicon Vertex Trigger (SVT) upgrade at CDF [1] using a standard-cell VLSI design methodology. We propose a new design that introduces a full custom CAM cell and takes advantage of 65 nm technology. The customized design maximizes the pattern density, minimizes the power consumption and implements the functionalities needed for the planned Fast Tracker (FTK) [2], an ATLAS trigger upgrade project at LHC. We introduce a new variable resolution patt...

  1. Distributed processor allocation for launching applications in a massively connected processors complex

    Science.gov (United States)

    Pedretti, Kevin

    2008-11-18

    A compute processor allocator architecture for allocating compute processors to run applications in a multiple processor computing apparatus is distributed among a subset of processors within the computing apparatus. Each processor of the subset includes a compute processor allocator. The compute processor allocators can share a common database of information pertinent to compute processor allocation. A communication path permits retrieval of information from the database independently of the compute processor allocators.

  2. VLSI micro- and nanophotonics science, technology, and applications

    CERN Document Server

    Lee, El-Hang; Razeghi, Manijeh; Jagadish, Chennupati

    2011-01-01

    Addressing the growing demand for larger capacity in information technology, VLSI Micro- and Nanophotonics: Science, Technology, and Applications explores issues of science and technology of micro/nano-scale photonics and integration for broad-scale and chip-scale Very Large Scale Integration photonics. This book is a game-changer in the sense that it is quite possibly the first to focus on ""VLSI Photonics"". Very little effort has been made to develop integration technologies for micro/nanoscale photonic devices and applications, so this reference is an important and necessary early-stage pe

  3. Handbook of VLSI chip design and expert systems

    CERN Document Server

    Schwarz, A F

    1993-01-01

    Handbook of VLSI Chip Design and Expert Systems provides information pertinent to the fundamental aspects of expert systems, which provides a knowledge-based approach to problem solving. This book discusses the use of expert systems in every possible subtask of VLSI chip design as well as in the interrelations between the subtasks.Organized into nine chapters, this book begins with an overview of design automation, which can be identified as Computer-Aided Design of Circuits and Systems (CADCAS). This text then presents the progress in artificial intelligence, with emphasis on expert systems.

  4. AN ALGORITHM FOR ASSEMBLING A COMMON IMAGE OF VLSI LAYOUT

    Directory of Open Access Journals (Sweden)

    Y. Y. Lankevich

    2015-01-01

    Full Text Available We consider problem of assembling a common image of VLSI layout. Common image is composedof frames obtained by electron microscope photographing. Many frames require a lot of computation for positioning each frame inside the common image. Employing graphics processing units enables acceleration of computations. We realize algorithms and programs for assembling a common image of VLSI layout. Specificity of this work is to use abilities of CUDA to reduce computation time. Experimental results show efficiency of the proposed programs.

  5. A Systolic Array RLS Processor

    OpenAIRE

    Asai, T.; Matsumoto, T.

    2000-01-01

    This paper presents the outline of the systolic array recursive least-squares (RLS) processor prototyped primarily with the aim of broadband mobile communication applications. To execute the RLS algorithm effectively, this processor uses an orthogonal triangularization technique known in matrix algebra as QR decomposition for parallel pipelined processing. The processor board comprises 19 application-specific integrated circuit chips, each with approximately one million gates. Thirty-two bit ...

  6. AMD's 64-bit Opteron processor

    CERN Document Server

    CERN. Geneva

    2003-01-01

    This talk concentrates on issues that relate to obtaining peak performance from the Opteron processor. Compiler options, memory layout, MPI issues in multi-processor configurations and the use of a NUMA kernel will be covered. A discussion of recent benchmarking projects and results will also be included.BiographiesDavid RichDavid directs AMD's efforts in high performance computing and also in the use of Opteron processors...

  7. Emerging Trends in Embedded Processors

    Directory of Open Access Journals (Sweden)

    Gurvinder Singh

    2014-05-01

    Full Text Available An Embedded Processors is simply a µProcessors that has been “Embedded” into a device. Embedded systems are important part of human life. For illustration, one cannot visualize life without mobile phones for personal communication. Embedded systems are used in many places like healthcare, automotive, daily life, and in different offices and industries.Embedded Processors develop new research area in the field of hardware designing.

  8. Spaceborne Processor Array

    Science.gov (United States)

    Chow, Edward T.; Schatzel, Donald V.; Whitaker, William D.; Sterling, Thomas

    2008-01-01

    A Spaceborne Processor Array in Multifunctional Structure (SPAMS) can lower the total mass of the electronic and structural overhead of spacecraft, resulting in reduced launch costs, while increasing the science return through dynamic onboard computing. SPAMS integrates the multifunctional structure (MFS) and the Gilgamesh Memory, Intelligence, and Network Device (MIND) multi-core in-memory computer architecture into a single-system super-architecture. This transforms every inch of a spacecraft into a sharable, interconnected, smart computing element to increase computing performance while simultaneously reducing mass. The MIND in-memory architecture provides a foundation for high-performance, low-power, and fault-tolerant computing. The MIND chip has an internal structure that includes memory, processing, and communication functionality. The Gilgamesh is a scalable system comprising multiple MIND chips interconnected to operate as a single, tightly coupled, parallel computer. The array of MIND components shares a global, virtual name space for program variables and tasks that are allocated at run time to the distributed physical memory and processing resources. Individual processor- memory nodes can be activated or powered down at run time to provide active power management and to configure around faults. A SPAMS system is comprised of a distributed Gilgamesh array built into MFS, interfaces into instrument and communication subsystems, a mass storage interface, and a radiation-hardened flight computer.

  9. Emergent Auditory Feature Tuning in a Real-Time Neuromorphic VLSI System.

    Science.gov (United States)

    Sheik, Sadique; Coath, Martin; Indiveri, Giacomo; Denham, Susan L; Wennekers, Thomas; Chicca, Elisabetta

    2012-01-01

    Many sounds of ecological importance, such as communication calls, are characterized by time-varying spectra. However, most neuromorphic auditory models to date have focused on distinguishing mainly static patterns, under the assumption that dynamic patterns can be learned as sequences of static ones. In contrast, the emergence of dynamic feature sensitivity through exposure to formative stimuli has been recently modeled in a network of spiking neurons based on the thalamo-cortical architecture. The proposed network models the effect of lateral and recurrent connections between cortical layers, distance-dependent axonal transmission delays, and learning in the form of Spike Timing Dependent Plasticity (STDP), which effects stimulus-driven changes in the pattern of network connectivity. In this paper we demonstrate how these principles can be efficiently implemented in neuromorphic hardware. In doing so we address two principle problems in the design of neuromorphic systems: real-time event-based asynchronous communication in multi-chip systems, and the realization in hybrid analog/digital VLSI technology of neural computational principles that we propose underlie plasticity in neural processing of dynamic stimuli. The result is a hardware neural network that learns in real-time and shows preferential responses, after exposure, to stimuli exhibiting particular spectro-temporal patterns. The availability of hardware on which the model can be implemented, makes this a significant step toward the development of adaptive, neurobiologically plausible, spike-based, artificial sensory systems.

  10. Implementation of neuromorphic systems: from discrete components to analog VLSI chips (testing and communication issues).

    Science.gov (United States)

    Dante, V; Del Giudice, P; Mattia, M

    2001-01-01

    We review a series of implementations of electronic devices aiming at imitating to some extent structure and function of simple neural systems, with particular emphasis on communication issues. We first provide a short overview of general features of such "neuromorphic" devices and the implications of setting up "tests" for them. We then review the developments directly related to our work at the Istituto Superiore di Sanità (ISS): a pilot electronic neural network implementing a simple classifier, autonomously developing internal representations of incoming stimuli; an output network, collecting information from the previous classifier and extracting the relevant part to be forwarded to the observer; an analog, VLSI (very large scale integration) neural chip implementing a recurrent network of spiking neurons and plastic synapses, and the test setup for it; a board designed to interface the standard PCI (peripheral component interconnect) bus of a PC with a special purpose, asynchronous bus for communication among neuromorphic chips; a short and preliminary account of an application-oriented device, taking advantage of the above communication infrastructure.

  11. Emergent auditory feature tuning in a real-time neuromorphic VLSI system

    Directory of Open Access Journals (Sweden)

    Sadique eSheik

    2012-02-01

    Full Text Available Many sounds of ecological importance, such as communication calls, are characterised by time-varying spectra. However, most neuromorphic auditory models to date have focused on distinguishing mainly static patterns, under the assumption that dynamic patterns can be learned as sequences of static ones. In contrast, the emergence of dynamic feature sensitivity through exposure to formative stimuli has been recently modeled in a network of spiking neurons based on the thalamocortical architecture. The proposed network models the effect of lateral and recurrent connections between cortical layers, distance-dependent axonal transmission delays, and learning in the form of Spike Timing Dependent Plasticity (STDP, which effects stimulus-driven changes in the pattern of network connectivity. In this paper we demonstrate how these principles can be efficiently implemented in neuromorphic hardware. In doing so we address two principle problems in the design of neuromorphic systems: real-time event-based asynchronous communication in multi-chip systems, and the realization in hybrid analog/digital VLSI technology of neural computational principles that we propose underlie plasticity in neural processing of dynamic stimuli. The result is a hardware neural network that learns in real-time and shows preferential responses, after exposure, to stimuli exhibiting particular spectrotemporal patterns. The availability of hardware on which the model can be implemented, makes this a significant step towards the development of adaptive, neurobiologically plausible, spike-based, artificial sensory systems.

  12. An efficient interpolation filter VLSI architecture for HEVC standard

    Science.gov (United States)

    Zhou, Wei; Zhou, Xin; Lian, Xiaocong; Liu, Zhenyu; Liu, Xiaoxiang

    2015-12-01

    The next-generation video coding standard of High-Efficiency Video Coding (HEVC) is especially efficient for coding high-resolution video such as 8K-ultra-high-definition (UHD) video. Fractional motion estimation in HEVC presents a significant challenge in clock latency and area cost as it consumes more than 40 % of the total encoding time and thus results in high computational complexity. With aims at supporting 8K-UHD video applications, an efficient interpolation filter VLSI architecture for HEVC is proposed in this paper. Firstly, a new interpolation filter algorithm based on the 8-pixel interpolation unit is proposed in this paper. It can save 19.7 % processing time on average with acceptable coding quality degradation. Based on the proposed algorithm, an efficient interpolation filter VLSI architecture, composed of a reused data path of interpolation, an efficient memory organization, and a reconfigurable pipeline interpolation filter engine, is presented to reduce the implement hardware area and achieve high throughput. The final VLSI implementation only requires 37.2k gates in a standard 90-nm CMOS technology at an operating frequency of 240 MHz. The proposed architecture can be reused for either half-pixel interpolation or quarter-pixel interpolation, which can reduce the area cost for about 131,040 bits RAM. The processing latency of our proposed VLSI architecture can support the real-time processing of 4:2:0 format 7680 × 4320@78fps video sequences.

  13. Boolean approaches to graph embeddings related to VLSI

    Institute of Scientific and Technical Information of China (English)

    刘彦佩

    2001-01-01

    This paper discusses the development of Boolean methods in some topics on graph em-beddings which are related to VLSI. They are mainly the general theory of graph embeddability, the orientabilities of a graph and the rectilinear layout of an electronic circuit.

  14. Tungsten and other refractory metals for VLSI applications II

    Energy Technology Data Exchange (ETDEWEB)

    Broadbent, E.K.

    1987-01-01

    This book presents papers on tungsten and other refractory metals for VLSI applications. Topics include the following: Selectivity loss and nucleation on insulators, fundamental reaction and growth studies, chemical vapor deposition of tungsten, chemical vapor deposition of molybdenum, reactive ion etching of refractory metal films; and properties of refractory metals deposited by sputtering.

  15. An Interactive Multimedia Learning Environment for VLSI Built with COSMOS

    Science.gov (United States)

    Angelides, Marios C.; Agius, Harry W.

    2002-01-01

    This paper presents Bigger Bits, an interactive multimedia learning environment that teaches students about VLSI within the context of computer electronics. The system was built with COSMOS (Content Oriented semantic Modelling Overlay Scheme), which is a modelling scheme that we developed for enabling the semantic content of multimedia to be used…

  16. An Analogue VLSI Implementation of the Meddis Inner Hair Cell Model

    Directory of Open Access Journals (Sweden)

    McEwan Alistair

    2003-01-01

    Full Text Available The Meddis inner hair cell model is a widely accepted, but computationally intensive computer model of mammalian inner hair cell function. We have produced an analogue VLSI implementation of this model that operates in real time in the current domain by using translinear and log-domain circuits. The circuit has been fabricated on a chip and tested against the Meddis model for (a rate level functions for onset and steady-state response, (b recovery after masking, (c additivity, (d two-component adaptation, (e phase locking, (f recovery of spontaneous activity, and (g computational efficiency. The advantage of this circuit, over other electronic inner hair cell models, is its nearly exact implementation of the Meddis model which can be tuned to behave similarly to the biological inner hair cell. This has important implications on our ability to simulate the auditory system in real time. Furthermore, the technique of mapping a mathematical model of first-order differential equations to a circuit of log-domain filters allows us to implement real-time neuromorphic signal processors for a host of models using the same approach.

  17. Embedded Processor Oriented Compiler Infrastructure

    Directory of Open Access Journals (Sweden)

    DJUKIC, M.

    2014-08-01

    Full Text Available In the recent years, research of special compiler techniques and algorithms for embedded processors broaden the knowledge of how to achieve better compiler performance in irregular processor architectures. However, industrial strength compilers, besides ability to generate efficient code, must also be robust, understandable, maintainable, and extensible. This raises the need for compiler infrastructure that provides means for convenient implementation of embedded processor oriented compiler techniques. Cirrus Logic Coyote 32 DSP is an example that shows how traditional compiler infrastructure is not able to cope with the problem. That is why the new compiler infrastructure was developed for this processor, based on research. in the field of embedded system software tools and experience in development of industrial strength compilers. The new infrastructure is described in this paper. Compiler generated code quality is compared with code generated by the previous compiler for the same processor architecture.

  18. A neuromorphic VLSI design for spike timing and rate based synaptic plasticity.

    Science.gov (United States)

    Rahimi Azghadi, Mostafa; Al-Sarawi, Said; Abbott, Derek; Iannella, Nicolangelo

    2013-09-01

    Triplet-based Spike Timing Dependent Plasticity (TSTDP) is a powerful synaptic plasticity rule that acts beyond conventional pair-based STDP (PSTDP). Here, the TSTDP is capable of reproducing the outcomes from a variety of biological experiments, while the PSTDP rule fails to reproduce them. Additionally, it has been shown that the behaviour inherent to the spike rate-based Bienenstock-Cooper-Munro (BCM) synaptic plasticity rule can also emerge from the TSTDP rule. This paper proposes an analogue implementation of the TSTDP rule. The proposed VLSI circuit has been designed using the AMS 0.35 μm CMOS process and has been simulated using design kits for Synopsys and Cadence tools. Simulation results demonstrate how well the proposed circuit can alter synaptic weights according to the timing difference amongst a set of different patterns of spikes. Furthermore, the circuit is shown to give rise to a BCM-like learning rule, which is a rate-based rule. To mimic an implementation environment, a 1000 run Monte Carlo (MC) analysis was conducted on the proposed circuit. The presented MC simulation analysis and the simulation result from fine-tuned circuits show that it is possible to mitigate the effect of process variations in the proof of concept circuit; however, a practical variation aware design technique is required to promise a high circuit performance in a large scale neural network. We believe that the proposed design can play a significant role in future VLSI implementations of both spike timing and rate based neuromorphic learning systems.

  19. Gesture Recognition Using Neural Networks Based on HW/SW Cosimulation Platform

    Directory of Open Access Journals (Sweden)

    Priyanka Mekala

    2013-01-01

    Full Text Available Hardware/software (HW/SW cosimulation integrates software simulation and hardware simulation simultaneously. Usually, HW/SW co-simulation platform is used to ease debugging and verification for very large-scale integration (VLSI design. To accelerate the computation of the gesture recognition technique, an HW/SW implementation using field programmable gate array (FPGA technology is presented in this paper. The major contributions of this work are: (1 a novel design of memory controller in the Verilog Hardware Description Language (Verilog HDL to reduce memory consumption and load on the processor. (2 The testing part of the neural network algorithm is being hardwired to improve the speed and performance. The American Sign Language gesture recognition is chosen to verify the performance of the approach. Several experiments were carried out on four databases of the gestures (alphabet signs A to Z. (3 The major benefit of this design is that it takes only few milliseconds to recognize the hand gesture which makes it computationally more efficient.

  20. Neural Computing in High Energy Physics

    Institute of Scientific and Technical Information of China (English)

    O.D.Joukov; N.D.Rishe

    2001-01-01

    Artifical neural networks (ANN) are now widely used successfully as tools for hith energy physics.The paper covers two aspects.First,mapping ANNs onto the proposed ring and linear systolic array provides an efficient implementation of VLSI-based architectures since in this case all connections among processing elements are local and regular,Second.it is discussed algorthmic organizing of such structures on the base of modular algebra whose use can provide an essential increase of system throughput.

  1. Cognitive and Neural Sciences Division 1991 Programs

    Science.gov (United States)

    1991-08-01

    Neuroscience. Davis, J. and Eichenbaum , H., (Eds) MIT Press. Leen, T., Means, E. and Hammerstrom, D. (1990) Analysis and VLSI implementation of self-organizing...Davis and H. Eichenbaum (Eds.) MIT Press. 154 TITLE: Intelligent Controls Research Using the Dexterous Robotic Hand and Neural Networks Laboratory...Electronic ’.pproach PRINCIPAL INVESTIGATOR: Kevan A. Martin Medical Research Council Department of Pharmacology (0865) 275184 R&T PROJECT CODE: 4426411

  2. Advanced symbolic analysis for VLSI systems methods and applications

    CERN Document Server

    Shi, Guoyong; Tlelo Cuautle, Esteban

    2014-01-01

    This book provides comprehensive coverage of the recent advances in symbolic analysis techniques for design automation of nanometer VLSI systems. The presentation is organized in parts of fundamentals, basic implementation methods and applications for VLSI design. Topics emphasized include  statistical timing and crosstalk analysis, statistical and parallel analysis, performance bound analysis and behavioral modeling for analog integrated circuits . Among the recent advances, the Binary Decision Diagram (BDD) based approaches are studied in depth. The BDD-based hierarchical symbolic analysis approaches, have essentially broken the analog circuit size barrier. In particular, this book   • Provides an overview of classical symbolic analysis methods and a comprehensive presentation on the modern  BDD-based symbolic analysis techniques; • Describes detailed implementation strategies for BDD-based algorithms, including the principles of zero-suppression, variable ordering and canonical reduction; • Int...

  3. VLSI Design with Alliance Free CAD Tools: an Implementation Example

    Directory of Open Access Journals (Sweden)

    Chávez-Bracamontes Ramón

    2015-07-01

    Full Text Available This paper presents the methodology used for a digital integrated circuit design that implements the communication protocol known as Serial Peripheral Interface, using the Alliance CAD System. The aim of this paper is to show how the work of VLSI design can be done by graduate and undergraduate students with minimal resources and experience. The physical design was sent to be fabricated using the CMOS AMI C5 process that features 0.5 micrometer in transistor size, sponsored by the MOSIS Educational Program. Tests were made on a platform that transfers data from inertial sensor measurements to the designed SPI chip, which in turn sends the data back on a parallel bus to a common microcontroller. The results show the efficiency of the employed methodology in VLSI design, as well as the feasibility of ICs manufacturing from school projects that have insufficient or no source of funding

  4. VLSI physical design analyzer: A profiling and data mining tool

    Science.gov (United States)

    Somani, Shikha; Verma, Piyush; Madhavan, Sriram; Batarseh, Fadi; Pack, Robert C.; Capodieci, Luigi

    2015-03-01

    Traditional physical design verification tools employ a deck of known design rules, each of which has a pre-defined pass/fail criteria associated with it. While passing a design rule deck is a necessary condition for a VLSI design to be manufacturable, it is not sufficient. Other physical design profiling decks that attempt to obtain statistical information about the various critical dimensions in the VLSI design lack a systematic methodology for rule enumeration. These decks are often inadequate, unable to extract all the interlayer and intralayer dimensions in a design that have a correlation with process yield. The Physical Design Analyzer is a comprehensive design analysis tool built with the objective of exhaustively exploring design-process correlations to increase the wafer yield.

  5. A novel 3D algorithm for VLSI floorplanning

    Science.gov (United States)

    Rani, D. Gracia N.; Rajaram, S.; Sudarasan, Athira

    2013-01-01

    3-D VLSI circuit is becoming a hot issue because of its potential of enhancing performance, while it is also facing challenges such as the increased complexity on floorplanning and placement in VLSI Physical design. Efficient 3-D floorplan representations are needed to handle the placement optimization in new circuit designs. We analyze and categorize some state-of-the-art 3-D representations, and propose a Ternary tree model for 3-D nonslicing floorplans, extending the B*tree from 2D.This paper proposes a novel optimization algorithm for packing of 3D rectangular blocks. The new techniques considered are Differential evolutionary algorithm (DE) is very fast in that it evaluates the feasibility of a Ternary tree representation. Experimental results based on MCNC benchmark with constraints show that our proposed Differential Evolutionary (DE) can quickly produce optimal solutions.

  6. VLSI ARCHITECTURE OF AN AREA EFFICIENT IMAGE INTERPOLATION

    Directory of Open Access Journals (Sweden)

    John Moses C

    2014-05-01

    Full Text Available Image interpolation is widely used in many image processing applications, such as digital camera, mobile phone, tablet and display devices. Image interpolation is a method of estimating the new data points within the range of discrete set of known data points. Image interpolation can also be referred as image scaling, image resizing, image re-sampling and image zooming. This paper presents VLSI (Very Large Scale Integration architecture of an area efficient image interpolation algorithm for any two dimensional (2-D image scalar. This architecture is implemented in FPGA (Field Programmable Gate Array and the performance of this system is simulated using Xilinx system generator and synthesized using Xilinx ISE smulation tool. Various VLSI parameters such as combinational path delay, CPU time, memory usage, number of LUTs (Look Up Tables are measured from the synthesis report.

  7. VLSI design for fault-dictionary based testability

    Science.gov (United States)

    Miller, Charles D.

    The fault-dictionary approach to isolating failures in digital circuits provides inferior isolation accuracy compared to that which is now generally attained with other isolation methods. This limitation is particularly apparent when circuits which use bidirectional bus configurations are being tested. For this reason, fault-dictionary-based isolation has serious economic implications when testing digital circuits which use expensive VLSI or HSIC devices. However, by incorporating relatively minor circuit additions into the design of VLSI and HSIC devices, the normal set/scan or equivalent testability pins can additionally serve to improve actual fault-isolation accuracy. The described additions for improving fault-dictionary-based fault isolation require little semiconductor area, and one configuration even serves to prevent bus-drive conflicts.

  8. Trace-based post-silicon validation for VLSI circuits

    CERN Document Server

    Liu, Xiao

    2014-01-01

    This book first provides a comprehensive coverage of state-of-the-art validation solutions based on real-time signal tracing to guarantee the correctness of VLSI circuits.  The authors discuss several key challenges in post-silicon validation and provide automated solutions that are systematic and cost-effective.  A series of automatic tracing solutions and innovative design for debug (DfD) techniques are described, including techniques for trace signal selection for enhancing visibility of functional errors, a multiplexed signal tracing strategy for improving functional error detection, a tracing solution for debugging electrical errors, an interconnection fabric for increasing data bandwidth and supporting multi-core debug, an interconnection fabric design and optimization technique to increase transfer flexibility and a DfD design and associated tracing solution for improving debug efficiency and expanding tracing window. The solutions presented in this book improve the validation quality of VLSI circuit...

  9. Fast Forwarding with Network Processors

    OpenAIRE

    Lefèvre, Laurent; Lemoine, E.; Pham, C; Tourancheau, B.

    2003-01-01

    Forwarding is a mechanism found in many network operations. Although a regular workstation is able to perform forwarding operations it still suffers from poor performances when compared to dedicated hardware machines. In this paper we study the possibility of using Network Processors (NPs) to improve the capability of regular workstations to forward data. We present a simple model and an experimental study demonstrating that even though NPs are less powerful than Host Processors (HPs) they ca...

  10. Digital VLSI algorithms and architectures for support vector machines.

    Science.gov (United States)

    Anguita, D; Boni, A; Ridella, S

    2000-06-01

    In this paper, we propose some very simple algorithms and architectures for a digital VLSI implementation of Support Vector Machines. We discuss the main aspects concerning the realization of the learning phase of SVMs, with special attention on the effects of fixed-point math for computing and storing the parameters of the network. Some experiments on two classification problems are described that show the efficiency of the proposed methods in reaching optimal solutions with reasonable hardware requirements.

  11. VLSI circuits for bidirectional interface to peripheral and visceral nerves.

    Science.gov (United States)

    Greenwald, Elliot; Wang, Qihong; Thakor, Nitish V

    2015-08-01

    This paper presents an architecture for sensing nerve signals and delivering functional electrical stimulation to peripheral and visceral nerves. The design is based on the very large scale integration (VLSI) technology and amenable to interface to microelectrodes and building a fully implantable system. The proposed stimulator was tested on the vagus nerve and is under further evaluation and testing of various visceral nerves and their functional effects on the innervated organs.

  12. VLSI Design with Alliance Free CAD Tools: an Implementation Example

    OpenAIRE

    Chávez-Bracamontes Ramón; García-López Reyna Itzel; Gurrola-Navarro Marco Antonio; Bandala-Sánchez Manuel

    2015-01-01

    This paper presents the methodology used for a digital integrated circuit design that implements the communication protocol known as Serial Peripheral Interface, using the Alliance CAD System. The aim of this paper is to show how the work of VLSI design can be done by graduate and undergraduate students with minimal resources and experience. The physical design was sent to be fabricated using the CMOS AMI C5 process that features 0.5 micrometer in transistor size, sponsored ...

  13. Design of a VLSI Decoder for Partially Structured LDPC Codes

    Directory of Open Access Journals (Sweden)

    Fabrizio Vacca

    2008-01-01

    of their parity matrix can be partitioned into two disjoint sets, namely, the structured and the random ones. For the proposed class of codes a constructive design method is provided. To assess the value of this method the constructed codes performance are presented. From these results, a novel decoding method called split decoding is introduced. Finally, to prove the effectiveness of the proposed approach a whole VLSI decoder is designed and characterized.

  14. Diseño digital : una perspectiva VLSI-CMOS

    OpenAIRE

    Alcubilla González, Ramón; Pons Nin, Joan; Bardés Llorensí, Daniel

    1996-01-01

    Bibliografia El presente texto aporta el material necesario para un curso introductorio de Electrónica Digital. Incluye los conceptos fundamentales de diseño clásico de circuitos lógicos combinacionales y secuenciales. Adicionalmente se introducen aspectos de diseño de circuitos integrados con tecnología VLSI-CMOS. Se ha incidido particularmente en los elementos de autoaprendizaje mediante la inclusión de numerosos ejemplos y problemas.

  15. vPELS: An E-Learning Social Environment for VLSI Design with Content Security Using DRM

    Science.gov (United States)

    Dewan, Jahangir; Chowdhury, Morshed; Batten, Lynn

    2014-01-01

    This article provides a proposal for personal e-learning system (vPELS [where "v" stands for VLSI: very large scale integrated circuit])) architecture in the context of social network environment for VLSI Design. The main objective of vPELS is to develop individual skills on a specific subject--say, VLSI--and share resources with peers.…

  16. vPELS: An E-Learning Social Environment for VLSI Design with Content Security Using DRM

    Science.gov (United States)

    Dewan, Jahangir; Chowdhury, Morshed; Batten, Lynn

    2014-01-01

    This article provides a proposal for personal e-learning system (vPELS [where "v" stands for VLSI: very large scale integrated circuit])) architecture in the context of social network environment for VLSI Design. The main objective of vPELS is to develop individual skills on a specific subject--say, VLSI--and share resources with peers.…

  17. High speed vision processor with reconfigurable processing element array based on full-custom distributed memory

    Science.gov (United States)

    Chen, Zhe; Yang, Jie; Shi, Cong; Qin, Qi; Liu, Liyuan; Wu, Nanjian

    2016-04-01

    In this paper, a hybrid vision processor based on a compact full-custom distributed memory for near-sensor high-speed image processing is proposed. The proposed processor consists of a reconfigurable processing element (PE) array, a row processor (RP) array, and a dual-core microprocessor. The PE array includes two-dimensional processing elements with a compact full-custom distributed memory. It supports real-time reconfiguration between the PE array and the self-organized map (SOM) neural network. The vision processor is fabricated using a 0.18 µm CMOS technology. The circuit area of the distributed memory is reduced markedly into 1/3 of that of the conventional memory so that the circuit area of the vision processor is reduced by 44.2%. Experimental results demonstrate that the proposed design achieves correct functions.

  18. Robust Bioinformatics Recognition with VLSI Biochip Microsystem

    Science.gov (United States)

    Lue, Jaw-Chyng L.; Fang, Wai-Chi

    2006-01-01

    A microsystem architecture for real-time, on-site, robust bioinformatic patterns recognition and analysis has been proposed. This system is compatible with on-chip DNA analysis means such as polymerase chain reaction (PCR)amplification. A corresponding novel artificial neural network (ANN) learning algorithm using new sigmoid-logarithmic transfer function based on error backpropagation (EBP) algorithm is invented. Our results show the trained new ANN can recognize low fluorescence patterns better than the conventional sigmoidal ANN does. A differential logarithmic imaging chip is designed for calculating logarithm of relative intensities of fluorescence signals. The single-rail logarithmic circuit and a prototype ANN chip are designed, fabricated and characterized.

  19. VLSI (Very Large Scale Integration) Design Tools Reference Manual - Release 1.0.

    Science.gov (United States)

    1983-10-01

    34" SUBCXT Sabna N1 < N2 N3 ... > 1_V/NW VLSI Release 1 -18- * SPICE User’s Guide UW/NW VLSI Consortium Examples: .SUBCKT OPAMP 12 3 4 A circuit definition... OPAMP This card must be the last one for any subcircuit definition. The subcircuit name, if included, indicates which subcircuit definition is being

  20. Libera Electron Beam Position Processor

    CERN Document Server

    Ursic, Rok

    2005-01-01

    Libera is a product family delivering unprecedented possibilities for either building powerful single station solutions or architecting complex feedback systems in the field of accelerator instrumentation and controls. This paper presents functionality and field performance of its first member, the electron beam position processor. It offers superior performance with multiple measurement channels delivering simultaneously position measurements in digital format with MHz kHz and Hz bandwidths. This all-in-one product, facilitating pulsed and CW measurements, is much more than simply a high performance beam position measuring device delivering micrometer level reproducibility with sub-micrometer resolution. Rich connectivity options and innate processing power make it a powerful feedback building block. By interconnecting multiple Libera electron beam position processors one can build a low-latency high throughput orbit feedback system without adding additional hardware. Libera electron beam position processor ...

  1. Java Processor Optimized for RTSJ

    Directory of Open Access Journals (Sweden)

    Tu Shiliang

    2007-01-01

    Full Text Available Due to the preeminent work of the real-time specification for Java (RTSJ, Java is increasingly expected to become the leading programming language in real-time systems. To provide a Java platform suitable for real-time applications, a Java processor which can execute Java bytecode is directly proposed in this paper. It provides efficient support in hardware for some mechanisms specified in the RTSJ and offers a simpler programming model through ameliorating the scoped memory of the RTSJ. The worst case execution time (WCET of the bytecodes implemented in this processor is predictable by employing the optimization method proposed in our previous work, in which all the processing interfering predictability is handled before bytecode execution. Further advantage of this method is to make the implementation of the processor simpler and suited to a low-cost FPGA chip.

  2. Making CSB + -Trees Processor Conscious

    DEFF Research Database (Denmark)

    Samuel, Michael; Pedersen, Anders Uhl; Bonnet, Philippe

    2005-01-01

    Cache-conscious indexes, such as CSB+-tree, are sensitive to the underlying processor architecture. In this paper, we focus on how to adapt the CSB+-tree so that it performs well on a range of different processor architectures. Previous work has focused on the impact of node size on the performance...... of the CSB+-tree. We argue that it is necessary to consider a larger group of parameters in order to adapt CSB+-tree to processor architectures as different as Pentium and Itanium. We identify this group of parameters and study how it impacts the performance of CSB+-tree on Itanium 2. Finally, we propose...... a systematic method for adapting CSB+-tree to new platforms. This work is a first step towards integrating CSB+-tree in MySQL’s heap storage manager....

  3. Cluster Algorithm Special Purpose Processor

    Science.gov (United States)

    Talapov, A. L.; Shchur, L. N.; Andreichenko, V. B.; Dotsenko, Vl. S.

    We describe a Special Purpose Processor, realizing the Wolff algorithm in hardware, which is fast enough to study the critical behaviour of 2D Ising-like systems containing more than one million spins. The processor has been checked to produce correct results for a pure Ising model and for Ising model with random bonds. Its data also agree with the Nishimori exact results for spin glass. Only minor changes of the SPP design are necessary to increase the dimensionality and to take into account more complex systems such as Potts models.

  4. Cluster algorithm special purpose processor

    Energy Technology Data Exchange (ETDEWEB)

    Talapov, A.L.; Shchur, L.N.; Andreichenko, V.B.; Dotsenko, V.S. (Landau Inst. for Theoretical Physics, GSP-1 117940 Moscow V-334 (USSR))

    1992-08-10

    In this paper, the authors describe a Special Purpose Processor, realizing the Wolff algorithm in hardware, which is fast enough to study the critical behaviour of 2D Ising-like systems containing more than one million spins. The processor has been checked to produce correct results for a pure Ising model and for Ising model with random bonds. Its data also agree with the Nishimori exact results for spin glass. Only minor changes of the SPP design are necessary to increase the dimensionality and to take into account more complex systems such as Potts models.

  5. Reconfigurable Communication Processor:A New Approach for Network Processor

    Institute of Scientific and Technical Information of China (English)

    孙华; 陈青山; 张文渊

    2003-01-01

    As the traditional RISC +ASIC/ASSP approach for network processor design can not meet the today'srequirements, this paper described an alternate approach, Reconfigurable Processing Architecture, to boost theperformance to ASIC level while reserve the programmability of the traditional RISC based system. This papercovers both the hardware architecture and the software development environment architecture.

  6. An efficient ASIC implementation of 16-channel on-line recursive ICA processor for real-time EEG system.

    Science.gov (United States)

    Fang, Wai-Chi; Huang, Kuan-Ju; Chou, Chia-Ching; Chang, Jui-Chung; Cauwenberghs, Gert; Jung, Tzyy-Ping

    2014-01-01

    This is a proposal for an efficient very-large-scale integration (VLSI) design, 16-channel on-line recursive independent component analysis (ORICA) processor ASIC for real-time EEG system, implemented with TSMC 40 nm CMOS technology. ORICA is appropriate to be used in real-time EEG system to separate artifacts because of its highly efficient and real-time process features. The proposed ORICA processor is composed of an ORICA processing unit and a singular value decomposition (SVD) processing unit. Compared with previous work [1], this proposed ORICA processor has enhanced effectiveness and reduced hardware complexity by utilizing a deeper pipeline architecture, shared arithmetic processing unit, and shared registers. The 16-channel random signals which contain 8-channel super-Gaussian and 8-channel sub-Gaussian components are used to analyze the dependence of the source components, and the average correlation coefficient is 0.95452 between the original source signals and extracted ORICA signals. Finally, the proposed ORICA processor ASIC is implemented with TSMC 40 nm CMOS technology, and it consumes 15.72 mW at 100 MHz operating frequency.

  7. Biophysical synaptic dynamics in an analog VLSI network of Hodgkin-Huxley neurons.

    Science.gov (United States)

    Yu, Theodore; Cauwenberghs, Gert

    2009-01-01

    We study synaptic dynamics in a biophysical network of four coupled spiking neurons implemented in an analog VLSI silicon microchip. The four neurons implement a generalized Hodgkin-Huxley model with individually configurable rate-based kinetics of opening and closing of Na+ and K+ ion channels. The twelve synapses implement a rate-based first-order kinetic model of neurotransmitter and receptor dynamics, accounting for NMDA and non-NMDA type chemical synapses. The implemented models on the chip are fully configurable by 384 parameters accounting for conductances, reversal potentials, and pre/post-synaptic voltage-dependence of the channel kinetics. We describe the models and present experimental results from the chip characterizing single neuron dynamics, single synapse dynamics, and multi-neuron network dynamics showing phase-locking behavior as a function of synaptic coupling strength. The 3mm x 3mm microchip consumes 1.29 mW power making it promising for applications including neuromorphic modeling and neural prostheses.

  8. Imaging with polycrystalline mercuric iodide detectors using VLSI readout

    Energy Technology Data Exchange (ETDEWEB)

    Turchetta, R.; Dulinski, W.; Husson, D.; Riester, J.L.; Schieber, M.; Zuck, A.; Melekhov, L.; Saado, Y.; Hermon, H.; Nissenbaum, J

    1999-06-01

    Potentially low cost and large area polycrystalline mercuric iodide room-temperature radiation detectors, with thickness of 100-600 {mu}m have been successfully tested with dedicated low-noise, low-power mixed signal VLSI electronics which can be used for compact, imaging solutions. The detectors are fabricated by depositing HgI{sub 2} directly on an insulating substrate having electrodes in the form of microstrips and pixels with an upper continuous electrode. The deposition is made either by direct evaporation or by screen printing HgI{sub 2} mixed with glue such as Poly-Vinyl-Butiral. The properties of these first-generation detectors are quite uniform from one detector to another. Also for each single detector the response is quite uniform and no charge loss in the inter-electrode space have been detected. Because of the low cost and of the polycrystallinity, detectors can be potentially fabricated in any size and shape, using standard ceramic technology equipment, which is an attractive feature where low cost and large area applications are needed. The detectors which act as radiation counters have been tested with a beta source as well as in a high-energy beam of 100 GeV muons at CERN, connected to VLSI, low noise electronics. Charge collection efficiency and uniformity have been studied. The charge is efficiently collected even in the space between strips indicating that fill factors of 100% could be reached in imaging applications with direct detection of radiation. Single photon counting capability is reached with VLSI electronics. These results show the potential of this material for applications demanding position sensitive, radiation resistant, room-temperature operating radiation detectors, where position resolution is essential, as it can be found in some applications in high-energy physics, nuclear medicine and astrophysics.

  9. VLSI implementations of threshold logic-a comprehensive survey.

    Science.gov (United States)

    Beiu, V; Quintana, J M; Avedillo, M J

    2003-01-01

    This paper is an in-depth review on silicon implementations of threshold logic gates that covers several decades. In this paper, we will mention early MOS threshold logic solutions and detail numerous very-large-scale integration (VLSI) implementations including capacitive (switched capacitor and floating gate with their variations), conductance/current (pseudo-nMOS and output-wired-inverters, including a plethora of solutions evolved from them), as well as many differential solutions. At the end, we will briefly mention other implementations, e.g., based on negative resistance devices and on single electron technologies.

  10. Crystal growth and evaluation of silicon for VLSI and ULSI

    CERN Document Server

    Eranna, Golla

    2014-01-01

    PrefaceAbout the AuthorIntroductionSilicon: The SemiconductorWhy Single CrystalsRevolution in Integrated Circuit Fabrication Technology and the Art of Device MiniaturizationUse of Silicon as a SemiconductorSilicon Devices for Boolean ApplicationsIntegration of Silicon Devices and the Art of Circuit MiniaturizationMOS and CMOS Devices for Digital ApplicationsLSI, VLSI, and ULSI Circuits and ApplicationsSilicon for MEMS ApplicationsSummaryReferencesSilicon: The Key Material for Integrated Circuit Fabrication TechnologyIntroductionPreparation of Raw Silicon MaterialMetallurgical-Grade SiliconPuri

  11. A VLSI architecture for simplified arithmetic Fourier transform algorithm

    Science.gov (United States)

    Reed, Irving S.; Shih, Ming-Tang; Truong, T. K.; Hendon, E.; Tufts, D. W.

    1992-01-01

    The arithmetic Fourier transform (AFT) is a number-theoretic approach to Fourier analysis which has been shown to perform competitively with the classical FFT in terms of accuracy, complexity, and speed. Theorems developed in a previous paper for the AFT algorithm are used here to derive the original AFT algorithm which Bruns found in 1903. This is shown to yield an algorithm of less complexity and of improved performance over certain recent AFT algorithms. A VLSI architecture is suggested for this simplified AFT algorithm. This architecture uses a butterfly structure which reduces the number of additions by 25 percent of that used in the direct method.

  12. VLSI architectures for modern error-correcting codes

    CERN Document Server

    Zhang, Xinmiao

    2015-01-01

    Error-correcting codes are ubiquitous. They are adopted in almost every modern digital communication and storage system, such as wireless communications, optical communications, Flash memories, computer hard drives, sensor networks, and deep-space probing. New-generation and emerging applications demand codes with better error-correcting capability. On the other hand, the design and implementation of those high-gain error-correcting codes pose many challenges. They usually involve complex mathematical computations, and mapping them directly to hardware often leads to very high complexity. VLSI

  13. VLSI IMPLEMENTATION OF CHANNEL ESTIMATION FOR MIMO-OFDM TRANSCEIVER

    Directory of Open Access Journals (Sweden)

    Joseph Gladwin Sekar

    2013-01-01

    Full Text Available In this study the VLSI architecture for MIMO-OFDM transceiver and the algorithm for the implementation of MMSE detection in MIMO-OFDM system is proposed. The implemented MIMO-OFDM system is capable of transmitting data at high throughput in physical layer and provides optimized hardware resources while achieving the same data rate. The proposed architecture has low latency, high throughput and efficient resource utilization. The result obtained is compared with the MATLAB results for verification. The main aim is to reduce the hardware complexity of the channel estimation.

  14. Formal verification an essential toolkit for modern VLSI design

    CERN Document Server

    Seligman, Erik; Kumar, M V Achutha Kiran

    2015-01-01

    Formal Verification: An Essential Toolkit for Modern VLSI Design presents practical approaches for design and validation, with hands-on advice for working engineers integrating these techniques into their work. Building on a basic knowledge of System Verilog, this book demystifies FV and presents the practical applications that are bringing it into mainstream design and validation processes at Intel and other companies. The text prepares readers to effectively introduce FV in their organization and deploy FV techniques to increase design and validation productivity. Presents formal verific

  15. Low-power Analog VLSI Implementation of Wavelet Transform

    Institute of Scientific and Technical Information of China (English)

    ZHANG Jiang-hong

    2009-01-01

    For applications requiring low-power, low-voltage and real-time, a novel analog VLSI implementation of continuous Marr wavelet transform based on CMOS log-domain integrator is proposed.Mart wavelet is approximated by a parameterized class of function and with Levenbery-Marquardt nonlinear least square method,the optimum parameters of this function are obtained.The circuits of implementating Mart wavelet transform are composed of analog filter whose impulse response is the required wavelet.The filter design is based on IFLF structure with CMOS log-domain integrators as the main building blocks.SPICE simulations indicate an excellent approximations of ideal wavelet.

  16. VLSI implementation of a fairness ATM buffer system

    DEFF Research Database (Denmark)

    Nielsen, J.V.; Dittmann, Lars; Madsen, Jens Kargaard

    1996-01-01

    This paper presents a VLSI implementation of a resource allocation scheme, based on the concept of weighted fair queueing. The design can be used in asynchronous transfer mode (ATM) networks to ensure fairness and robustness. Weighted fair queueing is a scheduling and buffer management scheme...... that can provide a resource allocation policy and enforcement of this policy. It can be used in networks in order to provide defined allocation policies (fairness) and improve network robustness. The presented design illustrates how the theoretical weighted fair queueing model can be approximated...

  17. An adaptive, lossless data compression algorithm and VLSI implementations

    Science.gov (United States)

    Venbrux, Jack; Zweigle, Greg; Gambles, Jody; Wiseman, Don; Miller, Warner H.; Yeh, Pen-Shu

    1993-01-01

    This paper first provides an overview of an adaptive, lossless, data compression algorithm originally devised by Rice in the early '70s. It then reports the development of a VLSI encoder/decoder chip set developed which implements this algorithm. A recent effort in making a space qualified version of the encoder is described along with several enhancements to the algorithm. The performance of the enhanced algorithm is compared with those from other currently available lossless compression techniques on multiple sets of test data. The results favor our implemented technique in many applications.

  18. A VLSI Algorithm for Calculating the Treee to Tree Distance

    Institute of Scientific and Technical Information of China (English)

    徐美瑞; 刘小林

    1993-01-01

    Given two ordered,labeled trees βand α,to find the distance from tree β to tree α is an important problem in many fields,for example,the pattern recognition field.In this paper,a VLSI algorithm for calculating the tree-to-tree distance is presented.The computation structure of the algorithm is a 2-D Mesh with the size m&n.and the time is O(m=n),where m,n are the numbers of nodes of the tree βand tree α,respectively.

  19. Fast, Massively Parallel Data Processors

    Science.gov (United States)

    Heaton, Robert A.; Blevins, Donald W.; Davis, ED

    1994-01-01

    Proposed fast, massively parallel data processor contains 8x16 array of processing elements with efficient interconnection scheme and options for flexible local control. Processing elements communicate with each other on "X" interconnection grid with external memory via high-capacity input/output bus. This approach to conditional operation nearly doubles speed of various arithmetic operations.

  20. ASSP Advanced Sensor Signal Processor.

    Science.gov (United States)

    1984-06-01

    transfer data sad cimeds . When a Processor receives the required data (Image) md/or oamand, that data will be operated on B-3 I I I autonomouly. The...BAN is provided by two separately controled DMA address generator chips (Am29o40). Each of these DMA chips create an 8 bit address. One DMA chip gene

  1. Cassava processors' awareness of occupational and environmental ...

    African Journals Online (AJOL)

    Cassava processors' awareness of occupational and environmental hazards ... Majority of the respondents also complained of lack of water (78.4%), lack of ... so as to reduce the problems faced by cassava processors during processing.

  2. Design Principles for Synthesizable Processor Cores

    DEFF Research Database (Denmark)

    Schleuniger, Pascal; McKee, Sally A.; Karlsson, Sven

    2012-01-01

    As FPGAs get more competitive, synthesizable processor cores become an attractive choice for embedded computing. Currently popular commercial processor cores do not fully exploit current FPGA architectures. In this paper, we propose general design principles to increase instruction throughput...... on FPGA-based processor cores: first, superpipelining enables higher-frequency system clocks, and second, predicated instructions circumvent costly pipeline stalls due to branches. To evaluate their effects, we develop Tinuso, a processor architecture optimized for FPGA implementation. We demonstrate...

  3. Spike timing dependent plasticity (STDP) can ameliorate process variations in neuromorphic VLSI.

    Science.gov (United States)

    Cameron, Katherine; Boonsobhak, Vasin; Murray, Alan; Renshaw, David

    2005-11-01

    A transient-detecting very large scale integration (VLSI) pixel is described, suitable for use in a visual-processing, depth-recovery algorithm based upon spike timing. A small array of pixels is coupled to an adaptive system, based upon spike timing dependent plasticity (STDP), that aims to reduce the effect of VLSI process variations on the algorithm's performance. Results from 0.35 microm CMOS temporal differentiating pixels and STDP circuits show that the system is capable of adapting to substantially reduce the effects of process variations without interrupting the algorithm's natural processes. The concept is generic to all spike timing driven processing algorithms in a VLSI.

  4. Methodology of Efficient Energy Design for Noisy Deep Submicron VLSI Chips

    Institute of Scientific and Technical Information of China (English)

    WANGJun

    2004-01-01

    Power dissipation is becoming increasingly important as technology continues to scale. This paper describes a way to consider the dynamic, static and shortcircuit power dissipation simultaneously for making complete, quantitative prediction on the total power dissipation of noisy VLSI chip. Especially, this new method elucidates the mechanism of power dissipation caused by the intrinsic noise of deep submicron VLSI chip. To capture the noise dependency of efficient energy design strategies for VLSI chip, the simulation of two illustrative cases are observed. Finally, the future works are proposed for the optimum tradeoff among the power, speed and area, which includes the use of floating-body partially depleted silicon-on-insulator CMOS technology.

  5. 40 CFR 791.45 - Processors.

    Science.gov (United States)

    2010-07-01

    ... 40 Protection of Environment 31 2010-07-01 2010-07-01 true Processors. 791.45 Section 791.45 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) TOXIC SUBSTANCES CONTROL ACT (CONTINUED) DATA REIMBURSEMENT Basis for Proposed Order § 791.45 Processors. (a) Generally, processors will be...

  6. Testing interconnected VLSI circuits in the Big Viterbi Decoder

    Science.gov (United States)

    Onyszchuk, I. M.

    1991-01-01

    The Big Viterbi Decoder (BVD) is a powerful error-correcting hardware device for the Deep Space Network (DSN), in support of the Galileo and Comet Rendezvous Asteroid Flyby (CRAF)/Cassini Missions. Recently, a prototype was completed and run successfully at 400,000 or more decoded bits per second. This prototype is a complex digital system whose core arithmetic unit consists of 256 identical very large scale integration (VLSI) gate-array chips, 16 on each of 16 identical boards which are connected through a 28-layer, printed-circuit backplane using 4416 wires. Special techniques were developed for debugging, testing, and locating faults inside individual chips, on boards, and within the entire decoder. The methods are based upon hierarchical structure in the decoder, and require that chips or boards be wired themselves as Viterbi decoders. The basic procedure consists of sending a small set of known, very noisy channel symbols through a decoder, and matching observables against values computed by a software simulation. Also, tests were devised for finding open and short-circuited wires which connect VLSI chips on the boards and through the backplane.

  7. Communications systems and methods for subsea processors

    Science.gov (United States)

    Gutierrez, Jose; Pereira, Luis

    2016-04-26

    A subsea processor may be located near the seabed of a drilling site and used to coordinate operations of underwater drilling components. The subsea processor may be enclosed in a single interchangeable unit that fits a receptor on an underwater drilling component, such as a blow-out preventer (BOP). The subsea processor may issue commands to control the BOP and receive measurements from sensors located throughout the BOP. A shared communications bus may interconnect the subsea processor and underwater components and the subsea processor and a surface or onshore network. The shared communications bus may be operated according to a time division multiple access (TDMA) scheme.

  8. Invasive tightly coupled processor arrays

    CERN Document Server

    LARI, VAHID

    2016-01-01

    This book introduces new massively parallel computer (MPSoC) architectures called invasive tightly coupled processor arrays. It proposes strategies, architecture designs, and programming interfaces for invasive TCPAs that allow invading and subsequently executing loop programs with strict requirements or guarantees of non-functional execution qualities such as performance, power consumption, and reliability. For the first time, such a configurable processor array architecture consisting of locally interconnected VLIW processing elements can be claimed by programs, either in full or in part, using the principle of invasive computing. Invasive TCPAs provide unprecedented energy efficiency for the parallel execution of nested loop programs by avoiding any global memory access such as GPUs and may even support loops with complex dependencies such as loop-carried dependencies that are not amenable to parallel execution on GPUs. For this purpose, the book proposes different invasion strategies for claiming a desire...

  9. The UA1 trigger processor

    CERN Document Server

    Grayer, G H

    1981-01-01

    Experiment UA1 is a large multipurpose spectrometer at the CERN proton-antiproton collider. The principal trigger is formed on the basis of the energy deposition in calorimeters. A trigger decision taken in under 2.4 microseconds can avoid dead-time losses due to the bunched nature of the beam. To achieve this fast 8-bit charge to digital converters have been built followed by two identical digital processors tailored to the experiment. The outputs of groups of the 2440 photomultipliers in the calorimeters are summed to form a total of 288 input channels to the ADCs. A look-up table in RAM is used to convert the digitised photomultiplier signals to energy in one processor, and to transverse energy in the other. Each processor forms four sums from a chosen combination of input channels, and also counts the number of clusters with electromagnetic or hadronic energy above pre-determined levels. Up to twelve combinations of these conditions, together with external information, may be combined in coincidence or in...

  10. Taxonomy of Data Prefetching for Multicore Processors

    Institute of Scientific and Technical Information of China (English)

    Surendra Byna; Yong Chen; Xian-He Sun

    2009-01-01

    Data prefetching is an effective data access latency hiding technique to mask the CPU stall caused by cache misses and to bridge the performance gap between processor and memory. With hardware and/or software support, data prefetching brings data closer to a processor before it is actually needed. Many prefetching techniques have been developed for single-core processors. Recent developments in processor technology have brought multicore processors into mainstream.While some of the single-core prefetching techniques are directly applicable to multicore processors, numerous novel strategies have been proposed in the past few years to take advantage of multiple cores. This paper aims to provide a comprehensive review of the state-of-the-art prefetching techniques, and proposes a taxonomy that classifies various design concerns in developing a prefetching strategy, especially for multicore processors. We compare various existing methods through analysis as well.

  11. An Efficient Circulant MIMO Equalizer for CDMA Downlink: Algorithm and VLSI Architecture

    Directory of Open Access Journals (Sweden)

    Cavallaro Joseph R

    2006-01-01

    Full Text Available We present an efficient circulant approximation-based MIMO equalizer architecture for the CDMA downlink. This reduces the direct matrix inverse (DMI of size with complexity to some FFT operations with complexity and the inverse of some submatrices. We then propose parallel and pipelined VLSI architectures with Hermitian optimization and reduced-state FFT for further complexity optimization. Generic VLSI architectures are derived for the high-order receiver from partitioned submatrices. This leads to more parallel VLSI design with further complexity reduction. Comparative study with both the conjugate-gradient and DMI algorithms shows very promising performance/complexity tradeoff. VLSI design space in terms of area/time efficiency is explored extensively for layered parallelism and pipelining with a Catapult C high-level-synthesis methodology.

  12. Real-time simulation of biologically realistic stochastic neurons in VLSI.

    Science.gov (United States)

    Chen, Hsin; Saighi, Sylvain; Buhry, Laure; Renaud, Sylvie

    2010-09-01

    Neuronal variability has been thought to play an important role in the brain. As the variability mainly comes from the uncertainty in biophysical mechanisms, stochastic neuron models have been proposed for studying how neurons compute with noise. However, most papers are limited to simulating stochastic neurons in a digital computer. The speed and the efficiency are thus limited especially when a large neuronal network is of concern. This brief explores the feasibility of simulating the stochastic behavior of biological neurons in a very large scale integrated (VLSI) system, which implements a programmable and configurable Hodgkin-Huxley model. By simply injecting noise to the VLSI neuron, various stochastic behaviors observed in biological neurons are reproduced realistically in VLSI. The noise-induced variability is further shown to enhance the signal modulation of a neuron. These results point toward the development of analog VLSI systems for exploring the stochastic behaviors of biological neuronal networks in large scale.

  13. Microfluidic very large scale integration (VLSI) modeling, simulation, testing, compilation and physical synthesis

    CERN Document Server

    Pop, Paul; Madsen, Jan

    2016-01-01

    This book presents the state-of-the-art techniques for the modeling, simulation, testing, compilation and physical synthesis of mVLSI biochips. The authors describe a top-down modeling and synthesis methodology for the mVLSI biochips, inspired by microelectronics VLSI methodologies. They introduce a modeling framework for the components and the biochip architecture, and a high-level microfluidic protocol language. Coverage includes a topology graph-based model for the biochip architecture, and a sequencing graph to model for biochemical application, showing how the application model can be obtained from the protocol language. The techniques described facilitate programmability and automation, enabling developers in the emerging, large biochip market. · Presents the current models used for the research on compilation and synthesis techniques of mVLSI biochips in a tutorial fashion; · Includes a set of "benchmarks", that are presented in great detail and includes the source code of several of the techniques p...

  14. Vertically Coupled Microring Resonator Filter :Versatile Building Block for VLSI Filter Circuits

    Institute of Scientific and Technical Information of China (English)

    Yasuo; Kokubun

    2003-01-01

    In this review, the recent progress in the development of vertically coupled micro-ring resonator filters is summarized and the potential applications of the filters leading to the development of VLSI photonics are described.

  15. Vertically Coupled Microring Resonator Filter : Versatile Building Block for VLSI Filter Circuits

    Institute of Scientific and Technical Information of China (English)

    Yasuo Kokubun

    2003-01-01

    In this review, the recent progress in the development of vertically coupled micro-ring resonator filters is summarized and the potential applications of the filters leading to the development of VLSI photonics are described.

  16. Application of evolutionary algorithms for multi-objective optimization in VLSI and embedded systems

    CERN Document Server

    2015-01-01

    This book describes how evolutionary algorithms (EA), including genetic algorithms (GA) and particle swarm optimization (PSO) can be utilized for solving multi-objective optimization problems in the area of embedded and VLSI system design. Many complex engineering optimization problems can be modelled as multi-objective formulations. This book provides an introduction to multi-objective optimization using meta-heuristic algorithms, GA and PSO, and how they can be applied to problems like hardware/software partitioning in embedded systems, circuit partitioning in VLSI, design of operational amplifiers in analog VLSI, design space exploration in high-level synthesis, delay fault testing in VLSI testing, and scheduling in heterogeneous distributed systems. It is shown how, in each case, the various aspects of the EA, namely its representation, and operators like crossover, mutation, etc. can be separately formulated to solve these problems. This book is intended for design engineers and researchers in the field ...

  17. Compilation Techniques Specific for a Hardware Cryptography-Embedded Multimedia Mobile Processor

    Directory of Open Access Journals (Sweden)

    Masa-aki FUKASE

    2007-12-01

    Full Text Available The development of single chip VLSI processors is the key technology of ever growing pervasive computing to answer overall demands for usability, mobility, speed, security, etc. We have so far developed a hardware cryptography-embedded multimedia mobile processor architecture, HCgorilla. Since HCgorilla integrates a wide range of techniques from architectures to applications and languages, one-sided design approach is not always useful. HCgorilla needs more complicated strategy, that is, hardware/software (H/S codesign. Thus, we exploit the software support of HCgorilla composed of a Java interface and parallelizing compilers. They are assumed to be installed in servers in order to reduce the load and increase the performance of HCgorilla-embedded clients. Since compilers are the essence of software's responsibility, we focus in this article on our recent results about the design, specifications, and prototyping of parallelizing compilers for HCgorilla. The parallelizing compilers are composed of a multicore compiler and a LIW compiler. They are specified to abstract parallelism from executable serial codes or the Java interface output and output the codes executable in parallel by HCgorilla. The prototyping compilers are written in Java. The evaluation by using an arithmetic test program shows the reasonability of the prototyping compilers compared with hand compilers.

  18. Compilation Techniques Specific for a Hardware Cryptography-Embedded Multimedia Mobile Processor

    Directory of Open Access Journals (Sweden)

    Masa-aki FUKASE

    2007-12-01

    Full Text Available The development of single chip VLSI processors is the key technology of ever growing pervasive computing to answer overall demands for usability, mobility, speed, security, etc. We have so far developed a hardware cryptography-embedded multimedia mobile processor architecture, HCgorilla. Since HCgorilla integrates a wide range of techniques from architectures to applications and languages, one-sided design approach is not always useful. HCgorilla needs more complicated strategy, that is, hardware/software (H/S codesign. Thus, we exploit the software support of HCgorilla composed of a Java interface and parallelizing compilers. They are assumed to be installed in servers in order to reduce the load and increase the performance of HCgorilla-embedded clients. Since compilers are the essence of software's responsibility, we focus in this article on our recent results about the design, specifications, and prototyping of parallelizing compilers for HCgorilla. The parallelizing compilers are composed of a multicore compiler and a LIW compiler. They are specified to abstract parallelism from executable serial codes or the Java interface output and output the codes executable in parallel by HCgorilla. The prototyping compilers are written in Java. The evaluation by using an arithmetic test program shows the reasonability of the prototyping compilers compared with hand compilers.

  19. POWER DRIVEN SYNTHESIS OF COMBINATIONAL CIRCUITS ON THE BASE OF CMOS VLSI LIBRARY ELEMENTS

    Directory of Open Access Journals (Sweden)

    D. I. Cheremisinov

    2013-01-01

    Full Text Available A problem of synthesis of multi-level logical networks using CMOS VLSI cell library is considered. The networks are optimized with respect to the die size and average dissipated power by CMOS-circuit implemented on a VLSI chip. The suggested approach is based on covering multilevel gate network and on taking into account specific features of the CMOS cell basis.

  20. A CNN-Specific Integrated Processor

    Directory of Open Access Journals (Sweden)

    Suleyman Malki

    2009-01-01

    Full Text Available Integrated Processors (IP are algorithm-specific cores that either by programming or by configuration can be re-used within many microelectronic systems. This paper looks at Cellular Neural Networks (CNN to become realized as IP. First current digital implementations are reviewed, and the memoryprocessor bandwidth issues are analyzed. Then a generic view is taken on the structure of the network, and a new intra-communication protocol based on rotating wheels is proposed. It is shown that this provides for guaranteed high-performance with a minimal network interface. The resulting node is small and supports multi-level CNN designs, giving the system a 30-fold increase in capacity compared to classical designs. As it facilitates multiple operations on a single image, and single operations on multiple images, with minimal access to the external image memory, balancing the internal and external data transfer requirements optimizes the system operation. In conventional digital CNN designs, the treatment of boundary nodes requires additional logic to handle the CNN value propagation scheme. In the new architecture, only a slight modification of the existing cells is necessary to model the boundary effect. A typical prototype for visual pattern recognition will house 4096 CNN cells with a 2% overhead for making it an IP.

  1. A CNN-Specific Integrated Processor

    Science.gov (United States)

    Malki, Suleyman; Spaanenburg, Lambert

    2009-12-01

    Integrated Processors (IP) are algorithm-specific cores that either by programming or by configuration can be re-used within many microelectronic systems. This paper looks at Cellular Neural Networks (CNN) to become realized as IP. First current digital implementations are reviewed, and the memoryprocessor bandwidth issues are analyzed. Then a generic view is taken on the structure of the network, and a new intra-communication protocol based on rotating wheels is proposed. It is shown that this provides for guaranteed high-performance with a minimal network interface. The resulting node is small and supports multi-level CNN designs, giving the system a 30-fold increase in capacity compared to classical designs. As it facilitates multiple operations on a single image, and single operations on multiple images, with minimal access to the external image memory, balancing the internal and external data transfer requirements optimizes the system operation. In conventional digital CNN designs, the treatment of boundary nodes requires additional logic to handle the CNN value propagation scheme. In the new architecture, only a slight modification of the existing cells is necessary to model the boundary effect. A typical prototype for visual pattern recognition will house 4096 CNN cells with a 2% overhead for making it an IP.

  2. Adaptive Neurons For Artificial Neural Networks

    Science.gov (United States)

    Tawel, Raoul

    1990-01-01

    Training time decreases dramatically. In improved mathematical model of neural-network processor, temperature of neurons (in addition to connection strengths, also called weights, of synapses) varied during supervised-learning phase of operation according to mathematical formalism and not heuristic rule. Evidence that biological neural networks also process information at neuronal level.

  3. A neural network device for on-line particle identification in cosmic ray experiments

    Energy Technology Data Exchange (ETDEWEB)

    Scrimaglio, R. E-mail: renato.scrimaglio@aquila.infn.it; Finetti, N.; D' Altorio, L.; Rantucci, E.; Raso, M.; Segreto, E.; Tassoni, A.; Cardarilli, G.C

    2004-05-21

    On-line particle identification is one of the main goals of many experiments in space both for rare event studies and for optimizing measurements along the orbital trajectory. Neural networks can be a useful tool for signal processing and real time data analysis in such experiments. In this document we report on the performances of a programmable neural device which was developed in VLSI analog/digital technology. Neurons and synapses were accomplished by making use of Operational Transconductance Amplifier (OTA) structures. In this paper we report on the results of measurements performed in order to verify the agreement of the characteristic curves of each elementary cell with simulations and on the device performances obtained by implementing simple neural structures on the VLSI chip. A feed-forward neural network (Multi-Layer Perceptron, MLP) was implemented on the VLSI chip and trained to identify particles by processing the signals of two-dimensional position-sensitive Si detectors. The radiation monitoring device consisted of three double-sided silicon strip detectors. From the analysis of a set of simulated data it was found that the MLP implemented on the neural device gave results comparable with those obtained with the standard method of analysis confirming that the implemented neural network could be employed for real time particle identification.

  4. VLSI technology for smaller, cheaper, faster return link systems

    Science.gov (United States)

    Nanzetta, Kathy; Ghuman, Parminder; Bennett, Toby; Solomon, Jeff; Dowling, Jason; Welling, John

    1994-01-01

    Very Large Scale Integration (VLSI) Application-specific Integrated Circuit (ASIC) technology has enabled substantially smaller, cheaper, and more capable telemetry data systems. However, the rapid growth in available ASIC fabrication densities has far outpaced the application of this technology to telemetry systems. Available densities have grown by well over an order magnitude since NASA's Goddard Space Flight Center (GSFC) first began developing ASIC's for ground telemetry systems in 1985. To take advantage of these higher integration levels, a new generation of ASIC's for return link telemetry processing is under development. These new submicron devices are designed to further reduce the cost and size of NASA return link processing systems while improving performance. This paper describes these highly integrated processing components.

  5. Efficient VLSI architecture for training radial basis function networks.

    Science.gov (United States)

    Fan, Zhe-Cheng; Hwang, Wen-Jyi

    2013-03-19

    This paper presents a novel VLSI architecture for the training of radial basis function (RBF) networks. The architecture contains the circuits for fuzzy C-means (FCM) and the recursive Least Mean Square (LMS) operations. The FCM circuit is designed for the training of centers in the hidden layer of the RBF network. The recursive LMS circuit is adopted for the training of connecting weights in the output layer. The architecture is implemented by the field programmable gate array (FPGA). It is used as a hardware accelerator in a system on programmable chip (SOPC) for real-time training and classification. Experimental results reveal that the proposed RBF architecture is an effective alternative for applications where fast and efficient RBF training is desired.

  6. Modeling selective attention using a neuromorphic analog VLSI device.

    Science.gov (United States)

    Indiveri, G

    2000-12-01

    Attentional mechanisms are required to overcome the problem of flooding a limited processing capacity system with information. They are present in biological sensory systems and can be a useful engineering tool for artificial visual systems. In this article we present a hardware model of a selective attention mechanism implemented on a very large-scale integration (VLSI) chip, using analog neuromorphic circuits. The chip exploits a spike-based representation to receive, process, and transmit signals. It can be used as a transceiver module for building multichip neuromorphic vision systems. We describe the circuits that carry out the main processing stages of the selective attention mechanism and provide experimental data for each circuit. We demonstrate the expected behavior of the model at the system level by stimulating the chip with both artificially generated control signals and signals obtained from a saliency map, computed from an image containing several salient features.

  7. VLSI-based Video Event Triggering for Image Data Compression

    Science.gov (United States)

    Williams, Glenn L.

    1994-01-01

    Long-duration, on-orbit microgravity experiments require a combination of high resolution and high frame rate video data acquisition. The digitized high-rate video stream presents a difficult data storage problem. Data produced at rates of several hundred million bytes per second may require a total mission video data storage requirement exceeding one terabyte. A NASA-designed, VLSI-based, highly parallel digital state machine generates a digital trigger signal at the onset of a video event. High capacity random access memory storage coupled with newly available fuzzy logic devices permits the monitoring of a video image stream for long term (DC-like) or short term (AC-like) changes caused by spatial translation, dilation, appearance, disappearance, or color change in a video object. Pre-trigger and post-trigger storage techniques are then adaptable to archiving only the significant video images.

  8. A VLSI implementation of DCT using pass transistor technology

    Science.gov (United States)

    Kamath, S.; Lynn, Douglas; Whitaker, Sterling

    1992-01-01

    A VLSI design for performing the Discrete Cosine Transform (DCT) operation on image blocks of size 16 x 16 in a real time fashion operating at 34 MHz (worst case) is presented. The process used was Hewlett-Packard's CMOS26--A 3 metal CMOS process with a minimum feature size of 0.75 micron. The design is based on Multiply-Accumulate (MAC) cells which make use of a modified Booth recoding algorithm for performing multiplication. The design of these cells is straight forward, and the layouts are regular with no complex routing. Two versions of these MAC cells were designed and their layouts completed. Both versions were simulated using SPICE to estimate their performance. One version is slightly faster at the cost of larger silicon area and higher power consumption. An improvement in speed of almost 20 percent is achieved after several iterations of simulation and re-sizing.

  9. VLSI design techniques for floating-point computation

    Energy Technology Data Exchange (ETDEWEB)

    Bose, B. K.

    1988-01-01

    The thesis presents design techniques for floating-point computation in VLSI. A basis for area-time design decisions for arithmetic and memory operations is formulated from a study of computationally intensive programs. Tradeoffs in the design and implementation of an efficient coprocessor interface are studied, together with the implications of hardware support for the IEEE Floating-Point Standard. Algorithm area-time tradeoffs for basic arithmetic functions are analyzed in light of changing technology. Details of a single-chip floating-point unit designed in two-micron CMOS for SPUR are described, including special design considerations for very wide data paths. The pervasive effects of scaling technology on different levels of design are explored, from devices and circuits, through logic and micro-architecture, to algorithms and systems.

  10. Efficient VLSI Architecture for Training Radial Basis Function Networks

    Directory of Open Access Journals (Sweden)

    Wen-Jyi Hwang

    2013-03-01

    Full Text Available This paper presents a novel VLSI architecture for the training of radial basis function (RBF networks. The architecture contains the circuits for fuzzy C-means (FCM and the recursive Least Mean Square (LMS operations. The FCM circuit is designed for the training of centers in the hidden layer of the RBF network. The recursive LMS circuit is adopted for the training of connecting weights in the output layer. The architecture is implemented by the field programmable gate array (FPGA. It is used as a hardware accelerator in a system on programmable chip (SOPC for real-time training and classification. Experimental results reveal that the proposed RBF architecture is an effective alternative for applications where fast and efficient RBF training is desired.

  11. Carbon nanotube based VLSI interconnects analysis and design

    CERN Document Server

    Kaushik, Brajesh Kumar

    2015-01-01

    The brief primarily focuses on the performance analysis of CNT based interconnects in current research scenario. Different CNT structures are modeled on the basis of transmission line theory. Performance comparison for different CNT structures illustrates that CNTs are more promising than Cu or other materials used in global VLSI interconnects. The brief is organized into five chapters which mainly discuss: (1) an overview of current research scenario and basics of interconnects; (2) unique crystal structures and the basics of physical properties of CNTs, and the production, purification and applications of CNTs; (3) a brief technical review, the geometry and equivalent RLC parameters for different single and bundled CNT structures; (4) a comparative analysis of crosstalk and delay for different single and bundled CNT structures; and (5) various unique mixed CNT bundle structures and their equivalent electrical models.

  12. Realistic model of compact VLSI FitzHugh-Nagumo oscillators

    Science.gov (United States)

    Cosp, Jordi; Binczak, Stéphane; Madrenas, Jordi; Fernández, Daniel

    2014-02-01

    In this article, we present a compact analogue VLSI implementation of the FitzHugh-Nagumo neuron model, intended to model large-scale, biologically plausible, oscillator networks. As the model requires a series resistor and a parallel capacitor with the inductor, which is the most complex part of the design, it is possible to greatly simplify the active inductor implementation compared to other implementations of this device as typically found in filters by allowing appreciable, but well modelled, nonidealities. We model and obtain the parameters of the inductor nonideal model as an inductance in series with a parasitic resistor and a second order low-pass filter with a large cut-off frequency. Post-layout simulations for a CMOS 0.35 μm double-poly technology using the MOSFET Spice BSIM3v3 model confirm the proper behaviour of the design.

  13. Power Efficient Sub-Array in Reconfigurable VLSI Meshes

    Institute of Scientific and Technical Information of China (English)

    Ji-Gang Wu; Thambipillai Srikanthan

    2005-01-01

    Given an m × n mesh-connected VLSI array with some faulty elements, the reconfiguration problem is to find a maximum-sized fault-free sub-array under the row and column rerouting scheme. This problem has already been shown to be NP-complete. In this paper, new techniques are proposed, based on heuristic strategy, to minimize the number of switches required for the power efficient sub-array. Our algorithm shows that notable improvements in the reduction of the number of long interconnects could be realized in linear time and without sacrificing on the size of the sub-array. Simulations based on several random and clustered fault scenarios clearly reveal the superiority of the proposed techniques.

  14. Replacing design rules in the VLSI design cycle

    Science.gov (United States)

    Hurley, Paul; Kryszczuk, Krzysztof

    2012-03-01

    We make a case for the migration of Design Rule Check (DRC), the first step in the modern VLSI design process, to a model-based system. DRC uses a large set of rules to determine permitted designs. We argue that it is a legacy of the past: slow, labor intensive, ad-hoc, inaccurate and too restrictive. We envisage the replacement of DRC and printability simulation by a signal processing and machine learning-based approach for 22nm technology nodes and beyond. Such a process would produce fast, accurate, autonomous printability prediction for optical lithography. As such, we built a proof-of-concept demonstrator that can predict OPC problems using a trained classifier without the need to fall back on costly first-principle simulation. For one sample test site, and for the OPC Line Width error type OPC violation marker, the demonstrator obtained an Equal Error Rate of ca. 4%.

  15. A VLSI implementation of DCT using pass transistor technology

    Science.gov (United States)

    Kamath, S.; Lynn, Douglas; Whitaker, Sterling

    A VLSI design for performing the Discrete Cosine Transform (DCT) operation on image blocks of size 16 x 16 in a real time fashion operating at 34 MHz (worst case) is presented. The process used was Hewlett-Packard's CMOS26--A 3 metal CMOS process with a minimum feature size of 0.75 micron. The design is based on Multiply-Accumulate (MAC) cells which make use of a modified Booth recoding algorithm for performing multiplication. The design of these cells is straight forward, and the layouts are regular with no complex routing. Two versions of these MAC cells were designed and their layouts completed. Both versions were simulated using SPICE to estimate their performance. One version is slightly faster at the cost of larger silicon area and higher power consumption. An improvement in speed of almost 20 percent is achieved after several iterations of simulation and re-sizing.

  16. Platforms for artificial neural networks : neurosimulators and performance prediction of MIMD-parallel systems

    NARCIS (Netherlands)

    Vuurpijl, L.G.

    1998-01-01

    In this thesis, two platforms for simulating artificial neural networks are discussed: MIMD-parallel processor systems as an execution platform and neurosimulators as a research and development platform. Because of the parallelism encountered in neural networks, distributed processor systems seem to

  17. Platforms for artificial neural networks : neurosimulators and performance prediction of MIMD-parallel systems

    NARCIS (Netherlands)

    Vuurpijl, L.G.

    1998-01-01

    In this thesis, two platforms for simulating artificial neural networks are discussed: MIMD-parallel processor systems as an execution platform and neurosimulators as a research and development platform. Because of the parallelism encountered in neural networks, distributed processor systems seem to

  18. Multiprocessor Realization of Neural Networks

    Science.gov (United States)

    1990-04-01

    the unique capabilities of receiving, processing, and transmitting electo-chemical signals. These signals are sent over neural pathways that make up...these switching nodes and a clever arrangement of internode links to guaranteee at least one’ path between each processor and memory. These types of

  19. Functional Verification of Enhanced RISC Processor

    OpenAIRE

    SHANKER NILANGI; SOWMYA L

    2013-01-01

    This paper presents design and verification of a 32-bit enhanced RISC processor core having floating point computations integrated within the core, has been designed to reduce the cost and complexity. The designed 3 stage pipelined 32-bit RISC processor is based on the ARM7 processor architecture with single precision floating point multiplier, floating point adder/subtractor for floating point operations and 32 x 32 booths multiplier added to the integer core of ARM7. The binary representati...

  20. Digital Signal Processor For GPS Receivers

    Science.gov (United States)

    Thomas, J. B.; Meehan, T. K.; Srinivasan, J. M.

    1989-01-01

    Three innovative components combined to produce all-digital signal processor with superior characteristics: outstanding accuracy, high-dynamics tracking, versatile integration times, lower loss-of-lock signal strengths, and infrequent cycle slips. Three components are digital chip advancer, digital carrier downconverter and code correlator, and digital tracking processor. All-digital signal processor intended for use in receivers of Global Positioning System (GPS) for geodesy, geodynamics, high-dynamics tracking, and ionospheric calibration.

  1. Design Principles for Synthesizable Processor Cores

    DEFF Research Database (Denmark)

    Schleuniger, Pascal; McKee, Sally A.; Karlsson, Sven

    2012-01-01

    As FPGAs get more competitive, synthesizable processor cores become an attractive choice for embedded computing. Currently popular commercial processor cores do not fully exploit current FPGA architectures. In this paper, we propose general design principles to increase instruction throughput...... through the use of micro-benchmarks that our principles guide the design of a processor core that improves performance by an average of 38% over a similar Xilinx MicroBlaze configuration....

  2. The case for a generic implant processor.

    Science.gov (United States)

    Strydis, Christos; Gaydadjiev, Georgi N

    2008-01-01

    A more structured and streamlined design of implants is nowadays possible. In this paper we focus on implant processors located in the heart of implantable systems. We present a real and representative biomedical-application scenario where such a new processor can be employed. Based on a suitably selected processor simulator, various operational aspects of the application are being monitored. Findings on performance, cache behavior, branch prediction, power consumption, energy expenditure and instruction mixes are presented and analyzed. The suitability of such an implant processor and directions for future work are given.

  3. Alternative Water Processor Test Development

    Science.gov (United States)

    Pickering, Karen D.; Mitchell, Julie; Vega, Leticia; Adam, Niklas; Flynn, Michael; Wjee (er. Rau); Lunn, Griffin; Jackson, Andrew

    2012-01-01

    The Next Generation Life Support Project is developing an Alternative Water Processor (AWP) as a candidate water recovery system for long duration exploration missions. The AWP consists of biological water processor (BWP) integrated with a forward osmosis secondary treatment system (FOST). The basis of the BWP is a membrane aerated biological reactor (MABR), developed in concert with Texas Tech University. Bacteria located within the MABR metabolize organic material in wastewater, converting approximately 90% of the total organic carbon to carbon dioxide. In addition, bacteria convert a portion of the ammonia-nitrogen present in the wastewater to nitrogen gas, through a combination of nitrogen and denitrification. The effluent from the BWP system is low in organic contaminants, but high in total dissolved solids. The FOST system, integrated downstream of the BWP, removes dissolved solids through a combination of concentration-driven forward osmosis and pressure driven reverse osmosis. The integrated system is expected to produce water with a total organic carbon less than 50 mg/l and dissolved solids that meet potable water requirements for spaceflight. This paper describes the test definition, the design of the BWP and FOST subsystems, and plans for integrated testing.

  4. Alternative Water Processor Test Development

    Science.gov (United States)

    Pickering, Karen D.; Mitchell, Julie L.; Adam, Niklas M.; Barta, Daniel; Meyer, Caitlin E.; Pensinger, Stuart; Vega, Leticia M.; Callahan, Michael R.; Flynn, Michael; Wheeler, Ray; hide

    2013-01-01

    The Next Generation Life Support Project is developing an Alternative Water Processor (AWP) as a candidate water recovery system for long duration exploration missions. The AWP consists of biological water processor (BWP) integrated with a forward osmosis secondary treatment system (FOST). The basis of the BWP is a membrane aerated biological reactor (MABR), developed in concert with Texas Tech University. Bacteria located within the MABR metabolize organic material in wastewater, converting approximately 90% of the total organic carbon to carbon dioxide. In addition, bacteria convert a portion of the ammonia-nitrogen present in the wastewater to nitrogen gas, through a combination of nitrification and denitrification. The effluent from the BWP system is low in organic contaminants, but high in total dissolved solids. The FOST system, integrated downstream of the BWP, removes dissolved solids through a combination of concentration-driven forward osmosis and pressure driven reverse osmosis. The integrated system is expected to produce water with a total organic carbon less than 50 mg/l and dissolved solids that meet potable water requirements for spaceflight. This paper describes the test definition, the design of the BWP and FOST subsystems, and plans for integrated testing.

  5. Ultrafast Fourier-transform parallel processor

    Energy Technology Data Exchange (ETDEWEB)

    Greenberg, W.L.

    1980-04-01

    A new, flexible, parallel-processing architecture is developed for a high-speed, high-precision Fourier transform processor. The processor is intended for use in 2-D signal processing including spatial filtering, matched filtering and image reconstruction from projections.

  6. Adapting implicit methods to parallel processors

    Energy Technology Data Exchange (ETDEWEB)

    Reeves, L.; McMillin, B.; Okunbor, D.; Riggins, D. [Univ. of Missouri, Rolla, MO (United States)

    1994-12-31

    When numerically solving many types of partial differential equations, it is advantageous to use implicit methods because of their better stability and more flexible parameter choice, (e.g. larger time steps). However, since implicit methods usually require simultaneous knowledge of the entire computational domain, these methods axe difficult to implement directly on distributed memory parallel processors. This leads to infrequent use of implicit methods on parallel/distributed systems. The usual implementation of implicit methods is inefficient due to the nature of parallel systems where it is common to take the computational domain and distribute the grid points over the processors so as to maintain a relatively even workload per processor. This creates a problem at the locations in the domain where adjacent points are not on the same processor. In order for the values at these points to be calculated, messages have to be exchanged between the corresponding processors. Without special adaptation, this will result in idle processors during part of the computation, and as the number of idle processors increases, the lower the effective speed improvement by using a parallel processor.

  7. The TM3270 Media-processor

    NARCIS (Netherlands)

    van de Waerdt, J.W.

    2006-01-01

    I n this thesis, we present the TM3270 VLIW media-processor, the latest of TriMedia processors, and describe the innovations with respect to its prede- cessor: the TM3260. We describe enhancements to the load/store unit design, such as a new data prefetching technique, and architectural

  8. Multi-output programmable quantum processor

    OpenAIRE

    Yu, Yafei; Feng, Jian; Zhan, Mingsheng

    2002-01-01

    By combining telecloning and programmable quantum gate array presented by Nielsen and Chuang [Phys.Rev.Lett. 79 :321(1997)], we propose a programmable quantum processor which can be programmed to implement restricted set of operations with several identical data outputs. The outputs are approximately-transformed versions of input data. The processor successes with certain probability.

  9. 7 CFR 1215.14 - Processor.

    Science.gov (United States)

    2010-01-01

    ... AND ORDERS; MISCELLANEOUS COMMODITIES), DEPARTMENT OF AGRICULTURE POPCORN PROMOTION, RESEARCH, AND CONSUMER INFORMATION Popcorn Promotion, Research, and Consumer Information Order Definitions § 1215.14 Processor. Processor means a person engaged in the preparation of unpopped popcorn for the market who...

  10. The TM3270 Media-processor

    NARCIS (Netherlands)

    van de Waerdt, J.W.

    2006-01-01

    I n this thesis, we present the TM3270 VLIW media-processor, the latest of TriMedia processors, and describe the innovations with respect to its prede- cessor: the TM3260. We describe enhancements to the load/store unit design, such as a new data prefetching technique, and architectural enhancements

  11. Advanced Multiple Processor Configuration Study. Final Report.

    Science.gov (United States)

    Clymer, S. J.

    This summary of a study on multiple processor configurations includes the objectives, background, approach, and results of research undertaken to provide the Air Force with a generalized model of computer processor combinations for use in the evaluation of proposed flight training simulator computational designs. An analysis of a real-time flight…

  12. The Case for a Generic Implant Processor

    NARCIS (Netherlands)

    Strydis, C.; Gaydadjiev, G.N.

    2008-01-01

    A more structured and streamlined design of implants is nowadays possible. In this paper we focus on implant processors located in the heart of implantable systems. We present a real and representative biomedical-application scenario where such a new processor can be employed. Based on a suitably se

  13. An Empirical Evaluation of XQuery Processors

    NARCIS (Netherlands)

    Manegold, S.

    2008-01-01

    This paper presents an extensive and detailed experimental evaluation of XQuery processors. The study consists of running five publicly available XQuery benchmarks --- the Michigan benchmark (MBench), XBench, XMach-1, XMark and X007 --- on six XQuery processors, three stand-alone (file-based) XQuery

  14. The Case for a Generic Implant Processor

    NARCIS (Netherlands)

    Strydis, C.; Gaydadjiev, G.N.

    2008-01-01

    A more structured and streamlined design of implants is nowadays possible. In this paper we focus on implant processors located in the heart of implantable systems. We present a real and representative biomedical-application scenario where such a new processor can be employed. Based on a suitably

  15. Towards a Process Algebra for Shared Processors

    DEFF Research Database (Denmark)

    Buchholtz, Mikael; Andersen, Jacob; Løvengreen, Hans Henrik

    2002-01-01

    We present initial work on a timed process algebra that models sharing of processor resources allowing preemption at arbitrary points in time. This enables us to model both the functional and the timely behaviour of concurrent processes executed on a single processor. We give a refinement relation...

  16. Verilog Implementation of 32-Bit CISC Processor

    Directory of Open Access Journals (Sweden)

    P.Kanaka Sirisha

    2016-04-01

    Full Text Available The Project deals with the design of the 32-Bit CISC Processor and modeling of its components using Verilog language. The Entire Processor uses 32-Bit bus to deal with all the registers and the memories. This Processor implements various arithmetic, logical, Data Transfer operations etc., using variable length instructions, which is the core property of the CISC Architecture. The Processor also supports various addressing modes to perform a 32-Bit instruction. Our Processor uses Harvard Architecture (i.e., to have a separate program and data memory and hence has different buses to negotiate with the Program Memory and Data Memory individually. This feature enhances the speed of our processor. Hence it has two different Program Counters to point to the memory locations of the Program Memory and Data Memory.Our processor has ‘Instruction Queuing’ which enables it to save the time needed to fetch the instruction and hence increases the speed of operation. ‘Interrupt Service Routine’ is provided in our Processor to make it address the Interrupts.

  17. A Primal-dual Neural Network for Shortest Path Problem

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    The shortest path (SP) problem is a classical combinatorial optimization problem which plays an important role in a packet-switched computer and communication network. A new primal-dual neural network to solve the shortest path problem (PDSPN) is presented in this paper. The proposed neural network combines many features such as no network coefficients set,easy implementation in a VLSI circuit, and is proved to be completely stable to the exact solutions. The simulation example shows its efficiency in finding the "optimum" path(s) for data transmission in computer and communication network.

  18. Boolean approaches to graph embeddings related to VLSI

    Institute of Scientific and Technical Information of China (English)

    LIU; Yanpei(

    2001-01-01

    [1]Hu, T. C., Kuh, S. E., Theory and concepts of circuit layout, in VLSI Circuit Layout: Theory and Design, New York:IEEE Press, 1985, 3-18.[2]Liu Yanpei, Embeddability in Graphs, Boston-Beijing: Kluwer Science, 1995.[3]Liu Yanpei, Some combinatorial optimization problems arising from VLSI circuit design, Applied Math. -JCU, 1993, 38:218-235.[4]Liu Yanpei, Marchioro, P. , Petreschi, R., At most single bend embeddings of cubic graphs, Applied Math. -CJU, 1994,39: 127-142.[5]Liu Yanpei, Marchioro, P. , Petreschi, R. et al. , Theoretical results on at most 1-bend embeddability of graphs, Acta Math.Appl. Sinica, 1992, 8: 188-192.[6]Liu Yanpei, Morgana, A., Simeone, B., General theoretical results on rectilinear embeddability of graphs, Acta Math. Ap- pl. Simca, 1991, 7: 187-192.[7]Calamoneri, T., Petreschi, R., Liu Yanpei, Optimally Extending Bistandard Graphs on the Orthogonal Grid, ASCM2000 Symposium, Tailand, Dec.17-21, 2000.[8]Liu Yanpei, Morgana, A., Simeone, B., A graph partition problem, Acta Math. Appl. Sinica, 1996, 12: 393-400.[9]Liu Yanpei, Morgana, A. , Simeone, B. , A linear algorithm for 2-bend embeddings of planar graphs in the two dimensional grid, Discrete Appl. Math., 1998, 81: 69-91.[10]Liu Yanpei, Boolean approach to planar embeddings of a graph, Acta Math. Sinica, New Series, 1989, 5: 64-79.[11]Hammer, P. L., Liu Yanpei, Simeone, B., Boolean approaches to combinatorial optimization, J. Math. Res. Expos.,1990, 10: 300-312, 455-468, 619-628.[12]Liu Yanpei, Boolean planarity characterization of graphs, Acta Math. Sinica, New Series, 1988, 4: 316-329.[13]Liu Yanpei, Boolean characterizations of planarity and planar embeddings of graphs, Ann. O. R., 1990, 24: 165-174.

  19. High-energy heavy ion testing of VLSI devices for single event upsets and latch up

    Indian Academy of Sciences (India)

    S B Umesh; S R Kulkarni; R Sandhya; G R Joshi; R Damle; M Ravindra

    2005-08-01

    Several very large scale integrated (VLSI) devices which are not available in radiation hardened version are still required to be used in spacecraft systems. Thus these components need to be tested for highenergy heavy ion irradiation to find out their tolerance and suitability in specific space applications. This paper describes the high-energy heavy ion radiation testing of VLSI devices for single event upset (SEU) and single event latch up (SEL). The experimental set up employed to produce low flux of heavy ions viz. silicon (Si), and silver (Ag), for studying single event effects (SEE) is briefly described. The heavy ion testing of a few VLSI devices is performed in the general purpose scattering chamber of the Pelletron facility, available at Nuclear Science Centre, New Delhi. The test results with respect to SEU and SEL are discussed.

  20. Pruned Continuous Haar Transform of 2D Polygonal Patterns with Application to VLSI Layouts

    CERN Document Server

    Scheibler, Robin; Chebira, Amina

    2011-01-01

    We introduce an algorithm for the efficient computation of the continuous Haar transform of 2D patterns that can be described by polygons. These patterns are ubiquitous in VLSI processes where they are used to describe design and mask layouts. There, speed is of paramount importance due to the magnitude of the problems to be solved and hence very fast algorithms are needed. We show that by techniques borrowed from computational geometry we are not only able to compute the continuous Haar transform directly, but also to do it quickly. This is achieved by massively pruning the transform tree and thus dramatically decreasing the computational load when the number of vertices is small, as is the case for VLSI layouts. We call this new algorithm the pruned continuous Haar transform. We implement this algorithm and show that for patterns found in VLSI layouts the proposed algorithm was in the worst case as fast as its discrete counterpart and up to 12 times faster.

  1. Simulation Study on Quantum Capacitances of Graphene Nanoribbon VLSI Interconnects

    Science.gov (United States)

    Dutta, Arin; Rahman, Silvia; Nandy, Turja; Mahmood, Zahid Hasan

    2016-03-01

    In this paper, study on the capacitive effects of Graphene nanoribbon (GNR) in VLSI interconnect has been studied as a function of GNR width, Fermi function and gate voltage. The quantum capacitance of GNR has been simulated in terms of Fermi function for three different values of insulator thickness — 1.5nm, 2nm and 2.5nm. After that, quantum capacitance is studied in both degenerate and nondegenerate region with respect to Fermi function and gate voltage of range 1-5V. Then, the total capacitance of GNR is studied as a function of gate voltage of -2-5V range at degenerate and nondegenerate regions, where width of GNR is considered 4nm. Finally, the total capacitance of GNR is studied in both regions with varying GNR width, considering fixed gate voltage of 3V. After analyzing these simulations, it has been found that GNR in degenerate region shows nearly steady capacitance under a certain applied gate voltage.

  2. VLSI Implementation of Hybrid Algorithm Architecture for Speech Enhancement

    Directory of Open Access Journals (Sweden)

    Jigar Shah

    2012-07-01

    Full Text Available The speech enhancement techniques are required to improve the speech signal quality without causing any offshoot in many applications. Recently the growing use of cellular and mobile phones, hands free systems, VoIP phones, voice messaging service, call service centers etc. require efficient real time speech enhancement and detection strategies to make them superior over conventional speech communication systems. The speech enhancement algorithms are required to deal with additive noise and convolutive distortion that occur in any wireless communication system. Also the single channel (one microphone signal is available in real environments. Hence a single channel hybrid algorithm is used which combines minimum mean square error-log spectral amplitude (MMSE-LSA algorithm for additive noise removal and the relative spectral amplitude (RASTA algorithm for reverberation cancellation. The real time and embedded implementation on directly available DSP platforms like TMS320C6713 shows some defects. Hence the VLSI implementation using semi-custom (e.g. FPGA or full-custom approach is required. One such architecture is proposed in this paper.

  3. Efficient VLSI architecture of CAVLC decoder with power optimized

    Institute of Scientific and Technical Information of China (English)

    CHEN Guang-hua; HU Deng-ji; ZHANG Jin-yi; ZHENG Wei-feng; ZENG Wei-min

    2009-01-01

    This paper presents an efficient VLSI architecture of the contest-based adaptive variable length code (CAVLC) decoder with power optimized for the H.264/advanced video coding (AVC) standard. In the proposed design, according to the regularity of the codewords, the first one detector is used to solve the low efficiency and high power dissipation problem within the traditional method of table-searching. Considering the relevance of the data used in the process of runbefore's decoding,arithmetic operation is combined with finite state machine (FSM), which achieves higher decoding efficiency. According to the CAVLC decoding flow, clock gating is employed in the module level and the register level respectively, which reduces 43% of the overall dynamic power dissipation. The proposed design can decode every syntax element in one clock cycle. When the proposed design is synthesized at the clock constraint of 100 MHz, the synthesis result shows that the design costs 11 300gates under a 0.25 μm CMOS technology, which meets the demand of real time decoding in the H.264/AVC standard.

  4. A bioinspired collision detection algorithm for VLSI implementation

    Science.gov (United States)

    Cuadri, J.; Linan, G.; Stafford, R.; Keil, M. S.; Roca, E.

    2005-06-01

    In this paper a bioinspired algorithm for collision detection is proposed, based on previous models of the locust (Locusta migratoria) visual system reported by F.C. Rind and her group, in the University of Newcastle-upon-Tyne. The algorithm is suitable for VLSI implementation in standard CMOS technologies as a system-on-chip for automotive applications. The working principle of the algorithm is to process a video stream that represents the current scenario, and to fire an alarm whenever an object approaches on a collision course. Moreover, it establishes a scale of warning states, from no danger to collision alarm, depending on the activity detected in the current scenario. In the worst case, the minimum time before collision at which the model fires the collision alarm is 40 msec (1 frame before, at 25 frames per second). Since the average time to successfully fire an airbag system is 2 msec, even in the worst case, this algorithm would be very helpful to more efficiently arm the airbag system, or even take some kind of collision avoidance countermeasures. Furthermore, two additional modules have been included: a "Topological Feature Estimator" and an "Attention Focusing Algorithm". The former takes into account the shape of the approaching object to decide whether it is a person, a road line or a car. This helps to take more adequate countermeasures and to filter false alarms. The latter centres the processing power into the most active zones of the input frame, thus saving memory and processing time resources.

  5. High performance genetic algorithm for VLSI circuit partitioning

    Science.gov (United States)

    Dinu, Simona

    2016-12-01

    Partitioning is one of the biggest challenges in computer-aided design for VLSI circuits (very large-scale integrated circuits). This work address the min-cut balanced circuit partitioning problem- dividing the graph that models the circuit into almost equal sized k sub-graphs while minimizing the number of edges cut i.e. minimizing the number of edges connecting the sub-graphs. The problem may be formulated as a combinatorial optimization problem. Experimental studies in the literature have shown the problem to be NP-hard and thus it is important to design an efficient heuristic algorithm to solve it. The approach proposed in this study is a parallel implementation of a genetic algorithm, namely an island model. The information exchange between the evolving subpopulations is modeled using a fuzzy controller, which determines an optimal balance between exploration and exploitation of the solution space. The results of simulations show that the proposed algorithm outperforms the standard sequential genetic algorithm both in terms of solution quality and convergence speed. As a direction for future study, this research can be further extended to incorporate local search operators which should include problem-specific knowledge. In addition, the adaptive configuration of mutation and crossover rates is another guidance for future research.

  6. Parallel VLSI design for the fast -D DWT core algorithm

    Institute of Scientific and Technical Information of China (English)

    WEI Benjie; LIU Mingye; ZHOU Yihua; CHENG Baodong

    2007-01-01

    By studying the core algorithm of a three-dimensional discrete wavelet transform (3-D DWT) in depth,this Paper divides it into three one-dimensional discrete wavelet transforms (1-D DWTs).Based on the implementation of a 3-D DWT software,a parallel architecture design of a very large-scale integration(VLSI)is produced.It needs three dual-port random-access memory(RAM)to store the temporary results and transpose the matrix,then builds up a pipeline model composed of the three 1-D DWTs.In the design.the finite state machine(FSM)is used well to control the flow.Compared with the serial mode.the experimental results of the post synthesized simulation show that the design method is correct and effective.It can increase the processing speed by about 66%.work at 59 MHz,and meet the real-time needs of the video encoder.

  7. Systolic implementation of neural networks

    Energy Technology Data Exchange (ETDEWEB)

    De Groot, A.J.; Parker, S.R.

    1989-01-01

    The backpropagation algorithm for error gradient calculations in multilayer, feed-forward neural networks is derived in matrix form involving inner and outer products. It is demonstrated that these calculations can be carried out efficiently using systolic processing techniques, particularly using the SPRINT, a 64-element systolic processor developed at Lawrence Livermore National Laboratory. This machine contains one million synapses, and forward-propagates 12 million connections per second, using 100 watts of power. When executing the algorithm, each SPRINT processor performs useful work 97% of the time. The theory and applications are confirmed by some nontrivial examples involving seismic signal recognition. 4 refs., 7 figs.

  8. Enabling Future Robotic Missions with Multicore Processors

    Science.gov (United States)

    Powell, Wesley A.; Johnson, Michael A.; Wilmot, Jonathan; Some, Raphael; Gostelow, Kim P.; Reeves, Glenn; Doyle, Richard J.

    2011-01-01

    Recent commercial developments in multicore processors (e.g. Tilera, Clearspeed, HyperX) have provided an option for high performance embedded computing that rivals the performance attainable with FPGA-based reconfigurable computing architectures. Furthermore, these processors offer more straightforward and streamlined application development by allowing the use of conventional programming languages and software tools in lieu of hardware design languages such as VHDL and Verilog. With these advantages, multicore processors can significantly enhance the capabilities of future robotic space missions. This paper will discuss these benefits, along with onboard processing applications where multicore processing can offer advantages over existing or competing approaches. This paper will also discuss the key artchitecural features of current commercial multicore processors. In comparison to the current art, the features and advancements necessary for spaceflight multicore processors will be identified. These include power reduction, radiation hardening, inherent fault tolerance, and support for common spacecraft bus interfaces. Lastly, this paper will explore how multicore processors might evolve with advances in electronics technology and how avionics architectures might evolve once multicore processors are inserted into NASA robotic spacecraft.

  9. Circuit design of VLSI for microelectronic coordinate-sensitive detector for material element analysis

    Directory of Open Access Journals (Sweden)

    Sidorenko V. P.

    2012-08-01

    Full Text Available There has been designed, manufactured and tested a VLSI providing as a part of the microelectronic coordinate-sensitive detector the simultaneous elemental analysis of all the principles of the substance. VLSI ensures the amplifier-converter response on receiving of 1,6.10–13 С negative charge to its input. Response speed of the microcircuit is at least 3 MHz in the counting mode and more than 4 MHz in the counter information read-out mode. The power consumption of the microcircuit is no more than 7 mA.

  10. The GLUEchip: A custom VLSI chip for detectors readout and associative memories circuits

    Energy Technology Data Exchange (ETDEWEB)

    Amendolia, S.R. (Univ. of Sassari and INFN, Pisa (Italy)); Galeotti, S.; Morsani, F.; Passuello, D.; Ristori, L. (Univ. and Scuola Normale Superiore, Pisa (Italy). INFN); Sciacca, G. (Univ. and LNS, Catania (Italy)); Turini, N. (Univ. and INFN, Bologna (Italy))

    1993-08-01

    An associative memory full-custom VLSI chip for pattern recognition has been designed and tested in the past years. It's the AMchip, that contains 128 patterns of 60 bits each. To expand the pattern capacity of an Associative Memory bank, the custom VLSI GLUEchip has been developed. The GLUEchip allows the interconnection of up to 16 AMchips or up to 16 GLUEchips: the resulting tree-like structure works like a single AMchip with an output pipelined structure and a pattern capacity increased by a factor 16 for each GLUEchip used.

  11. Fast VLSI Implementation of Modular Inversion in Galois Field GF(p)

    Institute of Scientific and Technical Information of China (English)

    周涛; 吴行军; 白国强; 陈弘毅

    2003-01-01

    Modular inversion is one of the key arithmetic operations in public key cryptosystems, so low-cost, high-speed hardware implementation is absolutely necessary. This paper presents an algorithm for prime fields for hardware implementation. The algorithm involves only ordinary addition/subtraction and does not need any modular operations, multiplications or divisions. All of the arithmetic operations in the algorithm can be accomplished by only one adder, so it is very suitable for fast very large scale integration (VLSI) implementation. The VLSI implementation of the algorithm is also given with good performance and low silicon penalty.

  12. Digital VLSI design with Verilog a textbook from Silicon Valley Technical Institute

    CERN Document Server

    Williams, John

    2008-01-01

    This unique textbook is structured as a step-by-step course of study along the lines of a VLSI IC design project. In a nominal schedule of 12 weeks, two days and about 10 hours per week, the entire verilog language is presented, from the basics to everything necessary for synthesis of an entire 70,000 transistor, full-duplex serializer - deserializer, including synthesizable PLLs. Digital VLSI Design With Verilog is all an engineer needs for in-depth understanding of the verilog language: Syntax, synthesis semantics, simulation, and test. Complete solutions for the 27 labs are provided on the

  13. A Multiple—Valued Algebra for Modeling MOS VLSI Circuits at Switch—Level

    Institute of Scientific and Technical Information of China (English)

    胡谋

    1992-01-01

    A multiple-valued algebra for modeling MOS VLSI circuits at switch-level is proposed in this paper,Its structure and properties are studied.This algebra can be used to transform a MOS digital circuit to a swith-level algebraic expression so as to generate the truth table for the circuit and to derive a Boolean expression for it.In the paper,methods to construct a switch-level algebraic expression for a circuit and methods to simplify expressions are given.This algebra provides a new tool for MOS VLSI circuit design and analysis.

  14. Making CSB+-Tree Processor Conscious

    DEFF Research Database (Denmark)

    Samuel, Michael; Pedersen, Anders Uhl; Bonnet, Philippe

    2005-01-01

    Cache-conscious indexes, such as CSB+-tree, are sensitive to the underlying processor architecture. In this paper, we focus on how to adapt the CSB+-tree so that it performs well on a range of different processor architectures. Previous work has focused on the impact of node size on the performance...... of the CSB+-tree. We argue that it is necessary to consider a larger group of parameters in order to adapt CSB+-tree to processor architectures as different as Pentium and Itanium. We identify this group of parameters and study how it impacts the performance of CSB+-tree on Itanium 2. Finally, we propose...

  15. Processor arrays with asynchronous TDM optical buses

    Science.gov (United States)

    Li, Y.; Zheng, S. Q.

    1997-04-01

    We propose a pipelined asynchronous time division multiplexing optical bus. Such a bus can use one of the two hardwared priority schemes, the linear priority scheme and the round-robin priority scheme. Our simulation results show that the performances of our proposed buses are significantly better than the performances of known pipelined synchronous time division multiplexing optical buses. We also propose a class of processor arrays connected by pipelined asynchronous time division multiplexing optical buses. We claim that our proposed processor array not only have better performance, but also have better scalabilities than the existing processor arrays connected by pipelined synchronous time division multiplexing optical buses.

  16. Optically excited synapse for neural networks.

    Science.gov (United States)

    Boyd, G D

    1987-07-15

    What can optics with its promise of parallelism do for neural networks which require matrix multipliers? An all optical approach requires optical logic devices which are still in their infancy. An alternative is to retain electronic logic while optically addressing the synapse matrix. This paper considers several versions of an optically addressed neural network compatible with VLSI that could be fabricated with the synapse connection unspecified. This optical matrix multiplier circuit is compared to an all electronic matrix multiplier. For the optical version a synapse consisting of back-to-back photodiodes is found to have a suitable i-v characteristic for optical matrix multiplication (a linear region) plus a clipping or nonlinear region as required for neural networks. Four photodiodes per synapse are required. The strength of the synapse connection is controlled by the optical power and is thus an adjustable parameter. The synapse network can be programmed in various ways such as a shadow mask of metal, imaged mask (static), or light valve or an acoustooptic scanned laser beam or array of beams (dynamic). A milliwatt from LEDs or lasers is adequate power. The neuron has a linear transfer function and is either a summing amplifier, in which case the synapse signal is current, or an integrator, in which case the synapse signal is charge, the choice of which depends on the programming mode. Optical addressing and settling times of microseconds are anticipated. Electronic neural networks using single-value resistor synapses or single-bit programmable synapses have been demonstrated in the high-gain region of discrete single-value feedback. As an alternative to these networks and the above proposed optical synapses, an electronic analog-voltage vector matrix multiplier is considered using MOSFETS as the variable conductance in CMOS VLSI. It is concluded that a shadow mask addressed (static) optical neural network is promising.

  17. A 600-µW ultra-low-power associative processor for image pattern recognition employing magnetic tunnel junction-based nonvolatile memories with autonomic intelligent power-gating scheme

    Science.gov (United States)

    Ma, Yitao; Miura, Sadahiko; Honjo, Hiroaki; Ikeda, Shoji; Hanyu, Takahiro; Ohno, Hideo; Endoh, Tetsuo

    2016-04-01

    A novel associative processor using magnetic tunnel junction (MTJ)-based nonvolatile memories has been proposed and fabricated under a 90 nm CMOS/70 nm perpendicular-MTJ (p-MTJ) hybrid process for achieving the exceptionally low-power performance of image pattern recognition. A four-transistor 2-MTJ (4T-2MTJ) spin transfer torque magnetoresistive random access memory was adopted to completely eliminate the standby power. A self-directed intelligent power-gating (IPG) scheme specialized for this associative processor is employed to optimize the operation power by only autonomously activating currently accessed memory cells. The operations of a prototype chip at 20 MHz are demonstrated by measurement. The proposed processor can successfully carry out single texture pattern matching within 6.5 µs using 128-dimension bag-of-feature patterns, and the measured average operation power of the entire processor core is only 600 µW. Compared with the twin chip designed with 6T static random access memory, 91.2% power reductions are achieved. More than 88.0% power reductions are obtained compared with the latest associative memories. The further power performance analysis is discussed in detail, which verifies the special superiority of the proposed processor in power consumption for large-capacity memory-based VLSI systems.

  18. Concept of a Supervector Processor: A Vector Approach to Superscalar Processor, Design and Performance Analysis

    Directory of Open Access Journals (Sweden)

    Deepak Kumar, Ranjan Kumar Behera, K. S. Pandey

    2013-07-01

    Full Text Available To maximize the available performance is always a goal in microprocessor design. In this paper a new technique has been implemented which exploits the advantage of both superscalar and vector processing technique in a proposed processor called Supervector processor. Vector processor operates on array of data called vector and can greatly improve certain task such as numerical simulation and tasks which requires huge number crunching. On other handsuperscalar processor issues multiple instructions per cyclewhich can enhance the throughput. To implement parallelism multiple vector instructions were issued and executed per cycle in superscalar fashion. Case study has been done on various benchmarks to compare the performance of proposedsupervector processor architecture with superscalar and vectorprocessor architecture. Trimaran Framework has been used in order to evaluate the performance of the proposed supervector processor scheme.

  19. Photonics and Fiber Optics Processor Lab

    Data.gov (United States)

    Federal Laboratory Consortium — The Photonics and Fiber Optics Processor Lab develops, tests and evaluates high speed fiber optic network components as well as network protocols. In addition, this...

  20. Radiation Tolerant Software Defined Video Processor Project

    Data.gov (United States)

    National Aeronautics and Space Administration — MaXentric's is proposing a radiation tolerant Software Define Video Processor, codenamed SDVP, for the problem of advanced motion imaging in the space environment....

  1. Processor-Dependent Malware... and codes

    CERN Document Server

    Desnos, Anthony; Filiol, Eric

    2010-01-01

    Malware usually target computers according to their operating system. Thus we have Windows malwares, Linux malwares and so on ... In this paper, we consider a different approach and show on a technical basis how easily malware can recognize and target systems selectively, according to the onboard processor chip. This technology is very easy to build since it does not rely on deep analysis of chip logical gates architecture. Floating Point Arithmetic (FPA) looks promising to define a set of tests to identify the processor or, more precisely, a subset of possible processors. We give results for different families of processors: AMD, Intel (Dual Core, Atom), Sparc, Digital Alpha, Cell, Atom ... As a conclusion, we propose two {\\it open problems} that are new, to the authors' knowledge.

  2. Critical review of programmable media processor architectures

    Science.gov (United States)

    Berg, Stefan G.; Sun, Weiyun; Kim, Donglok; Kim, Yongmin

    1998-12-01

    In the past several years, there has been a surge of new programmable mediaprocessors introduced to provide an alternative solution to ASICs and dedicated hardware circuitries in the multimedia PC and embedded consumer electronics markets. These processors attempt to combine the programmability of multimedia-enhanced general purpose processors with the performance and low cost of dedicated hardware. We have reviewed five current multimedia architectures and evaluated their strengths and weaknesses.

  3. First Cluster Algorithm Special Purpose Processor

    Science.gov (United States)

    Talapov, A. L.; Andreichenko, V. B.; Dotsenko S., Vi.; Shchur, L. N.

    We describe the architecture of the special purpose processor built to realize in hardware cluster Wolff algorithm, which is not hampered by a critical slowing down. The processor simulates two-dimensional Ising-like spin systems. With minor changes the same very effective architecture, which can be defined as a Memory Machine, can be used to study phase transitions in a wide range of models in two or three dimensions.

  4. A New Echeloned Poisson Series Processor (EPSP)

    Science.gov (United States)

    Ivanova, Tamara

    2001-07-01

    A specialized Echeloned Poisson Series Processor (EPSP) is proposed. It is a typical software for the implementation of analytical algorithms of Celestial Mechanics. EPSP is designed for manipulating long polynomial-trigonometric series with literal divisors. The coefficients of these echeloned series are the rational or floating-point numbers. The Keplerian processor and analytical generator of special celestial mechanics functions based on the EPSP are also developed.

  5. Evaluating current processors performance and machines stability

    CERN Document Server

    Esposito, R; Tortone, G; Taurino, F M

    2003-01-01

    Accurately estimate performance of currently available processors is becoming a key activity, particularly in HENP environment, where high computing power is crucial. This document describes the methods and programs, opensource or freeware, used to benchmark processors, memory and disk subsystems and network connection architectures. These tools are also useful to stress test new machines, before their acquisition or before their introduction in a production environment, where high uptimes are requested.

  6. A fast lightstripe rangefinding system with smart VLSI sensor

    Science.gov (United States)

    Gruss, Andrew; Carley, L. Richard; Kanade, Takeo

    1989-01-01

    The focus of the research is to build a compact, high performance lightstripe rangefinder using a Very Large Scale Integration (VLSI) smart photosensor array. Rangefinding, the measurement of the three-dimensional profile of an object or scene, is a critical component for many robotic applications, and therefore many techniques were developed. Of these, lightstripe rangefinding is one of the most widely used and reliable techniques available. Though practical, the speed of sampling range data by the conventional light stripe technique is severely limited. A conventional light stripe rangefinder operates in a step-and-repeat manner. A stripe source is projected on an object, a video image is acquired, range data is extracted from the image, the stripe is stepped, and the process repeats. Range acquisition is limited by the time needed to grab the video images, increasing linearly with the desired horizontal resolution. During the acquisition of a range image, the objects in the scene being scanned must be stationary. Thus, the long scene sampling time of step-and-repeat rangefinders limits their application. The fast range sensor proposed is based on the modification of this basic lightstripe ranging technique in a manner described by Sato and Kida. This technique does not require a sampling of images at various stripe positions to build a range map. Rather, an entire range image is acquired in parallel while the stripe source is swept continuously across the scene. Total time to acquire the range image data is independent of the range map resolution. The target rangefinding system will acquire 1,000 100 x 100 point range images per second with 0.5 percent range accuracy. It will be compact and rugged enough to be mounted on the end effector of a robot arm to aid in object manipulation and assembly tasks.

  7. SMART AS A CRYPTOGRAPHIC PROCESSOR

    Directory of Open Access Journals (Sweden)

    Saroja Kanchi

    2016-05-01

    Full Text Available SMaRT is a 16-bit 2.5-address RISC-type single-cycle processor, which was recently designed and successfully mapped into a FPGA chip in our ECE department. In this paper, we use SMaRT to run the well-known encryption algorithm, Data Encryption Standard. For information security purposes, encryption is a must in today’s sophisticated and ever-increasing computer communications such as ATM machines and SIM cards. For comparison and evaluation purposes, we also map the same algorithm on the HC12, a same-size but CISC-type off-the-shelf microcontroller, Our results show that compared to HC12, SMaRT code is only 14% longer in terms of the static number of instructions but about 10 times faster in terms of the number of clock cycles, and 7% smaller in terms of code size. Our results also show that 2.5- address instructions, a SMaRT selling point, amount to 45% of the whole R-type instructions resulting in significant improvement in static number of instructions hence code size as well as performance. Additionally, we see that the SMaRT short-branch range is sufficiently wide in 90% of cases in the SMaRT code. Our results also reveal that the SMaRT novel concept of locality of reference in using the MSBs of the registers in non-subroutine branch instructions stays valid with a remarkable hit rate of 95%!

  8. 3-D VLSI Architecture Implementation for Data Fusion Problems

    Science.gov (United States)

    Duong, T.; Weldon, D.; Thomas, T.

    1999-01-01

    This paper gives an overview of hardware implementation techniques employed in solving real-time classification problems using Neural Network, Principle Component Analysis (PCA), and Independent Component Analysis (ICA) techniques.

  9. Percola: a special purpose programmable 64-BIT floating-point processor

    Energy Technology Data Exchange (ETDEWEB)

    Normand, J.M.

    1988-01-01

    The computer PERCOLA is designed for lengthy numerical simulations on a percolation problem in Statistical Mechanics of disordered media. The project that the computer is engaged on at present is intended to improve the true values of critical indices characteristic of the behaviour of electrical conductivity at percolation threshold in a system of random resistors. The architecture is based on an efficient highly iterative algorithm to compute the electrical conductivity of random resistor networks. This computer runs programs of percolation problems considered 10 percent faster than the Cray X-MP with the same 64-bit floating-point precision. Operating since May 1987, months of calculation have already been performed. Although best suited to a class of algorithms, the processor includes a powerful 32-bit integer random number generator and has many characteristics of general purpose 64-bit floating-point microprogrammable computers that can run programs for various type of problems with a peak performance of more than 25 Mflops. This high computing speed is not the result of one all-powerful feature, but a balance among several factors including supercomputer derived features such as mainly: concurrent functional units separately controlled from independent microcode fields, distinct buses for data, addresses and instructions, flexible and powerful address generators for matrix processing and an advanced pipeline design implementing high performance VLSI Weitek's and Analog Devices components.

  10. DESIGN OF A RECONFIGURABLE DSP PROCESSOR WITH BIT EFFICIENT RESIDUE NUMBER SYSTEM

    Directory of Open Access Journals (Sweden)

    Chaitali Biswas Dutta

    2012-10-01

    Full Text Available Residue Number System (RNS, which originates from the Chinese Remainder Theorem, offers a promising future in VLSI because of its carry-free operations in addition, subtraction and multiplication. This property of RNS is very helpful to reduce the complexity of calculation in many applications. A residue number system represents a large integer using a set of smaller integers, called residues. But the area overhead, cost and speed not only depend on this word length, but also the selection of moduli, which is a very crucial step for residue system. This parameter determines bit efficiency, area, frequency etc. In this paper a new moduli set selection technique is proposed to improve bit efficiency which can be used to construct a residue system for digital signal processing environment. Subsequently, it is theoretically proved and illustrated using examples, that the proposed solution gives better results than the schemes reported in the literature. The novelty of the architecture is shown by comparison the different schemes reported in the literature. Using the novel moduli set, a guideline for a Reconfigurable Processor is presented here that can process some predefined functions. As RNS minimizes the carry propagation, the scheme can be implemented in Real Time Signal Processing & other fields where high speed computations are required.

  11. Fully-depleted silicon-on-sapphire and its application to advanced VLSI design

    Science.gov (United States)

    Offord, Bruce W.

    1992-01-01

    In addition to the widely recognized advantages of full dielectric isolation, e.g., reduced parasitic capacitance, transient radiation hardness, and processing simplicity, fully-depleted silicon-on-sapphire offers reduced floating body effects and improved thermal characteristics when compared to other silicon-on-insulator technologies. The properties of this technology and its potential impact on advanced VLSI circuitry will be discussed.

  12. VLSI chip-set for data compression using the Rice algorithm

    Science.gov (United States)

    Venbrux, J.; Liu, N.

    1990-01-01

    A full custom VLSI implementation of a data compression encoder and decoder which implements the lossless Rice data compression algorithm is discussed in this paper. The encoder and decoder reside on single chips. The data rates are to be 5 and 10 Mega-samples-per-second for the decoder and encoder respectively.

  13. Synthesis of on-chip control circuits for mVLSI biochips

    DEFF Research Database (Denmark)

    Potluri, Seetal; Schneider, Alexander Rüdiger; Hørslev-Petersen, Martin

    2017-01-01

    them to laboratory environments. To address this issue, researchers have proposed methods to reduce the number of offchip pressure sources, through integration of on-chip pneumatic control logic circuits fabricated using three-layer monolithic membrane valve technology. Traditionally, mVLSI biochip...... applied to generate biochip layouts with integrated on-chip pneumatic control....

  14. A VLSI analog pipeline read-out for electrode segmented ionization chambers

    CERN Document Server

    Bonazzola, G C; Cirio, R; Donetti, M; Figus, M; Marchetto, F; Peroni, C; Pernigotti, E; Thénard, J M; Zampieri, A

    1999-01-01

    We report on the design and test of a 32-channel VLSI chip based on the analog pipeline memory concept. The charge from a strip of a ionization chamber, is stored as a function of time in a switched capacitor array. The cell reading can be done in parallel with the writing.

  15. VLSI Technology: Impact and Promise. Identifying Emerging Issues and Trends in Technology for Special Education.

    Science.gov (United States)

    Bayoumi, Magdy

    As part of a 3-year study to identify emerging issues and trends in technology for special education, this paper addresses the implications of very large scale integrated (VLSI) technology. The first section reviews the development of educational technology, particularly microelectronics technology, from the 1950s to the present. The implications…

  16. A VLSI analog pipeline read-out for electrode segmented ionization chambers

    Science.gov (United States)

    Bonazzola, G. C.; Bouvier, S.; Cirio, R.; Donetti, M.; Figus, M.; Marchetto, F.; Peroni, C.; Pernigotti, E.; Thenard, J. M.; Zampieri, A.

    1999-05-01

    We report on the design and test of a 32-channel VLSI chip based on the analog pipeline memory concept. The charge from a strip of a ionization chamber, is stored as a function of time in a switched capacitor array. The cell reading can be done in parallel with the writing.

  17. A Proposal for Energy-Efficient Cellular Neural Network based on Spintronic Devices

    OpenAIRE

    2016-01-01

    Due to the massive parallel computing capability and outstanding image and signal processing performance, cellular neural network (CNN) is one promising type of non-Boolean computing system that can outperform the traditional digital logic computation and mitigate the physical scaling limit of the conventional CMOS technology. The CNN was originally implemented by VLSI analog technologies with operational amplifiers and operational transconductance amplifiers as neurons and synapses, respecti...

  18. Neural networks: Application to medical imaging

    Science.gov (United States)

    Clarke, Laurence P.

    1994-01-01

    The research mission is the development of computer assisted diagnostic (CAD) methods for improved diagnosis of medical images including digital x-ray sensors and tomographic imaging modalities. The CAD algorithms include advanced methods for adaptive nonlinear filters for image noise suppression, hybrid wavelet methods for feature segmentation and enhancement, and high convergence neural networks for feature detection and VLSI implementation of neural networks for real time analysis. Other missions include (1) implementation of CAD methods on hospital based picture archiving computer systems (PACS) and information networks for central and remote diagnosis and (2) collaboration with defense and medical industry, NASA, and federal laboratories in the area of dual use technology conversion from defense or aerospace to medicine.

  19. IDSP- INTERACTIVE DIGITAL SIGNAL PROCESSOR

    Science.gov (United States)

    Mish, W. H.

    1994-01-01

    The Interactive Digital Signal Processor, IDSP, consists of a set of time series analysis "operators" based on the various algorithms commonly used for digital signal analysis work. The processing of a digital time series to extract information is usually achieved by the application of a number of fairly standard operations. However, it is often desirable to "experiment" with various operations and combinations of operations to explore their effect on the results. IDSP is designed to provide an interactive and easy-to-use system for this type of digital time series analysis. The IDSP operators can be applied in any sensible order (even recursively), and can be applied to single time series or to simultaneous time series. IDSP is being used extensively to process data obtained from scientific instruments onboard spacecraft. It is also an excellent teaching tool for demonstrating the application of time series operators to artificially-generated signals. IDSP currently includes over 43 standard operators. Processing operators provide for Fourier transformation operations, design and application of digital filters, and Eigenvalue analysis. Additional support operators provide for data editing, display of information, graphical output, and batch operation. User-developed operators can be easily interfaced with the system to provide for expansion and experimentation. Each operator application generates one or more output files from an input file. The processing of a file can involve many operators in a complex application. IDSP maintains historical information as an integral part of each file so that the user can display the operator history of the file at any time during an interactive analysis. IDSP is written in VAX FORTRAN 77 for interactive or batch execution and has been implemented on a DEC VAX-11/780 operating under VMS. The IDSP system generates graphics output for a variety of graphics systems. The program requires the use of Versaplot and Template plotting

  20. A CMOS VLSI IC for real-time opto-electronic two-dimensional histogram generation

    Science.gov (United States)

    Richstein, James K.

    1993-12-01

    Histogram generation, a standard image processing operation, is a record of the intensity distribution in the image. Histogram generation has straightforward implementations on digital computers using high level languages. A prototype of an optical-electronic histogram generator was designed and tested for 1-D objects using wirewrapped MSI TTL components. The system has shown to be fairly modular in design. The aspects of the extension to two dimensions and the VLSI implementation of this design are discussed. In this paper, we report a VLSI design to be used in a two-dimensional real-time histogram generation scheme. The overall system design is such that the electronic signal obtained from the optically scanned two-dimensional semi-opaque image is processed and displayed within a period of one cycle of the scanning process. Specifically, in the VLSI implementation of the two-dimensional histogram generator, modifications were made to the original design. For the two-dimensional application, the output controller was analyzed as a finite state machine. The process used to describe the required timing signals and translate them to a VLSI finite state machine using Computer Aided Design Tools is discussed. In addition, the circuitry for sampling, binning, and display were combined with the timing circuitry on one IC. In the original design, the pulse width of the electronically sampled photodetector is limited with an analog one-shot. The high sampling rates associated with the extension to two dimensions requires significant reduction in the original 1-D prototype's sample pulse width of approximately 75 ns. The alternate design using VLSI logic gates will provide one-shot pulse widths of approximately 3 ns.

  1. Using Beowulf clusters to speed up neural simulations.

    Science.gov (United States)

    Smith, Leslie S.

    2002-06-01

    Simulation of large neural systems on PCs requires large amounts of memory, and takes a long time. Parallel computers can speed them up. A new form of parallel computer, the Beowulf cluster, is an affordable version. Event-driven simulation and processor farming are two ways of exploiting this parallelism in neural simulations.

  2. Relaxed fault-tolerant hardware implementation of neural networks in the presence of multiple transient errors.

    Science.gov (United States)

    Mahdiani, Hamid Reza; Fakhraie, Sied Mehdi; Lucas, Caro

    2012-08-01

    Reliability should be identified as the most important challenge in future nano-scale very large scale integration (VLSI) implementation technologies for the development of complex integrated systems. Normally, fault tolerance (FT) in a conventional system is achieved by increasing its redundancy, which also implies higher implementation costs and lower performance that sometimes makes it even infeasible. In contrast to custom approaches, a new class of applications is categorized in this paper, which is inherently capable of absorbing some degrees of vulnerability and providing FT based on their natural properties. Neural networks are good indicators of imprecision-tolerant applications. We have also proposed a new class of FT techniques called relaxed fault-tolerant (RFT) techniques which are developed for VLSI implementation of imprecision-tolerant applications. The main advantage of RFT techniques with respect to traditional FT solutions is that they exploit inherent FT of different applications to reduce their implementation costs while improving their performance. To show the applicability as well as the efficiency of the RFT method, the experimental results for implementation of a face-recognition computationally intensive neural network and its corresponding RFT realization are presented in this paper. The results demonstrate promising higher performance of artificial neural network VLSI solutions for complex applications in faulty nano-scale implementation environments.

  3. CMOS VLSI Active-Pixel Sensor for Tracking

    Science.gov (United States)

    Pain, Bedabrata; Sun, Chao; Yang, Guang; Heynssens, Julie

    2004-01-01

    An architecture for a proposed active-pixel sensor (APS) and a design to implement the architecture in a complementary metal oxide semiconductor (CMOS) very-large-scale integrated (VLSI) circuit provide for some advanced features that are expected to be especially desirable for tracking pointlike features of stars. The architecture would also make this APS suitable for robotic- vision and general pointing and tracking applications. CMOS imagers in general are well suited for pointing and tracking because they can be configured for random access to selected pixels and to provide readout from windows of interest within their fields of view. However, until now, the architectures of CMOS imagers have not supported multiwindow operation or low-noise data collection. Moreover, smearing and motion artifacts in collected images have made prior CMOS imagers unsuitable for tracking applications. The proposed CMOS imager (see figure) would include an array of 1,024 by 1,024 pixels containing high-performance photodiode-based APS circuitry. The pixel pitch would be 9 m. The operations of the pixel circuits would be sequenced and otherwise controlled by an on-chip timing and control block, which would enable the collection of image data, during a single frame period, from either the full frame (that is, all 1,024 1,024 pixels) or from within as many as 8 different arbitrarily placed windows as large as 8 by 8 pixels each. A typical prior CMOS APS operates in a row-at-a-time ( grolling-shutter h) readout mode, which gives rise to exposure skew. In contrast, the proposed APS would operate in a sample-first/readlater mode, suppressing rolling-shutter effects. In this mode, the analog readout signals from the pixels corresponding to the windows of the interest (which windows, in the star-tracking application, would presumably contain guide stars) would be sampled rapidly by routing them through a programmable diagonal switch array to an on-chip parallel analog memory array. The

  4. Advanced plasma etching processes for dielectric materials in VLSI technology

    Science.gov (United States)

    Wang, Juan Juan

    Manufacturable plasma etching processes for dielectric materials have played an important role in the Integrated Circuits (IC) industry in recent decades. Dielectric materials such as SiO2 and SiN are widely used to electrically isolate the active device regions (like the gate, source and drain from the first level of metallic interconnects) and to isolate different metallic interconnect levels from each other. However, development of new state-of-the-art etching processes is urgently needed for higher aspect ratio (oxide depth/hole diameter---6:1) in Very Large Scale Integrated (VLSI) circuits technology. The smaller features can provide greater packing density of devices on a single chip and greater number of chips on a single wafer. This dissertation focuses on understanding and optimizing of several key aspects of etching processes for dielectric materials. The challenges are how to get higher selectivity of oxide/Si for contact and oxide/TiN for vias; tight Critical Dimension (CD) control; wide process margin (enough over-etch); uniformity and repeatability. By exploring all of the parameters for the plasma etch process, the key variables are found and studied extensively. The parameters investigated here are Power, Pressure, Gas ratio, and Temperature. In particular, the novel gases such as C4F8, C5F8, and C4F6 were studied in order to meet the requirements of the design rules. We also studied CF4 that is used frequently for dielectric material etching in the industry. Advanced etch equipment was used for the above applications: the medium-density plasma tools (like Magnet-Enhanced Reactive Ion Etching (MERIE) tool) and the high-density plasma tools. By applying the Design of Experiments (DOE) method, we found the key factors needed to predict the trend of the etch process (such as how to increase the etch rates, selectivity, etc.; and how to control the stability of the etch process). We used JMP software to analyze the DOE data. The characterization of the

  5. VLSI circuits implementing computational models of neocortical circuits.

    Science.gov (United States)

    Wijekoon, Jayawan H B; Dudek, Piotr

    2012-09-15

    This paper overviews the design and implementation of three neuromorphic integrated circuits developed for the COLAMN ("Novel Computing Architecture for Cognitive Systems based on the Laminar Microcircuitry of the Neocortex") project. The circuits are implemented in a standard 0.35 μm CMOS technology and include spiking and bursting neuron models, and synapses with short-term (facilitating/depressing) and long-term (STDP and dopamine-modulated STDP) dynamics. They enable execution of complex nonlinear models in accelerated-time, as compared with biology, and with low power consumption. The neural dynamics are implemented using analogue circuit techniques, with digital asynchronous event-based input and output. The circuits provide configurable hardware blocks that can be used to simulate a variety of neural networks. The paper presents experimental results obtained from the fabricated devices, and discusses the advantages and disadvantages of the analogue circuit approach to computational neural modelling.

  6. Design and implementation of multipattern generators in analog VLSI.

    Science.gov (United States)

    Kier, Ryan J; Ames, Jeffrey C; Beer, Randall D; Harrison, Reid R

    2006-07-01

    In recent years, computational biologists have shown through simulation that small neural networks with fixed connectivity are capable of producing multiple output rhythms in response to transient inputs. It is believed that such networks may play a key role in certain biological behaviors such as dynamic gait control. In this paper, we present a novel method for designing continuous-time recurrent neural networks (CTRNNs) that contain multiple embedded limit cycles, and we show that it is possible to switch the networks between these embedded limit cycles with simple transient inputs. We also describe the design and testing of a fully integrated four-neuron CTRNN chip that is used to implement the neural network pattern generators. We provide two example multipattern generators and show that the measured waveforms from the chip agree well with numerical simulations.

  7. Real time processor for array speckle interferometry

    Science.gov (United States)

    Chin, Gordon; Florez, Jose; Borelli, Renan; Fong, Wai; Miko, Joseph; Trujillo, Carlos

    1989-01-01

    The authors are constructing a real-time processor to acquire image frames, perform array flat-fielding, execute a 64 x 64 element two-dimensional complex FFT (fast Fourier transform) and average the power spectrum, all within the 25 ms coherence time for speckles at near-IR (infrared) wavelength. The processor will be a compact unit controlled by a PC with real-time display and data storage capability. This will provide the ability to optimize observations and obtain results on the telescope rather than waiting several weeks before the data can be analyzed and viewed with offline methods. The image acquisition and processing, design criteria, and processor architecture are described.

  8. Making CSB+-Tree Processor Conscious

    DEFF Research Database (Denmark)

    Samuel, Michael; Pedersen, Anders Uhl; Bonnet, Philippe

    2005-01-01

    Cache-conscious indexes, such as CSB+-tree, are sensitive to the underlying processor architecture. In this paper, we focus on how to adapt the CSB+-tree so that it performs well on a range of different processor architectures. Previous work has focused on the impact of node size on the performance...... of the CSB+-tree. We argue that it is necessary to consider a larger group of parameters in order to adapt CSB+-tree to processor architectures as different as Pentium and Itanium. We identify this group of parameters and study how it impacts the performance of CSB+-tree on Itanium 2. Finally, we propose...... a systematic method for adapting CSB+-tree to new platforms. This work is a first step towards integrating CSB+-tree in MySQL’s heap storage manager....

  9. Dynamic Load Balancing using Graphics Processors

    Directory of Open Access Journals (Sweden)

    R Mohan

    2014-04-01

    Full Text Available To get maximum performance on the many-core graphics processors, it is important to have an even balance of the workload so that all processing units contribute equally to the task at hand. This can be hard to achieve when the cost of a task is not known beforehand and when new sub-tasks are created dynamically during execution. Both the dynamic load balancing methods using Static task assignment and work stealing using deques are compared to see which one is more suited to the highly parallel world of graphics processors. They have been evaluated on the task of simulating a computer move against the human move, in the famous four in a row game. The experiments showed that synchronization can be very expensive, and those new methods which use graphics processor features wisely might be required.

  10. Intrusion Detection Architecture Utilizing Graphics Processors

    Directory of Open Access Journals (Sweden)

    Branislav Madoš

    2012-12-01

    Full Text Available With the thriving technology and the great increase in the usage of computer networks, the risk of having these network to be under attacks have been increased. Number of techniques have been created and designed to help in detecting and/or preventing such attacks. One common technique is the use of Intrusion Detection Systems (IDS. Today, number of open sources and commercial IDS are available to match enterprises requirements. However, the performance of these systems is still the main concern. This paper examines perceptions of intrusion detection architecture implementation, resulting from the use of graphics processor. It discusses recent research activities, developments and problems of operating systems security. Some exploratory evidence is presented that shows capabilities of using graphical processors and intrusion detection systems. The focus is on how knowledge experienced throughout the graphics processor inclusion has played out in the design of intrusion detection architecture that is seen as an opportunity to strengthen research expertise.

  11. ETHERNET PACKET PROCESSOR FOR SOC APPLICATION

    Directory of Open Access Journals (Sweden)

    Raja Jitendra Nayaka

    2012-07-01

    Full Text Available As the demand for Internet expands significantly in numbers of users, servers, IP addresses, switches and routers, the IP based network architecture must evolve and change. The design of domain specific processors that require high performance, low power and high degree of programmability is the bottleneck in many processor based applications. This paper describes the design of ethernet packet processor for system-on-chip (SoC which performs all core packet processing functions, including segmentation and reassembly, packetization classification, route and queue management which will speedup switching/routing performance. Our design has been configured for use with multiple projects ttargeted to a commercial configurable logic device the system is designed to support 10/100/1000 links with a speed advantage. VHDL has been used to implement and simulated the required functions in FPGA.

  12. Programmable DNA-mediated multitasking processor

    CERN Document Server

    Shu, Jian-Jun; Yong, Kian-Yan; Shao, Fangwei; Lee, Kee Jin

    2015-01-01

    Because of DNA appealing features as perfect material, including minuscule size, defined structural repeat and rigidity, programmable DNA-mediated processing is a promising computing paradigm, which employs DNAs as information storing and processing substrates to tackle the computational problems. The massive parallelism of DNA hybridization exhibits transcendent potential to improve multitasking capabilities and yield a tremendous speed-up over the conventional electronic processors with stepwise signal cascade. As an example of multitasking capability, we present an in vitro programmable DNA-mediated optimal route planning processor as a functional unit embedded in contemporary navigation systems. The novel programmable DNA-mediated processor has several advantages over the existing silicon-mediated methods, such as conducting massive data storage and simultaneous processing via much fewer materials than conventional silicon devices.

  13. SWIFT Privacy: Data Processor Becomes Data Controller

    Directory of Open Access Journals (Sweden)

    Edwin Jacobs

    2007-04-01

    Full Text Available Last month, SWIFT emphasised the urgent need for a solution to compliance with US Treasury subpoenas that provides legal certainty for the financial industry as well as for SWIFT. SWIFT will continue its activities to adhere to the Safe Harbor framework of the European data privacy legislation. Safe Harbor is a framework negotiated by the EU and US in 2000 to provide a way for companies in Europe, with operations in the US, to conform to EU data privacy regulations. This seems to conclude a complex privacy case, widely covered by the US and European media. A fundamental question in this case was who is a data controller and who is a mere data processor. Both the Belgian and the European privacy authorities considered SWIFT, jointly with the banks, as a data controller whereas SWIFT had considered itself as a mere data processor that processed financial data for banks. The difference between controller and processor has far reaching consequences.

  14. An efficient VLSI implementation of H.264/AVC entropy decoder

    Institute of Scientific and Technical Information of China (English)

    Jongsik; PARK; Jeonhak; MOON; Seongsoo; LEE

    2010-01-01

    <正>This paper proposes an efficient H.264/AVC entropy decoder.It requires no ROM/RAM fabrication process that decreases fabrication cost and increases operation speed.It was achieved by optimizing lookup tables and internal buffers,which significantly improves area,speed,and power.The proposed entropy decoder does not exploit embedded processor for bitstream manipulation, which also improves area,speed,and power.Its gate counts and maximum operation frequency are 77515 gates and 175MHz in 0.18um fabrication process,respectively.The proposed entropy decoder needs 2303 cycles in average for one macroblock decoding.It can run at 28MHz to meet the real-time processing requirement for CIF format video decoding on mobile applications.

  15. Efficient SIMD optimization for media processors

    Institute of Scientific and Technical Information of China (English)

    Jian-peng ZHOU; Ce SHI

    2008-01-01

    Single instruction multiple data (SIMD) instructions are often implemented in modem media processors. Although SIMD instructions are useful in multimedia applications, most compilers do not have good support for SIMD instructions. This paper focuses on SIMD instructions generation for media processors. We present an efficient code optimization approach that is integrated into a retargetable C compiler. SIMD instructions are generated by finding and combining the same operations in programs. Experimental results for the UltraSPARC VIS instruction set show that a speedup factor up to 2.639 is obtained.

  16. 8 Bit RISC Processor Using Verilog HDL

    Directory of Open Access Journals (Sweden)

    Ramandeep Kaur

    2014-03-01

    Full Text Available RISC is a design philosophy to reduce the complexity of instruction set that in turn reduces the amount of space, cycle time, cost and other parameters taken into account during the implementation of the design. The advent of FPGA has enabled the complex logical systems to be implemented on FPGA. The intent of this paper is to design and implement 8 bit RISC processor using FPGA Spartan 3E tool. This processor design depends upon design specification, analysis and simulation. It takes into consideration very simple instruction set. The momentous components include Control unit, ALU, shift registers and accumulator register.

  17. Models of Communication for Multicore Processors

    DEFF Research Database (Denmark)

    Schoeberl, Martin; Sørensen, Rasmus Bo; Sparsø, Jens

    2015-01-01

    To efficiently use multicore processors we need to ensure that almost all data communication stays on chip, i.e., the bits moved between tasks executing on different processor cores do not leave the chip. Different forms of on-chip communication are supported by different hardware mechanism, e.......g., shared caches with cache coherency protocols, core-to-core networks-on-chip, and shared scratchpad memories. In this paper we explore the different hardware mechanism for on-chip communication and how they support or favor different models of communication. Furthermore, we discuss the usability...... of the different models of communication for real-time systems....

  18. SPROC: A multiple-processor DSP IC

    Science.gov (United States)

    Davis, R.

    1991-01-01

    A large, single-chip, multiple-processor, digital signal processing (DSP) integrated circuit (IC) fabricated in HP-Cmos34 is presented. The innovative architecture is best suited for analog and real-time systems characterized by both parallel signal data flows and concurrent logic processing. The IC is supported by a powerful development system that transforms graphical signal flow graphs into production-ready systems in minutes. Automatic compiler partitioning of tasks among four on-chip processors gives the IC the signal processing power of several conventional DSP chips.

  19. Models of Communication for Multicore Processors

    DEFF Research Database (Denmark)

    Schoeberl, Martin; Sørensen, Rasmus Bo; Sparsø, Jens

    2015-01-01

    To efficiently use multicore processors we need to ensure that almost all data communication stays on chip, i.e., the bits moved between tasks executing on different processor cores do not leave the chip. Different forms of on-chip communication are supported by different hardware mechanism, e.......g., shared caches with cache coherency protocols, core-to-core networks-on-chip, and shared scratchpad memories. In this paper we explore the different hardware mechanism for on-chip communication and how they support or favor different models of communication. Furthermore, we discuss the usability...... of the different models of communication for real-time systems....

  20. Multi-core processors - An overview

    CERN Document Server

    Venu, Balaji

    2011-01-01

    Microprocessors have revolutionized the world we live in and continuous efforts are being made to manufacture not only faster chips but also smarter ones. A number of techniques such as data level parallelism, instruction level parallelism and hyper threading (Intel's HT) already exists which have dramatically improved the performance of microprocessor cores. This paper briefs on evolution of multi-core processors followed by introducing the technology and its advantages in today's world. The paper concludes by detailing on the challenges currently faced by multi-core processors and how the industry is trying to address these issues.

  1. Comparison of Processor Performance of SPECint2006 Benchmarks of some Intel Xeon Processors

    Directory of Open Access Journals (Sweden)

    Abdul Kareem PARCHUR

    2012-08-01

    Full Text Available High performance is a critical requirement to all microprocessors manufacturers. The present paper describes the comparison of performance in two main Intel Xeon series processors (Type A: Intel Xeon X5260, X5460, E5450 and L5320 and Type B: Intel Xeon X5140, 5130, 5120 and E5310. The microarchitecture of these processors is implemented using the basis of a new family of processors from Intel starting with the Pentium 4 processor. These processors can provide a performance boost for many key application areas in modern generation. The scaling of performance in two major series of Intel Xeon processors (Type A: Intel Xeon X5260, X5460, E5450 and L5320 and Type B: Intel Xeon X5140, 5130, 5120 and E5310 has been analyzed using the performance numbers of 12 CPU2006 integer benchmarks, performance numbers that exhibit significant differences in performance. The results and analysis can be used by performance engineers, scientists and developers to better understand the performance scaling in modern generation processors.

  2. Inter Processor Communication for Fault Diagnosis in Multiprocessor Systems

    Directory of Open Access Journals (Sweden)

    C. D. Malleswar

    1994-04-01

    Full Text Available In the preseJlt paper a simple technique is proposed for fault diagnosis for multiprocessor and multiple system environments, wherein all microprocessors in the system are used in part to check the health of their neighbouring processors. It involves building simple fail-safe serial communication links between processors. Processors communicate with each other over these links and each processor is made to go through certain sequences of actions intended for diagnosis, under the observation of another processor .With limited overheads, fault detection can be done by this method. Also outlined are some of the popular techniques used for health check of processor-based systems.

  3. 超大规模集成电路可调试性设计综述%Survey of Design-for-Debug of VLSI

    Institute of Scientific and Technical Information of China (English)

    钱诚; 沈海华; 陈天石; 陈云霁

    2012-01-01

    随着硬件复杂度的不断提高和并行软件调试的需求不断增长,可调试性设计已经成为集成电路设计中的重要内容.一方面,仅靠传统的硅前验证已经无法保证现代超大规模复杂集成电路设计验证的质量,因此作为硅后验证重要支撑技术的可调试性设计日渐成为大规模集成电路设计领域的研究热点.另一方面,并行程序的调试非常困难,很多细微的bug无法直接用传统的单步、断点等方法进行调试,如果没有专门的硬件支持,需要耗费极大的人力和物力.全面分析了现有的可调试性设计,在此基础上归纳总结了可调试性设计技术的主要研究方向并介绍了各个方向的研究进展,深入探讨了可调试性结构设计研究中的热点问题及其产生根源,给出了可调试性结构设计领域的发展趋势.%Design-for-debug (DFD) has become an important feature of modern VLSI. On the one hand, traditional pre-silicon verification methods are not sufficient to enssure the quality of modern complex VLSI designs, thus employing DFD to facilitate post-silicon verification has attracted wide interests from both academia and industry; on the other hand, debugging parallel program is a worldwide difficult problem, which cries out for DFD hardware supports. In this paper, we analyze the existing structures of DFD comprehensively and introduce different fields of DFD for debugging hardware and software. These fields contain various kinds of DFD infrastructures, such as the DFD infrastructure for the pipe line of processor, the system-on-chips (SOC) and the networks on multi-cores processor. We also introduce the recent researches on how to design the DFD infrastructures with certain processor architecture and how to use the DFD infrastructures to solve the debug problems in these different fields. The topologic of the whole infrastructure, the hardware design of components, the methods of analyzing signals, the

  4. Space Station Water Processor Process Pump

    Science.gov (United States)

    Parker, David

    1995-01-01

    This report presents the results of the development program conducted under contract NAS8-38250-12 related to the International Space Station (ISS) Water Processor (WP) Process Pump. The results of the Process Pumps evaluation conducted on this program indicates that further development is required in order to achieve the performance and life requirements for the ISSWP.

  5. Fuel processors for fuel cell APU applications

    Science.gov (United States)

    Aicher, T.; Lenz, B.; Gschnell, F.; Groos, U.; Federici, F.; Caprile, L.; Parodi, L.

    The conversion of liquid hydrocarbons to a hydrogen rich product gas is a central process step in fuel processors for auxiliary power units (APUs) for vehicles of all kinds. The selection of the reforming process depends on the fuel and the type of the fuel cell. For vehicle power trains, liquid hydrocarbons like gasoline, kerosene, and diesel are utilized and, therefore, they will also be the fuel for the respective APU systems. The fuel cells commonly envisioned for mobile APU applications are molten carbonate fuel cells (MCFC), solid oxide fuel cells (SOFC), and proton exchange membrane fuel cells (PEMFC). Since high-temperature fuel cells, e.g. MCFCs or SOFCs, can be supplied with a feed gas that contains carbon monoxide (CO) their fuel processor does not require reactors for CO reduction and removal. For PEMFCs on the other hand, CO concentrations in the feed gas must not exceed 50 ppm, better 20 ppm, which requires additional reactors downstream of the reforming reactor. This paper gives an overview of the current state of the fuel processor development for APU applications and APU system developments. Furthermore, it will present the latest developments at Fraunhofer ISE regarding fuel processors for high-temperature fuel cell APU systems on board of ships and aircrafts.

  6. A Demo Processor as an Educational Tool

    NARCIS (Netherlands)

    van Moergestel, L.; van Nieuwland, K.; Vermond, L.; Meyer, John-Jules Charles

    2014-01-01

    Explaining the workings of a processor can be done in several ways. Just a written explanation, some pictures, a simulator program or a real hardware demo. The research presented here is based on the idea that a slowly working hardware demo could be a nice tool to explain to IT students the inner

  7. Quantum Algorithm Processor For Finding Exact Divisors

    OpenAIRE

    Burger, John Robert

    2005-01-01

    Wiring diagrams are given for a quantum algorithm processor in CMOS to compute, in parallel, all divisors of an n-bit integer. Lines required in a wiring diagram are proportional to n. Execution time is proportional to the square of n.

  8. Continuous history variable for programmable quantum processors

    CERN Document Server

    Vlasov, Alexander Yu

    2010-01-01

    In this brief note is discussed application of continuous quantum history ("trash") variable for simplification of scheme of programmable quantum processor. Similar scheme may be tested also in other models of the theory of quantum algorithms and complexity, because provides modification of a standard operation: quantum function evaluation.

  9. Focal-plane sensor-processor chips

    CERN Document Server

    Zarándy, Ákos

    2011-01-01

    Focal-Plane Sensor-Processor Chips explores both the implementation and application of state-of-the-art vision chips. Presenting an overview of focal plane chip technology, the text discusses smart imagers and cellular wave computers, along with numerous examples of current vision chips.

  10. Microarchitecture of the Godson-2 Processor

    Institute of Scientific and Technical Information of China (English)

    Wei-Wu Hu; Fu-Xin Zhang; Zu-Song Li

    2005-01-01

    The Godson project is the first attempt to design high performance general-purpose microprocessors in China.This paper introduces the microarchitecture of the Godson-2 processor which is a 64-bit, 4-issue, out-of-order execution RISC processor that implements the 64-bit MIPS-like instruction set. The adoption of the aggressive out-of-order execution techniques (such as register mapping, branch prediction, and dynamic scheduling) and cache techniques (such as non-blocking cache, load speculation, dynamic memory disambiguation) helps the Godson-2 processor to achieve high performance even at not so high frequency. The Godson-2 processor has been physically implemented on a 6-metal 0.18μm CMOS technology based on the automatic placing and routing flow with the help of some crafted library cells and macros. The area of the chip is 6,700 micrometers by 6,200 micrometers and the clock cycle at typical corner is 2.3ns.

  11. Noise limitations in optical linear algebra processors.

    Science.gov (United States)

    Batsell, S G; Jong, T L; Walkup, J F; Krile, T F

    1990-05-10

    A general statistical noise model is presented for optical linear algebra processors. A statistical analysis which includes device noise, the multiplication process, and the addition operation is undertaken. We focus on those processes which are architecturally independent. Finally, experimental results which verify the analytical predictions are also presented.

  12. Practical guide to energy management for processors

    CERN Document Server

    Consortium, Energywise

    2012-01-01

    Do you know how best to manage and reduce your energy consumption? This book gives comprehensive guidance on effective energy management for organisations in the polymer processing industry. This book is one of three which support the ENERGYWISE Plastics Project eLearning platform for European plastics processors to increase their knowledge and understanding of energy management. Topics covered include: Understanding Energy,

  13. Globe hosts launch of new processor

    CERN Multimedia

    2006-01-01

    Launch of the quadecore processor chip at the Globe. On 14 November, in a series of major media events around the world, the chip-maker Intel launched its new 'quadcore' processor. For the regions of Europe, the Middle East and Africa, the day-long launch event took place in CERN's Globe of Science and Innovation, with over 30 journalists in attendance, coming from as far away as Johannesburg and Dubai. CERN was a significant choice for the event: the first tests of this new generation of processor in Europe had been made at CERN over the preceding months, as part of CERN openlab, a research partnership with leading IT companies such as Intel, HP and Oracle. The event also provided the opportunity for the journalists to visit ATLAS and the CERN Computer Centre. The strategy of putting multiple processor cores on the same chip, which has been pursued by Intel and other chip-makers in the last few years, represents an important departure from the more traditional improvements in the sheer speed of such chips. ...

  14. Analysis of Reconfigurable Processors Using Petri Net

    Directory of Open Access Journals (Sweden)

    Hadis Heidari

    2013-07-01

    Full Text Available In this paper, we propose Petri net models for processing elements. The processing elements include: a general-purpose processor (GPP, a reconfigurable element (RE, and a hybrid element (combining a GPP with an RE. The models consist of many transitions and places. The model and associated analysis methods provide a promising tool for modeling and performance evaluation of reconfigurable processors. The model is demonstrated by considering a simple example. This paper describes the development of a reconfigurable processor; the developed system is based on the Petri net concept. Petri nets are becoming suitable as a formal model for hardware system design. Designers can use Petri net as a modeling language to perform high level analysis of complex processors designs processing chips. The simulation does with PIPEv4.1 simulator. The simulation results show that Petri net state spaces are bounded and safe and have not deadlock and the average of number tokens in first token is 0.9901 seconds. In these models, there are only 5% errors; also the analysis time in these models is 0.016 seconds.

  15. VLSI architectures for computing multiplications and inverses in GF(2-m)

    Science.gov (United States)

    Wang, C. C.; Truong, T. K.; Shao, H. M.; Deutsch, L. J.; Omura, J. K.; Reed, I. S.

    1983-01-01

    Finite field arithmetic logic is central in the implementation of Reed-Solomon coders and in some cryptographic algorithms. There is a need for good multiplication and inversion algorithms that are easily realized on VLSI chips. Massey and Omura recently developed a new multiplication algorithm for Galois fields based on a normal basis representation. A pipeline structure is developed to realize the Massey-Omura multiplier in the finite field GF(2m). With the simple squaring property of the normal-basis representation used together with this multiplier, a pipeline architecture is also developed for computing inverse elements in GF(2m). The designs developed for the Massey-Omura multiplier and the computation of inverse elements are regular, simple, expandable and, therefore, naturally suitable for VLSI implementation.

  16. VLSI architectures for computing multiplications and inverses in GF(2m)

    Science.gov (United States)

    Wang, C. C.; Truong, T. K.; Shao, H. M.; Deutsch, L. J.; Omura, J. K.

    1985-01-01

    Finite field arithmetic logic is central in the implementation of Reed-Solomon coders and in some cryptographic algorithms. There is a need for good multiplication and inversion algorithms that are easily realized on VLSI chips. Massey and Omura recently developed a new multiplication algorithm for Galois fields based on a normal basis representation. A pipeline structure is developed to realize the Massey-Omura multiplier in the finite field GF(2m). With the simple squaring property of the normal-basis representation used together with this multiplier, a pipeline architecture is also developed for computing inverse elements in GF(2m). The designs developed for the Massey-Omura multiplier and the computation of inverse elements are regular, simple, expandable and, therefore, naturally suitable for VLSI implementation.

  17. Power gating of VLSI circuits using MEMS switches in low power applications

    KAUST Repository

    Shobak, Hosam

    2011-12-01

    Power dissipation poses a great challenge for VLSI designers. With the intense down-scaling of technology, the total power consumption of the chip is made up primarily of leakage power dissipation. This paper proposes combining a custom-designed MEMS switch to power gate VLSI circuits, such that leakage power is efficiently reduced while accounting for performance and reliability. The designed MEMS switch is characterized by an 0.1876 ? ON resistance and requires 4.5 V to switch. As a result of implementing this novel power gating technique, a standby leakage power reduction of 99% and energy savings of 33.3% are achieved. Finally the possible effects of surge currents and ground bounce noise are studied. These findings allow longer operation times for battery-operated systems characterized by long standby periods. © 2011 IEEE.

  18. Implementation of Optimized Reversible Sequential and Combinational Circuits for VLSI Applications

    Directory of Open Access Journals (Sweden)

    P. Mohan Krishna

    2014-04-01

    Full Text Available Reversible logic has emerged as one of the most important approaches for the power optimization with its application in low power VLSI design. They are also the fundamental requirement for the emerging field of the Quantum computing having with applications in the domains like Nano-technology, Digital signal processing, Cryptography, Communications. Implementing the reversible logic has the advantages of reducing gate counts, garbage outputs as well as constant inputs. In this project we present sequential and combinational circuit with reversible logic gates which are simulated in Xilinx ISE and by writing the code in VHDL . we have proposed a new design technique of BCD Adder using newly constructed reversible gates are based on CMOS with pass transistor gates . Here the total reversible Adder is designed using EDA tools. We will analyze the VLSI limitations like power consumption and area of designed circuits.

  19. Las Vegas is better than determinism in VLSI and distributed computing

    DEFF Research Database (Denmark)

    Mehlhorn, Kurt; Schmidt, Erik Meineche

    1982-01-01

    to (accepting) nondeterministic computations as well as to deterministic computations. Hence whenever a boolean function f is such that f and -&-fmarc; (the complement of f, -&-fmarc; -&-equil; 1 -&-minus; f) have efficient nondeterministic chips then the known techniques are of no help for proving lower bounds...... on the complexity of deterministic chips. In this paper we describe a lower bound technique (Thm 1) which only applies to deterministic computations......In this paper we describe a new method for proving lower bounds on the complexity of VLSI - computations and more generally distributed computations. Lipton and Sedgewick observed that the crossing sequence arguments used to prove lower bounds in VLSI (or TM or distributed computing) apply...

  20. Cyclic Redundancy Checking (CRC) Accelerator for Embedded Processor Datapaths

    National Research Council Canada - National Science Library

    Abdul Rehman Buzdar; Liguo Sun; Rao Kashif; Muhammad Waqar Azhar; Muhammad Imran Khan

    2017-01-01

    We present the integration of a multimode Cyclic Redundancy Checking (CRC) accelerator unit with an embedded processor datapath to enhance the processor performance in terms of execution time and energy efficiency...

  1. Area and Energy Efficient Viterbi Accelerator for Embedded Processor Datapaths

    National Research Council Canada - National Science Library

    Abdul Rehman Buzdar; Liguo Sun; Muhammad Waqar Azhar; Muhammad Imran Khan; Rao Kashif

    2017-01-01

    .... We present the integration of a mixed hardware/software viterbi accelerator unit with an embedded processor datapath to enhance the processor performance in terms of execution time and energy efficiency...

  2. International Conference on VLSI, Communication, Advanced Devices, Signals & Systems and Networking

    CERN Document Server

    Shirur, Yasha; Prasad, Rekha

    2013-01-01

    This book is a collection of papers presented by renowned researchers, keynote speakers and academicians in the International Conference on VLSI, Communication, Analog Designs, Signals and Systems, and Networking (VCASAN-2013), organized by B.N.M. Institute of Technology, Bangalore, India during July 17-19, 2013. The book provides global trends in cutting-edge technologies in electronics and communication engineering. The content of the book is useful to engineers, researchers and academicians as well as industry professionals.

  3. A VLSI Design Flow for Secure Side-Channel Attack Resistant ICs

    OpenAIRE

    Tiri, Kris; Verbauwhede, Ingrid

    2007-01-01

    Submitted on behalf of EDAA (http://www.edaa.com/); International audience; This paper presents a digital VLSI design flow to create secure, side-channel attack (SCA) resistant integrated circuits. The design flow starts from a normal design in a hardware description language such as VHDL or Verilog and provides a direct path to a SCA resistant layout. Instead of a full custom layout or an iterative design process with extensive simulations, a few key modifications are incorporated in a regul...

  4. The Design, Simulation, and Fabrication of a BiCMOS VLSI Digitally Programmable GIC Filter

    Science.gov (United States)

    2001-09-01

    December 2000. Michael, S., Analog VLSI: Class Notes, Naval Postgraduate School, Monterey, CA, 1999. Sedra , A.S., Smith , K.C., Microelectronic...loop gain for an opamp is defined by the following equation ( Sedra , 1998) The ideal opamp has an infinite open loop gain, which can be seen from...Response (from Lee, 2000). The slew rate is defined by the following equation ( Sedra , 1998) 0 103102 10 104 105 106 107 f (Hz) 20 40 60

  5. Array processors based on Gaussian fraction-free method

    Energy Technology Data Exchange (ETDEWEB)

    Peng, S.; Sedukhin, S. [Aizu Univ., Aizuwakamatsu, Fukushima (Japan); Sedukhin, I.

    1998-03-01

    The design of algorithmic array processors for solving linear systems of equations using fraction-free Gaussian elimination method is presented. The design is based on a formal approach which constructs a family of planar array processors systematically. These array processors are synthesized and analyzed. It is shown that some array processors are optimal in the framework of linear allocation of computations and in terms of number of processing elements and computing time. (author)

  6. VLSI Architecture for Configurable and Low-Complexity Design of Hard-Decision Viterbi Decoding Algorithm

    Directory of Open Access Journals (Sweden)

    Rachmad Vidya Wicaksana Putra

    2016-06-01

    Full Text Available Convolutional encoding and data decoding are fundamental processes in convolutional error correction. One of the most popular error correction methods in decoding is the Viterbi algorithm. It is extensively implemented in many digital communication applications. Its VLSI design challenges are about area, speed, power, complexity and configurability. In this research, we specifically propose a VLSI architecture for a configurable and low-complexity design of a hard-decision Viterbi decoding algorithm. The configurable and low-complexity design is achieved by designing a generic VLSI architecture, optimizing each processing element (PE at the logical operation level and designing a conditional adapter. The proposed design can be configured for any predefined number of trace-backs, only by changing the trace-back parameter value. Its computational process only needs N + 2 clock cycles latency, with N is the number of trace-backs. Its configurability function has been proven for N = 8, N = 16, N = 32 and N = 64. Furthermore, the proposed design was synthesized and evaluated in Xilinx and Altera FPGA target boards for area consumption and speed performance.

  7. 49 CFR 234.275 - Processor-based systems.

    Science.gov (United States)

    2010-10-01

    ..., DEPARTMENT OF TRANSPORTATION GRADE CROSSING SIGNAL SYSTEM SAFETY AND STATE ACTION PLANS Maintenance, Inspection, and Testing Requirements for Processor-Based Systems § 234.275 Processor-based systems. (a... 49 Transportation 4 2010-10-01 2010-10-01 false Processor-based systems. 234.275 Section...

  8. A lock circuit for a multi-core processor

    DEFF Research Database (Denmark)

    2015-01-01

    An integrated circuit comprising a multiple processor cores and a lock circuit that comprises a queue register with respective bits set or reset via respective, connections dedicated to respective processor cores, whereby the queue register identifies those among the multiple processor cores that...

  9. Finite difference programs and array processors. [High-speed floating point processing by coupling host computer to programable array processor

    Energy Technology Data Exchange (ETDEWEB)

    Rudy, T.E.

    1977-08-01

    An alternative to maxi computers for high-speed floating-point processing capabilities is the coupling of a host computer to a programable array processor. This paper compares the performance of two finite difference programs on various computers and their expected performance on commercially available array processors. The significance of balancing array processor computation, host-array processor control traffic, and data transfer operations is emphasized. 3 figures, 1 table.

  10. Cache Energy Optimization Techniques For Modern Processors

    Energy Technology Data Exchange (ETDEWEB)

    Mittal, Sparsh [ORNL

    2013-01-01

    Modern multicore processors are employing large last-level caches, for example Intel's E7-8800 processor uses 24MB L3 cache. Further, with each CMOS technology generation, leakage energy has been dramatically increasing and hence, leakage energy is expected to become a major source of energy dissipation, especially in last-level caches (LLCs). The conventional schemes of cache energy saving either aim at saving dynamic energy or are based on properties specific to first-level caches, and thus these schemes have limited utility for last-level caches. Further, several other techniques require offline profiling or per-application tuning and hence are not suitable for product systems. In this book, we present novel cache leakage energy saving schemes for single-core and multicore systems; desktop, QoS, real-time and server systems. Also, we present cache energy saving techniques for caches designed with both conventional SRAM devices and emerging non-volatile devices such as STT-RAM (spin-torque transfer RAM). We present software-controlled, hardware-assisted techniques which use dynamic cache reconfiguration to configure the cache to the most energy efficient configuration while keeping the performance loss bounded. To profile and test a large number of potential configurations, we utilize low-overhead, micro-architecture components, which can be easily integrated into modern processor chips. We adopt a system-wide approach to save energy to ensure that cache reconfiguration does not increase energy consumption of other components of the processor. We have compared our techniques with state-of-the-art techniques and have found that our techniques outperform them in terms of energy efficiency and other relevant metrics. The techniques presented in this book have important applications in improving energy-efficiency of higher-end embedded, desktop, QoS, real-time, server processors and multitasking systems. This book is intended to be a valuable guide for both

  11. A VLSI optimal constructive algorithm for classification problems

    Energy Technology Data Exchange (ETDEWEB)

    Beiu, V. [Los Alamos National Lab., NM (United States); Draghici, S.; Sethi, I.K. [Wayne State Univ., Detroit, MI (United States)

    1997-10-01

    If neural networks are to be used on a large scale, they have to be implemented in hardware. However, the cost of the hardware implementation is critically sensitive to factors like the precision used for the weights, the total number of bits of information and the maximum fan-in used in the network. This paper presents a version of the Constraint Based Decomposition training algorithm which is able to produce networks using limited precision integer weights and units with limited fan-in. The algorithm is tested on the 2-spiral problem and the results are compared with other existing algorithms.

  12. Multi-Core Processor Memory Contention Benchmark Analysis Case Study

    Science.gov (United States)

    Simon, Tyler; McGalliard, James

    2009-01-01

    Multi-core processors dominate current mainframe, server, and high performance computing (HPC) systems. This paper provides synthetic kernel and natural benchmark results from an HPC system at the NASA Goddard Space Flight Center that illustrate the performance impacts of multi-core (dual- and quad-core) vs. single core processor systems. Analysis of processor design, application source code, and synthetic and natural test results all indicate that multi-core processors can suffer from significant memory subsystem contention compared to similar single-core processors.

  13. Multiple core computer processor with globally-accessible local memories

    Energy Technology Data Exchange (ETDEWEB)

    Shalf, John; Donofrio, David; Oliker, Leonid

    2016-09-20

    A multi-core computer processor including a plurality of processor cores interconnected in a Network-on-Chip (NoC) architecture, a plurality of caches, each of the plurality of caches being associated with one and only one of the plurality of processor cores, and a plurality of memories, each of the plurality of memories being associated with a different set of at least one of the plurality of processor cores and each of the plurality of memories being configured to be visible in a global memory address space such that the plurality of memories are visible to two or more of the plurality of processor cores.

  14. Design of Processors with Reconfigurable Microarchitecture

    Directory of Open Access Journals (Sweden)

    Andrey Mokhov

    2014-01-01

    Full Text Available Energy becomes a dominating factor for a wide spectrum of computations: from intensive data processing in “big data” companies resulting in large electricity bills, to infrastructure monitoring with wireless sensors relying on energy harvesting. In this context it is essential for a computation system to be adaptable to the power supply and the service demand, which often vary dramatically during runtime. In this paper we present an approach to building processors with reconfigurable microarchitecture capable of changing the way they fetch and execute instructions depending on energy availability and application requirements. We show how to use Conditional Partial Order Graphs to formally specify the microarchitecture of such a processor, explore the design possibilities for its instruction set, and synthesise the instruction decoder using correct-by-construction techniques. The paper is focused on the design methodology, which is evaluated by implementing a power-proportional version of Intel 8051 microprocessor.

  15. Multifunction nonlinear signal processor - Deconvolution and correlation

    Science.gov (United States)

    Javidi, Bahram; Horner, Joseph L.

    1989-08-01

    A multifuncional nonlinear optical signal processor is described that allows different types of operations, such as image deconvolution and nonlinear correlation. In this technique, the joint power spectrum of the input signal is thresholded with varying nonlinearity to produce different specific operations. In image deconvolution, the joint power spectrum is modified and hard-clip thresholded to remove the amplitude distortion effects and to restore the correct phase of the original image. In optical correlation, the Fourier transform interference intensity is thresholded to provide higher correlation peak intensity and a better-defined correlation spot. Various types of correlation signals can be produced simply by varying the severity of the nonlinearity, without the need for synthesis of specific matched filter. An analysis of the nonlinear processor for image deconvolution is presented.

  16. Model of computation for Fourier optical processors

    Science.gov (United States)

    Naughton, Thomas J.

    2000-05-01

    We present a novel and simple theoretical model of computation that captures what we believe are the most important characteristics of an optical Fourier transform processor. We use this abstract model to reason about the computational properties of the physical systems it describes. We define a grammar for our model's instruction language, and use it to write algorithms for well-known filtering and correlation techniques. We also suggest suitable computational complexity measures that could be used to analyze any coherent optical information processing technique, described with the language, for efficiency. Our choice of instruction language allows us to argue that algorithms describable with this model should have optical implementations that do not require a digital electronic computer to act as a master unit. Through simulation of a well known model of computation from computer theory we investigate the general-purpose capabilities of analog optical processors.

  17. Communication Efficient Multi-processor FFT

    Science.gov (United States)

    Lennart Johnsson, S.; Jacquemin, Michel; Krawitz, Robert L.

    1992-10-01

    Computing the fast Fourier transform on a distributed memory architecture by a direct pipelined radix-2, a bi-section, or a multisection algorithm, all yield the same communications requirement, if communication for all FFT stages can be performed concurrently, the input data is in normal order, and the data allocation is consecutive. With a cyclic data allocation, or bit-reversed input data and a consecutive allocation, multi-sectioning offers a reduced communications requirement by approximately a factor of two. For a consecutive data allocation, normal input order, a decimation-in-time FFT requires that P/ N + d-2 twiddle factors be stored for P elements distributed evenly over N processors, and the axis that is subject to transformation be distributed over 2 d processors. No communication of twiddle factors is required. The same storage requirements hold for a decimation-in-frequency FFT, bit-reversed input order, and consecutive data allocation. The opposite combination of FFT type and data ordering requires a factor of log 2N more storage for N processors. The peak performance for a Connection Machine system CM-200 implementation is 12.9 Gflops/s in 32-bit precision, and 10.7 Gflops/s in 64-bit precision for unordered transforms local to each processor. The corresponding execution rates for ordered transforms are 11.1 Gflops/s and 8.5 Gflops/s, respectively. For distributed one- and two-dimensional transforms the peak performance for unordered transforms exceeds 5 Gflops/s in 32-bit precision and 3 Gflops/s in 64-bit precision. Three-dimensional transforms execute at a slightly lower rate. Distributed ordered transforms execute at a rate of about {1}/{2}to {2}/{3} of the unordered transforms.

  18. Breadboard Signal Processor for Arraying DSN Antennas

    Science.gov (United States)

    Jongeling, Andre; Sigman, Elliott; Chandra, Kumar; Trinh, Joseph; Soriano, Melissa; Navarro, Robert; Rogstad, Stephen; Goodhart, Charles; Proctor, Robert; Jourdan, Michael; hide

    2008-01-01

    A recently developed breadboard version of an advanced signal processor for arraying many antennas in NASA s Deep Space Network (DSN) can accept inputs in a 500-MHz-wide frequency band from six antennas. The next breadboard version is expected to accept inputs from 16 antennas, and a following developed version is expected to be designed according to an architecture that will be scalable to accept inputs from as many as 400 antennas. These and similar signal processors could also be used for combining multiple wide-band signals in non-DSN applications, including very-long-baseline interferometry and telecommunications. This signal processor performs functions of a wide-band FX correlator and a beam-forming signal combiner. [The term "FX" signifies that the digital samples of two given signals are fast Fourier transformed (F), then the fast Fourier transforms of the two signals are multiplied (X) prior to accumulation.] In this processor, the signals from the various antennas are broken up into channels in the frequency domain (see figure). In each frequency channel, the data from each antenna are correlated against the data from each other antenna; this is done for all antenna baselines (that is, for all antenna pairs). The results of the correlations are used to obtain calibration data to align the antenna signals in both phase and delay. Data from the various antenna frequency channels are also combined and calibration corrections are applied. The frequency-domain data thus combined are then synthesized back to the time domain for passing on to a telemetry receiver

  19. A post-processor for Gurmukhi OCR

    Indian Academy of Sciences (India)

    G S Lehal; Chandan Singh

    2002-02-01

    A post-processing system for OCR of Gurmukhi script has been developed. Statistical information of Punjabi language syllable combinations, corpora look-up and certain heuristics based on Punjabi grammar rules have been combined to design the post-processor. An improvement of 3% in recognition rate, from 94.35% to 97.34%, has been reported on clean images using the post-processing techniques.

  20. Modules for Pipelined Mixed Radix FFT Processors

    Directory of Open Access Journals (Sweden)

    Anatolij Sergiyenko

    2016-01-01

    Full Text Available A set of soft IP cores for the Winograd r-point fast Fourier transform (FFT is considered. The cores are designed by the method of spatial SDF mapping into the hardware, which provides the minimized hardware volume at the cost of slowdown of the algorithm by r times. Their clock frequency is equal to the data sampling frequency. The cores are intended for the high-speed pipelined FFT processors, which are implemented in FPGA.

  1. High-pressure coal fuel processor development

    Energy Technology Data Exchange (ETDEWEB)

    Greenhalgh, M.L.

    1992-11-01

    The objective of Subtask 1.1 Engine Feasibility was to conduct research needed to establish the technical feasibility of ignition and stable combustion of directly injected, 3,000 psi, low-Btu gas with glow plug ignition assist at diesel engine compression ratios. This objective was accomplished by designing, fabricating, testing and analyzing the combustion performance of synthesized low-Btu coal gas in a single-cylinder test engine combustion rig located at the Caterpillar Technical Center engine lab in Mossville, Illinois. The objective of Subtask 1.2 Fuel Processor Feasibility was to conduct research needed to establish the technical feasibility of air-blown, fixed-bed, high-pressure coal fuel processing at up to 3,000 psi operating pressure, incorporating in-bed sulfur and particulate capture. This objective was accomplished by designing, fabricating, testing and analyzing the performance of bench-scale processors located at Coal Technology Corporation (subcontractor) facilities in Bristol, Virginia. These two subtasks were carried out at widely separated locations and will be discussed in separate sections of this report. They were, however, independent in that the composition of the synthetic coal gas used to fuel the combustion rig was adjusted to reflect the range of exit gas compositions being produced on the fuel processor rig. Two major conclusions resulted from this task. First, direct injected, ignition assisted Diesel cycle engine combustion systems can be suitably modified to efficiently utilize these low-Btu gas fuels. Second, high pressure gasification of selected run-of-the-mine coals in batch-loaded fuel processors is feasible. These two findings, taken together, significantly reduce the perceived technical risks associated with the further development of the proposed coal gas fueled Diesel cycle power plant concept.

  2. Keystone Business Models for Network Security Processors

    Directory of Open Access Journals (Sweden)

    Arthur Low

    2013-07-01

    Full Text Available Network security processors are critical components of high-performance systems built for cybersecurity. Development of a network security processor requires multi-domain experience in semiconductors and complex software security applications, and multiple iterations of both software and hardware implementations. Limited by the business models in use today, such an arduous task can be undertaken only by large incumbent companies and government organizations. Neither the “fabless semiconductor” models nor the silicon intellectual-property licensing (“IP-licensing” models allow small technology companies to successfully compete. This article describes an alternative approach that produces an ongoing stream of novel network security processors for niche markets through continuous innovation by both large and small companies. This approach, referred to here as the "business ecosystem model for network security processors", includes a flexible and reconfigurable technology platform, a “keystone” business model for the company that maintains the platform architecture, and an extended ecosystem of companies that both contribute and share in the value created by innovation. New opportunities for business model innovation by participating companies are made possible by the ecosystem model. This ecosystem model builds on: i the lessons learned from the experience of the first author as a senior integrated circuit architect for providers of public-key cryptography solutions and as the owner of a semiconductor startup, and ii the latest scholarly research on technology entrepreneurship, business models, platforms, and business ecosystems. This article will be of interest to all technology entrepreneurs, but it will be of particular interest to owners of small companies that provide security solutions and to specialized security professionals seeking to launch their own companies.

  3. Intelligent trigger processor for the crystal box

    CERN Document Server

    Sanders, G H; Cooper, M D; Hart, G W; Hoffman, C M; Hogan, G E; Hughes, E B; Matis, H S; Rolfe, J; Sandberg, V D; Williams, R A; Wilson, S; Zeman, H

    1981-01-01

    A large solid angle angular modular NaI(Tl) detector with 432 phototubes and 88 trigger scintillators is being used to search simultaneously for three lepton flavor-changing decays of the muon. A beam of up to 10/sup 6/ muons stopping per second with a 6% duty factor would yield up to 1000 triggers per second from random triple coincidences. A reduction of the trigger rate to 10 Hz is required from a hardwired primary trigger processor. Further reduction to <1 Hz is achieved by a microprocessor-based secondary trigger processor. The primary trigger hardware imposes voter coincidence logic, stringent timing requirements, and a non-adjacency requirement in the trigger scintillators defined by hardwired circuits. Sophisticated geometric requirements are imposed by a PROM-based matrix logic, and energy and vector-momentum cuts are imposed by a hardwired processor using LSI flash ADC's and digital arithmetic logic. The secondary trigger employs four satellite microprocessors to do a sparse data scan, multiplex ...

  4. Software-Reconfigurable Processors for Spacecraft

    Science.gov (United States)

    Farrington, Allen; Gray, Andrew; Bell, Bryan; Stanton, Valerie; Chong, Yong; Peters, Kenneth; Lee, Clement; Srinivasan, Jeffrey

    2005-01-01

    A report presents an overview of an architecture for a software-reconfigurable network data processor for a spacecraft engaged in scientific exploration. When executed on suitable electronic hardware, the software performs the functions of a physical layer (in effect, acts as a software radio in that it performs modulation, demodulation, pulse-shaping, error correction, coding, and decoding), a data-link layer, a network layer, a transport layer, and application-layer processing of scientific data. The software-reconfigurable network processor is undergoing development to enable rapid prototyping and rapid implementation of communication, navigation, and scientific signal-processing functions; to provide a long-lived communication infrastructure; and to provide greatly improved scientific-instrumentation and scientific-data-processing functions by enabling science-driven in-flight reconfiguration of computing resources devoted to these functions. This development is an extension of terrestrial radio and network developments (e.g., in the cellular-telephone industry) implemented in software running on such hardware as field-programmable gate arrays, digital signal processors, traditional digital circuits, and mixed-signal application-specific integrated circuits (ASICs).

  5. Issue Mechanism for Embedded Simultaneous Multithreading Processor

    Science.gov (United States)

    Zang, Chengjie; Imai, Shigeki; Frank, Steven; Kimura, Shinji

    Simultaneous Multithreading (SMT) technology enhances instruction throughput by issuing multiple instructions from multiple threads within one clock cycle. For in-order pipeline to each thread, SMT processors can provide large number of issued instructions close to or surpass than using out-of-order pipeline. In this work, we show an efficient issue logic for predicated instruction sequence with the parallel flag in each instruction, where the predicate register based issue control is adopted and the continuous instructions with the parallel flag of ‘0’ are executed in parallel. The flag is pre-defined by a compiler. Instructions from different threads are issued based on the round-robin order. We also introduce an Instruction Queue skip mechanism for thread if the queue is empty. Using this kind of issue logic, we designed a 6 threads, 7-stage, in-order pipeline processor. Based on this processor, we compare round-robin issue policy (RR(T1-Tn)) with other policies: thread one always has the highest priority (PR(T1)) and thread one or thread n has the highest priority in turn (PR(T1-Tn)). The results show that RR(T1-Tn) policy outperforms others and PR(T1-Tn) is almost the same to RR(T1-Tn) from the point of view of the issued instructions per cycle.

  6. Efficient searching and sorting applications using an associative array processor

    Science.gov (United States)

    Pace, W.; Quinn, M. J.

    1978-01-01

    The purpose of this paper is to describe a method of searching and sorting data by using some of the unique capabilities of an associative array processor. To understand the application, the associative array processor is described in detail. In particular, the content addressable memory and flip network are discussed because these two unique elements give the associative array processor the power to rapidly sort and search. A simple alphanumeric sorting example is explained in hardware and software terms. The hardware used to explain the application is the STARAN (Goodyear Aerospace Corporation) associative array processor. The software used is the APPLE (Array Processor Programming Language) programming language. Some applications of the array processor are discussed. This summary tries to differentiate between the techniques of the sequential machine and the associative array processor.

  7. Testing and operating a multiprocessor chip with processor redundancy

    Energy Technology Data Exchange (ETDEWEB)

    Bellofatto, Ralph E; Douskey, Steven M; Haring, Rudolf A; McManus, Moyra K; Ohmacht, Martin; Schmunkamp, Dietmar; Sugavanam, Krishnan; Weatherford, Bryan J

    2014-10-21

    A system and method for improving the yield rate of a multiprocessor semiconductor chip that includes primary processor cores and one or more redundant processor cores. A first tester conducts a first test on one or more processor cores, and encodes results of the first test in an on-chip non-volatile memory. A second tester conducts a second test on the processor cores, and encodes results of the second test in an external non-volatile storage device. An override bit of a multiplexer is set if a processor core fails the second test. In response to the override bit, the multiplexer selects a physical-to-logical mapping of processor IDs according to one of: the encoded results in the memory device or the encoded results in the external storage device. On-chip logic configures the processor cores according to the selected physical-to-logical mapping.

  8. Merged ozone profiles from four MIPAS processors

    Science.gov (United States)

    Laeng, Alexandra; von Clarmann, Thomas; Stiller, Gabriele; Dinelli, Bianca Maria; Dudhia, Anu; Raspollini, Piera; Glatthor, Norbert; Grabowski, Udo; Sofieva, Viktoria; Froidevaux, Lucien; Walker, Kaley A.; Zehner, Claus

    2017-04-01

    The Michelson Interferometer for Passive Atmospheric Sounding (MIPAS) was an infrared (IR) limb emission spectrometer on the Envisat platform. Currently, there are four MIPAS ozone data products, including the operational Level-2 ozone product processed at ESA, with the scientific prototype processor being operated at IFAC Florence, and three independent research products developed by the Istituto di Fisica Applicata Nello Carrara (ISAC-CNR)/University of Bologna, Oxford University, and the Karlsruhe Institute of Technology-Institute of Meteorology and Climate Research/Instituto de Astrofísica de Andalucía (KIT-IMK/IAA). Here we present a dataset of ozone vertical profiles obtained by merging ozone retrievals from four independent Level-2 MIPAS processors. We also discuss the advantages and the shortcomings of this merged product. As the four processors retrieve ozone in different parts of the spectra (microwindows), the source measurements can be considered as nearly independent with respect to measurement noise. Hence, the information content of the merged product is greater and the precision is better than those of any parent (source) dataset. The merging is performed on a profile per profile basis. Parent ozone profiles are weighted based on the corresponding error covariance matrices; the error correlations between different profile levels are taken into account. The intercorrelations between the processors' errors are evaluated statistically and are used in the merging. The height range of the merged product is 20-55 km, and error covariance matrices are provided as diagnostics. Validation of the merged dataset is performed by comparison with ozone profiles from ACE-FTS (Atmospheric Chemistry Experiment-Fourier Transform Spectrometer) and MLS (Microwave Limb Sounder). Even though the merging is not supposed to remove the biases of the parent datasets, around the ozone volume mixing ratio peak the merged product is found to have a smaller (up to 0.1 ppmv

  9. Technology development and circuit design for a parallel laser programmable floating-point application specific processor. Master's thesis

    Energy Technology Data Exchange (ETDEWEB)

    Scriber, M.W.

    1989-12-01

    The laser programmable floating point application specific processor (LPASP) is a new approach at rapid development of custom VLSI chips. The LPASP is a generic application specific processor that can be programmed to perform a specific function. The effort of this thesis is to develop and test the double precision floating point adder and the laser programmable read-only memory (LPROM) that are macrocells within the LPASP. In addition, the thesis analyzes the applicability of an LPASP parallel processing system. The double precision floating point adder is an adder/subtractor macrocell designed to comply with the IEEE double precision floating point standard. An 84-pin chip of the adder was fabricated using 2 micron feature sizes. The fastest processing time was measured at 120 nanoseconds over 23 worst case test vectors. The adder uses the optimized carry multiplexed (OCM) adder that was developed at AFIT. The OCM adder is a new adder architecture that uses four parallel carry paths to attain a performance time on the order of (cubed root of M) with a gate count on the order of O (n). The redundant logic associated with the parallel propagation banks is eliminated in the OCM adder so that the largest bit-slice of the adder contains only eight 2-to-1 multiplexer gates. A 57-bit adder was fabricated using 2 micron feature sizes. The processing time for the adder is 31 nsec.

  10. Neuromorphic VLSI vision system for real-time texture segregation.

    Science.gov (United States)

    Shimonomura, Kazuhiro; Yagi, Tetsuya

    2008-10-01

    The visual system of the brain can perceive an external scene in real-time with extremely low power dissipation, although the response speed of an individual neuron is considerably lower than that of semiconductor devices. The neurons in the visual pathway generate their receptive fields using a parallel and hierarchical architecture. This architecture of the visual cortex is interesting and important for designing a novel perception system from an engineering perspective. The aim of this study is to develop a vision system hardware, which is designed inspired by a hierarchical visual processing in V1, for real time texture segregation. The system consists of a silicon retina, orientation chip, and field programmable gate array (FPGA) circuit. The silicon retina emulates the neural circuits of the vertebrate retina and exhibits a Laplacian-Gaussian-like receptive field. The orientation chip selectively aggregates multiple pixels of the silicon retina in order to produce Gabor-like receptive fields that are tuned to various orientations by mimicking the feed-forward model proposed by Hubel and Wiesel. The FPGA circuit receives the output of the orientation chip and computes the responses of the complex cells. Using this system, the neural images of simple cells were computed in real-time for various orientations and spatial frequencies. Using the orientation-selective outputs obtained from the multi-chip system, a real-time texture segregation was conducted based on a computational model inspired by psychophysics and neurophysiology. The texture image was filtered by the two orthogonally oriented receptive fields of the multi-chip system and the filtered images were combined to segregate the area of different texture orientation with the aid of FPGA. The present system is also useful for the investigation of the functions of the higher-order cells that can be obtained by combining the simple and complex cells.

  11. Decentralized neural control application to robotics

    CERN Document Server

    Garcia-Hernandez, Ramon; Sanchez, Edgar N; Alanis, Alma y; Ruz-Hernandez, Jose A

    2017-01-01

    This book provides a decentralized approach for the identification and control of robotics systems. It also presents recent research in decentralized neural control and includes applications to robotics. Decentralized control is free from difficulties due to complexity in design, debugging, data gathering and storage requirements, making it preferable for interconnected systems. Furthermore, as opposed to the centralized approach, it can be implemented with parallel processors. This approach deals with four decentralized control schemes, which are able to identify the robot dynamics. The training of each neural network is performed on-line using an extended Kalman filter (EKF). The first indirect decentralized control scheme applies the discrete-time block control approach, to formulate a nonlinear sliding manifold. The second direct decentralized neural control scheme is based on the backstepping technique, approximated by a high order neural network. The third control scheme applies a decentralized neural i...

  12. A reconfigurable on-line learning spiking neuromorphic processor comprising 256 neurons and 128K synapses.

    Science.gov (United States)

    Qiao, Ning; Mostafa, Hesham; Corradi, Federico; Osswald, Marc; Stefanini, Fabio; Sumislawska, Dora; Indiveri, Giacomo

    2015-01-01

    Implementing compact, low-power artificial neural processing systems with real-time on-line learning abilities is still an open challenge. In this paper we present a full-custom mixed-signal VLSI device with neuromorphic learning circuits that emulate the biophysics of real spiking neurons and dynamic synapses for exploring the properties of computational neuroscience models and for building brain-inspired computing systems. The proposed architecture allows the on-chip configuration of a wide range of network connectivities, including recurrent and deep networks, with short-term and long-term plasticity. The device comprises 128 K analog synapse and 256 neuron circuits with biologically plausible dynamics and bi-stable spike-based plasticity mechanisms that endow it with on-line learning abilities. In addition to the analog circuits, the device comprises also asynchronous digital logic circuits for setting different synapse and neuron properties as well as different network configurations. This prototype device, fabricated using a 180 nm 1P6M CMOS process, occupies an area of 51.4 mm(2), and consumes approximately 4 mW for typical experiments, for example involving attractor networks. Here we describe the details of the overall architecture and of the individual circuits and present experimental results that showcase its potential. By supporting a wide range of cortical-like computational modules comprising plasticity mechanisms, this device will enable the realization of intelligent autonomous systems with on-line learning capabilities.

  13. An Evolutionary Transition of conventional n MOS VLSI to CMOS considering Scaling, Low Power and Higher Mobility

    Directory of Open Access Journals (Sweden)

    Md Mobarok Hossain Rubel

    2016-07-01

    Full Text Available This paper emphasizes on the gradual revolution of CMOS scaling by delivering the modern concepts of newly explored device structures and new materials. After analyzing the improvements in sources, performance of CMOS technology regarding conventional semiconductor devices has been thoroughly discussed. This has been done by considering the significant semiconductor evolution devices like metal gate electrode, double gate FET, FinFET, high dielectric constant (high k and strained silicon FET. Considering the power level while scaling, the paper showed how nMOS VLSI chips have been gradually replaced by CMOS aiming for the reduction in the growing power of VLSI systems.

  14. A hardware implementation of neural network with modified HANNIBAL architecture

    Energy Technology Data Exchange (ETDEWEB)

    Lee, Bum youb; Chung, Duck Jin [Inha University, Inchon (Korea, Republic of)

    1996-03-01

    A digital hardware architecture for artificial neural network with learning capability is described in this paper. It is a modified hardware architecture known as HANNIBAL(Hardware Architecture for Neural Networks Implementing Back propagation Algorithm Learning). For implementing an efficient neural network hardware, we analyzed various type of multiplier which is major function block of neuro-processor cell. With this result, we design a efficient digital neural network hardware using serial/parallel multiplier, and test the operation. We also analyze the hardware efficiency with logic level simulation. (author). 14 refs., 10 figs., 3 tabs.

  15. VLSI Architectures for Sliding-Window-Based Space-Time Turbo Trellis Code Decoders

    Directory of Open Access Journals (Sweden)

    Georgios Passas

    2012-01-01

    Full Text Available The VLSI implementation of SISO-MAP decoders used for traditional iterative turbo coding has been investigated in the literature. In this paper, a complete architectural model of a space-time turbo code receiver that includes elementary decoders is presented. These architectures are based on newly proposed building blocks such as a recursive add-compare-select-offset (ACSO unit, A-, B-, Γ-, and LLR output calculation modules. Measurements of complexity and decoding delay of several sliding-window-technique-based MAP decoder architectures and a proposed parameter set lead to defining equations and comparison between those architectures.

  16. New Metric Based Algorithm for Test Vector Generation in VLSI Testing

    Directory of Open Access Journals (Sweden)

    M. V. Atre

    1995-07-01

    Full Text Available A new algorithm for test-vector-generation (TVG for combinational circuits has been presented for testing VLSI chips. This is done by defining a suitable metric or distance, in the space of all input vectors, between a vector and a set of vectors. The test vectors are generated by suitably maximising the above distance. Two different methods of maximising the distance are suggested. Performances of the two methods for different circuits are presented and compared with the random method of TVG. It was observed that method B is superior to the other two methods. Also, method A is slightly better than method R.

  17. Wide-range, picoampere-sensitivity multichannel VLSI potentiostat for neurotransmitter sensing.

    Science.gov (United States)

    Murari, Kartikeya; Thakor, Nitish; Stanacevic, Milutin; Cauwenberghs, Gert

    2004-01-01

    Neurotransmitter sensing is critical in studying nervous pathways and neurological disorders. A 16-channel current-measuring VLSI potentiostat with multiple ranges from picoamperes to microamperes is presented for electrochemical detection of electroactive neurotransmitters like dopamine, nitric oxide etc. The analog-to-digital converter design employs a current-mode, first-order single-bit delta-sigma modulator architecture with a two-stage, digitally reconfigurable oversampling ratio for ranging the conversion scale. An integrated prototype is fabricated in CMOS technology, and experimentally characterized. Real-time multi-channel acquisition of dopamine concentration in vitro is performed with a microfabricated sensor array.

  18. VLSI (Very Large Scale Integration) Design Tools, Reference Manual, Release 3.0.

    Science.gov (United States)

    1985-08-01

    purpose of the Consortium is to advance the state of the art in VLSI technology and to transfer this technology between industry and the university...it is passed to Lyra with the -r switch to indicate a specific ruleset. Otherwise, the current technology is used as the ruleset. sacro < character...symbols art aligned so that the symbolic point n1 on the top of si is adjacent to the symbolic point n2 on the bottom of s2. Both points are taken to be

  19. VLSI Structure for an All Digital Receiver for CDMA PABX Handset

    Institute of Scientific and Technical Information of China (English)

    ZhouShidong; BiGuangguo

    1995-01-01

    In this paper,a VLSI architecture of a CDMA receiver is put forward for wirelesss PABX handset.To meet the critically low cost and power consumption requirement with neglectable per-formance degradation,some new techniques are employed to reduce hardware complexity,including base band processing,chip-rate sampling,low ADC resolution,absolute value detector,double branch acquisition ,and modified carrier phase compensation.Performance of experimental system fits well with theoretical predition ,and the practical SNR lose compared with ideal reception is about 2-3dB.

  20. Performance Analysis of Low Power, High Gain Operational Amplifier Using CMOS VLSI Design

    Directory of Open Access Journals (Sweden)

    Ankush S. Patharkar

    2014-07-01

    Full Text Available The operational amplifier is one of the most useful and important component of analog electronics. They are widely used in popular electronics. Their primary limitation is that they are not especially fast. The typical performance degrades rapidly for frequencies greater than about 1 MHz, although some models are designed specifically to handle higher frequencies. The primary use of op-amps in amplifier and related circuits is closely connected to the concept of negative feedback. The operational amplifier has high gain, high input impedance and low output impedance. Here the operational amplifier designed by using CMOS VLSI technology having low power consumption and high gain.

  1. Control of autonomous mobile robots using custom-designed qualitative reasoning VLSI chips and boards

    Energy Technology Data Exchange (ETDEWEB)

    Pin, F.G.; Pattay, R.S.

    1991-01-01

    Two types of computer boards including custom-designed VLSI chips have been developed to provide a qualitative reasoning capability for the real-time control of autonomous mobile robots. The design and operation of these boards are described and an example of application of qualitative reasoning for the autonomous navigation of a mobile robot in a-priori unknown environments is presented. Results concerning consistency and modularity in the development of qualitative reasoning schemes as well as the general applicability of these techniques to robotic control domains are also discussed. 17 refs., 4 figs.

  2. Vlsi implementation of flexible architecture for decision tree classification in data mining

    Science.gov (United States)

    Sharma, K. Venkatesh; Shewandagn, Behailu; Bhukya, Shankar Nayak

    2017-07-01

    The Data mining algorithms have become vital to researchers in science, engineering, medicine, business, search and security domains. In recent years, there has been a terrific raise in the size of the data being collected and analyzed. Classification is the main difficulty faced in data mining. In a number of the solutions developed for this problem, most accepted one is Decision Tree Classification (DTC) that gives high precision while handling very large amount of data. This paper presents VLSI implementation of flexible architecture for Decision Tree classification in data mining using c4.5 algorithm.

  3. Optical linear algebra processors - Architectures and algorithms

    Science.gov (United States)

    Casasent, David

    1986-01-01

    Attention is given to the component design and optical configuration features of a generic optical linear algebra processor (OLAP) architecture, as well as the large number of OLAP architectures, number representations, algorithms and applications encountered in current literature. Number-representation issues associated with bipolar and complex-valued data representations, high-accuracy (including floating point) performance, and the base or radix to be employed, are discussed, together with case studies on a space-integrating frequency-multiplexed architecture and a hybrid space-integrating and time-integrating multichannel architecture.

  4. Message-Driven Processor Architecture Version 11

    Science.gov (United States)

    1988-08-18

    UNCLASSIFIED . $CUUIT. v A$SIf9CAYON Or IMIS SAGE ’Whlken Dese E,...’lld) __ REPO_Or T CU NT PAGE ateREAD INSTRUCTIONS REPORT DOCUmtNTATION PAGE...fields instead of 2. This reflects the change in machine topology from 2D to 3D . Also, the NNR is no longer set to zero on a reset; it is left to...an X field, a Y field and a Z field indicating the position of the node in the 3D network grid. Its value identifies the processor on the network and

  5. Optical linear algebra processors - Architectures and algorithms

    Science.gov (United States)

    Casasent, David

    1986-01-01

    Attention is given to the component design and optical configuration features of a generic optical linear algebra processor (OLAP) architecture, as well as the large number of OLAP architectures, number representations, algorithms and applications encountered in current literature. Number-representation issues associated with bipolar and complex-valued data representations, high-accuracy (including floating point) performance, and the base or radix to be employed, are discussed, together with case studies on a space-integrating frequency-multiplexed architecture and a hybrid space-integrating and time-integrating multichannel architecture.

  6. Bio-Inspired Neural Model for Learning Dynamic Models

    Science.gov (United States)

    Duong, Tuan; Duong, Vu; Suri, Ronald

    2009-01-01

    A neural-network mathematical model that, relative to prior such models, places greater emphasis on some of the temporal aspects of real neural physical processes, has been proposed as a basis for massively parallel, distributed algorithms that learn dynamic models of possibly complex external processes by means of learning rules that are local in space and time. The algorithms could be made to perform such functions as recognition and prediction of words in speech and of objects depicted in video images. The approach embodied in this model is said to be "hardware-friendly" in the following sense: The algorithms would be amenable to execution by special-purpose computers implemented as very-large-scale integrated (VLSI) circuits that would operate at relatively high speeds and low power demands.

  7. The ISS Water Processor Catalytic Reactor as a Post Processor for Advanced Water Reclamation Systems

    Science.gov (United States)

    Nalette, Tim; Snowdon, Doug; Pickering, Karen D.; Callahan, Michael

    2007-01-01

    Advanced water processors being developed for NASA s Exploration Initiative rely on phase change technologies and/or biological processes as the primary means of water reclamation. As a result of the phase change, volatile compounds will also be transported into the distillate product stream. The catalytic reactor assembly used in the International Space Station (ISS) water processor assembly, referred to as Volatile Removal Assembly (VRA), has demonstrated high efficiency oxidation of many of these volatile contaminants, such as low molecular weight alcohols and acetic acid, and is considered a viable post treatment system for all advanced water processors. To support this investigation, two ersatz solutions were defined to be used for further evaluation of the VRA. The first solution was developed as part of an internal research and development project at Hamilton Sundstrand (HS) and is based primarily on ISS experience related to the development of the VRA. The second ersatz solution was defined by NASA in support of a study contract to Hamilton Sundstrand to evaluate the VRA as a potential post processor for the Cascade Distillation system being developed by Honeywell. This second ersatz solution contains several low molecular weight alcohols, organic acids, and several inorganic species. A range of residence times, oxygen concentrations and operating temperatures have been studied with both ersatz solutions to provide addition performance capability of the VRA catalyst.

  8. Compact optical processor for Hough and frequency domain features

    Science.gov (United States)

    Ott, Peter

    1996-11-01

    Shape recognition is necessary in a broad band of applications such as traffic sign or work piece recognition. It requires not only neighborhood processing of the input image pixels but global interconnection of them. The Hough transform (HT) performs such a global operation and it is well suited in the preprocessing stage of a shape recognition system. Translation invariant features can be easily calculated form the Hough domain. We have implemented on the computer a neural network shape recognition system which contains a HT, a feature extraction, and a classification layer. The advantage of this approach is that the total system can be optimized with well-known learning techniques and that it can explore the parallelism of the algorithms. However, the HT is a time consuming operation. Parallel, optical processing is therefore advantageous. Several systems have been proposed, based on space multiplexing with arrays of holograms and CGH's or time multiplexing with acousto-optic processors or by image rotation with incoherent and coherent astigmatic optical processors. We took up the last mentioned approach because 2D array detectors are read out line by line, so a 2D detector can achieve the same speed and is easier to implement. Coherent processing can allow the implementation of tilers in the frequency domain. Features based on wedge/ring, Gabor, or wavelet filters have been proven to show good discrimination capabilities for texture and shape recognition. The astigmatic lens system which is derived form the mathematical formulation of the HT is long and contains a non-standard, astigmatic element. By methods of lens transformation s for coherent applications we map the original design to a shorter lens with a smaller number of well separated standard elements and with the same coherent system response. The final lens design still contains the frequency plane for filtering and ray-tracing shows diffraction limited performance. Image rotation can be done

  9. Retargetable Code Generation based on Structural Processor Descriptions

    OpenAIRE

    Leupers, Rainer; Marwedel, Peter

    1998-01-01

    Design automation for embedded systems comprising both hardware and software components demands for code generators integrated into electronic CAD systems. These code generators provide the necessary link between software synthesis tools in HW/SW codesign systems and embedded processors. General-purpose compilers for standard processors are often insufficient, because they do not provide flexibility with respect to different target processors and also suffer from inferior code quality....

  10. User microprogrammable processors for high data rate telemetry preprocessing

    Science.gov (United States)

    Pugsley, J. H.; Ogrady, E. P.

    1973-01-01

    The use of microprogrammable processors for the preprocessing of high data rate satellite telemetry is investigated. The following topics are discussed along with supporting studies: (1) evaluation of commercial microprogrammable minicomputers for telemetry preprocessing tasks; (2) microinstruction sets for telemetry preprocessing; and (3) the use of multiple minicomputers to achieve high data processing. The simulation of small microprogrammed processors is discussed along with examples of microprogrammed processors.

  11. MPC Related Computational Capabilities of ARMv7A Processors

    DEFF Research Database (Denmark)

    Frison, Gianluca; Jørgensen, John Bagterp

    2015-01-01

    In recent years, the mass market of mobile devices has pushed the demand for increasingly fast but cheap processors. ARM, the world leader in this sector, has developed the Cortex-A series of processors with focus on computationally intensive applications. If properly programmed, these processors...... are powerful enough to solve the complex optimization problems arising in MPC in real-time, while keeping the traditional low-cost and low-power consumption. This makes these processors ideal candidates for use in embedded MPC. In this paper, we investigate the floating-point capabilities of Cortex A7, A9...

  12. Dynamically Reconfigurable Processor for Floating Point Arithmetic

    Directory of Open Access Journals (Sweden)

    S. Anbumani,

    2014-01-01

    Full Text Available Recently, development of embedded processors is toward miniaturization and energy saving for ecology. On the other hand, high performance arithmetic circuits are required in a lot of application in science and technology. Dynamically reconfigurable processors have been developed to meet these requests. They can change circuit configuration according to instructions in program instantly during operations.This paper describes, a dynamically reconfigurable circuit for floating-point arithmetic is proposed. The arithmetic circuit consists of two single precision floating-point arithmetic circuits. It performs double precision floating-point arithmetic by reconfiguration. Dynamic reconfiguration changes circuit construction at one clock cycle during operation without stopping circuits. It enables reconfiguration of circuits in a few nano seconds. The proposed circuit is reconfigured in two modes. In first mode it performs one double precision floating-point arithmetic or else the circuit will perform two parallel operations of single precision floating-point arithmetic. The new system design reduces implementation area by reconfiguring common parts of each operation. It also increases the processing speed with a very little number of clocks.

  13. Speed Scaling on Parallel Processors with Migration

    CERN Document Server

    Angel, Eric; Kacem, Fadi; Letsios, Dimitrios

    2011-01-01

    We study the problem of scheduling a set of jobs with release dates, deadlines and processing requirements (or works), on parallel speed-scaled processors so as to minimize the total energy consumption. We consider that both preemption and migration of jobs are allowed. An exact polynomial-time algorithm has been proposed for this problem, which is based on the Ellipsoid algorithm. Here, we formulate the problem as a convex program and we propose a simpler polynomial-time combinatorial algorithm which is based on a reduction to the maximum flow problem. Our algorithm runs in $O(nf(n)logP)$ time, where $n$ is the number of jobs, $P$ is the range of all possible values of processors' speeds divided by the desired accuracy and $f(n)$ is the complexity of computing a maximum flow in a layered graph with O(n) vertices. Independently, Albers et al. \\cite{AAG11} proposed an $O(n^2f(n))$-time algorithm exploiting the same relation with the maximum flow problem. We extend our algorithm to the multiprocessor speed scal...

  14. Coordinated Energy Management in Heterogeneous Processors

    Directory of Open Access Journals (Sweden)

    Indrani Paul

    2014-01-01

    Full Text Available This paper examines energy management in a heterogeneous processor consisting of an integrated CPU–GPU for high-performance computing (HPC applications. Energy management for HPC applications is challenged by their uncompromising performance requirements and complicated by the need for coordinating energy management across distinct core types – a new and less understood problem. We examine the intra-node CPU–GPU frequency sensitivity of HPC applications on tightly coupled CPU–GPU architectures as the first step in understanding power and performance optimization for a heterogeneous multi-node HPC system. The insights from this analysis form the basis of a coordinated energy management scheme, called DynaCo, for integrated CPU–GPU architectures. We implement DynaCo on a modern heterogeneous processor and compare its performance to a state-of-the-art power- and performance-management algorithm. DynaCo improves measured average energy-delay squared (ED2 product by up to 30% with less than 2% average performance loss across several exascale and other HPC workloads.

  15. Broadband monitoring simulation with massively parallel processors

    Science.gov (United States)

    Trubetskov, Mikhail; Amotchkina, Tatiana; Tikhonravov, Alexander

    2011-09-01

    Modern efficient optimization techniques, namely needle optimization and gradual evolution, enable one to design optical coatings of any type. Even more, these techniques allow obtaining multiple solutions with close spectral characteristics. It is important, therefore, to develop software tools that can allow one to choose a practically optimal solution from a wide variety of possible theoretical designs. A practically optimal solution provides the highest production yield when optical coating is manufactured. Computational manufacturing is a low-cost tool for choosing a practically optimal solution. The theory of probability predicts that reliable production yield estimations require many hundreds or even thousands of computational manufacturing experiments. As a result reliable estimation of the production yield may require too much computational time. The most time-consuming operation is calculation of the discrepancy function used by a broadband monitoring algorithm. This function is formed by a sum of terms over wavelength grid. These terms can be computed simultaneously in different threads of computations which opens great opportunities for parallelization of computations. Multi-core and multi-processor systems can provide accelerations up to several times. Additional potential for further acceleration of computations is connected with using Graphics Processing Units (GPU). A modern GPU consists of hundreds of massively parallel processors and is capable to perform floating-point operations efficiently.

  16. Efficiency of Cache Mechanism for Network Processors

    Institute of Scientific and Technical Information of China (English)

    XU Bo; CHANG Jian; HUANG Shimeng; XUE Yibo; LI Jun

    2009-01-01

    With the explosion of network bandwidth and the ever-changing requirements for diverse net-work-based applications, the traditional processing architectures, i.e., general purpose processor (GPP) and application specific integrated circuits (ASIC) cannot provide sufficient flexibility and high performance at the same time. Thus, the network processor (NP) has emerged as an altemative to meet these dual demands for today's network processing. The NP combines embedded multi-threaded cores with a dch memory hierarchy that can adapt to different networking circumstances when customized by the application developers. In to-day's NP architectures, muitithreading prevails over cache mechanism, which has achieved great success in GPP to hide memory access latencies. This paper focuses on the efficiency of the cache mechanism in an NP. Theoretical timing models of packet processing are established for evaluating cache efficiency and experi-ments are performed based on real-life network backbone traces. Testing results show that an improvement of neady 70% can be gained in throughput with assistance from the cache mechanism. Accordingly, the cache mechanism is still efficient and irreplaceable in network processing, despite the existing of multithreading.

  17. The ATLAS Level-1 Central Trigger Processor

    CERN Document Server

    Pauly, T; Ellis, Nick; Farthouat, P; Gällnö, P; Haller, J; Krasznahorkay, A; Maeno, T; Pessoa-Lima, H; Resurreccion-Arcas, I; Schuler, G; De Seixas, J M; Spiwoks, R; Torga-Teixeira, R; Wengler, T; 14th IEEE-NPSS Real Time Conference 2005

    2005-01-01

    ATLAS is a multi-purpose particle physics detector at CERN’s Large Hadron Collider where two pulsed beams of protons are brought to collision at very high energy. There are collisions every 25 ns, corresponding to a rate of 40 MHz. A three-level trigger system reduces this rate to about 200 Hz while keeping bunch crossings which potentially contain interesting processes. The Level-1 trigger, implemented in electronics and firmware, makes an initial selection in under 2.5 us with an output rate of less than 100 kHz. A key element of this is the Central Trigger Processor (CTP) which combines trigger information from the calorimeter and muon trigger processors to make the final Level-1 accept decision in under 100 ns on the basis of lists of selection criteria, implemented as a trigger menu. Timing and trigger signals are fanned out to all sub-detectors, while busy signals from all sub-detector read-out systems are collected and fed into the CTP in order to throttle the generation of Level-1 triggers.

  18. Neural network post-processing of grayscale optical correlator

    Science.gov (United States)

    Lu, Thomas T; Hughlett, Casey L.; Zhoua, Hanying; Chao, Tien-Hsin; Hanan, Jay C.

    2005-01-01

    In this paper we present the use of a radial basis function neural network (RBFNN) as a post-processor to assist the optical correlator to identify the objects and to reject false alarms. Image plane features near the correlation peaks are extracted and fed to the neural network for analysis. The approach is capable of handling large number of object variations and filter sets. Preliminary experimental results are presented and the performance is analyzed.

  19. Area Efficient 3.3GHZ Phase Locked Loop with Four Multiple Output Using 45NM VLSI Technology

    Directory of Open Access Journals (Sweden)

    Ms. Ujwala A. Belorkar

    2011-03-01

    Full Text Available This paper present area efficient layout designs for 3.3GigaHertz (GHz Phase Locked loop (PLL withfour multiple output. Effort has been taken to design Low Power Phase locked loop with multiple output,using VLSI technology. VLSI Technology includes process design, trends, chip fabrication, real circuitparameters, circuit design, electrical characteristics, configuration building blocks, switching circuitry,translation onto silicon, CAD and practical experience in layout design. The proposed PLL is designedusing 45 nm CMOS/VLSI technology with microwind 3.1. This software allows designing and simulatingan integrated circuit at physical description level. The main novelties related to the 45 nm technology arethe high-k gate oxide, metal gate and very low-k interconnect dielectric. The effective gate lengthrequired for 45 nm technology is 25nm. Low Power (0.211miliwatt phase locked loop with four multipleoutputs as PLL8x, PLL4x, PLL2x, & PLL1x of 3.3 GHz, 1.65 GHz, 0.825 GHz, and 0.412 GHzrespectively is obtained using 45 nm VLSI technology.

  20. New VLSI smart sensor for collision avoidance inspired by insect vision

    Science.gov (United States)

    Abbott, Derek; Moini, Alireza; Yakovleff, Andre; Nguyen, X. Thong; Blanksby, Andrew; Kim, Gyudong; Bouzerdoum, Abdesselam; Bogner, Robert E.; Eshraghian, Kamran

    1995-01-01

    An analog VLSI implementation of a smart microsensor that mimics the early visual processing stage in insects is described with an emphasis on the overall concept and the front- end detection. The system employs the `smart sensor' paradigm in that the detectors and processing circuitry are integrated on the one chip. The integrated circuit is composed of sixty channels of photodetectors and parallel processing elements. The photodetection circuitry includes p-well junction diodes on a 2 micrometers CMOS process and a logarithmic compression to increase the dynamic range of the system. The future possibility of gallium arsenide implementation is discussed. The processing elements behind each photodetector contain a low frequency differentiator where subthreshold design methods have been used. The completed IC is ideal for motion detection, particularly collision avoidance tasks, as it essentially detects distance, speed & bearing of an object. The Horridge Template Model for insect vision has been directly mapped into VLSI and therefore the IC truly exploits the beauty of nature in that the insect eye is so compact with parallel processing, enabling compact motion detection without the computational overhead of intensive imaging, full image extraction and interpretation. This world-first has exciting applications in the areas of automobile anti- collision, IVHS, autonomous robot guidance, aids for the blind, continuous process monitoring/web inspection and automated welding, for example.

  1. VLSI circuit techniques and technologies for ultrahigh speed data conversion interfaces

    Science.gov (United States)

    Wooley, Bruce A.

    1991-04-01

    The performance of digital VLSI signal processing and communications systems is often limited by the data conversion interfaces between digital system-level components and the analog environment in which those components are embedded. The focus of this program has been research into the fundamental nature of such interfaces in systems that digitally process high-bandwidth signals for purposes such as radar imaging, high-resolution graphics, high-definition video, mobile and fiber-optic communications, and broadband instrumentation. Effort has been devoted to the study of both generic circuit functions, such as sampling and comparison, and architectural alternatives relevant to the implementation of high-speed data converters in present and emerging VLSI technologies. Specific results of the research include the design and realization of novel low-power CMOS and BiCMOS sampled-data comparators operating at rates as high as 200 MHz, the exploration of various design approaches to the implementation of high-speed sample-and-hold circuits in CMOS and BiCMOS technologies, and the design of a subranging CMOS analog-to-digital converter that provides 12-bit resolution at a conversion rate of 10 MHz.

  2. A multi coding technique to reduce transition activity in VLSI circuits

    Science.gov (United States)

    Vithyalakshmi, N.; Rajaram, M.

    2014-02-01

    Advances in VLSI technology have enabled the implementation of complex digital circuits in a single chip, reducing system size and power consumption. In deep submicron low power CMOS VLSI design, the main cause of energy dissipation is charging and discharging of internal node capacitances due to transition activity. Transition activity is one of the major factors that also affect the dynamic power dissipation. This paper proposes power reduction analyzed through algorithm and logic circuit levels. In algorithm level the key aspect of reducing power dissipation is by minimizing transition activity and is achieved by introducing a data coding technique. So a novel multi coding technique is introduced to improve the efficiency of transition activity up to 52.3% on the bus lines, which will automatically reduce the dynamic power dissipation. In addition, 1 bit full adders are introduced in the Hamming distance estimator block, which reduces the device count. This coding method is implemented using Verilog HDL. The overall performance is analyzed by using Modelsim and Xilinx Tools. In total 38.2% power saving capability is achieved compared to other existing methods.

  3. Design of Low Power Phase Locked Loop (PLL Using 45NM VLSI Technology

    Directory of Open Access Journals (Sweden)

    Ms. Ujwala A. Belorkar

    2010-06-01

    Full Text Available Power has become one of the most important paradigms of design convergence for multigigahertz communication systems such as optical data links, wireless products, microprocessor &ASIC/SOC designs. POWER consumption has become a bottleneck in microprocessor design. The coreof a microprocessor, which includes the largest power density on the microprocessor. In an effort toreduce the power consumption of the circuit, the supply voltage can be reduced leading to reduction ofdynamic and static power consumption. Lowering the supply voltage, however, also reduces theperformance of the circuit, which is usually unacceptable. One way to overcome this limitation, availablein some application domains, is to replicate the circuit block whose supply voltage is being reduced inorder to maintain the same throughput .This paper introduces a design aspects for low power phaselocked loop using VLSI technology. This phase locked loop is designed using latest 45nm processtechnology parameters, which in turn offers high speed performance at low power. The main noveltyrelated to the 45nm technology such as the high-k gate oxide ,metal-gate and very low-k interconnectdielectric described. VLSI Technology includes process design, trends, chip fabrication, real circuitparameters, circuit design, electrical characteristics, configuration building blocks, switching circuitry,translation onto silicon, CAD, practical experience in layout design

  4. VLSI architecture of a K-best detector for MIMO-OFDM wireless communication systems

    Energy Technology Data Exchange (ETDEWEB)

    Jian Haifang; Shi Yin, E-mail: jhf@semi.ac.c [Institute of Semiconductors, Chinese Academy of Sciences, Beijing 100083 (China)

    2009-07-15

    The K-best detector is considered as a promising technique in the MIMO-OFDM detection because of its good performance and low complexity. In this paper, a new K-best VLSI architecture is presented. In the proposed architecture, the metric computation units (MCUs) expand each surviving path only to its partial branches, based on the novel expansion scheme, which can predetermine the branches' ascending order by their local distances. Then a distributed sorter sorts out the new K surviving paths from the expanded branches in pipelines. Compared to the conventional K-best scheme, the proposed architecture can approximately reduce fundamental operations by 50% and 75% for the 16-QAM and the 64-QAM cases, respectively, and, consequently, lower the demand on the hardware resource significantly. Simulation results prove that the proposed architecture can achieve a performance very similar to conventional K-best detectors. Hence, it is an efficient solution to the K-best detector's VLSI implementation for high-throughput MIMO-OFDM systems.

  5. Neural simulations on multi-core architectures

    Directory of Open Access Journals (Sweden)

    Hubert Eichner

    2009-07-01

    Full Text Available Neuroscience is witnessing increasing knowledge about the anatomy and electrophysiological properties of neurons and their connectivity, leading to an ever increasing computational complexity of neural simulations. At the same time, a rather radical change in personal computer technology emerges with the establishment of multi-cores: high-density, explicitly parallel processor architectures for both high performance as well as standard desktop computers. This work introduces strategies for the parallelization of biophysically realistic neural simulations based on the compartmental modeling technique and results of such an implementation, with a strong focus on multi-core architectures and automation, i. e. user-transparent load balancing.

  6. GA103 a microprogrammable processor for online filtering

    CERN Document Server

    Calzas, A; Danon, G

    1981-01-01

    GA103 is a 16 bit microprogrammable processor, which emulates the PDP 11 instruction set. It is based on the Am2900 slices. It allows user- implemented microinstructions and addition of hardwired processors. It will perform online filtering tasks in the NA14 experiment at CERN, based on the reconstruction of transverse momentum of photons detected in a lead glass calorimeter. (3 refs).

  7. Expert System Constant False Alarm Rate (CFAR) Processor

    Science.gov (United States)

    2006-09-01

    Processor has been developed on a Sun Sparc Station 4/470 using a commercial-off-the-shelf software development package called G2 by Gensym Corporation...size of the training data set. A prototype expert system CFAR Processor has been presented which applies artificial intelligence to CFAR detection

  8. Digital Signal Processor System for AC Power Drivers

    OpenAIRE

    Ovidiu Neamtu

    2009-01-01

    DSP (Digital Signal Processor) is the bestsolution for motor control systems to make possible thedevelopment of advanced motor drive systems. The motorcontrol processor calculates the required motor windingvoltage magnitude and frequency to operate the motor atthe desired speed. A PWM (Pulse Width Modulation)circuit controls the on and off duty cycle of the powerinverter switches to vary the magnitude of the motorvoltages.

  9. Digital Signal Processor System for AC Power Drivers

    Directory of Open Access Journals (Sweden)

    Ovidiu Neamtu

    2009-10-01

    Full Text Available DSP (Digital Signal Processor is the bestsolution for motor control systems to make possible thedevelopment of advanced motor drive systems. The motorcontrol processor calculates the required motor windingvoltage magnitude and frequency to operate the motor atthe desired speed. A PWM (Pulse Width Modulationcircuit controls the on and off duty cycle of the powerinverter switches to vary the magnitude of the motorvoltages.

  10. High speed matrix processors using floating point representation

    Energy Technology Data Exchange (ETDEWEB)

    Birkner, D.A.

    1980-01-01

    The author describes the architecture of a high-speed matrix processor which uses a floating-point format for data representation. It is shown how multipliers and other LSI devices are used in the design to obtain the high speed of the processor.

  11. Temporal Partitioning and Multi-Processor Scheduling for Reconfigurable Architectures

    DEFF Research Database (Denmark)

    Popp, Andreas; Le Moullec, Yannick; Koch, Peter

    This poster presentation outlines a proposed framework for handling mapping of signal processing applications to heterogeneous reconfigurable architectures. The methodology consists of an extension to traditional multi-processor scheduling by creating a separate HW track for generation of groups...... of tasks that are handled similarly to SW processes in a traditional multi-processor scheduling context....

  12. Message Passing on a Time-predictable Multicore Processor

    DEFF Research Database (Denmark)

    Sørensen, Rasmus Bo; Puffitsch, Wolfgang; Schoeberl, Martin

    2015-01-01

    Real-time systems need time-predictable computing platforms. For a multicore processor to be time-predictable, communication between processor cores needs to be time-predictable as well. This paper presents a time-predictable message-passing library for such a platform. We show how to build up...

  13. A Simple and Affordable TTL Processor for the Classroom

    Science.gov (United States)

    Feinberg, Dave

    2007-01-01

    This paper presents a simple 4 bit computer processor design that may be built using TTL chips for less than $65. In addition to describing the processor itself in detail, we discuss our experience using the laboratory kit and its associated machine instruction set to teach computer architecture to high school students. (Contains 3 figures and 5…

  14. Bibliographic Pattern Matching Using the ICL Distributed Array Processor.

    Science.gov (United States)

    Carroll, David M.; And Others

    1988-01-01

    Describes the use of a highly parallel array processor for pattern matching operations in a bibliographic retrieval system. The discussion covers the hardware and software features of the processor, the pattern matching algorithm used, and the results of experimental tests of the system. (37 references) (Author/CLB)

  15. Designing a dataflow processor using CλaSH

    NARCIS (Netherlands)

    Niedermeier, Anja; Wester, Rinse; Rovers, Kenneth; Baaij, Christiaan; Kuper, Jan; Smit, Gerard

    2010-01-01

    In this paper we show how a simple dataflow processor can be fully implemented using CλaSH, a high level HDL based on the functional programming language Haskell. The processor was described using Haskell, the CλaSH compiler was then used to translate the design into a fully synthesisable VHDL code.

  16. EARLY EXPERIENCE WITH A HYBRID PROCESSOR: K-MEANS CLUSTERING

    Energy Technology Data Exchange (ETDEWEB)

    M. GOKHALE; ET AL

    2001-02-01

    We discuss hardware/software coprocessing on a hybrid processor for a compute- and data-intensive hyper-spectral imaging algorithm, K-Means Clustering. The experiments are performed on the Altera Excalibur board using the soft IP core 32-bit NIOS RISC processor. In our experiments, we compare performance of the sequential algorithm with two different accelerated versions. We consider granularity and synchronization issues when mapping an algorithm to a hybrid processor. Our results show that on the Excalibur NIOS, a 15% speedup can be achieved over the sequential algorithm on images with 8 spectral bands where the pixels are divided into 8 categories. Speedup is limited by the communication cost of transferring data from external memory through the NIOS processor to the customized circuits. Our results indicate that future hybrid processors must either (1) have a clock rate 10X the speed of the configurable logic circuits or (2) include dual port memories that both the processor and configurable logic can access. If either of these conditions is met, the hybrid processor will show a factor of 10 speedup over the sequential algorithm. Such systems will combine the convenience of conventional processors with the speed of configurable logic.

  17. Evaluation of the Intel Sandy Bridge-EP server processor

    CERN Document Server

    Jarp, S; Leduc, J; Nowak, A; CERN. Geneva. IT Department

    2012-01-01

    In this paper we report on a set of benchmark results recently obtained by CERN openlab when comparing an 8-core “Sandy Bridge-EP” processor with Intel’s previous microarchitecture, the “Westmere-EP”. The Intel marketing names for these processors are “Xeon E5-2600 processor series” and “Xeon 5600 processor series”, respectively. Both processors are produced in a 32nm process, and both platforms are dual-socket servers. Multiple benchmarks were used to get a good understanding of the performance of the new processor. We used both industry-standard benchmarks, such as SPEC2006, and specific High Energy Physics benchmarks, representing both simulation of physics detectors and data analysis of physics events. Before summarizing the results we must stress the fact that benchmarking of modern processors is a very complex affair. One has to control (at least) the following features: processor frequency, overclocking via Turbo mode, the number of physical cores in use, the use of logical cores ...

  18. Signal Processor for Spring8 Linac BPM

    CERN Document Server

    Yanagida, K; Dewa, H; Hanaki, H; Hori, T; Kobayashi, T; Mizuno, A; Sasaki, S; Suzuki, S; Takashima, T; Taniushi, T; Tomizawa, H

    2001-01-01

    A signal processor of the single shot BPM system consists of a narrow-band BPF unit, a detector unit, a P/H circuit, an S/H IC and a 16-bit ADC. The BPF unit extracts a pure 2856MHz RF signal component from a BPM and makes the pulse width longer than 100ns. The detector unit that includes a demodulating logarithmic amplifier is used to detect an S-band RF amplitude. A wide dynamic range of beam current has been achieved; 0.01 ~ 3.5nC for below 100ns input pulse width, or 0.06 ~ 20mA for above 100ns input pulse width. The maximum acquisition rate with a VME system has been achieved up to 1kHz.

  19. Efficient quantum walk on a quantum processor

    Science.gov (United States)

    Qiang, Xiaogang; Loke, Thomas; Montanaro, Ashley; Aungskunsiri, Kanin; Zhou, Xiaoqi; O'Brien, Jeremy L.; Wang, Jingbo B.; Matthews, Jonathan C. F.

    2016-05-01

    The random walk formalism is used across a wide range of applications, from modelling share prices to predicting population genetics. Likewise, quantum walks have shown much potential as a framework for developing new quantum algorithms. Here we present explicit efficient quantum circuits for implementing continuous-time quantum walks on the circulant class of graphs. These circuits allow us to sample from the output probability distributions of quantum walks on circulant graphs efficiently. We also show that solving the same sampling problem for arbitrary circulant quantum circuits is intractable for a classical computer, assuming conjectures from computational complexity theory. This is a new link between continuous-time quantum walks and computational complexity theory and it indicates a family of tasks that could ultimately demonstrate quantum supremacy over classical computers. As a proof of principle, we experimentally implement the proposed quantum circuit on an example circulant graph using a two-qubit photonics quantum processor.

  20. Scaling the ion trap quantum processor.

    Science.gov (United States)

    Monroe, C; Kim, J

    2013-03-08

    Trapped atomic ions are standards for quantum information processing, serving as quantum memories, hosts of quantum gates in quantum computers and simulators, and nodes of quantum communication networks. Quantum bits based on trapped ions enjoy a rare combination of attributes: They have exquisite coherence properties, they can be prepared and measured with nearly 100% efficiency, and they are readily entangled with each other through the Coulomb interaction or remote photonic interconnects. The outstanding challenge is the scaling of trapped ions to hundreds or thousands of qubits and beyond, at which scale quantum processors can outperform their classical counterparts in certain applications. We review the latest progress and prospects in that effort, with the promise of advanced architectures and new technologies, such as microfabricated ion traps and integrated photonics.