WorldWideScience

Sample records for high bandwidth parallel

  1. Automatic high-bandwidth calibration and reconstruction of arbitrarily sampled parallel MRI.

    Directory of Open Access Journals (Sweden)

    Jan Aelterman

    Full Text Available Today, many MRI reconstruction techniques exist for undersampled MRI data. Regularization-based techniques inspired by compressed sensing allow for the reconstruction of undersampled data that would lead to an ill-posed reconstruction problem. Parallel imaging enables the reconstruction of MRI images from undersampled multi-coil data that leads to a well-posed reconstruction problem. Autocalibrating pMRI techniques encompass pMRI techniques where no explicit knowledge of the coil sensivities is required. A first purpose of this paper is to derive a novel autocalibration approach for pMRI that allows for the estimation and use of smooth, but high-bandwidth coil profiles instead of a compactly supported kernel. These high-bandwidth models adhere more accurately to the physics of an antenna system. The second purpose of this paper is to demonstrate the feasibility of a parameter-free reconstruction algorithm that combines autocalibrating pMRI and compressed sensing. Therefore, we present several techniques for automatic parameter estimation in MRI reconstruction. Experiments show that a higher reconstruction accuracy can be had using high-bandwidth coil models and that the automatic parameter choices yield an acceptable result.

  2. Low latency, high bandwidth data communications between compute nodes in a parallel computer

    Science.gov (United States)

    Archer, Charles J.; Blocksome, Michael A.; Ratterman, Joseph D.; Smith, Brian E.

    2010-11-02

    Methods, parallel computers, and computer program products are disclosed for low latency, high bandwidth data communications between compute nodes in a parallel computer. Embodiments include receiving, by an origin direct memory access (`DMA`) engine of an origin compute node, data for transfer to a target compute node; sending, by the origin DMA engine of the origin compute node to a target DMA engine on the target compute node, a request to send (`RTS`) message; transferring, by the origin DMA engine, a predetermined portion of the data to the target compute node using memory FIFO operation; determining, by the origin DMA engine whether an acknowledgement of the RTS message has been received from the target DMA engine; if the an acknowledgement of the RTS message has not been received, transferring, by the origin DMA engine, another predetermined portion of the data to the target compute node using a memory FIFO operation; and if the acknowledgement of the RTS message has been received by the origin DMA engine, transferring, by the origin DMA engine, any remaining portion of the data to the target compute node using a direct put operation.

  3. Design, analysis and testing of a parallel-kinematic high-bandwidth XY nanopositioning stage

    Energy Technology Data Exchange (ETDEWEB)

    Li, Chun-Xia; Gu, Guo-Ying; Yang, Mei-Ju; Zhu, Li-Min, E-mail: zhulm@sjtu.edu.cn [State Key Laboratory of Mechanical System and Vibration, School of Mechanical Engineering, Shanghai Jiao Tong University, Shanghai 200240 (China)

    2013-12-15

    This paper presents the design, analysis, and testing of a parallel-kinematic high-bandwidth XY nanopositioning stage driven by piezoelectric stack actuators. The stage is designed with two kinematic chains. In each kinematic chain, the end-effector of the stage is connected to the base by two symmetrically distributed flexure modules, respectively. Each flexure module comprises a fixed-fixed beam and a parallelogram flexure serving as two orthogonal prismatic joints. With the purpose to achieve high resonance frequencies of the stage, a novel center-thickened beam which has large stiffness is proposed to act as the fixed-fixed beam. The center-thickened beam also contributes to reducing cross-coupling and restricting parasitic motion. To decouple the motion in two axes totally, a symmetric configuration is adopted for the parallelogram flexures. Based on the analytical models established in static and dynamic analysis, the dimensions of the stage are optimized in order to maximize the first resonance frequency. Then finite element analysis is utilized to validate the design and a prototype of the stage is fabricated for performance tests. According to the results of static and dynamic tests, the resonance frequencies of the developed stage are over 13.6 kHz and the workspace is 11.2 μm × 11.6 μm with the cross-coupling between two axes less than 0.52%. It is clearly demonstrated that the developed stage has high resonance frequencies, a relatively large travel range, and nearly decoupled performance between two axes. For high-speed tracking performance tests, an inversion-based feedforward controller is implemented for the stage to compensate for the positioning errors caused by mechanical vibration. The experimental results show that good tracking performance at high speed is achieved, which validates the effectiveness of the developed stage.

  4. 3-D readout-electronics packaging for high-bandwidth massively paralleled imager

    Science.gov (United States)

    Kwiatkowski, Kris; Lyke, James

    2007-12-18

    Dense, massively parallel signal processing electronics are co-packaged behind associated sensor pixels. Microchips containing a linear or bilinear arrangement of photo-sensors, together with associated complex electronics, are integrated into a simple 3-D structure (a "mirror cube"). An array of photo-sensitive cells are disposed on a stacked CMOS chip's surface at a 45.degree. angle from light reflecting mirror surfaces formed on a neighboring CMOS chip surface. Image processing electronics are held within the stacked CMOS chip layers. Electrical connections couple each of said stacked CMOS chip layers and a distribution grid, the connections for distributing power and signals to components associated with each stacked CSMO chip layer.

  5. Low latency, high bandwidth data communications between compute nodes in a parallel computer

    Energy Technology Data Exchange (ETDEWEB)

    Blocksome, Michael A

    2014-04-01

    Methods, systems, and products are disclosed for data transfers between nodes in a parallel computer that include: receiving, by an origin DMA on an origin node, a buffer identifier for a buffer containing data for transfer to a target node; sending, by the origin DMA to the target node, a RTS message; transferring, by the origin DMA, a data portion to the target node using a memory FIFO operation that specifies one end of the buffer from which to begin transferring the data; receiving, by the origin DMA, an acknowledgement of the RTS message from the target node; and transferring, by the origin DMA in response to receiving the acknowledgement, any remaining data portion to the target node using a direct put operation that specifies the other end of the buffer from which to begin transferring the data, including initiating the direct put operation without invoking an origin processing core.

  6. Low latency, high bandwidth data communications between compute nodes in a parallel computer

    Energy Technology Data Exchange (ETDEWEB)

    Blocksome, Michael A

    2014-04-22

    Methods, systems, and products are disclosed for data transfers between nodes in a parallel computer that include: receiving, by an origin DMA on an origin node, a buffer identifier for a buffer containing data for transfer to a target node; sending, by the origin DMA to the target node, a RTS message; transferring, by the origin DMA, a data portion to the target node using a memory FIFO operation that specifies one end of the buffer from which to begin transferring the data; receiving, by the origin DMA, an acknowledgement of the RTS message from the target node; and transferring, by the origin DMA in response to receiving the acknowledgement, any remaining data portion to the target node using a direct put operation that specifies the other end of the buffer from which to begin transferring the data, including initiating the direct put operation without invoking an origin processing core.

  7. Efficiently parallelized modeling of tightly focused, large bandwidth laser pulses

    CERN Document Server

    Dumont, Joey; Lefebvre, Catherine; Gagnon, Denis; MacLean, Steve

    2016-01-01

    The Stratton-Chu integral representation of electromagnetic fields is used to study the spatio-temporal properties of large bandwidth laser pulses focused by high numerical aperture mirrors. We review the formal aspects of the derivation of diffraction integrals from the Stratton-Chu representation and discuss the use of the Hadamard finite part in the derivation of the physical optics approximation. By analyzing the formulation we show that, for the specific case of a parabolic mirror, the integrands involved in the description of the reflected field near the focal spot do not possess the strong oscillations characteristic of diffraction integrals. Consequently, the integrals can be evaluated with simple and efficient quadrature methods rather than with specialized, more costly approaches. We report on the development of an efficiently parallelized algorithm that evaluates the Stratton-Chu diffraction integrals for incident fields of arbitrary temporal and spatial dependence. We use our method to show that t...

  8. Efficiently parallelized modeling of tightly focused, large bandwidth laser pulses

    Science.gov (United States)

    Dumont, Joey; Fillion-Gourdeau, François; Lefebvre, Catherine; Gagnon, Denis; MacLean, Steve

    2017-02-01

    The Stratton-Chu integral representation of electromagnetic fields is used to study the spatio-temporal properties of large bandwidth laser pulses focused by high numerical aperture mirrors. We review the formal aspects of the derivation of diffraction integrals from the Stratton-Chu representation and discuss the use of the Hadamard finite part in the derivation of the physical optics approximation. By analyzing the formulation we show that, for the specific case of a parabolic mirror, the integrands involved in the description of the reflected field near the focal spot do not possess the strong oscillations characteristic of diffraction integrals. Consequently, the integrals can be evaluated with simple and efficient quadrature methods rather than with specialized, more costly approaches. We report on the development of an efficiently parallelized algorithm that evaluates the Stratton-Chu diffraction integrals for incident fields of arbitrary temporal and spatial dependence. This method has the advantage that its input is the unfocused field coming from the laser chain, which is experimentally known with high accuracy. We use our method to show that the reflection of a linearly polarized Gaussian beam of femtosecond duration off a high numerical aperture parabolic mirror induces ellipticity in the dominant field components and generates strong longitudinal components. We also estimate that future high-power laser facilities may reach intensities of {10}24 {{W}} {{cm}}-2.

  9. Network Bandwidth Utilization Forecast Model on High Bandwidth Network

    Energy Technology Data Exchange (ETDEWEB)

    Yoo, Wucherl; Sim, Alex

    2014-07-07

    With the increasing number of geographically distributed scientific collaborations and the scale of the data size growth, it has become more challenging for users to achieve the best possible network performance on a shared network. We have developed a forecast model to predict expected bandwidth utilization for high-bandwidth wide area network. The forecast model can improve the efficiency of resource utilization and scheduling data movements on high-bandwidth network to accommodate ever increasing data volume for large-scale scientific data applications. Univariate model is developed with STL and ARIMA on SNMP path utilization data. Compared with traditional approach such as Box-Jenkins methodology, our forecast model reduces computation time by 83.2percent. It also shows resilience against abrupt network usage change. The accuracy of the forecast model is within the standard deviation of the monitored measurements.

  10. High-Bandwidth, High-Efficiency Envelope Tracking Power Supply for 40W RF Power Amplifier Using Paralleled Bandpass Current Sources

    DEFF Research Database (Denmark)

    Høyerby, Mikkel Christian Wendelboe; Andersen, Michael Andreas E.

    2005-01-01

    This paper presents a high-performance power conversion scheme for power supply applications that require very high output voltage slew rates (dV/dt). The concept is to parallel 2 switching bandpass current sources, each optimized for its passband frequency space and the expected load current....... The principle is demonstrated with a power supply, designed for supplying a 40 W linear RF power amplifier for efficient amplification of a 16-QAM modulated data stream...

  11. High-Bandwidth Hybrid Sensor (HYSENS) Project

    Data.gov (United States)

    National Aeronautics and Space Administration — ATA has demonstrated the primary innovation of combining a precision MEMS gyro (BAE SiRRS01) with a high bandwidth angular rate sensor, ATA's ARS-14 resulting in a...

  12. On the Bandwidth of High-Impedance Frequency Selective Surfaces

    CERN Document Server

    Costa, Filippo; Monorchio, Agostino; 10.1109/LAWP.2009.2038346

    2010-01-01

    In this letter, the bandwidth of high-impedance surfaces (HISs) is discussed by an equivalent circuit approach. Even if these surfaces have been employed for almost 10 years, it is sometimes unclear how to choose the shape of the frequency selective surface (FSS) on the top of the grounded slab in order to achieve the largest possible bandwidth. Here, we will show that the conventional approach describing the HIS as a parallel connection between the inductance given by the grounded dielectric substrate and the capacitance of the FSS may induce inaccurate results in the determination of the operating bandwidth of the structure. Indeed, in order to derive a more complete model and to provide a more accurate estimate of the operating bandwidth, it is also necessary to introduce the series inductance of the FSS.We will present the explicit expression for defining the bandwidth of a HIS, and we will show that the reduction of the FSS inductance results in the best choice for achieving wide operating bandwidth in c...

  13. VISA IB Ultra-High Bandwidth, High Gain SASE FEL

    CERN Document Server

    Andonian, Gerard; Murokh, Alex; Pellegrini, Claudio; Reiche, Sven; Rosenzweig, J B; Travish, Gil

    2004-01-01

    The results of a high energy-spread SASE FEL experiment, the intermediary experiment linking the VISA I and VISA II projects, are presented. A highly chirped beam (~1.7%) was transported without correction of longitudinal aberrations in the ATF dogleg, and injected into the VISA undulator. The output FEL radiation displayed an uncharacteristicly large bandwidth (~11%) with extremely stable lasing and measured energy of about 2 microJoules. Start-to-end simulations reproduce key features of the measured results and provide an insight into the mechanisms giving rise to such a high bandwidth. These analyses are described as they relate to important considerations for the VISA II experiment.

  14. Fast Faraday Cup With High Bandwidth

    Science.gov (United States)

    Deibele, Craig E [Knoxville, TN

    2006-03-14

    A circuit card stripline Fast Faraday cup quantitatively measures the picosecond time structure of a charged particle beam. The stripline configuration maintains signal integrity, and stitching of the stripline increases the bandwidth. A calibration procedure ensures the measurement of the absolute charge and time structure of the charged particle beam.

  15. High-bandwidth hybrid quantum repeater.

    Science.gov (United States)

    Munro, W J; Van Meter, R; Louis, Sebastien G R; Nemoto, Kae

    2008-07-25

    We present a physical- and link-level design for the creation of entangled pairs to be used in quantum repeater applications where one can control the noise level of the initially distributed pairs. The system can tune dynamically, trading initial fidelity for success probability, from high fidelity pairs (F=0.98 or above) to moderate fidelity pairs. The same physical resources that create the long-distance entanglement are used to implement the local gates required for entanglement purification and swapping, creating a homogeneous repeater architecture. Optimizing the noise properties of the initially distributed pairs significantly improves the rate of generating long-distance Bell pairs. Finally, we discuss the performance trade-off between spatial and temporal resources.

  16. Simple High-Bandwidth Sideband Locking with Heterodyne Readout

    CERN Document Server

    Reinhardt, Christoph; Sankey, Jack C

    2016-01-01

    We present a robust sideband laser locking technique that is ideally suited for applications requiring low probe power and heterodyne readout. By feeding back to a high-bandwidth voltage controlled oscillator, we lock a first-order phase-modulation sideband to a table-top high-finesse Fabry-Perot cavity, achieving a feedback bandwidth of 3.5 MHz with a single integrator, limited fundamentally by the signal delay. The directly measured transfer function of the closed feedback loop agrees with a model assuming ideal system components, and from this we suggest a modified design that should realistically achieve a bandwidth exceeding 6 MHz with a near-causally limited feedback gain of $4\\times 10^7$ at 1 kHz. The off-resonance optical carrier is used for alignment-free heterodyne readout, alleviating the need for a second laser or additional optical modulators.

  17. High speed and wide bandwidth delta-sigma ADCs

    CERN Document Server

    Bolatkale, Muhammed; Makinwa, Kofi A A

    2014-01-01

    This book describes techniques for realizing wide bandwidth (125MHz) over-sampled analog-to-digital converters (ADCs) in nanometer-CMOS processes.  The authors offer a clear and complete picture of system level challenges and practical design solutions in high-speed Delta-Sigma modulators.  Readers will be enabled to implement ADCs as continuous-time delta-sigma (CT∆Σ) modulators, offering simple resistive inputs, which do not require the use of power-hungry input buffers, as well as offering inherent anti-aliasing, which simplifies system integration. The authors focus on the design of high speed and wide-bandwidth ΔΣMs that make a step in bandwidth range which was previously only possible with Nyquist converters. More specifically, this book describes the stability, power efficiency, and linearity limits of ΔΣMs, aiming at a GHz sampling frequency.   • Provides overview of trends in Wide Bandwidth and High Dynamic Range analog-to-digital converters (ADCs); • Enables the design of a wide band...

  18. High-bandwidth remote flat panel display interconnect system

    Science.gov (United States)

    Peterson, Darrel G.

    1999-08-01

    High performance electronic displays (CRT, AMLCD, TFEL, plasma, etc.) require wide bandwidth electrical drive signals to produce the desired display images. When the image generation and/or image processing circuitry is located within the same line replaceable unit (LRU) as the display media, the transmission of the display drive signals to the display media presents no unusual design problems. However, many aircraft cockpits are severely constrained for available space behind the instrument panel. This often forces the system designer to specify that only the display media and its immediate support circuitry are to be mounted in the instrument panel. A wide bandwidth interconnect system is then required to transfer image data from the display generation circuitry to the display unit. Image data transfer rates of nearly 1.5 Gbits/second may be required when displaying full motion video at a 60 Hz field rate. In addition to wide bandwidth, this interconnect system must exhibit several additional key characteristics: (1) Lossless transmission of image data; (2) High reliability and high integrity; (3) Ease of installation and field maintenance; (4) High immunity to HIRF and electrical noise; (5) Low EMI emissions; (6) Long term supportability; and (7) Low acquisition and maintenance cost. Rockwell Collins has developed an avionics grade remote display interconnect system based on the American National Standards Institute Fibre Channel standard which meets these requirements. Readily available low cost commercial off the shelf (COTS) components are utilized, and qualification tests have confirmed system performance.

  19. Managing high-bandwidth real-time data storage

    Energy Technology Data Exchange (ETDEWEB)

    Bigelow, David D. [Los Alamos National Laboratory; Brandt, Scott A [Los Alamos National Laboratory; Bent, John M [Los Alamos National Laboratory; Chen, Hsing-Bung [Los Alamos National Laboratory

    2009-09-23

    There exist certain systems which generate real-time data at high bandwidth, but do not necessarily require the long-term retention of that data in normal conditions. In some cases, the data may not actually be useful, and in others, there may be too much data to permanently retain in long-term storage whether it is useful or not. However, certain portions of the data may be identified as being vitally important from time to time, and must therefore be retained for further analysis or permanent storage without interrupting the ongoing collection of new data. We have developed a system, Mahanaxar, intended to address this problem. It provides quality of service guarantees for incoming real-time data streams and simultaneous access to already-recorded data on a best-effort basis utilizing any spare bandwidth. It has built in mechanisms for reliability and indexing, can scale upwards to meet increasing bandwidth requirements, and handles both small and large data elements equally well. We will show that a prototype version of this system provides better performance than a flat file (traditional filesystem) based version, particularly with regard to quality of service guarantees and hard real-time requirements.

  20. Ultra-high bandwidth quantum secured data transmission

    Science.gov (United States)

    Dynes, James F.; Tam, Winci W.-S.; Plews, Alan; Fröhlich, Bernd; Sharpe, Andrew W.; Lucamarini, Marco; Yuan, Zhiliang; Radig, Christian; Straw, Andrew; Edwards, Tim; Shields, Andrew J.

    2016-10-01

    Quantum key distribution (QKD) provides an attractive means for securing communications in optical fibre networks. However, deployment of the technology has been hampered by the frequent need for dedicated dark fibres to segregate the very weak quantum signals from conventional traffic. Up until now the coexistence of QKD with data has been limited to bandwidths that are orders of magnitude below those commonly employed in fibre optic communication networks. Using an optimised wavelength divisional multiplexing scheme, we transport QKD and the prevalent 100 Gb/s data format in the forward direction over the same fibre for the first time. We show a full quantum encryption system operating with a bandwidth of 200 Gb/s over a 100 km fibre. Exploring the ultimate limits of the technology by experimental measurements of the Raman noise, we demonstrate it is feasible to combine QKD with 10 Tb/s of data over a 50 km link. These results suggest it will be possible to integrate QKD and other quantum photonic technologies into high bandwidth data communication infrastructures, thereby allowing their widespread deployment.

  1. Ultra-high bandwidth quantum secured data transmission

    Science.gov (United States)

    Dynes, James F.; Tam, Winci W-S.; Plews, Alan; Fröhlich, Bernd; Sharpe, Andrew W.; Lucamarini, Marco; Yuan, Zhiliang; Radig, Christian; Straw, Andrew; Edwards, Tim; Shields, Andrew J.

    2016-01-01

    Quantum key distribution (QKD) provides an attractive means for securing communications in optical fibre networks. However, deployment of the technology has been hampered by the frequent need for dedicated dark fibres to segregate the very weak quantum signals from conventional traffic. Up until now the coexistence of QKD with data has been limited to bandwidths that are orders of magnitude below those commonly employed in fibre optic communication networks. Using an optimised wavelength divisional multiplexing scheme, we transport QKD and the prevalent 100 Gb/s data format in the forward direction over the same fibre for the first time. We show a full quantum encryption system operating with a bandwidth of 200 Gb/s over a 100 km fibre. Exploring the ultimate limits of the technology by experimental measurements of the Raman noise, we demonstrate it is feasible to combine QKD with 10 Tb/s of data over a 50 km link. These results suggest it will be possible to integrate QKD and other quantum photonic technologies into high bandwidth data communication infrastructures, thereby allowing their widespread deployment. PMID:27734921

  2. Modulator-Based, High Bandwidth Optical Links for HEP Experiments

    CERN Document Server

    Underwood, D G; Fernando, W S; Stanek, R W

    2012-01-01

    As a concern with the reliability, bandwidth and mass of future optical links in LHC experiments, we are investigating CW lasers and light modulators as an alternative to VCSELs. These links will be particularly useful if they utilize light modulators which are very small, low power, high bandwidth, and are very radiation hard. We have constructed a test system with 3 such links, each operating at 10 Gb/s. We present the quality of these links (jitter, rise and fall time, BER) and eye mask margins (10GbE) for 3 different types of modulators: LiNbO3-based, InP-based, and Si-based. We present the results of radiation hardness measurements with up to ~1012 protons/cm2 and ~65 krad total ionizing dose (TID), confirming no single event effects (SEE) at 10 Gb/s with either of the 3 types of modulators. These optical links will be an integral part of intelligent tracking systems at various scales from coupled sensors through intra-module and off detector communication. We have used a Si-based photonic transceiver to...

  3. High speed InAs electron avalanche photodiodes overcome the conventional gain-bandwidth product limit.

    Science.gov (United States)

    Marshall, Andrew R J; Ker, Pin Jern; Krysa, Andrey; David, John P R; Tan, Chee Hing

    2011-11-07

    High bandwidth, uncooled, Indium Arsenide (InAs) electron avalanche photodiodes (e-APDs) with unique and highly desirable characteristics are reported. The e-APDs exhibit a 3dB bandwidth of 3.5 GHz which, unlike that of conventional APDs, is shown not to reduce with increasing avalanche gain. Hence these InAs e-APDs demonstrate a characteristic of theoretically ideal electron only APDs, the absence of a gain-bandwidth product limit. This is important because gain-bandwidth products restrict the maximum exploitable gain in all conventional high bandwidth APDs. Non-limiting gain-bandwidth products up to 580 GHz have been measured on these first high bandwidth e-APDs.

  4. Highly efficient frequency conversion with bandwidth compression of quantum light

    Science.gov (United States)

    Allgaier, Markus; Ansari, Vahid; Sansoni, Linda; Eigner, Christof; Quiring, Viktor; Ricken, Raimund; Harder, Georg; Brecht, Benjamin; Silberhorn, Christine

    2017-01-01

    Hybrid quantum networks rely on efficient interfacing of dissimilar quantum nodes, as elements based on parametric downconversion sources, quantum dots, colour centres or atoms are fundamentally different in their frequencies and bandwidths. Although pulse manipulation has been demonstrated in very different systems, to date no interface exists that provides both an efficient bandwidth compression and a substantial frequency translation at the same time. Here we demonstrate an engineered sum-frequency-conversion process in lithium niobate that achieves both goals. We convert pure photons at telecom wavelengths to the visible range while compressing the bandwidth by a factor of 7.47 under preservation of non-classical photon-number statistics. We achieve internal conversion efficiencies of 61.5%, significantly outperforming spectral filtering for bandwidth compression. Our system thus makes the connection between previously incompatible quantum systems as a step towards usable quantum networks. PMID:28134242

  5. Highly efficient frequency conversion with bandwidth compression of quantum light

    Science.gov (United States)

    Allgaier, Markus; Ansari, Vahid; Sansoni, Linda; Eigner, Christof; Quiring, Viktor; Ricken, Raimund; Harder, Georg; Brecht, Benjamin; Silberhorn, Christine

    2017-01-01

    Hybrid quantum networks rely on efficient interfacing of dissimilar quantum nodes, as elements based on parametric downconversion sources, quantum dots, colour centres or atoms are fundamentally different in their frequencies and bandwidths. Although pulse manipulation has been demonstrated in very different systems, to date no interface exists that provides both an efficient bandwidth compression and a substantial frequency translation at the same time. Here we demonstrate an engineered sum-frequency-conversion process in lithium niobate that achieves both goals. We convert pure photons at telecom wavelengths to the visible range while compressing the bandwidth by a factor of 7.47 under preservation of non-classical photon-number statistics. We achieve internal conversion efficiencies of 61.5%, significantly outperforming spectral filtering for bandwidth compression. Our system thus makes the connection between previously incompatible quantum systems as a step towards usable quantum networks.

  6. Highly efficient frequency conversion with bandwidth compression of quantum light

    CERN Document Server

    Allgaier, Markus; Sansoni, Linda; Quiring, Viktor; Ricken, Raimund; Harder, Georg; Brecht, Benjamin; Silberhorn, Christine

    2016-01-01

    Hybrid quantum networks rely on efficient interfacing of dissimilar quantum nodes, since elements based on parametric down-conversion sources, quantum dots, color centres or atoms are fundamentally different in their frequencies and bandwidths. While pulse manipulation has been demonstrated in very different systems, to date no interface exists that provides both an efficient bandwidth compression and a substantial frequency translation at the same time. Here, we demonstrate an engineered sum-frequency-conversion process in Lithium Niobate that achieves both goals. We convert pure photons at telecom wavelengths to the visible range while compressing the bandwidth by a factor of 7.47 under preservation of non-classical photon-number statistics. We achieve internal conversion efficiencies of 75.5%, significantly outperforming spectral filtering for bandwidth compression. Our system thus makes the connection between previously incompatible quantum systems as a step towards usable quantum networks.

  7. High Bandwidth Short Stroke Rotary Fast Tool Servo

    Energy Technology Data Exchange (ETDEWEB)

    Montesanti, R C; Trumper, D L

    2003-08-22

    This paper presents the design and performance of a new rotary fast tool servo (FTS) capable of developing the 40 g's tool tip acceleration required to follow a 5 micron PV sinusoidal surface at 2 kHz with a planned accuracy of 50 nm, and having a full stroke of 50 micron PV at lower frequencies. Tests with de-rated power supplies have demonstrated a closed-loop unity-gain bandwidth of 2 kHz with 20 g's tool acceleration, and we expect to achieve 40 g's with supplies providing {+-} 16 Amp to the Lorentz force actuator. The use of a fast tool servo with a diamond turning machine for producing non-axisymmetric or textured surfaces on a workpiece is well known. Our new rotary FTS was designed to specifically accommodate fabricating prescription textured surfaces on 5 mm diameter spherical target components for High Energy Density Physics experiments on the National Ignition Facility Laser (NIF).

  8. An Octave Bandwidth, High PAE, Linear, Class J GaN High Power Amplifier

    Science.gov (United States)

    2012-03-12

    versus the modeled small-signal gain and return loss response of the Class J amplifier using a 45-W CREE GaN HEMT . The amplifier has a gain of 13 to...AFFTC-PA-12055 An Octave Bandwidth, High PAE, Linear, Class J GaN High Power Amplifier Kris Skowronski, Steve Nelson, Rajesh Mongia, Howard...Technical Paper 3. DATES COVERED (From - To) 11/11 – 03/12 (etc.) 4. TITLE AND SUBTITLE An Octave Bandwidth, High PAE, Linear, Class J GaN High

  9. Adaptive slope compensation for high bandwidth digital current mode controller

    DEFF Research Database (Denmark)

    Taeed, Fazel; Nymand, Morten

    2015-01-01

    An adaptive slope compensation method for digital current mode control of dc-dc converters is proposed in this paper. The compensation slope is used for stabilizing the inner current loop in peak current mode control. In this method, the compensation slope is adapted with the variations...... in converter duty cycle. The adaptive slope compensation provides optimum controller operation in term of bandwidth over wide range of operating points. In this paper operation principle of the controller is discussed. The proposed controller is implemented in an FPGA to control a 100 W buck converter...

  10. Ultra-low Noise, High Bandwidth, 1550nm HgCdTe APD Project

    Data.gov (United States)

    National Aeronautics and Space Administration — To meet the demands of future high-capacity free space optical communications links, a high bandwidth, near infrared (NIR), single photon sensitive optoelectronic...

  11. Highly Parallel Modern Signal Processing.

    Science.gov (United States)

    1982-02-28

    SVD to Signal Processing (NOSC,USC) ----------------------------------- 11 Parallel Kalman Filter Algorithms (Stanford and ISI [25-27...speCtLrm d kvel,’ope d by u.s *converges to the 2- maqx i -um;- entropy s t-?ect ra I S i -ivate 8Syrn PLCticIIllY. N~rrently we have beg-un...Methods for Ppectral Estimat~ion and Array Processing I t may seem tOO amnbitLious tr. compare all currently popular high1- resol’Jtion spectra

  12. Knee implant imaging at 3 Tesla using high-bandwidth radiofrequency pulses.

    Science.gov (United States)

    Bachschmidt, Theresa J; Sutter, Reto; Jakob, Peter M; Pfirrmann, Christian W A; Nittka, Mathias

    2015-06-01

    To investigate the impact of high-bandwidth radiofrequency (RF) pulses used in turbo spin echo (TSE) sequences or combined with slice encoding for metal artifact correction (SEMAC) on artifact reduction at 3 Tesla in the knee in the presence of metal. Local transmit/receive coils feature increased maximum B1 amplitude, reduced SAR exposition and thus enable the application of high-bandwidth RF pulses. Susceptibility-induced through-plane distortion scales inversely with the RF bandwidth and the view angle, hence blurring, increases for higher RF bandwidths, when SEMAC is used. These effects were assessed for a phantom containing a total knee arthroplasty. TSE and SEMAC sequences with conventional and high RF bandwidths and different contrasts were tested on eight patients with different types of implants. To realize scan times of 7 to 9 min, SEMAC was always applied with eight slice-encoding steps and distortion was rated by two radiologists. A local transmit/receive knee coil enables the use of an RF bandwidth of 4 kHz compared with 850 Hz in conventional sequences. Phantom scans confirm the relation of RF bandwidth and through-plane distortion, which can be reduced up to 79%, and demonstrate the increased blurring for high-bandwidth RF pulses. In average, artifacts in this RF mode are rated hardly visible for patients with joint arthroplasties, when eight SEMAC slice-encoding steps are applied, and for patients with titanium fixtures, when TSE is used. The application of high-bandwidth RF pulses by local transmit coils substantially reduces through-plane distortion artifacts at 3 Tesla. © 2014 Wiley Periodicals, Inc.

  13. Study on Dielectric Resonator Antenna with Annular Patch for High Gain and Large Bandwidth

    Institute of Scientific and Technical Information of China (English)

    FENG Kuisheng; LI Na; MENG Qingwei; WANG Yongfeng; ZHANG Jingwei

    2015-01-01

    A new high-gain cylindrical Dielectric res-onator antenna (DRA) with a large bandwidth is proposed. A cylindrical Dielectric resonator (DR), a double-annular patch and a metallic cylinder are used to obtain a large bandwidth and a high gain. The mode TM12 excited in the patch is used to enhance the gain of the DRA, and the cavity formed by the metallic cylinder provides a further higher gain and a larger bandwidth. The measured results demonstrate that the proposed DRA achieves a large band-width of 23%from 5.3 to 6.8GHz with VSWR less than two and a high gain around 11 dBi.

  14. THE IMPROVEMENT OF COMPUTER NETWORK PERFORMANCE WITH BANDWIDTH MANAGEMENT IN KEMURNIAN II SENIOR HIGH SCHOOL

    Directory of Open Access Journals (Sweden)

    Bayu Kanigoro

    2012-05-01

    Full Text Available This research describes the improvement of computer network performance with bandwidth management in Kemurnian II Senior High School. The main issue of this research is the absence of bandwidth division on computer, which makes user who is downloading data, the provided bandwidth will be absorbed by the user. It leads other users do not get the bandwidth. Besides that, it has been done IP address division on each room, such as computer, teacher and administration room for supporting learning process in Kemurnian II Senior High School, so wireless network is needed. The method is location observation and interview with related parties in Kemurnian II Senior High School, the network analysis has run and designed a new topology network including the wireless network along with its configuration and separation bandwidth on microtic router and its limitation. The result is network traffic on Kemurnian II Senior High School can be shared evenly to each user; IX and IIX traffic are separated, which improve the speed on network access at school and the implementation of wireless network.Keywords: Bandwidth Management; Wireless Network

  15. High modulation bandwidth of a light-emitting diode with surface plasmon coupling (Conference Presentation)

    Science.gov (United States)

    Lin, Chun-Han; Tu, Charng-Gan; Yao, Yu-Feng; Chen, Sheng-Hung; Su, Chia-Ying; Chen, Hao-Tsung; Kiang, Yean-Woei; Yang, Chih-Chung

    2017-02-01

    Besides lighting, LEDs can be used for indoor data transmission. Therefore, a large modulation bandwidth becomes an important target in the development of visible LED. In this regard, enhancing the radiative recombination rate of carriers in the quantum wells of an LED is a useful method since the modulation bandwidth of an LED is related to the carrier decay rate besides the device RC time constant To increase the carrier decay rate in an LED without sacrificing its output power, the technique of surface plasmon (SP) coupling in an LED is useful. In this paper, the increases of modulation bandwidth by reducing mesa size, decreasing active layer thickness, and inducing SP coupling in blue- and green-emitting LEDs are illustrated. The results are demonstrated by comparing three different LED surface structures, including bare p-type surface, GaZnO current spreading layer, and Ag nanoparticles (NPs) for inducing SP coupling. In a single-quantum-well, blue-emitting LED with a circular mesa of 10 microns in radius, SP coupling results in a modulation bandwidth of 528.8 MHz, which is believed to be the record-high level. A smaller RC time constant can lead to a higher modulation bandwidth. However, when the RC time constant is smaller than 0.2 ns, its effect on modulation bandwidth saturates. The dependencies of modulation bandwidth on injected current density and carrier decay time confirm that the modulation bandwidth is essentially inversely proportional to a time constant, which is inversely proportional to the square-root of carrier decay rate and injected current density.

  16. High Speed and Wide Bandwidth Delta-Sigma ADCs

    NARCIS (Netherlands)

    Bolatkale, M.

    2013-01-01

    This thesis describes the theory, design and implementation of a high-speed, high-performance continuous-time delta-sigma (CTΔΣ) ADC for applications such as medical imaging, high-definition video processing, and wireline and wireless communications. In order to achieve a GHz clocking speed, this

  17. High Speed and Wide Bandwidth Delta-Sigma ADCs

    NARCIS (Netherlands)

    Bolatkale, M.

    2013-01-01

    This thesis describes the theory, design and implementation of a high-speed, high-performance continuous-time delta-sigma (CTΔΣ) ADC for applications such as medical imaging, high-definition video processing, and wireline and wireless communications. In order to achieve a GHz clocking speed, this th

  18. Extremelly High Bandwidth Rad Hard Data Acquisition System Project

    Data.gov (United States)

    National Aeronautics and Space Administration — Analog-to-digital converters (ADCs) are the key components for digitizing high-speed analog data in modern data acquisition systems, which is a critical part of...

  19. Extremelly High Bandwidth Rad Hard Data Acquisition System Project

    Data.gov (United States)

    National Aeronautics and Space Administration — Advancements in sensors/detectors are needed to support future NASA mission concepts including polarimetry, large format imaging arrays, and high-sensitivity...

  20. Plasma Sensor for High Bandwidth Mass-Flow Measurements at High Mach Numbers with RF Link Project

    Data.gov (United States)

    National Aeronautics and Space Administration — The proposal is aimed at the development of a miniature high bandwidth (1 MHz class) plasma sensor for flow measurements at high enthalpies. This device uses a...

  1. High-Bandwidth Tactical-Network Data Analysis in a High-Performance-Computing (HPC) Environment: Device Status Data

    Science.gov (United States)

    2015-09-01

    5.1.1 Basic Components The Hydra data processing framework provides an object - oriented hierarchy for organizing data processing within an HPC...ARL-CR-0780 ● SEP 2015 US Army Research Laboratory High-Bandwidth Tactical-Network Data Analysis in a High-Performance-Computing...ARL-CR-0780 ● SEP 2015 US Army Research Laboratory High-Bandwidth Tactical-Network Data Analysis in a High-Performance-Computing (HPC

  2. Effective Actuation: High Bandwidth Actuators and Actuator Scaling Laws

    Science.gov (United States)

    2007-11-02

    piezo elements mounted on structural members and devices that exhibited aeroacoustic resonance. The former type of actuator ( piezo ) was considered...Raman and Kibens (Raman et al. 2000). These experiments involved high-frequency forcing applied to low-speed flows using wedge piezo actuators and... Subharmonic Interaction and Wall Influence," AIAA- 86-1047, May, 1986. Davis, S. A., 2000, "The manipulation of large and small flow structures in single and

  3. Fully Controllable Pancharatnam-Berry Metasurface Array with High Conversion Efficiency and Broad Bandwidth

    Science.gov (United States)

    Liu, Chuanbao; Bai, Yang; Zhao, Qian; Yang, Yihao; Chen, Hongsheng; Zhou, Ji; Qiao, Lijie

    2016-01-01

    Metasurfaces have powerful abilities to manipulate the properties of electromagnetic waves flexibly, especially the modulation of polarization state for both linearly polarized (LP) and circularly polarized (CP) waves. However, the transmission efficiency of cross-polarization conversion by a single-layer metasurface has a low theoretical upper limit of 25% and the bandwidth is usually narrow, which cannot be resolved by their simple additions. Here, we efficiently manipulate polarization coupling in multilayer metasurface to promote the transmission of cross-polarization by Fabry-Perot resonance, so that a high conversion coefficient of 80–90% of CP wave is achieved within a broad bandwidth in the metasurface with C-shaped scatters by theoretical calculation, numerical simulation and experiments. Further, fully controlling Pancharatnam-Berry phase enables to realize polarized beam splitter, which is demonstrated to produce abnormal transmission with high conversion efficiency and broad bandwidth. PMID:27703254

  4. High-Level Parallel Programming.

    Science.gov (United States)

    parallel programming languages. These issues were evaluated via the utilization of a language called UC. UC is a programming language aimed at balancing notational simplicity with execution efficiency and portability. UC accomplishes this by separating the programming task from the efficiency issues. This report gives a description of the language, its current implementation, its verification methodology and its use in designing various

  5. Using the Sirocco File System for high-bandwidth checkpoints.

    Energy Technology Data Exchange (ETDEWEB)

    Klundt, Ruth Ann; Curry, Matthew L.; Ward, H. Lee

    2012-02-01

    The Sirocco File System, a file system for exascale under active development, is designed to allow the storage software to maximize quality of service through increased flexibility and local decision-making. By allowing the storage system to manage a range of storage targets that have varying speeds and capacities, the system can increase the speed and surety of storage to the application. We instrument CTH to use a group of RAM-based Sirocco storage servers allocated within the job as a high-performance storage tier to accept checkpoints, allowing computation to potentially continue asynchronously of checkpoint migration to slower, more permanent storage. The result is a 10-60x speedup in constructing and moving checkpoint data from the compute nodes. This demonstration of early Sirocco functionality shows a significant benefit for a real I/O workload, checkpointing, in a real application, CTH. By running Sirocco storage servers within a job as RAM-only stores, CTH was able to store checkpoints 10-60x faster than storing to PanFS, allowing the job to continue computing sooner. While this prototype did not include automatic data migration, the checkpoint was available to be pushed or pulled to disk-based storage as needed after the compute nodes continued computing. Future developments include the ability to dynamically spawn Sirocco nodes to absorb checkpoints, expanding this mechanism to other fast tiers of storage like flash memory, and sharing of dynamic Sirocco nodes between multiple jobs as needed.

  6. High-gain, high-bandwidth, rail-to-rail, constant-gm CMOS operational amplifier

    Science.gov (United States)

    Huang, Hong-Yi; Wang, Bo-Ruei

    2013-01-01

    This study presents a high-gain, high-bandwidth, constant-gm , rail-to-rail operational amplifier (op-amp). The constant transconductance is improved with a source-to-bulk bias control of an input pair. A source degeneration scheme is also adapted to the output stage for receiving wide input range without degradation of the gain. Additionally, several compensation schemes are employed to enhance the stability. A test chip is fabricated in a 0.18 µm complementary metal-oxide semiconductor process. The active area of the op-amp is 181 × 173 µm2 and it consumes a power of 2.41 mW at a supply voltage of 1.8 V. The op-amp achieves a dc gain of 94.3 dB and a bandwidth of 45 MHz when the output capacitive load is connected to an effective load of 42.5 pF. A class-AB output stage combining a slew rate (SR) boost circuit provides a sinking current of 6 mA and an SR of 17 V/µs.

  7. Parallel Modem Architectures for High-Data-Rate Space Modems

    Science.gov (United States)

    Satorius, E.

    2014-08-01

    Existing software-defined radios (SDRs) for space are limited in data volume by several factors, including bandwidth, space-qualified analog-to-digital converter (ADC) technology, and processor throughput, e.g., the throughput of a space-qualified field-programmable gate array (FPGA). In an attempt to further improve the throughput of space-based SDRs and to fully exploit the newer and more capable space-qualified technology (ADCs, FPGAs), we are evaluating parallel transmitter/receiver architectures for space SDRs. These architectures would improve data volume for both deep-space and particularly proximity (e.g., relay) links. In this article, designs for FPGA implementation of a high-rate parallel modem are presented as well as both fixed- and floating-point simulated performance results based on a functional design that is suitable for FPGA implementation.

  8. High performance parallel I/O

    CERN Document Server

    Prabhat

    2014-01-01

    Gain Critical Insight into the Parallel I/O EcosystemParallel I/O is an integral component of modern high performance computing (HPC), especially in storing and processing very large datasets to facilitate scientific discovery. Revealing the state of the art in this field, High Performance Parallel I/O draws on insights from leading practitioners, researchers, software architects, developers, and scientists who shed light on the parallel I/O ecosystem.The first part of the book explains how large-scale HPC facilities scope, configure, and operate systems, with an emphasis on choices of I/O har

  9. Full phase stabilization of a Yb:fiber femtosecond frequency comb via high-bandwidth transducers

    NARCIS (Netherlands)

    Benko, C.; Ruehl, A.; Martin, M.J.; Eikema, K.S.E.; Fermann, M.E.; Hartl, I.; Ye, J.

    2012-01-01

    We present full phase stabilization of an amplified Yb:fiber femtosecond frequency comb using an intracavity electro-optic modulator and an acousto-optic modulator. These transducers provide high servo bandwidths of 580 kHz and 250 kHz for f(rep) and f(ceo), producing a robust and low phase noise fi

  10. High Bandwidth Rotary Fast Tool Servos and a Hybrid Rotary/Linear Electromagnetic Actuator

    Energy Technology Data Exchange (ETDEWEB)

    Montesanti, Richard Clement [Massachusetts Inst. of Technology (MIT), Cambridge, MA (United States)

    2005-09-01

    This thesis describes the development of two high bandwidth short-stroke rotary fast tool servos and the hybrid rotary/linear electromagnetic actuator developed for one of them. Design insights, trade-o® methodologies, and analytical tools are developed for precision mechanical systems, power and signal electronic systems, control systems, normal-stress electromagnetic actuators, and the dynamics of the combined systems.

  11. Domain Decomposition Based High Performance Parallel Computing

    CERN Document Server

    Raju, Mandhapati P

    2009-01-01

    The study deals with the parallelization of finite element based Navier-Stokes codes using domain decomposition and state-ofart sparse direct solvers. There has been significant improvement in the performance of sparse direct solvers. Parallel sparse direct solvers are not found to exhibit good scalability. Hence, the parallelization of sparse direct solvers is done using domain decomposition techniques. A highly efficient sparse direct solver PARDISO is used in this study. The scalability of both Newton and modified Newton algorithms are tested.

  12. High-resolution and wide-bandwidth light intensity fiber optic displacement sensor for MEMS metrology.

    Science.gov (United States)

    Orłowska, Karolina; Świątkowski, Michał; Kunicki, Piotr; Kopiec, Daniel; Gotszalk, Teodor

    2016-08-01

    We report on the design, properties, and applications of a high-resolution and wide-bandwidth light intensity fiber optic displacement sensor for microelectromechanical system (MEMS) metrology. There are two types of structures that the system is dedicated to: vibrating with both high and low frequencies. In order to ensure high-frequency and high-resolution measurements, frequency down mixing and selective signal processing were applied. The obtained effective measuring bandwidth ranges from single hertz to 1 megahertz. The achieved resolution presented here is 116  pm/Hz1/2 and 138  pm/Hz1/2 for low-frequency and high-frequency operation modes, respectively, whereas the measurement of static displacement is 100 μm.

  13. Parallel Algebraic Multigrid Methods - High Performance Preconditioners

    Energy Technology Data Exchange (ETDEWEB)

    Yang, U M

    2004-11-11

    The development of high performance, massively parallel computers and the increasing demands of computationally challenging applications have necessitated the development of scalable solvers and preconditioners. One of the most effective ways to achieve scalability is the use of multigrid or multilevel techniques. Algebraic multigrid (AMG) is a very efficient algorithm for solving large problems on unstructured grids. While much of it can be parallelized in a straightforward way, some components of the classical algorithm, particularly the coarsening process and some of the most efficient smoothers, are highly sequential, and require new parallel approaches. This chapter presents the basic principles of AMG and gives an overview of various parallel implementations of AMG, including descriptions of parallel coarsening schemes and smoothers, some numerical results as well as references to existing software packages.

  14. Demonstration of OCDM Coder and Variable Bandwidth Filter Using Parallel Topology of Quadruple Series Coupled Microring Resonators

    National Research Council Canada - National Science Library

    Tanaka, K; Kokubun, Y

    2011-01-01

    ...) coding circuit using two wavelength selective switches (WSSs) consisting of quadruple series coupled MRRs laid out in parallel topology and a phase shifter incorporated in the one arm between them...

  15. Pickup design for high bandwidth bunch arrival-time monitors in free-electron lasers

    Energy Technology Data Exchange (ETDEWEB)

    Angelovski, Aleksandar; Penirschke, Andreas; Jakoby, Rolf [TU Darmstadt (Germany). Institut fuer Mikrowellentechnik und Photonik; Kuhl, Alexander; Schnepp, Sascha [TU Darmstadt (Germany). Graduate School of Computational Engineering; Bock, Marie Kristin; Bousonville, Michael; Schlarb, Holger [Deutsches Elektronen-Synchrotron DESY, Hamburg (Germany); Weiland, Thomas [TU Darmstadt (Germany). Institut fuer Theorie Elektromagnetischer Felder

    2012-07-01

    The increased demands for low bunch charge operation mode in the free-electron lasers (FELs) require an upgrade of the existing synchronization equipment. As a part of the laser-based synchronization system, the bunch arrival-time monitors (BAMs) should have a sub-10 femtosecond precision for high and low bunch charge operation. In order to fulfill the resolution demands for both modes of operation, the bandwidth of such a BAM should be increased up to a cutoff frequency of 40 GHz. In this talk, we present the design and the realization of high bandwidth cone-shaped pickup electrodes as a part of the BAM for the FEL in Hamburg (FLASH) and the European X-ray free-electron laser (European XFEL). The proposed pickup was simulated with CST STUDIO SUITE, and a non-hermetic model was built up for radio frequency (rf) measurements.

  16. Bullet: high bandwidth data dissemination using an overlay mesh

    OpenAIRE

    Kostic, D.; Rodriguez, A.; J. Albrecht; Vahdat, A.

    2003-01-01

    In recent years, overlay networks have become an effective alternative to IP multicast for efficient point to multipoint communication across the Internet. Typically, nodes self-organize with the goal of forming an efficient overlay tree, one that meets performance targets without placing undue burden on the underlying network. In this paper, we target high-bandwidth data distribution from a single source to a large number of receivers. Applications include large-file transfers and real-time ...

  17. High-Bandwidth Dynamic Full-Field Profilometry for Nano-Scale Characterization of MEMS

    Energy Technology Data Exchange (ETDEWEB)

    Chen, L-C [Graduate Institute of Automation Technology, National Taipei University of Technology, 1 Sec. 3 Chung-Hsiao East Rd., Taipei, 106, Taiwan (China); Huang, Y-T [Graduate Institute of Automation Technology, National Taipei University of Technology, 1 Sec. 3 Chung-Hsiao East Rd., Taipei, 106, Taiwan (China); Chang, P-B [Graduate Institute of Mechanical and Electrical Engineering, National Taipei University of Technology, 1 Sec. 3 Chung-Hsiao East Rd., Taipei, 106, Taiwan (China)

    2006-10-15

    The article describes an innovative optical interferometric methodology to delivery dynamic surface profilometry with a measurement bandwidth up to 10MHz or higher and a vertical resolution up to 1 nm. Previous work using stroboscopic microscopic interferometry for dynamic characterization of micro (opto)electromechanical systems (M(O)EMS) has been limited in measurement bandwidth mainly within a couple of MHz. For high resonant mode analysis, the stroboscopic light pulse is insufficiently short to capture the moving fringes from dynamic motion of the detected structure. In view of this need, a microscopic prototype based on white-light stroboscopic interferometry with an innovative light superposition strategy was developed to achieve dynamic full-field profilometry with a high measurement bandwidth up to 10MHz or higher. The system primarily consists of an optical microscope, on which a Mirau interferometric objective embedded with a piezoelectric vertical translator, a high-power LED light module with dual operation modes and light synchronizing electronics unit are integrated. A micro cantilever beam used in AFM was measured to verify the system capability in accurate characterisation of dynamic behaviours of the device. The full-field seventh-mode vibration at a vibratory frequency of 3.7MHz can be fully characterized and nano-scale vertical measurement resolution as well as tens micrometers of vertical measurement range can be performed.

  18. A HIGH BANDWIDTH BIPOLAR POWER SUPPLY FOR THE FAST CORRECTORS IN THE APS UPGRADE*

    Energy Technology Data Exchange (ETDEWEB)

    Wang, Ju; Sprau, Gary

    2017-06-25

    The APS Upgrade of a multi-bend achromat (MBA) storage ring requires a fast bipolar power supply for the fast correction magnets. The key performance requirement of the power supply includes a small-signal bandwidth of 10 kHz for the output current. This requirement presents a challenge to the design because of the high inductance of the magnet load and a limited input DC voltage. A prototype DC/DC power supply utilizing a MOSFET H-bridge circuit with a 500 kHz PWM has been developed and tested successfully. The prototype achieved a 10-kHz bandwidth with less than 3-dB attenuation for a signal 0.5% of the maximum operating current of 15 amperes. This paper presents the design of the power circuit, the PWM method, the control loop, and the test results.

  19. High-speed 405-nm superluminescent diode (SLD) with 807-MHz modulation bandwidth

    KAUST Repository

    Shen, Chao

    2016-08-25

    III-nitride LEDs are fundamental components for visible-light communication (VLC). However, the modulation bandwidth is inherently limited by the relatively long carrier lifetime. In this letter, we present the 405 nm emitting superluminescent diode (SLD) with tilted facet design on semipolar GaN substrate, showing a broad emission of ∼9 nm at 20 mW optical power. Owing to the fast recombination (τ<0.35 ns) through the amplified spontaneous emission, the SLD exhibits a significantly large 3-dB bandwidth of 807 MHz. A data rate of 1.3 Gbps with a bit-error rate of 2.9 × 10 was obtained using on-off keying modulation scheme, suggesting the SLD being a high-speed transmitter for VLC applications.

  20. Maximizing the bandwidth of coherent, mid-IR supercontinuum using highly nonlinear aperiodic nanofibers

    Science.gov (United States)

    Baili, Amira; Cherif, Rim; Heidt, Alexander; Zghal, Mourad

    2014-05-01

    We describe in detail a new procedure of maximizing the bandwidth of mid-infrared (mid-IR) supercontinuum (SC) in highly nonlinear microstructured As2Se3 and tellurite aperiodic nanofibers. By introducing aperiodic rings of first and secondary air holes into the cross-sections of our microstructured fiber designs, we achieve flattened and all-normal dispersion profiles over much broader bandwidths than would be possible with simple periodic designs. These fiber designs are optimized for efficient, broadband, and coherent SC generation in the mid-IR spectral region. Numerical simulations show that these designs enable the generation of a SC spanning over 2290 nm extending from 1140 to 3430 nm in 8 cm length of tellurite nanofiber with input energy of E = 200 pJ and a SC bandwidth of over 4700 nm extending from 1795 to 6525 nm generated in only 8 mm-length of As2Se3-based nanofiber with input energy as low as E = 100 pJ. This work provides a new type of broadband mid-IR SC source with flat spectral shape as well as excellent coherence and temporal properties by using aperiodic nanofibers with all-normal dispersion suitable for applications in ultrafast science, metrology, coherent control, non-destructive testing, spectroscopy, and optical coherence tomography in the mid-IR region.

  1. A Synthetic Bandwidth Method for High-Resolution SAR Based on PGA in the Range Dimension

    Directory of Open Access Journals (Sweden)

    Jincheng Li

    2015-06-01

    Full Text Available The synthetic bandwidth technique is an effective method to achieve ultra-high range resolution in an SAR system. There are mainly two challenges in its implementation. The first one is the estimation and compensation of system errors, such as the timing deviation and the amplitude-phase error. Due to precision limitation of the radar instrument, construction of the sub-band signals becomes much more complicated with these errors. The second challenge lies in the combination method, that is how to fit the sub-band signals together into a much wider bandwidth. In this paper, a novel synthetic bandwidth approach is presented. It considers two main errors of the multi-sub-band SAR system and compensates them by a two-order PGA (phase gradient auto-focus-based method, named TRPGA. Furthermore, an improved cut-paste method is proposed to combine the signals in the frequency domain. It exploits the redundancy of errors and requires only a limited amount of data in the azimuth direction for error estimation. Moreover, the up-sampling operation can be avoided in the combination process. Imaging results based on both simulated and real data are presented to validate the proposed approach.

  2. ICE-Based Custom Full-Mesh Network for the CHIME High Bandwidth Radio Astronomy Correlator

    Science.gov (United States)

    Bandura, K.; Cliche, J. F.; Dobbs, M. A.; Gilbert, A. J.; Ittah, D.; Mena Parra, J.; Smecher, G.

    New generation radio interferometers encode signals from thousands of antenna feeds across large bandwidth. Channelizing and correlating this data requires networking capabilities that can handle unprecedented data rates with reasonable cost. The Canadian Hydrogen Intensity Mapping Experiment (CHIME) correlator processes 8-bits from N=2,048 digitizer inputs across 400MHz of bandwidth. Measured in N2× bandwidth, it is the largest radio correlator that is currently commissioning. Its digital back-end must exchange and reorganize the 6.6terabit/s produced by its 128 digitizing and channelizing nodes, and feed it to the 256 graphics processing unit (GPU) node spatial correlator in a way that each node obtains data from all digitizer inputs but across a small fraction of the bandwidth (i.e. ‘corner-turn’). In order to maximize performance and reliability of the corner-turn system while minimizing cost, a custom networking solution has been implemented. The system makes use of Field Programmable Gate Array (FPGA) transceivers to implement direct, passive copper, full-mesh, high speed serial connections between sixteen circuit boards in a crate, to exchange data between crates, and to offload the data to a cluster of 256 GPU nodes using standard 10Gbit/s Ethernet links. The GPU nodes complete the corner-turn by combining data from all crates and then computing visibilities. Eye diagrams and frame error counters confirm error-free operation of the corner-turn network in both the currently operating CHIME Pathfinder telescope (a prototype for the full CHIME telescope) and a representative fraction of the full CHIME hardware providing an end-to-end system validation. An analysis of an equivalent corner-turn system built with Ethernet switches instead of custom passive data links is provided.

  3. Applied Techniques for High Bandwidth Data Transfers across Wide Area Networks

    Institute of Scientific and Technical Information of China (English)

    JasonLee; BillAllcock; 等

    2001-01-01

    Large distributed systems such as Computational/Data Grids require large amounts of data to be co-located with the computing facilities for processing.From our work develogpin a scalable distributed network cache.we have gained experience with techniques necessary to achieve high data throughput over high bandwidth Wide Area Networks(WAN).In this paper,we discuss several hardware and software dsign techniques,and then describe their application to an implementation of an enhanced FTP protocol called GridFTP,We describe results from the Supercomputing 2000 conference.

  4. Designing and implementing Multibeam Smart Antennas for high bandwidth UAV communications using FPGAs

    Science.gov (United States)

    Porcello, J. C.

    Requirements for high bandwidth UAV communications are often necessary in order to move large amounts of mission information to/from Users in real-time. The focus of this paper is antenna beamforming for point-to-point, high bandwidth UAV communications in order to optimize transmit and receive power and support high data throughput communications. Specifically, this paper looks at the design and implementation of Multibeam Smart Antennas to implement antenna beamforming in an aerospace communications environment. The Smart Antenna is contrasted against Fast Fourier Transform (FFT) based beamforming in order to quantify the increase in both computational load and FPGA resources required for multibeam adaptive signal processing in the Smart Antenna. The paper begins with an overall discussion of Smart Antenna design and general beamforming issues in high bandwidth communications. Important design considerations such as processing complexity in a constrained Size, Weight and Power (SWaP) environment are discussed. The focus of the paper is with respect to design and implementation of digital beamforming wideband communications waveforms using FPGAs. A Multibeam Time Delay element is introduced based on Lagrange Interpolation. Design data for Multibeam Smart Antennas in FPGAs is provided in the paper as well as reference circuits for implementation. Finally, an example Multibeam Smart Antenna design is provided based on a Xilinx Virtex-7 FPGA. The Multibeam Smart Antenna example design illustrates the concepts discussed in the paper and provides design insight into Multibeam Smart Antenna implementation from the point of view of implementation complexity, required hardware, and overall system performance gain.

  5. Re-use of Low Bandwidth Equipment for High Bit Rate Transmission Using Signal Slicing Technique

    DEFF Research Database (Denmark)

    Wagner, Christoph; Spolitis, S.; Vegas Olmos, Juan José;

    : Massive fiber-to-the-home network deployment requires never ending equipment upgrades operating at higher bandwidth. We show effective signal slicing method, which can reuse low bandwidth opto-electronical components for optical communications at higher bit rates.......: Massive fiber-to-the-home network deployment requires never ending equipment upgrades operating at higher bandwidth. We show effective signal slicing method, which can reuse low bandwidth opto-electronical components for optical communications at higher bit rates....

  6. Influence of the fiber Bragg gratings with different reflective bandwidths in high power all-fiber laser oscillator

    Science.gov (United States)

    Wang, Jianming; Yan, Dapeng; Xiong, Songsong; Huang, Bao; Li, Cheng

    2017-01-01

    The effects of large-mode-area (LMA) fiber Bragg gratings (FBGs) with different reflective bandwidths on bi-directionally pumped ytterbium-doped single-mode all-fiber laser oscillator have been investigated experimentally. The forward laser output power and the backward signal leakage were measured and analyzed. It was found that the laser output power and efficiency depended on the bandwidth of the high-reflection (HR) FBG used in the laser cavity. The broader bandwidth gives higher laser efficiency, especially at high power level.

  7. Narrow-bandwidth high-order harmonics driven by long-duration hot spots

    Science.gov (United States)

    Kozlov, Maxim; Kfir, Ofer; Fleischer, Avner; Kaplan, Alex; Carmon, Tal; Schwefel, Harald G. L.; Bartal, Guy; Cohen, Oren

    2012-06-01

    We predict and investigate the emission of high-order harmonics by atoms that cross intense laser hot spots that last for a nanosecond or longer. An atom that moves through a nanometer-scale hot spot at characteristic thermal velocity can emit high-order harmonics in a similar fashion to an atom that is irradiated by a short-duration (picosecond-scale) laser pulse. We analyze the collective emission from a thermal gas and from a jet of atoms. In both cases, the line shape of a high-order harmonic exhibits a narrow spike with spectral width that is determined by the bandwidth of the driving laser. Finally, we discuss a scheme for producing long-duration laser hot spots with intensity in the range of the intensity threshold for high-harmonic generation. In the proposed scheme, the hot spot is produced by a long laser pulse that is consecutively coupled to a high-quality micro-resonator and a metallic nano-antenna. This system may be used for generating ultra-narrow bandwidth extreme-ultraviolet radiation through frequency up-conversion of a low-cost compact pump laser.

  8. Three-Axis Attitude Estimation With a High-Bandwidth Angular Rate Sensor

    Science.gov (United States)

    Bayard, David S.; Green, Joseph J.

    2013-01-01

    A continuing challenge for modern instrument pointing control systems is to meet the increasingly stringent pointing performance requirements imposed by emerging advanced scientific, defense, and civilian payloads. Instruments such as adaptive optics telescopes, space interferometers, and optical communications make unprecedented demands on precision pointing capabilities. A cost-effective method was developed for increasing the pointing performance for this class of NASA applications. The solution was to develop an attitude estimator that fuses star tracker and gyro measurements with a high-bandwidth angular rotation sensor (ARS). An ARS is a rate sensor whose bandwidth extends well beyond that of the gyro, typically up to 1,000 Hz or higher. The most promising ARS sensor technology is based on a magnetohydrodynamic concept, and has recently become available commercially. The key idea is that the sensor fusion of the star tracker, gyro, and ARS provides a high-bandwidth attitude estimate suitable for supporting pointing control with a fast-steering mirror or other type of tip/tilt correction for increased performance. The ARS is relatively inexpensive and can be bolted directly next to the gyro and star tracker on the spacecraft bus. The high-bandwidth attitude estimator fuses an ARS sensor with a standard three-axis suite comprised of a gyro and star tracker. The estimation architecture is based on a dual-complementary filter (DCF) structure. The DCF takes a frequency- weighted combination of the sensors such that each sensor is most heavily weighted in a frequency region where it has the lowest noise. An important property of the DCF is that it avoids the need to model disturbance torques in the filter mechanization. This is important because the disturbance torques are generally not known in applications. This property represents an advantage over the prior art because it overcomes a weakness of the Kalman filter that arises when fusing more than one rate

  9. APEnet+: high bandwidth 3D torus direct network for petaflops scale commodity clusters

    Science.gov (United States)

    Ammendola, R.; Biagioni, A.; Frezza, O.; Lo Cicero, F.; Lonardo, A.; Paolucci, P. S.; Rossetti, D.; Salamon, A.; Salina, G.; Simula, F.; Tosoratto, L.; Vicini, P.

    2011-12-01

    We describe herein the APElink+ board, a PCIe interconnect adapter featuring the latest advances in wire speed and interface technology plus hardware support for a RDMA programming model and experimental acceleration of GPU networking; this design allows us to build a low latency, high bandwidth PC cluster, the APEnet+ network, the new generation of our cost-effective, tens-of-thousands-scalable cluster network architecture. Some test results and characterization of data transmission of a complete testbench, based on a commercial development card mounting an Altera® FPGA, are provided.

  10. APEnet+: high bandwidth 3D torus direct network for petaflops scale commodity clusters

    CERN Document Server

    Ammendola, Roberto; Frezza, Ottorino; Cicero, Francesca Lo; Lonardo, Alessandro; Paolucci, Pier Stanislao; Rossetti, Davide; Salamon, Andrea; Salina, Gaetano; Simula, Francesco; Tosoratto, Laura; Vicini, Piero

    2011-01-01

    We describe herein the APElink+ board, a PCIe interconnect adapter featuring the latest advances in wire speed and interface technology plus hardware support for a RDMA programming model and experimental acceleration of GPU networking; this design allows us to build a low latency, high bandwidth PC cluster, the APEnet+ network, the new generation of our cost-effective, tens-of-thousands-scalable cluster network architecture. Some test results and characterization of data transmission of a complete testbench, based on a commercial development card mounting an Altera FPGA, are provided.

  11. Engineering the CernVM-Filesystem as a High Bandwidth Distributed Filesystem for Auxiliary Physics Data

    Science.gov (United States)

    Dykstra, D.; Bockelman, B.; Blomer, J.; Herner, K.; Levshina, T.; Slyz, M.

    2015-12-01

    A common use pattern in the computing models of particle physics experiments is running many distributed applications that read from a shared set of data files. We refer to this data is auxiliary data, to distinguish it from (a) event data from the detector (which tends to be different for every job), and (b) conditions data about the detector (which tends to be the same for each job in a batch of jobs). Relatively speaking, conditions data also tends to be relatively small per job where both event data and auxiliary data are larger per job. Unlike event data, auxiliary data comes from a limited working set of shared files. Since there is spatial locality of the auxiliary data access, the use case appears to be identical to that of the CernVM- Filesystem (CVMFS). However, we show that distributing auxiliary data through CVMFS causes the existing CVMFS infrastructure to perform poorly. We utilize a CVMFS client feature called "alien cache" to cache data on existing local high-bandwidth data servers that were engineered for storing event data. This cache is shared between the worker nodes at a site and replaces caching CVMFS files on both the worker node local disks and on the site's local squids. We have tested this alien cache with the dCache NFSv4.1 interface, Lustre, and the Hadoop Distributed File System (HDFS) FUSE interface, and measured performance. In addition, we use high-bandwidth data servers at central sites to perform the CVMFS Stratum 1 function instead of the low-bandwidth web servers deployed for the CVMFS software distribution function. We have tested this using the dCache HTTP interface. As a result, we have a design for an end-to-end high-bandwidth distributed caching read-only filesystem, using existing client software already widely deployed to grid worker nodes and existing file servers already widely installed at grid sites. Files are published in a central place and are soon available on demand throughout the grid and cached locally on the

  12. Engineering the CernVM-Filesystem as a High Bandwidth Distributed Filesystem for Auxiliary Physics Data

    Energy Technology Data Exchange (ETDEWEB)

    Dykstra, D. [Fermilab; Bockelman, B. [Nebraska U.; Blomer, J. [CERN; Herner, K. [Fermilab; Levshina, T. [Fermilab; Slyz, M. [Fermilab

    2015-12-23

    A common use pattern in the computing models of particle physics experiments is running many distributed applications that read from a shared set of data files. We refer to this data is auxiliary data, to distinguish it from (a) event data from the detector (which tends to be different for every job), and (b) conditions data about the detector (which tends to be the same for each job in a batch of jobs). Relatively speaking, conditions data also tends to be relatively small per job where both event data and auxiliary data are larger per job. Unlike event data, auxiliary data comes from a limited working set of shared files. Since there is spatial locality of the auxiliary data access, the use case appears to be identical to that of the CernVM- Filesystem (CVMFS). However, we show that distributing auxiliary data through CVMFS causes the existing CVMFS infrastructure to perform poorly. We utilize a CVMFS client feature called 'alien cache' to cache data on existing local high-bandwidth data servers that were engineered for storing event data. This cache is shared between the worker nodes at a site and replaces caching CVMFS files on both the worker node local disks and on the site's local squids. We have tested this alien cache with the dCache NFSv4.1 interface, Lustre, and the Hadoop Distributed File System (HDFS) FUSE interface, and measured performance. In addition, we use high-bandwidth data servers at central sites to perform the CVMFS Stratum 1 function instead of the low-bandwidth web servers deployed for the CVMFS software distribution function. We have tested this using the dCache HTTP interface. As a result, we have a design for an end-to-end high-bandwidth distributed caching read-only filesystem, using existing client software already widely deployed to grid worker nodes and existing file servers already widely installed at grid sites. Files are published in a central place and are soon available on demand throughout the grid and cached

  13. Memory bandwidth efficient two-layer reduced-resolution decoding of high-definition video

    Science.gov (United States)

    Comer, Mary L.

    2000-12-01

    This paper addresses the problem of efficiently decoding high- definition (HD) video for display at a reduced resolution. The decoder presented in this paper is intended for applications that are constrained not only in memory size, but also in peak memory bandwidth. This is the case, for example, during decoding of a high-definition television (HDTV) channel for picture-in-picture (PIP) display, if the reduced resolution PIP-channel decoder is sharing memory with the full-resolution main-channel decoder. The most significant source of video quality degradation in a reduced-resolution decoder is prediction drift, which is caused by the mismatch between the full-resolution reference frames used by the encoder and the subsampled reference frames used by the decoder. to mitigate the visually annoying effects of prediction drift, the decoder described in this paper operates at two different resolutions -- a lower resolution for B pictures, which do not contribute to prediction drift and a higher resolution for I and P pictures. This means that the motion-compensation unit (MCU) essentially operates at the higher resolution, but the peak memory bandwidth is the same as that required to decode at the lower resolution. Storage of additional data, representing the higher resolution for I and P pictures, requires a relatively small amount of additional memory as compared to decoding at the lower resolution. Experimental results will demonstrate the improvement in video quality achieved by the addition of the higher-resolution data in forming predictions for P pictures.

  14. Novel high bandwidth wall shear stress sensor for ultrasonic cleaning applications

    Science.gov (United States)

    Gonzalez-Avila, S. Roberto; Prabowo, Firdaus; Ohl, Claus-Dieter

    2010-11-01

    Ultrasonic cleaning is due to the action of cavitation bubbles. The details of the cleaning mechanisms are not revealed or confirmed experimentally, yet several studies suggest that the wall shear stresses generated are very high, i.e. of the order of several thousand Pascal. Ultrasonic cleaning applications span a wide range from semiconductor manufacturing, to low pressure membrane cleaning, and the in the medical field cleaning of surgical instruments. We have developed a novel sensor to monitor and quantify cleaning activity which is (1) very sturdy, (2) has a high bandwidth of several megahertz, (3) is cheap in manufacturing costs, and (4) of very small size. We analyze the sensor signal by comparing its response time correlated to single laser induced cavitation bubbles using high-speed photography. Additionally, we will present first measurements in ultrasonic cleaning bathes using again high-speed photography. A preliminary discussion on the working mechanism of the sensor will be presented.

  15. High-bandwidth squeezed light at 1550 nm from a compact monolithic PPKTP cavity

    CERN Document Server

    Ast, Stefan; Schnabel, Roman

    2013-01-01

    We report the generation of squeezed vacuum states of light at 1550 nm with a broadband quantum noise reduction of up to 4.8 dB ranging from 5 MHz to 1.2 GHz sideband frequency. We used a custom-designed 2.6 mm long biconvex periodically-poled potassium titanyl phosphate (PPKTP) crystal. It featured reflectively coated end surfaces, 2.26 GHz of linewidth and generated the squeezing via optical parametric amplification. Two homodyne detectors with different quantum efficiencies and bandwidths were used to characterize the non-classical noise suppression. We measured squeezing values of up to 4.8 dB from 5 to 100 MHz and up to 3 dB from 100 MHz to 1.2 GHz. The squeezed vacuum measurements were limited by detection loss. We propose an improved detection scheme to measure up to 10 dB squeezing over 1 GHz. Our results of GHz bandwidth squeezed light generation provide new prospects for high-speed quantum key distribution.

  16. Mahanaxar: quality of service guarantees in high-bandwidth, real-time streaming data storage

    Energy Technology Data Exchange (ETDEWEB)

    Bigelow, David [Los Alamos National Laboratory; Bent, John [Los Alamos National Laboratory; Chen, Hsing-Bung [Los Alamos National Laboratory; Brandt, Scott [UCSC

    2010-04-05

    Large radio telescopes, cyber-security systems monitoring real-time network traffic, and others have specialized data storage needs: guaranteed capture of an ultra-high-bandwidth data stream, retention of the data long enough to determine what is 'interesting,' retention of interesting data indefinitely, and concurrent read/write access to determine what data is interesting, without interrupting the ongoing capture of incoming data. Mahanaxar addresses this problem. Mahanaxar guarantees streaming real-time data capture at (nearly) the full rate of the raw device, allows concurrent read and write access to the device on a best-effort basis without interrupting the data capture, and retains data as long as possible given the available storage. It has built in mechanisms for reliability and indexing, can scale to meet arbitrary bandwidth requirements, and handles both small and large data elements equally well. Results from our prototype implementation shows that Mahanaxar provides both better guarantees and better performance than traditional file systems.

  17. High Bandwidth Pickup Design for Bunch Arrival-time Monitors for Free-Electron Laser

    CERN Document Server

    Angelovski, Aleksandar; Hansli, Matthias; Penirschke, Andreas; Schnepp, Sascha M; Bousonville, Michael; Schlarb, Holger; Bock, Marie Kristin; Weiland, Thomas; Jakoby, Rolf

    2012-01-01

    In this paper, we present the design and realization of high bandwidth pickup electrodes with a cutoff frequency above 40 GHz. The proposed cone-shaped pickups are part of a bunch arrival-time monitor (BAM) designed for high (> 500 pC) and low (20 pC) bunch charge operation mode providing for a time resolution of less than 10 fs for both operation modes. The proposed design has a fast voltage response, low ringing, and a resonance-free spectrum. For assessing the influence of manufacturing tolerances on the performance of the pickups, an extensive tolerance study has been performed via numerical simulations. A non-hermetic model of the pickups was built for measurement and validation purposes. The measurement and simulation results are in good agreement and demonstrate the capability of the proposed pickup system to meet the given specifications.

  18. High bandwidth pickup design for bunch arrival-time monitors for free-electron laser

    Directory of Open Access Journals (Sweden)

    Aleksandar Angelovski

    2012-11-01

    Full Text Available In this paper, we present the design and realization of high bandwidth pickup electrodes with a cutoff frequency above 40 GHz. The proposed cone-shaped pickups are part of a bunch arrival-time monitor designed for high (>500  pC and low (20 pC bunch charge operation mode providing for a time resolution of less than 10 fs for both operation modes. The proposed design has a fast voltage response, low ringing, and a resonance-free spectrum. For assessing the influence of manufacturing tolerances on the performance of the pickups, an extensive tolerance study has been performed via numerical simulations. A nonhermetic model of the pickups was built for measurement and validation purposes. The measurement and simulation results are in good agreement and demonstrate the capability of the proposed pickup system to meet the given specifications.

  19. On Bandwidth Efficient Modulation for High-Data-Rate Wireless LAN Systems

    Directory of Open Access Journals (Sweden)

    Stolpman Victor

    2002-01-01

    Full Text Available We address the problem of high-data-rate orthogonal frequency division multiplexed (OFDM systems under restrictive bandwidth constraints. Based on recent theoretic results, multiple-input multiple-output (MIMO configurations are best suited for this problem. In this paper, we examine several MIMO configurations suitable for high rate transmission. In all scenarios considered, perfect channel state information (CSI is assumed at the receiver. In constrast, availability of CSI at the transmitter is addressed separately. We show that powerful space-time codes can be developed by combining some simple well-known techniques. In fact, we show that for certain configurations, these space-time MIMO configurations are near optimum in terms of outage capacity as compared to previously published codes. Performance evaluation of these techniques is demonstrated within the IEEE 802.11a framework via Monte Carlo simulations.

  20. Applied techniques for high bandwidth data transfers across wide area networks

    Energy Technology Data Exchange (ETDEWEB)

    Lee, Jason; Gunter, Dan; Tierney, Brian; Allcock, Bill; Bester, Joe; Bresnahan, John; Tuecke, Steve

    2001-04-30

    Large distributed systems such as Computational/Data Grids require large amounts of data to be co-located with the computing facilities for processing. Ensuring that the data is there in time for the computation in today's Internet is a massive problem. From our work developing a scalable distributed network cache, we have gained experience with techniques necessary to achieve high data throughput over high bandwidth Wide Area Networks (WAN). In this paper, we discuss several hardware and software design techniques and issues, and then describe their application to an implementation of an enhanced FTP protocol called GridFTP. We also describe results from two applications using these techniques, which were obtained at the Supercomputing 2000 conference.

  1. High-Bandwidth Photon-Counting Detectors with Enhanced Near-Infrared Response Project

    Data.gov (United States)

    National Aeronautics and Space Administration — Long-range optical telecommunications (LROT) impose challenging requirements on detector array sensitivity at 1064nm and arrays timing bandwidth. Large photonic...

  2. High bandwidth all-optical 3×3 switch based on multimode interference structures

    Science.gov (United States)

    Le, Duy-Tien; Truong, Cao-Dung; Le, Trung-Thanh

    2017-03-01

    A high bandwidth all-optical 3×3 switch based on general interference multimode interference (GI-MMI) structure is proposed in this study. Two 3×3 multimode interference couplers are cascaded to realize an all-optical switch operating at both wavelengths of 1550 nm and 1310 nm. Two nonlinear directional couplers at two outer-arms of the structure are used as all-optical phase shifters to achieve all switching states and to control the switching states. Analytical expressions for switching operation using the transfer matrix method are presented. The beam propagation method (BPM) is used to design and optimize the whole structure. The optimal design of the all-optical phase shifters and 3×3 MMI couplers are carried out to reduce the switching power and loss.

  3. Call Admission Control with Bandwidth Reallocation for Adaptive Multimedia in High-Rate Short-Range Wireless Networks

    Institute of Scientific and Technical Information of China (English)

    ZHAIXuping; BIGuangguo; XUPingping

    2005-01-01

    In high-rate short-range wireless networks,CAC (Call admission control) scheme plays an important role in quality of service provisioning for adaptive multimedia services. Three functions, namely bandwidth satisfaction function, revenue rate function and bandwidth reallocation cost function, are firstly introduced. Based on these functions, an efficient CAC scheme, the Rev-RT-BRA (Reservation-based and Revenue test with Bandwidth reallocation) CAC scheme is proposed. The main idea is that it reserves some bandwidth for service classes with higher admission priority. The performance of the Rev-RT-BRA CAC scheme is analyzed by solving a multidimension Markov process. Both the numerical and simulation results are given. The advantages of the proposedRev-RT-BRA CAC scheme are as follows. (1) It maximizes the overall bandwidth satisfaction function at any system state. (2) It solves the unfairness problem in admitting multiple classes of services with different bandwidth requirenlents. (3) The required admission priority level can be guaranteed for various classes of services.

  4. Tri-material multilayer coatings with high reflectivity and wide bandwidth for 25 to 50 nm extreme ultraviolet light

    Energy Technology Data Exchange (ETDEWEB)

    Aquila, Andrew; Salmassi, Farhad; Liu, Yanwei; Gullikson, Eric M.

    2009-09-09

    Magnesium/silicon carbide (Mg/SiC) multilayers have been fabricated with normal incidence reflectivity in the vicinity of 40% to 50% for wavelengths in the 25 to 50 nm wavelength range. However many applications, for example solar telescopes and ultrafast studies using high harmonic generation sources, desire larger bandwidths than provided by high reflectivity Mg/SiC multilayers. We investigate introducing a third material, Scandium, to create a tri-material Mg/Sc/SiC multilayer allowing an increase the bandwidth while maintaining high reflectivity.

  5. International distance education and the transition from ISDN to high-bandwidth Internet connectivity.

    Science.gov (United States)

    Vincent, Dale S; Berg, Benjamin W; Chitpatima, Suwicha; Hudson, Donald

    2002-12-01

    The Thailand Hawaii Assessment of Interactive Healthcare Initiative (THAI-HI) is an international distance-education project between two teaching hospitals in Honolulu and Bangkok that uses videoconferencing over three ISDN lines. A 'morning report' format is used to discuss clinical cases primarily covering infectious disease and critical-care topics. An audience response system is used at both sites to add interactivity. From July 2001 to May 2002, 816 health-care providers attended 20 clinical conferences. Audiences rated the conferences as highly relevant and as having high training value. Since the ISDN connection is expensive, we plan to convert the telecommunications to a high-bandwidth Internet connection. The Honolulu site will use a 45 Mbit/s commercial connection to the Hawaii Intranetwork Consortium, which links to the Abilene Network on the US mainland. The Bangkok hospital will use a 155 Mbit/s wireless optical connection to UNINET Thailand, which has a 45 Mbit/s circuit to Abilene.

  6. Radiation-tolerant, low-mass, high bandwidth, flexible printed circuit cables for particle physics experiments

    Science.gov (United States)

    McFadden, N. C.; Hoeferkamp, M. R.; Seidel, S.

    2016-09-01

    The design of meter long flexible printed circuit cables required for low-mass ultra-high speed signal transmission in the high radiation environment of the High Luminosity Large Hadron Collider is described. The design geometry is a differential embedded microstrip with 100 Ω nominal impedance. Minimal mass and maximal radiation hardness are pre-eminent considerations. Several dielectric materials are compared. To reduce mass, a cross hatched ground plane is applied. The long flexible printed circuit cables are characterized in bit error rate tests, attenuation versus frequency, mechanical response to temperature induced stress, and dimensional implications on radiation length. These tests are performed before and after irradiation with 1 MeV neutrons to 2×1016/cm2 and 800 MeV protons to 2×1016 1-MeV neutron equivalent/cm2. A 1.0 m Kapton cable with cross hatched ground plane, effective bandwidth of 4.976 gigabits per second, 0.0160% of a radiation length, and no detectable radiation-induced mechanical or electrical degradation is obtained.

  7. Radiation-tolerant, low-mass, high bandwidth, flexible printed circuit cables for particle physics experiments

    Energy Technology Data Exchange (ETDEWEB)

    McFadden, N.C.; Hoeferkamp, M.R.; Seidel, S.

    2016-09-11

    The design of meter long flexible printed circuit cables required for low-mass ultra-high speed signal transmission in the high radiation environment of the High Luminosity Large Hadron Collider is described. The design geometry is a differential embedded microstrip with 100 Ω nominal impedance. Minimal mass and maximal radiation hardness are pre-eminent considerations. Several dielectric materials are compared. To reduce mass, a cross hatched ground plane is applied. The long flexible printed circuit cables are characterized in bit error rate tests, attenuation versus frequency, mechanical response to temperature induced stress, and dimensional implications on radiation length. These tests are performed before and after irradiation with 1 MeV neutrons to 2×10{sup 16}/cm{sup 2} and 800 MeV protons to 2×10{sup 16} 1-MeV neutron equivalent/cm{sup 2}. A 1.0 m Kapton cable with cross hatched ground plane, effective bandwidth of 4.976 gigabits per second, 0.0160% of a radiation length, and no detectable radiation-induced mechanical or electrical degradation is obtained.

  8. Gbps wireless transceivers for high bandwidth interconnections in distributed cyber physical systems

    Science.gov (United States)

    Saponara, Sergio; Neri, Bruno

    2015-05-01

    In Cyber Physical Systems there is a growing use of high speed sensors like photo and video camera, radio and light detection and ranging (Radar/Lidar) sensors. Hence Cyber Physical Systems can benefit from the high communication data rate, several Gbps, that can be provided by mm-wave wireless transceivers. At such high frequency the wavelength is few mm and hence the whole transceiver including the antenna can be integrated in a single chip. To this aim this paper presents the design of 60 GHz transceiver architecture to ensure connection distances up to 10 m and data rate up to 4 Gbps. At 60 GHz there are more than 7 GHz of unlicensed bandwidth (available for free for development of new services). By using a CMOS SOI technology RF, analog and digital baseband circuitry can be integrated in the same chip minimizing noise coupling. Even the antenna is integrated on chip reducing cost and size vs. classic off-chip antenna solutions. Therefore the proposed transceiver can enable at physical layer the implementation of low cost nodes for a Cyber Physical System with data rates of several Gbps and with a communication distance suitable for home/office scenarios, or on-board vehicles such as cars, trains, ships, airplanes

  9. A 750MHz and a 8GHz High Bandwidth Digital FFT Spectrometer Project

    Data.gov (United States)

    National Aeronautics and Space Administration — The scope of this project is to to develop a wide bandwidth, low power, and compact single board digital Fast Fourier Transform spectrometer (FFTS) optimized for the...

  10. High-Bandwidth Photon-Counting Detectors with Enhanced Near-Infrared Response Project

    Data.gov (United States)

    National Aeronautics and Space Administration — Laser optical communications offer the potential to dramatically increase the link bandwidth and decrease the emitter power in long-range space communications....

  11. Controllable high bandwidth storage of optical information in a Bose-Einstein Condensate

    Science.gov (United States)

    Jayaseelan, Maitreyi; Schultz, Justin T.; Murphree, Joseph D.; Hansen, Azure; Bigelow, Nicholas P.

    2016-05-01

    The storage and retrieval of optical information has been of interest for a variety of applications including quantum information processing, quantum networks and quantum memories. Several schemes have been investigated and realized with weak, narrowband pulses, including techniques using EIT in solid state systems and both hot and cold atomic vapors. In contrast, we investigate the storage and manipulation of strong, high bandwidth pulses in a Bose-Einstein Condensate (BEC) of ultracold 87 Rb atoms. As a storage medium for optical pulses, BECs offer long storage times and preserve the coherence properties of the input information, suppressing unwanted thermal decoherence effects. We present numerical simulations of nanosecond pulses addressing a three-level lambda system on the D2 line of 87 Rb. The signal pulse is stored as a localized spin excitation in the condensate and can be moved or retrieved by reapplication of successive control pulses. The relative Rabi frequencies and areas of the pulses and the local atomic density in the condensate determine the storage location and readout of the signal pulse. Extending this scheme to use beams with a variety of spatial modes such as Hermite- and Laguerre-Gaussian modes offers an expanded alphabet for information storage.

  12. Wide bandwidth optical signals for high range resolution measurements in water

    Science.gov (United States)

    Nash, Justin; Lee, Robert; Mullen, Linda

    2016-05-01

    Measurements with high range resolution are needed to identify underwater threats, especially when two-dimensional contrast information is insufficient to extract object details. The challenge is that optical measurements are limited by scattering phenomena induced by the underwater channel. Back-scatter results in transmitted photons being directed back to the receiver before reaching the target of interest which induces a clutter signal for ranging and a reduction in contrast for imaging. Multiple small-angle scattering (forward-scatter) results in transmitted photons being directed to unintended regions of the target of interest (spatial spreading), while also stretching the temporal profile of a short optical pulse (temporal spreading). Spatial and temporal spreading of the optical signal combine to cause a reduction in range resolution in conventional laser imaging systems. NAVAIR has investigated ways in which wide bandwidth, modulated optical signals can be utilized to improve ranging and imaging performance in turbid water environments. Experimental efforts have been conducted to investigate channel effects on the propagated frequency content, as well as different filtering and processing techniques on the return signals to maximize range resolution. Of particular interest for the modulated pulses are coherent detection and processing techniques employed by the radar community, including methods to reduce sidelobe clutter. This paper will summarize NAVAIR's work and show that wideband optical signals, in combination with the CLEAN algorithm, can indeed provide enhancements to range resolution and 3D imagery in turbid water environments.

  13. A Lowpass Filter with Sharp Roll - off and High Relative Stopband Bandwidth Using Asymmetric High - Low Impedance Patches

    Directory of Open Access Journals (Sweden)

    As. Abdipour

    2015-09-01

    Full Text Available In this letter, a microstrip lowpass filter with -3 dB cut-off frequency at 1.286 GHz is proposed. By using two main resonators which are placed symmetrically around Y axis a sharp roll-off rate (250 dB/GHz is obtained. The proposed resonators are consisted of two asymmetric high-low impedance patches. To achieve a high relative stopband bandwidth (1.82 four high - low impedance resonators and four radial stubs as suppressing cells are employed. Furthermore, a flat insertion loss in the passband and a low return loss in the stopband can prove desired in-band and out-band frequency response. The proposed LPF has a high FOM about 63483.

  14. An InP-Based Dual-Depletion-Region Electroabsorption Modulator with Low Capacitance and Predicted High Bandwidth

    Institute of Scientific and Technical Information of China (English)

    SHAO Yong-Bo; ZHAO Ling-Juan; YU Hong-Yan; QIU Ji-Fang; QIU Ying-Ping; PAN Jiao-Qing; WANG Bao-Jun; ZHU Hong-Liang; WANG Wei

    2011-01-01

    A novel dual-depletion-region electroabsorption modulator (DDR-EAM) based on InP at 1550nm is fabricated.The measured capacitance and extinction ratio of the DDR-EAM reveal that the dual depletion region structure can reduce the device capacitance significantly without any degradation of extinction ratio.Moreover,the bandwidth of the DDR-EAM predicted by using an equivalent circuit model is larger than twice the bandwidth of the conventional lumped-electrode EAM (L-EAM).The electroabsorption modulator (EAM) is highly desirable as an external electro-optical modulator due to its high speed,low cost and capability of integration with other optical component such as DFB lasers,DBR lasers or semiconductor optical amplifiers.[1-4]So far,EAMs are typically fabricated by using lumped electrodes[1-4] and travelling-wave electrodes.[5-15]%A novel dual-depletion-region electroabsorption modulator (DDR-EAM) based on InP at 1550nm is fabricated. The measured capacitance and extinction ratio of the DDR-EAM reveal that the dual depletion region structure can reduce the device capacitance significantly without any degradation of extinction ratio. Moreover, the bandwidth of the DDR-EAM predicted by using an equivalent circuit model is larger than twice the bandwidth of the conventional lumped-electrode EAM (L-EAM).

  15. Performance Evaluation of a High Bandwidth Liquid Fuel Modulation Valve for Active Combustion Control

    Science.gov (United States)

    Saus, Joseph R.; DeLaat, John C.; Chang, Clarence T.; Vrnak, Daniel R.

    2012-01-01

    At the NASA Glenn Research Center, a characterization rig was designed and constructed for the purpose of evaluating high bandwidth liquid fuel modulation devices to determine their suitability for active combustion control research. Incorporated into the rig s design are features that approximate conditions similar to those that would be encountered by a candidate device if it were installed on an actual combustion research rig. The characterized dynamic performance measures obtained through testing in the rig are planned to be accurate indicators of expected performance in an actual combustion testing environment. To evaluate how well the characterization rig predicts fuel modulator dynamic performance, characterization rig data was compared with performance data for a fuel modulator candidate when the candidate was in operation during combustion testing. Specifically, the nominal and off-nominal performance data for a magnetostrictive-actuated proportional fuel modulation valve is described. Valve performance data were collected with the characterization rig configured to emulate two different combustion rig fuel feed systems. Fuel mass flows and pressures, fuel feed line lengths, and fuel injector orifice size was approximated in the characterization rig. Valve performance data were also collected with the valve modulating the fuel into the two combustor rigs. Comparison of the predicted and actual valve performance data show that when the valve is operated near its design condition the characterization rig can appropriately predict the installed performance of the valve. Improvements to the characterization rig and accompanying modeling activities are underway to more accurately predict performance, especially for the devices under development to modulate fuel into the much smaller fuel injectors anticipated in future lean-burning low-emissions aircraft engine combustors.

  16. ICE-based Custom Full-Mesh Network for the CHIME High Bandwidth Radio Astronomy Correlator

    CERN Document Server

    Bandura, Kevin; Dobbs, Matt; Gilbert, Adam; Ittah, David; Parra, Juan Mena; Smecher, Graeme

    2016-01-01

    New generation radio interferometers encode signals from thousands of antenna feeds across large bandwidth. Channelizing and correlating this data requires networking capabilities that can handle unprecedented data rates with reasonable cost. The Canadian Hydrogen Intensity Mapping Experiment (CHIME) correlator processes 8-bits from N=2048 digitizer inputs across 400~MHz of bandwidth. Measured in $N^2~\\times $ bandwidth, it is the largest radio correlator that has been built. Its digital back-end must exchange and reorganize the 6.6~terabit/s produced by its 128 digitizing and channelizing nodes, and feed it to the 256-node spatial correlator in a way that each node obtains data from all digitizer inputs but across a small fraction of the bandwidth (i.e. `corner-turn'). In order to maximize performance and reliability of the corner-turn system while minimizing cost, a custom networking solution has been implemented. The system makes use of Field Programmable Gate Array (FPGA) transceivers to implement direct,...

  17. A high performance long-reach passive optical network with a novel excess bandwidth distribution scheme

    Science.gov (United States)

    Chao, I.-Fen; Zhang, Tsung-Min

    2015-06-01

    Long-reach passive optical networks (LR-PONs) have been considered to be promising solutions for future access networks. In this paper, we propose a distributed medium access control (MAC) scheme over an advantageous LR-PON network architecture that reroutes the control information from and back to all ONUs through an (N + 1) × (N + 1) star coupler (SC) deployed near the ONUs, thereby overwhelming the extremely long propagation delay problem in LR-PONs. In the network, the control slot is designed to contain all bandwidth requirements of all ONUs and is in-band time-division-multiplexed with a number of data slots within a cycle. In the proposed MAC scheme, a novel profit-weight-based dynamic bandwidth allocation (P-DBA) scheme is presented. The algorithm is designed to efficiently and fairly distribute the amount of excess bandwidth based on a profit value derived from the excess bandwidth usage of each ONU, which resolves the problems of previously reported DBA schemes that are either unfair or inefficient. The simulation results show that the proposed decentralized algorithms exhibit a nearly three-order-of-magnitude improvement in delay performance compared to the centralized algorithms over LR-PONs. Moreover, the newly proposed P-DBA scheme guarantees low delay performance and fairness even when under attack by the malevolent ONU irrespective of traffic loads and burstiness.

  18. Speeding up parallel GROMACS on high-latency networks.

    Science.gov (United States)

    Kutzner, Carsten; van der Spoel, David; Fechner, Martin; Lindahl, Erik; Schmitt, Udo W; de Groot, Bert L; Grubmüller, Helmut

    2007-09-01

    We investigate the parallel scaling of the GROMACS molecular dynamics code on Ethernet Beowulf clusters and what prerequisites are necessary for decent scaling even on such clusters with only limited bandwidth and high latency. GROMACS 3.3 scales well on supercomputers like the IBM p690 (Regatta) and on Linux clusters with a special interconnect like Myrinet or Infiniband. Because of the high single-node performance of GROMACS, however, on the widely used Ethernet switched clusters, the scaling typically breaks down when more than two computer nodes are involved, limiting the absolute speedup that can be gained to about 3 relative to a single-CPU run. With the LAM MPI implementation, the main scaling bottleneck is here identified to be the all-to-all communication which is required every time step. During such an all-to-all communication step, a huge amount of messages floods the network, and as a result many TCP packets are lost. We show that Ethernet flow control prevents network congestion and leads to substantial scaling improvements. For 16 CPUs, e.g., a speedup of 11 has been achieved. However, for more nodes this mechanism also fails. Having optimized an all-to-all routine, which sends the data in an ordered fashion, we show that it is possible to completely prevent packet loss for any number of multi-CPU nodes. Thus, the GROMACS scaling dramatically improves, even for switches that lack flow control. In addition, for the common HP ProCurve 2848 switch we find that for optimum all-to-all performance it is essential how the nodes are connected to the switch's ports. This is also demonstrated for the example of the Car-Parinello MD code.

  19. A Parallel, High-Fidelity Radar Model

    Science.gov (United States)

    Horsley, M.; Fasenfest, B.

    2010-09-01

    Accurate modeling of Space Surveillance sensors is necessary for a variety of applications. Accurate models can be used to perform trade studies on sensor designs, locations, and scheduling. In addition, they can be used to predict system-level performance of the Space Surveillance Network to a collision or satellite break-up event. A high fidelity physics-based radar simulator has been developed for Space Surveillance applications. This simulator is designed in a modular fashion, where each module describes a particular physical process or radar function (radio wave propagation & scattering, waveform generation, noise sources, etc.) involved in simulating the radar and its environment. For each of these modules, multiple versions are available in order to meet the end-users needs and requirements. For instance, the radar simulator supports different atmospheric models in order to facilitate different methods of simulating refraction of the radar beam. The radar model also has the capability to use highly accurate radar cross sections generated by the method of moments, accelerated by the fast multipole method. To accelerate this computationally expensive model, it is parallelized using MPI. As a testing framework for the radar model, it is incorporated into the Testbed Environment for Space Situational Awareness (TESSA). TESSA is based on a flexible, scalable architecture, designed to exploit high-performance computing resources and allow physics-based simulation of the SSA enterprise. In addition to the radar models, TESSA includes hydrodynamic models of satellite intercept and debris generation, orbital propagation algorithms, optical brightness calculations, optical system models, object detection algorithms, orbit determination algorithms, simulation analysis and visualization tools. Within this framework, observations and tracks generated by the new radar model are compared to results from a phenomenological radar model. In particular, the new model will be

  20. Multi-petascale highly efficient parallel supercomputer

    Energy Technology Data Exchange (ETDEWEB)

    Asaad, Sameh; Bellofatto, Ralph E.; Blocksome, Michael A.; Blumrich, Matthias A.; Boyle, Peter; Brunheroto, Jose R.; Chen, Dong; Cher, Chen -Yong; Chiu, George L.; Christ, Norman; Coteus, Paul W.; Davis, Kristan D.; Dozsa, Gabor J.; Eichenberger, Alexandre E.; Eisley, Noel A.; Ellavsky, Matthew R.; Evans, Kahn C.; Fleischer, Bruce M.; Fox, Thomas W.; Gara, Alan; Giampapa, Mark E.; Gooding, Thomas M.; Gschwind, Michael K.; Gunnels, John A.; Hall, Shawn A.; Haring, Rudolf A.; Heidelberger, Philip; Inglett, Todd A.; Knudson, Brant L.; Kopcsay, Gerard V.; Kumar, Sameer; Mamidala, Amith R.; Marcella, James A.; Megerian, Mark G.; Miller, Douglas R.; Miller, Samuel J.; Muff, Adam J.; Mundy, Michael B.; O' Brien, John K.; O' Brien, Kathryn M.; Ohmacht, Martin; Parker, Jeffrey J.; Poole, Ruth J.; Ratterman, Joseph D.; Salapura, Valentina; Satterfield, David L.; Senger, Robert M.; Smith, Brian; Steinmacher-Burow, Burkhard; Stockdell, William M.; Stunkel, Craig B.; Sugavanam, Krishnan; Sugawara, Yutaka; Takken, Todd E.; Trager, Barry M.; Van Oosten, James L.; Wait, Charles D.; Walkup, Robert E.; Watson, Alfred T.; Wisniewski, Robert W.; Wu, Peng

    2015-07-14

    A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaOPS-scale computing, at decreased cost, power and footprint, and that allows for a maximum packaging density of processing nodes from an interconnect point of view. The Supercomputer exploits technological advances in VLSI that enables a computing model where many processors can be integrated into a single Application Specific Integrated Circuit (ASIC). Each ASIC computing node comprises a system-on-chip ASIC utilizing four or more processors integrated into one die, with each having full access to all system resources and enabling adaptive partitioning of the processors to functions such as compute or messaging I/O on an application by application basis, and preferably, enable adaptive partitioning of functions in accordance with various algorithmic phases within an application, or if I/O or other processors are underutilized, then can participate in computation or communication nodes are interconnected by a five dimensional torus network with DMA that optimally maximize the throughput of packet communications between nodes and minimize latency.

  1. High Speed Peltier Calorimeter for the Calibration of High Bandwidth Power Measurement Equipment

    CERN Document Server

    Frost, Damien F

    2015-01-01

    Accurate power measurements of electronic components operating at high frequencies are vital in determining where power losses occur in a system such as a power converter. Such power measurements must be carried out with equipment that can accurately measure real power at high frequency. We present the design of a high speed calorimeter to address this requirement, capable of reaching a steady state in less than 10 minutes. The system uses Peltier thermoelectric coolers to remove heat generated in a load resistance, and was calibrated against known real power measurements using an artificial neural network. A dead zone controller was used to achieve stable power measurements. The calibration was validated and shown to have an absolute accuracy of +/-8 mW (95% confidence interval) for measurements of real power from 0.1 to 5 W.

  2. Integrated high-speed DFB light source and narrow-bandwidth RCE photodetector for WDM fiber communication network application

    Science.gov (United States)

    Wang, Qiming; Li, Cheng; Pan, Zhong; Luo, Yi

    2000-10-01

    Electroabsorption (EA) modulator integrated with partially gain coupling distributed feedback (DFB) lasers have been fabricated and shown high single mode yield and wavelength stability. The small signal bandwidth is about 7.5 GHz. Strained Si1-xGex/Si multiple quantum well (MQW) resonant-cavity enhanced (RCE) photodetectors with SiO2/Si distributed Bragg reflector (DBR) as the mirrors have been fabricated and shown a clear narrow bandwidth response. The external quantum efficiency at 1.3 micrometer is measured to be about 3.5% under reverse bias of 16 V. A novel GaInNAs/GaAs MQW RCE p-i-n photodetector with high reflectance GaAs/AlAs DBR mirrors has also been demonstrated and shown the selectively detecting function with the FWHM of peak response of 12 nm.

  3. XPP A High Performance Parallel Signal Processing Platform for Space Applications

    Science.gov (United States)

    Schueler, E.; Syed, M.; Helfers, T.

    This document presents the eXtreme Processing Platform (XPP), a new runtime reconfigurable data processing technology developed by PACT GmbH which combines the performance of ASICs with the flexibility of DSPs. The XPP is built using a scalable array of arithmetic processing elements, embedded memories, high bandwidth I/O, an auto synchronizing packet oriented data network, an internal event network that enables the control of program flow using execution flags, and is designed to support different types of parallelism, like multithreading, multitasking and multiple parallel instances. The technology promises to provide high flexible payloads in future telecommunication satellites, scientific missions and short time to market, much needed to cope up with the significant changes in telecommunication market and rapidly changing customer needs.

  4. Resource Centered Computing delivering high parallel performance

    OpenAIRE

    2014-01-01

    International audience; Modern parallel programming requires a combination of differentparadigms, expertise and tuning, that correspond to the differentlevels in today's hierarchical architectures. To cope with theinherent difficulty, ORWL (ordered read-write locks) presents a newparadigm and toolbox centered around local or remote resources, suchas data, processors or accelerators. ORWL programmers describe theircomputation in terms of access to these resources during criticalsections. Exclu...

  5. Silicon Photonics for All-Optical Processing and High-Bandwidth-Density Interconnects

    Science.gov (United States)

    Ophir, Noam

    The first chapter of the thesis provides motivation for the integration of silicon photonic modules into compute systems and surveys some of the recent developments in the field. The second chapter then proceeds to detail a technical case study of silicon photonic microring-based WDM links' scalability and power efficiency for these chip I/O applications which could be developed in the intermediate future. The analysis, initiated originally for a workshop on optical and electrical board and rack level interconnects, looks into a detailed model of the optical power budget for such a link capturing both single-channel aspects as well as WDM-operation-related considerations which are unique for a microring physical characteristics. The third chapter, while continuing on the theme silicon photonic high bandwidth density links, proceeds to detail the first experimental demonstration and characterization of an on-chip spatial division multiplexing (SDM) scheme based on microrings for the multiplexing and demultiplexing functionalities. In the context of more forward looking optical network-on-chip environments, SDM-enabled WDM photonic interconnects can potentially achieve superior bandwidth densities per waveguide compared to WDM-only photonic interconnects. The microring-based implementation allows dynamic tuning of the multiplexing and demultiplexing characteristic of the system which allows operation on WDM grid as well device tuning to combat intra-channel crosstalk. The characterization focuses on the first reported power penalty measurements for on-chip silicon photonic SDM link showing minimal penalties achievable with 3 spatial modes concurrently operating on a single waveguide with 10-Gb/s data carried by each mode. The fourth, fifth, and sixth chapters shift in topic from the application of silicon photonics to communication links to the evolving use of silicon waveguides for nonlinear all-optical processing. Chapter four primarily introduces and motivates

  6. FEREBUS: Highly parallelized engine for kriging training.

    Science.gov (United States)

    Di Pasquale, Nicodemo; Bane, Michael; Davie, Stuart J; Popelier, Paul L A

    2016-11-05

    FFLUX is a novel force field based on quantum topological atoms, combining multipolar electrostatics with IQA intraatomic and interatomic energy terms. The program FEREBUS calculates the hyperparameters of models produced by the machine learning method kriging. Calculation of kriging hyperparameters (θ and p) requires the optimization of the concentrated log-likelihood L̂(θ,p). FEREBUS uses Particle Swarm Optimization (PSO) and Differential Evolution (DE) algorithms to find the maximum of L̂(θ,p). PSO and DE are two heuristic algorithms that each use a set of particles or vectors to explore the space in which L̂(θ,p) is defined, searching for the maximum. The log-likelihood is a computationally expensive function, which needs to be calculated several times during each optimization iteration. The cost scales quickly with the problem dimension and speed becomes critical in model generation. We present the strategy used to parallelize FEREBUS, and the optimization of L̂(θ,p) through PSO and DE. The code is parallelized in two ways. MPI parallelization distributes the particles or vectors among the different processes, whereas the OpenMP implementation takes care of the calculation of L̂(θ,p), which involves the calculation and inversion of a particular matrix, whose size increases quickly with the dimension of the problem. The run time shows a speed-up of 61 times going from single core to 90 cores with a saving, in one case, of ∼98% of the single core time. In fact, the parallelization scheme presented reduces computational time from 2871 s for a single core calculation, to 41 s for 90 cores calculation. © 2016 The Authors. Journal of Computational Chemistry Published by Wiley Periodicals, Inc.

  7. A Low Power High Bandwidth Four Quadrant Analog Multiplier in 32 NM CNFET Technology

    Directory of Open Access Journals (Sweden)

    Vitrag Sheth

    2012-05-01

    Full Text Available Carbon Nanotube Field Effect Transistor (CNFET is a promising new technology that overcomes several limitations of traditional silicon integrated circuit technology. In recent years, the potential of CNFET for analog circuit applications has been explored. This paper proposes a novel four quadrant analog multiplier design using CNFETs. The simulation based on 32nm CNFET technology shows that the proposed multiplier has very low harmonic distortion (<0.45%, large input range (±400mV, large bandwidth (~50GHz and low power consumption (~247µW, while operating at a supply voltage of ±0.9V.

  8. Teaching RLC Parallel Circuits in High-School Physics Class

    Science.gov (United States)

    Simon, Alpár

    2015-01-01

    This paper will try to give an alternative treatment of the subject "parallel RLC circuits" and "resonance in parallel RLC circuits" from the Physics curricula for the XIth grade from Romanian high-schools, with an emphasis on practical type circuits and their possible applications, and intends to be an aid for both Physics…

  9. Wide-Bandwidth, Wide-Beamwidth, High-Resolution, Millimeter-Wave Imaging for Concealed Weapon Detection

    Energy Technology Data Exchange (ETDEWEB)

    Sheen, David M.; Fernandes, Justin L.; Tedeschi, Jonathan R.; McMakin, Douglas L.; Jones, Anthony M.; Lechelt, Wayne M.; Severtsen, Ronald H.

    2013-06-12

    Active millimeter-wave imaging is currently being used for personnel screening at airports and other high-security facilities. The lateral resolution, depth resolution, clothing penetration, and image illumination quality obtained from next-generation systems can be significantly enhanced through the selection the aperture size, antenna beamwidth, center frequency, and bandwidth. In this paper, the results of an extensive imaging trade study are presented using both planar and cylindrical three-dimensional imaging techniques at frequency ranges of 10-20 GHz, 10 – 40 GHz, 40 – 60 GHz, and 75 – 105 GHz

  10. A Broadband, Spectrally Flat, High Rep-rate Frequency Comb: Bandwidth Scaling and Flatness Enhancement of Phase Modulated CW through Cascaded Four-Wave Mixing

    CERN Document Server

    Supradeepa, V R

    2010-01-01

    We demonstrate a scheme to scale the bandwidth by several times while enhancing spectral flatness of frequency combs generated by intensity and phase modulation of CW lasers using cascaded four-wave mixing in highly nonlinear fiber.

  11. Heterogeneous Highly Parallel Implementation of Matrix Exponentiation Using GPU

    CERN Document Server

    Raja, Chittampally Vasanth; Raghavendra, Prakash S; 10.5121/ijdps.2012.3209

    2012-01-01

    The vision of super computer at every desk can be realized by powerful and highly parallel CPUs or GPUs or APUs. Graphics processors once specialized for the graphics applications only, are now used for the highly computational intensive general purpose applications. Very expensive GFLOPs and TFLOP performance has become very cheap with the GPGPUs. Current work focuses mainly on the highly parallel implementation of Matrix Exponentiation. Matrix Exponentiation is widely used in many areas of scientific community ranging from highly critical flight, CAD simulations to financial, statistical applications. Proposed solution for Matrix Exponentiation uses OpenCL for exploiting the hyper parallelism offered by the many core GPGPUs. It employs many general GPU optimizations and architectural specific optimizations. This experimentation covers the optimizations targeted specific to the Scientific Graphics cards (Tesla-C2050). Heterogeneous Highly Parallel Matrix Exponentiation method has been tested for matrices of ...

  12. High-bandwidth Modulation of H2/Syngas Fuel to Control Combustion Dynamics in Micro-Mixing Lean Premix Systems

    Energy Technology Data Exchange (ETDEWEB)

    Jeff Melzak; Tim Lieuwen; Adel Mansour

    2012-01-31

    The goal of this program was to develop and demonstrate fuel injection technologies that will facilitate the development of cost-effective turbine engines for Integrated Gasification Combined Cycle (IGCC) power plants, while improving efficiency and reducing emissions. The program involved developing a next-generation multi-point injector with enhanced stability performance for lean premix turbine systems that burn hydrogen (H2) or synthesis gas (syngas) fuels. A previously developed injector that demonstrated superior emissions performance was improved to enhance static flame stability through zone staging and pilot sheltering. In addition, piezo valve technology was implemented to investigate the potential for enhanced dynamic stability through high-bandwidth modulation of the fuel supply. Prototype injector and valve hardware were tested in an atmospheric combustion facility. The program was successful in meeting its objectives. Specifically, the following was accomplished: Demonstrated improvement of lean operability of the Parker multi-point injector through staging of fuel flow and primary zone sheltering; Developed a piezo valve capable of proportional and high-bandwidth modulation of gaseous fuel flow at frequencies as high as 500 Hz; The valve was shown to be capable of effecting changes to flame dynamics, heat release, and acoustic signature of an atmospheric combustor. The latter achievement indicates the viability of the Parker piezo valve technology for use in future adaptively controlled systems for the mitigation of combustion instabilities, particularly for attenuating combustion dynamics under ultra-lean conditions.

  13. High-bandwidth multimode self-sensing in bimodal atomic force microscopy

    Directory of Open Access Journals (Sweden)

    Michael G. Ruppert

    2016-02-01

    Full Text Available Using standard microelectromechanical system (MEMS processes to coat a microcantilever with a piezoelectric layer results in a versatile transducer with inherent self-sensing capabilities. For applications in multifrequency atomic force microscopy (MF-AFM, we illustrate that a single piezoelectric layer can be simultaneously used for multimode excitation and detection of the cantilever deflection. This is achieved by a charge sensor with a bandwidth of 10 MHz and dual feedthrough cancellation to recover the resonant modes that are heavily buried in feedthrough originating from the piezoelectric capacitance. The setup enables the omission of the commonly used piezoelectric stack actuator and optical beam deflection sensor, alleviating limitations due to distorted frequency responses and instrumentation cost, respectively. The proposed method benefits from a more than two orders of magnitude increase in deflection to strain sensitivity on the fifth eigenmode leading to a remarkable signal-to-noise ratio. Experimental results using bimodal AFM imaging on a two component polymer sample validate that the self-sensing scheme can therefore be used to provide both the feedback signal, for topography imaging on the fundamental mode, and phase imaging on the higher eigenmode.

  14. Instantaneous high-resolution focus tracking and a vibrometery system using parallel phase shift interferometry

    Science.gov (United States)

    Ney, Michael; Safrani, Avner; Abdulhlaim, Ibrahim

    2016-09-01

    High resolution fast focus tracking and vibrometery system based on parallel phase shift polarization interferometry using three detectors is presented. The basic design and algorithm are described, followed by an experimental demonstration showing sub nm resolution of different controlled motion profiles instantaneously monitored at a feedback rate of 100 kHz. The fact that the method does not rely on active optical components, potentially allows extremely high vibration rates to be measured; limited only by the detector bandwidth and sampling rate. In addition, the relatively simple design relies only on standard optical equipment, combined with the simple algorithm, makes the task of setting up a high performance vibrometry system cheap and readily available.

  15. A high-speed linear algebra library with automatic parallelism

    Science.gov (United States)

    Boucher, Michael L.

    1994-01-01

    Parallel or distributed processing is key to getting highest performance workstations. However, designing and implementing efficient parallel algorithms is difficult and error-prone. It is even more difficult to write code that is both portable to and efficient on many different computers. Finally, it is harder still to satisfy the above requirements and include the reliability and ease of use required of commercial software intended for use in a production environment. As a result, the application of parallel processing technology to commercial software has been extremely small even though there are numerous computationally demanding programs that would significantly benefit from application of parallel processing. This paper describes DSSLIB, which is a library of subroutines that perform many of the time-consuming computations in engineering and scientific software. DSSLIB combines the high efficiency and speed of parallel computation with a serial programming model that eliminates many undesirable side-effects of typical parallel code. The result is a simple way to incorporate the power of parallel processing into commercial software without compromising maintainability, reliability, or ease of use. This gives significant advantages over less powerful non-parallel entries in the market.

  16. Multi-input and binary reproducible, high bandwidth floating point adder in a collective network

    Energy Technology Data Exchange (ETDEWEB)

    Chen, Dong; Eisley, Noel A.; Heidelberger, Philip; Steinmacher-Burow, Burkhard

    2016-11-15

    To add floating point numbers in a parallel computing system, a collective logic device receives the floating point numbers from computing nodes. The collective logic devices converts the floating point numbers to integer numbers. The collective logic device adds the integer numbers and generating a summation of the integer numbers. The collective logic device converts the summation to a floating point number. The collective logic device performs the receiving, the converting the floating point numbers, the adding, the generating and the converting the summation in one pass. One pass indicates that the computing nodes send inputs only once to the collective logic device and receive outputs only once from the collective logic device.

  17. Multi-input and binary reproducible, high bandwidth floating point adder in a collective network

    Energy Technology Data Exchange (ETDEWEB)

    Chen, Dong; Eisley, Noel A; Heidelberger, Philip; Steinmacher-Burow, Burkhard

    2015-03-10

    To add floating point numbers in a parallel computing system, a collective logic device receives the floating point numbers from computing nodes. The collective logic devices converts the floating point numbers to integer numbers. The collective logic device adds the integer numbers and generating a summation of the integer numbers. The collective logic device converts the summation to a floating point number. The collective logic device performs the receiving, the converting the floating point numbers, the adding, the generating and the converting the summation in one pass. One pass indicates that the computing nodes send inputs only once to the collective logic device and receive outputs only once from the collective logic device.

  18. High power parallel ultrashort pulse laser processing

    Science.gov (United States)

    Gillner, Arnold; Gretzki, Patrick; Büsing, Lasse

    2016-03-01

    The class of ultra-short-pulse (USP) laser sources are used, whenever high precession and high quality material processing is demanded. These laser sources deliver pulse duration in the range of ps to fs and are characterized with high peak intensities leading to a direct vaporization of the material with a minimum thermal damage. With the availability of industrial laser source with an average power of up to 1000W, the main challenge consist of the effective energy distribution and disposition. Using lasers with high repetition rates in the MHz region can cause thermal issues like overheating, melt production and low ablation quality. In this paper, we will discuss different approaches for multibeam processing for utilization of high pulse energies. The combination of diffractive optics and conventional galvometer scanner can be used for high throughput laser ablation, but are limited in the optical qualities. We will show which applications can benefit from this hybrid optic and which improvements in productivity are expected. In addition, the optical limitations of the system will be compiled, in order to evaluate the suitability of this approach for any given application.

  19. High speed single-wavelength modulation and transmission at 2 μm under bandwidth-constrained condition.

    Science.gov (United States)

    Xu, Ke; Wu, Qiong; Xie, Yongqiang; Tang, Ming; Fu, Songnian; Liu, Deming

    2017-02-20

    The 2-μm optical band has gained much attention recently due to its potential applications in optical fiber communication systems. One constraint in this wavelength region is that the electrical bandwidth of components like modulators and photodetectors is limited by the immature manufacturing technologies. Here we experimentally demonstrated the high-speed signal generation and transmission under bandwidth-constrained scenario at 2-μm. It is enabled by the direct-detection optical filter bank multicarrier (FBMC) modulation technique with constant amplitude zero autocorrelation (CAZAC) equalization. We achieved a single wavelength 80 Gbit/s data rate using the 16-QAM FBMC modulation format which is the highest single channel bit rate at 2-μm according to our best knowledge. The signal is transmitted through a 100m-long solid-core fiber designed for single-mode transmission at 2-μm. The measured bit error rates of the signals are below the forward error correction limit of 3.8 × 10-3, and the 100m-fiber transmission brings negligible penalty.

  20. High-speed massively parallel scanning

    Science.gov (United States)

    Decker, Derek E.

    2010-07-06

    A new technique for recording a series of images of a high-speed event (such as, but not limited to: ballistics, explosives, laser induced changes in materials, etc.) is presented. Such technique(s) makes use of a lenslet array to take image picture elements (pixels) and concentrate light from each pixel into a spot that is much smaller than the pixel. This array of spots illuminates a detector region (e.g., film, as one embodiment) which is scanned transverse to the light, creating tracks of exposed regions. Each track is a time history of the light intensity for a single pixel. By appropriately configuring the array of concentrated spots with respect to the scanning direction of the detection material, different tracks fit between pixels and sufficient lengths are possible which can be of interest in several high-speed imaging applications.

  1. Low-bandwidth authentication.

    Energy Technology Data Exchange (ETDEWEB)

    Donnelly, Patrick Joseph; McIver, Lauren; Gaines, Brian R.; Anderson, Erik; Collins, Michael Joseph; Thomas,Kurt Adam; McDaniel, Austin

    2007-09-01

    Remotely-fielded unattended sensor networks generally must operate at very low power--in the milliwatt or microwatt range--and thus have extremely limited communications bandwidth. Such sensors might be asleep most of the time to conserve power, waking only occasionally to transmit a few bits. RFID tags for tracking or material control have similarly tight bandwidth constraints, and emerging nanotechnology devices will be even more limited. Since transmitted data is subject to spoofing, and since sensors might be located in uncontrolled environments vulnerable to physical tampering, the high-consequence data generated by such systems must be protected by cryptographically sound authentication mechanisms; but such mechanisms are often lacking in current sensor networks. One reason for this undesirable situation is that standard authentication methods become impractical or impossible when bandwidth is severely constrained; if messages are small, a standard digital signature or HMAC will be many times larger than the message itself, yet it might be possible to spare only a few extra bits per message for security. Furthermore, the authentication tags themselves are only one part of cryptographic overhead, as key management functions (distributing, changing, and revoking keys) consume still more bandwidth. To address this problem, we have developed algorithms that provide secure authentication while adding very little communication overhead. Such techniques will make it possible to add strong cryptographic guarantees of data integrity to a much wider range of systems.

  2. Low-bandwidth authentication.

    Energy Technology Data Exchange (ETDEWEB)

    Donnelly, Patrick Joseph; McIver, Lauren; Gaines, Brian R.; Anderson, Erik; Collins, Michael Joseph; Thomas,Kurt Adam; McDaniel, Austin

    2007-09-01

    Remotely-fielded unattended sensor networks generally must operate at very low power--in the milliwatt or microwatt range--and thus have extremely limited communications bandwidth. Such sensors might be asleep most of the time to conserve power, waking only occasionally to transmit a few bits. RFID tags for tracking or material control have similarly tight bandwidth constraints, and emerging nanotechnology devices will be even more limited. Since transmitted data is subject to spoofing, and since sensors might be located in uncontrolled environments vulnerable to physical tampering, the high-consequence data generated by such systems must be protected by cryptographically sound authentication mechanisms; but such mechanisms are often lacking in current sensor networks. One reason for this undesirable situation is that standard authentication methods become impractical or impossible when bandwidth is severely constrained; if messages are small, a standard digital signature or HMAC will be many times larger than the message itself, yet it might be possible to spare only a few extra bits per message for security. Furthermore, the authentication tags themselves are only one part of cryptographic overhead, as key management functions (distributing, changing, and revoking keys) consume still more bandwidth. To address this problem, we have developed algorithms that provide secure authentication while adding very little communication overhead. Such techniques will make it possible to add strong cryptographic guarantees of data integrity to a much wider range of systems.

  3. The FORCE: A highly portable parallel programming language

    Science.gov (United States)

    Jordan, Harry F.; Benten, Muhammad S.; Alaghband, Gita; Jakob, Ruediger

    1989-01-01

    Here, it is explained why the FORCE parallel programming language is easily portable among six different shared-memory microprocessors, and how a two-level macro preprocessor makes it possible to hide low level machine dependencies and to build machine-independent high level constructs on top of them. These FORCE constructs make it possible to write portable parallel programs largely independent of the number of processes and the specific shared memory multiprocessor executing them.

  4. The FORCE - A highly portable parallel programming language

    Science.gov (United States)

    Jordan, Harry F.; Benten, Muhammad S.; Alaghband, Gita; Jakob, Ruediger

    1989-01-01

    This paper explains why the FORCE parallel programming language is easily portable among six different shared-memory multiprocessors, and how a two-level macro preprocessor makes it possible to hide low-level machine dependencies and to build machine-independent high-level constructs on top of them. These FORCE constructs make it possible to write portable parallel programs largely independent of the number of processes and the specific shared-memory multiprocessor executing them.

  5. High temporal resolution functional MRI using parallel echo volumar imaging

    Energy Technology Data Exchange (ETDEWEB)

    Rabrait, C.; Ciuciu, P.; Ribes, A.; Poupon, C.; Dehaine-Lambertz, G.; LeBihan, D.; Lethimonnier, F. [CEA Saclay, DSV, I2BM, Neurospin, F-91191 Gif Sur Yvette (France); Le Roux, P. [GEHC, Buc (France); Dehaine-Lambertz, G. [Unite INSERM 562, Gif Sur Yvette (France)

    2008-07-01

    Purpose: To combine parallel imaging with 3D single-shot acquisition (echo volumar imaging, EVI) in order to acquire high temporal resolution volumar functional MRI (fMRI) data. Materials and Methods: An improved EVI sequence was associated with parallel acquisition and field of view reduction in order to acquire a large brain volume in 200 msec. Temporal stability and functional sensitivity were increased through optimization of all imaging parameters and Tikhonov regularization of parallel reconstruction. Two human volunteers were scanned with parallel EVI in a 1.5 T whole-body MR system, while submitted to a slow event-related auditory paradigm. Results: Thanks to parallel acquisition, the EVI volumes display a low level of geometric distortions and signal losses. After removal of low-frequency drifts and physiological artifacts,activations were detected in the temporal lobes of both volunteers and voxel-wise hemodynamic response functions (HRF) could be computed. On these HRF different habituation behaviors in response to sentence repetition could be identified. Conclusion: This work demonstrates the feasibility of high temporal resolution 3D fMRI with parallel EVI. Combined with advanced estimation tools,this acquisition method should prove useful to measure neural activity timing differences or study the nonlinearities and non-stationarities of the BOLD response. (authors)

  6. Comparison of State-of-the-Art Digital Control and Analogue Control for High Bandwidth Point of Load Converters

    DEFF Research Database (Denmark)

    Jakobsen, Lars Tønnes; Schneider, Henrik; Andersen, Michael Andreas E.

    2008-01-01

    The purpose of this paper is to present a comparison of state-of-the-art digital and analogue control for a Buck converter with synchronous rectification. The digital control scheme is based on a digital self-oscillating modulator that allows the sampling frequency to be higher than the switching...... frequency of the converter. Voltage mode control is used in both the analogue and digital control schemes. The experimental results show that it is possible to design a digitally controlled Buck converter that has the same performance as can be achieved using commercially available analogue control ICs....... The performance of the analogue system can however be increased by using a separate operational amplifier as error amplifier. Thus analogue control is still the best option if high control bandwidth and fast transient response to load steps are important design parameters....

  7. High-Bandwidth Atomic Force Microscopy Reveals A Mechanical spike Accompanying the Action Potential in mammalian Nerve Terminals

    Science.gov (United States)

    Salzberg, Brian M.

    2008-03-01

    Information transfer from neuron to neuron within nervous systems occurs when the action potential arrives at a nerve terminal and initiates the release of a chemical messenger (neurotransmitter). In the mammalian neurohypophysis (posterior pituitary), large and rapid changes in light scattering accompany secretion of transmitter-like neuropeptides. In the mouse, these intrinsic optical signals are intimately related to the arrival of the action potential (E-wave) and the release of arginine vasopressin and oxytocin (S-wave). We have used a high bandwidth (20 kHz) atomic force microscope (AFM) to demonstrate that these light scattering signals are associated with changes in nerve terminal volume, detected as nanometer-scale movements of a cantilever positioned on top of the neurohypophysis. The most rapid mechanical response, the ``spike'', has duration comparable to that of the action potential (˜2 ms) and probably reflects an increase in terminal volume due to H2O movement associated with Na^+-influx. Elementary calculations suggest that two H2O molecules accompanying each Na^+-ion could account for the ˜0.5-1.0 å increase in the diameter of each terminal during the action potential. Distinguishable from the mechanical ``spike'', a slower mechanical event, the ``dip'', represents a decrease in nerve terminal volume, depends upon Ca^2+-entry, as well as on intra-terminal Ca^2+-transients, and appears to monitor events associated with secretion. A simple hypothesis is that this ``dip'' reflects the extrusion of the dense core granule that comprises the secretory products. These dynamic high bandwidth AFM recordings are the first to monitor mechanical events in nervous systems and may provide novel insights into the mechanism(s) by which excitation is coupled to secretion at nerve terminals.

  8. Radiation-hard/high-speed parallel optical links

    Energy Technology Data Exchange (ETDEWEB)

    Gan, K.K., E-mail: gan@mps.ohio-state.edu [Department of Physics, The Ohio State University, Columbus, OH 43210 (United States); Buchholz, P. [Fachbereich Physik, Universität Siegen, Siegen (Germany); Kagan, H.P.; Kass, R.D.; Moore, J.; Smith, D.S. [Department of Physics, The Ohio State University, Columbus, OH 43210 (United States); Wiese, A.; Ziolkowski, M. [Fachbereich Physik, Universität Siegen, Siegen (Germany)

    2013-12-11

    We have designed an ASIC for use in a parallel optical engine for a new layer of the ATLAS pixel detector in the initial phase of the LHC luminosity upgrade. The ASIC is a 12-channel Vertical Cavity Surface Emitting Laser (VCSEL) array driver capable of operating up to 5 Gb/s per channel. The ASIC is designed using a 130 nm CMOS process to enhance the radiation-hardness. A scheme for redundancy has also been implemented to allow bypassing of a broken VCSEL. The ASIC also contains a power-on reset circuit that sets the ASIC to a default configuration with no signal steering. In addition, the bias and modulation currents of the individual channels are programmable. We have tested the ASIC and the performance up to 5 Gb/s is satisfactory. Furthermore, we are able to program the bias and modulation currents and to bypass a broken VCSEL channel. We are currently upgrading our design to allow operation at 10 Gb/s per channel yielding an aggregated bandwidth of 120 Gb/s. Preliminary results of the design will be presented.

  9. Radiation-hard/high-speed parallel optical links

    Energy Technology Data Exchange (ETDEWEB)

    Gan, K.K., E-mail: gan@mps.ohio-state.edu [Department of Physics, The Ohio State University, Columbus, OH 43210 (United States); Buchholz, P. [Fachbereich Physik, Universität Siegen, Siegen (Germany); Kagan, H.P.; Kass, R.D.; Moore, J.; Smith, D.S. [Department of Physics, The Ohio State University, Columbus, OH 43210 (United States); Wiese, A.; Ziolkowski, M. [Fachbereich Physik, Universität Siegen, Siegen (Germany)

    2014-11-21

    We have designed an ASIC for use in a parallel optical engine for a new layer of the ATLAS pixel detector in the initial phase of the LHC luminosity upgrade. The ASIC is a 12-channel VCSEL (Vertical Cavity Surface Emitting Laser) array driver capable of operating up to 5 Gb/s per channel. The ASIC is designed using a 130 nm CMOS process to enhance the radiation-hardness. A scheme for redundancy has also been implemented to allow bypassing of a broken VCSEL. The ASIC also contains a power-on reset circuit that sets the ASIC to a default configuration with no signal steering. In addition, the bias and modulation currents of the individual channels are programmable. The performance of the first prototype ASIC up to 5 Gb/s is satisfactory. Furthermore, we are able to program the bias and modulation currents and to bypass a broken VCSEL channel. We are currently upgrading our design to allow operation at 10 Gb/s per channel yielding an aggregated bandwidth of 120 Gb/s. Some preliminary results of the design will be presented.

  10. Design considerations of high precision 6-HTRT parallel manipulator

    Institute of Scientific and Technical Information of China (English)

    ZHANG Xiu-feng; SUN Li-ning

    2006-01-01

    Describes a new architecture of a parallel robot with six degrees of freedom and focuses on improving orientation accuracy of movable platform in mechanism, error correction and control methods. A set of formulations about inverse kinematics, Jacobin matrix, and forward kinematics for the high precision 6-HTRT parallel robots is presented. The analysis of errors existing in the manipulator is discussed and a novel approach for error correction is advanced. By DSP technique, inverse kinematics is solved in real time conditions with high precision and the hardware control system is given. The experimental results demonstrate the effectiveness of the proposed technique.

  11. Ultra-low Noise, High Bandwidth, 1550nm HgCdTe APD Project

    Data.gov (United States)

    National Aeronautics and Space Administration — Voxtel Inc. proposes to optimize the design of a large area, 1.55?m sensitive HgCdTe avalanche photodiode (APD) that achieves high gain with nearly no excess noise....

  12. Electrothermal impedance spectroscopy measurement on high power LiMO2/Li4Ti5O12 battery cell with low bandwidth test setup

    DEFF Research Database (Denmark)

    Swierczynski, Maciej Jozef; Stroe, Daniel Loan; Stanciu, Tiberiu

    2015-01-01

    Modern lithium-ion batteries, like LiMO2/Li4Ti5O12 chemistry, are having very high power capability, which drives the need for precise thermal modelling of the battery. Battery thermal models are required to avoid possible safety issues (thermal runaways, high-temperature gradients) but also......-bandwidth and high-current capability for large format battery cells. Thus, this paper evaluates the possibility and accuracy of performing ETIS measurements with a standard battery test station (or bidirectional power supply) with low-bandwidth....

  13. A High-Performance Communication Service for Parallel Servo Computing

    Directory of Open Access Journals (Sweden)

    Cheng Xin

    2010-11-01

    Full Text Available Complexity of algorithms for the servo control in the multi-dimensional, ultra-precise stage application has made multi-processor parallel computing technology needed. Considering the specific communication requirements in the parallel servo computing, we propose a communication service scheme based on VME bus, which provides high-performance data transmission and precise synchronization trigger support for the processors involved. Communications service is implemented on both standard VME bus and user-defined Internal Bus (IB, and can be redefined online. This paper introduces parallel servo computing architecture and communication service, describes structure and implementation details of each module in the service, and finally provides data transmission model and analysis. Experimental results show that communication services can provide high-speed data transmission with sub-nanosecond-level error of transmission latency, and synchronous trigger with nanosecond-level synchronization error. Moreover, the performance of communication service is not affected by the increasing number of processors.

  14. Development of Advanced Low Emission Injectors and High-Bandwidth Fuel Flow Modulation Valves

    Science.gov (United States)

    Mansour, Adel

    2015-01-01

    Parker Hannifin Corporation developed the 3-Zone fuel nozzle for NASA's Environmentally Responsible Aviation Program to meet NASAs target of 75 LTO NOx reduction from CAEP6 regulation. The nozzle concept was envisioned as a drop-in replacement for currently used fuel nozzle stem, and is built up from laminates to provide energetic mixing suitable for lean direct injection mode at high combustor pressure. A high frequency fuel valve was also developed to provide fuel modulation for the pilot injector. Final testing result shows the LTO NOx level falling just shy of NASAs goal at 31.

  15. High Bandwidth Zero Voltage Injection Method for Sensorless Control of PMSM

    DEFF Research Database (Denmark)

    Ge, Xie; Lu, Kaiyuan; Kumar, Dwivedi Sanjeet

    2014-01-01

    High frequency signal injection is widely used in PMSM sensorless control system for low speed operations. The conventional voltage injection method often needs filters to obtain particular harmonic component in order to estimate the rotor position; or it requires several voltage pulses to be inj...... in a fast current regulation performance. Injection of zero voltage also minimizes the inverter voltage error effects caused by the dead-time.......High frequency signal injection is widely used in PMSM sensorless control system for low speed operations. The conventional voltage injection method often needs filters to obtain particular harmonic component in order to estimate the rotor position; or it requires several voltage pulses...... to be injected before the position may be estimated. In this paper, a single pulse zero voltage injection method is proposed. The rotor position is directly estimated from the current ripple at half of the switching frequency. No machine parameters are needed and using of filters is avoided. This results...

  16. Achieving High Resolution Measurements Within Limited Bandwidth Via Sensor Data Compression

    Science.gov (United States)

    2013-06-01

    are buffered separately and then saved when peaks are detected. The data are time stamped and inserted into a first-in, first-out ( FIFO ) buffer...16 samples around the peak are saved. These samples are combined with 2 solar sync words and 2 time stamp words, and are buffered into a FIFO for...Output Frame 5 A/D High Speed Fames Low Speed Fames Solar Buffer Peak Detect Time- Stamped Solar Pulse TX Buffer Solar FIFO

  17. High flux, narrow bandwidth compton light sources via extended laser-electron interactions

    Science.gov (United States)

    Barty, V P

    2015-01-13

    New configurations of lasers and electron beams efficiently and robustly produce high flux beams of bright, tunable, polarized quasi-monoenergetic x-rays and gamma-rays via laser-Compton scattering. Specifically, the use of long-duration, pulsed lasers and closely-spaced, low-charge and low emittance bunches of electron beams increase the spectral flux of the Compton-scattered x-rays and gamma rays, increase efficiency of the laser-electron interaction and significantly reduce the overall complexity of Compton based light sources.

  18. Parallel Beam Dynamics Code Development for High Intensity Cyclotron

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    <正>1 Parallel PIC algorithm Self field solver is the key part of a high intensity beam dynamic PIC code which usually adopts the P-M (Particle-Mesh) method to solve the space charge. The P-M method is composed of four major

  19. Remote parallel rendering for high-resolution tiled display walls

    KAUST Repository

    Nachbaur, Daniel

    2014-11-01

    © 2014 IEEE. We present a complete, robust and simple to use hardware and software stack delivering remote parallel rendering of complex geometrical and volumetric models to high resolution tiled display walls in a production environment. We describe the setup and configuration, present preliminary benchmarks showing interactive framerates, and describe our contributions for a seamless integration of all the software components.

  20. A parallel microfluidic flow cytometer for high-content screening.

    Science.gov (United States)

    McKenna, Brian K; Evans, James G; Cheung, Man Ching; Ehrlich, Daniel J

    2011-05-01

    A parallel microfluidic cytometer (PMC) uses a high-speed scanning photomultiplier-based detector to combine low-pixel-count, one-dimensional imaging with flow cytometry. The 384 parallel flow channels of the PMC decouple count rate from signal-to-noise ratio. Using six-pixel one-dimensional images, we investigated protein localization in a yeast model for human protein misfolding diseases and demonstrated the feasibility of a nuclear-translocation assay in Chinese hamster ovary (CHO) cells expressing an NFκB-EGFP reporter.

  1. Design of a Real-Time Face Detection Parallel Architecture Using High-Level Synthesis

    Directory of Open Access Journals (Sweden)

    Yang Fan

    2008-01-01

    Full Text Available Abstract We describe a High-Level Synthesis implementation of a parallel architecture for face detection. The chosen face detection method is the well-known Convolutional Face Finder (CFF algorithm, which consists of a pipeline of convolution operations. We rely on dataflow modelling of the algorithm and we use a high-level synthesis tool in order to specify the local dataflows of our Processing Element (PE, by describing in C language inter-PE communication, fine scheduling of the successive convolutions, and memory distribution and bandwidth. Using this approach, we explore several implementation alternatives in order to find a compromise between processing speed and area of the PE. We then build a parallel architecture composed of a PE ring and a FIFO memory, which constitutes a generic architecture capable of processing images of different sizes. A ring of 25 PEs running at 80 MHz is able to process 127 QVGA images per second or 35 VGA images per second.

  2. Design of a Real-Time Face Detection Parallel Architecture Using High-Level Synthesis

    Directory of Open Access Journals (Sweden)

    2009-02-01

    Full Text Available We describe a High-Level Synthesis implementation of a parallel architecture for face detection. The chosen face detection method is the well-known Convolutional Face Finder (CFF algorithm, which consists of a pipeline of convolution operations. We rely on dataflow modelling of the algorithm and we use a high-level synthesis tool in order to specify the local dataflows of our Processing Element (PE, by describing in C language inter-PE communication, fine scheduling of the successive convolutions, and memory distribution and bandwidth. Using this approach, we explore several implementation alternatives in order to find a compromise between processing speed and area of the PE. We then build a parallel architecture composed of a PE ring and a FIFO memory, which constitutes a generic architecture capable of processing images of different sizes. A ring of 25 PEs running at 80 MHz is able to process 127 QVGA images per second or 35 VGA images per second.

  3. High-Efficient Parallel CAVLC Encoders on Heterogeneous Multicore Architectures

    Directory of Open Access Journals (Sweden)

    H. Y. Su

    2012-04-01

    Full Text Available This article presents two high-efficient parallel realizations of the context-based adaptive variable length coding (CAVLC based on heterogeneous multicore processors. By optimizing the architecture of the CAVLC encoder, three kinds of dependences are eliminated or weaken, including the context-based data dependence, the memory accessing dependence and the control dependence. The CAVLC pipeline is divided into three stages: two scans, coding, and lag packing, and be implemented on two typical heterogeneous multicore architectures. One is a block-based SIMD parallel CAVLC encoder on multicore stream processor STORM. The other is a component-oriented SIMT parallel encoder on massively parallel architecture GPU. Both of them exploited rich data-level parallelism. Experiments results show that compared with the CPU version, more than 70 times of speedup can be obtained for STORM and over 50 times for GPU. The implementation of encoder on STORM can make a real-time processing for 1080p @30fps and GPU-based version can satisfy the requirements for 720p real-time encoding. The throughput of the presented CAVLC encoders is more than 10 times higher than that of published software encoders on DSP and multicore platforms.

  4. Wide bandwidth and high resolution planar filter array based on DBR-metasurface-DBR structures

    CERN Document Server

    Horie, Yu; Arbabi, Ehsan; Kamali, Seyedeh Mahsa; Faraon, Andrei

    2016-01-01

    We propose and experimentally demonstrate a planar array of optical bandpass filters composed of low loss dielectric metasurface layers sandwiched between two distributed Bragg reflectors (DBRs). The two DBRs form a Fabry-P\\'erot resonator whose center wavelength is controlled by the design of the transmissive metasurface layer which functions as a phase shifting element. We demonstrate an array of bandpass filters with spatially varying center wavelengths covering a wide range of operation wavelengths of 250 nm around {\\lambda} = 1550 nm ({\\Delta}{\\lambda}/{\\lambda} = 16%). The center wavelengths of each filter are independently controlled only by changing the in-plane geometry of the sandwiched metasurfaces, and the experimentally measured quality factors are larger than 700. The demonstrated filter array can be directly integrated on top of photodetector arrays to realize on-chip high-resolution spectrometers with free-space coupling.

  5. Parallel computation of seismic analysis of high arch dam

    Institute of Scientific and Technical Information of China (English)

    Chen Houqun; Ma Huaifa; Tu Jin; Cheng Guangqing; Tang Juzhen

    2008-01-01

    Parallel computation programs are developed for three-dimensional meso-mechanics analysis of fully-graded dam concrete and seismic response analysis of high arch dams (ADs), based on the Parallel Finite Element Program Generator (PFEPG). The computational algorithms of the numerical simulation of the meso-structure of concrete specimens were studied. Taking into account damage evolution, static preload, strain rate effect, and the heterogeneity of the meso-structure of dam concrete, the fracture processes of damage evolution and configuration of the cracks can be directly simulated. In the seismic response analysis of ADs, all the following factors are involved, such as the nonlinear contact due to the opening and slipping of the contraction joints, energy dispersion of the far-field foundation, dynamic interactions of the dam-foundation-reservoir system, and the combining effects of seismic action with all static loads. The correctness, reliability and efficiency of the two parallel computational programs are verified with practical illustrations.

  6. Are highly parallel systems ready for prime time?

    Science.gov (United States)

    Simon, Horst D.

    1990-01-01

    This is the edited and abbreviated transcript of a panel discussion held on May 9, 1989, in Los Angeles, during the conference "Parallel Computational Fluid Dynamics-Implementations and Results Using MIMD Computers." The purpose of the conference was to discuss recent developments in the use of MIMD parallel computers in high-performance computational fluid dynamics. The intent of the panel discussion was to summarize the findings of the meeting, and to give a perspective on the state of the art in using parallel computers for solving large-scale engineering and scientific application problems. The panelists were Creon Levit (NAS Systems Division, NASA-Ames Research Center, Moffett Field, California), Kent Misegades (Cray Research, Inc., Mendota Heights, Minnesota), Gary Montry (Myrias Computer Corporation, Albuquerque, New Mexico), Ken Neves (Boeing Computer Services, Seattle, Washington), Anthony Patera (Massachusetts Institute of Technology, Cambridge, Massachusetts), and Justin Rattner (Intel Scientific Computers, Beaverton, Oregon).

  7. Development of Radiation-Tolerant, Low Mass, High Bandwidth Flexible Printed Circuit Cables for Particle Detection Applications

    Science.gov (United States)

    McFadden, Neil

    2016-03-01

    Design options for meter long flexible printed circuit cables required for low mass ultra-high speed signal transmission in the high radiation environment at the High Luminosity run of the Large Hadron Collider (LHC) are described. Two dielectric materials were considered in this study, Kapton and a Kapton/Teflon mixture. The design geometry is a differential embedded microstrip with nominal 100 Ω impedance. Minimal mass and maximal radiation hardness are pre-eminent considerations. The long flexible printed circuit cables are characterized in bit error rate tests (BERT), attenuation versus frequency, mechanical response to stress and temperature change, and RLC decomposition. These tests are performed before and after irradiation with 1 MeV neutrons to 2x1016/cm 2 and 800 MeV protons to 2x1016 1 MeV-neq/cm2. A 1.0 m Kapton cable, with bandwidth of 6.22 gigabits per second, 0.03% of a radiation length, and no radiation induced mechanical or electrical degradation is obtained.

  8. A Novel Multi-carrier Radar for High-speed Wide-bandwidth Stepped-Frequency GPR

    Science.gov (United States)

    Kyoo Kim, Dong; Choi, Young Woo; Kang, Do Wook

    2015-04-01

    Ground Penetrating Radar (GPR) is one of the non-destructive testing methods for studying underground situations by using the electro-magnetic wave radiation effect. Two classical sensing techniques, impulsive GPR and stepped-frequency GPR, are used for a long time in various GPR applications. Signal bandwidths generated by the two techniques ranges from several hundred MHz to several GHz. For the research area of pavement survey the surveying speed is emphasized, thus impulsive GPR has been preferred to stepped-frequency GPR. To make a complete single scan operation, stepped-frequency GPR needs over hundreds of different frequency continuous wave (CW) radiations within its signal bandwidth which is the main time taking process. In case of impulsive GPR, it needs also several repeated pulses, for example from 64 to 512 repeated pulses, to do a complete single scan operation. Although the two techniques need several repeated internal operation processes, impulsive GPR is generally considered to be fast than stepped-frequency GPR. On the other hand, many studies of stepped-frequency GPR emphasizes that high-resolution scanning accuracy can be achieved by controlling each frequency component differently, such as frequency power profile, flexible bandwidth control. In case of pavement survey area, high-accuracy scanning is required within one meter deep as well as high-speed survey. The required accuracy is up to several centimeter in the material where dielectric constant is about 10. When surveying pavement, multi-element array antenna gives advantages to the measurement accuracy enhancement, where the scanning region of a 3 meters wide paved road is divided into several sub-regions as the number of the antenna element. For example, when stepped-frequency GPR requires 6msec for single scan operation and 15-element antenna is considered, the survey speed is limited to 15km/h in order to scan the road every 5cm, which is slow compared with common driving condition on

  9. Level-1 Data Driver Card - A high bandwidth radiation tolerant aggregator board for detectors

    CERN Document Server

    Gkountoumis, Panagiotis; The ATLAS collaboration

    2017-01-01

    The Level-1 Data Driver Card (L1DDC) was designed for the needs of the future upgrades of the innermost stations of the ATLAS end-cap muon spectrometer. The L1DDC is a high speed aggregator board capable of communicating with multiple front-end electronic boards. It collects the Level-1 data along with monitoring data and transmits them to a network interface through bidirectional and/or unidirectional fiber links at 4.8 Gbps each. In addition, the L1DDC board distributes trigger, time and configuration data coming from the network interface to the front-end boards. The L1DDC is fully compatible with the Phase II upgrade where the trigger rate is expected to reach the 1 MHz. Three different types of L1DDC boards will be fabricated handling up to 10.080 Gbps of user data. It consist of custom made radiation tolerant ASICs: the GigaBit Transceiver (GBTx), the FEAST DC-DC converter, the Slow Control Adapter (SCA), and the Versatile Tranceivers (VTRX) and transmitters (VTTX). The overall scheme of the data acquis...

  10. High Bandwidth, Multi-Purpose Passive Radar Receiver Design For Aerospace and Geoscience Targets

    Science.gov (United States)

    Vertatschitsch, Laura

    uninterruptible power supply (UPS) for up to 1 hour of continuous operation. In this document we provide technical details of the hardware, firmware, and software of the system and design strategies and decisions. We cover the topic of coherent processing for passive radar, specifically an overview of the cross-ambiguity function as a detection mechanism. While the applications of a system like this are incredibly broad, the initial validation and performance analysis was applied specifically to detection of aircraft using Digital Television (DTV) broadcast as an illuminator. We present results of both stationary and mobile operation. In stationary operation, the same helicopter has been detected using two different DTV transmissions. Early mobile operation results show the Doppler-spread ground clutter and possible detection of aircraft. In addition to the fully-functional aircraft detection signal chain, alternative FPGA designs are presented with modes for fast sampling on two antennas or four antennas, with access to an aggregate 240 MHz of spectrum, with 8-bit samples. At these extremely high data rates, moderate data loss occurs while saving this data to disk, but as detailed within this document, it can be accounted for and the effects minimalized, still allowing for detection of aircraft. With these modes, FM transmission and DTV transmission can be captured synchronously from a single antenna and digitizer feed, an exciting result that offers promise for both aerospace and geoscience applications.

  11. Parallel preconditioners and high order elements for microwave imaging

    CERN Document Server

    Bonazzoli, M; Rapetti, F; Tournier, P -H

    2016-01-01

    This paper combines the use of high order finite element methods with parallel preconditioners of domain decomposition type for solving electromagnetic problems arising from brain microwave imaging. The numerical algorithms involved in such complex imaging systems are computationally expensive since they require solving the direct problem of Maxwell's equations several times. Moreover, wave propagation problems in the high frequency regime are challenging because a sufficiently high number of unknowns is required to accurately represent the solution. In order to use these algorithms in practice for brain stroke diagnosis, running time should be reasonable. The method presented in this paper, coupling high order finite elements and parallel preconditioners, makes it possible to reduce the overall computational cost and simulation time while maintaining accuracy.

  12. POTENTIAL: A Highly Adaptive Core of Parallel Database System

    Institute of Scientific and Technical Information of China (English)

    文继荣; 陈红; 王珊

    2000-01-01

    POTENTIAL is a virtual database machine based on general computing platforms, especially parallel computing platforms. It provides a complete solution to high-performance database systems by a 'virtual processor + virtual data bus + virtual memory' architecture. Virtual processors manage all CPU resources in the system, on which various operations are running. Virtual data bus is responsible for the management of datatransmission between associated operations, which forms the hinges of the entire system. Virtual memory provides efficient data storage and buffering mechanisms that conform to data reference behaviors in database systems. The architecture of POTENTIAL is very clear and has many good features,including high efficiency, high scalability, high extensibility, high portability, etc.

  13. Electro-optic prism-pair setup for efficient high bandwidth isochronous CEP phase shift or group delay generation

    Science.gov (United States)

    Gobert, Olivier; Mennerat, Gabriel; Cornaggia, Christian; Lupinski, Dominique; Perdrix, Michel; Guillaumet, Delphine; Lepetit, Fabien; Oksenhendler, Thomas; Comte, Michel

    2016-05-01

    We report the experimental demonstration of an electro-optic prism pair pure carrier-envelope phase (CEP) shifter at low voltage (shift of 1 rad for a voltage of 90 V, applied to a crystal of 5 mm aperture). Validating our mathematical model, the experiments prove that this set-up which uses two rubidium titanyl phosphate (RTP) crystals, can be used either as an efficient high bandwidth CEP shifter without modifying the group delay of an ultrashort pulse (isochronous CEP shifter) or alternatively as a group delay generator with quasi-constant CEP (Pure Group Delay generator). These two configurations which correspond to specific geometries are characterized by spectral interferometry with a 800 nm mode-locked Ti:sapphire laser. The results are in very good agreement with the model. In the pure group delay mode, a group delay of 2.3 fs is obtained at 1000 V/cm without significant CEP shift. In the isochronous mode, a shift of 5.5 rad at 1000 V/cm is generated without significant delay. The applied voltage is also lowered by a factor of nearly three in this configuration, compared to the case of an RTP rectangular slab of the same total length.

  14. Remote, Real-time Investigations of Extreme Environments Using High Power and Bandwidth Cabled Observatories: The OOI Regional Scale Nodes

    Science.gov (United States)

    Kelley, D. S.; Delaney, J. R.

    2012-12-01

    Methane hydrate deposits and hydrothermal vents are two of the most extreme environments on Earth. Seismic events and flow of gases from the seafloor support and modulate novel microbial communities within these systems. Although studied intensely for several decades, significant questions remain about the flux of heat, volatiles and microbial material from the subsurface to the hydrosphere in these dynamic environments. Quantification of microbial communities, their structure and abundances, and metabolic activities is in an infant state. To better understand these systems, the National Science Foundation's Ocean Observatory Initiative has installed high power (8 kW), high bandwidth (10 Gb/s) nodes on the seafloor that provide access to active methane seeps at Southern Hydrate Ridge, and at the most magmatically robust volcano on the Juan de Fuca Ridge - Axial Seamount. The real-time interactive capabilities of the cabled observatory are critical to studying gas-hydrate systems because many of the key processes occur over short time scales. Events such as bubble plume formation, the creation of collapse zones, and increased seepage in response to earthquakes require adaptive response and sampling capabilities. To meet these challenges a suite of instruments will be connected to the cable in 2013. These sensors include full resolution sampling by upward-looking sonars, fluid and gas chemical characterization by mass spectrometers and osmo samplers, long-term duration collection of seep imagery from cameras, and in situ manipulation of chemical sensors coupled with flow meters. In concert, this instrument suite will provide quantification of transient and more stable chemical fluxes. Similarly, at Axial Seamount the high bandwidth and high power fiber optic cables will be used to communicate with and power a diverse array of sensors at the summit of the volcano. Real-time high definition video will provide unprecedented views of macrofaunal and microbial communities

  15. VisIO: enabling interactive visualization of ultra-scale, time-series data via high-bandwidth distributed I/O systems

    Energy Technology Data Exchange (ETDEWEB)

    Mitchell, Christopher J [Los Alamos National Laboratory; Ahrens, James P [Los Alamos National Laboratory; Wang, Jun [UCF

    2010-10-15

    Petascale simulations compute at resolutions ranging into billions of cells and write terabytes of data for visualization and analysis. Interactive visuaUzation of this time series is a desired step before starting a new run. The I/O subsystem and associated network often are a significant impediment to interactive visualization of time-varying data; as they are not configured or provisioned to provide necessary I/O read rates. In this paper, we propose a new I/O library for visualization applications: VisIO. Visualization applications commonly use N-to-N reads within their parallel enabled readers which provides an incentive for a shared-nothing approach to I/O, similar to other data-intensive approaches such as Hadoop. However, unlike other data-intensive applications, visualization requires: (1) interactive performance for large data volumes, (2) compatibility with MPI and POSIX file system semantics for compatibility with existing infrastructure, and (3) use of existing file formats and their stipulated data partitioning rules. VisIO, provides a mechanism for using a non-POSIX distributed file system to provide linear scaling of 110 bandwidth. In addition, we introduce a novel scheduling algorithm that helps to co-locate visualization processes on nodes with the requested data. Testing using VisIO integrated into Para View was conducted using the Hadoop Distributed File System (HDFS) on TACC's Longhorn cluster. A representative dataset, VPIC, across 128 nodes showed a 64.4% read performance improvement compared to the provided Lustre installation. Also tested, was a dataset representing a global ocean salinity simulation that showed a 51.4% improvement in read performance over Lustre when using our VisIO system. VisIO, provides powerful high-performance I/O services to visualization applications, allowing for interactive performance with ultra-scale, time-series data.

  16. Power spectrum analysis for optical tweezers. II: Laser wavelength dependence of parasitic filtering, and how to achieve high bandwidth

    DEFF Research Database (Denmark)

    Berg-Sørensen, Kirstine; Peterman, Erwin J G; Weber, Tom

    2006-01-01

    In a typical optical tweezers detection system, the position of a trapped object is determined from laser light impinging on a quadrant photodiode. When the laser is infrared and the photodiode is of silicon, they can act together as an unintended low-pass filter. This parasicit effect is due...... this detection system of optical tweezers a bandwidth, accuracy, and precision that are limited only by the data acquisition board's bandwidth and bandpass ripples, here 96.7 kHz and 0.005 dB, respectively. ©2006 American Institute of Physics...

  17. Weak-quasi-bandwidth and forward-bandwidth of graphs

    Institute of Scientific and Technical Information of China (English)

    原晋江

    1996-01-01

    Concepts of weak-quasi-bandwidth and forward-bandwidth of graphs are introduced. They are used to studythe following problems in graph theory: bandwidth, topological bandwidth, fill-in, profile, path-width, tree-width.

  18. Polarization mode dispersion spectrum measurement via high-speed wavelength-parallel polarimetry.

    Science.gov (United States)

    Xu, Li; Wang, Shawn X; Miao, Houxun; Weiner, Andrew M

    2009-08-20

    We report experiments in which wavelength-parallel spectral polarimetry technology is used for measurement of the frequency-dependent polarization mode dispersion (PMD) vector. Experiments have been performed using either a grating spectral disperser, configured to provide 13.6 GHz spectral resolution over a 14 nm optical bandwidth, or a virtually imaged phased array spectral disperser, configured for 1.6 GHz spectral resolution over a 200 GHz band. Our results indicate that the spectral polarimetry data obtained via this approach are of sufficient quality to permit accurate extraction of the PMD spectrum. The wavelength-parallel spectral polarimetry approach allows data acquisition within a few milliseconds.

  19. Parallel Microcracks-based Ultrasensitive and Highly Stretchable Strain Sensors.

    Science.gov (United States)

    Amjadi, Morteza; Turan, Mehmet; Clementson, Cameron P; Sitti, Metin

    2016-03-01

    There is an increasing demand for flexible, skin-attachable, and wearable strain sensors due to their various potential applications. However, achieving strain sensors with both high sensitivity and high stretchability is still a grand challenge. Here, we propose highly sensitive and stretchable strain sensors based on the reversible microcrack formation in composite thin films. Controllable parallel microcracks are generated in graphite thin films coated on elastomer films. Sensors made of graphite thin films with short microcracks possess high gauge factors (maximum value of 522.6) and stretchability (ε ≥ 50%), whereas sensors with long microcracks show ultrahigh sensitivity (maximum value of 11,344) with limited stretchability (ε ≤ 50%). We demonstrate the high performance strain sensing of our sensors in both small and large strain sensing applications such as human physiological activity recognition, human body large motion capturing, vibration detection, pressure sensing, and soft robotics.

  20. Direct drive digital servo press with high parallel control

    Science.gov (United States)

    Murata, Chikara; Yabe, Jun; Endou, Junichi; Hasegawa, Kiyoshi

    2013-12-01

    Direct drive digital servo press has been developed as the university-industry joint research and development since 1998. On the basis of this result, 4-axes direct drive digital servo press has been developed and in the market on April of 2002. This servo press is composed of 1 slide supported by 4 ball screws and each axis has linearscale measuring the position of each axis with high accuracy less than μm order level. Each axis is controlled independently by servo motor and feedback system. This system can keep high level parallelism and high accuracy even with high eccentric load. Furthermore the 'full stroke full power' is obtained by using ball screws. Using these features, new various types of press forming and stamping have been obtained by development and production. The new stamping and forming methods are introduced and 'manufacturing' need strategy of press forming with high added value and also the future direction of press forming are also introduced.

  1. Parallel input parallel output high voltage bi-directional converters for driving dielectric electro active polymer actuators

    DEFF Research Database (Denmark)

    Thummala, Prasanth; Zhang, Zhe; Andersen, Michael A. E.;

    2014-01-01

    is to design and implement driving circuits for the DEAP actuators for their use in various applications. This paper presents implementation of parallel input, parallel output, high voltage (~2.5 kV) bi-directional DC-DC converters for driving the DEAP actuators. The topology is a bidirectional flyback DC......-DC converter incorporating commercially available high voltage MOSFETs (4 kV) and high voltage diodes (5 kV). Although the average current of the aforementioned devices is limited to 300 mA and 150 mA, respectively, connecting the outputs of multiple converters in parallel can provide a scalable design....... This enables operating the DEAP actuators in various static and dynamic applications e.g. positioning, vibration generation or damping, and pumps. The proposed idea is experimentally verified by connecting three high voltage converters in parallel to operate a single DEAP actuator. The experimental results...

  2. Parallelism and pipelining in high-speed digital simulators

    Science.gov (United States)

    Karplus, W. J.

    1983-01-01

    The attainment of high computing speed as measured by the computational throughput is seen as one of the most challenging requirements. It is noted that high speed is cardinal in several distinct classes of applications. These classes are then discussed; they comprise (1) the real-time simulation of dynamic systems , (2) distributed parameter systems, and (3) mixed lumped and distributed systems. From the 1950s on, the quest for high speed in digital simulators concentrated on overcoming the limitations imposed by the so-called von Neumann bottleneck. Two major architectural approaches have made ig possible to circumvent this bottleneck and attain high speeds. These are pipelining and parallelism. Supercomputers, peripheral array processors, and microcomputer networks are then discussed.

  3. High Performance Parallel Methods for Space Weather Simulations

    Science.gov (United States)

    Hunter, Paul (Technical Monitor); Gombosi, Tamas I.

    2003-01-01

    This is the final report of our NASA AISRP grant entitled 'High Performance Parallel Methods for Space Weather Simulations'. The main thrust of the proposal was to achieve significant progress towards new high-performance methods which would greatly accelerate global MHD simulations and eventually make it possible to develop first-principles based space weather simulations which run much faster than real time. We are pleased to report that with the help of this award we made major progress in this direction and developed the first parallel implicit global MHD code with adaptive mesh refinement. The main limitation of all earlier global space physics MHD codes was the explicit time stepping algorithm. Explicit time steps are limited by the Courant-Friedrichs-Lewy (CFL) condition, which essentially ensures that no information travels more than a cell size during a time step. This condition represents a non-linear penalty for highly resolved calculations, since finer grid resolution (and consequently smaller computational cells) not only results in more computational cells, but also in smaller time steps.

  4. High power, picosecond green laser based on a frequency-doubled, all-fiber, narrow-bandwidth, linearly polarized, Yb-doped fiber laser

    Science.gov (United States)

    Tian, Wenyan; Isyanova, Yelena; Stegeman, Robert; Huang, Ye; Chieffo, Logan R.; Moulton, Peter F.

    2016-03-01

    We report on the development of an all-fiber, 68-kW-peak-power, 16-ps-pulse-width, narrow-bandwidth, linearly polarized, 1064-nm fiber laser suitable for high-power, picosecond-pulse-width, green-light generation. Our 1064-nm fiber laser delivered an average power of up to 110 W at a repetition of 100- MHz in a narrow bandwidth, with minimal nonlinear distortion. We developed a high-power, picosecond green source at 532 nm through use of single-pass frequency-doubling of our 1064-nm fiber laser in lithium triborate (LBO). Using a 15-mm long LBO crystal, we have generated 30 W of average power in the second harmonic with 73-W of fundamental average power, for a conversion efficiency of 41%.

  5. Design of High Speed 128 bit Parallel Prefix Adders

    OpenAIRE

    T.KIRAN KUMAR; Srikanth, P

    2014-01-01

    In this paper, we propose 128-bit Kogge-Stone, Ladner-Fischer, Spanning tree parallel prefix adders and compared with Ripple carry adder. In general N-bit adders like Ripple Carry Adders (slow adders compare to other adders), and Carry Look Ahead adders (area consuming adders) are used in earlier days. But now the most Industries are using parallel prefix adders because of their advantages compare to other adders. Parallel prefix adders are faster and area efficient. Parallel pref...

  6. High-Power and High-Efficiency 1.3- µm Superluminescent Diode With Flat-Top and Ultrawide Emission Bandwidth

    KAUST Repository

    Khan, Mohammed Zahed Mustafa

    2015-02-01

    We report on a flat-top and ultrawide emission bandwidth of 125 nm from InGaAsP/InP multiple quantum-well (MQW) superluminescent diode with antireflection coated and tilted ridge-waveguide device configuration. A total output power in excess of 70 mW with an average power spectral density of 0.56 mW/nm and spectral ripple ≤ 1.2 ± 0.5 dB is measured from the device. Wall-plug efficiency and output power as high as 14% and 80 mW, respectively, is demonstrated from this batch of devices. We attribute the broad emission to the inherent inhomogeneity of the electron-heavy-hole (e-hh) and electron-light-hole (e-lh) recombination of the ground state and the first excited state of the MQWs and their simultaneous emission.

  7. Novel high-gain, improved-bandwidth, finned-ladder V-band Traveling-Wave Tube slow-wave circuit design

    Science.gov (United States)

    Kory, Carol L.; Wilson, Jeffrey D.

    1994-01-01

    The V-band frequency range of 59-64 GHz is a region of the millimeter-wave spectrum that has been designated for inter-satellite communications. As a first effort to develop a high-efficiency V-band Traveling-Wave Tube (TWT), variations on a ring-plane slow-wave circuit were computationally investigated to develop an alternative to the more conventional ferruled coupled-cavity circuit. The ring-plane circuit was chosen because of its high interaction impedance, large beam aperture, and excellent thermal dissipation properties. Despite these advantages, however, low bandwidth and high voltage requirements have, until now, prevented its acceptance outside the laboratory. In this paper, the three-dimensional electrodynamic simulation code MAFIA (solution of MAxwell's Equation by the Finite-Integration-Algorithm) is used to investigate methods of increasing the bandwidth and lowering the operating voltage of the ring-plane circuit. Calculations of frequency-phase dispersion, beam on-axis interaction impedance, attenuation and small-signal gain per wavelength were performed for various geometric variations and loading distributions of the ring-plane TWT slow-wave circuit. Based on the results of the variations, a circuit termed the finned-ladder TWT slow-wave circuit was designed and is compared here to the scaled prototype ring-plane and a conventional ferruled coupled-cavity TWT circuit over the V-band frequency range. The simulation results indicate that this circuit has a much higher gain, significantly wider bandwidth, and a much lower voltage requirement than the scaled ring-plane prototype circuit, while retaining its excellent thermal dissipation properties. The finned-ladder circuit has a much larger small-signal gain per wavelength than the ferruled coupled-cavity circuit, but with a moderate sacrifice in bandwidth.

  8. Minimization of the impact of a broad bandwidth high-gain nonlinear preamplifier to the amplified spontaneous emission pedestal of the Vulcan petawatt laser facility.

    Science.gov (United States)

    Musgrave, I O; Hernandez-Gomez, C; Canny, D; Collier, J; Heathcote, R

    2007-10-01

    To generate petawatt pulses using the Vulcan Nd:glass laser requires a broad bandwidth high-gain preamplifier. The preamplifier used is an optical parametric amplifier that provides a total gain of 10(8) in three amplification stages. We report on a detailed investigation of the effect of the Vulcan optical parametric chirped pulse amplification (OPCPA) preamplifier on contrast caused by the amplified spontaneous emission (ASE) pedestal that extends up to 2 ns before the arrival of the main pulse. The contrast after compression is improved to 4x10(8) of the intensity of the main pulse using near-field apertures between the stages of the OPCPA preamplifier. Further reduction of the level of the ASE pedestal can be achieved at the cost of a reduction in amplified bandwidth by solely phosphate glass amplification after initial preamplification rather than a mixed glass amplification scheme.

  9. High-Bandwidth Tactical-Network Data Analysis in a High-Performance-Computing (HPC) Environment: Time Tagging the Data

    Science.gov (United States)

    2015-09-01

    1 ms. 15. SUBJECT TERMS tactical networks, data reduction, high-performance computing, data analysis, big data 16. SECURITY CLASSIFICATION OF...study of WIN-T IOTE ClockModel Issues.7 Fig. 5 Sample long-running ADMAS clock differences (3 clock model states...of the total cuts recorded (on the order of 0.00001% of cuts recorded). 7. Adametz J, McGowan J. Case study of WIN-T IOTE ClockModel issues

  10. High-Bandwidth Tactical-Network Data Analysis in a High-Performance-Computing (HPC) Environment: Introduction

    Science.gov (United States)

    2015-09-01

    ability to transport voice and data messages, with high assurance and minimal delays, as the unit maneuvers to accomplish its mission. Tactical...critical to such analysis efforts, in addition to metrics drawn from application-level interactions, such as Voice over futemet Protocol (VoiP...Network Performance Statistics. These provide information on the state of IP routing tables and radio-level connections, which informs the overall

  11. Generation of a spectrum with high flatness and high bandwidth in a short length of telecom fiber using microchip laser

    Science.gov (United States)

    Hernandez-Garcia, J. C.; Estudillo-Ayala, J. M.; Pottiez, O.; Rojas-Laguna, R.; Mata-Chavez, R. I.; Gonzalez-Garcia, A.

    2013-04-01

    In this work, we studied experimentally the generation of a supercontinuum spectrum induced in a piece of standard single-mode fiber using pulses from a microchip laser. For different values of fiber length, we obtained spectra with high flatness in visible and IR regions. The possibility to generate a spectrum with a high flatness and spectral width of more than ˜1100 nm (600 nm to over 1700 nm) in relatively short lengths of telecom fiber (˜57 m), using as the pump pulses with no more than a few kW peak power at a non-zero-dispersion wavelength, is attributed to the peculiar properties of the pulses generated by the pump source. The physical processes leading to the formation of the supercontinuum spectrum were studied by monitoring the growth of the spectrum while increasing the input power. The coupling efficiency between the microchip laser and the telecom fiber helped us obtain a very wide spectrum. This work shows that the use of conventional fiber for supercontinuum generation can be viewed as a cheap and efficient option, in particular for applications like optical metrology, coherence tomography and low noise sources for the characterization of devices.

  12. The parallel I/O architecture of the high performance storage system (HPSS). Revision 1

    Energy Technology Data Exchange (ETDEWEB)

    Watson, R.W. [Lawrence Livermore National Lab., CA (United States); Coyne, R.A. [IBM Government Systems, Houston, TX (United States)

    1995-04-01

    Datasets up to terabyte size and petabyte capacities have created a serious imbalance between I/O and storage system performance and system functionality. One promising approach is the use of parallel data transfer techniques for client access to storage, peripheral-to-peripheral transfers, and remote file transfers. This paper describes the parallel I/O architecture and mechanisms, Parallel Transport Protocol (PTP), parallel FTP, and parallel client Application Programming Interface (API) used by the High Performance Storage System (HPSS). Parallel storage integration issues with a local parallel file system are also discussed.

  13. Overview of Parallel Platforms for Common High Performance Computing

    Directory of Open Access Journals (Sweden)

    T. Fryza

    2012-04-01

    Full Text Available The paper deals with various parallel platforms used for high performance computing in the signal processing domain. More precisely, the methods exploiting the multicores central processing units such as message passing interface and OpenMP are taken into account. The properties of the programming methods are experimentally proved in the application of a fast Fourier transform and a discrete cosine transform and they are compared with the possibilities of MATLAB's built-in functions and Texas Instruments digital signal processors with very long instruction word architectures. New FFT and DCT implementations were proposed and tested. The implementation phase was compared with CPU based computing methods and with possibilities of the Texas Instruments digital signal processing library on C6747 floating-point DSPs. The optimal combination of computing methods in the signal processing domain and new, fast routines' implementation is proposed as well.

  14. Laser damage comparisons of broad-bandwidth, high-reflection optical coatings containing TiO2, Nb2O5, or Ta2O5 high-index layers

    Science.gov (United States)

    Field, Ella S.; Bellum, John C.; Kletecka, Damon E.

    2017-01-01

    Broad bandwidth coatings allow angle of incidence flexibility and accommodate spectral shifts due to aging and water absorption. Higher refractive index materials in optical coatings, such as TiO2, Nb2O5, and Ta2O5, can be used to achieve broader bandwidths compared to coatings that contain HfO2 high index layers. We have identified the deposition settings that lead to the highest index, lowest absorption layers of TiO2, Nb2O5, and Ta2O5, via e-beam evaporation using ion-assisted deposition. We paired these high index materials with SiO2 as the low index material to create broad bandwidth high reflection coatings centered at 1054 nm for 45 deg angle of incidence and P polarization. High reflection bandwidths as large as 231 nm were realized. Laser damage tests of these coatings using the ISO 11254 and NIF-MEL protocols are presented, which revealed that the Ta2O5/SiO2 coating exhibits the highest resistance to laser damage, at the expense of lower bandwidth compared to the TiO2/SiO2 and Nb2O5/SiO2 coatings.

  15. Chemical Industry Bandwidth Study

    Energy Technology Data Exchange (ETDEWEB)

    none,

    2006-12-01

    The Chemical Bandwidth Study provides a snapshot of potentially recoverable energy losses during chemical manufacturing. The advantage of this study is the use of "exergy" analysis as a tool for pinpointing inefficiencies.

  16. Bandwidth efficient coding

    CERN Document Server

    Anderson, John B

    2017-01-01

    Bandwidth Efficient Coding addresses the major challenge in communication engineering today: how to communicate more bits of information in the same radio spectrum. Energy and bandwidth are needed to transmit bits, and bandwidth affects capacity the most. Methods have been developed that are ten times as energy efficient at a given bandwidth consumption as simple methods. These employ signals with very complex patterns and are called "coding" solutions. The book begins with classical theory before introducing new techniques that combine older methods of error correction coding and radio transmission in order to create narrowband methods that are as efficient in both spectrum and energy as nature allows. Other topics covered include modulation techniques such as CPM, coded QAM and pulse design.

  17. Industrial Glass Bandwidth Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Rue, David M. [Gas Technology Inst., Des Plaines, IL (United States); Servaites, James [Gas Technology Inst., Des Plaines, IL (United States); Wolf, Warren [Gas Technology Inst., Des Plaines, IL (United States)

    2007-08-01

    This is a study on energy use and potential savings, or "bandwidth" study, for several glassmaking processes. Intended to provide a realistic estimate of the potential amount of energy that can be saved in an industrial process, the "bandwidth" refers to the difference between the amount of energy that would be consumed in a process using commercially available technology versus the minimum amount of energy needed to achieve those same results.

  18. Glass Industry Bandwidth Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Rue, David M. [Gas Technology Inst., Des Plaines, IL (United States)

    2006-07-01

    This is a study on energy use and potential savings, or "bandwidth" study, for several glassmaking processes. Intended to provide a realistic estimate of the potential amount of energy that can be saved in an industrial process, the "bandwidth" refers to the difference between the amount of energy that would be consumed in a process using commercially available technology versus the minimum amount of energy needed to achieve those same results.

  19. Industrial Glass Bandwidth Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Rue, David M. [Gas Technology Inst., Des Plaines, IL (United States); Servaites, James [Gas Technology Inst., Des Plaines, IL (United States); Wolf, Warren [Gas Technology Inst., Des Plaines, IL (United States)

    2007-08-01

    This is a study on energy use and potential savings, or "bandwidth" study, for several glassmaking processes. Intended to provide a realistic estimate of the potential amount of energy that can be saved in an industrial process, the "bandwidth" refers to the difference between the amount of energy that would be consumed in a process using commercially available technology versus the minimum amount of energy needed to achieve those same results.

  20. Glass Industry Bandwidth Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Rue, David M. [Gas Technology Inst., Des Plaines, IL (United States)

    2006-07-01

    This is a study on energy use and potential savings, or "bandwidth" study, for several glassmaking processes. Intended to provide a realistic estimate of the potential amount of energy that can be saved in an industrial process, the "bandwidth" refers to the difference between the amount of energy that would be consumed in a process using commercially available technology versus the minimum amount of energy needed to achieve those same results.

  1. Highly parallel translation of DNA sequences into small molecules.

    Directory of Open Access Journals (Sweden)

    Rebecca M Weisinger

    Full Text Available A large body of in vitro evolution work establishes the utility of biopolymer libraries comprising 10(10 to 10(15 distinct molecules for the discovery of nanomolar-affinity ligands to proteins. Small-molecule libraries of comparable complexity will likely provide nanomolar-affinity small-molecule ligands. Unlike biopolymers, small molecules can offer the advantages of cell permeability, low immunogenicity, metabolic stability, rapid diffusion and inexpensive mass production. It is thought that such desirable in vivo behavior is correlated with the physical properties of small molecules, specifically a limited number of hydrogen bond donors and acceptors, a defined range of hydrophobicity, and most importantly, molecular weights less than 500 Daltons. Creating a collection of 10(10 to 10(15 small molecules that meet these criteria requires the use of hundreds to thousands of diversity elements per step in a combinatorial synthesis of three to five steps. With this goal in mind, we have reported a set of mesofluidic devices that enable DNA-programmed combinatorial chemistry in a highly parallel 384-well plate format. Here, we demonstrate that these devices can translate DNA genes encoding 384 diversity elements per coding position into corresponding small-molecule gene products. This robust and efficient procedure yields small molecule-DNA conjugates suitable for in vitro evolution experiments.

  2. Highly Parallelized Pattern Matching Execution for the ATLAS Experiment

    CERN Document Server

    Citraro, Saverio; The ATLAS collaboration

    2015-01-01

    The trigger system of the ATLAS experiment at LHC will extend its rejection capabilities during operations in 2015-2018 by introducing the Fast TracKer system (FTK). FTK is a hardware based system capable of finding charged particle tracks by analyzing hits in silicon detectors at the rate of 105 events per second. The core of track reconstruction is performed into two pipelined steps. At first step the candidate tracks are found by matching combination of low resolution hits to predefined patterns; then they are used in the second step to seed a more precise track fitting algorithm. The key FTK component is an Associative Memory (AM) system that is used to perform pattern matching with high degree of parallelism. The AM system implementation, the AM Serial Link Processor, is based on an extremely powerful network of 2 Gb/s serial links to sustain a huge traffic of data. We report on the design of the Serial Link Processor consisting of two types of boards, the Little Associative Memory Board (LAMB), a mezzan...

  3. Highly Parallelized Pattern Matching Execution for the ATLAS Experiment

    CERN Document Server

    Citraro, Saverio; The ATLAS collaboration

    2015-01-01

    Abstract– The Associative Memory (AM) system of the Fast Tracker (FTK) processor has been designed to perform pattern matching using as input the data from the silicon tracker in the ATLAS experiment. The AM is the primary component of the FTK system and is designed using ASIC technology (the AM chip) to execute pattern matching with a high degree of parallelism. The FTK system finds track candidates at low resolution that are seeds for a full resolution track fitting. The AM system implementation is named “Serial Link Processor” and is based on an extremely powerful network of 2 Gb/s serial links to sustain a huge traffic of data. This paper reports on the design of the Serial Link Processor consisting of two types of boards, the Little Associative Memory Board (LAMB), a mezzanine where the AM chips are mounted, and the Associative Memory Board (AMB), a 9U VME motherboard which hosts four LAMB daughterboards. We also report on the performance of the prototypes (both hardware and firmware) produced and ...

  4. A novel highly parallel algorithm for linearly unmixing hyperspectral images

    Science.gov (United States)

    Guerra, Raúl; López, Sebastián.; Callico, Gustavo M.; López, Jose F.; Sarmiento, Roberto

    2014-10-01

    Endmember extraction and abundances calculation represent critical steps within the process of linearly unmixing a given hyperspectral image because of two main reasons. The first one is due to the need of computing a set of accurate endmembers in order to further obtain confident abundance maps. The second one refers to the huge amount of operations involved in these time-consuming processes. This work proposes an algorithm to estimate the endmembers of a hyperspectral image under analysis and its abundances at the same time. The main advantage of this algorithm is its high parallelization degree and the mathematical simplicity of the operations implemented. This algorithm estimates the endmembers as virtual pixels. In particular, the proposed algorithm performs the descent gradient method to iteratively refine the endmembers and the abundances, reducing the mean square error, according with the linear unmixing model. Some mathematical restrictions must be added so the method converges in a unique and realistic solution. According with the algorithm nature, these restrictions can be easily implemented. The results obtained with synthetic images demonstrate the well behavior of the algorithm proposed. Moreover, the results obtained with the well-known Cuprite dataset also corroborate the benefits of our proposal.

  5. Bandwidth challenge teams at SC2003 conference

    CERN Multimedia

    2003-01-01

    Results from the fourth annual High-Performance Bandwidth Challenge, held in conjunction with SC2003, the international conference on high-performance computing and networking which occurred last week in Phoenix, AZ (1 page).

  6. Bandwidth Reconfigurable Metamaterial Arrays

    Directory of Open Access Journals (Sweden)

    Nathanael J. Smith

    2014-01-01

    Full Text Available Metamaterial structures provide innovative ways to manipulate electromagnetic wave responses to realize new applications. This paper presents a conformal wideband metamaterial array that achieves as much as 10 : 1 continuous bandwidth. This was done by using interelement coupling to concurrently achieve significant wave slow-down and cancel the inductance stemming from the ground plane. The corresponding equivalent circuit of the resulting array is the same as that of classic metamaterial structures. In this paper, we present a wideband Marchand-type balun with validation measurements demonstrating the metamaterial (MTM array’s bandwidth from 280 MHz to 2800 MHz. Bandwidth reconfiguration of this class of array is then demonstrated achieving a variety of band-pass or band-rejection responses within its original bandwidth. In contrast with previous bandwidth and frequency response reconfigurations, our approach does not change the aperture’s or ground plane’s geometry, nor does it introduce external filtering structures. Instead, the new responses are realized by making simple circuit changes into the balanced feed integrated with the wideband MTM array. A variety of circuit changes can be employed using MEMS switches or variable lumped loads within the feed and 5 example band-pass and band-rejection responses are presented. These demonstrate the potential of the MTM array’s reconfiguration to address a variety of responses.

  7. High Performance Input/Output for Parallel Computer Systems

    Science.gov (United States)

    Ligon, W. B.

    1996-01-01

    The goal of our project is to study the I/O characteristics of parallel applications used in Earth Science data processing systems such as Regional Data Centers (RDCs) or EOSDIS. Our approach is to study the runtime behavior of typical programs and the effect of key parameters of the I/O subsystem both under simulation and with direct experimentation on parallel systems. Our three year activity has focused on two items: developing a test bed that facilitates experimentation with parallel I/O, and studying representative programs from the Earth science data processing application domain. The Parallel Virtual File System (PVFS) has been developed for use on a number of platforms including the Tiger Parallel Architecture Workbench (TPAW) simulator, The Intel Paragon, a cluster of DEC Alpha workstations, and the Beowulf system (at CESDIS). PVFS provides considerable flexibility in configuring I/O in a UNIX- like environment. Access to key performance parameters facilitates experimentation. We have studied several key applications fiom levels 1,2 and 3 of the typical RDC processing scenario including instrument calibration and navigation, image classification, and numerical modeling codes. We have also considered large-scale scientific database codes used to organize image data.

  8. Memory Benchmarks for SMP-Based High Performance Parallel Computers

    Energy Technology Data Exchange (ETDEWEB)

    Yoo, A B; de Supinski, B; Mueller, F; Mckee, S A

    2001-11-20

    As the speed gap between CPU and main memory continues to grow, memory accesses increasingly dominates the performance of many applications. The problem is particularly acute for symmetric multiprocessor (SMP) systems, where the shared memory may be accessed concurrently by a group of threads running on separate CPUs. Unfortunately, several key issues governing memory system performance in current systems are not well understood. Complex interactions between the levels of the memory hierarchy, buses or switches, DRAM back-ends, system software, and application access patterns can make it difficult to pinpoint bottlenecks and determine appropriate optimizations, and the situation is even more complex for SMP systems. To partially address this problem, we formulated a set of multi-threaded microbenchmarks for characterizing and measuring the performance of the underlying memory system in SMP-based high-performance computers. We report our use of these microbenchmarks on two important SMP-based machines. This paper has four primary contributions. First, we introduce a microbenchmark suite to systematically assess and compare the performance of different levels in SMP memory hierarchies. Second, we present a new tool based on hardware performance monitors to determine a wide array of memory system characteristics, such as cache sizes, quickly and easily; by using this tool, memory performance studies can be targeted to the full spectrum of performance regimes with many fewer data points than is otherwise required. Third, we present experimental results indicating that the performance of applications with large memory footprints remains largely constrained by memory. Fourth, we demonstrate that thread-level parallelism further degrades memory performance, even for the latest SMPs with hardware prefetching and switch-based memory interconnects.

  9. High-Throughput Atomic Force Microscopes Operating in Parallel

    CERN Document Server

    Sadeghian, H; Dekker, B; Winters, J; Bijnagte, T; Rijnbeek, R

    2016-01-01

    Atomic force microscopy (AFM) is an essential nanoinstrument technique for several applications such as cell biology and nanoelectronics metrology and inspection. The need for statistically significant sample sizes means that data collection can be an extremely lengthy process in AFM. The use of a single AFM instrument is known for its very low speed and not being suitable for scanning large areas, resulting in very-low-throughput measurement. We address this challenge by parallelizing AFM instruments. The parallelization is achieved by miniaturizing the AFM instrument and operating many of them simultaneously. This nanoinstrument has the advantages that each miniaturized AFM can be operated independently and that the advances in the field of AFM, both in terms of speed and imaging modalities, can be implemented more easily. Moreover, a parallel AFM instrument also allows one to measure several physical parameters simultaneously; while one instrument measures nano-scale topography, another instrument can meas...

  10. High-throughput atomic force microscopes operating in parallel

    Science.gov (United States)

    Sadeghian, Hamed; Herfst, Rodolf; Dekker, Bert; Winters, Jasper; Bijnagte, Tom; Rijnbeek, Ramon

    2017-03-01

    Atomic force microscopy (AFM) is an essential nanoinstrument technique for several applications such as cell biology and nanoelectronics metrology and inspection. The need for statistically significant sample sizes means that data collection can be an extremely lengthy process in AFM. The use of a single AFM instrument is known for its very low speed and not being suitable for scanning large areas, resulting in a very-low-throughput measurement. We address this challenge by parallelizing AFM instruments. The parallelization is achieved by miniaturizing the AFM instrument and operating many of them simultaneously. This instrument has the advantages that each miniaturized AFM can be operated independently and that the advances in the field of AFM, both in terms of speed and imaging modalities, can be implemented more easily. Moreover, a parallel AFM instrument also allows one to measure several physical parameters simultaneously; while one instrument measures nano-scale topography, another instrument can measure mechanical, electrical, or thermal properties, making it a lab-on-an-instrument. In this paper, a proof of principle of such a parallel AFM instrument has been demonstrated by analyzing the topography of large samples such as semiconductor wafers. This nanoinstrument provides new research opportunities in the nanometrology of wafers and nanolithography masks by enabling real die-to-die and wafer-level measurements and in cell biology by measuring the nano-scale properties of a large number of cells.

  11. Design of high-performance parallelized gene predictors in MATLAB

    Directory of Open Access Journals (Sweden)

    Rivard Sylvain

    2012-04-01

    Full Text Available Abstract Background This paper proposes a method of implementing parallel gene prediction algorithms in MATLAB. The proposed designs are based on either Goertzel’s algorithm or on FFTs and have been implemented using varying amounts of parallelism on a central processing unit (CPU and on a graphics processing unit (GPU. Findings Results show that an implementation using a straightforward approach can require over 4.5 h to process 15 million base pairs (bps whereas a properly designed one could perform the same task in less than five minutes. In the best case, a GPU implementation can yield these results in 57 s. Conclusions The present work shows how parallelism can be used in MATLAB for gene prediction in very large DNA sequences to produce results that are over 270 times faster than a conventional approach. This is significant as MATLAB is typically overlooked due to its apparent slow processing time even though it offers a convenient environment for bioinformatics. From a practical standpoint, this work proposes two strategies for accelerating genome data processing which rely on different parallelization mechanisms. Using a CPU, the work shows that direct access to the MEX function increases execution speed and that the PARFOR construct should be used in order to take full advantage of the parallelizable Goertzel implementation. When the target is a GPU, the work shows that data needs to be segmented into manageable sizes within the GFOR construct before processing in order to minimize execution time.

  12. Design of high-performance parallelized gene predictors in MATLAB.

    Science.gov (United States)

    Rivard, Sylvain Robert; Mailloux, Jean-Gabriel; Beguenane, Rachid; Bui, Hung Tien

    2012-04-10

    This paper proposes a method of implementing parallel gene prediction algorithms in MATLAB. The proposed designs are based on either Goertzel's algorithm or on FFTs and have been implemented using varying amounts of parallelism on a central processing unit (CPU) and on a graphics processing unit (GPU). Results show that an implementation using a straightforward approach can require over 4.5 h to process 15 million base pairs (bps) whereas a properly designed one could perform the same task in less than five minutes. In the best case, a GPU implementation can yield these results in 57 s. The present work shows how parallelism can be used in MATLAB for gene prediction in very large DNA sequences to produce results that are over 270 times faster than a conventional approach. This is significant as MATLAB is typically overlooked due to its apparent slow processing time even though it offers a convenient environment for bioinformatics. From a practical standpoint, this work proposes two strategies for accelerating genome data processing which rely on different parallelization mechanisms. Using a CPU, the work shows that direct access to the MEX function increases execution speed and that the PARFOR construct should be used in order to take full advantage of the parallelizable Goertzel implementation. When the target is a GPU, the work shows that data needs to be segmented into manageable sizes within the GFOR construct before processing in order to minimize execution time.

  13. Bandwidth in bolometric interferometry

    Science.gov (United States)

    Charlassier, R.; Bunn, E. F.; Hamilton, J.-Ch.; Kaplan, J.; Malu, S.

    2010-05-01

    Context. Bolometric interferometry is a promising new technology with potential applications to the detection of B-mode polarization fluctuations of the cosmic microwave background (CMB). A bolometric interferometer will have to take advantage of the wide spectral detection band of its bolometers to be competitive with imaging experiments. A crucial concern is that interferometers are assumed to be significantly affected by a spoiling effect known as bandwidth smearing. Aims: We investigate how the bandwidth modifies the work principle of a bolometric interferometer and affects its sensitivity to the CMB angular power spectra. Methods: We obtain analytical expressions for the broadband visibilities measured by broadband heterodyne and bolometric interferometers. We investigate how the visibilities must be reconstructed in a broadband bolometric interferometer and show that this critically depends on hardware properties of the modulation phase shifters. If the phase shifters produce shifts that are constant with respect to frequency, the instrument works like its monochromatic version (the modulation matrix is not modified), while if they vary (linearly or otherwise) with respect to frequency, one has to perform a special reconstruction scheme, which allows the visibilities to be reconstructed in frequency subbands. Using an angular power spectrum estimator that accounts for the bandwidth, we finally calculate the sensitivity of a broadband bolometric interferometer. A numerical simulation is performed that confirms the analytical results. Results: We conclude that (i) broadband bolometric interferometers allow broadband visibilities to be reconstructed regardless of the type of phase shifters used and (ii) for dedicated B-mode bolometric interferometers, the sensitivity loss caused by bandwidth smearing is quite acceptable, even for wideband instruments (a factor of 2 loss for a typical 20% bandwidth experiment).

  14. A Highly Parallelized MIMO Detector for Vector-Based Reconfigurable Architectures

    OpenAIRE

    Zhang, Chenxin; Liu, Liang; Wang, Yian; Zhu, Meifang; Edfors, Ove; Öwall, Viktor

    2013-01-01

    This paper presents a highly parallelized MIMO signal detection algorithm targeting vector-based reconfigurable architectures. The detector achieves high data-level parallelism and near-ML performance by adopting a vector-architecture-friendly technique - parallel node perturbation. To further reduce the computational complexity, imbalanced node and successive partial node expansion schemes in conjunction with sorted QR decomposition are applied. The effectiveness of the proposed algorithm is...

  15. LOADS INFLUENCE ANALYSIS ON NOVEL HIGH PRECISION FLEXURE PARALLEL POSITIONER

    Institute of Scientific and Technical Information of China (English)

    2006-01-01

    A large workspace flexure parallel positioner system is developed, which can attain sub-micron scale accuracy over cubic centimeter motion range for utilizing novel wide-range flexure hinges instead of the conventional mechanism joints. Flexure hinges eliminate backlash and friction, but on the other hand their deformation caused by initial loads influences the positioning accuracy greatly, so discussions about loads' influence analysis on this flexure parallel positioner is very necessary. The stiffness model of the whole mechanism is presented via stiffness assembly method based on the stiffness model of individual flexure hinge. And the analysis results are validated by the finite element analysis (FEA) simulation and experiment tests, which provide essential data to the practical application of this positioner system.

  16. Parallel Libraries to support High-Level Programming

    DEFF Research Database (Denmark)

    Larsen, Morten Nørgaard

    so is not a simple task and for many non-computer scientists, like chemists and physicists writing programs for simulating their experiments, the task can easily become overwhelming. During the last decades, a lot of research efforts have been put into how to create tools that will simplify writing......The development of computer architectures during the last ten years have forced programmers to move towards writing parallel programs instead of sequential ones. The homogenous multi-core architectures from the major CPU producers like Intel and AMD has led this trend, but the introduction......, the general increase in the usage of graphic cards for general-purpose programming (GPGPU) have meant that programmers today must be able to write parallel programs that cannot only utilize small number computational cores but perhaps hundreds or even thousands. However, most programmers will agree that doing...

  17. A Highly Efficient Parallel Algorithm for Computing the Fiedler Vector

    CERN Document Server

    Manguoglu, Murat

    2010-01-01

    The eigenvector corresponding to the second smallest eigenvalue of the Laplacian of a graph, known as the Fiedler vector, has a number of applications in areas that include matrix reordering, graph partitioning, protein analysis, data mining, machine learning, and web search. The computation of the Fiedler vector has been regarded as an expensive process as it involves solving a large eigenvalue problem. We present a novel and efficient parallel algorithm for computing the Fiedler vector of large graphs based on the Trace Minimization algorithm (Sameh, et.al). We compare the parallel performance of our method with a multilevel scheme, designed specifically for computing the Fiedler vector, which is implemented in routine MC73\\_Fiedler of the Harwell Subroutine Library (HSL). In addition, we compare the quality of the Fiedler vector for the application of weighted matrix reordering and provide a metric for measuring the quality of reordering.

  18. A FAMILY OF HIGH-ODER PARALLEL ROOTFINDERS FOR POLYNOMIALS

    Institute of Scientific and Technical Information of China (English)

    Shi-ming Zheng

    2000-01-01

    In this paper we present a family of parallel iterations of order m + 2 with parameter m = 0, 1… for simultaneous finding all zeros of a polynomial without evaluation of derivatives, which includes the well known Weierstrass-Durand-Dochev-Kerner and Borsch-Supan-Nourein iterations as the special cases for m = 0 and m = 1, respectively. Some numerical examples are given.

  19. The parallel I/O architecture of the High Performance Storage System (HPSS)

    Energy Technology Data Exchange (ETDEWEB)

    Watson, R.W. [Lawrence Livermore National Lab., CA (United States); Coyne, R.A. [IBM Federal Systems Co., Houston, TX (United States)

    1995-02-01

    Rapid improvements in computational science, processing capability, main memory sizes, data collection devices, multimedia capabilities and integration of enterprise data are producing very large datasets (10s-100s of gigabytes to terabytes). This rapid growth of data has resulted in a serious imbalance in I/O and storage system performance and functionality. One promising approach to restoring balanced I/O and storage system performance is use of parallel data transfer techniques for client access to storage, device-to-device transfers, and remote file transfers. This paper describes the parallel I/O architecture and mechanisms, Parallel Transport Protocol, parallel FIP, and parallel client Application Programming Interface (API) used by the High Performance Storage System (HPSS). Parallel storage integration issues with a local parallel file system are also discussed.

  20. Efficient Bandwidth Management for Ethernet Passive Optical Networks

    KAUST Repository

    Elrasad, Amr Elsayed M.

    2016-05-15

    Polling (EGSIP), to compensate the unutilized bandwidth due to frame delineation. Our solution achieves delay reduction ratio up to 90% at high load. We also develop a Congestion Aware Limited Time (CALT) DBA scheme to detect and resolve temporary congestion in EPONs. CALT smartly adapts the optical networking unit (ONU) maximum transmission window according to the detected congestion level. Numerical results show that CALT is more robust at high load compared to other related published schemes. Regarding LR-EPONs, the main concern is large round trip delay mitigation. We address two problems, namely bandwidth over-granting in Multi-Thread Polling (MTP) and on-the-fly void filling. We combine, with some modifications, EGSIP and DES to resolve bandwidth over-granting in MTP. We also manage to adaptively tune MTP active running threads along with the offered load. Regarding on-the-fly void filling, Our approach, Parallel Void Thread (PVT), achieves large delay reduction for delay-sensitive traffic. PVT is designed as a plus function to DBA and can be combined with almost all DBA schemes proposed before. The powerful feature of our proposed solutions is integrability. We integrate our solutions together and form a multi-feature, robust, fairly simple, and well performing DBA scheme over LR-TWDM-EPONs. Our final contribution is about energy saving under target delay constraints. We tackle the problem of downstream based sleep time sizing and scheduling under required delay constraints. Simulation results show that our approach adheres to delay constraints and achieves almost ideal energy saving ratio at the same time.

  1. Bandwidth in bolometric interferometry

    CERN Document Server

    Charlassier, R; Hamilton, J -Ch; Kaplan, J; Malu, S

    2009-01-01

    Bolometric Interferometry is a technology currently under development that will be first dedicated to the detection of B-mode polarization fluctuations in the Cosmic Microwave Background. A bolometric interferometer will have to take advantage of the wide spectral detection band of its bolometers in order to be competitive with imaging experiments. A crucial concern is that interferometers are presumed to be importantly affected by a spoiling effect known as bandwidth smearing. In this paper, we investigate how the bandwidth modifies the work principle of a bolometric interferometer and how it affects its sensitivity to the CMB angular power spectra. We obtain analytical expressions for the broadband visibilities measured by broadband heterodyne and bolometric interferometers. We investigate how the visibilities must be reconstructed in a broadband bolometric interferometer and show that this critically depends on hardware properties of the modulation phase shifters. Using an angular power spectrum estimator ...

  2. Optical conductivity measurements of GaTa4Se8 under high pressure: evidence of a bandwidth-controlled insulator-to-metal Mott transition.

    Science.gov (United States)

    Ta Phuoc, V; Vaju, C; Corraze, B; Sopracase, R; Perucchi, A; Marini, C; Postorino, P; Chligui, M; Lupi, S; Janod, E; Cario, L

    2013-01-18

    The optical properties of a GaTa(4)Se(8) single crystal are investigated under high pressure. At ambient pressure, the optical conductivity exhibits a charge gap of ≈0.12 eV and a broad midinfrared band at ≈0.55 eV. As pressure is increased, the low energy spectral weight is strongly enhanced and the optical gap is rapidly filled, pointing to an insulator to metal transition around 6 GPa. The overall evolution of the optical conductivity demonstrates that GaTa(4)Se(8) is a Mott insulator which undergoes a bandwidth-controlled Mott metal-insulator transition under pressure, in remarkably good agreement with theory. With the use of our optical data and ab initio band structure calculations, our results were successfully compared to the (U/D, T/D) phase diagram predicted by dynamical mean field theory for strongly correlated systems.

  3. Bandwidth Trading as Incentive

    Science.gov (United States)

    Eger, Kolja; Killat, Ulrich

    In P2P networks with multi-source download the file of interest is fragmented into pieces and peers exchange pieces with each other although they did not finish the download of the complete file. Peers can adopt different strategies to trade upload for download bandwidth. These trading schemes should give peers an incentive to contribute bandwidth to the P2P network. This chapter studies different trading schemes analytically and by simulations. A mathematical framework for bandwidth trading is introduced and two distributed algorithms, which are denoted as Resource Pricing and Reciprocal Rate Control, are derived. The algorithms are compared to the tit-for-tat principle in BitTorrent. Nash Equilibria and results from simulations of static and dynamic networks are presented. Additionally, we discuss how trading schemes can be combined with a piece selection algorithm to increase the availability of a full copy of the file. The chapter closes with an extension of the mathematical model which takes also the underlying IP network into account. This results in a TCP variant optimised for P2P content distribution.

  4. Introduction to massively-parallel computing in high-energy physics

    CERN Document Server

    Smith, Mark

    1993-01-01

    Ever since computers were first used for scientific and numerical work, there has existed an "arms race" between the technical development of faster computing hardware, and the desires of scientists to solve larger problems in shorter time-scales. However, the vast leaps in processor performance achieved through advances in semi-conductor science have reached a hiatus as the technology comes up against the physical limits of the speed of light and quantum effects. This has lead all high performance computer manufacturers to turn towards a parallel architecture for their new machines. In these lectures we will introduce the history and concepts behind parallel computing, and review the various parallel architectures and software environments currently available. We will then introduce programming methodologies that allow efficient exploitation of parallel machines, and present case studies of the parallelization of typical High Energy Physics codes for the two main classes of parallel computing architecture (S...

  5. Two-photon-excited fluorescence (TPEF) and fluorescence lifetime imaging (FLIM) with sub-nanosecond pulses and a high analog bandwidth signal detection

    Science.gov (United States)

    Eibl, Matthias; Karpf, Sebastian; Hakert, Hubertus; Weng, Daniel; Huber, Robert

    2017-02-01

    Two-photon excited fluorescence (TPEF) microscopy and fluorescence lifetime imaging (FLIM) are powerful imaging techniques in bio-molecular science. The need for elaborate light sources for TPEF and speed limitations for FLIM, however, hinder an even wider application. We present a way to overcome this limitations by combining a robust and inexpensive fiber laser for nonlinear excitation with a fast analog digitization method for rapid FLIM imaging. The applied sub nanosecond pulsed laser source is synchronized to a high analog bandwidth signal detection for single shot TPEF- and single shot FLIM imaging. The actively modulated pulses at 1064nm from the fiber laser are adjustable from 50ps to 5ns with kW of peak power. At a typically applied pulse lengths and repetition rates, the duty cycle is comparable to typically used femtosecond pulses and thus the peak power is also comparable at same cw-power. Hence, both types of excitation should yield the same number of fluorescence photons per time on average when used for TPEF imaging. However, in the 100ps configuration, a thousand times more fluorescence photons are generated per pulse. In this paper, we now show that the higher number of fluorescence photons per pulse combined with a high analog bandwidth detection makes it possible to not only use a single pulse per pixel for TPEF imaging but also to resolve the exponential time decay for FLIM. To evaluate the performance of our system, we acquired FLIM images of a Convallaria sample with pixel rates of 1 MHz where the lifetime information is directly measured with a fast real time digitizer. With the presented results, we show that longer pulses in the many-10ps to nanosecond regime can be readily applied for TPEF imaging and enable new imaging modalities like single pulse FLIM.

  6. Radiation-hard/high-speed parallel optical links

    Energy Technology Data Exchange (ETDEWEB)

    Gan, K.K., E-mail: gan@mps.ohio-state.edu [Department of Physics, The Ohio State University, Columbus, OH 43210 (United States); Buchholz, P.; Heidbrink, S. [Fachbereich Physik, Universität Siegen, Siegen (Germany); Kagan, H.P.; Kass, R.D.; Moore, J.; Smith, D.S. [Department of Physics, The Ohio State University, Columbus, OH 43210 (United States); Vogt, M.; Ziolkowski, M. [Fachbereich Physik, Universität Siegen, Siegen (Germany)

    2016-09-21

    We have designed and fabricated a compact parallel optical engine for transmitting data at 5 Gb/s. The device consists of a 4-channel ASIC driving a VCSEL (Vertical Cavity Surface Emitting Laser) array in an optical package. The ASIC is designed using only core transistors in a 65 nm CMOS process to enhance the radiation-hardness. The ASIC contains an 8-bit DAC to control the bias and modulation currents of the individual channels in the VCSEL array. The performance of the optical engine up at 5 Gb/s is satisfactory.

  7. Radiation-hard/high-speed parallel optical links

    Science.gov (United States)

    Gan, K. K.; Buchholz, P.; Heidbrink, S.; Kagan, H. P.; Kass, R. D.; Moore, J.; Smith, D. S.; Vogt, M.; Ziolkowski, M.

    2016-09-01

    We have designed and fabricated a compact parallel optical engine for transmitting data at 5 Gb/s. The device consists of a 4-channel ASIC driving a VCSEL (Vertical Cavity Surface Emitting Laser) array in an optical package. The ASIC is designed using only core transistors in a 65 nm CMOS process to enhance the radiation-hardness. The ASIC contains an 8-bit DAC to control the bias and modulation currents of the individual channels in the VCSEL array. The performance of the optical engine up at 5 Gb/s is satisfactory.

  8. Polybinary modulation for bandwidth limited optical links

    DEFF Research Database (Denmark)

    Vegas Olmos, Juan José; Jurado-Navas, Antonio

    2015-01-01

    Optical links using traditional modulation formats are reaching a plateau in terms of capacity, mainly due to bandwidth limitations in the devices employed at the transmitter and receivers. Advanced modulation formats, which boost the spectral efficiency, provide a smooth migration path towards...... the recent results on poly binary modulation, comprising both binary and multilevel signals as seed signals. The results will show how poly binary modulation effectively reduces the bandwidth requirements on optical links while providing high spectral efficiency....

  9. Parallel Libraries to support High-Level Programming

    DEFF Research Database (Denmark)

    Larsen, Morten Nørgaard

    model requires the programmer to think a bit differently, but at the same time the implemented algorithms will perform very well, as shown by the initial tests presented. In the second part of this thesis, I will change focus from the CELL-BE architecture to the more traditionally x86 architecture...... of the more exotic though short-lived heterogeneous CELL Broadband Engine (CELL-BE) architecture added to this shift. Furthermore, the use of cluster computers made of commodity hardware and specialized supercomputers have greatly increased in both industry as well as in the academic world. Finally...... as they would be a single machine. In between is a number of tools helping the programmers handle communication, share data, run loops in parallel, handle algorithms mining huge amounts of data etc. Even though most of them do a good job performance-wise, almost all of them require that the programmers learn...

  10. Spectrophotometer spectral bandwidth calibration with absorption bands crystal standard.

    Science.gov (United States)

    Soares, O D; Costa, J L

    1999-04-01

    A procedure for calibration of a spectral bandwidth standard for high-resolution spectrophotometers is described. Symmetrical absorption bands for a crystal standard are adopted. The method relies on spectral band shape fitting followed by a convolution with the slit function of the spectrophotometer. A reference spectrophotometer is used to calibrate the spectral bandwidth standard. Bandwidth calibration curves for a minimum spectral transmission factor relative to the spectral bandwidth of the reference spectrophotometer are derived for the absorption bands at the wavelength of the band absorption maximum. The family of these calibration curves characterizes the spectral bandwidth standard. We calibrate the spectral bandwidth of a spectrophotometer with respect to the reference spectrophotometer by determining the spectral transmission factor minimum at every calibrated absorption band of the bandwidth standard for the nominal instrument values of the spectral bandwidth. With reference to the standard spectral bandwidth calibration curves, the relation of the spectral bandwidth to the reference spectrophotometer is determined. We determine the discrepancy in the spectrophotometers' spectral bandwidths by averaging the spectral bandwidth discrepancies relative to the standard calibrated values found at the absorption bands considered. A weighted average of the uncertainties is taken.

  11. GROMACS 4.5: A high-throughput and highly parallel open source molecular simulation toolkit

    Energy Technology Data Exchange (ETDEWEB)

    Pronk, Sander [Science for Life Lab., Stockholm (Sweden); KTH Royal Institute of Technology, Stockholm (Sweden); Pall, Szilard [Science for Life Lab., Stockholm (Sweden); KTH Royal Institute of Technology, Stockholm (Sweden); Schulz, Roland [Univ. of Tennessee, Knoxville, TN (United States); Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Larsson, Per [Univ. of Virginia, Charlottesville, VA (United States); Bjelkmar, Par [Science for Life Lab., Stockholm (Sweden); Stockholm Univ., Stockholm (Sweden); Apostolov, Rossen [Science for Life Lab., Stockholm (Sweden); KTH Royal Institute of Technology, Stockholm (Sweden); Shirts, Michael R. [Univ. of Virginia, Charlottesville, VA (United States); Smith, Jeremy C. [Univ. of Tennessee, Knoxville, TN (United States); Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Kasson, Peter M. [Univ. of Virginia, Charlottesville, VA (United States); van der Spoel, David [Science for Life Lab., Stockholm (Sweden); Uppsala Univ., Uppsala (Sweden); Hess, Berk [Science for Life Lab., Stockholm (Sweden); KTH Royal Institute of Technology, Stockholm (Sweden); Lindahl, Erik [Science for Life Lab., Stockholm (Sweden); KTH Royal Institute of Technology, Stockholm (Sweden); Stockholm Univ., Stockholm (Sweden)

    2013-02-13

    In this study, molecular simulation has historically been a low-throughput technique, but faster computers and increasing amounts of genomic and structural data are changing this by enabling large-scale automated simulation of, for instance, many conformers or mutants of biomolecules with or without a range of ligands. At the same time, advances in performance and scaling now make it possible to model complex biomolecular interaction and function in a manner directly testable by experiment. These applications share a need for fast and efficient software that can be deployed on massive scale in clusters, web servers, distributed computing or cloud resources. As a result, we present a range of new simulation algorithms and features developed during the past 4 years, leading up to the GROMACS 4.5 software package. The software now automatically handles wide classes of biomolecules, such as proteins, nucleic acids and lipids, and comes with all commonly used force fields for these molecules built-in. GROMACS supports several implicit solvent models, as well as new free-energy algorithms, and the software now uses multithreading for efficient parallelization even on low-end systems, including windows-based workstations. Together with hand-tuned assembly kernels and state-of-the-art parallelization, this provides extremely high performance and cost efficiency for high-throughput as well as massively parallel simulations.

  12. Line filter design of parallel interleaved VSCs for high power wind energy conversion systems

    DEFF Research Database (Denmark)

    Gohil, Ghanshyamsinh Vijaysinh; Bede, Lorand; Teodorescu, Remus

    2015-01-01

    The Voltage Source Converters (VSCs) are often connected in parallel in a Wind Energy Conversion System (WECS) to match the high power rating of the modern wind turbines. The effect of the interleaved carriers on the harmonic performance of the parallel connected VSCs is analyzed in this paper...

  13. Topology optimization of photonic crystal structures: a high-bandwidth low-loss T-junction waveguide

    DEFF Research Database (Denmark)

    Jensen, Jakob Søndergaard; Sigmund, Ole

    2005-01-01

    A T junction in a photonic crystal waveguide is designed with the topology-optimization method. The gradientbased optimization tool is used to modify the material distribution in the junction area so that the power transmission in the output ports is maximized. To obtain high transmission...

  14. The Paralleling of High Power High Frequency Amplifier Based on Synchronous and Asynchronous Control

    Institute of Scientific and Technical Information of China (English)

    程荣仓; 刘正之

    2004-01-01

    The vertical position of plasma in the HT-7U Tokamak is inherently unstable. In order to realize active stabilization, the response rate of the high-power high-frequency amplifier feeding the active control coils must be fast enough. This paper analyzes the paralleling scheme of the power amplifier through two kinds of control mode. One is the synchronous control; the other is the asynchronous control. Via the comparison of the two kinds of control mode, both of their characteristics are given in the text. At last, the analyzed result is verified by a small power experiment.

  15. Health care using high-bandwidth communication to overcome distance and time barriers for the Department of Defense

    Science.gov (United States)

    Mun, Seong K.; Freedman, Matthew T.; Gelish, Anthony; de Treville, Robert E.; Sheehy, Monet R.; Hansen, Mark; Hill, Mac; Zacharia, Elisabeth; Sullivan, Michael J.; Sebera, C. Wayne

    1993-01-01

    Image management and communications (IMAC) network, also known as picture archiving and communication system (PACS) consists of (1) digital image acquisition, (2) image review station (3) image storage device(s), image reading workstation, and (4) communication capability. When these subsystems are integrated over a high speed communication technology, possibilities are numerous in improving the timeliness and quality of diagnostic services within a hospital or at remote clinical sites. Teleradiology system uses basically the same hardware configuration together with a long distance communication capability. Functional characteristics of components are highlighted. Many medical imaging systems are already in digital form. These digital images constitute approximately 30% of the total volume of images produced in a radiology department. The remaining 70% of images include conventional x-ray films of the chest, skeleton, abdomen, and GI tract. Unless one develops a method of handling these conventional film images, global improvement in productivity in image management and radiology service throughout a hospital cannot be achieved. Currently, there are two method of producing digital information representing these conventional analog images for IMAC: film digitizers that scan the conventional films, and computed radiography (CR) that captures x-ray images using storage phosphor plate that is subsequently scanned by a laser beam.

  16. Live Educational Outreach for Ocean Exploration: High-Bandwidth Ship-to-Shore Broadcasts Using Internet2

    Science.gov (United States)

    Coleman, D. F.; Ballard, R. D.

    2005-12-01

    During the past 3 field seasons, our group at the University of Rhode Island Graduate School of Oceanography, in partnership with the Institute for Exploration and a number of educational institutions, has conducted a series of ocean exploration expeditions with a significant focus on educational outreach through "telepresence" - utilizing live transmissions of video, audio, and data streams across the Internet and Internet2. Our educational partners include Immersion Presents, Boys and Girls Clubs of America, the Jason Foundation for Education, and the National Geographic Society, all who provided partial funding for the expeditions. The primary funding agency each year was NOAA's Office of Ocean Exploration and our outreach efforts were conducted in collaboration with them. During each expedition, remotely operated vehicle (ROV) systems were employed to examine interesting geological and archaeological sites on the seafloor. These expeditions include the investigation of ancient shipwrecks in the Black Sea in 2003, a survey of the Titanic shipwreck site in 2004, and a detailed sampling and mapping effort at the Lost City Hydrothermal Field in 2005. High-definition video cameras on the ROVs collected the footage that was then digitally encoded, IP-encapsulated, and streamed across a satellite link to a shore-based hub, where the streams were redistributed. During each expedition, live half-hour-long educational broadcasts were produced 4 times per day for 10 days. These shows were distributed using satellite and internet technologies to a variety of venues, including museums, aquariums, science centers, public schools, and universities. In addition to the live broadcasts, educational products were developed to enhance the learning experience. These include activity modules and curriculum-based material for teachers and informal educators. Each educational partner also maintained a web site that followed the expedition and provided additional background information

  17. Bandwidth Estimation For Mobile Ad hoc Network (MANET

    Directory of Open Access Journals (Sweden)

    Rabia Ali

    2011-09-01

    Full Text Available In this paper we presents bandwidth estimation scheme for MANET, which uses some components of the two methods for the bandwidth estimation: 'Hello Bandwidth Estimation 'Listen Bandwidth Estimation. This paper also gives the advantages of the proposed method. The proposed method is based on the comparison of these two methods. Bandwidth estimation is an important issue in the Mobile Ad-hoc Network (MANET because bandwidth estimation in MANET is difficult, because each host has imprecise knowledge of the network status and links change dynamically. Therefore, an effective bandwidth estimation scheme for MANET is highly desirable. Ad hoc networks present unique advanced challenges, including the design of protocols for mobility management, effective routing, data transport, security, power management, and quality-of-service (QoS provisioning. Once these problems are solved, the practical use of MANETs will be realizable.

  18. High-accuracy interferometric measurements of flatness and parallelism of a step gauge

    CSIR Research Space (South Africa)

    Kruger, OA

    2001-01-01

    Full Text Available for the calibration of step gauges to a high accuracy. A system was also developed for interferometric measurements of the flatness and parallelism of gauge block faces for use in uncertainty calculations....

  19. High-speed parallel forward error correction for optical transport networks

    DEFF Research Database (Denmark)

    Rasmussen, Anders; Ruepp, Sarah Renée; Berger, Michael Stübert;

    2010-01-01

    This paper presents a highly parallelized hardware implementation of the standard OTN Reed-Solomon Forward Error Correction algorithm. The proposed circuit is designed to meet the immense throughput required by OTN4, using commercially available FPGA technology.......This paper presents a highly parallelized hardware implementation of the standard OTN Reed-Solomon Forward Error Correction algorithm. The proposed circuit is designed to meet the immense throughput required by OTN4, using commercially available FPGA technology....

  20. Ultrahigh bandwidth signal processing

    DEFF Research Database (Denmark)

    Oxenløwe, Leif Katsuo

    2016-01-01

    Optical time lenses have proven to be very versatile for advanced optical signal processing. Based on a controlled interplay between dispersion and phase-modulation by e.g. four-wave mixing, the processing is phase-preserving, an hence useful for all types of data signals including coherent multi......-level modulation founats. This has enabled processing of phase-modulated spectrally efficient data signals, such as orthogonal frequency division multiplexed (OFDM) signa In that case, a spectral telescope system was used, using two time lenses with different focal lengths (chirp rates), yielding a spectral...... regeneratio These operations require a broad bandwidth nonlinear platform, and novel photonic integrated nonlinear platform like aluminum gallium arsenide nano-waveguides used for 1.28 Tbaud optical signal processing will be described....

  1. Bandwidth enhancement and time-delay signature suppression of chaotic signal from an optical feedback semiconductor laser by using cross phase modulation in a highly nonlinear fiber loop mirror

    Science.gov (United States)

    Wang, Liang-Yan; Zhong, Zhu-Qiong; Wu, Zheng-Mao; Lu, Dong; Chen, Xi; Chen, Jun; Xia, Guang-Qiong

    2016-11-01

    Based on a nonlinear fiber loop mirror (NOLM) composed of a fiber coupler (FC) and a highly nonlinear fiber (HNLF), a scheme is proposed to simultaneously realize the bandwidth enhancement and the time-delay signature (TDS) suppression of a chaotic signal generated from an external cavity optical feedback semiconductor laser. The simulation results show that, after passing through the NOLM, the bandwidth of chaotic signal can be efficiently enhanced and the TDS can be well suppressed under suitable operation parameters. Furthermore, the influences of the power-splitting ratio of the FC, the averaged power of the chaotic signal entering into the FC and the length of the HNLF on the chaotic bandwidth and TDS are analyzed, and the optimized parameters are determined.

  2. Analysis of stress and natural frequencies of high-speed spatial parallel mechanism

    Institute of Scientific and Technical Information of China (English)

    陈修龙; 李文彬; 邓昱; 李云峰

    2013-01-01

    In order to grasp the dynamic behaviors of 4-UPS-UPU high-speed spatial parallel mechanism, the stress of driving limbs and natural frequencies of parallel mechanism were investigated. Based on flexible multi-body dynamics theory, the dynamics model of 4-UPS-UPU high-speed spatial parallel mechanism without considering geometric nonlinearity was derived. The stress of driving limbs and natural frequencies of 4-UPS-UPU parallel mechanism with specific parameters were analyzed. The relationship between the basic parameters of parallel mechanism and its dynamic behaviors, such as stress of driving limbs and natural frequencies of parallel mechanism, were discussed. The numerical simulation results show that the stress and natural frequencies are relatively sensitive to the section parameters of driving limbs, the characteristic parameters of material on driving limbs, and the mass of moving platform. The researches can provide important theoretical base of the analysis of dynamic behaviors and optimal design for high-speed spatial parallel mechanism.

  3. A highly scalable massively parallel fast marching method for the Eikonal equation

    Science.gov (United States)

    Yang, Jianming; Stern, Frederick

    2017-03-01

    The fast marching method is a widely used numerical method for solving the Eikonal equation arising from a variety of scientific and engineering fields. It is long deemed inherently sequential and an efficient parallel algorithm applicable to large-scale practical applications is not available in the literature. In this study, we present a highly scalable massively parallel implementation of the fast marching method using a domain decomposition approach. Central to this algorithm is a novel restarted narrow band approach that coordinates the frequency of communications and the amount of computations extra to a sequential run for achieving an unprecedented parallel performance. Within each restart, the narrow band fast marching method is executed; simple synchronous local exchanges and global reductions are adopted for communicating updated data in the overlapping regions between neighboring subdomains and getting the latest front status, respectively. The independence of front characteristics is exploited through special data structures and augmented status tags to extract the masked parallelism within the fast marching method. The efficiency, flexibility, and applicability of the parallel algorithm are demonstrated through several examples. These problems are extensively tested on six grids with up to 1 billion points using different numbers of processes ranging from 1 to 65536. Remarkable parallel speedups are achieved using tens of thousands of processes. Detailed pseudo-codes for both the sequential and parallel algorithms are provided to illustrate the simplicity of the parallel implementation and its similarity to the sequential narrow band fast marching algorithm.

  4. High Volume Colour Image Processing with Massively Parallel Embedded Processors

    NARCIS (Netherlands)

    Jacobs, Jan W.M.; Bond, W.; Pouls, R.; Smit, Gerard J.M.; Joubert, G.R.; Peters, F.J.; Tirado, P.; Nagel, W.E.; Plata, O.; Zapata, E.

    2006-01-01

    Currently Oce uses FPGA technology for implementing colour image processing for their high volume colour printers. Although FPGA technology provides enough performance it, however, has a rather tedious development process. This paper describes the research conducted on an alternative implementation

  5. Ultrawide bandwidth 1.55-um lasers

    Science.gov (United States)

    Morton, Paul A.; Tanbun-Ek, Tawee; Logan, Ralph A.; Ackerman, David A.; Shtengel, Gleb E.; Yadvish, R. D.; Sergent, A. M.; Sciortino, Paul F., Jr.

    1996-04-01

    This paper describes the essential elements for creating a practical wide bandwidth directly modulated laser source. This includes considerations of the intrinsic limitations of the laser structure, due to the resonant frequency and damping of the laser output, together with carrier transport issues to allow carriers in the device active region to be efficiently modulated at high speeds. the use of a P-doped compressively strained multiple-quantum well active region to provide high intrinsic speed and remove transport limitations is described, together with record setting results of 25 GHz modulation bandwidth for a 1.55 micrometer Fabry-Perot laser and 26 GHz bandwidth for a 1.55 micrometer DFB laser. The challenges of providing high bandwidth electrical connections to the laser on a suitable submount, together with fiber attachment and microwave packaging, are discussed. Results of fully packaged 1.55 micrometer DFB lasers with 25 Ghz modulation bandwidth are shown. Digital modulation of the packaged 1.55 micrometer DFB including impedance matching is described, and the transient wavelength chirp is presented. This low chirp is reduced further using an optical filter, to provide a 10 GBit/s source with chirp similar to that of an external electroabsorption modulator.

  6. Extending Automatic Parallelization to Optimize High-Level Abstractions for Multicore

    Energy Technology Data Exchange (ETDEWEB)

    Liao, C; Quinlan, D J; Willcock, J J; Panas, T

    2008-12-12

    Automatic introduction of OpenMP for sequential applications has attracted significant attention recently because of the proliferation of multicore processors and the simplicity of using OpenMP to express parallelism for shared-memory systems. However, most previous research has only focused on C and Fortran applications operating on primitive data types. C++ applications using high-level abstractions, such as STL containers and complex user-defined types, are largely ignored due to the lack of research compilers that are readily able to recognize high-level object-oriented abstractions and leverage their associated semantics. In this paper, we automatically parallelize C++ applications using ROSE, a multiple-language source-to-source compiler infrastructure which preserves the high-level abstractions and gives us access to their semantics. Several representative parallelization candidate kernels are used to explore semantic-aware parallelization strategies for high-level abstractions, combined with extended compiler analyses. Those kernels include an array-base computation loop, a loop with task-level parallelism, and a domain-specific tree traversal. Our work extends the applicability of automatic parallelization to modern applications using high-level abstractions and exposes more opportunities to take advantage of multicore processors.

  7. Performance analysis of high quality parallel preconditioners applied to 3D finite element structural analysis

    Energy Technology Data Exchange (ETDEWEB)

    Kolotilina, L.; Nikishin, A.; Yeremin, A. [and others

    1994-12-31

    The solution of large systems of linear equations is a crucial bottleneck when performing 3D finite element analysis of structures. Also, in many cases the reliability and robustness of iterative solution strategies, and their efficiency when exploiting hardware resources, fully determine the scope of industrial applications which can be solved on a particular computer platform. This is especially true for modern vector/parallel supercomputers with large vector length and for modern massively parallel supercomputers. Preconditioned iterative methods have been successfully applied to industrial class finite element analysis of structures. The construction and application of high quality preconditioners constitutes a high percentage of the total solution time. Parallel implementation of high quality preconditioners on such architectures is a formidable challenge. Two common types of existing preconditioners are the implicit preconditioners and the explicit preconditioners. The implicit preconditioners (e.g. incomplete factorizations of several types) are generally high quality but require solution of lower and upper triangular systems of equations per iteration which are difficult to parallelize without deteriorating the convergence rate. The explicit type of preconditionings (e.g. polynomial preconditioners or Jacobi-like preconditioners) require sparse matrix-vector multiplications and can be parallelized but their preconditioning qualities are less than desirable. The authors present results of numerical experiments with Factorized Sparse Approximate Inverses (FSAI) for symmetric positive definite linear systems. These are high quality preconditioners that possess a large resource of parallelism by construction without increasing the serial complexity.

  8. Bandwidth and Noise in Spatiotemporally Modulated Mueller Matrix Polarimeters

    Science.gov (United States)

    Vaughn, Israel Jacob

    Polarimetric systems design has seen recent utilization of linear systems theory for system descriptions. Although noise optimal systems have been shown, bandwidth performance has not been addressed in depth generally and is particularly lacking for Mueller matrix (active) polarimetric systems. Bandwidth must be considered in a systematic way for remote sensing polarimetric systems design. The systematic approach facilitates both understanding of fundamental constraints and design of higher bandwidth polarimetric systems. Fundamental bandwidth constraints result in production of polarimetric "artifacts" due to channel crosstalk upon Mueller matrix reconstruction. This dissertation analyzes bandwidth trade-offs in spatio-temporal channeled Mueller matrix polarimetric systems. Bandwidth is directly related to the geometric positioning of channels in the Fourier (channel) space, however channel positioning for polarimetric systems is constrained both physically and by design parameters like domain separability. We present the physical channel constraints and the constraints imposed when the carriers are separable between space and time. Polarimetric systems are also constrained by noise performance, and there is a trade-off between noise performance and bandwidth. I develop cost functions which account for the trade-off between noise and bandwidth for spatio-temporal polarimetric systems. The cost functions allow a systems designer to jointly optimize systems with good bandwidth and noise performance. Optimization is implemented for a candidate spatio-temporal system design, and high temporal bandwidth systems resulting from the optimization are presented. Systematic errors which impact the bandwidth performance and mitigation strategies for these systematic errors are also presented. Finally, a portable imaging Mueller matrix system is built and analyzed based on the theoretical bandwidth analysis and system bandwidth optimization. Temporal bandwidth performance is

  9. Development of gallium arsenide high-speed, low-power serial parallel interface modules: Executive summary

    Science.gov (United States)

    1988-01-01

    Final report to NASA LeRC on the development of gallium arsenide (GaAS) high-speed, low power serial/parallel interface modules. The report discusses the development and test of a family of 16, 32 and 64 bit parallel to serial and serial to parallel integrated circuits using a self aligned gate MESFET technology developed at the Honeywell Sensors and Signal Processing Laboratory. Lab testing demonstrated 1.3 GHz clock rates at a power of 300 mW. This work was accomplished under contract number NAS3-24676.

  10. 10th International Workshop on Parallel Tools for High Performance Computing

    CERN Document Server

    Gracia, José; Hilbrich, Tobias; Knüpfer, Andreas; Resch, Michael; Nagel, Wolfgang

    2017-01-01

    This book presents the proceedings of the 10th International Parallel Tools Workshop, held October 4-5, 2016 in Stuttgart, Germany – a forum to discuss the latest advances in parallel tools. High-performance computing plays an increasingly important role for numerical simulation and modelling in academic and industrial research. At the same time, using large-scale parallel systems efficiently is becoming more difficult. A number of tools addressing parallel program development and analysis have emerged from the high-performance computing community over the last decade, and what may have started as collection of small helper script has now matured to production-grade frameworks. Powerful user interfaces and an extensive body of documentation allow easy usage by non-specialists.

  11. Ultrahigh bandwidth signal processing

    Science.gov (United States)

    Oxenløwe, Leif Katsuo

    2016-04-01

    Optical time lenses have proven to be very versatile for advanced optical signal processing. Based on a controlled interplay between dispersion and phase-modulation by e.g. four-wave mixing, the processing is phase-preserving, and hence useful for all types of data signals including coherent multi-level modulation formats. This has enabled processing of phase-modulated spectrally efficient data signals, such as orthogonal frequency division multiplexed (OFDM) signals. In that case, a spectral telescope system was used, using two time lenses with different focal lengths (chirp rates), yielding a spectral magnification of the OFDM signal. Utilising such telescopic arrangements, it has become possible to perform a number of interesting functionalities, which will be described in the presentation. This includes conversion from OFDM to Nyquist WDM, compression of WDM channels to a single Nyquist channel and WDM regeneration. These operations require a broad bandwidth nonlinear platform, and novel photonic integrated nonlinear platforms like aluminum gallium arsenide nano-waveguides used for 1.28 Tbaud optical signal processing will be described.

  12. Theoretical Calculation of MMF's Bandwidth

    Institute of Scientific and Technical Information of China (English)

    LI Xiao-fu; JIANG De-sheng; YU Hai-hu

    2004-01-01

    The difference between over-filled launch bandwidth (OFL BW) and restricted mode launch bandwidth (RML BW) is described. A theoretical model is founded to calculate the OFL BW of grade index multimode fiber (GI-MMF),and the result is useful to guide the modification of the manufacturing method.

  13. Estimating Bottleneck Bandwidth using TCP

    Science.gov (United States)

    Allman, Mark

    1998-01-01

    Various issues associated with estimating bottleneck bandwidth using TCP are presented in viewgraph form. Specific topics include: 1) Why TCP is wanted to estimate the bottleneck bandwidth; 2) Setting ssthresh to an appropriate value to reduce loss; 3) Possible packet-pair solutions; and 4) Preliminary results: ACTS and the Internet.

  14. Bandwidth of Gaussian weighted Chirp

    DEFF Research Database (Denmark)

    Wilhjelm, Jens E.

    1993-01-01

    Four major time duration and bandwidth expressions are calculated for a linearly frequency modulated sinusoid with Gaussian shaped envelope. This includes a Gaussian tone pulse. The bandwidth is found to be a nonlinear function of nominal time duration and nominal frequency excursion of the chirp...

  15. Improved-Bandwidth Transimpedance Amplifier

    Science.gov (United States)

    Chapsky, Jacob

    2009-01-01

    The widest available operational amplifier, with the best voltage and current noise characteristics, is considered for transimpedance amplifier (TIA) applications where wide bandwidth is required to handle fast rising input signals (as for time-of-flight measurement cases). The added amplifier inside the TIA feedback loop can be configured to have slightly lower voltage gain than the bandwidth reduction factor.

  16. Design of High Speed Architecture of Parallel MAC Based On Radix-2 MBA

    Directory of Open Access Journals (Sweden)

    Syed Anwar Ahmed,

    2014-05-01

    Full Text Available The multiplier and multiplier-and-accumulator (MAC are the essential elements of the digital signal processing such as filtering, convolution, transformations and Inner products. Parallel MAC is frequently used in digital signal processing and video/graphics applications. Fast multipliers are essential parts of digital signal processing systems. The speed of multiply operation is of great importance in digital signal processing as well as in the general purpose processors today, especially since the media processing took off. The MAC provides high speed multiplication and multiplication with accumulative addition. This paper presents a combined process of multiplication and accumulation based on radix-4 & radix-8 booth encodings. In this Paper, we investigate the method of implementing the Parallel MAC with the smallest possible delay. Enhancing the speed of operation of the parallel MAC is a major design issue. This has been achieved by developing a CLA adder for parallel MAC.

  17. High-speed detection of emergent market clustering via an unsupervised parallel genetic algorithm

    Directory of Open Access Journals (Sweden)

    Dieter Hendricks

    2016-02-01

    Full Text Available We implement a master-slave parallel genetic algorithm with a bespoke log-likelihood fitness function to identify emergent clusters within price evolutions. We use graphics processing units (GPUs to implement a parallel genetic algorithm and visualise the results using disjoint minimal spanning trees. We demonstrate that our GPU parallel genetic algorithm, implemented on a commercially available general purpose GPU, is able to recover stock clusters in sub-second speed, based on a subset of stocks in the South African market. This approach represents a pragmatic choice for low-cost, scalable parallel computing and is significantly faster than a prototype serial implementation in an optimised C-based fourth-generation programming language, although the results are not directly comparable because of compiler differences. Combined with fast online intraday correlation matrix estimation from high frequency data for cluster identification, the proposed implementation offers cost-effective, near-real-time risk assessment for financial practitioners.

  18. A highly scalable massively parallel fast marching method for the Eikonal equation

    CERN Document Server

    Yang, Jianming

    2015-01-01

    In this study, we present a highly scalable massively parallel implementation of the fast marching method using a domain decomposition approach. Central to this algorithm is a novel restarted narrow band approach that coordinates the frequency of communications and the amount of computations extra to a sequential run for achieving an unprecedented parallel performance. Within each restart, the narrow band fast marching method is executed; simple synchronous local exchanges and global reductions are adopted for communicating updated data in the overlapping regions between neighboring subdomains and getting the latest front status, respectively. The independence of front characteristics is exploited through special data structures and augmented status tags to extract the masked parallelism within the fast marching method. The efficiency, flexibility, and applicability of the parallel algorithm are demonstrated through several examples. These problems are extensively tested on grids with up to 1 billion points u...

  19. Compact antenna arrays with wide bandwidth and low sidelobe levels

    Science.gov (United States)

    Strassner, II, Bernd H.

    2014-09-09

    Highly efficient, low cost, easily manufactured SAR antenna arrays with lightweight low profiles, large instantaneous bandwidths and low SLL are disclosed. The array topology provides all necessary circuitry within the available antenna aperture space and between the layers of material that comprise the aperture. Bandwidths of 15.2 GHz to 18.2 GHz, with 30 dB SLLs azimuthally and elevationally, and radiation efficiencies above 40% may be achieved. Operation over much larger bandwidths is possible as well.

  20. DVS-SOFTWARE: An Effective Tool for Applying Highly Parallelized Hardware To Computational Geophysics

    Science.gov (United States)

    Herrera, I.; Herrera, G. S.

    2015-12-01

    Most geophysical systems are macroscopic physical systems. The behavior prediction of such systems is carried out by means of computational models whose basic models are partial differential equations (PDEs) [1]. Due to the enormous size of the discretized version of such PDEs it is necessary to apply highly parallelized super-computers. For them, at present, the most efficient software is based on non-overlapping domain decomposition methods (DDM). However, a limiting feature of the present state-of-the-art techniques is due to the kind of discretizations used in them. Recently, I. Herrera and co-workers using 'non-overlapping discretizations' have produced the DVS-Software which overcomes this limitation [2]. The DVS-software can be applied to a great variety of geophysical problems and achieves very high parallel efficiencies (90%, or so [3]). It is therefore very suitable for effectively applying the most advanced parallel supercomputers available at present. In a parallel talk, in this AGU Fall Meeting, Graciela Herrera Z. will present how this software is being applied to advance MOD-FLOW. Key Words: Parallel Software for Geophysics, High Performance Computing, HPC, Parallel Computing, Domain Decomposition Methods (DDM)REFERENCES [1]. Herrera Ismael and George F. Pinder, Mathematical Modelling in Science and Engineering: An axiomatic approach", John Wiley, 243p., 2012. [2]. Herrera, I., de la Cruz L.M. and Rosas-Medina A. "Non Overlapping Discretization Methods for Partial, Differential Equations". NUMER METH PART D E, 30: 1427-1454, 2014, DOI 10.1002/num 21852. (Open source) [3]. Herrera, I., & Contreras Iván "An Innovative Tool for Effectively Applying Highly Parallelized Software To Problems of Elasticity". Geofísica Internacional, 2015 (In press)

  1. Semantic-Aware Automatic Parallelization of Modern Applications Using High-Level Abstractions

    Energy Technology Data Exchange (ETDEWEB)

    Liao, C; Quinlan, D J; Willcock, J J; Panas, T

    2009-12-21

    Automatic introduction of OpenMP for sequential applications has attracted significant attention recently because of the proliferation of multicore processors and the simplicity of using OpenMP to express parallelism for shared-memory systems. However, most previous research has only focused on C and Fortran applications operating on primitive data types. Modern applications using high-level abstractions, such as C++ STL containers and complex user-defined class types, are largely ignored due to the lack of research compilers that are readily able to recognize high-level object-oriented abstractions and leverage their associated semantics. In this paper, we use a source-to-source compiler infrastructure, ROSE, to explore compiler techniques to recognize high-level abstractions and to exploit their semantics for automatic parallelization. Several representative parallelization candidate kernels are used to study semantic-aware parallelization strategies for high-level abstractions, combined with extended compiler analyses. Preliminary results have shown that semantics of abstractions can help extend the applicability of automatic parallelization to modern applications and expose more opportunities to take advantage of multicore processors.

  2. High-performance Scientific Computing using Parallel Computing to Improve Performance Optimization Problems

    Directory of Open Access Journals (Sweden)

    Florica Novăcescu

    2011-10-01

    Full Text Available HPC (High Performance Computing has become essential for the acceleration of innovation and the companies’ assistance in creating new inventions, better models and more reliable products as well as obtaining processes and services at low costs. The information in this paper focuses particularly on: description the field of high performance scientific computing, parallel computing, scientific computing, parallel computers, and trends in the HPC field, presented here reveal important new directions toward the realization of a high performance computational society. The practical part of the work is an example of use of the HPC tool to accelerate solving an electrostatic optimization problem using the Parallel Computing Toolbox that allows solving computational and data-intensive problems using MATLAB and Simulink on multicore and multiprocessor computers.

  3. Optimal resource allocation in random networks with transportation bandwidths

    Science.gov (United States)

    Yeung, C. H.; Wong, K. Y. Michael

    2009-03-01

    We apply statistical physics to study the task of resource allocation in random sparse networks with limited bandwidths for the transportation of resources along the links. Recursive relations from the Bethe approximation are converted into useful algorithms. Bottlenecks emerge when the bandwidths are small, causing an increase in the fraction of idle links. For a given total bandwidth per node, the efficiency of allocation increases with the network connectivity. In the high connectivity limit, we find a phase transition at a critical bandwidth, above which clusters of balanced nodes appear, characterized by a profile of homogenized resource allocation similar to the Maxwell construction.

  4. The Parallel Curriculum: A Design To Develop High Potential and Challenge High-Ability Learners.

    Science.gov (United States)

    Tomlinson, Carol Ann; Kaplan, Sandra N.; Renzulli, Joseph S.; Purcell, Jeanne; Leppien, Jann; Burns, Deborah

    This book presents a model of curriculum development for gifted students and offers four parallel approaches that focus on ascending intellectual demand as students develop expertise in learning. The parallel curriculum's four approaches include: (1) the core or basic curriculum; (2) the curriculum of connections, which expands on the core…

  5. High performance parallel computing of large eddy simulation of the flow in a curved duct with square cross section

    Institute of Scientific and Technical Information of China (English)

    樊洪明; 黄伟; 魏英杰

    2004-01-01

    Large eddy simulation(LES) cooperated with a high performance parallel computing method is applied to simulate the flow in a curved duct with square cross section in the paper. The method consists of parallel domain decomposition of grids, creation of virtual diagonal bordered matrix, assembling of boundary matrix,parallel LDLT decomposition, parallel solving of Poisson Equation, parallel estimation of convergence and so on. The parallel computing method can solve the problems that are difficult to solve using traditional serial computing. Furthermore, existing microcomputers can be fully used to resolve some large-scale problems of complex turbulent flow.

  6. Plasmonics and the parallel programming problem

    Science.gov (United States)

    Vishkin, Uzi; Smolyaninov, Igor; Davis, Chris

    2007-02-01

    While many parallel computers have been built, it has generally been too difficult to program them. Now, all computers are effectively becoming parallel machines. Biannual doubling in the number of cores on a single chip, or faster, over the coming decade is planned by most computer vendors. Thus, the parallel programming problem is becoming more critical. The only known solution to the parallel programming problem in the theory of computer science is through a parallel algorithmic theory called PRAM. Unfortunately, some of the PRAM theory assumptions regarding the bandwidth between processors and memories did not properly reflect a parallel computer that could be built in previous decades. Reaching memories, or other processors in a multi-processor organization, required off-chip connections through pins on the boundary of each electric chip. Using the number of transistors that is becoming available on chip, on-chip architectures that adequately support the PRAM are becoming possible. However, the bandwidth of off-chip connections remains insufficient and the latency remains too high. This creates a bottleneck at the boundary of the chip for a PRAM-On-Chip architecture. This also prevents scalability to larger "supercomputing" organizations spanning across many processing chips that can handle massive amounts of data. Instead of connections through pins and wires, power-efficient CMOS-compatible on-chip conversion to plasmonic nanowaveguides is introduced for improved latency and bandwidth. Proper incorporation of our ideas offer exciting avenues to resolving the parallel programming problem, and an alternative way for building faster, more useable and much more compact supercomputers.

  7. Upgrade trigger: Bandwidth strategy proposal

    CERN Document Server

    Fitzpatrick, Conor; Meloni, Simone; Boettcher, Thomas Julian; Whitehead, Mark Peter; Dziurda, Agnieszka; Vesterinen, Mika Anton

    2017-01-01

    This document describes a selection strategy for the upgrade trigger using charm signals as a benchmark. The Upgrade trigger uses a 'Run 2-like' sequence consisting of a first and second stage, in between which the calibration and alignment is performed. The first stage, HLT1, uses an inclusive strategy to select beauty and charm decays, while the second stage uses offline-quality exclusive selections. A novel genetic algorithm-based bandwidth division is performed at the second stage to distribute the output bandwidth among different physics channels, maximising the efficiency for useful physics events. The performance is then studied as a function of the available output bandwidth.

  8. 7th International Workshop on Parallel Tools for High Performance Computing

    CERN Document Server

    Gracia, José; Nagel, Wolfgang; Resch, Michael

    2014-01-01

    Current advances in High Performance Computing (HPC) increasingly impact efficient software development workflows. Programmers for HPC applications need to consider trends such as increased core counts, multiple levels of parallelism, reduced memory per core, and I/O system challenges in order to derive well performing and highly scalable codes. At the same time, the increasing complexity adds further sources of program defects. While novel programming paradigms and advanced system libraries provide solutions for some of these challenges, appropriate supporting tools are indispensable. Such tools aid application developers in debugging, performance analysis, or code optimization and therefore make a major contribution to the development of robust and efficient parallel software. This book introduces a selection of the tools presented and discussed at the 7th International Parallel Tools Workshop, held in Dresden, Germany, September 3-4, 2013.  

  9. 8th International Workshop on Parallel Tools for High Performance Computing

    CERN Document Server

    Gracia, José; Knüpfer, Andreas; Resch, Michael; Nagel, Wolfgang

    2015-01-01

    Numerical simulation and modelling using High Performance Computing has evolved into an established technique in academic and industrial research. At the same time, the High Performance Computing infrastructure is becoming ever more complex. For instance, most of the current top systems around the world use thousands of nodes in which classical CPUs are combined with accelerator cards in order to enhance their compute power and energy efficiency. This complexity can only be mastered with adequate development and optimization tools. Key topics addressed by these tools include parallelization on heterogeneous systems, performance optimization for CPUs and accelerators, debugging of increasingly complex scientific applications, and optimization of energy usage in the spirit of green IT. This book represents the proceedings of the 8th International Parallel Tools Workshop, held October 1-2, 2014 in Stuttgart, Germany – which is a forum to discuss the latest advancements in the parallel tools.

  10. High-performance Parallel Solver for Integral Equations of Electromagnetics Based on Galerkin Method

    CERN Document Server

    Kruglyakov, Mikhail

    2015-01-01

    A new parallel solver for the volumetric integral equations (IE) of electrodynamics is presented. The solver is based on the Galerkin method which ensures the convergent numerical solution. The main features include: 1) the reduction of the memory usage in half, compared to analogous IE based algorithms, without additional restriction on the background media; 2) accurate and stable method to compute matrix coefficients corresponding to the IE; 3) high degree of parallelism. The solver's computational efficiency is shown on a problem of magnetotelluric sounding of the high conductivity contrast media. A good agreement with the results obtained with the second order finite element method is demonstrated. Due to effective approach to parallelization and distributed data storage the program exhibits perfect scalability on different hardware platforms.

  11. High-performance parallel image reconstruction for the New Vacuum Solar Telescope

    Science.gov (United States)

    Li, Xue-Bao; Liu, Zhong; Wang, Feng; Jin, Zhen-Yu; Xiang, Yong-Yuan; Zheng, Yan-Fang

    2015-06-01

    Many technologies have been developed to help improve spatial resolution of observational images for ground-based solar telescopes, such as adaptive optics (AO) systems and post-processing reconstruction. As any AO system correction is only partial, it is indispensable to use post-processing reconstruction techniques. In the New Vacuum Solar Telescope (NVST), a speckle-masking method is used to achieve the diffraction-limited resolution of the telescope. Although the method is very promising, the computation is quite intensive, and the amount of data is tremendous, requiring several months to reconstruct observational data of one day on a high-end computer. To accelerate image reconstruction, we parallelize the program package on a high-performance cluster. We describe parallel implementation details for several reconstruction procedures. The code is written in the C language using the Message Passing Interface (MPI) and is optimized for parallel processing in a multiprocessor environment. We show the excellent performance of parallel implementation, and the whole data processing speed is about 71 times faster than before. Finally, we analyze the scalability of the code to find possible bottlenecks, and propose several ways to further improve the parallel performance. We conclude that the presented program is capable of executing reconstruction applications in real-time at NVST.

  12. Simulation and Analysis of Router Buffer Requirements in High Bandwidth-Delay Networks%高带宽延迟网络中路由器缓存需求的仿真分析

    Institute of Scientific and Technical Information of China (English)

    王建新; 李春泉; 黄家玮

    2009-01-01

    In order to meet the requirement for router buffer size in high bandwidth-delay networks, five typical buffer-sizing methods based on the TCP model are analyzed via the NS2 simulation, and the effects of various high-speed TCP protocols and active queue management (AQM) mechanisms on the buffer-sizing methods in high bandwidth-delay networks are discussed in detail. Simulated results show that: (1) the buffer-sizing methods based on different assumptions adapt to different network environments; (2) the validity of the existing cache mechanisms depends on the ratio of the bandwidth-delay product to the flow number; and (3) when high-speed TCP protocols and AQM mechanisms are used in high bandwidth-delay networks, the buffer size is greatly reduced.%文中针对当今高带宽延迟网络下路由器缓存大小的需求问题,通过NS2仿真实验,对基于TCP协议模型的5种典型的缓存设置方法展开研究,着重分析了在高带宽延迟网络下各种高速TCP协议和主动队列管理(AQM)机制对各种缓存设置方法的影响.仿真实验表明:基于不同假设前提的缓存设置方法适应于不同的网络负载环境;缓存机制的选择取决于网络带宽延迟乘积与流数的比值;在高带宽延迟网络下,当采用高速TCP协议和AQM机制时,缓存需求可以大大减小.

  13. Note: Expanding the bandwidth of the ultra-low current amplifier using an artificial negative capacitor

    Energy Technology Data Exchange (ETDEWEB)

    Xie, Kai, E-mail: kaixie@mail.xidian.edu.cn; Liu, Yan; Li, XiaoPing [School of Aerospace Science and Technology, Xidian University, Xi’an 710071 (China); Guo, Lixin [School of Physics and Optoelectronic Engineering, Xidian University, Xi’an 710071 (China); Zhang, Hanlu [School of Communication & Information Engineering, Xi’an University of Posts & Telecommunication, Xi’an 710121 (China)

    2016-04-15

    The bandwidth and low noise characteristics are often contradictory in ultra-low current amplifier, because an inevitable parasitic capacitance is paralleled with the high value feedback resistor. In order to expand the amplifier’s bandwidth, a novel approach was proposed by introducing an artificial negative capacitor to cancel the parasitic capacitance. The theory of the negative capacitance and the performance of the improved amplifier circuit with the negative capacitor are presented in this manuscript. The test was conducted by modifying an ultra-low current amplifier with a trans-impedance gain of 50 GΩ. The results show that the maximum bandwidth was expanded from 18.7 Hz to 3.3 kHz with more than 150 times of increase when the parasitic capacitance (∼0.17 pF) was cancelled. Meanwhile, the rise time decreased from 18.7 ms to 0.26 ms with no overshot. Any desired bandwidth or rise time within these ranges can be obtained by adjusting the ratio of cancellation of the parasitic and negative capacitance. This approach is especially suitable for the demand of rapid response to weak current, such as transient ion-beam detector, mass spectrometry analysis, and fast scanning microscope.

  14. Design of the Trap Filter for the High Power Converters with Parallel Interleaved VSCs

    DEFF Research Database (Denmark)

    Gohil, Ghanshyamsinh Vijaysinh; Bede, Lorand; Teodorescu, Remus

    2014-01-01

    . Therefore, large filter components are often required in order to meet the stringent grid code requirements imposed by the utility. As a result, the size, weight and cost of the overall system increase. The use of interleaved carriers of the parallel connected VSCs, along with the high order line filter...

  15. High performance domain decomposition methods on massively parallel architectures with FreeFEM++

    OpenAIRE

    Jolivet, Pierre; Dolean, Victorita; Hecht, Frédéric; Nataf, Frédéric; Prud'Homme, Christophe; Spillane, Nicole

    2012-01-01

    International audience; In this document, we present a parallel implementation in Freefem++ of scalable two-level domain decomposition methods. Numerical studies with highly heterogeneous problems are then performed on large clusters in order to assert the performance of our code.

  16. High performance parallelism pearls 2 multicore and many-core programming approaches

    CERN Document Server

    Jeffers, Jim

    2015-01-01

    High Performance Parallelism Pearls Volume 2 offers another set of examples that demonstrate how to leverage parallelism. Similar to Volume 1, the techniques included here explain how to use processors and coprocessors with the same programming - illustrating the most effective ways to combine Xeon Phi coprocessors with Xeon and other multicore processors. The book includes examples of successful programming efforts, drawn from across industries and domains such as biomed, genetics, finance, manufacturing, imaging, and more. Each chapter in this edited work includes detailed explanations of t

  17. Computational performance of a parallelized high-order spectral and mortar element toolbox

    CERN Document Server

    Bouffanais, Roland; Gruber, Ralf; Deville, Michel O

    2007-01-01

    In this paper, a comprehensive performance review of a MPI-based high-order spectral and mortar element method C++ toolbox is presented. The focus is put on the performance evaluation of several aspects with a particular emphasis on the parallel efficiency. The performance evaluation is analyzed and compared to predictions given by a heuristic model, the so-called Gamma model. A tailor-made CFD computation benchmark case is introduced and used to carry out this review, stressing the particular interest for commodity clusters. Conclusions are drawn from this extensive series of analyses and modeling leading to specific recommendations concerning such toolbox development and parallel implementation.

  18. Data flow analysis of a highly parallel processor for a level 1 pixel trigger

    Energy Technology Data Exchange (ETDEWEB)

    Cancelo, G. [Fermi National Accelerator Laboratory (FNAL), Batavia, IL (United States); Gottschalk, Erik Edward [Fermi National Accelerator Laboratory (FNAL), Batavia, IL (United States); Pavlicek, V. [Fermi National Accelerator Laboratory (FNAL), Batavia, IL (United States); Wang, M. [Fermi National Accelerator Laboratory (FNAL), Batavia, IL (United States); Wu, J. [Fermi National Accelerator Laboratory (FNAL), Batavia, IL (United States)

    2003-01-01

    The present work describes the architecture and data flow analysis of a highly parallel processor for the Level 1 Pixel Trigger for the BTeV experiment at Fermilab. First the Level 1 Trigger system is described. Then the major components are analyzed by resorting to mathematical modeling. Also, behavioral simulations are used to confirm the models. Results from modeling and simulations are fed back into the system in order to improve the architecture, eliminate bottlenecks, allocate sufficient buffering between processes and obtain other important design parameters. An interesting feature of the current analysis is that the models can be extended to a large class of architectures and parallel systems.

  19. Programming Environment for a High-Performance Parallel Supercomputer with Intelligent Communication

    Directory of Open Access Journals (Sweden)

    A. Gunzinger

    1996-01-01

    Full Text Available At the Electronics Laboratory of the Swiss Federal Institute of Technology (ETH in Zürich, the high-performance parallel supercomputer MUSIC (MUlti processor System with Intelligent Communication has been developed. As applications like neural network simulation and molecular dynamics show, the Electronics Laboratory supercomputer is absolutely on par with those of conventional supercomputers, but electric power requirements are reduced by a factor of 1,000, weight is reduced by a factor of 400, and price is reduced by a factor of 100. Software development is a key issue of such parallel systems. This article focuses on the programming environment of the MUSIC system and on its applications.

  20. High performance word level sequential and parallel coding methods and architectures for bit plane coding

    Institute of Scientific and Technical Information of China (English)

    XIONG ChengYi; TIAN JinWen; LIU Jian

    2008-01-01

    This paper introduced a novel high performance algorithm and VLSI architectures for achieving bit plane coding (BPC) in word level sequential and parallel mode. The proposed BPC algorithm adopts the techniques of coding pass prediction and par-allel & pipeline to reduce the number of accessing memory and to increase the ability of concurrently processing of the system, where all the coefficient bits of a code block could be coded by only one scan. A new parallel bit plane architecture (PA) was proposed to achieve word-level sequential coding. Moreover, an efficient high-speed architecture (HA) was presented to achieve multi-word parallel coding. Compared to the state of the art, the proposed PA could reduce the hardware cost more efficiently, though the throughput retains one coefficient coded per clock. While the proposed HA could perform coding for 4 coefficients belonging to a stripe column at one intra-clock cycle, so that coding for an N×N code-block could be completed in approximate N2/4 intra-clock cycles. Theoretical analysis and ex-perimental results demonstrate that the proposed designs have high throughput rate with good performance in terms of speedup to cost, which can be good alter-natives for low power applications.

  1. High performance parallel computers for science: New developments at the Fermilab advanced computer program

    Energy Technology Data Exchange (ETDEWEB)

    Nash, T.; Areti, H.; Atac, R.; Biel, J.; Cook, A.; Deppe, J.; Edel, M.; Fischler, M.; Gaines, I.; Hance, R.

    1988-08-01

    Fermilab's Advanced Computer Program (ACP) has been developing highly cost effective, yet practical, parallel computers for high energy physics since 1984. The ACP's latest developments are proceeding in two directions. A Second Generation ACP Multiprocessor System for experiments will include $3500 RISC processors each with performance over 15 VAX MIPS. To support such high performance, the new system allows parallel I/O, parallel interprocess communication, and parallel host processes. The ACP Multi-Array Processor, has been developed for theoretical physics. Each $4000 node is a FORTRAN or C programmable pipelined 20 MFlops (peak), 10 MByte single board computer. These are plugged into a 16 port crossbar switch crate which handles both inter and intra crate communication. The crates are connected in a hypercube. Site oriented applications like lattice gauge theory are supported by system software called CANOPY, which makes the hardware virtually transparent to users. A 256 node, 5 GFlop, system is under construction. 10 refs., 7 figs.

  2. Load balancing in highly parallel processing of Monte Carlo code for particle transport

    Energy Technology Data Exchange (ETDEWEB)

    Higuchi, Kenji; Takemiya, Hiroshi [Japan Atomic Energy Research Inst., Tokyo (Japan); Kawasaki, Takuji [Fuji Research Institute Corporation, Tokyo (Japan)

    2001-01-01

    In parallel processing of Monte Carlo(MC) codes for neutron, photon and electron transport problems, particle histories are assigned to processors making use of independency of the calculation for each particle. Although we can easily parallelize main part of a MC code by this method, it is necessary and practically difficult to optimize the code concerning load balancing in order to attain high speedup ratio in highly parallel processing. In fact, the speedup ratio in the case of 128 processors remains in nearly one hundred times when using the test bed for the performance evaluation. Through the parallel processing of the MCNP code, which is widely used in the nuclear field, it is shown that it is difficult to attain high performance by static load balancing in especially neutron transport problems, and a load balancing method, which dynamically changes the number of assigned particles minimizing the sum of the computational and communication costs, overcomes the difficulty, resulting in nearly fifteen percentage of reduction for execution time. (author)

  3. Highly accelerated cardiac cine parallel MRI using low-rank matrix completion and partial separability model

    Science.gov (United States)

    Lyu, Jingyuan; Nakarmi, Ukash; Zhang, Chaoyi; Ying, Leslie

    2016-05-01

    This paper presents a new approach to highly accelerated dynamic parallel MRI using low rank matrix completion, partial separability (PS) model. In data acquisition, k-space data is moderately randomly undersampled at the center kspace navigator locations, but highly undersampled at the outer k-space for each temporal frame. In reconstruction, the navigator data is reconstructed from undersampled data using structured low-rank matrix completion. After all the unacquired navigator data is estimated, the partial separable model is used to obtain partial k-t data. Then the parallel imaging method is used to acquire the entire dynamic image series from highly undersampled data. The proposed method has shown to achieve high quality reconstructions with reduction factors up to 31, and temporal resolution of 29ms, when the conventional PS method fails.

  4. DESIGN OF HIGH-SPEED PARALLEL IN PARALLEL OUT SHIFT REGISTER USING ALGAAS/GAAS MODFET TECHNOLOGY

    Directory of Open Access Journals (Sweden)

    V. Ganesan

    2014-01-01

    Full Text Available This study enumerates the efficient design and analysis of Parallel In Parallel Out (PIPO shift register using AlGaAs/GaAs MODFET D Flip-flop. The transient and power analysis are obtained with operating voltage at 1.3 V for the D flip-flop and PIPO shift register using pspice tool. There are many issues facing while integrating many number of transistors like delay, power dissipation, scaling of the transistors. To overcome these problem by Considered the AlGaAs/GaAs MODFET have promising application in the field of electronics. The simulation results are done and the power consumptions and delay are compared with the conventional MOSFET design. The comparison of results the MODFET based design is capable of efficient power savings.

  5. High-Dimensional Data Visualization by Interactive Construction of Low-Dimensional Parallel Coordinate Plots

    OpenAIRE

    Itoh, Takayuki; Kumar, Ashnil; Klein, Karsten; Kim, Jinman

    2016-01-01

    Parallel coordinate plots (PCPs) are among the most useful techniques for the visualization and exploration of high-dimensional data spaces. They are especially useful for the representation of correlations among the dimensions, which identify relationships and interdependencies between variables. However, within these high-dimensional spaces, PCPs face difficulties in displaying the correlation between combinations of dimensions and generally require additional display space as the number of...

  6. Using Heterogeneous High Performance Computing Cluster for Supporting Fine-Grained Parallel Applications

    Science.gov (United States)

    2006-10-01

    Beowulf cluster made of COTS PCs (featuring dual processor Xeon’s) interconnected via a Gigabit Ethernet Network and a Myrinet network [Boden 1995...AFRL-IF-RS-TR-2006-313 Final Technical Report October 2006 USING HETEROGENEOUS HIGH PERFORMANCE COMPUTING CLUSTER FOR SUPPORTING...GRANT NUMBER FA8750-05-1-0130 4. TITLE AND SUBTITLE USING HETEROGENEOUS HIGH PERFORMANCE COMPUTING CLUSTER FOR SUPPORTING FINE-GRAINED PARALLEL

  7. 高灵敏度微小卫星可变带宽接收机设计%Design of variable loop bandwidth high sensitivity micro-satellite receiver

    Institute of Scientific and Technical Information of China (English)

    张朝杰; 金小军; 杨伟君; 金仲和

    2011-01-01

    针对微小卫星发射功率低、天线增益小的特点对星载测控应答机提出的高接收灵敏度及高动态范围要求,研究卫星接收机的实现方法.提出一种基于正交欠采样技术及全数字载波恢复环的可变带宽卫星接收机结构.在全数字载波恢复环的实现中,通过相干自动增益控制(AGC)来控制环路带宽,使得在高信噪比下的环路带宽增大,从而获得更佳的跟踪性能;在低信噪比下,降低环路带宽使得接收机有更高的接收灵敏度.经实验测试可知,在250Hz环路带宽下,接收灵敏度为-144 dBm,动态范围达到80 dB以上.%The characteristic of low transmit power and antenna gain in micro-satellite requires high receiver sensitivity and high dynamic range for board transponders. A variable loop bandwidth receiver architecture based on all digital carrier recovery loop was presented using I/Q sub-sampling technique. A coherent automatic gain control (AGC) was used in order to control the loop bandwidth. The loop bandwidth was expanded to achieve better tracking performance at high signal to noise ratio; the loop bandwidth was decreased to realize high receiver sensitivity at low signal to noise ratio. -144 dBm receiver sensitivity was achieved and the dynamic range was better than 80 dB under the condition of 250 Hz loop bandwidth.

  8. A digital calibration technique for an ultra high-speed wide-bandwidth folding and interpolating analog-to-digital converter in 0.18-μm CMOS technology*

    Institute of Scientific and Technical Information of China (English)

    Yu Jinshan; Zhang Ruitao; Zhang Zhengping; Wang Yonglu; Zhu Can; Zhang Lei; Yu Zhou; Han Yong

    2011-01-01

    A digital calibration technique for an ultra high-speed folding and interpolating analog-to-digital converter in 0.18-μm CMOS technology is presented. The similar digital calibration techniques are taken for high 3-bit flash converter and low 5-bit folding and interpolating converter, which are based on well-designed calibration reference, calibration DAC and comparators. The spice simulation and the measured results show the ADC produces 5.9 ENOB with calibration disabled and 7.2 ENOB with calibration enabled for high-frequency wide-bandwidth analog input.

  9. Black Holes, Bandwidths and Beethoven

    CERN Document Server

    Kempf, A

    2000-01-01

    It is usually believed that a function whose Fourier spectrum is bounded can vary at most as fast as its highest frequency component. This is in fact not the case, as Aharonov, Berry and others drastically demonstrated with explicit counter examples, so-called superoscillations. The claim is that even the recording of an entire Beethoven symphony can occur as part of a signal with 1Hz bandwidth. Superoscillations have been suggested to account e.g. for transplanckian frequencies of black hole radiation. Here, we give an exact proof for generic superoscillations. Namely, we show that for every fixed bandwidth there exist functions which pass through any finite number of arbitrarily prespecified points. Further, we show that the behavior of bandlimited functions can be reliably characterized through an uncertainty relation for the standard deviation of the signals' samples taken at the Nyquist rate. This uncertainty relation generalizes to time-varying bandwidths.

  10. A Parallel Multigrid Solver for High Frequency Electromagnetic Field Analyses with Small-scale PC Cluster

    Science.gov (United States)

    Yosui, Kuniaki; Iwashita, Takeshi; Mori, Michiya; Kobayashi, Eiichi

    Finite element analyses of electromagnetic field are commonly used for designing of various electronic devices. The scale of the analyses becomes larger and larger, therefore, a fast linear solver is needed to solve linear equations arising from the finite element method. Since a multigrid solver is the fastest linear solver for these problems, parallelization of a multigrid solver is a quite useful approach. From the viewpoint of industrial applications, an effective usage of a small-scale PC cluster is important due to initial cost for introducing parallel computers. In this paper, a distributed parallel multigrid solver for a small-scale PC cluster is developed. In high frequency electromagnetic field analyses, a special block Gauss-Seidel smoother is used for the multigrid solver instead of general smoothers such as Gauss-Seidel smoother or Jacobi smoother in order to improve a convergence rate. The block multicolor ordering technique is applied to parallelize the smoother. A numerical exsample shows that a 3.7-fold speed-up in computational time and a 3.0-fold increase in the scale of the analysis were attained when the number of CPU was increased from one to five.

  11. High Selective Determination of Anionic Surfactant Using Its Parallel Catalytic Hydrogen Wave

    Institute of Scientific and Technical Information of China (English)

    过玮; 何盈盈; 宋俊峰

    2003-01-01

    A faradaic response of anionic surfactants (AS), such as linear aikylbenzene sulfonate (LAS), dodecyl benzene sulfonate and dodecyl sulfate, was observed in weak acidic medium. The faradaic response of AS includes (1) a catalytic hydrogen wave of AS in HAc/NaAc buffer that was attributed to the reduction of proton associated with the sulfo-group of AS, and (2) a parallel catalytic hydrogen wave of AS in the presence of hydrogen peroxide, which was due to the catalysis of the catalytic hydrogen wave of AS by hydroxyl radical OH electrogenerated in the reduction of hydrogen peroxide. The parallel catalytic hydrogen wave is about 50 times as sensitive as the catalytic hydrogen wave. Based on the parallel catalytic hydrogen wave, a high selective method for the determination of AS was developed. In 0.1mol/L HAc/NaAc (pH=6.2±0.1)/1.0×10-3mol/L H2O2 supporting electrolyte, the second-order derivative peak current of the parallel catalytic hydrogen wave located at-1.33 V (vs. SCE) was rectilinear to AS concentration in the range of 3.0×10-6-2.5×10-4mol/L, without the interference of other surfactants. The proposed method was evaluated by quantitative analysis of AS in environmental wastewater.

  12. Parallel Gene Expression Differences between Low and High Latitude Populations of Drosophila melanogaster and D. simulans.

    Science.gov (United States)

    Zhao, Li; Wit, Janneke; Svetec, Nicolas; Begun, David J

    2015-05-01

    Gene expression variation within species is relatively common, however, the role of natural selection in the maintenance of this variation is poorly understood. Here we investigate low and high latitude populations of Drosophila melanogaster and its sister species, D. simulans, to determine whether the two species show similar patterns of population differentiation, consistent with a role for spatially varying selection in maintaining gene expression variation. We compared at two temperatures the whole male transcriptome of D. melanogaster and D. simulans sampled from Panama City (Panama) and Maine (USA). We observed a significant excess of genes exhibiting differential expression in both species, consistent with parallel adaptation to heterogeneous environments. Moreover, the majority of genes showing parallel expression differentiation showed the same direction of differential expression in the two species and the magnitudes of expression differences between high and low latitude populations were correlated across species, further bolstering the conclusion that parallelism for expression phenotypes results from spatially varying selection. However, the species also exhibited important differences in expression phenotypes. For example, the genomic extent of genotype × environment interaction was much more common in D. melanogaster. Highly differentiated SNPs between low and high latitudes were enriched in the 3' UTRs and CDS of the geographically differently expressed genes in both species, consistent with an important role for cis-acting variants in driving local adaptation for expression-related phenotypes.

  13. High-resolution MRI of spinal cords by compressive sensing parallel imaging.

    Science.gov (United States)

    Peng Li; Xiangdong Yu; Griffin, Jay; Levine, Jonathan M; Jim Ji

    2015-08-01

    Spinal Cord Injury (SCI) is a common injury due to diseases or accidents. Noninvasive imaging methods play a critical role in diagnosing SCI and monitoring the response to therapy. Magnetic Resonance Imaging (MRI), by the virtue of providing excellent soft tissue contrast, is the most promising imaging method for this application. However, spinal cord has a very small cross-section, which needs high-resolution images for better visualization and diagnosis. Acquiring high-resolution spinal cord MRI images requires long acquisition time due to the physical and physiological constraints. Moreover, long acquisition time makes MRI more susceptible to motion artifacts. In this paper, we studied the application of compressive sensing (CS) and parallel imaging to achieve high-resolution imaging from sparsely sampled and reduced k-space data acquired by parallel receive arrays. In particular, the studies are limited to the effects of 2D Cartesian sampling with different subsampling schemes and reduction factors. The results show that compressive sensing parallel MRI has the potential to provide high-resolution images of the spinal cord in 1/3 of the acquisition time required by the conventional methods.

  14. Upgrade trigger: Bandwidth strategy proposal

    CERN Document Server

    Boettcher, Thomas Julian; Meloni, Simone; Whitehead, Mark Peter; Williams, Mark Richard James

    2017-01-01

    This document describes a proposed selection strategy for the upgrade trigger using charm signals as a benchmark. The Upgrade trigger uses a 'Run2-like' sequence consisting of a first and second stage, in between which the calibration and alignment is performed. The first stage, HLT1, uses an inclusive strategy to select beauty and charm, while the second stage uses offline-quality exclusive selections. A novel genetic algorithm-based bandwidth division is performed at the second stage to maximise the output of useful physics events, and a range of possible signal efficiencies are presented as a function of the available bandwidth.

  15. Mining Industry Energy Bandwidth Study

    Energy Technology Data Exchange (ETDEWEB)

    none,

    2007-07-01

    The Industrial Technologies Program (ITP) relies on analytical studies to identify large energy reduction opportunities in energy-intensive industries and uses these results to guide its R&D portfolio. The energy bandwidth illustrates the total energy-saving opportunity that exists in the industry if the current processes are improved by implementing more energy-efficient practices and by using advanced technologies. This bandwidth analysis report was conducted to assist the ITP Mining R&D program in identifying energy-saving opportunities in coal, metals, and mineral mining. These opportunities were analyzed in key mining processes of blasting, dewatering, drilling, digging, ventilation, materials handling, crushing, grinding, and separations.

  16. Preconditioning Schur Complement Systems of Highly-Indefinite Linear Systems for a Parallel Hybrid Solve

    Institute of Scientific and Technical Information of China (English)

    I.Yamazaki; X.S.Li; E.G.Ng

    2010-01-01

    A parallel hybrid linear solver based on the Schur complement method has the potential to balance the robustness of direct solvers with the efficiency of precon-ditioned iterative solvers. However, when solving large-scale highly-indefinite linear systems, this hybrid solver often suffers from either slow convergence or large memory requirements to solve the Schur complement systems. To overcome this challenge, we in this paper discuss techniques to preprocess the Schur complement systems in paral-lel. Numerical results of solving large-scale highly-indefinite linear systems from various applications demonstrate that these techniques improve the reliability and performance of the hybrid solver and enable efficient solutions of these linear systems on hundreds of processors, which was previously infeasible using existing state-of-the-art solvers.

  17. Bandwidth Partitioning in Decentralized Wireless Networks

    CERN Document Server

    Jindal, Nihar; Weber, Steven

    2007-01-01

    This paper addresses the following question, which is of interest in the design of a multiuser decentralized network. Given a total system bandwidth of W Hz and a fixed data rate constraint of R bps for each transmission, how many frequency slots N of size W/N should the band be partitioned into in order to maximize the number of simultaneous links in the network? Dividing the available spectrum results in two competing effects. On the positive side, a larger N allows for more parallel, non-interfering communications to take place in the same area. On the negative side, a larger N increases the SINR requirement for each link because the same information rate must be achieved over less bandwidth, which in turn increases the area consumed by each transmission. Exploring this tradeoff and determining the optimum value of N in terms of the system parameters is the focus of the paper. Using stochastic geometry, the optimal SINR threshold - which directly corresponds to the optimal spectral efficiency - is derived ...

  18. Electric-arc synthesis of soot with high content of higher fullerenes in parallel arc

    Science.gov (United States)

    Dutlov, A. E.; Nekrasov, V. M.; Sergeev, A. G.; Bubnov, V. P.; Kareev, I. E.

    2016-12-01

    Soot with a relatively high content of higher fullerenes (C76, C78, C80, C82, C84, C86, etc.) is synthesized in a parallel arc upon evaporation of pure carbon electrodes. The content of higher fullerenes in soot extract amounts to 13.8 wt % when two electrodes are simultaneously burnt in electric-arc reactor. Such a content is comparable with the content obtained upon evaporation of composite graphite electrodes with potassium carbonate impurity.

  19. Programming Environment for a High-Performance Parallel Supercomputer with Intelligent Communication

    OpenAIRE

    A. Gunzinger; BÄumle, B.; Frey, M.; Klebl, M.; Kocheisen, M.; Kohler, P.; Morel, R.; Müller, U; Rosenthal, M

    1996-01-01

    At the Electronics Laboratory of the Swiss Federal Institute of Technology (ETH) in Zürich, the high-performance parallel supercomputer MUSIC (MUlti processor System with Intelligent Communication) has been developed. As applications like neural network simulation and molecular dynamics show, the Electronics Laboratory supercomputer is absolutely on par with those of conventional supercomputers, but electric power requirements are reduced by a factor of 1,000, weight is reduced by a factor of...

  20. Signal processing in high data rate environments. Design tradeoffs in the exploitation of parallel architectures and fast system clock rates. An overview

    Science.gov (United States)

    Gilbert, B. K.; Kinter, T. M.; Schwab, D. J.; Naused, B. A.; Krueger, L. M.; Rice, K. M.; Lee, F. S.

    1983-11-01

    This conference is exploring the methods by which the emerging very large scale integration (VLSI) technology, i.e., the ability to place more than 10,000 logic gates on a single integrated circuit, can be exploited for the solution of difficult signal processing problems. The following discussion will concentrate on a highly specialized subset of the total signal processing environment, i.e., that small minority of such problems in which a single unprocessed data stream appears at the input of a digital processor in real time and at very high data bandwidths. These high volume data streams must be processed by the front end of the signal processor at clock rates equal to or greater than the rates at which they are delivered; in later stages of processing, it may be possible to partition the single high-speed data stream into a series of lower speed substreams and to institute parallel processing on the substreams. We have been compelled to consider potential solutions to these high data rate problems, and to compare these problems with the capabilities of silicon VLSI, as well as other technologies, with which they may be addressed.

  1. High Performance Computing Based Parallel HIearchical Modal Association Clustering (HPAR HMAC)

    Energy Technology Data Exchange (ETDEWEB)

    2017-01-12

    For many applications, clustering is a crucial step in order to gain insight into the makeup of a dataset. The best approach to a given problem often depends on a variety of factors, such as the size of the dataset, time restrictions, and soft clustering requirements. The HMAC algorithm seeks to combine the strengths of 2 particular clustering approaches: model-based and linkage-based clustering. One particular weakness of HMAC is its computational complexity. HMAC is not practical for mega-scale data clustering. For high-definition imagery, a user would have to wait months or years for a result; for a 16-megapixel image, the estimated runtime skyrockets to over a decade! To improve the execution time of HMAC, it is reasonable to consider an multi-core implementation that utilizes available system resources. An existing imple-mentation (Ray and Cheng 2014) divides the dataset into N partitions - one for each thread prior to executing the HMAC algorithm. This implementation benefits from 2 types of optimization: parallelization and divide-and-conquer. By running each partition in parallel, the program is able to accelerate computation by utilizing more system resources. Although the parallel implementation provides considerable improvement over the serial HMAC, it still suffers from poor computational complexity, O(N2). Once the maximum number of cores on a system is exhausted, the program exhibits slower behavior. We now consider a modification to HMAC that involves a recursive partitioning scheme. Our modification aims to exploit divide-and-conquer benefits seen by the parallel HMAC implementation. At each level in the recursion tree, partitions are divided into 2 sub-partitions until a threshold size is reached. When the partition can no longer be divided without falling below threshold size, the base HMAC algorithm is applied. This results in a significant speedup over the parallel HMAC.

  2. Implementation of High Speed FIR Filter: Performance Comparison with Different Parallel Prefix Adders in FPGAs

    Directory of Open Access Journals (Sweden)

    R. Uma

    2014-04-01

    Full Text Available This study describes the design of high speed FIR filter using parallel prefix adders and factorized multiplier. The fundamental component in constructing any high speed FIR filter consists of adders, multipliers and delay elements. To meet the constraint of high speed performance and low power consumption parallel prefix adders are more suitable. This study focus the design of new Parallel Prefix Adder (PPA and new multiplier cell called factorized multiplier with minimal depth algorithm and its functional characteristics is compared with the existing architecture in terms of delay and area. The performance evaluation of the proposed PPA and multiplier are examined for the bit sizes of 8, 16, 32 and 64. The coefficient of the filter is obtained through hamming window using MATLAB program. The proposed FIR filter using new PPA and factorized multiplier has been prototyped on XC3S1600EFG320 in Spartan-3E Platform using Integrated Synthesis Environment (ISE for 90 nm process. Nearly 14% of slice utilization and 34% of speed improvement has been obtained for FIR using new PPA and factorized multiplier.

  3. High-Bandwidth AFM-Based Rheology Reveals that Cartilage is Most Sensitive to High Loading Rates at Early Stages of Impairment

    Science.gov (United States)

    Nia, Hadi Tavakoli; Bozchalooi, Iman S.; Li, Yang; Han, Lin; Hung, Han-Hwa; Frank, Eliot; Youcef-Toumi, Kamal; Ortiz, Christine; Grodzinsky, Alan

    2013-01-01

    Utilizing a newly developed atomic-force-microscopy-based wide-frequency rheology system, we measured the dynamic nanomechanical behavior of normal and glycosaminoglycan (GAG)-depleted cartilage, the latter representing matrix degradation that occurs at the earliest stages of osteoarthritis. We observed unique variations in the frequency-dependent stiffness and hydraulic permeability of cartilage in the 1 Hz-to-10 kHz range, a frequency range that is relevant to joint motions from normal ambulation to high-frequency impact loading. Measurement in this frequency range is well beyond the capabilities of typical commercial atomic force microscopes. We showed that the dynamic modulus of cartilage undergoes a dramatic alteration after GAG loss, even with the collagen network still intact: whereas the magnitude of the dynamic modulus decreased two- to threefold at higher frequencies, the peak frequency of the phase angle of the modulus (representing fluid-solid frictional dissipation) increased 15-fold from 55 Hz in normal cartilage to 800 Hz after GAG depletion. These results, based on a fibril-reinforced poroelastic finite-element model, demonstrated that GAG loss caused a dramatic increase in cartilage hydraulic permeability (up to 25-fold), suggesting that early osteoarthritic cartilage is more vulnerable to higher loading rates than to the conventionally studied “loading magnitude”. Thus, over the wide frequency range of joint motion during daily activities, hydraulic permeability appears the most sensitive marker of early tissue degradation. PMID:23561529

  4. All-optical bandwidth-tailorable radar

    CERN Document Server

    Zou, Weiwen; Long, Xin; Zhang, Siteng; Cui, Yuanjun; Chen, Jianping

    2015-01-01

    Radar has been widely used in military, security, and rescue. Metamaterial cloak is employed in stealth targets to evade radar detection. Hence modern radar should be reconfigurable at multi-bands for detecting stealth targets, which might be realized based on microwave photonics. Here, we demonstrate an all-optical bandwidth-tailorable radar architecture. It is a coherent system utilizing one mode-locked laser for both signal generation and reception. Heterodyning of two individually filtered optical pulses that are pre-chirped via wavelength-to-time mapping generates wideband linearly-chirped radar signal. The working bands can be flexibly tailored with desired bandwidth at user-preferred carrier frequency. After modulated onto the pre-chirped optical pulse, radar echoes are time-stretched and frequency-compressed by several times. The digitization becomes much easier without loss of detection ability. We believe that the demonstration can innovate the radar's architecture with ultra-high range resolution.

  5. Black holes, bandwidths and Beethoven

    Science.gov (United States)

    Kempf, Achim

    2000-04-01

    It is usually believed that a function φ(t) whose Fourier spectrum is bounded can vary at most as fast as its highest frequency component ωmax. This is, in fact, not the case, as Aharonov, Berry, and others drastically demonstrated with explicit counterexamples, so-called superoscillations. It has been claimed that even the recording of an entire Beethoven symphony can occur as part of a signal with a 1 Hz bandwidth. Bandlimited functions also occur as ultraviolet regularized fields. Their superoscillations have been suggested, for example, to resolve the trans-Planckian frequencies problem of black hole radiation. Here, we give an exact proof for generic superoscillations. Namely, we show that for every fixed bandwidth there exist functions that pass through any finite number of arbitrarily prespecified points. Further, we show that, in spite of the presence of superoscillations, the behavior of bandlimited functions can be characterized reliably, namely through an uncertainty relation: The standard deviation ΔT of samples φ(tn) taken at the Nyquist rate obeys ΔT>=1/4ωmax. This uncertainty relation generalizes to variable bandwidths. For ultraviolet regularized fields we identify the bandwidth as the in general spatially variable finite local density of degrees of freedom.

  6. Performance evaluation of a high-resolution parallel-plate differential mobility analyzer

    Directory of Open Access Journals (Sweden)

    J. P. Santos

    2009-04-01

    Full Text Available A high-resolution differential mobility analyzer (DMA, specially designed for (i the measurement of ion mobility spectra, and (ii the generation of a continuous stream of monomobile ions, has been developed and tested. The apparatus consists of two parallel-plate electrodes between which an electric field is applied. The test ion stream flows into the instrument through a narrow rectangular slit made in one of the electrodes, and migrates toward the other electrode driven by the applied field, while being transported by a stream of clean air which flows parallel to the plates at Reynolds number between 2×104 and 9×104 in laminar flow conditions. The collector electrode contains also a narrow slit through which ions of the desired mobility are withdrawn out of DMA. The classified ion current is measured with a high-sensitivity electrometer having a noise level around 0.1 fA.

    The theory behind the DMA operation is first discussed, focusing on the special case of parallel-plate geometry. Some generic results showing the stability and repeatability of the measurements and the resolving power of the instrument are presented next. The last part of the paper deals with the application of the apparatus to the study of the effect of humidity and aging time on the mobility spectra of air ions generated by a low-activity 241Am source.

  7. Performance evaluation of a high-resolution parallel-plate differential mobility analyzer

    Directory of Open Access Journals (Sweden)

    J. P. Santos

    2008-09-01

    Full Text Available A high-resolution differential mobility analyzer (DMA, specially designed for

    (i the measurement of ion mobility spectra, and
    (ii the generation of a continuous stream of monomobile ions,

    has been developed and tested. The apparatus consists of two parallel-plate electrodes between which an electric field is applied. The test ion stream flows into the instrument through a narrow rectangular slit made in one of the electrodes, and migrates toward the other electrode driven by the applied field, while being transported by a stream of clean air which flows parallel to the plates at Reynolds number between 2×104 and 9×104 in laminar flow conditions. The collector electrode contains also a narrow slit through which ions of the desired mobility are withdrawn out of DMA. The classified ion current is measured with a high-sensitivity electrometer having a noise level around 0.1 fA.

    The theory behind the DMA operation is first discussed, focusing on the special case of parallel-plate geometry. Some generic results showing the stability and repeatability of the measurements and the resolving power of the instrument are presented next. The last part of the paper deals with the application of the apparatus to the study of the effect of humidity and aging time on the mobility spectra of air ions generated by a low-activity 241Am source.

  8. Specularly reflected He sup 2+ at high Mach number quasi-parallel shocks

    Energy Technology Data Exchange (ETDEWEB)

    Fuselier, S.A.; Lennartsson, O.W. (Lockheed Palo Alto Research Lab., CA (United States)); Thomsen, M.F. (Los Alamos National Lab., NM (United States)); Russell, C.T. (Univ. of California, Los Angeles (United States))

    1990-04-01

    Upstream from the Earth's quasi-parallel bow shock, the Lockheed Plasma Composition Experiment on ISEE 1 often observes two types of suprathermal He{sup 2+} distributions. Always present to some degree is an energetic (several keV/eto 17.4 keV/e, the maximum energy of the detector) diffuse He{sup 2+} distribution. Sometimes, apparently when the Alfven Mach number, M{sub A}, is high enough and the spacecraft is near the shock (within a few minutes of a crossing), a second type of suprathermal He{sup 2+} distribution is also observed. This nongyrotropic, gyrating He{sup 2+} distribution has velocity components parallel and perpendicular to the magnetic field that are consistent with near-specular reflection of a portion of the incident solar wind He{sup 2+} distribution off the shock. Specularly reflected and diffuse proton distributions are associated with these gyrating He{sup 2+} distributions. The presence of these gyrating He{sup 2+} distributions suggests that specular reflection is controlled primarily by magnetic forces in high Mach number quasi-parallel shocks and that these distributions may be a seed population for more energetic diffuse He{sup 2+} distributions.

  9. USING PENALIZED REGRESSION WITH PARALLEL COORDINATES FOR VISUALIZATION OF SIGNIFICANCE IN HIGH DIMENSIONAL DATA

    Directory of Open Access Journals (Sweden)

    Shengwen Wang

    2013-11-01

    Full Text Available In recent years, there has been an exponential increase in the amount of data being produced and disseminated by diverse applications, intensifying the need for the development of effective methods for the interactive visual and analytical exploration of large, high-dimensional datasets. In this paper, we describe the development of a novel tool for multivariate data visualization and exploration based on the integrated use of regression analysis and advanced parallel coordinates visualization. Conventional parallel-coordinates visualization is a classical method for presenting raw multivariate data on a 2D screen. However, current tools suffer from a variety of problems when applied to massively high-dimensional datasets. Our system tackles these issues through the combined use of regression analysis and a variety of enhancements to traditional parallel-coordinates display capabilities, including new techniques to handle visual clutter, and intuitive solutions for selecting, ordering, and grouping dimensions. We demonstrate the effectiveness of our system through two case-studies.

  10. Development of high performance casting analysis software by coupled parallel computation

    Institute of Scientific and Technical Information of China (English)

    2007-01-01

    Up to now, so much casting analysis software has been continuing to develop the new access way to real casting processes. Those include the melt flow analysis, heat transfer analysis for solidification calculation,mechanical property predictions and microstructure predictions. These trials were successful to obtain the ideal results comparing with real situations, so that CAE technologies became inevitable to design or develop new casting processes. But for manufacturing fields, CAE technologies are not so frequently being used because of their difficulties in using the software or insufficient computing performances. To introduce CAE technologies to manufacturing field,the high performance analysis is essential to shorten the gap between product designing time and prototyping time.The software code optimization can be helpful, but it is not enough, because the codes developed by software experts are already optimized enough. As an alternative proposal for high performance computations, the parallel computation technologies are eagerly being applied to CAE technologies to make the analysis time shorter. In this research, SMP (Shared Memory Processing) and MPI (Message Passing Interface) (1) methods for parallelization were applied to commercial software "Z-Cast" to calculate the casting processes. In the code parallelizing processes,the network stabilization, core optimization were also carried out under Microsoft Windows platform and their performances and results were compared with those of normal linear analysis codes.

  11. Development of high performance casting analysis software by coupled parallel computation

    Directory of Open Access Journals (Sweden)

    Sang Hyun CHO

    2007-08-01

    Full Text Available Up to now, so much casting analysis software has been continuing to develop the new access way to real casting processes. Those include the melt flow analysis, heat transfer analysis for solidification calculation, mechanical property predictions and microstructure predictions. These trials were successful to obtain the ideal results comparing with real situations, so that CAE technologies became inevitable to design or develop new casting processes. But for manufacturing fields, CAE technologies are not so frequently being used because of their difficulties in using the software or insufficient computing performances. To introduce CAE technologies to manufacturing field, the high performance analysis is essential to shorten the gap between product designing time and prototyping time. The software code optimization can be helpful, but it is not enough, because the codes developed by software experts are already optimized enough. As an alternative proposal for high performance computations, the parallel computation technologies are eagerly being applied to CAE technologies to make the analysis time shorter. In this research, SMP (Shared Memory Processing and MPI (Message Passing Interface (1 methods for parallelization were applied to commercial software "Z-Cast" to calculate the casting processes. In the code parallelizing processes, the network stabilization, core optimization were also carried out under Microsoft Windows platform and their performances and results were compared with those of normal linear analysis codes.

  12. Parallel optical control of spatiotemporal neuronal spike activity using high-frequency digital light processingtechnology

    Directory of Open Access Journals (Sweden)

    Jason eJerome

    2011-08-01

    Full Text Available Neurons in the mammalian neocortex receive inputs from and communicate back to thousands of other neurons, creating complex spatiotemporal activity patterns. The experimental investigation of these parallel dynamic interactions has been limited due to the technical challenges of monitoring or manipulating neuronal activity at that level of complexity. Here we describe a new massively parallel photostimulation system that can be used to control action potential firing in in vitro brain slices with high spatial and temporal resolution while performing extracellular or intracellular electrophysiological measurements. The system uses Digital-Light-Processing (DLP technology to generate 2-dimensional (2D stimulus patterns with >780,000 independently controlled photostimulation sites that operate at high spatial (5.4 µm and temporal (>13kHz resolution. Light is projected through the quartz-glass bottom of the perfusion chamber providing access to a large area (2.76 x 2.07 mm2 of the slice preparation. This system has the unique capability to induce temporally precise action potential firing in large groups of neurons distributed over a wide area covering several cortical columns. Parallel photostimulation opens up new opportunities for the in vitro experimental investigation of spatiotemporal neuronal interactions at a broad range of anatomical scales.

  13. Parallel optical control of spatiotemporal neuronal spike activity using high-speed digital light processing.

    Science.gov (United States)

    Jerome, Jason; Foehring, Robert C; Armstrong, William E; Spain, William J; Heck, Detlef H

    2011-01-01

    Neurons in the mammalian neocortex receive inputs from and communicate back to thousands of other neurons, creating complex spatiotemporal activity patterns. The experimental investigation of these parallel dynamic interactions has been limited due to the technical challenges of monitoring or manipulating neuronal activity at that level of complexity. Here we describe a new massively parallel photostimulation system that can be used to control action potential firing in in vitro brain slices with high spatial and temporal resolution while performing extracellular or intracellular electrophysiological measurements. The system uses digital light processing technology to generate 2-dimensional (2D) stimulus patterns with >780,000 independently controlled photostimulation sites that operate at high spatial (5.4 μm) and temporal (>13 kHz) resolution. Light is projected through the quartz-glass bottom of the perfusion chamber providing access to a large area (2.76 mm × 2.07 mm) of the slice preparation. This system has the unique capability to induce temporally precise action potential firing in large groups of neurons distributed over a wide area covering several cortical columns. Parallel photostimulation opens up new opportunities for the in vitro experimental investigation of spatiotemporal neuronal interactions at a broad range of anatomical scales.

  14. Parallel plate chambers for monitoring the profiles of high-intensity pulsed antiproton beams

    CERN Document Server

    Hori, Masaki

    2004-01-01

    Two types of beam profile monitor with thin parallel-plate electrodes have been used in experiments carried out at the Low Energy Antiproton Ring (LEAR) and Antiproton Decelerator (AD) of CERN. The detectors were used to measure non-destructively the spatial profiles, absolute intensities, and time structures of 100-300-ns- long beam pulses containing between 10**7 and 10**9 antiprotons. The first of these monitors was a parallel plate ionization chamber operated at gas pressure P=65 mbar. The other was a secondary electron emission detector, and was operated in the ultra-high vacuum of the AD. Both designs may be useful in medical and commercial applications. The position-sensitive electrodes in these detectors were manufactured by a novel method in which a laser trimmer was used to cut strip patterns on metallized polyester foils.

  15. Kemari: A Portable High Performance Fortran System for Distributed Memory Parallel Processors

    Directory of Open Access Journals (Sweden)

    T. Kamachi

    1997-01-01

    Full Text Available We have developed a compilation system which extends High Performance Fortran (HPF in various aspects. We support the parallelization of well-structured problems with loop distribution and alignment directives similar to HPF's data distribution directives. Such directives give both additional control to the user and simplify the compilation process. For the support of unstructured problems, we provide directives for dynamic data distribution through user-defined mappings. The compiler also allows integration of message-passing interface (MPI primitives. The system is part of a complete programming environment which also comprises a parallel debugger and a performance monitor and analyzer. After an overview of the compiler, we describe the language extensions and related compilation mechanisms in detail. Performance measurements demonstrate the compiler's applicability to a variety of application classes.

  16. 9th International Workshop on Parallel Tools for High Performance Computing

    CERN Document Server

    Hilbrich, Tobias; Niethammer, Christoph; Gracia, José; Nagel, Wolfgang; Resch, Michael

    2016-01-01

    High Performance Computing (HPC) remains a driver that offers huge potentials and benefits for science and society. However, a profound understanding of the computational matters and specialized software is needed to arrive at effective and efficient simulations. Dedicated software tools are important parts of the HPC software landscape, and support application developers. Even though a tool is by definition not a part of an application, but rather a supplemental piece of software, it can make a fundamental difference during the development of an application. Such tools aid application developers in the context of debugging, performance analysis, and code optimization, and therefore make a major contribution to the development of robust and efficient parallel software. This book introduces a selection of the tools presented and discussed at the 9th International Parallel Tools Workshop held in Dresden, Germany, September 2-3, 2015, which offered an established forum for discussing the latest advances in paral...

  17. Efficient high-precision matrix algebra on parallel architectures for nonlinear combinatorial optimization

    KAUST Repository

    Gunnels, John

    2010-06-01

    We provide a first demonstration of the idea that matrix-based algorithms for nonlinear combinatorial optimization problems can be efficiently implemented. Such algorithms were mainly conceived by theoretical computer scientists for proving efficiency. We are able to demonstrate the practicality of our approach by developing an implementation on a massively parallel architecture, and exploiting scalable and efficient parallel implementations of algorithms for ultra high-precision linear algebra. Additionally, we have delineated and implemented the necessary algorithmic and coding changes required in order to address problems several orders of magnitude larger, dealing with the limits of scalability from memory footprint, computational efficiency, reliability, and interconnect perspectives. © Springer and Mathematical Programming Society 2010.

  18. PPM A highly efficient parallel particle mesh library for the simulation of continuum systems

    Science.gov (United States)

    Sbalzarini, I. F.; Walther, J. H.; Bergdorf, M.; Hieber, S. E.; Kotsalis, E. M.; Koumoutsakos, P.

    2006-07-01

    This paper presents a highly efficient parallel particle-mesh (PPM) library, based on a unifying particle formulation for the simulation of continuous systems. In this formulation, the grid-free character of particle methods is relaxed by the introduction of a mesh for the reinitialization of the particles, the computation of the field equations, and the discretization of differential operators. The present utilization of the mesh does not detract from the adaptivity, the efficient handling of complex geometries, the minimal dissipation, and the good stability properties of particle methods. The coexistence of meshes and particles, allows for the development of a consistent and adaptive numerical method, but it presents a set of challenging parallelization issues that have hindered in the past the broader use of particle methods. The present library solves the key parallelization issues involving particle-mesh interpolations and the balancing of processor particle loading, using a novel adaptive tree for mixed domain decompositions along with a coloring scheme for the particle-mesh interpolation. The high parallel efficiency of the library is demonstrated in a series of benchmark tests on distributed memory and on a shared-memory vector architecture. The modularity of the method is shown by a range of simulations, from compressible vortex rings using a novel formulation of smooth particle hydrodynamics, to simulations of diffusion in real biological cell organelles. The present library enables large scale simulations of diverse physical problems using adaptive particle methods and provides a computational tool that is a viable alternative to mesh-based methods.

  19. High-speed, digitally refocused retinal imaging with line-field parallel swept source OCT

    Science.gov (United States)

    Fechtig, Daniel J.; Kumar, Abhishek; Ginner, Laurin; Drexler, Wolfgang; Leitgeb, Rainer A.

    2015-03-01

    MHz OCT allows mitigating undesired influence of motion artifacts during retinal assessment, but comes in state-of-the-art point scanning OCT at the price of increased system complexity. By changing the paradigm from scanning to parallel OCT for in vivo retinal imaging the three-dimensional (3D) acquisition time is reduced without a trade-off between speed, sensitivity and technological requirements. Furthermore, the intrinsic phase stability allows for applying digital refocusing methods increasing the in-focus imaging depth range. Line field parallel interferometric imaging (LPSI) is utilizing a commercially available swept source, a single-axis galvo-scanner and a line scan camera for recording 3D data with up to 1MHz A-scan rate. Besides line-focus illumination and parallel detection, we mitigate the necessity for high-speed sensor and laser technology by holographic full-range imaging, which allows for increasing the imaging speed by low sampling of the optical spectrum. High B-scan rates up to 1kHz further allow for implementation of lable-free optical angiography in 3D by calculating the inter B-scan speckle variance. We achieve a detection sensitivity of 93.5 (96.5) dB at an equivalent A-scan rate of 1 (0.6) MHz and present 3D in vivo retinal structural and functional imaging utilizing digital refocusing. Our results demonstrate for the first time competitive imaging sensitivity, resolution and speed with a parallel OCT modality. LPSI is in fact currently the fastest OCT device applied to retinal imaging and operating at a central wavelength window around 800 nm with a detection sensitivity of higher than 93.5 dB.

  20. Bandwidth Enhancement between Graphics Processing Units on the Peripheral Component Interconnect Bus

    Directory of Open Access Journals (Sweden)

    ANTON Alin

    2015-10-01

    Full Text Available General purpose computing on graphics processing units is a new trend in high performance computing. Present day applications require office and personal supercomputers which are mostly based on many core hardware accelerators communicating with the host system through the Peripheral Component Interconnect (PCI bus. Parallel data compression is a difficult topic but compression has been used successfully to improve the communication between parallel message passing interface (MPI processes on high performance computing clusters. In this paper we show that special pur pose compression algorithms designed for scientific floating point data can be used to enhance the bandwidth between 2 graphics processing unit (GPU devices on the PCI Express (PCIe 3.0 x16 bus in a homebuilt personal supercomputer (PSC.

  1. GPU based cloud system for high-performance arrhythmia detection with parallel k-NN algorithm.

    Science.gov (United States)

    Tae Joon Jun; Hyun Ji Park; Hyuk Yoo; Young-Hak Kim; Daeyoung Kim

    2016-08-01

    In this paper, we propose an GPU based Cloud system for high-performance arrhythmia detection. Pan-Tompkins algorithm is used for QRS detection and we optimized beat classification algorithm with K-Nearest Neighbor (K-NN). To support high performance beat classification on the system, we parallelized beat classification algorithm with CUDA to execute the algorithm on virtualized GPU devices on the Cloud system. MIT-BIH Arrhythmia database is used for validation of the algorithm. The system achieved about 93.5% of detection rate which is comparable to previous researches while our algorithm shows 2.5 times faster execution time compared to CPU only detection algorithm.

  2. Modeling of fatigue crack induced nonlinear ultrasonics using a highly parallelized explicit local interaction simulation approach

    Science.gov (United States)

    Shen, Yanfeng; Cesnik, Carlos E. S.

    2016-04-01

    This paper presents a parallelized modeling technique for the efficient simulation of nonlinear ultrasonics introduced by the wave interaction with fatigue cracks. The elastodynamic wave equations with contact effects are formulated using an explicit Local Interaction Simulation Approach (LISA). The LISA formulation is extended to capture the contact-impact phenomena during the wave damage interaction based on the penalty method. A Coulomb friction model is integrated into the computation procedure to capture the stick-slip contact shear motion. The LISA procedure is coded using the Compute Unified Device Architecture (CUDA), which enables the highly parallelized supercomputing on powerful graphic cards. Both the explicit contact formulation and the parallel feature facilitates LISA's superb computational efficiency over the conventional finite element method (FEM). The theoretical formulations based on the penalty method is introduced and a guideline for the proper choice of the contact stiffness is given. The convergence behavior of the solution under various contact stiffness values is examined. A numerical benchmark problem is used to investigate the new LISA formulation and results are compared with a conventional contact finite element solution. Various nonlinear ultrasonic phenomena are successfully captured using this contact LISA formulation, including the generation of nonlinear higher harmonic responses. Nonlinear mode conversion of guided waves at fatigue cracks is also studied.

  3. High-Performance Parallel Analysis of Coupled Problems for Aircraft Propulsion

    Science.gov (United States)

    Felippa, C. A.; Farhat, C.; Park, K. C.; Gumaste, U.; Chen, P.-S.; Lesoinne, M.; Stern, P.

    1997-01-01

    Applications are described of high-performance computing methods to the numerical simulation of complete jet engines. The methodology focuses on the partitioned analysis of the interaction of the gas flow with a flexible structure and with the fluid mesh motion driven by structural displacements. The latter is treated by a ALE technique that models the fluid mesh motion as that of a fictitious mechanical network laid along the edges of near-field elements. New partitioned analysis procedures to treat this coupled three-component problem were developed. These procedures involved delayed corrections and subcycling, and have been successfully tested on several massively parallel computers, including the iPSC-860, Paragon XP/S and the IBM SP2. The NASA-sponsored ENG10 program was used for the global steady state analysis of the whole engine. This program uses a regular FV-multiblock-grid discretization in conjunction with circumferential averaging to include effects of blade forces, loss, combustor heat addition, blockage, bleeds and convective mixing. A load-balancing preprocessor for parallel versions of ENG10 was developed as well as the capability for the first full 3D aeroelastic simulation of a multirow engine stage. This capability was tested on the IBM SP2 parallel supercomputer at NASA Ames.

  4. Building A High Performance Parallel File System Using Grid Datafarm and ROOT I/O

    CERN Document Server

    Morita, Y; Watase, Y; Tatebe, Osamu; Sekiguchi, S; Matsuoka, S; Soda, N; Dell'Acqua, A

    2003-01-01

    Sheer amount of petabyte scale data foreseen in the LHC experiments require a careful consideration of the persistency design and the system design in the world-wide distributed computing. Event parallelism of the HENP data analysis enables us to take maximum advantage of the high performance cluster computing and networking when we keep the parallelism both in the data processing phase, in the data management phase, and in the data transfer phase. A modular architecture of FADS/ Goofy, a versatile detector simulation framework for Geant4, enables an easy choice of plug-in facilities for persistency technologies such as Objectivity/DB and ROOT I/O. The framework is designed to work naturally with the parallel file system of Grid Datafarm (Gfarm). FADS/Goofy is proven to generate 10^6 Geant4-simulated Atlas Mockup events using a 512 CPU PC cluster. The data in ROOT I/O files is replicated using Gfarm file system. The histogram information is collected from the distributed ROOT files. During the data replicatio...

  5. Efficient Parallel Global Optimization for High Resolution Hydrologic and Climate Impact Models

    Science.gov (United States)

    Shoemaker, C. A.; Mueller, J.; Pang, M.

    2013-12-01

    High Resolution hydrologic models are typically computationally expensive, requiring many minutes or perhaps hours for one simulation. Optimization can be used with these models for parameter estimation or for analyzing management alternatives. However Optimization of these computationally expensive simulations requires algorithms that can obtain accurate answers with relatively few simulations to avoid infeasibly long computation times. We have developed a number of efficient parallel algorithms and software codes for optimization of expensive problems with multiple local minimum. This is open source software we are distributing. It runs in Matlab and Python, and has been run on Yellowstone supercomputer. The talk will quickly discuss the characteristics of the problem (e.g. the presence of integer as well as continuous variables, the number of dimensions, the availability of parallel/grid computing, the number of simulations that can be allowed to find a solution, etc. ) that determine which algorithms are most appropriate for each type of problem. A major application of this optimization software is for parameter estimation for nonlinear hydrologic models, including contaminant transport in subsurface (e.g. for groundwater remediation or multi-phase flow for carbon sequestration), nutrient transport in watersheds, and climate models. We will present results for carbon sequestration plume monitoring (multi-phase, multi-constiuent), for groundwater remediation, and for the CLM climate model. The carbon sequestration example is based on the Frio CO2 field site and the groundwater example is for a 50,000 acre remediation site (with model requiring about 1 hour per simulation). Parallel speed-ups are excellent in most cases, and our serial and parallel algorithms tend to outperform alternative methods on complex computationally expensive simulations that have multiple global minima.

  6. Tunable-Bandwidth Filter System

    Science.gov (United States)

    Aye, Tin; Yu, Kevin; Dimov, Fedor; Savant, Gajendra

    2006-01-01

    A tunable-bandwidth filter system (TBFS), now undergoing development, is intended to be part of a remote-sensing multispectral imaging system that will operate in the visible and near infrared spectral region (wavelengths from 400 to 900 nm). Attributes of the TBFS include rapid tunability of the pass band over a wide wavelength range and high transmission efficiency. The TBFS is based on a unique integration of two pairs of broadband Raman reflection holographic filters with two rotating spherical lenses. In experiments, a prototype of the TBFS was shown to be capable of spectral sampling of images in the visible range over a 200-nm spectral range with a spectral resolution of .30 nm. The figure depicts the optical layout of a prototype of the TBFS as part of a laboratory multispectral imaging system for the spectral sampling of color test images in two orthogonal polarizations. Each pair of broadband Raman reflection holographic filters is mounted at an equatorial plane between two halves of a spherical lens. The two filters in each pair are characterized by steep spectral slopes (equivalently, narrow spectral edges), no ripple or side lobes in their pass bands, and a few nanometers of non-overlapping wavelength range between their pass bands. Each spherical lens and thus the filter pair within it is rotated in order to rapidly tune its pass band. The rotations of the lenses are effected by electronically controlled, programmable, high-precision rotation stages. The rotations are coordinated by electronic circuits operating under overall supervision of a personal computer in order to obtain the desired variation of the overall pass bands with time. Embedding the filters inside the spherical lenses increases the range of the hologram incidence angles, making it possible to continuously tune the pass and stop bands of the filters over a wider wavelength range. In addition, each spherical lens also serves as part of the imaging optics: The telephoto lens focuses

  7. A highly parallel method for synthesizing DNA repeats enables the discovery of 'smart' protein polymers.

    Science.gov (United States)

    Amiram, Miriam; Quiroz, Felipe Garcia; Callahan, Daniel J; Chilkoti, Ashutosh

    2011-02-01

    Robust high-throughput synthesis methods are needed to expand the repertoire of repetitive protein-polymers for different applications. To address this need, we developed a new method, overlap extension rolling circle amplification (OERCA), for the highly parallel synthesis of genes encoding repetitive protein-polymers. OERCA involves a single PCR-type reaction for the rolling circle amplification of a circular DNA template and simultaneous overlap extension by thermal cycling. We characterized the variables that control OERCA and demonstrated its superiority over existing methods, its robustness, high-throughput and versatility by synthesizing variants of elastin-like polypeptides (ELPs) and protease-responsive polymers of glucagon-like peptide-1 analogues. Despite the GC-rich, highly repetitive sequences of ELPs, we synthesized remarkably large genes without recursive ligation. OERCA also enabled us to discover 'smart' biopolymers that exhibit fully reversible thermally responsive behaviour. This powerful strategy generates libraries of repetitive genes over a wide and tunable range of molecular weights in a 'one-pot' parallel format.

  8. Frequency response and bandwidth enhancement in Ge/Si avalanche photodiodes with over 840 GHz gain-bandwidth-product.

    Science.gov (United States)

    Zaoui, Wissem Sfar; Chen, Hui-Wen; Bowers, John E; Kang, Yimin; Morse, Mike; Paniccia, Mario J; Pauchard, Alexandre; Campbell, Joe C

    2009-07-20

    In this work we report a separate-absorption-charge-multiplication Ge/Si avalanche photodiode with an enhanced gain-bandwidth-product of 845 GHz at a wavelength of 1310 nm. The corresponding gain value is 65 and the electrical bandwidth is 13 GHz at an optical input power of -30 dBm. The unconventional high gain-bandwidth-product is investigated using device physical simulation and optical pulse response measurement. The analysis of the electric field distribution, electron and hole concentration and drift velocities in the device shows that the enhanced gain-bandwidth-product at high bias voltages is due to a decrease of the transit time and avalanche build-up time limitation at high fields.

  9. Black Holes, Bandwidths and Beethoven

    OpenAIRE

    Kempf, A.

    1999-01-01

    It is usually believed that a function whose Fourier spectrum is bounded can vary at most as fast as its highest frequency component. This is in fact not the case, as Aharonov, Berry and others drastically demonstrated with explicit counter examples, so-called superoscillations. It has been claimed that even the recording of an entire Beethoven symphony can occur as part of a signal with 1Hz bandwidth. Bandlimited functions also occur as ultraviolet regularized fields. Their superoscillations...

  10. Utility-based bandwidth allocation algorithm for heterogeneous wireless networks

    Institute of Scientific and Technical Information of China (English)

    CHAI Rong; WANG XiuJuan; CHEN QianBin; SVENSSON Tommy

    2013-01-01

    In next generation wireless network (NGWN), mobile users are capable of connecting to the core network through various heterogeneous wireless access networks, such as cellular network, wireless metropolitan area network (WMAN), wireless local area network (WLAN), and ad hoc network. NGWN is expected to provide high-bandwidth connectivity with guaranteed quality-of-service to mobile users in a seamless manner; however, this desired function demands seamless coordination of the heterogeneous radio access network (RAN) technologies. In recent years, some researches have been conducted to design radio resource management (RRM) architectures and algorithms for NGWN; however, few studies stress the problem of joint network performance optimization, which is an essential goal for a cooperative service providing scenario. Furthermore, while some authors consider the competition among the service providers, the QoS requirements of users and the resource competition within access networks are not fully considered. In this paper, we present an interworking integrated network architecture, which is responsible for monitoring the status information of different radio access technologies (RATs) and executing the resource allocation algorithm. Within this architecture, the problem of joint bandwidth allocation for heterogeneous integrated networks is formulated based on utility function theory and bankruptcy game theory. The proposed bandwidth allocation scheme comprises two successive stages, i.e., service bandwidth allocation and user bandwidth allocation. At the service bandwidth allocation stage, the optimal amount of bandwidth for different types of services in each network is allocated based on the criterion of joint utility maximization. At the user bandwidth allocation stage, the service bandwidth in each network is optimally allocated among users in the network according to bankruptcy game theory. Numerical results demonstrate the efficiency of

  11. Centrifugal micro-channel array droplet generation for highly parallel digital PCR.

    Science.gov (United States)

    Chen, Zitian; Liao, Peiyu; Zhang, Fangli; Jiang, Mengcheng; Zhu, Yusen; Huang, Yanyi

    2017-01-17

    Stable water-in-oil emulsion is essential to digital PCR and many other bioanalytical reactions that employ droplets as microreactors. We developed a novel technology to produce monodisperse emulsion droplets with high efficiency and high throughput using a bench-top centrifuge. Upon centrifugal spinning, the continuous aqueous phase is dispersed into monodisperse droplet jets in air through a micro-channel array (MiCA) and then submerged into oil as a stable emulsion. We performed dPCR reactions with a high dynamic range through the MiCA approach, and demonstrated that this cost-effective method not only eliminates the usage of complex microfluidic devices and control systems, but also greatly suppresses the loss of materials and cross-contamination. MiCA-enabled highly parallel emulsion generation combines both easiness and robustness of picoliter droplet production, and breaks the technical challenges by using conventional lab equipment and supplies.

  12. High-speed 3D imaging using two-wavelength parallel-phase-shift interferometry.

    Science.gov (United States)

    Safrani, Avner; Abdulhalim, Ibrahim

    2015-10-15

    High-speed three dimensional imaging based on two-wavelength parallel-phase-shift interferometry is presented. The technique is demonstrated using a high-resolution polarization-based Linnik interferometer operating with three high-speed phase-masked CCD cameras and two quasi-monochromatic modulated light sources. The two light sources allow for phase unwrapping the single source wrapped phase so that relatively high step profiles having heights as large as 3.7 μm can be imaged in video rate with ±2  nm accuracy and repeatability. The technique is validated using a certified very large scale integration (VLSI) step standard followed by a demonstration from the semiconductor industry showing an integrated chip with 2.75 μm height copper micro pillars at different packing densities.

  13. A parallel high-order accurate finite element nonlinear Stokes ice sheet model and benchmark experiments

    Energy Technology Data Exchange (ETDEWEB)

    Leng, Wei [Chinese Academy of Sciences; Ju, Lili [University of South Carolina; Gunzburger, Max [Florida State University; Price, Stephen [Los Alamos National Laboratory; Ringler, Todd [Los Alamos National Laboratory,

    2012-01-01

    The numerical modeling of glacier and ice sheet evolution is a subject of growing interest, in part because of the potential for models to inform estimates of global sea level change. This paper focuses on the development of a numerical model that determines the velocity and pressure fields within an ice sheet. Our numerical model features a high-fidelity mathematical model involving the nonlinear Stokes system and combinations of no-sliding and sliding basal boundary conditions, high-order accurate finite element discretizations based on variable resolution grids, and highly scalable parallel solution strategies, all of which contribute to a numerical model that can achieve accurate velocity and pressure approximations in a highly efficient manner. We demonstrate the accuracy and efficiency of our model by analytical solution tests, established ice sheet benchmark experiments, and comparisons with other well-established ice sheet models.

  14. A high efficient integrated planar transformer for primary-parallel isolated boost converters

    DEFF Research Database (Denmark)

    Sen, Gökhan; Ouyang, Ziwei; Thomsen, Ole Cornelius;

    2010-01-01

    A simple, easy to manufacture and high efficient integrated planar transformer design approach for primary parallel isolated boost converters is presented. Utilizing the same phase flux flow, transformers are integrated, reducing the total ferrite volume and core loss for the same peak flux density...... of transformer integration is further extended to multiple primary power stages using modular geometry of the planar core, further reducing the core loss and allowing a higher power density. To verify the validity of the design approach, a 4-kW prototype converter with two primary power stages is implemented...

  15. Parallel adaptive integration in high-performance functional Renormalization Group computations

    CERN Document Server

    Lichtenstein, Julian; de la Peña, David Sánchez; Vidović, Toni; Di Napoli, Edoardo

    2016-01-01

    The conceptual framework provided by the functional Renormalization Group (fRG) has become a formidable tool to study correlated electron systems on lattices which, in turn, provided great insights to our understanding of complex many-body phenomena, such as high- temperature superconductivity or topological states of matter. In this work we present one of the latest realizations of fRG which makes use of an adaptive numerical quadrature scheme specifically tailored to the described fRG scheme. The final result is an increase in performance thanks to improved parallelism and scalability.

  16. Development of high-resolution x-ray CT system using parallel beam geometry

    Energy Technology Data Exchange (ETDEWEB)

    Yoneyama, Akio, E-mail: akio.yoneyama.bu@hitachi.com; Baba, Rika [Central Research Laboratory, Hitachi Ltd., Hatoyama, Saitama (Japan); Hyodo, Kazuyuki [Institute of Materials Science, High Energy Accelerator Research Organization, Tsukuba, Ibaraki (Japan); Takeda, Tohoru [School of Allied Health Sciences, Kitasato University, Sagamihara, Kanagawa (Japan); Nakano, Haruhisa; Maki, Koutaro [Department of Orthodontics, School of Dentistry Showa University, Ota-ku, Tokyo (Japan); Sumitani, Kazushi; Hirai, Yasuharu [Kyushu Synchrotron Light Research Center, Tosu, Saga (Japan)

    2016-01-28

    For fine three-dimensional observations of large biomedical and organic material samples, we developed a high-resolution X-ray CT system. The system consists of a sample positioner, a 5-μm scintillator, microscopy lenses, and a water-cooled sCMOS detector. Parallel beam geometry was adopted to attain a field of view of a few mm square. A fine three-dimensional image of birch branch was obtained using a 9-keV X-ray at BL16XU of SPring-8 in Japan. The spatial resolution estimated from the line profile of a sectional image was about 3 μm.

  17. Rapid, automated, parallel quantitative immunoassays using highly integrated microfluidics and AlphaLISA

    Science.gov (United States)

    TakYu, Zeta; Guan, Huijiao; Ki Cheung, Mei; McHugh, Walker M.; Cornell, Timothy T.; Shanley, Thomas P.; Kurabayashi, Katsuo; Fu, Jianping

    2015-06-01

    Immunoassays represent one of the most popular analytical methods for detection and quantification of biomolecules. However, conventional immunoassays such as ELISA and flow cytometry, even though providing high sensitivity and specificity and multiplexing capability, can be labor-intensive and prone to human error, making them unsuitable for standardized clinical diagnoses. Using a commercialized no-wash, homogeneous immunoassay technology (‘AlphaLISA’) in conjunction with integrated microfluidics, herein we developed a microfluidic immunoassay chip capable of rapid, automated, parallel immunoassays of microliter quantities of samples. Operation of the microfluidic immunoassay chip entailed rapid mixing and conjugation of AlphaLISA components with target analytes before quantitative imaging for analyte detections in up to eight samples simultaneously. Aspects such as fluid handling and operation, surface passivation, imaging uniformity, and detection sensitivity of the microfluidic immunoassay chip using AlphaLISA were investigated. The microfluidic immunoassay chip could detect one target analyte simultaneously for up to eight samples in 45 min with a limit of detection down to 10 pg mL-1. The microfluidic immunoassay chip was further utilized for functional immunophenotyping to examine cytokine secretion from human immune cells stimulated ex vivo. Together, the microfluidic immunoassay chip provides a promising high-throughput, high-content platform for rapid, automated, parallel quantitative immunosensing applications.

  18. Challenges in Polybinary Modulation for Bandwidth Limited Optical Links

    DEFF Research Database (Denmark)

    Vegas Olmos, Juan José; Tafur Monroy, Idelfonso; Madsen, Peter

    2016-01-01

    Optical links using traditional modulation formats are reaching a plateau in terms of capacity, mainly due to bandwidth limitations in the devices employed at the transmitter and receivers. Advanced modulation formats, which boost the spectral efficiency, provide a smooth migration path towards...... of the current research status of the key building blocks in polybinary systems. The results clearly show how polybinary modulation effectively reduces the bandwidth requirements on optical links while providing high spectral efficiency....

  19. Parallel biocomputing

    Directory of Open Access Journals (Sweden)

    Witte John S

    2011-03-01

    Full Text Available Abstract Background With the advent of high throughput genomics and high-resolution imaging techniques, there is a growing necessity in biology and medicine for parallel computing, and with the low cost of computing, it is now cost-effective for even small labs or individuals to build their own personal computation cluster. Methods Here we briefly describe how to use commodity hardware to build a low-cost, high-performance compute cluster, and provide an in-depth example and sample code for parallel execution of R jobs using MOSIX, a mature extension of the Linux kernel for parallel computing. A similar process can be used with other cluster platform software. Results As a statistical genetics example, we use our cluster to run a simulated eQTL experiment. Because eQTL is computationally intensive, and is conceptually easy to parallelize, like many statistics/genetics applications, parallel execution with MOSIX gives a linear speedup in analysis time with little additional effort. Conclusions We have used MOSIX to run a wide variety of software programs in parallel with good results. The limitations and benefits of using MOSIX are discussed and compared to other platforms.

  20. High-throughput parallel SPM for metrology, defect, and mask inspection

    Science.gov (United States)

    Sadeghian, H.; Herfst, R. W.; van den Dool, T. C.; Crowcombe, W. E.; Winters, J.; Kramer, G. F. I. J.

    2014-10-01

    Scanning probe microscopy (SPM) is a promising candidate for accurate assessment of metrology and defects on wafers and masks, however it has traditionally been too slow for high-throughput applications, although recent developments have significantly pushed the speed of SPM [1,2]. In this paper we present new results obtained with our previously presented high-throughput parallel SPM system [3,4] that showcase two key advances that are required for a successful deployment of SPM in high-throughput metrology, defect and mask inspection. The first is a very fast (up to 40 lines/s) image acquisition and a comparison of the image quality as function of speed. Secondly, a fast approach method: measurements of the scan-head approaching the sample from 0.2 and 1.0 mm distance in under 1.4 and 6 seconds respectively.

  1. A new massively parallel version of CRYSTAL for large systems on high performance computing architectures.

    Science.gov (United States)

    Orlando, Roberto; Delle Piane, Massimo; Bush, Ian J; Ugliengo, Piero; Ferrabone, Matteo; Dovesi, Roberto

    2012-10-30

    Fully ab initio treatment of complex solid systems needs computational software which is able to efficiently take advantage of the growing power of high performance computing (HPC) architectures. Recent improvements in CRYSTAL, a periodic ab initio code that uses a Gaussian basis set, allows treatment of very large unit cells for crystalline systems on HPC architectures with high parallel efficiency in terms of running time and memory requirements. The latter is a crucial point, due to the trend toward architectures relying on a very high number of cores with associated relatively low memory availability. An exhaustive performance analysis shows that density functional calculations, based on a hybrid functional, of low-symmetry systems containing up to 100,000 atomic orbitals and 8000 atoms are feasible on the most advanced HPC architectures available to European researchers today, using thousands of processors.

  2. Parametric analysis of hollow conductor parallel and coaxial transmission lines for high frequency space power distribution

    Science.gov (United States)

    Jeffries, K. S.; Renz, D. D.

    1984-01-01

    A parametric analysis was performed of transmission cables for transmitting electrical power at high voltage (up to 1000 V) and high frequency (10 to 30 kHz) for high power (100 kW or more) space missions. Large diameter (5 to 30 mm) hollow conductors were considered in closely spaced coaxial configurations and in parallel lines. Formulas were derived to calculate inductance and resistance for these conductors. Curves of cable conductance, mass, inductance, capacitance, resistance, power loss, and temperature were plotted for various conductor diameters, conductor thickness, and alternating current frequencies. An example 5 mm diameter coaxial cable with 0.5 mm conductor thickness was calculated to transmit 100 kW at 1000 Vac, 50 m with a power loss of 1900 W, an inductance of 1.45 micron and a capacitance of 0.07 micron-F. The computer programs written for this analysis are listed in the appendix.

  3. Algorithms and Requirements for Measuring Network Bandwidth

    Energy Technology Data Exchange (ETDEWEB)

    Jin, Guojun

    2002-12-08

    This report unveils new algorithms for actively measuring (not estimating) available bandwidths with very low intrusion, computing cross traffic, thus estimating the physical bandwidth, provides mathematical proof that the algorithms are accurate, and addresses conditions, requirements, and limitations for new and existing algorithms for measuring network bandwidths. The paper also discusses a number of important terminologies and issues for network bandwidth measurement, and introduces a fundamental parameter -Maximum Burst Size that is critical for implementing algorithms based on multiple packets.

  4. HIGH RESOLUTION REAL-TIME MEDICAL IMAGING BASED ON PARALLEL BEAMFORMING TECHNIQUE

    Institute of Scientific and Technical Information of China (English)

    Wang Lutao; Jin Gang; Xu Hongbing

    2011-01-01

    Improvement of frame-rate is very important for high quality ultrasound imaging of fast-moving structures.It is also one of the key technologies of Three-Dimension (3-D) real-time medical imaging.In this paper,we have demonstrated a beamforming method which gives imaging frame-rate increment without sacrificing the quality of medical images.By using wider and fewer transmit beams in combination with four narrower parallel receive beams,potentially increasing the imaging frame-rate by a factor four.Through employing full transmit aperture,controlling the mainlobe width,and suppressing sidelobes of angular responses,the inherent gain loss of normal parallel beamfomer can be compensated in the maximal degree.The noise and interference signals also can be suppressed effectively.Finally,we show comparable lateral resolution and contrast of ultrasound images to normal single widow weighting beamformer on simulated phantoms of point targets,cyst and fetus of 12th week.As the computational cost is linear with the number of array elements and the same with Delay And Sum (DAS) beamformers,this method has great advantages of possibility for high frame-rate real-time applications.

  5. Novel Highly Parallel and Systolic Architectures Using Quantum Dot-Based Hardware

    Science.gov (United States)

    Fijany, Amir; Toomarian, Benny N.; Spotnitz, Matthew

    1997-01-01

    VLSI technology has made possible the integration of massive number of components (processors, memory, etc.) into a single chip. In VLSI design, memory and processing power are relatively cheap and the main emphasis of the design is on reducing the overall interconnection complexity since data routing costs dominate the power, time, and area required to implement a computation. Communication is costly because wires occupy the most space on a circuit and it can also degrade clock time. In fact, much of the complexity (and hence the cost) of VLSI design results from minimization of data routing. The main difficulty in VLSI routing is due to the fact that crossing of the lines carrying data, instruction, control, etc. is not possible in a plane. Thus, in order to meet this constraint, the VLSI design aims at keeping the architecture highly regular with local and short interconnection. As a result, while the high level of integration has opened the way for massively parallel computation, practical and full exploitation of such a capability in many applications of interest has been hindered by the constraints on interconnection pattern. More precisely. the use of only localized communication significantly simplifies the design of interconnection architecture but at the expense of somewhat restricted class of applications. For example, there are currently commercially available products integrating; hundreds of simple processor elements within a single chip. However, the lack of adequate interconnection pattern among these processing elements make them inefficient for exploiting a large degree of parallelism in many applications.

  6. A High-Efficiency Monolithic DC-DC PFM Boost Converter with Parallel Power MOS Technique

    Directory of Open Access Journals (Sweden)

    Hou-Ming Chen

    2013-01-01

    Full Text Available This paper presents a high-efficiency monolithic dc-dc PFM boost converter designed with a standard TSMC 3.3/5V 0.35 μm CMOS technology. The proposed boost converter combines the parallel power MOS technique with pulse-frequency modulation (PFM technique to achieve high efficiency over a wide load current range, extending battery life and reducing the cost for the portable systems. The proposed parallel power MOS controller and load current detector exactly determine the size of power MOS to increase power conversion efficiency in different loads. Postlayout simulation results of the designed circuit show that the power conversion is 74.9–90.7% efficiency over a load range from 1 mA to 420 mA with 1.5 V supply. Moreover, the proposed boost converter has a smaller area and lower cost than those of the existing boost converter circuits.

  7. Improving the Bandwidth Selection in Kernel Equating

    Science.gov (United States)

    Andersson, Björn; von Davier, Alina A.

    2014-01-01

    We investigate the current bandwidth selection methods in kernel equating and propose a method based on Silverman's rule of thumb for selecting the bandwidth parameters. In kernel equating, the bandwidth parameters have previously been obtained by minimizing a penalty function. This minimization process has been criticized by practitioners…

  8. 47 CFR 2.202 - Bandwidths.

    Science.gov (United States)

    2010-10-01

    ... RULES AND REGULATIONS Emissions § 2.202 Bandwidths. (a) Occupied bandwidth. The frequency bandwidth such.... Facsimile Analogue facsimile by sub-carrier frequency modulation of a single-sideband emission with reduced...: 1980 Hz=1.98 kHz 1K98F3C 5. Composite Emissions (See Table III-B) Radio-relay system,...

  9. Dynamic bandwidth allocation in GPON networks

    DEFF Research Database (Denmark)

    Ozimkiewiez, J.; Ruepp, Sarah Renée; Dittmann, Lars

    2009-01-01

    Two Dynamic Bandwidth Allocation algorithms used for coordination of the available bandwidth between end users in a GPON network have been simulated using OPNET to determine and compare the performance, scalability and efficiency of status reporting and non status reporting dynamic bandwidth allo...

  10. MulticoreBSP for C : A high-performance library for shared-memory parallel programming

    NARCIS (Netherlands)

    Yzelman, A. N.; Bisseling, R. H.; Roose, D.; Meerbergen, K.

    2014-01-01

    The bulk synchronous parallel (BSP) model, as well as parallel programming interfaces based on BSP, classically target distributed-memory parallel architectures. In earlier work, Yzelman and Bisseling designed a MulticoreBSP for Java library specifically for shared-memory architectures. In the prese

  11. The parallel reaction monitoring method contributes to a highly sensitive polyubiquitin chain quantification

    Energy Technology Data Exchange (ETDEWEB)

    Tsuchiya, Hikaru; Tanaka, Keiji, E-mail: tanaka-kj@igakuken.or.jp; Saeki, Yasushi, E-mail: saeki-ys@igakuken.or.jp

    2013-06-28

    Highlights: •The parallel reaction monitoring method was applied to ubiquitin quantification. •The ubiquitin PRM method is highly sensitive even in biological samples. •Using the method, we revealed that Ufd4 assembles the K29-linked ubiquitin chain. -- Abstract: Ubiquitylation is an essential posttranslational protein modification that is implicated in a diverse array of cellular functions. Although cells contain eight structurally distinct types of polyubiquitin chains, detailed function of several chain types including K29-linked chains has remained largely unclear. Current mass spectrometry (MS)-based quantification methods are highly inefficient for low abundant atypical chains, such as K29- and M1-linked chains, in complex mixtures that typically contain highly abundant proteins. In this study, we applied parallel reaction monitoring (PRM), a quantitative, high-resolution MS method, to quantify ubiquitin chains. The ubiquitin PRM method allows us to quantify 100 attomole amounts of all possible ubiquitin chains in cell extracts. Furthermore, we quantified ubiquitylation levels of ubiquitin-proline-β-galactosidase (Ub-P-βgal), a historically known model substrate of the ubiquitin fusion degradation (UFD) pathway. In wild-type cells, Ub-P-βgal is modified with ubiquitin chains consisting of 21% K29- and 78% K48-linked chains. In contrast, K29-linked chains are not detected in UFD4 knockout cells, suggesting that Ufd4 assembles the K29-linked ubiquitin chain(s) on Ub-P-βgal in vivo. Thus, the ubiquitin PRM is a novel, useful, quantitative method for analyzing the highly complicated ubiquitin system.

  12. Parallel Backprojection: A Case Study in High-Performance Reconfigurable Computing

    Directory of Open Access Journals (Sweden)

    Cordes Ben

    2009-01-01

    Full Text Available High-performance reconfigurable computing (HPRC is a novel approach to provide large-scale computing power to modern scientific applications. Using both general-purpose processors and FPGAs allows application designers to exploit fine-grained and coarse-grained parallelism, achieving high degrees of speedup. One scientific application that benefits from this technique is backprojection, an image formation algorithm that can be used as part of a synthetic aperture radar (SAR processing system. We present an implementation of backprojection for SAR on an HPRC system. Using simulated data taken at a variety of ranges, our implementation runs over 200 times faster than a similar software program, with an overall application speedup better than 50x. The backprojection application is easily parallelizable, achieving near-linear speedup when run on multiple nodes of a clustered HPRC system. The results presented can be applied to other systems and other algorithms with similar characteristics.

  13. Parallel Backprojection: A Case Study in High-Performance Reconfigurable Computing

    Directory of Open Access Journals (Sweden)

    2009-03-01

    Full Text Available High-performance reconfigurable computing (HPRC is a novel approach to provide large-scale computing power to modern scientific applications. Using both general-purpose processors and FPGAs allows application designers to exploit fine-grained and coarse-grained parallelism, achieving high degrees of speedup. One scientific application that benefits from this technique is backprojection, an image formation algorithm that can be used as part of a synthetic aperture radar (SAR processing system. We present an implementation of backprojection for SAR on an HPRC system. Using simulated data taken at a variety of ranges, our implementation runs over 200 times faster than a similar software program, with an overall application speedup better than 50x. The backprojection application is easily parallelizable, achieving near-linear speedup when run on multiple nodes of a clustered HPRC system. The results presented can be applied to other systems and other algorithms with similar characteristics.

  14. High Performance Computation of a Jet in Crossflow by Lattice Boltzmann Based Parallel Direct Numerical Simulation

    Directory of Open Access Journals (Sweden)

    Jiang Lei

    2015-01-01

    Full Text Available Direct numerical simulation (DNS of a round jet in crossflow based on lattice Boltzmann method (LBM is carried out on multi-GPU cluster. Data parallel SIMT (single instruction multiple thread characteristic of GPU matches the parallelism of LBM well, which leads to the high efficiency of GPU on the LBM solver. With present GPU settings (6 Nvidia Tesla K20M, the present DNS simulation can be completed in several hours. A grid system of 1.5 × 108 is adopted and largest jet Reynolds number reaches 3000. The jet-to-free-stream velocity ratio is set as 3.3. The jet is orthogonal to the mainstream flow direction. The validated code shows good agreement with experiments. Vortical structures of CRVP, shear-layer vortices and horseshoe vortices, are presented and analyzed based on velocity fields and vorticity distributions. Turbulent statistical quantities of Reynolds stress are also displayed. Coherent structures are revealed in a very fine resolution based on the second invariant of the velocity gradients.

  15. High-performance SPAD array detectors for parallel photon timing applications

    Science.gov (United States)

    Rech, I.; Cuccato, A.; Antonioli, S.; Cammi, C.; Gulinatti, A.; Ghioni, M.

    2012-02-01

    Over the past few years there has been a growing interest in monolithic arrays of single photon avalanche diodes (SPAD) for spatially resolved detection of faint ultrafast optical signals. SPADs implemented in planar technologies offer the typical advantages of microelectronic devices (small size, ruggedness, low voltage, low power, etc.). Furthermore, they have inherently higher photon detection efficiency than PMTs and are able to provide, beside sensitivities down to single-photons, very high acquisition speeds. In order to make SPAD array more and more competitive in time-resolved application it is necessary to face problems like electrical crosstalk between adjacent pixel, moreover all the singlephoton timing electronics with picosecond resolution has to be developed. In this paper we present a new instrument suitable for single-photon imaging applications and made up of 32 timeresolved parallel channels. The 32x1 pixel array that includes SPAD detectors represents the system core, and an embedded data elaboration unit performs on-board data processing for single-photon counting applications. Photontiming information is exported through a custom parallel cable that can be connected to an external multichannel TCSPC system.

  16. High-Frequency Replanning Under Uncertainty Using Parallel Sampling-Based Motion Planning

    Science.gov (United States)

    Sun, Wen; Patil, Sachin; Alterovitz, Ron

    2015-01-01

    As sampling-based motion planners become faster, they can be re-executed more frequently by a robot during task execution to react to uncertainty in robot motion, obstacle motion, sensing noise, and uncertainty in the robot’s kinematic model. We investigate and analyze high-frequency replanning (HFR), where, during each period, fast sampling-based motion planners are executed in parallel as the robot simultaneously executes the first action of the best motion plan from the previous period. We consider discrete-time systems with stochastic nonlinear (but linearizable) dynamics and observation models with noise drawn from zero mean Gaussian distributions. The objective is to maximize the probability of success (i.e., avoid collision with obstacles and reach the goal) or to minimize path length subject to a lower bound on the probability of success. We show that, as parallel computation power increases, HFR offers asymptotic optimality for these objectives during each period for goal-oriented problems. We then demonstrate the effectiveness of HFR for holonomic and nonholonomic robots including car-like vehicles and steerable medical needles. PMID:26279645

  17. Parallel and series FED microstrip array with high efficiency and low cross polarization

    Science.gov (United States)

    Huang, John (Inventor)

    1995-01-01

    A microstrip array antenna for vertically polarized fan beam (approximately 2 deg x 50 deg) for C-band SAR applications with a physical area of 1.7 m by 0.17 m comprises two rows of patch elements and employs a parallel feed to left- and right-half sections of the rows. Each section is divided into two segments that are fed in parallel with the elements in each segment fed in series through matched transmission lines for high efficiency. The inboard section has half the number of patch elements of the outboard section, and the outboard sections, which have tapered distribution with identical transmission line sections, terminated with half wavelength long open-circuit stubs so that the remaining energy is reflected and radiated in phase. The elements of the two inboard segments of the two left- and right-half sections are provided with tapered transmission lines from element to element for uniform power distribution over the central third of the entire array antenna. The two rows of array elements are excited at opposite patch feed locations with opposite (180 deg difference) phases for reduced cross-polarization.

  18. Highly parallel computational study of amphiphilic molecules using the Wang--Landau method

    Science.gov (United States)

    Vogel, Thomas; Landau, David

    2012-02-01

    The self-assembly process in amphiphilic solutions is a phenomenon of broad interest. Molecular dynamics simulations generally used to study micelle formation or lipid layer assembly in an explicit solvent are limited in time scale. Vast studies of structure formation processes via standard Markov-chain based Monte Carlo simulations are challenging, but the Wang--Landau method [1] provides a way to efficiently study such systems in a generalized thermodynamic ensemble. This makes it possible, for example, to get results over a broad temperature range from a single simulation. In an attempt to develop highly parallel applications using this method, we study the thermodynamic behavior of a generic coarse-grained model for amphiphilic molecules [2] as well as of a new coarse-grained lipid model specifically designed for dimyristoyl phosphatidylcholine (DMPC) [3]. Here, we focus on the design and the performance of our parallel Wang--Landau simulation on multi-CPU and GPU systems.[4pt] [1] F. Wang and D.P. Landau, Phys. Rev. Lett. 86, 2050 (2001)[0pt] [2] S. Fujiwara et al., J. Chem. Phys. 130, 144901 (2009)[0pt] [3] W. Shinoda et al., J. Phys. Chem. B 114, 6836 (2010)

  19. Characterization of Mn-modified Pb(Mg(13)Nb(23))O(3)-PbZrO(3)-PbTiO(3) single crystals for high power broad bandwidth transducers.

    Science.gov (United States)

    Zhang, Shujun; Lee, Sung-Min; Kim, Dong-Ho; Lee, Ho-Yong; Shrout, Thomas R

    2008-09-22

    The effect of MnO(2) addition on the dielectric and piezoelectric properties of 0.4Pb(Mg(13)Nb(23))O(3)-0.25PbZrO(3)-0.35PbTiO(3) single crystals was investigated. Analogous to acceptor doping in "hard" Pb(Zr,Ti)O(3) based polycrystalline materials, the Mn doped crystals exhibited enhanced mechanical Q ( approximately 1050) with low dielectric loss ( approximately 0.2%), while maintaining ultrahigh electromechanical coupling k(33)>90%, inherent in domain engineered single crystals. The effect of acceptor doping was also evident in the build-up of an internal bias (E(i) approximately 1.6 kVcm), shown by a horizontal offset in the polarization-field behavior. Together with the relatively high usage temperature (T(R-T) approximately 140 degrees C), the Mn doped crystals are promising candidates for high power and broad bandwidth transducers.

  20. Membrane Transport Processes Analyzed by a Highly Parallel Nanopore Chip System at Single Protein Resolution.

    Science.gov (United States)

    Urban, Michael; Vor der Brüggen, Marc; Tampé, Robert

    2016-08-16

    Membrane protein transport on the single protein level still evades detailed analysis, if the substrate translocated is non-electrogenic. Considerable efforts have been made in this field, but techniques enabling automated high-throughput transport analysis in combination with solvent-free lipid bilayer techniques required for the analysis of membrane transporters are rare. This class of transporters however is crucial in cell homeostasis and therefore a key target in drug development and methodologies to gain new insights desperately needed. The here presented manuscript describes the establishment and handling of a novel biochip for the analysis of membrane protein mediated transport processes at single transporter resolution. The biochip is composed of microcavities enclosed by nanopores that is highly parallel in its design and can be produced in industrial grade and quantity. Protein-harboring liposomes can directly be applied to the chip surface forming self-assembled pore-spanning lipid bilayers using SSM-techniques (solid supported lipid membranes). Pore-spanning parts of the membrane are freestanding, providing the interface for substrate translocation into or out of the cavity space, which can be followed by multi-spectral fluorescent readout in real-time. The establishment of standard operating procedures (SOPs) allows the straightforward establishment of protein-harboring lipid bilayers on the chip surface of virtually every membrane protein that can be reconstituted functionally. The sole prerequisite is the establishment of a fluorescent read-out system for non-electrogenic transport substrates. High-content screening applications are accomplishable by the use of automated inverted fluorescent microscopes recording multiple chips in parallel. Large data sets can be analyzed using the freely available custom-designed analysis software. Three-color multi spectral fluorescent read-out furthermore allows for unbiased data discrimination into different

  1. Improvement of CBQ for bandwidth reclamation of RPR

    Science.gov (United States)

    Huang, Benxiong; Wang, Xiaoling; Xu, Ming; Shi, Lili

    2004-04-01

    The Resilient Packet Ring (RPR) IEEE 802.17 standard is under development as a new high-speed backbone technology for metropolitan area networks (MAN) [1]. Bandwidth reclamation has been concerned in RPR specifications from draft 0.1 to draft 2.4. According to specifications, allocated bandwidth can be reused, or reclaimed, by a lower priority service class whenever the reclamation does not effect the service guarantees of any equal or higher priority classes on the local station or on any other station on the ring [2]. The class-based queuing (CBQ) algorithm is proposed to implement link-sharing [3]. A hierarchical link-sharing structure can be used to specify guidelines for the distribution of 'excess" bandwidth [4] and it can rate-limit all classes to their allocated bandwidth. There is some sameness between the link-sharing of CBQ and bandwidth reclamation of RPR. The CBQ is a mature technology while RPR is a new technology. Given CBQ improvement and full use so as to make its thought suitable for bandwidth reclamation of RPR is the focus of our work. In this paper, we present the solution that can solve the reclamation problem, which proves to be effective by simulation.

  2. Accurate high-throughput identification of parallel G-quadruplex topology by a new tetraaryl-substituted imidazole.

    Science.gov (United States)

    Hu, Ming-Hao; Chen, Shuo-Bin; Wang, Yu-Qing; Zeng, You-Mei; Ou, Tian-Miao; Li, Ding; Gu, Lian-Quan; Huang, Zhi-Shu; Tan, Jia-Heng

    2016-09-15

    G-quadruplex nucleic acids are four-stranded DNA or RNA secondary structures that are formed in guanine-rich sequences. These structures exhibit extensive structural polymorphism and play a pivotal role in the control of a variety of cellular processes. To date, diverse approaches for high-throughput identification of G-quadruplex structures have been successfully developed, but high-throughput methods for further characterization of their topologies are still lacking. In this study, we report a new tetra-arylimidazole probe psIZCM-1, which was found to display significant and distinctive changes in both the absorption and the fluorescence spectra in the presence of parallel G-quadruplexes but show insignificant changes upon interactions with anti-parallel G-quadruplexes or other non-quadruplex oligonucleotides. In view of this dual-output feature, we used psIZCM-1 to identify the parallel G-quadruplexes from a large set of 314 oligonucleotides (including 300 G-quadruplex-forming oligonucleotides and 14 non-quadruplex oligonucleotides) via a microplate reader and accordingly established a high-throughput method for the characterization of parallel G-quadruplex topologies. The accuracy of this method was greater than 95%, which was much higher than that of the commercial probe NMM. To make the approach more practical, we further combined psIZCM-1 with another G-quadruplex probe IZCM-7 to realize the high-throughput classification of parallel, anti-parallel G-quadruplexes and non-quadruplex structures.

  3. An Integrated Circuit Design of High Efficiency Parallel-SSHI Rectifier for Piezoelectric Energy Harvesting

    Science.gov (United States)

    Hsieh, Y. C.; Chen, J. J.; Chen, H. S.; Wu, W. J.

    2016-11-01

    This paper presents the design and implementation of a rectifier for piezoelectric energy harvesting based on the parallel-synchronized-switch harvesting-on-inductor (P-SSHI) technique, also known as bias flip circuit[1]. The circuit is implemented with 0.25 μm CMOS high voltage process with only 0.9648 mm2 chip area. Post-layout simulation of the circuit shows the circuit extracts 336% more power compared with the full-bridge rectifier. The system's average control power loss is 26 μW while operating with a self-made MEMS piezoelectric transducer with output current 25 μA 120Hz and internal capacitance 6.45nF. The output power is 43.42 μW under optimal load of 1.5 MΩ.

  4. The high performance parallel algorithm for Unified Gas-Kinetic Scheme

    Science.gov (United States)

    Li, Shiyi; Li, Qibing; Fu, Song; Xu, Jinxiu

    2016-11-01

    A high performance parallel algorithm for UGKS is developed to simulate three-dimensional flows internal and external on arbitrary grid system. The physical domain and velocity domain are divided into different blocks and distributed according to the two-dimensional Cartesian topology with intra-communicators in physical domain for data exchange and other intra-communicators in velocity domain for sum reduction to moment integrals. Numerical results of three-dimensional cavity flow and flow past a sphere agree well with the results from the existing studies and validate the applicability of the algorithm. The scalability of the algorithm is tested both on small (1-16) and large (729-5832) scale processors. The tested speed-up ratio is near linear ashind thus the efficiency is around 1, which reveals the good scalability of the present algorithm.

  5. Evaluation of emerging parallel optical link technology for high energy physics

    Science.gov (United States)

    Chramowicz, J.; Kwan, S.; Prosser, A.; Winchell, M.

    2012-01-01

    Modern particle detectors utilize optical fiber links to deliver event data to upstream trigger and data processing systems. Future detector systems can benefit from the development of dense arrangements of high speed optical links emerging from industry advancements in transceiver technology. Supporting data transfers of up to 120 Gbps in each direction, optical engines permit assembly of the optical transceivers in close proximity to ASICs and FPGAs. Test results of some of these parallel components will be presented including the development of pluggable FPGA Mezzanine Cards equipped with optical engines to provide to collaborators on the Versatile Link Common Project for the HI-LHC at CERN. This work was supported by the U.S. Department of Energy, operated by Fermi Research Alliance, LLC under contract No. DE-AC02-07CH11359 with the United States Department of Energy.

  6. HPC-NMF: A High-Performance Parallel Algorithm for Nonnegative Matrix Factorization

    Energy Technology Data Exchange (ETDEWEB)

    2016-08-22

    NMF is a useful tool for many applications in different domains such as topic modeling in text mining, background separation in video analysis, and community detection in social networks. Despite its popularity in the data mining community, there is a lack of efficient distributed algorithms to solve the problem for big data sets. We propose a high-performance distributed-memory parallel algorithm that computes the factorization by iteratively solving alternating non-negative least squares (NLS) subproblems for $\\WW$ and $\\HH$. It maintains the data and factor matrices in memory (distributed across processors), uses MPI for interprocessor communication, and, in the dense case, provably minimizes communication costs (under mild assumptions). As opposed to previous implementation, our algorithm is also flexible: It performs well for both dense and sparse matrices, and allows the user to choose any one of the multiple algorithms for solving the updates to low rank factors $\\WW$ and $\\HH$ within the alternating iterations.

  7. High performance parallel computing of flows in complex geometries: II. Applications

    Energy Technology Data Exchange (ETDEWEB)

    Gourdain, N; Gicquel, L; Staffelbach, G; Vermorel, O; Duchaine, F; Boussuge, J-F [Computational Fluid Dynamics Team, CERFACS, Toulouse, 31057 (France); Poinsot, T [Institut de Mecanique des Fluides de Toulouse, Toulouse, 31400 (France)], E-mail: Nicolas.gourdain@cerfacs.fr

    2009-01-01

    Present regulations in terms of pollutant emissions, noise and economical constraints, require new approaches and designs in the fields of energy supply and transportation. It is now well established that the next breakthrough will come from a better understanding of unsteady flow effects and by considering the entire system and not only isolated components. However, these aspects are still not well taken into account by the numerical approaches or understood whatever the design stage considered. The main challenge is essentially due to the computational requirements inferred by such complex systems if it is to be simulated by use of supercomputers. This paper shows how new challenges can be addressed by using parallel computing platforms for distinct elements of a more complex systems as encountered in aeronautical applications. Based on numerical simulations performed with modern aerodynamic and reactive flow solvers, this work underlines the interest of high-performance computing for solving flow in complex industrial configurations such as aircrafts, combustion chambers and turbomachines. Performance indicators related to parallel computing efficiency are presented, showing that establishing fair criterions is a difficult task for complex industrial applications. Examples of numerical simulations performed in industrial systems are also described with a particular interest for the computational time and the potential design improvements obtained with high-fidelity and multi-physics computing methods. These simulations use either unsteady Reynolds-averaged Navier-Stokes methods or large eddy simulation and deal with turbulent unsteady flows, such as coupled flow phenomena (thermo-acoustic instabilities, buffet, etc). Some examples of the difficulties with grid generation and data analysis are also presented when dealing with these complex industrial applications.

  8. A Highly Parallel Implementation of K-Means for Multithreaded Architecture

    Energy Technology Data Exchange (ETDEWEB)

    Mackey, Patrick S.; Feo, John T.; Wong, Pak C.; Chen, Yousu

    2011-04-06

    We present a parallel implementation of the popular k-means clustering algorithm for massively multithreaded computer systems, as well as a parallelized version of the KKZ seed selection algorithm. We demonstrate that as system size increases, sequential seed selection can become a bottleneck. We also present an early attempt at parallelizing k-means that highlights critical performance issues when programming massively multithreaded systems. For our case studies, we used data collected from electric power simulations and run on the Cray XMT.

  9. Modeling a high output marine steam generator feedwater control system which uses parallel turbine-driven feed pumps

    Institute of Scientific and Technical Information of China (English)

    QIU Zhi-qiang; ZOU Hai; SUN Jian-hua

    2008-01-01

    Parallel turbine-driven feedwater pumps are needed when ships travel at high speed. In order to study marine steam generator feedwater control systems which use parallel turbine-driven feed pumps,a mathematical model of marine steam generator feedwater control system was developed which includes mathematical models of two steam generators and parallel turbine-driven feed pumps as well as mathematical models of feedwater pipes and feed regulating valves. The operating condition points of the parallel turbine-driven feed pumps were calculated by the Chebyshev curve fit method. A water level controller for the steam generator and a rotary speed controller for the turbine-driven feed pumps were also included in the model. The accuracy of the mathematical models and their controllers was verified by comparing their results with those from a simulator.

  10. A High-Order Accurate Parallel Solver for Maxwell's Equations on Overlapping Grids

    Energy Technology Data Exchange (ETDEWEB)

    Henshaw, W D

    2005-09-23

    A scheme for the solution of the time dependent Maxwell's equations on composite overlapping grids is described. The method uses high-order accurate approximations in space and time for Maxwell's equations written as a second-order vector wave equation. High-order accurate symmetric difference approximations to the generalized Laplace operator are constructed for curvilinear component grids. The modified equation approach is used to develop high-order accurate approximations that only use three time levels and have the same time-stepping restriction as the second-order scheme. Discrete boundary conditions for perfect electrical conductors and for material interfaces are developed and analyzed. The implementation is optimized for component grids that are Cartesian, resulting in a fast and efficient method. The solver runs on parallel machines with each component grid distributed across one or more processors. Numerical results in two- and three-dimensions are presented for the fourth-order accurate version of the method. These results demonstrate the accuracy and efficiency of the approach.

  11. A High-Order Accurate Parallel Solver for Maxwell's Equations on Overlapping Grids

    Energy Technology Data Exchange (ETDEWEB)

    Henshaw, W D

    2005-09-23

    A scheme for the solution of the time dependent Maxwell's equations on composite overlapping grids is described. The method uses high-order accurate approximations in space and time for Maxwell's equations written as a second-order vector wave equation. High-order accurate symmetric difference approximations to the generalized Laplace operator are constructed for curvilinear component grids. The modified equation approach is used to develop high-order accurate approximations that only use three time levels and have the same time-stepping restriction as the second-order scheme. Discrete boundary conditions for perfect electrical conductors and for material interfaces are developed and analyzed. The implementation is optimized for component grids that are Cartesian, resulting in a fast and efficient method. The solver runs on parallel machines with each component grid distributed across one or more processors. Numerical results in two- and three-dimensions are presented for the fourth-order accurate version of the method. These results demonstrate the accuracy and efficiency of the approach.

  12. Parallelized multi-graphics processing unit framework for high-speed Gabor-domain optical coherence microscopy.

    Science.gov (United States)

    Tankam, Patrice; Santhanam, Anand P; Lee, Kye-Sung; Won, Jungeun; Canavesi, Cristina; Rolland, Jannick P

    2014-07-01

    Gabor-domain optical coherence microscopy (GD-OCM) is a volumetric high-resolution technique capable of acquiring three-dimensional (3-D) skin images with histological resolution. Real-time image processing is needed to enable GD-OCM imaging in a clinical setting. We present a parallelized and scalable multi-graphics processing unit (GPU) computing framework for real-time GD-OCM image processing. A parallelized control mechanism was developed to individually assign computation tasks to each of the GPUs. For each GPU, the optimal number of amplitude-scans (A-scans) to be processed in parallel was selected to maximize GPU memory usage and core throughput. We investigated five computing architectures for computational speed-up in processing 1000×1000 A-scans. The proposed parallelized multi-GPU computing framework enables processing at a computational speed faster than the GD-OCM image acquisition, thereby facilitating high-speed GD-OCM imaging in a clinical setting. Using two parallelized GPUs, the image processing of a 1×1×0.6  mm3 skin sample was performed in about 13 s, and the performance was benchmarked at 6.5 s with four GPUs. This work thus demonstrates that 3-D GD-OCM data may be displayed in real-time to the examiner using parallelized GPU processing.

  13. Parallelized multi–graphics processing unit framework for high-speed Gabor-domain optical coherence microscopy

    Science.gov (United States)

    Tankam, Patrice; Santhanam, Anand P.; Lee, Kye-Sung; Won, Jungeun; Canavesi, Cristina; Rolland, Jannick P.

    2014-01-01

    Abstract. Gabor-domain optical coherence microscopy (GD-OCM) is a volumetric high-resolution technique capable of acquiring three-dimensional (3-D) skin images with histological resolution. Real-time image processing is needed to enable GD-OCM imaging in a clinical setting. We present a parallelized and scalable multi-graphics processing unit (GPU) computing framework for real-time GD-OCM image processing. A parallelized control mechanism was developed to individually assign computation tasks to each of the GPUs. For each GPU, the optimal number of amplitude-scans (A-scans) to be processed in parallel was selected to maximize GPU memory usage and core throughput. We investigated five computing architectures for computational speed-up in processing 1000×1000 A-scans. The proposed parallelized multi-GPU computing framework enables processing at a computational speed faster than the GD-OCM image acquisition, thereby facilitating high-speed GD-OCM imaging in a clinical setting. Using two parallelized GPUs, the image processing of a 1×1×0.6  mm3 skin sample was performed in about 13 s, and the performance was benchmarked at 6.5 s with four GPUs. This work thus demonstrates that 3-D GD-OCM data may be displayed in real-time to the examiner using parallelized GPU processing. PMID:24695868

  14. High Performance Scheduling in Parallel Heterogeneous Multiprocessor Systems Using Evolutionary Algorithms

    Directory of Open Access Journals (Sweden)

    Mohammad Sadeq Garshasbi

    2013-10-01

    Full Text Available Scheduling is the process of improving the performance of a parallel and distributed system. Parallel systems are part of distributed systems. Parallel systems refers to the concept of run parallel jobs that can be run simultaneously on several processors. Load balancing and scheduling are very important and complex problems in multiprocessor systems. So that problems are an NP-Complete problems. In this paper, we introduce a method based on genetic algorithms for scheduling and laod balancing in parallel heterogeneous multi-processor systems. The results of the simulations indicate Genetic algorithm for scheduling at in systems is better than LPT, SPT and FIFO. Simualation results indicate Genetic Algorithm reduce total response time and also it increase utilization.

  15. Integration experiences and performance studies of A COTS parallel archive systems

    Energy Technology Data Exchange (ETDEWEB)

    Chen, Hsing-bung [Los Alamos National Laboratory; Scott, Cody [Los Alamos National Laboratory; Grider, Bary [Los Alamos National Laboratory; Torres, Aaron [Los Alamos National Laboratory; Turley, Milton [Los Alamos National Laboratory; Sanchez, Kathy [Los Alamos National Laboratory; Bremer, John [Los Alamos National Laboratory

    2010-01-01

    Current and future Archive Storage Systems have been asked to (a) scale to very high bandwidths, (b) scale in metadata performance, (c) support policy-based hierarchical storage management capability, (d) scale in supporting changing needs of very large data sets, (e) support standard interface, and (f) utilize commercial-off-the-shelf(COTS) hardware. Parallel file systems have been asked to do the same thing but at one or more orders of magnitude faster in performance. Archive systems continue to move closer to file systems in their design due to the need for speed and bandwidth, especially metadata searching speeds such as more caching and less robust semantics. Currently the number of extreme highly scalable parallel archive solutions is very small especially those that will move a single large striped parallel disk file onto many tapes in parallel. We believe that a hybrid storage approach of using COTS components and innovative software technology can bring new capabilities into a production environment for the HPC community much faster than the approach of creating and maintaining a complete end-to-end unique parallel archive software solution. In this paper, we relay our experience of integrating a global parallel file system and a standard backup/archive product with a very small amount of additional code to provide a scalable, parallel archive. Our solution has a high degree of overlap with current parallel archive products including (a) doing parallel movement to/from tape for a single large parallel file, (b) hierarchical storage management, (c) ILM features, (d) high volume (non-single parallel file) archives for backup/archive/content management, and (e) leveraging all free file movement tools in Linux such as copy, move, ls, tar, etc. We have successfully applied our working COTS Parallel Archive System to the current world's first petaflop/s computing system, LANL's Roadrunner, and demonstrated its capability to address requirements of

  16. Integration experiments and performance studies of a COTS parallel archive system

    Energy Technology Data Exchange (ETDEWEB)

    Chen, Hsing-bung [Los Alamos National Laboratory; Scott, Cody [Los Alamos National Laboratory; Grider, Gary [Los Alamos National Laboratory; Torres, Aaron [Los Alamos National Laboratory; Turley, Milton [Los Alamos National Laboratory; Sanchez, Kathy [Los Alamos National Laboratory; Bremer, John [Los Alamos National Laboratory

    2010-06-16

    Current and future Archive Storage Systems have been asked to (a) scale to very high bandwidths, (b) scale in metadata performance, (c) support policy-based hierarchical storage management capability, (d) scale in supporting changing needs of very large data sets, (e) support standard interface, and (f) utilize commercial-off-the-shelf (COTS) hardware. Parallel file systems have been asked to do the same thing but at one or more orders of magnitude faster in performance. Archive systems continue to move closer to file systems in their design due to the need for speed and bandwidth, especially metadata searching speeds such as more caching and less robust semantics. Currently the number of extreme highly scalable parallel archive solutions is very small especially those that will move a single large striped parallel disk file onto many tapes in parallel. We believe that a hybrid storage approach of using COTS components and innovative software technology can bring new capabilities into a production environment for the HPC community much faster than the approach of creating and maintaining a complete end-to-end unique parallel archive software solution. In this paper, we relay our experience of integrating a global parallel file system and a standard backup/archive product with a very small amount of additional code to provide a scalable, parallel archive. Our solution has a high degree of overlap with current parallel archive products including (a) doing parallel movement to/from tape for a single large parallel file, (b) hierarchical storage management, (c) ILM features, (d) high volume (non-single parallel file) archives for backup/archive/content management, and (e) leveraging all free file movement tools in Linux such as copy, move, Is, tar, etc. We have successfully applied our working COTS Parallel Archive System to the current world's first petafiop/s computing system, LANL's Roadrunner machine, and demonstrated its capability to address

  17. Ptychography with broad-bandwidth radiation

    Energy Technology Data Exchange (ETDEWEB)

    Enders, B., E-mail: bjoern.enders@ph.tum.de; Dierolf, M.; Stockmar, M.; Pfeiffer, F. [Lehrstuhl für Biomedizinische Physik, Physik-Department and Institut für Medizintechnik, Technische Universität München, 85747 Garching (Germany); Cloetens, P. [European Synchrotron Radiation Facility, 38043 Grenoble (France); Thibault, P. [Department of Physics and Astronomy, University College London, London (United Kingdom)

    2014-04-28

    Ptychography, a scanning Coherent Diffractive Imaging (CDI) technique, has quickly gained momentum as a robust method to deliver quantitative images of extended specimens. A current conundrum for the development of X-ray CDI is the conflict between a need for higher flux to reach higher resolutions and the requirement to strongly filter the incident beam to satisfy the tight coherence prerequisite of the technique. Latest developments in algorithmic treatment of ptychographic data indicate that the technique is more robust than initially assumed, so that some experimental limitations can be substantially relaxed. Here, we demonstrate that ptychography can be conducted in conditions that were up to now considered insufficient, using a broad-bandwidth X-ray beam and an integrating scintillator-based detector. Our work shows the wide applicability of ptychography and paves the way to high-throughput, high-flux diffractive imaging.

  18. Controlling Laser Plasma Instabilities Using Temporal Bandwidth

    Science.gov (United States)

    Tsung, Frank; Weaver, J.; Lehmberg, R.

    2016-10-01

    We are performing particle-in-cell simulations using the code OSIRIS to study the effects of laser plasma interactions in the presence of temporal bandwidth under conditions relevant to current and future experiments on the NIKE laser. Our simulations show that, for sufficiently large bandwidth (where the inverse bandwidth is comparable with the linear growth time), the saturation level, and the distribution of hot electrons, can be effected by the addition of temporal bandwidths (which can be accomplished in experiments using beam smoothing techniques such as ISI). We will quantify these effects and investigate higher dimensional effects such as laser speckles. This work is supported by DOE and NRL.

  19. An Array Consisting of 10 High-Speed Side-Illuminated Evanescently Coupled Waveguide Photodetectors Each with a Bandwidth of 20 GHz

    Science.gov (United States)

    Lv, Qian-Qian; Ye, Han; Yin, Dong-Dong; Yang, Xiao-Hong; Han, Qin

    2015-12-01

    Not Available Supported by the High-Tech Research and Development Program of China under Grant Nos 2013AA031401, 2015AA016902 and 2015AA016904, the National Natural Science Foundation of China under Grant Nos 61176053, 61274069 and 61435002, and the National Basic Research Program of China under Grant No 2012CB933503.

  20. ArchSim: A System-Level Parallel Simulation Platform for the Architecture Design of High Performance Computer

    Institute of Scientific and Technical Information of China (English)

    Yong-Qin Huang; Hong-Liang Li; Xiang-Hui Xie; Lei Qian; Zi-Yu Hao; Feng Guo; Kun Zhang

    2009-01-01

    High performance computer(HPC)is a complex huge system,of which the architecture design meets increasing difficulties and risks.Traditional methods,such as theoretical analysis,component-level simulation and sequential simulation,are not applicable to system-level simulations of HPC systems.Eyen the parallel simulation using large-scale parallel machines also have many difficulties in scalability,reliability,generality,as well as efficiency.According to the current needs of HPC architecture design,this paper proposes a system-level parallel simulation platform:ArchSim.We first introduce the architecture of ArchSim simulation platform which is composed of a global server(GS),local server agents(LSA)and entities.Secondly,we emphasize some key techniques of ArchSim,including the synchronization protocol,the communication mechanism and the distributed checkpointing/restart mechanism.We then make a synthesized test of some main performance indices of ArchSim with the phold benchmark and analyze the extra overhead generated by ArchSim.Finally,based on ArchSim.we construct a parallel event-driven interconnection network simulator and a system-level simulator for a small scale HPC system with 256 processors.The results of the performance test and HPC system simulations demonstrate that ArchSim can achieve high speedup ratio and high scalability on parallel host machine and support system-level simulations for the architecture design of HPC systems.

  1. Achieving high performance in numerical computations on RISC workstations and parallel systems

    Energy Technology Data Exchange (ETDEWEB)

    Goedecker, S. [Max-Planck Inst. for Solid State Research, Stuttgart (Germany); Hoisie, A. [Los Alamos National Lab., NM (United States)

    1997-08-20

    The nominal peak speeds of both serial and parallel computers is raising rapidly. At the same time however it is becoming increasingly difficult to get out a significant fraction of this high peak speed from modern computer architectures. In this tutorial the authors give the scientists and engineers involved in numerically demanding calculations and simulations the necessary basic knowledge to write reasonably efficient programs. The basic principles are rather simple and the possible rewards large. Writing a program by taking into account optimization techniques related to the computer architecture can significantly speedup your program, often by factors of 10--100. As such, optimizing a program can for instance be a much better solution than buying a faster computer. If a few basic optimization principles are applied during program development, the additional time needed for obtaining an efficient program is practically negligible. In-depth optimization is usually only needed for a few subroutines or kernels and the effort involved is therefore also acceptable.

  2. A low-power column-parallel ADC for high-speed CMOS image sensor

    Science.gov (United States)

    Han, Ye; Li, Quanliang; Shi, Cong; Liu, Liyuan; Wu, Nanjian

    2013-08-01

    This paper presents a 10-bit low-power column-parallel cyclic analog-to-digital converter (ADC) used for high-speed CMOS image sensor (CIS). An opamp sharing technique is used to save power and area. Correlated double sampling (CDS) circuit and programmable gain amplifier (PGA) are integrated in the ADC, which avoids stand-alone circuit blocks. An offset cancellation technique is also introduced, which reduces the column fixed-pattern noise (FPN) effectively. One single channel ADC with an area less than 0.03mm2 was implemented in a 0.18μm 1P4M CMOS image sensor process. The resolution of the proposed ADC is 10-bit, and the conversion rate is 2MS/s. The measured differential nonlinearity (DNL) and integral nonlinearity (INL) are 0.62 LSB and 2.1 LSB together with CDS, respectively. The power consumption from 1.8V supply is only 0.36mW.

  3. Parallel sort with a ranged, partitioned key-value store in a high perfomance computing environment

    Science.gov (United States)

    Bent, John M.; Faibish, Sorin; Grider, Gary; Torres, Aaron; Poole, Stephen W.

    2016-01-26

    Improved sorting techniques are provided that perform a parallel sort using a ranged, partitioned key-value store in a high performance computing (HPC) environment. A plurality of input data files comprising unsorted key-value data in a partitioned key-value store are sorted. The partitioned key-value store comprises a range server for each of a plurality of ranges. Each input data file has an associated reader thread. Each reader thread reads the unsorted key-value data in the corresponding input data file and performs a local sort of the unsorted key-value data to generate sorted key-value data. A plurality of sorted, ranged subsets of each of the sorted key-value data are generated based on the plurality of ranges. Each sorted, ranged subset corresponds to a given one of the ranges and is provided to one of the range servers corresponding to the range of the sorted, ranged subset. Each range server sorts the received sorted, ranged subsets and provides a sorted range. A plurality of the sorted ranges are concatenated to obtain a globally sorted result.

  4. Microdroplet-enabled highly parallel co-cultivation of microbial communities.

    Directory of Open Access Journals (Sweden)

    Jihyang Park

    Full Text Available Microbial interactions in natural microbiota are, in many cases, crucial for the sustenance of the communities, but the precise nature of these interactions remain largely unknown because of the inherent complexity and difficulties in laboratory cultivation. Conventional pure culture-oriented cultivation does not account for these interactions mediated by small molecules, which severely limits its utility in cultivating and studying "unculturable" microorganisms from synergistic communities. In this study, we developed a simple microfluidic device for highly parallel co-cultivation of symbiotic microbial communities and demonstrated its effectiveness in discovering synergistic interactions among microbes. Using aqueous micro-droplets dispersed in a continuous oil phase, the device could readily encapsulate and co-cultivate subsets of a community. A large number of droplets, up to ∼1,400 in a 10 mm × 5 mm chamber, were generated with a frequency of 500 droplets/sec. A synthetic model system consisting of cross-feeding E. coli mutants was used to mimic compositions of symbionts and other microbes in natural microbial communities. Our device was able to detect a pair-wise symbiotic relationship when one partner accounted for as low as 1% of the total population or each symbiont was about 3% of the artificial community.

  5. A fully parallel, high precision, N-body code running on hybrid computing platforms

    CERN Document Server

    Capuzzo-Dolcetta, R; Punzo, D

    2012-01-01

    We present a new implementation of the numerical integration of the classical, gravitational, N-body problem based on a high order Hermite's integration scheme with block time steps, with a direct evaluation of the particle-particle forces. The main innovation of this code (called HiGPUs) is its full parallelization, exploiting both OpenMP and MPI in the use of the multicore Central Processing Units as well as either Compute Unified Device Architecture (CUDA) or OpenCL for the hosted Graphic Processing Units. We tested both performance and accuracy of the code using up to 256 GPUs in the supercomputer IBM iDataPlex DX360M3 Linux Infiniband Cluster provided by the italian supercomputing consortium CINECA, for values of N up to 8 millions. We were able to follow the evolution of a system of 8 million bodies for few crossing times, task previously unreached by direct summation codes. The code is freely available to the scientific community.

  6. Parallelization and High-Performance Computing Enables Automated Statistical Inference of Multi-scale Models.

    Science.gov (United States)

    Jagiella, Nick; Rickert, Dennis; Theis, Fabian J; Hasenauer, Jan

    2017-02-22

    Mechanistic understanding of multi-scale biological processes, such as cell proliferation in a changing biological tissue, is readily facilitated by computational models. While tools exist to construct and simulate multi-scale models, the statistical inference of the unknown model parameters remains an open problem. Here, we present and benchmark a parallel approximate Bayesian computation sequential Monte Carlo (pABC SMC) algorithm, tailored for high-performance computing clusters. pABC SMC is fully automated and returns reliable parameter estimates and confidence intervals. By running the pABC SMC algorithm for ∼10(6) hr, we parameterize multi-scale models that accurately describe quantitative growth curves and histological data obtained in vivo from individual tumor spheroid growth in media droplets. The models capture the hybrid deterministic-stochastic behaviors of 10(5)-10(6) of cells growing in a 3D dynamically changing nutrient environment. The pABC SMC algorithm reliably converges to a consistent set of parameters. Our study demonstrates a proof of principle for robust, data-driven modeling of multi-scale biological systems and the feasibility of multi-scale model parameterization through statistical inference.

  7. Microdroplet-enabled highly parallel co-cultivation of microbial communities.

    Science.gov (United States)

    Park, Jihyang; Kerner, Alissa; Burns, Mark A; Lin, Xiaoxia Nina

    2011-02-25

    Microbial interactions in natural microbiota are, in many cases, crucial for the sustenance of the communities, but the precise nature of these interactions remain largely unknown because of the inherent complexity and difficulties in laboratory cultivation. Conventional pure culture-oriented cultivation does not account for these interactions mediated by small molecules, which severely limits its utility in cultivating and studying "unculturable" microorganisms from synergistic communities. In this study, we developed a simple microfluidic device for highly parallel co-cultivation of symbiotic microbial communities and demonstrated its effectiveness in discovering synergistic interactions among microbes. Using aqueous micro-droplets dispersed in a continuous oil phase, the device could readily encapsulate and co-cultivate subsets of a community. A large number of droplets, up to ∼1,400 in a 10 mm × 5 mm chamber, were generated with a frequency of 500 droplets/sec. A synthetic model system consisting of cross-feeding E. coli mutants was used to mimic compositions of symbionts and other microbes in natural microbial communities. Our device was able to detect a pair-wise symbiotic relationship when one partner accounted for as low as 1% of the total population or each symbiont was about 3% of the artificial community.

  8. Improving the Bandwidth Utilization by Recycling the Unused Bandwidth in IEEE 802.16 Networks

    Directory of Open Access Journals (Sweden)

    Gowri T

    2012-03-01

    Full Text Available The Physical and MAC layers have been specified in IEEE 802.16 networks. The quality of service is ensured by the bandwidth reservation. The subscriber station should reserve the bandwidth more than its demand. But the bandwidth is fully utilized by SS but not all the time. So the bandwidth has recycled by the process of recycling the unused bandwidth. The main objective of the proposed scheme is to utilize the unused bandwidth by recycling and maintain the QOS service. By recycling the throughput can be improved which maintains the QOS in the proposed scheme. During this recycling process to maintain the QOS services, the amount of reserved bandwidth is not changed. The proposed scheme can utilize the unused bandwidth up to 70% on average. Protocols and the scheduling algorithms are used to improve the utilization and throughput.

  9. A scheme of optical interconnection for super high speed parallel computer

    Institute of Scientific and Technical Information of China (English)

    Youju Mao(毛幼菊); Yi L(u)(吕翊); Jiang Liu(刘江); Mingrui Dang(党明瑞)

    2004-01-01

    An optical cross connection network which adopts coarse wavelength division multiplexing (CWDM) and data packet is introduced. It can be used to realize communication between multi-CPU and multi-MEM in parallel computing system. It provides an effective way to upgrade the capability of parallel computer by combining optical wavelength division multiplexing (WDM) and data packet switching technology. CWDM used in network construction, optical cross connection (OXC) based on optical switch arrays, and data packet format used in network construction were analyzed. We have also done the optimizing analysis of the number of optical switches needed in different scales of network in this paper. The architecture of the optical interconnection for 8 wavelength channels and 128 bits parallel transmission has been researched. Finally, a parallel transmission system with 4 nodes, 8 channels per node, has been designed.

  10. Mixing subattolitre volumes in a quantitative and highly parallel manner with soft matter nanofluidics

    DEFF Research Database (Denmark)

    Christensen, Sune M.; Bolinger, Pierre-Yves; Hatzakis, Nikos;

    2012-01-01

    Handling and mixing ultrasmall volumes of reactants in parallel can increase the throughput and complexity of screening assays while simultaneously reducing reagent consumption. Microfabricated silicon and plastic can provide reliable fluidic devices, but cannot typically handle total volumes sma...

  11. Parallel Adaptive Mesh Refinement for High-Order Finite-Volume Schemes in Computational Fluid Dynamics

    Science.gov (United States)

    Schwing, Alan Michael

    For computational fluid dynamics, the governing equations are solved on a discretized domain of nodes, faces, and cells. The quality of the grid or mesh can be a driving source for error in the results. While refinement studies can help guide the creation of a mesh, grid quality is largely determined by user expertise and understanding of the flow physics. Adaptive mesh refinement is a technique for enriching the mesh during a simulation based on metrics for error, impact on important parameters, or location of important flow features. This can offload from the user some of the difficult and ambiguous decisions necessary when discretizing the domain. This work explores the implementation of adaptive mesh refinement in an implicit, unstructured, finite-volume solver. Consideration is made for applying modern computational techniques in the presence of hanging nodes and refined cells. The approach is developed to be independent of the flow solver in order to provide a path for augmenting existing codes. It is designed to be applicable for unsteady simulations and refinement and coarsening of the grid does not impact the conservatism of the underlying numerics. The effect on high-order numerical fluxes of fourth- and sixth-order are explored. Provided the criteria for refinement is appropriately selected, solutions obtained using adapted meshes have no additional error when compared to results obtained on traditional, unadapted meshes. In order to leverage large-scale computational resources common today, the methods are parallelized using MPI. Parallel performance is considered for several test problems in order to assess scalability of both adapted and unadapted grids. Dynamic repartitioning of the mesh during refinement is crucial for load balancing an evolving grid. Development of the methods outlined here depend on a dual-memory approach that is described in detail. Validation of the solver developed here against a number of motivating problems shows favorable

  12. High-bandwidth AFM-based rheology is a sensitive indicator of early cartilage aggrecan degradation relevant to mouse models of osteoarthritis.

    Science.gov (United States)

    Nia, Hadi T; Gauci, Stephanie J; Azadi, Mojtaba; Hung, Han-Hwa; Frank, Eliot; Fosang, Amanda J; Ortiz, Christine; Grodzinsky, Alan J

    2015-01-02

    Murine models of osteoarthritis (OA) and post-traumatic OA have been widely used to study the development and progression of these diseases using genetically engineered mouse strains along with surgical or biochemical interventions. However, due to the small size and thickness of murine cartilage, the relationship between mechanical properties, molecular structure and cartilage composition has not been well studied. We adapted a recently developed AFM-based nano-rheology system to probe the dynamic nanomechanical properties of murine cartilage over a wide frequency range of 1 Hz to 10 kHz, and studied the role of glycosaminoglycan (GAG) on the dynamic modulus and poroelastic properties of murine femoral cartilage. We showed that poroelastic properties, highlighting fluid-solid interactions, are more sensitive indicators of loss of mechanical function compared to equilibrium properties in which fluid flow is negligible. These fluid-flow-dependent properties include the hydraulic permeability (an indicator of the resistance of matrix to fluid flow) and the high frequency modulus, obtained at high rates of loading relevant to jumping and impact injury in vivo. Utilizing a fibril-reinforced finite element model, we estimated the poroelastic properties of mouse cartilage over a wide range of loading rates for the first time, and show that the hydraulic permeability increased by a factor ~16 from knormal=7.80×10(-16)±1.3×10(-16) m(4)/N s to kGAG-depleted=1.26×10(-14)±6.73×10(-15) m(4)/N s after GAG depletion. The high-frequency modulus, which is related to fluid pressurization and the fibrillar network, decreased significantly after GAG depletion. In contrast, the equilibrium modulus, which is fluid-flow independent, did not show a statistically significant alteration following GAG depletion. Copyright © 2014 Elsevier Ltd. All rights reserved.

  13. Highly sensitive wide bandwidth photodetector based on internal photoemission in CVD grown p-type MoS2/graphene Schottky junction.

    Science.gov (United States)

    Vabbina, PhaniKiran; Choudhary, Nitin; Chowdhury, Al-Amin; Sinha, Raju; Karabiyik, Mustafa; Das, Santanu; Choi, Wonbong; Pala, Nezih

    2015-07-22

    Two dimensional (2D) Molybdenum disulfide (MoS2) has evolved as a promising material for next generation optoelectronic devices owing to its unique electrical and optical properties, such as band gap modulation, high optical absorption, and increased luminescence quantum yield. The 2D MoS2 photodetectors reported in the literature have presented low responsivity compared to silicon based photodetectors. In this study, we assembled atomically thin p-type MoS2 with graphene to form a MoS2/graphene Schottky photodetector where photo generated holes travel from graphene to MoS2 over the Schottky barrier under illumination. We found that the p-type MoS2 forms a Schottky junction with graphene with a barrier height of 139 meV, which results in high photocurrent and wide spectral range of detection with wavelength selectivity. The fabricated photodetector showed excellent photosensitivity with a maximum photo responsivity of 1.26 AW(-1) and a noise equivalent power of 7.8 × 10(-12) W/√Hz at 1440 nm.

  14. High-throughput mass-directed parallel purification incorporating a multiplexed single quadrupole mass spectrometer.

    Science.gov (United States)

    Xu, Rongda; Wang, Tao; Isbell, John; Cai, Zhe; Sykes, Christopher; Brailsford, Andrew; Kassel, Daniel B

    2002-07-01

    We report on the development of a parallel HPLC/MS purification system incorporating an indexed (i.e., multiplexed) ion source. In the method described, each of the flow streams from a parallel array of HPLC columns is directed toward the multiplexed (MUX) ion source and sampled in a time-dependent, parallel manner. A visual basic application has been developed and monitors in real-time the extracted ion current from each sprayer channel. Mass-directed fraction collection is initiated into a parallel array of fraction collectors specific for each of the spray channels. In the first embodiment of this technique, we report on a four-column semipreparative parallel LC/MS system incorporating MUX detection. In this parallel LC/MS application (in which sample loads between 1 and 10 mg on-column are typically made), no cross talk was observed. Ion signals from each of the channels were found reproducible over 192 injections, with interchannel signal variations between 11 and 17%. The visual basic fraction collection application permits preset individual start collection and end collection thresholds for each channel, thereby compensating for the slight variation in signal between sprayers. By incorporating postfraction collector UV detection, we have been able to optimize the valve-triggering delay time with precut transfer tubing between the mass spectrometer and fraction collectors and achieve recoveries greater than 80%. Examples of the MUX-guided, mass-directed fraction purification of both standards and real library reaction mixtures are presented within.

  15. A Novel Dynamic Bandwidth Assignment Algorithm for Multi-Services EPONs

    Institute of Scientific and Technical Information of China (English)

    CHEN Xue; ZHANG Yang; HUANG Xiang; DENG Yu; SUN Shu-he

    2005-01-01

    In this paper we propose a novel Dynamic Bandwidth Assignment (DBA) algorithm for Ethernet-based Passive Optical Networks (EPON) which offers multiple kinds of services. To satisfy crucial Quality of Service (QoS) requirement for Time Division Multiplexing (TDM) service and achieve fair and high bandwidth utilization simultaneously, the algorithm integrates periodic, for TDM service, and polling granting for Ethernet service. Detailed simulation shows that the algorithm guarantees carrier-grade QoS for TDM service, high bandwidth utilization and good fairness of bandwidth assignment among Optical Network Units (ONU).

  16. Directing Traffic: Managing Internet Bandwidth Fairly

    Science.gov (United States)

    Paine, Thomas A.; Griggs, Tyler J.

    2008-01-01

    Educational institutions today face budgetary restraints and scarce resources, complicating the decision of how to allot bandwidth for campus network users. Additionally, campus concerns over peer-to-peer networking (specifically outbound Internet traffic) have increased because of bandwidth and copyright issues. In this article, the authors…

  17. 47 CFR 95.633 - Emission bandwidth.

    Science.gov (United States)

    2010-10-01

    ... SERVICES Technical Regulations Technical Standards § 95.633 Emission bandwidth. (a) The authorized... frequencies 151.820 MHz, 151.880 MHz, and 151.940 MHz are limited to 11.25 kHz. (2) Emissions on frequencies... 47 Telecommunication 5 2010-10-01 2010-10-01 false Emission bandwidth. 95.633 Section...

  18. Energy Bandwidth for Petroleum Refining Processes

    Energy Technology Data Exchange (ETDEWEB)

    none,

    2006-10-01

    The petroleum refining energy bandwidth report analyzes the most energy-intensive unit operations used in U.S. refineries: crude oil distillation, fluid catalytic cracking, catalytic hydrotreating, catalytic reforming, and alkylation. The "bandwidth" provides a snapshot of the energy losses that can potentially be recovered through best practices and technology R&D.

  19. Bandwidth engineering of photonic crystal waveguide bends

    DEFF Research Database (Denmark)

    Borel, Peter Ingo; Frandsen, Lars Hagedorn; Harpøth, Anders;

    2004-01-01

    An effective design principle has been applied to photonic crystal waveguide bends fabricated in silicon-on-insulator material using deep UV lithography resulting in a large increase in the low-loss bandwidth of the bends. Furthermore, it is experimentally demonstrated that the absolute bandwidth...

  20. Bimodal-sized quantum dots for broad spectral bandwidth emitter.

    Science.gov (United States)

    Zhou, Yinli; Zhang, Jian; Ning, Yongqiang; Zeng, Yugang; Zhang, Jianwei; Zhang, Xing; Qin, Li; Wang, Lijun

    2015-12-14

    In this work, a high-power and broadband superluminescent diode (SLD) is achieved utilizing bimodal-sized quantum dots (QDs) as active materials. The device exhibits a 3 dB bandwidth of 178.8 nm with output power of 1.3 mW under continuous-wave (CW) conditions. Preliminary discussion attributes the spectra behavior of the device to carrier transfer between small dot ensemble and large dot ensemble. Our result provides a new possibility to further broadening the spectral bandwidth and improving the CW output power of QD-SLDs.

  1. Analysis of Starting Performance of High Power-Factor Induction Motor with Floating Winding in Parallel Connection with Capacitors

    Institute of Scientific and Technical Information of China (English)

    1999-01-01

    In this paper, by using a matrix technique, a dynamic model of high power-factor induction motor with floating winding in parallel connection with capacitors is established. Then, the starting performance of this motor is analyzed by computer simulation. By comparison of the tested and computed results, which are in good agreement, the dynamic model and simulative method are verified.

  2. Allocating Bandwidth in Datacenter Networks:A Survey

    Institute of Scientific and Technical Information of China (English)

    陈丽; 李葆春; 李波

    2014-01-01

    Datacenters have played an increasingly essential role as the underlying infrastructure in cloud computing. As implied by the essence of cloud computing, resources in these datacenters are shared by multiple competing entities, which can be either tenants that rent virtual machines (VMs) in a public cloud such as Amazon EC2, or applications that embrace data parallel frameworks like MapReduce in a private cloud maintained by Google. It has been generally observed that with traditional transport-layer protocols allocating link bandwidth in datacenters, network traffic from competing applications interferes with each other, resulting in a severe lack of predictability and fairness of application performance. Such a critical issue has drawn a substantial amount of recent research attention on bandwidth allocation in datacenter networks, with a number of new mechanisms proposed to efficiently and fairly share a datacenter network among competing entities. In this article, we present an extensive survey of existing bandwidth allocation mechanisms in the literature, covering the scenarios of both public and private clouds. We thoroughly investigate their underlying design principles, evaluate the trade-off involved in their design choices and summarize them in a unified design space, with the hope of conveying some meaningful insights for better designs in the future.

  3. A high performance data parallel tensor contraction framework: Application to coupled electro-mechanics

    Science.gov (United States)

    Poya, Roman; Gil, Antonio J.; Ortigosa, Rogelio

    2017-07-01

    The paper presents aspects of implementation of a new high performance tensor contraction framework for the numerical analysis of coupled and multi-physics problems on streaming architectures. In addition to explicit SIMD instructions and smart expression templates, the framework introduces domain specific constructs for the tensor cross product and its associated algebra recently rediscovered by Bonet et al. (2015, 2016) in the context of solid mechanics. The two key ingredients of the presented expression template engine are as follows. First, the capability to mathematically transform complex chains of operations to simpler equivalent expressions, while potentially avoiding routes with higher levels of computational complexity and, second, to perform a compile time depth-first or breadth-first search to find the optimal contraction indices of a large tensor network in order to minimise the number of floating point operations. For optimisations of tensor contraction such as loop transformation, loop fusion and data locality optimisations, the framework relies heavily on compile time technologies rather than source-to-source translation or JIT techniques. Every aspect of the framework is examined through relevant performance benchmarks, including the impact of data parallelism on the performance of isomorphic and nonisomorphic tensor products, the FLOP and memory I/O optimality in the evaluation of tensor networks, the compilation cost and memory footprint of the framework and the performance of tensor cross product kernels. The framework is then applied to finite element analysis of coupled electro-mechanical problems to assess the speed-ups achieved in kernel-based numerical integration of complex electroelastic energy functionals. In this context, domain-aware expression templates combined with SIMD instructions are shown to provide a significant speed-up over the classical low-level style programming techniques.

  4. A computationally efficient parallel Levenberg-Marquardt algorithm for highly parameterized inverse model analyses

    Science.gov (United States)

    Lin, Youzuo; O'Malley, Daniel; Vesselinov, Velimir V.

    2016-09-01

    Inverse modeling seeks model parameters given a set of observations. However, for practical problems because the number of measurements is often large and the model parameters are also numerous, conventional methods for inverse modeling can be computationally expensive. We have developed a new, computationally efficient parallel Levenberg-Marquardt method for solving inverse modeling problems with a highly parameterized model space. Levenberg-Marquardt methods require the solution of a linear system of equations which can be prohibitively expensive to compute for moderate to large-scale problems. Our novel method projects the original linear problem down to a Krylov subspace such that the dimensionality of the problem can be significantly reduced. Furthermore, we store the Krylov subspace computed when using the first damping parameter and recycle the subspace for the subsequent damping parameters. The efficiency of our new inverse modeling algorithm is significantly improved using these computational techniques. We apply this new inverse modeling method to invert for random transmissivity fields in 2-D and a random hydraulic conductivity field in 3-D. Our algorithm is fast enough to solve for the distributed model parameters (transmissivity) in the model domain. The algorithm is coded in Julia and implemented in the MADS computational framework (http://mads.lanl.gov). By comparing with Levenberg-Marquardt methods using standard linear inversion techniques such as QR or SVD methods, our Levenberg-Marquardt method yields a speed-up ratio on the order of ˜101 to ˜102 in a multicore computational environment. Therefore, our new inverse modeling method is a powerful tool for characterizing subsurface heterogeneity for moderate to large-scale problems.

  5. Turbulence Resolving Flow Simulations of a Francis Turbine in Part Load using Highly Parallel CFD Simulations

    Science.gov (United States)

    Krappel, Timo; Riedelbauch, Stefan; Jester-Zuerker, Roland; Jung, Alexander; Flurl, Benedikt; Unger, Friedeman; Galpin, Paul

    2016-11-01

    The operation of Francis turbines in part load conditions causes high fluctuations and dynamic loads in the turbine and especially in the draft tube. At the hub of the runner outlet a rotating vortex rope within a low pressure zone arises and propagates into the draft tube cone. The investigated part load operating point is at about 72% discharge of best efficiency. To reduce the possible influence of boundary conditions on the solution, a flow simulation of a complete Francis turbine is conducted consisting of spiral case, stay and guide vanes, runner and draft tube. As the flow has a strong swirling component for the chosen operating point, it is very challenging to accurately predict the flow and in particular the flow losses in the diffusor. The goal of this study is to reach significantly better numerical prediction of this flow type. This is achieved by an improved resolution of small turbulent structures. Therefore, the Scale Adaptive Simulation SAS-SST turbulence model - a scale resolving turbulence model - is applied and compared to the widely used RANS-SST turbulence model. The largest mesh contains 300 million elements, which achieves LES-like resolution throughout much of the computational domain. The simulations are evaluated in terms of the hydraulic losses in the machine, evaluation of the velocity field, pressure oscillations in the draft tube and visual comparisons of turbulent flow structures. A pre-release version of ANSYS CFX 17.0 is used in this paper, as this CFD solver has a parallel performance up to several thousands of cores for this application which includes a transient rotor-stator interface to support the relative motion between the runner and the stationary portions of the water turbine.

  6. High throughput whole rumen metagenome profiling using untargeted massively parallel sequencing

    Directory of Open Access Journals (Sweden)

    Ross Elizabeth M

    2012-07-01

    Full Text Available Abstract Background Variation of microorganism communities in the rumen of cattle (Bos taurus is of great interest because of possible links to economically or environmentally important traits, such as feed conversion efficiency or methane emission levels. The resolution of studies investigating this variation may be improved by utilizing untargeted massively parallel sequencing (MPS, that is, sequencing without targeted amplification of genes. The objective of this study was to develop a method which used MPS to generate “rumen metagenome profiles”, and to investigate if these profiles were repeatable among samples taken from the same cow. Given faecal samples are much easier to obtain than rumen fluid samples; we also investigated whether rumen metagenome profiles were predictive of faecal metagenome profiles. Results Rather than focusing on individual organisms within the rumen, our method used MPS data to generate quantitative rumen micro-biome profiles, regardless of taxonomic classifications. The method requires a previously assembled reference metagenome. A number of such reference metagenomes were considered, including two rumen derived metagenomes, a human faecal microflora metagenome and a reference metagenome made up of publically available prokaryote sequences. Sequence reads from each test sample were aligned to these references. The “rumen metagenome profile” was generated from the number of the reads that aligned to each contig in the database. We used this method to test the hypothesis that rumen fluid microbial community profiles vary more between cows than within multiple samples from the same cow. Rumen fluid samples were taken from three cows, at three locations within the rumen. DNA from the samples was sequenced on the Illumina GAIIx. When the reads were aligned to a rumen metagenome reference, the rumen metagenome profiles were repeatable (P  Conclusions We have presented a simple and high throughput method of

  7. Convergent Evolution of Hemoglobin Function in High-Altitude Andean Waterfowl Involves Limited Parallelism at the Molecular Sequence Level.

    Directory of Open Access Journals (Sweden)

    Chandrasekhar Natarajan

    2015-12-01

    Full Text Available A fundamental question in evolutionary genetics concerns the extent to which adaptive phenotypic convergence is attributable to convergent or parallel changes at the molecular sequence level. Here we report a comparative analysis of hemoglobin (Hb function in eight phylogenetically replicated pairs of high- and low-altitude waterfowl taxa to test for convergence in the oxygenation properties of Hb, and to assess the extent to which convergence in biochemical phenotype is attributable to repeated amino acid replacements. Functional experiments on native Hb variants and protein engineering experiments based on site-directed mutagenesis revealed the phenotypic effects of specific amino acid replacements that were responsible for convergent increases in Hb-O2 affinity in multiple high-altitude taxa. In six of the eight taxon pairs, high-altitude taxa evolved derived increases in Hb-O2 affinity that were caused by a combination of unique replacements, parallel replacements (involving identical-by-state variants with independent mutational origins in different lineages, and collateral replacements (involving shared, identical-by-descent variants derived via introgressive hybridization. In genome scans of nucleotide differentiation involving high- and low-altitude populations of three separate species, function-altering amino acid polymorphisms in the globin genes emerged as highly significant outliers, providing independent evidence for adaptive divergence in Hb function. The experimental results demonstrate that convergent changes in protein function can occur through multiple historical paths, and can involve multiple possible mutations. Most cases of convergence in Hb function did not involve parallel substitutions and most parallel substitutions did not affect Hb-O2 affinity, indicating that the repeatability of phenotypic evolution does not require parallelism at the molecular level.

  8. Convergent Evolution of Hemoglobin Function in High-Altitude Andean Waterfowl Involves Limited Parallelism at the Molecular Sequence Level.

    Science.gov (United States)

    Natarajan, Chandrasekhar; Projecto-Garcia, Joana; Moriyama, Hideaki; Weber, Roy E; Muñoz-Fuentes, Violeta; Green, Andy J; Kopuchian, Cecilia; Tubaro, Pablo L; Alza, Luis; Bulgarella, Mariana; Smith, Matthew M; Wilson, Robert E; Fago, Angela; McCracken, Kevin G; Storz, Jay F

    2015-12-01

    A fundamental question in evolutionary genetics concerns the extent to which adaptive phenotypic convergence is attributable to convergent or parallel changes at the molecular sequence level. Here we report a comparative analysis of hemoglobin (Hb) function in eight phylogenetically replicated pairs of high- and low-altitude waterfowl taxa to test for convergence in the oxygenation properties of Hb, and to assess the extent to which convergence in biochemical phenotype is attributable to repeated amino acid replacements. Functional experiments on native Hb variants and protein engineering experiments based on site-directed mutagenesis revealed the phenotypic effects of specific amino acid replacements that were responsible for convergent increases in Hb-O2 affinity in multiple high-altitude taxa. In six of the eight taxon pairs, high-altitude taxa evolved derived increases in Hb-O2 affinity that were caused by a combination of unique replacements, parallel replacements (involving identical-by-state variants with independent mutational origins in different lineages), and collateral replacements (involving shared, identical-by-descent variants derived via introgressive hybridization). In genome scans of nucleotide differentiation involving high- and low-altitude populations of three separate species, function-altering amino acid polymorphisms in the globin genes emerged as highly significant outliers, providing independent evidence for adaptive divergence in Hb function. The experimental results demonstrate that convergent changes in protein function can occur through multiple historical paths, and can involve multiple possible mutations. Most cases of convergence in Hb function did not involve parallel substitutions and most parallel substitutions did not affect Hb-O2 affinity, indicating that the repeatability of phenotypic evolution does not require parallelism at the molecular level.

  9. Low profile, highly configurable, current sharing paralleled wide band gap power device power module

    Science.gov (United States)

    McPherson, Brice; Killeen, Peter D.; Lostetter, Alex; Shaw, Robert; Passmore, Brandon; Hornberger, Jared; Berry, Tony M

    2016-08-23

    A power module with multiple equalized parallel power paths supporting multiple parallel bare die power devices constructed with low inductance equalized current paths for even current sharing and clean switching events. Wide low profile power contacts provide low inductance, short current paths, and large conductor cross section area provides for massive current carrying. An internal gate & source kelvin interconnection substrate is provided with individual ballast resistors and simple bolted construction. Gate drive connectors are provided on either left or right size of the module. The module is configurable as half bridge, full bridge, common source, and common drain topologies.

  10. Numerical investigation of power requirements for ultra-high-speed serial-to-parallel conversion

    DEFF Research Database (Denmark)

    Lillieholm, Mads; Mulvad, Hans Christian Hansen; Palushani, Evarist

    2012-01-01

    We present a numerical bit-error rate investigation of 160-640 Gbit/s serial-to-parallel conversion by four-wave mixing based time-domain optical Fourier transformation, showing an inverse scaling of the required pump energy per bit with the bit rate.......We present a numerical bit-error rate investigation of 160-640 Gbit/s serial-to-parallel conversion by four-wave mixing based time-domain optical Fourier transformation, showing an inverse scaling of the required pump energy per bit with the bit rate....

  11. A heterogeneous and parallel computing framework for high-resolution hydrodynamic simulations

    Science.gov (United States)

    Smith, Luke; Liang, Qiuhua

    2015-04-01

    Shock-capturing hydrodynamic models are now widely applied in the context of flood risk assessment and forecasting, accurately capturing the behaviour of surface water over ground and within rivers. Such models are generally explicit in their numerical basis, and can be computationally expensive; this has prohibited full use of high-resolution topographic data for complex urban environments, now easily obtainable through airborne altimetric surveys (LiDAR). As processor clock speed advances have stagnated in recent years, further computational performance gains are largely dependent on the use of parallel processing. Heterogeneous computing architectures (e.g. graphics processing units or compute accelerator cards) provide a cost-effective means of achieving high throughput in cases where the same calculation is performed with a large input dataset. In recent years this technique has been applied successfully for flood risk mapping, such as within the national surface water flood risk assessment for the United Kingdom. We present a flexible software framework for hydrodynamic simulations across multiple processors of different architectures, within multiple computer systems, enabled using OpenCL and Message Passing Interface (MPI) libraries. A finite-volume Godunov-type scheme is implemented using the HLLC approach to solving the Riemann problem, with optional extension to second-order accuracy in space and time using the MUSCL-Hancock approach. The framework is successfully applied on personal computers and a small cluster to provide considerable improvements in performance. The most significant performance gains were achieved across two servers, each containing four NVIDIA GPUs, with a mix of K20, M2075 and C2050 devices. Advantages are found with respect to decreased parametric sensitivity, and thus in reducing uncertainty, for a major fluvial flood within a large catchment during 2005 in Carlisle, England. Simulations for the three-day event could be performed

  12. Scalable parallel programming for high performance seismic simulation on petascale heterogeneous supercomputers

    Science.gov (United States)

    Zhou, Jun

    The 1994 Northridge earthquake in Los Angeles, California, killed 57 people, injured over 8,700 and caused an estimated $20 billion in damage. Petascale simulations are needed in California and elsewhere to provide society with a better understanding of the rupture and wave dynamics of the largest earthquakes at shaking frequencies required to engineer safe structures. As the heterogeneous supercomputing infrastructures are becoming more common, numerical developments in earthquake system research are particularly challenged by the dependence on the accelerator elements to enable "the Big One" simulations with higher frequency and finer resolution. Reducing time to solution and power consumption are two primary focus area today for the enabling technology of fault rupture dynamics and seismic wave propagation in realistic 3D models of the crust's heterogeneous structure. This dissertation presents scalable parallel programming techniques for high performance seismic simulation running on petascale heterogeneous supercomputers. A real world earthquake simulation code, AWP-ODC, one of the most advanced earthquake codes to date, was chosen as the base code in this research, and the testbed is based on Titan at Oak Ridge National Laboraratory, the world's largest hetergeneous supercomputer. The research work is primarily related to architecture study, computation performance tuning and software system scalability. An earthquake simulation workflow has also been developed to support the efficient production sets of simulations. The highlights of the technical development are an aggressive performance optimization focusing on data locality and a notable data communication model that hides the data communication latency. This development results in the optimal computation efficiency and throughput for the 13-point stencil code on heterogeneous systems, which can be extended to general high-order stencil codes. Started from scratch, the hybrid CPU/GPU version of AWP

  13. A System Theoretic Approach to Bandwidth Estimation

    OpenAIRE

    Liebeherr, Jorg; Fidler, Markus; Valaee, Shahrokh

    2008-01-01

    It is shown that bandwidth estimation in packet networks can be viewed in terms of min-plus linear system theory. The available bandwidth of a link or complete path is expressed in terms of a {\\em service curve}, which is a function that appears in the network calculus to express the service available to a traffic flow. The service curve is estimated based on measurements of a sequence of probing packets or passive measurements of a sample path of arrivals. It is shown that existing bandwidth...

  14. Low cost, highly effective parallel computing achieved through a Beowulf cluster.

    Science.gov (United States)

    Bitner, Marc; Skelton, Gordon

    2003-01-01

    A Beowulf cluster is a means of bringing together several computers and using software and network components to make this cluster of computers appear and function as one computer with multiple parallel computing processors. A cluster of computers can provide comparable computing power usually found only in very expensive super computers or servers.

  15. High Performance Parallel Processing Project: Industrial computing initiative. Progress reports for fiscal year 1995

    Energy Technology Data Exchange (ETDEWEB)

    Koniges, A.

    1996-02-09

    This project is a package of 11 individual CRADA`s plus hardware. This innovative project established a three-year multi-party collaboration that is significantly accelerating the availability of commercial massively parallel processing computing software technology to U.S. government, academic, and industrial end-users. This report contains individual presentations from nine principal investigators along with overall program information.

  16. Achieving High Performance in Parallel Applications via Kernel-Application Interaction

    Science.gov (United States)

    1996-04-01

    empirical numbers. Tom LeBlanc’s past experience in real-time systems and insightful criticisms proved valuable, and Professor Abraham Seid- mann’s...Becker, P. Das, J. Karlsson, and C. Quiroz . Operating system support for animate vi- sion. Journal of Parallel and Distributed Computing, 15(2):103

  17. Highly efficient spatial data filtering in parallel using the opensource library CPPPO

    Science.gov (United States)

    Municchi, Federico; Goniva, Christoph; Radl, Stefan

    2016-10-01

    CPPPO is a compilation of parallel data processing routines developed with the aim to create a library for "scale bridging" (i.e. connecting different scales by mean of closure models) in a multi-scale approach. CPPPO features a number of parallel filtering algorithms designed for use with structured and unstructured Eulerian meshes, as well as Lagrangian data sets. In addition, data can be processed on the fly, allowing the collection of relevant statistics without saving individual snapshots of the simulation state. Our library is provided with an interface to the widely-used CFD solver OpenFOAM®, and can be easily connected to any other software package via interface modules. Also, we introduce a novel, extremely efficient approach to parallel data filtering, and show that our algorithms scale super-linearly on multi-core clusters. Furthermore, we provide a guideline for choosing the optimal Eulerian cell selection algorithm depending on the number of CPU cores used. Finally, we demonstrate the accuracy and the parallel scalability of CPPPO in a showcase focusing on heat and mass transfer from a dense bed of particles.

  18. Practical parallel computing

    CERN Document Server

    Morse, H Stephen

    1994-01-01

    Practical Parallel Computing provides information pertinent to the fundamental aspects of high-performance parallel processing. This book discusses the development of parallel applications on a variety of equipment.Organized into three parts encompassing 12 chapters, this book begins with an overview of the technology trends that converge to favor massively parallel hardware over traditional mainframes and vector machines. This text then gives a tutorial introduction to parallel hardware architectures. Other chapters provide worked-out examples of programs using several parallel languages. Thi

  19. HOTB: High precision parallel code for calculation of four-particle harmonic oscillator transformation brackets

    Science.gov (United States)

    Stepšys, A.; Mickevicius, S.; Germanas, D.; Kalinauskas, R. K.

    2014-11-01

    This new version of the HOTB program for calculation of the three and four particle harmonic oscillator transformation brackets provides some enhancements and corrections to the earlier version (Germanas et al., 2010) [1]. In particular, new version allows calculations of harmonic oscillator transformation brackets be performed in parallel using MPI parallel communication standard. Moreover, higher precision of intermediate calculations using GNU Quadruple Precision and arbitrary precision library FMLib [2] is done. A package of Fortran code is presented. Calculation time of large matrices can be significantly reduced using effective parallel code. Use of Higher Precision methods in intermediate calculations increases the stability of algorithms and extends the validity of used algorithms for larger input values. Catalogue identifier: AEFQ_v4_0 Program summary URL: http://cpc.cs.qub.ac.uk/summaries/AEFQ_v4_0.html Program obtainable from: CPC Program Library, Queen’s University of Belfast, N. Ireland Licensing provisions: GNU General Public License, version 3 Number of lines in programs, including test data, etc.: 1711 Number of bytes in distributed programs, including test data, etc.: 11667 Distribution format: tar.gz Program language used: FORTRAN 90 with MPI extensions for parallelism Computer: Any computer with FORTRAN 90 compiler Operating system: Windows, Linux, FreeBSD, True64 Unix Has the code been vectorized of parallelized?: Yes, parallelism using MPI extensions. Number of CPUs used: up to 999 RAM(per CPU core): Depending on allocated binomial and trinomial matrices and use of precision; at least 500 MB Catalogue identifier of previous version: AEFQ_v1_0 Journal reference of previous version: Comput. Phys. Comm. 181, Issue 2, (2010) 420-425 Does the new version supersede the previous version? Yes Nature of problem: Calculation of matrices of three-particle harmonic oscillator brackets (3HOB) and four-particle harmonic oscillator brackets (4HOB) in a more

  20. Optical encryption of parallel quadrature phase shift keying signals based on nondegenerate four-wave mixing in highly nonlinear fiber

    Science.gov (United States)

    Cui, Yue; Zhang, Min; Zhan, Yueying; Wang, Danshi; Huang, Shanguo

    2016-08-01

    A scheme for optical parallel encryption/decryption of quadrature phase shift keying (QPSK) signals is proposed, in which three QPSK signals at 10 Gb/s are encrypted and decrypted simultaneously in the optical domain through nondegenerate four-wave mixing in a highly nonlinear fiber. The results of theoretical analysis and simulations show that the scheme can perform high-speed wiretapping against the encryption of parallel signals and receiver sensitivities of encrypted signal and the decrypted signal are -25.9 and -23.8 dBm, respectively, at the forward error correction threshold. The results are useful for designing high-speed encryption/decryption of advanced modulated signals and thus enhancing the physical layer security of optical networks.

  1. Average Bandwidth Allocation Model of WFQ

    Directory of Open Access Journals (Sweden)

    Tomáš Balogh

    2012-01-01

    Full Text Available We present a new iterative method for the calculation of average bandwidth assignment to traffic flows using a WFQ scheduler in IP based NGN networks. The bandwidth assignment calculation is based on the link speed, assigned weights, arrival rate, and average packet length or input rate of the traffic flows. We prove the model outcome with examples and simulation results using NS2 simulator.

  2. Application of parallel liquid chromatography/mass spectrometry for high throughput microsomal stability screening of compound libraries.

    Science.gov (United States)

    Xu, Rongda; Nemes, Csaba; Jenkins, Kelly M; Rourick, Robyn A; Kassel, Daniel B; Liu, Charles Z C

    2002-02-01

    Solution-phase and solid-phase parallel synthesis and high throughput screening have enabled biologically active and selective compounds to be identified at an unprecedented rate. The challenge has been to convert these hits into viable development candidates. To accelerate the conversion of these hits into lead development candidates, early assessment of the physicochemical and pharmacological properties of these compounds is being made. In particular, in vitro absorption, distribution, metabolism, and elimination (ADME) assays are being conducted at earlier and earlier stages of discovery with the goal of reducing the attrition rate of these potential drug candidates as they progress through development. In this report, we present an eight-channel parallel liquid chromatography/mass spectrometry (LC/MS) system in combination with custom Visual Basic and Applescript automated data processing applications for high throughput early ADME. The parallel LC/MS system was configured with one set of gradient LC pumps and an eight-channel multiple probe autosampler. The flow was split equivalently into eight streams before the multiple probe autosampler and recombined after the eight columns and just prior to the mass spectrometer ion source. The system was tested for column-to-column variation and for reproducibility over a 17 h period (approximately 500 injections per column). The variations in retention time and peak area were determined to be less than 2 and 10%, respectively, in both tests. The parallel LC/MS system described permits time-course microsomal incubations (t(o), t5, t15, t30) to be measured in triplicate and enables estimations of t 1/2 microsomal stability. The parallel LC/MS system is capable of analyzing up to 240 samples per hour and permits the complete profiling up to two microtiter plates of compounds per day (i.e., 176 test substrate compounds + sixteen controls).

  3. Investigation of Diagonal Antenna-Chassis Mode in Mobile Terminal LTE MIMO Antennas for Bandwidth Enhancement

    DEFF Research Database (Denmark)

    Zhang, Shuai; Zhao, Kun; Ying, Zhinong

    2015-01-01

    A diagonal antenna-chassis mode is investigated in long-term evolution multiple-input-multiple-output (LTE MIMO) antennas. The MIMO bandwidth is defined in this paper as the overlap range of the low-envelope correlation coefficient, high total efficiency, and -6-dB impedance matching bandwidths...

  4. GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers

    Directory of Open Access Journals (Sweden)

    Mark James Abraham

    2015-09-01

    Full Text Available GROMACS is one of the most widely used open-source and free software codes in chemistry, used primarily for dynamical simulations of biomolecules. It provides a rich set of calculation types, preparation and analysis tools. Several advanced techniques for free-energy calculations are supported. In version 5, it reaches new performance heights, through several new and enhanced parallelization algorithms. These work on every level; SIMD registers inside cores, multithreading, heterogeneous CPU–GPU acceleration, state-of-the-art 3D domain decomposition, and ensemble-level parallelization through built-in replica exchange and the separate Copernicus framework. The latest best-in-class compressed trajectory storage format is supported.

  5. High-Performance Computation of Distributed-Memory Parallel 3D Voronoi and Delaunay Tessellation

    Energy Technology Data Exchange (ETDEWEB)

    Peterka, Tom; Morozov, Dmitriy; Phillips, Carolyn

    2014-11-14

    Computing a Voronoi or Delaunay tessellation from a set of points is a core part of the analysis of many simulated and measured datasets: N-body simulations, molecular dynamics codes, and LIDAR point clouds are just a few examples. Such computational geometry methods are common in data analysis and visualization; but as the scale of simulations and observations surpasses billions of particles, the existing serial and shared-memory algorithms no longer suffice. A distributed-memory scalable parallel algorithm is the only feasible approach. The primary contribution of this paper is a new parallel Delaunay and Voronoi tessellation algorithm that automatically determines which neighbor points need to be exchanged among the subdomains of a spatial decomposition. Other contributions include periodic and wall boundary conditions, comparison of our method using two popular serial libraries, and application to numerous science datasets.

  6. High Performance Discrete Cosine Transform Operator Using Multimedia Oriented Subword Parallelism

    Directory of Open Access Journals (Sweden)

    Shafqat Khan

    2015-01-01

    Full Text Available In this paper an efficient two-dimensional discrete cosine transform (DCT operator is proposed for multimedia applications. It is based on the DCT operator proposed in Kovac and Ranganathan, 1995. Speed-up is obtained by using multimedia oriented subword parallelism (SWP. Rather than operating on a single pixel, the SWP-based DCT operator performs parallel computations on multiple pixels packed in word size input registers so that the performance of the operator is increased. Special emphasis is made to increase the coordination between pixel sizes and subword sizes to maximize resource utilization rate. Rather than using classical subword sizes (8, 16, and 32 bits, multimedia oriented subword sizes (8, 10, 12, and 16 bits are used in the proposed DCT operator. The proposed SWP DCT operator unit can be used as a coprocessor for multimedia applications.

  7. PF-OLA: A High-Performance Framework for Parallel On-Line Aggregation

    CERN Document Server

    Qin, Chengjie

    2012-01-01

    On-line aggregation provides estimates to the final result of a computation during the actual processing. The user can stop the computation as soon as the estimate is accurate enough, typically early in the execution. This allows for the interactive data exploration of the largest datasets. In this paper we introduce the first framework for parallel on-line aggregation in which the estimation does not incur any overhead on top of the actual execution. We define a generic interface to express any estimation model that abstracts completely the execution details. We design a novel estimator specifically targeted at parallel on- line aggregation. When executed by the framework over an 8TB TPC-H instance, the estimator provides accurate confidence bounds early in the execution even when the cardinality of the final result is seven orders of magnitude smaller than the dataset size and without incurring any overhead.

  8. Parallel STEPS: Large Scale Stochastic Spatial Reaction-Diffusion Simulation with High Performance Computers

    CERN Document Server

    Chen, Weiliang

    2016-01-01

    Stochastic, spatial reaction-diffusion simulations have been widely used in systems biology and computational neuroscience. However, the increasing scale and complexity of simulated models and morphologies have exceeded the capacity of any serial implementation. This led to development of parallel solutions that benefit from the boost in performance of modern large-scale supercomputers. In this paper, we describe an MPI-based, parallel Operator-Splitting implementation for stochastic spatial reaction-diffusion simulations with irregular tetrahedral meshes. The performance of our implementation is first examined and analyzed with simulations of a simple model. We then demonstrate its usage in real-world research by simulating the reaction-diffusion components of a published calcium burst model in both Purkinje neuron sub-branch and full dendrite morphologies. Simulation results indicate that our implementation is capable of achieving super-linear speedup for balanced loading simulations with reasonable molecul...

  9. Parallel-META 2.0: enhanced metagenomic data analysis with functional annotation, high performance computing and advanced visualization.

    Directory of Open Access Journals (Sweden)

    Xiaoquan Su

    Full Text Available The metagenomic method directly sequences and analyses genome information from microbial communities. The main computational tasks for metagenomic analyses include taxonomical and functional structure analysis for all genomes in a microbial community (also referred to as a metagenomic sample. With the advancement of Next Generation Sequencing (NGS techniques, the number of metagenomic samples and the data size for each sample are increasing rapidly. Current metagenomic analysis is both data- and computation- intensive, especially when there are many species in a metagenomic sample, and each has a large number of sequences. As such, metagenomic analyses require extensive computational power. The increasing analytical requirements further augment the challenges for computation analysis. In this work, we have proposed Parallel-META 2.0, a metagenomic analysis software package, to cope with such needs for efficient and fast analyses of taxonomical and functional structures for microbial communities. Parallel-META 2.0 is an extended and improved version of Parallel-META 1.0, which enhances the taxonomical analysis using multiple databases, improves computation efficiency by optimized parallel computing, and supports interactive visualization of results in multiple views. Furthermore, it enables functional analysis for metagenomic samples including short-reads assembly, gene prediction and functional annotation. Therefore, it could provide accurate taxonomical and functional analyses of the metagenomic samples in high-throughput manner and on large scale.

  10. Parallel-META 2.0: enhanced metagenomic data analysis with functional annotation, high performance computing and advanced visualization.

    Science.gov (United States)

    Su, Xiaoquan; Pan, Weihua; Song, Baoxing; Xu, Jian; Ning, Kang

    2014-01-01

    The metagenomic method directly sequences and analyses genome information from microbial communities. The main computational tasks for metagenomic analyses include taxonomical and functional structure analysis for all genomes in a microbial community (also referred to as a metagenomic sample). With the advancement of Next Generation Sequencing (NGS) techniques, the number of metagenomic samples and the data size for each sample are increasing rapidly. Current metagenomic analysis is both data- and computation- intensive, especially when there are many species in a metagenomic sample, and each has a large number of sequences. As such, metagenomic analyses require extensive computational power. The increasing analytical requirements further augment the challenges for computation analysis. In this work, we have proposed Parallel-META 2.0, a metagenomic analysis software package, to cope with such needs for efficient and fast analyses of taxonomical and functional structures for microbial communities. Parallel-META 2.0 is an extended and improved version of Parallel-META 1.0, which enhances the taxonomical analysis using multiple databases, improves computation efficiency by optimized parallel computing, and supports interactive visualization of results in multiple views. Furthermore, it enables functional analysis for metagenomic samples including short-reads assembly, gene prediction and functional annotation. Therefore, it could provide accurate taxonomical and functional analyses of the metagenomic samples in high-throughput manner and on large scale.

  11. Scalable High-Performance Parallel Design for Network Intrusion Detection Systems on Many-Core Processors

    OpenAIRE

    Jiang, Hayang; Xie, Gaogang; Salamatian, Kavé; Mathy, Laurent

    2013-01-01

    Network Intrusion Detection Systems (NIDSes) face significant challenges coming from the relentless network link speed growth and increasing complexity of threats. Both hardware accelerated and parallel software-based NIDS solutions, based on commodity multi-core and GPU processors, have been proposed to overcome these challenges. Network Intrusion Detection Systems (NIDSes) face significant challenges coming from the relentless network link speed growth and increasing complexity of threats. ...

  12. Parallel segmented outlet flow high performance liquid chromatography with multiplexed detection

    Energy Technology Data Exchange (ETDEWEB)

    Camenzuli, Michelle [Australian Centre for Research on Separation Science (ACROSS), School of Science and Health, University of Western Sydney (Parramatta), Sydney, NSW (Australia); Terry, Jessica M. [Centre for Chemistry and Biotechnology, School of Life and Environmental Sciences, Deakin University, Geelong, Victoria 3216 (Australia); Shalliker, R. Andrew, E-mail: r.shalliker@uws.edu.au [Australian Centre for Research on Separation Science (ACROSS), School of Science and Health, University of Western Sydney (Parramatta), Sydney, NSW (Australia); Conlan, Xavier A.; Barnett, Neil W. [Centre for Chemistry and Biotechnology, School of Life and Environmental Sciences, Deakin University, Geelong, Victoria 3216 (Australia); Francis, Paul S., E-mail: paul.francis@deakin.edu.au [Centre for Chemistry and Biotechnology, School of Life and Environmental Sciences, Deakin University, Geelong, Victoria 3216 (Australia)

    2013-11-25

    Graphical abstract: -- Highlights: •Multiplexed detection for liquid chromatography. •‘Parallel segmented outlet flow’ distributes inner and outer portions of the analyte zone. •Three detectors were used simultaneously for the determination of opiate alkaloids. -- Abstract: We describe a new approach to multiplex detection for HPLC, exploiting parallel segmented outlet flow – a new column technology that provides pressure-regulated control of eluate flow through multiple outlet channels, which minimises the additional dead volume associated with conventional post-column flow splitting. Using three detectors: one UV-absorbance and two chemiluminescence systems (tris(2,2′-bipyridine)ruthenium(III) and permanganate), we examine the relative responses for six opium poppy (Papaver somniferum) alkaloids under conventional and multiplexed conditions, where approximately 30% of the eluate was distributed to each detector and the remaining solution directed to a collection vessel. The parallel segmented outlet flow mode of operation offers advantages in terms of solvent consumption, waste generation, total analysis time and solute band volume when applying multiple detectors to HPLC, but the manner in which each detection system is influenced by changes in solute concentration and solution flow rates must be carefully considered.

  13. Parallel high resolution imaging of diffuse objects in the Magellanic Clouds

    Science.gov (United States)

    Walsh, Jeremy

    1996-07-01

    The Magellanic Clouds, because of their well-determined distance and small extinction, allow an unprecedented opportunity to observe many ISM phenomena occurring in a whole galaxy. The HST resolution {0.1" = 0.025 pc} offers detail hitherto poorly studied in the extragalactic context on the morphology and spatial relationships in various ISM processes associated with the evolution of Population I and Population II systems. This long term {11 cycles} parallel program exploits these opportunities by obtaining WFPC2 images of appropriate targets that are accessible at the same time as primary pointings. The number of priority parallel observations per Cycle is estimated at 20; and our intent is to accumulate a significant archive of Magellanic Cloud direct images over the life of the program. The parallel targets, to be specified in crafting rules executed as part of the Phase II planning of each HST Cycle, will include {or search for} compact H II regions and young clusters, proto-stellar and maser regions, reflection nebulae, Herbig-Haro objects, stellar ejecta, SNR and wind-driven shells, shells, planetary nebulae and Very Low Excitation nebulae. The observations will be primarily in the Balmer lines and the stronger forbidden lines, with supplemental continuum images.

  14. Demonstration of parallel scanning probe microscope for high throughput metrology and inspection

    Science.gov (United States)

    Sadeghian, Hamed; Dekker, Bert; Herfst, Rodolf; Winters, Jasper; Eigenraam, Alexander; Rijnbeek, Ramon; Nulkes, Nicole

    2015-03-01

    With the device dimensions moving towards the 1X node and below, the semiconductor industry is rapidly approaching the point where existing metrology, inspection and review tools face huge challenges in terms of resolution, the ability to resolve 3D and the throughput. Due to the advantages of sub-nanometer resolution and the ability of true 3D scanning, scanning probe microscope (SPM) and specifically atomic force microscope (AFM) are considered as alternative technologies for CD-metrology, defect inspection and review of 1X node and below. In order to meet the increasing demand for resolution and throughput of CD-metrology, defect inspection and review, TNO has previously introduced the parallel SPM concept, consisting of parallel operation of many miniaturized SPMs on a 300 and 450 mm wafer. In this paper we will present the proof of principle of the parallelization for metrology and inspection. To give an indication of the system's specifications, the throughput of scanning is 4500 sites per hour, each within an area of 1 μm2 and 1024 ×1024 pixels.

  15. Experience with highly-parallel software for the storage system of the ATLAS Experiment at CERN

    Science.gov (United States)

    Colombo, T.; Vandelli, W.

    2012-12-01

    The ATLAS experiment records proton-proton collisions delivered by the LHC accelerator. The ATLAS Trigger and Data Acquisition (TDAQ) system selects interesting events on-line in a three-level trigger system in order to store them at a budgeted rate of several hundred Hz. This paper focuses on the TDAQ data-logging system and in particular on the implementation and performance of a novel parallel software design. The main challenge presented by a parallel data-logging implementation is the conflict between the largely parallel nature of the event processing, especially the recently introduced event compression, and the constraint of sequential file writing and hash-sum evaluation. This is further complicated by the necessity of operating in a fully data-driven mode, to cope with continuously evolving trigger and detector configurations. In this paper we report on the design of the new ATLAS on-line storage software. In particular we will discuss our development experience using recent concurrency-oriented libraries. Finally we will show the new system performance with respect to the old, single-threaded software design.

  16. Schottky Heterodyne Receivers With Full Waveguide Bandwidth

    Science.gov (United States)

    Hesler, Jeffrey; Crowe, Thomas

    2011-01-01

    Compact THz receivers with broad bandwidth and low noise have been developed for the frequency range from 100 GHz to 1 THz. These receivers meet the requirements for high-resolution spectroscopic studies of planetary atmospheres (including the Earth s) from spacecraft, as well as airborne and balloon platforms. The ongoing research is significant not only for the development of Schottky mixers, but also for the creation of a receiver system, including the LO chain. The new receivers meet the goals of high sensitivity, compact size, low total power requirement, and operation across complete waveguide bands. The exceptional performance makes these receivers ideal for the broader range of scientific and commercial applications. These include the extension of sophisticated test and measurement equipment to 1 THz and the development of low-cost imaging systems for security applications and industrial process monitoring. As a particular example, a WR-1.9SHM (400-600 GHz) has been developed (see Figure 1), with state-of-the-art noise temperature ranging from 1,000-1,800 K (DSB) over the full waveguide band. Also, a Vector Network Analyzer extender has been developed (see Figure 2) for the WR1.5 waveguide band (500 750 GHz) with 100-dB dynamic range.

  17. High accuracy microwave frequency measurement based on single-drive dual-parallel Mach-Zehnder modulator

    DEFF Research Database (Denmark)

    Zhao, Ying; Pang, Xiaodan; Deng, Lei

    2011-01-01

    A novel approach for broadband microwave frequency measurement by employing a single-drive dual-parallel Mach-Zehnder modulator is proposed and experimentally demonstrated. Based on bias manipulations of the modulator, conventional frequency-to-power mapping technique is developed by performing a...... 10−3 relative error. This high accuracy frequency measurement technique is a promising candidate for high-speed electronic warfare and defense applications.......A novel approach for broadband microwave frequency measurement by employing a single-drive dual-parallel Mach-Zehnder modulator is proposed and experimentally demonstrated. Based on bias manipulations of the modulator, conventional frequency-to-power mapping technique is developed by performing...... a two-stage frequency measurement cooperating with digital signal processing. In the experiment, 10GHz measurement range is guaranteed and the average uncertainty of estimated microwave frequency is 5.4MHz, which verifies the measurement accuracy is significantly improved by achieving an unprecedented...

  18. Bandwidth utilization maximization of scientific RF communication systems

    Energy Technology Data Exchange (ETDEWEB)

    Rey, D. [Sandia National Lab., Albuquerque, NM (United States); Ryan, W. [New Mexico State Univ., Las Cruces, NM (United States); Ross, M.

    1997-01-01

    A method for more efficiently utilizing the frequency bandwidth allocated for data transmission is presented. Current space and range communication systems use modulation and coding schemes that transmit 0.5 to 1.0 bits per second per Hertz of radio frequency bandwidth. The goal in this LDRD project is to increase the bandwidth utilization by employing advanced digital communications techniques. This is done with little or no increase in the transmit power which is usually very limited on airborne systems. Teaming with New Mexico State University, an implementation of trellis coded modulation (TCM), a coding and modulation scheme pioneered by Ungerboeck, was developed for this application and simulated on a computer. TCM provides a means for reliably transmitting data while simultaneously increasing bandwidth efficiency. The penalty is increased receiver complexity. In particular, the trellis decoder requires high-speed, application-specific digital signal processing (DSP) chips. A system solution based on the QualComm Viterbi decoder and the Graychip DSP receiver chips is presented.

  19. VSDocker: a tool for parallel high-throughput virtual screening using AutoDock on Windows-based computer clusters.

    Science.gov (United States)

    Prakhov, Nikita D; Chernorudskiy, Alexander L; Gainullin, Murat R

    2010-05-15

    VSDocker is an original program that allows using AutoDock4 for optimized virtual ligand screening on computer clusters or multiprocessor workstations. This tool is the first implementation of parallel high-performance virtual screening of ligands for MS Windows-based computer systems. VSDocker 2.0 is freely available for non-commercial use at http://www.bio.nnov.ru/projects/vsdocker2/ nikita.prakhov@gmail.com Supplementary data are available at Bioinformatics online.

  20. Parallel electron streaming in the high-latitude E region and its effect on the incoherent scatter spectrum

    Science.gov (United States)

    Bahcivan, H.; Cosgrove, R. B.; Tsunoda, R. T.

    2006-07-01

    This article investigates the combined electron heating and streaming effects of low-frequency parallel electric fields on the incoherent scatter measurements of the high-latitude E region. The electric fields distort the electron distribution function, inducing changes on the amplitude and frequency of the ion-acoustic line in the measured incoherent scatter spectrum. If one assumes Maxwellian electrons, the measurements of electron and ion temperatures and electron density are subject to significant percentage errors during geomagnetically active conditions.

  1. Parallel R

    CERN Document Server

    McCallum, Ethan

    2011-01-01

    It's tough to argue with R as a high-quality, cross-platform, open source statistical software product-unless you're in the business of crunching Big Data. This concise book introduces you to several strategies for using R to analyze large datasets. You'll learn the basics of Snow, Multicore, Parallel, and some Hadoop-related tools, including how to find them, how to use them, when they work well, and when they don't. With these packages, you can overcome R's single-threaded nature by spreading work across multiple CPUs, or offloading work to multiple machines to address R's memory barrier.

  2. Development of a novel parallel-spool pilot operated high-pressure solenoid valve with high flow rate and high speed

    Science.gov (United States)

    Dong, Dai; Li, Xiaoning

    2015-03-01

    High-pressure solenoid valve with high flow rate and high speed is a key component in an underwater driving system. However, traditional single spool pilot operated valve cannot meet the demands of both high flow rate and high speed simultaneously. A new structure for a high pressure solenoid valve is needed to meet the demand of the underwater driving system. A novel parallel-spool pilot operated high-pressure solenoid valve is proposed to overcome the drawback of the current single spool design. Mathematical models of the opening process and flow rate of the valve are established. Opening response time of the valve is subdivided into 4 parts to analyze the properties of the opening response. Corresponding formulas to solve 4 parts of the response time are derived. Key factors that influence the opening response time are analyzed. According to the mathematical model of the valve, a simulation of the opening process is carried out by MATLAB. Parameters are chosen based on theoretical analysis to design the test prototype of the new type of valve. Opening response time of the designed valve is tested by verifying response of the current in the coil and displacement of the main valve spool. The experimental results are in agreement with the simulated results, therefore the validity of the theoretical analysis is verified. Experimental opening response time of the valve is 48.3 ms at working pressure of 10 MPa. The flow capacity test shows that the largest effective area is 126 mm2 and the largest air flow rate is 2320 L/s. According to the result of the load driving test, the valve can meet the demands of the driving system. The proposed valve with parallel spools provides a new method for the design of a high-pressure valve with fast response and large flow rate.

  3. Development of a Novel Parallel-spool Pilot Operated High-pressure Solenoid Valve with High Flow Rate and High Speed

    Institute of Scientific and Technical Information of China (English)

    DONG Dai; LI Xiaoning

    2015-01-01

    High-pressure solenoid valve with high flow rate and high speed is a key component in an underwater driving system. However, traditional single spool pilot operated valve cannot meet the demands of both high flow rate and high speed simultaneously. A new structure for a high pressure solenoid valve is needed to meet the demand of the underwater driving system. A novel parallel-spool pilot operated high-pressure solenoid valve is proposed to overcome the drawback of the current single spool design. Mathematical models of the opening process and flow rate of the valve are established. Opening response time of the valve is subdivided into 4 parts to analyze the properties of the opening response. Corresponding formulas to solve 4 parts of the response time are derived. Key factors that influence the opening response time are analyzed. According to the mathematical model of the valve, a simulation of the opening process is carried out by MATLAB. Parameters are chosen based on theoretical analysis to design the test prototype of the new type of valve. Opening response time of the designed valve is tested by verifying response of the current in the coil and displacement of the main valve spool. The experimental results are in agreement with the simulated results, therefore the validity of the theoretical analysis is verified. Experimental opening response time of the valve is 48.3 ms at working pressure of 10 MPa. The flow capacity test shows that the largest effective area is 126 mm2 and the largest air flow rate is 2320 L/s. According to the result of the load driving test, the valve can meet the demands of the driving system. The proposed valve with parallel spools provides a new method for the design of a high-pressure valve with fast response and large flow rate.

  4. Parallel Processing of Numerical Tsunami Simulations on a High Performance Cluster based on the GDAL Library

    Science.gov (United States)

    Schroeder, Matthias; Jankowski, Cedric; Hammitzsch, Martin; Wächter, Joachim

    2014-05-01

    Thousands of numerical tsunami simulations allow the computation of inundation and run-up along the coast for vulnerable areas over the time. A so-called Matching Scenario Database (MSDB) [1] contains this large number of simulations in text file format. In order to visualize these wave propagations the scenarios have to be reprocessed automatically. In the TRIDEC project funded by the seventh Framework Programme of the European Union a Virtual Scenario Database (VSDB) and a Matching Scenario Database (MSDB) were established amongst others by the working group of the University of Bologna (UniBo) [1]. One part of TRIDEC was the developing of a new generation of a Decision Support System (DSS) for tsunami Early Warning Systems (TEWS) [2]. A working group of the GFZ German Research Centre for Geosciences was responsible for developing the Command and Control User Interface (CCUI) as central software application which support operator activities, incident management and message disseminations. For the integration and visualization in the CCUI, the numerical tsunami simulations from MSDB must be converted into the shapefiles format. The usage of shapefiles enables a much easier integration into standard Geographic Information Systems (GIS). Since also the CCUI is based on two widely used open source products (GeoTools library and uDig), whereby the integration of shapefiles is provided by these libraries a priori. In this case, for an example area around the Western Iberian margin several thousand tsunami variations were processed. Due to the mass of data only a program-controlled process was conceivable. In order to optimize the computing efforts and operating time the use of an existing GFZ High Performance Computing Cluster (HPC) had been chosen. Thus, a geospatial software was sought after that is capable for parallel processing. The FOSS tool Geospatial Data Abstraction Library (GDAL/OGR) was used to match the coordinates with the wave heights and generates the

  5. Codebook Design and Hybrid Digital/Analog Coding for Parallel Rayleigh Fading Channels

    OpenAIRE

    Shi, Shuying; Larsson, Erik G.; Skoglund, Mikael

    2011-01-01

    Low-delay source-channel transmission over parallel fading channels is studied. In this scenario separate sourceand channel coding is highly suboptimal. A scheme based on hybrid digital/analog joint source-channel coding istherefore proposed, employing scalar quantization and polynomial-based analog bandwidth expansion. Simulationsdemonstrate substantial performance gains. Funding agencies|European Community|248993|EL-LIIT||Knut and Alice Wallenberg Foundation||

  6. 未来高带宽网络中FAST TCP与TCP Vegas的公平性分析%Fairness analysis of FAST TCP and TCP Vegas over future high-bandwidth internet

    Institute of Scientific and Technical Information of China (English)

    朱小松

    2012-01-01

    FAST TCP, a modern end-to-end protocol adopting queuing delay as a congestion measure. However, the lack of a precise measurement of queuing delay leads to a potential unfairness problem that FAST TCP flows may be discriminated against according to their starting times in a persistent congestion scenario, TCP Vegas also encounters the unfairness problem. The unfairness problem is quantitatively assessed by mathematical analysis and ns2 simulations, then, we compared FAST TCP with TCP Vegas. Consequently, FAST TCP demonstrates a competitive edge over TCP Vegas, under future high bandwidth-delay product environment. This conclusion will contribute to the improvement of FAST TCP for future reference.%FAST TCP是先进的端到端拥塞控制协议,采用队列时延作为拥塞度量.由于不能准确测得精确的队列时延,此协议中存有不公平的隐患,即在某些持续拥塞场景下,不同时刻启动的FAST TCP流会受到差别对待,TCP Vegas中同样存在不公平问题.通过数学分析和ns2仿真对这种不公平问题进行量化,进而比较FASTTCP与TCP Vegas在公平性问题上的性能差异.结果证明了在将来高带宽时延乘积网络环境下,FAST TCP在公平性上要明显优于TCP Vegas.这为对FAST TCP协议的改进给出了有价值的参照.

  7. Passive Mobile Bandwidth Classification Using Short Lived TCP Connections

    OpenAIRE

    Michelinakis, Foivos; Kreitz, Gunnar; Petrocco, Riccardo; Zhang, Boxun; Widmer, Joerg

    2015-01-01

    Consumption of multimedia content is moving from a residential environment to mobile phones. Optimizing Quality of Experience—smooth, quick, and high quality playback—is more difficult in this setting, due to the highly dynamic nature of wireless links. A key requirement for achieving this goal is estimating the available bandwidth of mobile devices. Ideally, this should be done quickly and with low overhead. One challenge is that the majority of connections on mobiles are short-l...

  8. High-resolution Volumetric Display System with Multi-screen in Parallel Motion%高分辨率多屏平动体积显示系统

    Institute of Scientific and Technical Information of China (English)

    李立新; 夏孙城; 沈海锋; 江玉刚

    2012-01-01

    A high-resolution volumetric display system by scanning with multi-screen in parallel motion and a corresponding principle prototype is described. Through a DMD, 2D image sequence is projected on imaging screens moving in circular translation, and the dynamic refreshed image sequence is perceived as a 3D image with physical depth because of the persistence of vision. The optical projecting module keeps the optical path between the projector and the imaging screen to be constant. Hence, a clear and stable image space is constructed. A high-speed data channel is designed on a FPGA chip, whose bandwidth achieves 18.75 Gbps. Each volumetric frame contains 512 frames and over 400 million voxels at 12 Hz refresh frequency. The principle prototype is successfully manufactured, and the 3D display effect is achieved as anticipated.%描述了一种基于多屏平动扫描的高分辨率体积显示系统及其原理样机.通过DMD向多个圆周平动的成像屏投射二维图像序列,借助人眼视觉暂留效应,此动态刷新的二维图像序列被感知为一个具有物理景深的三维图像.其光学投影模块保证了投影仪与成像屏之间的光程不变,从而构建了清晰稳定的成像空间;采用FPGA设计高速数据通道,带宽达到了18.75 Gbps,使得每个三维体积帧包含512个二维图像切片,总体素超过4亿,刷新频率达到12Hz.成功研制了原理样机,达到了预期的三维显示效果.

  9. A comparison of high-order explicit Runge–Kutta, extrapolation, and deferred correction methods in serial and parallel

    KAUST Repository

    Ketcheson, David I.

    2014-06-13

    We compare the three main types of high-order one-step initial value solvers: extrapolation, spectral deferred correction, and embedded Runge–Kutta pairs. We consider orders four through twelve, including both serial and parallel implementations. We cast extrapolation and deferred correction methods as fixed-order Runge–Kutta methods, providing a natural framework for the comparison. The stability and accuracy properties of the methods are analyzed by theoretical measures, and these are compared with the results of numerical tests. In serial, the eighth-order pair of Prince and Dormand (DOP8) is most efficient. But other high-order methods can be more efficient than DOP8 when implemented in parallel. This is demonstrated by comparing a parallelized version of the wellknown ODEX code with the (serial) DOP853 code. For an N-body problem with N = 400, the experimental extrapolation code is as fast as the tuned Runge–Kutta pair at loose tolerances, and is up to two times as fast at tight tolerances.

  10. PARALLEL FINITE ELEMENT ANALYSIS OF HIGH FREQUENCY VIBRATIONS OF QUARTZ CRYSTAL RESONATORS ON LINUX CLUSTER

    Institute of Scientific and Technical Information of China (English)

    Ji Wang; Yu Wang; Wenke Hu; Wenhua Zhao; Jianke Du; Dejin Huang

    2008-01-01

    Quartz crystal resonators are typical piezoelectric acoustic wave devices for frequency control applications with mechanical vibration frequency at the radio-frequency (RF) range. Precise analyses of the vibration and deformation are generally required in the resonator design and improvement process. The considerations include the presence of electrodes, mountings, bias fields such as temperature, initial stresses, and acceleration. Naturally, the finite element method is the only effective tool for such a coupled problem with multi-physics nature. The main challenge is the extremely large size of resulted linear equations. For this reason, we have been employing the Mindlin plate equations to reduce the computational difficulty. In addition, we have to utilize the parallel computing techniques on Linux clusters, which are widely available for academic and industrial applications nowadays, to improve the computing efficiency. The general principle of our research is to use open source software components and public domain technology to reduce cost for developers and users on a Linux cluster. We start with a mesh generator specifically for quartz crystal resonators of rectangular and circular types, and the Mindlin plate equations are implemented for the finite element analysis. Computing techniques like parallel processing, sparse matrix handling, and the latest eigenvalue extraction package are integrated into the program. It is clear from our computation that the combination of these algorithms and methods on a cluster can meet the memory requirement and reduce computing time significantly.

  11. Scan-directed load balancing for highly-parallel mesh-connected computers. Technical report

    Energy Technology Data Exchange (ETDEWEB)

    Biagioni, E.S.; Prins, J.F.

    1991-07-01

    Scan Directed Load Balancing is a new, locality-preserving, dynamic load balancing algorithm for grid based computations on mesh connected parallel computers. Scans are used to efficiently determine what areas of the machine are heavily loaded and what areas are lightly loaded, and to organize the movement of data. Data is shifted along the mesh in a regular fashion to balance the load. The Locality Property of the algorithm guarantees that all the neighbors of a data point on the grid are stored either on the same processor, or on a processor that is directly connected to it. Scan Directed Load Balancing is applicable to both SIMD and MIMD mesh-connected parallel computers, and has been implemented on the MasPar MP-1. The authors present some theoretical bounds achieved by the algorithm as well as the algorithm's performance on a particular image processing problem, edge-directed diffusion. Their experiments show that the algorithm is effective in improving the load distribution for real problems, while the efficiency of the original grid-based computation is preserved by the locality property.

  12. Experience with highly-parallel software for the storage system of the ATLAS Experiment at CERN

    CERN Document Server

    Colombo, T; The ATLAS collaboration

    2012-01-01

    The ATLAS experiment is observing proton-proton collisions delivered by the LHC accelerator. The ATLAS Trigger and Data Acquisition (TDAQ) system selects interesting events on-line in a three-level trigger system in order to store them at a budgeted rate of several hundred Hz. This paper focuses on the TDAQ data-logging system and in particular on the implementation and performance of a novel parallel software design. In this respect, the main challenge presented by the data-logging workload is the conflict between the largely parallel nature of the event processing, especially the recently introduced event compression, and the constraint of sequential file writing and checksum evaluation. This is further complicated by the necessity of operating in a fully data-driven mode, to cope with continuously evolving trigger and detector configurations. In this paper we report on the design of the new ATLAS on-line storage software. In particular we will discuss our development experience using recent concurrency-ori...

  13. High performance shallow water kernels for parallel overland flow simulations based on FullSWOF2D

    KAUST Repository

    Wittmann, Roland

    2017-01-25

    We describe code optimization and parallelization procedures applied to the sequential overland flow solver FullSWOF2D. Major difficulties when simulating overland flows comprise dealing with high resolution datasets of large scale areas which either cannot be computed on a single node either due to limited amount of memory or due to too many (time step) iterations resulting from the CFL condition. We address these issues in terms of two major contributions. First, we demonstrate a generic step-by-step transformation of the second order finite volume scheme in FullSWOF2D towards MPI parallelization. Second, the computational kernels are optimized by the use of templates and a portable vectorization approach. We discuss the load imbalance of the flux computation due to dry and wet cells and propose a solution using an efficient cell counting approach. Finally, scalability results are shown for different test scenarios along with a flood simulation benchmark using the Shaheen II supercomputer.

  14. Hybrid Adaptive Ray-Moment Method (HARM2): A highly parallel method for radiation hydrodynamics on adaptive grids

    Science.gov (United States)

    Rosen, A. L.; Krumholz, M. R.; Oishi, J. S.; Lee, A. T.; Klein, R. I.

    2017-02-01

    We present a highly-parallel multi-frequency hybrid radiation hydrodynamics algorithm that combines a spatially-adaptive long characteristics method for the radiation field from point sources with a moment method that handles the diffuse radiation field produced by a volume-filling fluid. Our Hybrid Adaptive Ray-Moment Method (HARM2) operates on patch-based adaptive grids, is compatible with asynchronous time stepping, and works with any moment method. In comparison to previous long characteristics methods, we have greatly improved the parallel performance of the adaptive long-characteristics method by developing a new completely asynchronous and non-blocking communication algorithm. As a result of this improvement, our implementation achieves near-perfect scaling up to O (103) processors on distributed memory machines. We present a series of tests to demonstrate the accuracy and performance of the method.

  15. Hybrid Adaptive Ray-Moment Method (HARM$^2$): A Highly Parallel Method for Radiation Hydrodynamics on Adaptive Grids

    CERN Document Server

    Rosen, Anna L; Oishi, Jeffrey S; Lee, Aaron T; Klein, Richard I

    2016-01-01

    We present a highly-parallel multi-frequency hybrid radiation hydrodynamics algorithm that combines a spatially-adaptive long characteristics method for the radiation field from point sources with a moment method that handles the diffuse radiation field produced by a volume-filling fluid. Our Hybrid Adaptive Ray-Moment Method (HARM$^2$) operates on patch-based adaptive grids, is compatible with asynchronous time stepping, and works with any moment method. In comparison to previous long characteristics methods, we have greatly improved the parallel performance of the adaptive long-characteristics method by developing a new completely asynchronous and non-blocking communication algorithm. As a result of this improvement, our implementation achieves near-perfect scaling up to $\\mathcal{O}(10^3)$ processors on distributed memory machines. We present a series of tests to demonstrate the accuracy and performance of the method.

  16. Massively Parallel Rogue Cell Detection Using Serial Time-Encoded Amplified Microscopy of Inertially Ordered Cells in High-Throughput Flow

    Science.gov (United States)

    2012-08-01

    method. The temporal waveform in Fig. 1(c) indicates the repetitive pulses (corresponding to the line scans) detected by a single-pixel photodetector and...time-encoded optical pulses are then captured by a high-speed photodetector with 10 GHz bandwidth and digitized by a real-time digitizer with 16 GHz...higher in 2D) include identification of missiles and aircrafts via light detection and ranging ( LIDAR ), non-destructive inspection of acoustic

  17. Large scale probabilistic available bandwidth estimation

    CERN Document Server

    Thouin, Frederic; Rabbat, Michael

    2010-01-01

    The common utilization-based definition of available bandwidth and many of the existing tools to estimate it suffer from several important weaknesses: i) most tools report a point estimate of average available bandwidth over a measurement interval and do not provide a confidence interval; ii) the commonly adopted models used to relate the available bandwidth metric to the measured data are invalid in almost all practical scenarios; iii) existing tools do not scale well and are not suited to the task of multi-path estimation in large-scale networks; iv) almost all tools use ad-hoc techniques to address measurement noise; and v) tools do not provide enough flexibility in terms of accuracy, overhead, latency and reliability to adapt to the requirements of various applications. In this paper we propose a new definition for available bandwidth and a novel framework that addresses these issues. We define probabilistic available bandwidth (PAB) as the largest input rate at which we can send a traffic flow along a pa...

  18. Extending molecular simulation time scales: Parallel in time integrations for high-level quantum chemistry and complex force representations

    Science.gov (United States)

    Bylaska, Eric J.; Weare, Jonathan Q.; Weare, John H.

    2013-08-01

    environment using very slow transmission control protocol/Internet protocol networks. Scripts written in Python that make calls to a precompiled quantum chemistry package (NWChem) are demonstrated to provide an actual speedup of 8.2 for a 2.5 ps AIMD simulation of HCl + 4H2O at the MP2/6-31G* level. Implemented in this way these algorithms can be used for long time high-level AIMD simulations at a modest cost using machines connected by very slow networks such as WiFi, or in different time zones connected by the Internet. The algorithms can also be used with programs that are already parallel. Using these algorithms, we are able to reduce the cost of a MP2/6-311++G(2d,2p) simulation that had reached its maximum possible speedup in the parallelization of the electronic structure calculation from 32 s/time step to 6.9 s/time step.

  19. Extending molecular simulation time scales: Parallel in time integrations for high-level quantum chemistry and complex force representations

    Energy Technology Data Exchange (ETDEWEB)

    Bylaska, Eric J., E-mail: Eric.Bylaska@pnnl.gov [Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory, P.O. Box 999, Richland, Washington 99352 (United States); Weare, Jonathan Q., E-mail: weare@uchicago.edu [Department of Mathematics, University of Chicago, Chicago, Illinois 60637 (United States); Weare, John H., E-mail: jweare@ucsd.edu [Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, California 92093 (United States)

    2013-08-21

    to 14.3. The parallel in time algorithms can be implemented in a distributed computing environment using very slow transmission control protocol/Internet protocol networks. Scripts written in Python that make calls to a precompiled quantum chemistry package (NWChem) are demonstrated to provide an actual speedup of 8.2 for a 2.5 ps AIMD simulation of HCl + 4H{sub 2}O at the MP2/6-31G* level. Implemented in this way these algorithms can be used for long time high-level AIMD simulations at a modest cost using machines connected by very slow networks such as WiFi, or in different time zones connected by the Internet. The algorithms can also be used with programs that are already parallel. Using these algorithms, we are able to reduce the cost of a MP2/6-311++G(2d,2p) simulation that had reached its maximum possible speedup in the parallelization of the electronic structure calculation from 32 s/time step to 6.9 s/time step.

  20. Construction and experimental testing of the constant-bandwidth constant-temperature anemometer.

    Science.gov (United States)

    Ligeza, P

    2008-09-01

    A classical constant-temperature hot-wire anemometer enables the measurement of fast-changing flow velocity fluctuations, although its transmission bandwidth is a function of measured velocity. This may be a source of significant dynamic errors. Incorporation of an adaptive controller into the constant-temperature system results in hot-wire anemometer operating with a constant transmission bandwidth. The construction together with the results of experimental testing of a constant-bandwidth hot-wire anemometer prototype are presented in this article. During the testing, an approximately constant transmission bandwidth of the anemometer was achieved. The constant-bandwidth hot-wire anemometer can be used in measurements of high-frequency variable flows characterized by a wide range of velocity changes.

  1. Reconstruction in Time-Bandwidth Compression Systems

    CERN Document Server

    Chan, Jacky; Asghari, Mohammad H; Jalali, Bahram

    2014-01-01

    Recently it has been shown that the intensity time-bandwidth product of optical signals can be engineered to match that of the data acquisition instrument. In particular, it is possible to slow down an ultrafast signal, resulting in compressed RF bandwidth - a similar benefit to that offered by the Time-Stretch Dispersive Fourier Transform (TS-DFT) - but with reduced temporal record length leading to time-bandwidth compression. The compression is implemented using a warped group delay dispersion leading to non-uniform time stretching of the signal's intensity envelope. Decoding requires optical phase retrieval and reconstruction of the input temporal profile, for the case where information of interest is resides in the complex field. In this paper, we present results on the general behavior of the reconstruction process and its dependence on the signal-to-noise ratio. We also discuss the role of chirp in the input signal.

  2. Large-bandwidth planar photonic crystal waveguides

    DEFF Research Database (Denmark)

    Søndergaard, Thomas; Lavrinenko, Andrei

    2002-01-01

    A general design principle is presented for making finite-height photonic crystal waveguides that support leakage-free guidance of light over large frequency intervals. The large bandwidth waveguides are designed by introducing line defects in photonic crystal slabs, where the material in the lin......-free single-mode guidance is found for a large frequency interval covering 60% of the photonic band-gap.......A general design principle is presented for making finite-height photonic crystal waveguides that support leakage-free guidance of light over large frequency intervals. The large bandwidth waveguides are designed by introducing line defects in photonic crystal slabs, where the material in the line...... defect has appropriate dispersion properties relative to the photonic crystal slab material surrounding the line defect. A three-dimensional theoretical analysis is given for large-bandwidth waveguide designs based on a silicon-air photonic crystal slab suspended in air. In one example, the leakage...

  3. Bandwidth Assessment for MultiRotor UAVs

    Directory of Open Access Journals (Sweden)

    Ferrarese Gastone

    2017-06-01

    Full Text Available This paper is a technical note about the theoretical evaluation of the bandwidth of multirotor helicopters. Starting from a mathematical linear model of the dynamics of a multirotor aircraft, the transfer functions of the state variables that deeply affect the stability characteristics of the aircraft are obtained. From these transfer functions, the frequency response analysis of the system is effected. After this analysis, the bandwidth of the system is defined. This result is immediately utilized for the design of discrete PID controllers for hovering flight stabilization. Numeric simulations are shown to demonstrate that the knowledge of the bandwidth is a valid aid in the design of flight control systems of these machines.

  4. Improved space bandwidth product in image upconversion

    DEFF Research Database (Denmark)

    Dam, Jeppe Seidelin; Pedersen, Christian; Tidemand-Lichtenberg, Peter

    2012-01-01

    We present a technique increasing the space bandwidth product of a nonlinear image upconversion process used for spectral imaging. The technique exploits the strong dependency of the phase-matching condition in sum frequency generation (SFG) on the angle of propagation of the interacting fields...... with respect to the optical axis. Appropriate scanning of the phase-match condition (Δk=0) while acquiring images, allow us to perform monochromatic image reconstruction with a significantly increased space bandwidth product. We derive the theory for the image reconstruction process and demonstrate acquisition...... of images with >10 fold increase in space bandwidth product, i.e. the number of pixel elements, when compared to upconversion of images using fixed phase-match conditions....

  5. Long-pulse-width narrow-bandwidth solid state laser

    Science.gov (United States)

    Dane, C.B.; Hackel, L.A.

    1997-11-18

    A long pulse laser system emits 500-1000 ns quasi-rectangular pulses at 527 nm with near diffraction-limited divergence and near transform-limited bandwidth. The system consists of one or more flashlamp-pumped Nd:glass zig-zag amplifiers, a very low threshold stimulated-Brillouin-scattering (SBS) phase conjugator system, and a free-running single frequency Nd:YLF master oscillator. Completely passive polarization switching provides eight amplifier gain passes. Multiple frequency output can be generated by using SBS cells having different pressures of a gaseous SBS medium or different SBS materials. This long pulse, low divergence, narrow-bandwidth, multi-frequency output laser system is ideally suited for use as an illuminator for long range speckle imaging applications. Because of its high average power and high beam quality, this system has application in any process which would benefit from a long pulse format, including material processing and medical applications. 5 figs.

  6. Exploiting material softening in hard PZTs for resonant bandwidth enhancement

    Science.gov (United States)

    Leadenham, S.; Moura, A.; Erturk, A.

    2016-04-01

    Intentionally designed nonlinearities have been employed by several research groups to enhance the frequency bandwidth of vibration energy harvesters. Another type of nonlinear resonance behavior emerges from the piezoelectric constitutive behavior for high excitation levels and is manifested in the form of softening stiffness. This material nonlinearity does not result in the jump phenomenon in soft piezoelectric ceramics, e.g. PZT-5A and PZT-5H, due to their large internal dissipation. This paper explores the potential for wideband energy harvesting using a hard (relatively high quality factor) PZT-8 bimorph by exploiting its material softening. A wide range of base excitation experiments conducted for a set of resistive electrical loads confirms the frequency bandwidth enhancement.

  7. High speed image space parallel processing for computer-generated integral imaging system.

    Science.gov (United States)

    Kwon, Ki-Chul; Park, Chan; Erdenebat, Munkh-Uchral; Jeong, Ji-Seong; Choi, Jeong-Hun; Kim, Nam; Park, Jae-Hyeung; Lim, Young-Tae; Yoo, Kwan-Hee

    2012-01-16

    In an integral imaging display, the computer-generated integral imaging method has been widely used to create the elemental images from a given three-dimensional object data. Long processing time, however, has been problematic especially when the three-dimensional object data set or the number of the elemental lenses are large. In this paper, we propose an image space parallel processing method, which is implemented by using Open Computer Language (OpenCL) for rapid generation of the elemental images sets from large three-dimensional volume data. Using the proposed technique, it is possible to realize a real-time interactive integral imaging display system for 3D volume data constructed from computational tomography (CT) or magnetic resonance imaging (MRI) data.

  8. Stochastic parallel gradient descent based adaptive optics used for high contrast imaging coronagraph

    CERN Document Server

    Dong, Bing; Zhang, Xi

    2011-01-01

    An adaptive optics (AO) system based on stochastic parallel gradient descent (SPGD) algorithm is proposed to reduce the speckle noises in the optical system of stellar coronagraph in order to further improve the contrast. The principle of SPGD algorithm is described briefly and a metric suitable for point source imaging optimization is given. The feasibility and good performance of SPGD algorithm is demonstrated by experimental system featured with a 140-actuators deformable mirror (DM) and a Hartmann- Shark wavefront sensor. Then the SPGD based AO is applied to a liquid crystal array (LCA) based coronagraph. The LCA can modulate the incoming light to generate a pupil apodization mask in any pattern. A circular stepped pattern is used in our preliminary experiment and the image contrast shows improvement from 10^-3 to 10^-4.5 at angular distance of 2{\\lambda}/D after corrected by SPGD based AO.

  9. Automated cantilever exchange and optical alignment for High-throughput, parallel atomic force microscopy

    CERN Document Server

    Bijnagte, Tom; Kramer, Lukas; Dekker, Bert; Herfst, Rodolf; Sadeghian, Hamed

    2016-01-01

    In atomic force microscopy (AFM), the exchange and alignment of the AFM cantilever with respect to the optical beam and position-sensitive detector (PSD) are often performed manually. This process is tedious and time-consuming and sometimes damages the cantilever or tip. To increase the throughput of AFM in industrial applications, the ability to automatically exchange and align the cantilever in a very short time with sufficient accuracy is required. In this paper, we present the development of an automated cantilever exchange and optical alignment instrument. We present an experimental proof of principle by exchanging various types of AFM cantilevers in 6 seconds with an accuracy better than 2 um. The exchange and alignment unit is miniaturized to allow for integration in a parallel AFM. The reliability of the demonstrator has also been evaluated. Ten thousand continuous exchange and alignment cycles were performed without failure. The automated exchange and alignment of the AFM cantilever overcome a large ...

  10. Highly parallel implementation of sub-pixel interpolation for AVS HDTV decoder

    Institute of Scientific and Technical Information of China (English)

    Wan-yi LI; Lu YU

    2008-01-01

    In this paper,we propose an effective VLSI architecture of sub-pixel interpolation for motion compensation in the AVS HDTV decoder. To utilize the similar arithmetical operations of 15 luma sub-pixel positions,three types of interpolation filters are proposed. A simplified multiplier is presented due to the limited range of input in the chroma interpolation process. To improve the processing throughput,a parallel and pipelined computing architecture is adopted. The simulation results show that the proposed hardware implementation can satisfy the real-time constraint for the AVS HDTV (1 920×1 088) 30 fps decoder by operating at 108 M Hz with 38.18k logic gates. Meanwhile,it costs only 216 cycles to accomplish one macroblock,which means the B frame sub-pixel interpolation can be realized by using only one set of the proposed architecture under real-time constraints.

  11. Nonlinear Elastodynamic Behaviour Analysis of High-Speed Spatial Parallel Coordinate Measuring Machines

    Directory of Open Access Journals (Sweden)

    Xiulong Chen

    2012-10-01

    Full Text Available In order to study the elastodynamic behaviour of 4‐ universal joints‐ prismatic pairs‐ spherical joints / universal joints‐ prismatic pairs‐ universal joints 4‐UPS‐UPU high‐speed spatial PCMMs(parallel coordinate measuring machines, the nonlinear time‐varying dynamics model, which comprehensively considers geometric nonlinearity and the rigid‐flexible coupling effect, is derived by using Lagrange equations and finite element methods. Based on the Newmark method, the kinematics output response of 4‐UPS‐UPU PCMMs is illustrated through numerical simulation. The results of the simulation show that the flexibility of the links is demonstrated to have a significant impact on the system dynamics response. This research can provide the important theoretical base of the optimization design and vibration control for 4‐UPS‐UPU PCMMs.

  12. High resolution carotid black-blood 3T MR with parallel imaging and dedicated 4-channel surface coils

    Directory of Open Access Journals (Sweden)

    Frey Ute

    2009-10-01

    Full Text Available Abstract Background Most of the carotid plaque MR studies have been performed using black-blood protocols at 1.5 T without parallel imaging techniques. The purpose of this study was to evaluate a multi-sequence, black-blood MR protocol using parallel imaging and a dedicated 4-channel surface coil for vessel wall imaging of the carotid arteries at 3 T. Materials and methods 14 healthy volunteers and 14 patients with intimal thickening as proven by duplex ultrasound had their carotid arteries imaged at 3 T using a multi-sequence protocol (time-of-flight MR angiography, pre-contrast T1w-, PDw- and T2w sequences in the volunteers, additional post-contrast T1w- and dynamic contrast enhanced sequences in patients. To assess intrascan reproducibility, 10 volunteers were scanned twice within 2 weeks. Results Intrascan reproducibility for quantitative measurements of lumen, wall and outer wall areas was excellent with Intraclass Correlation Coefficients >0.98 and measurement errors of 1.5%, 4.5% and 1.9%, respectively. Patients had larger wall areas than volunteers in both common carotid and internal carotid arteries and smaller lumen areas in internal carotid arteries (p Conclusion The findings of this study indicate that high resolution carotid black-blood 3 T MR with parallel imaging is a fast, reproducible and robust method to assess carotid atherosclerotic plaque in vivo and this method is ready to be used in clinical practice.

  13. Large-bandwidth planar photonic crystal waveguides

    DEFF Research Database (Denmark)

    Søndergaard, Thomas; Lavrinenko, Andrei

    2002-01-01

    A general design principle is presented for making finite-height photonic crystal waveguides that support leakage-free guidance of light over large frequency intervals. The large bandwidth waveguides are designed by introducing line defects in photonic crystal slabs, where the material in the line...... defect has appropriate dispersion properties relative to the photonic crystal slab material surrounding the line defect. A three-dimensional theoretical analysis is given for large-bandwidth waveguide designs based on a silicon-air photonic crystal slab suspended in air. In one example, the leakage...

  14. Real-Time Virtual Instruments for Remote Sensor Monitoring Using Low Bandwidth Wireless Networks

    Directory of Open Access Journals (Sweden)

    Biruk Gebre

    2008-06-01

    Full Text Available The development of a peer-to-peer virtual instrumentation system for remote acquisition, analysis and transmission of data on low bandwidth networks is described. The objective of this system is to collect high frequency/high bandwidth data from multiple sensors placed at remote locations and adaptively adjust the resolution of this data so that it can be transmitted on bandwidth limited networks to a central monitoring and command center. This is achieved by adaptively re-sampling (decimating the data from the sensors at the remote location before transmission. The decimation is adjusted to the available bandwidth of the communications network which is characterized in real-time. As a result, the system allows users at the remote command center to view high bandwidth data (at a lower resolution with user-aware and minimized latency. This technique is applied to an eight hydrophone data acquisition system that requires a 25.6 Mbps connection for the transmission of the full data set using a wireless connection with 1 – 3.5 Mbps variable bandwidth. This technique can be used for applications that require monitoring of high bandwidth data from remote sensors in research and education fields such as remote scientific instruments and visually driven control applications.

  15. Real-Time Virtual Instruments for Remote Sensor Monitoring Using Low Bandwidth Wireless Networks

    Directory of Open Access Journals (Sweden)

    Biruk Gebre

    2008-06-01

    Full Text Available The development of a peer-to-peer virtual instrumentation system for remote acquisition, analysis and transmission of data on low bandwidth networks is described. The objective of this system is to collect high frequency/high bandwidth data from multiple sensors placed at remote locations and adaptively adjust the resolution of this data so that it can be transmitted on bandwidth limited networks to a central monitoring and command center. This is achieved by adaptively re-sampling (decimating the data from the sensors at the remote location before transmission. The decimation is adjusted to the available bandwidth of the communications network which is characterized in real-time. As a result, the system allows users at the remote command center to view high bandwidth data (at a lower resolution with user-aware and minimized latency. This technique is applied to an eight hydrophone data acquisition system that requires a 25.6 Mbps connection for the transmission of the full data set using a wireless connection with 1 – 3.5 Mbps variable bandwidth. This technique can be used for applications that require monitoring of high bandwidth data from remote sensors in research and education fields such as remote scientific instruments and visually driven control applications.

  16. An Adaptive Bandwidth Allocation for Energy Efficient Wireless Communication Systems

    Institute of Scientific and Technical Information of China (English)

    Yung-Fa Huang,Che-Hao Li; Chuan-Bi Lin; Chia-Chi Chang

    2015-01-01

    Abstract―In this paper, an energy efficient bandwidth allocation scheme is proposed for wireless communication systems. An optimal bandwidth expansion (OBE) scheme is proposed to assign the available system bandwidth for users. When the system bandwidth does not reach the full load, the remaining bandwidth can be energy-efficiently assigned to the other users. Simulation results show that the energy efficiency of the proposed OBE scheme outperforms the traditional same bandwidth expansion (SBE) scheme. Thus, the proposed OBE can effectively assign the system bandwidth and improve energy efficiency.

  17. Extending molecular simulation time scales: Parallel in time integrations for high-level quantum chemistry and complex force representations.

    Science.gov (United States)

    Bylaska, Eric J; Weare, Jonathan Q; Weare, John H

    2013-08-21

    distributed computing environment using very slow transmission control protocol/Internet protocol networks. Scripts written in Python that make calls to a precompiled quantum chemistry package (NWChem) are demonstrated to provide an actual speedup of 8.2 for a 2.5 ps AIMD simulation of HCl + 4H2O at the MP2/6-31G* level. Implemented in this way these algorithms can be used for long time high-level AIMD simulations at a modest cost using machines connected by very slow networks such as WiFi, or in different time zones connected by the Internet. The algorithms can also be used with programs that are already parallel. Using these algorithms, we are able to reduce the cost of a MP2/6-311++G(2d,2p) simulation that had reached its maximum possible speedup in the parallelization of the electronic structure calculation from 32 s/time step to 6.9 s/time step.

  18. A highly parallel method for synthesizing DNA repeats enables the discovery of ‘smart’ protein polymers

    Science.gov (United States)

    Amiram, Miriam; Quiroz, Felipe Garcia; Callahan, Daniel J.; Chilkoti, Ashutosh

    2011-02-01

    Robust high-throughput synthesis methods are needed to expand the repertoire of repetitive protein-polymers for different applications. To address this need, we developed a new method, overlap extension rolling circle amplification (OERCA), for the highly parallel synthesis of genes encoding repetitive protein-polymers. OERCA involves a single PCR-type reaction for the rolling circle amplification of a circular DNA template and simultaneous overlap extension by thermal cycling. We characterized the variables that control OERCA and demonstrated its superiority over existing methods, its robustness, high-throughput and versatility by synthesizing variants of elastin-like polypeptides (ELPs) and protease-responsive polymers of glucagon-like peptide-1 analogues. Despite the GC-rich, highly repetitive sequences of ELPs, we synthesized remarkably large genes without recursive ligation. OERCA also enabled us to discover ‘smart’ biopolymers that exhibit fully reversible thermally responsive behaviour. This powerful strategy generates libraries of repetitive genes over a wide and tunable range of molecular weights in a ‘one-pot’ parallel format.

  19. Quartic scaling MP2 for solids: A highly parallelized algorithm in the plane-wave basis

    CERN Document Server

    Schäfer, Tobias; Kresse, Georg

    2016-01-01

    We present a low-complexity algorithm to calculate the correlation energy of periodic systems in second-order M{\\o}ller-Plesset perturbation theory (MP2). In contrast to previous approximation-free MP2 codes, our implementation possesses a quartic scaling, $\\mathcal O(N^4$), with respect to the system size $N$ and offers an almost ideal parallelization efficiency. The general issue that the correlation energy converges slowly with the number of basis functions is solved by an internal basis set extrapolation. The key concept to reduce the scaling of the algorithm is to eliminate all summations over virtual bands which can be elegantly achieved in the Laplace transformed MP2 (LTMP2) formulation using plane-wave basis sets. Analogously, this approach could allow to calculate second order screened exchange (SOSEX) as well as particle-hole ladder diagrams with a similar low complexity. Hence, the presented method can be considered as a step towards systematically improved correlation energies.

  20. Cpl6: The New Extensible, High-Performance Parallel Coupler forthe Community Climate System Model

    Energy Technology Data Exchange (ETDEWEB)

    Craig, Anthony P.; Jacob, Robert L.; Kauffman, Brain; Bettge,Tom; Larson, Jay; Ong, Everest; Ding, Chris; He, Yun

    2005-03-24

    Coupled climate models are large, multiphysics applications designed to simulate the Earth's climate and predict the response of the climate to any changes in the forcing or boundary conditions. The Community Climate System Model (CCSM) is a widely used state-of-art climate model that has released several versions to the climate community over the past ten years. Like many climate models, CCSM employs a coupler, a functional unit that coordinates the exchange of data between parts of climate system such as the atmosphere and ocean. This paper describes the new coupler, cpl6, contained in the latest version of CCSM,CCSM3. Cpl6 introduces distributed-memory parallelism to the coupler, a class library for important coupler functions, and a standardized interface for component models. Cpl6 is implemented entirely in Fortran90 and uses Model Coupling Toolkit as the base for most of its classes. Cpl6 gives improved performance over previous versions and scales well on multiple platforms.

  1. Parallel Adjective High-Order CFD Simulations Characterizing SOFIA Cavity Acoustics

    Science.gov (United States)

    Barad, Michael F.; Brehm, Christoph; Kiris, Cetin C.; Biswas, Rupak

    2016-01-01

    This paper presents large-scale MPI-parallel computational uid dynamics simulations for the Stratospheric Observatory for Infrared Astronomy (SOFIA). SOFIA is an airborne, 2.5-meter infrared telescope mounted in an open cavity in the aft fuselage of a Boeing 747SP. These simulations focus on how the unsteady ow eld inside and over the cavity interferes with the optical path and mounting structure of the telescope. A temporally fourth-order accurate Runge-Kutta, and spatially fth-order accurate WENO- 5Z scheme was used to perform implicit large eddy simulations. An immersed boundary method provides automated gridding for complex geometries and natural coupling to a block-structured Cartesian adaptive mesh re nement framework. Strong scaling studies using NASA's Pleiades supercomputer with up to 32k CPU cores and 4 billion compu- tational cells shows excellent scaling. Dynamic load balancing based on execution time on individual AMR blocks addresses irregular numerical cost associated with blocks con- taining boundaries. Limits to scaling beyond 32k cores are identi ed, and targeted code optimizations are discussed.

  2. Experience with highly-parallel software for the storage system of the ATLAS Experiment at CERN

    CERN Document Server

    Colombo, T; The ATLAS collaboration

    2012-01-01

    The ATLAS experiment is observing proton-proton collisions delivered by the LHC accelerator at a centre of mass energy of 7 TeV. The ATLAS Trigger and Data Acquisition (TDAQ) system selects interesting events on-line in a three-level trigger system in order to store them at a budgeted rate of several hundred Hz, for an average event size of ~1.2 MB. This paper focuses on the TDAQ data-logging system and in particular on the implementation and performance of a novel SW design, reporting on the effort of exploiting the full power of recently installed multi-core hardware. In this respect, the main challenge presented by the data-logging workload is the conflict between the largely parallel nature of the event processing, especially the recently introduced on-line event-compression, and the constraint of sequential file writing and checksum evaluation. This is furtherly complicated by the necessity of operating in a fully data-driven mode, to cope with continuously evolving trigger and detector configurations. T...

  3. Coherent temporal imaging with analog time-bandwidth compression

    CERN Document Server

    Asghari, Mohammad H

    2013-01-01

    We introduce the concept of coherent temporal imaging and its combination with the anamorphic stretch transform. The new system can measure both temporal profile of fast waveforms as well as their spectrum in real time and at high-throughput. We show that the combination of coherent detection and warped time-frequency mapping also performs time-bandwidth compression. By reducing the temporal width without sacrificing spectral resolution, it addresses the Big Data problem in real time instruments. The proposed method is the first application of the recently demonstrated Anamorphic Stretch Transform to temporal imaging. Using this method narrow spectral features beyond the spectrometer resolution can be captured. At the same time the output bandwidth and hence the record length is minimized. Coherent detection allows the temporal imaging and dispersive Fourier transform systems to operate in the traditional far field as well as in near field regimes.

  4. Compact silicon multimode waveguide spectrometer with enhanced bandwidth

    Science.gov (United States)

    Piels, Molly; Zibar, Darko

    2017-01-01

    Compact, broadband, and high-resolution spectrometers are appealing for sensing applications, but difficult to fabricate. Here we show using calibration data a spectrometer based on a multimode waveguide with 2 GHz resolution, 250 GHz bandwidth, and a 1.6 mm × 2.1 mm footprint. Typically, such spectrometers have a bandwidth limited by the number of modes supported by the waveguide. In this case, an on-chip mode-exciting element is used to repeatably excite distinct collections of waveguide modes. This increases the number of independent spectral channels from the number of modes to this number squared, resulting in an extension of the usable range. PMID:28290537

  5. Dynamic resource management using bandwidth brokers

    Institute of Scientific and Technical Information of China (English)

    Yu Chengzhi; Song Hantao; Hou Xianjun; Pan Chengsheng

    2006-01-01

    The admission control issue in the design of a centralized bandwidth broker model for dynamic control and management of QoS provisioning is studied. A two-phase differentiated flow treatment based dynamic admission control scheme under the centralized bandwidth broker model is proposed. In the proposed scheme, the flow requests are classified into two classes and get differentiated treatment according to their QoS demands. We demonstrate that this admission control scheme can not only improve the resource utilization but also guarantee the flows' QoS. Furthermore, the admission control is divided into two phases: edge admission control and interior admissio-n control. During the interior phase, the PoQ scheme is adopted, which enhances the call processing capability of the bandwidth broker. The simulation results show that the proposed scheme can result in lower flow blocking probability and higher resource utilization. And it also reduces the number of QoS state accesses/updates, thereby increasing the overall call processing capability of the bandwidth broker.

  6. A System Theoretic Approach to Bandwidth Estimation

    CERN Document Server

    Liebeherr, Jorg; Valaee, Shahrokh

    2008-01-01

    It is shown that bandwidth estimation in packet networks can be viewed in terms of min-plus linear system theory. The available bandwidth of a link or complete path is expressed in terms of a {\\em service curve}, which is a function that appears in the network calculus to express the service available to a traffic flow. The service curve is estimated based on measurements of a sequence of probing packets or passive measurements of a sample path of arrivals. It is shown that existing bandwidth estimation methods can be derived in the min-plus algebra of the network calculus, thus providing further mathematical justification for these methods. Principal difficulties of estimating available bandwidth from measurement of network probes are related to potential non-linearities of the underlying network. When networks are viewed as systems that operate either in a linear or in a non-linear regime, it is argued that probing schemes extract the most information at a point when the network crosses from a linear to a n...

  7. Hierarchical Parallelization of Gene Differential Association Analysis

    Directory of Open Access Journals (Sweden)

    Dwarkadas Sandhya

    2011-09-01

    Full Text Available Abstract Background Microarray gene differential expression analysis is a widely used technique that deals with high dimensional data and is computationally intensive for permutation-based procedures. Microarray gene differential association analysis is even more computationally demanding and must take advantage of multicore computing technology, which is the driving force behind increasing compute power in recent years. In this paper, we present a two-layer hierarchical parallel implementation of gene differential association analysis. It takes advantage of both fine- and coarse-grain (with granularity defined by the frequency of communication parallelism in order to effectively leverage the non-uniform nature of parallel processing available in the cutting-edge systems of today. Results Our results show that this hierarchical strategy matches data sharing behavior to the properties of the underlying hardware, thereby reducing the memory and bandwidth needs of the application. The resulting improved efficiency reduces computation time and allows the gene differential association analysis code to scale its execution with the number of processors. The code and biological data used in this study are downloadable from http://www.urmc.rochester.edu/biostat/people/faculty/hu.cfm. Conclusions The performance sweet spot occurs when using a number of threads per MPI process that allows the working sets of the corresponding MPI processes running on the multicore to fit within the machine cache. Hence, we suggest that practitioners follow this principle in selecting the appropriate number of MPI processes and threads within each MPI process for their cluster configurations. We believe that the principles of this hierarchical approach to parallelization can be utilized in the parallelization of other computationally demanding kernels.

  8. Adaptive bandwidth measurements of importance functions for speech intelligibility prediction.

    Science.gov (United States)

    Whitmal, Nathaniel A; DeRoy, Kristina

    2011-12-01

    The Articulation Index (AI) and Speech Intelligibility Index (SII) predict intelligibility scores from measurements of speech and hearing parameters. One component in the prediction is the "importance function," a weighting function that characterizes contributions of particular spectral regions of speech to speech intelligibility. Previous work with SII predictions for hearing-impaired subjects suggests that prediction accuracy might improve if importance functions for individual subjects were available. Unfortunately, previous importance function measurements have required extensive intelligibility testing with groups of subjects, using speech processed by various fixed-bandwidth low-pass and high-pass filters. A more efficient approach appropriate to individual subjects is desired. The purpose of this study was to evaluate the feasibility of measuring importance functions for individual subjects with adaptive-bandwidth filters. In two experiments, ten subjects with normal-hearing listened to vowel-consonant-vowel (VCV) nonsense words processed by low-pass and high-pass filters whose bandwidths were varied adaptively to produce specified performance levels in accordance with the transformed up-down rules of Levitt [(1971). J. Acoust. Soc. Am. 49, 467-477]. Local linear psychometric functions were fit to resulting data and used to generate an importance function for VCV words. Results indicate that the adaptive method is reliable and efficient, and produces importance function data consistent with that of the corresponding AI/SII importance function.

  9. A parallel algorithm for error correction in high-throughput short-read data on CUDA-enabled graphics hardware.

    Science.gov (United States)

    Shi, Haixiang; Schmidt, Bertil; Liu, Weiguo; Müller-Wittig, Wolfgang

    2010-04-01

    Emerging DNA sequencing technologies open up exciting new opportunities for genome sequencing by generating read data with a massive throughput. However, produced reads are significantly shorter and more error-prone compared to the traditional Sanger shotgun sequencing method. This poses challenges for de novo DNA fragment assembly algorithms in terms of both accuracy (to deal with short, error-prone reads) and scalability (to deal with very large input data sets). In this article, we present a scalable parallel algorithm for correcting sequencing errors in high-throughput short-read data so that error-free reads can be available before DNA fragment assembly, which is of high importance to many graph-based short-read assembly tools. The algorithm is based on spectral alignment and uses the Compute Unified Device Architecture (CUDA) programming model. To gain efficiency we are taking advantage of the CUDA texture memory using a space-efficient Bloom filter data structure for spectrum membership queries. We have tested the runtime and accuracy of our algorithm using real and simulated Illumina data for different read lengths, error rates, input sizes, and algorithmic parameters. Using a CUDA-enabled mass-produced GPU (available for less than US$400 at any local computer outlet), this results in speedups of 12-84 times for the parallelized error correction, and speedups of 3-63 times for both sequential preprocessing and parallelized error correction compared to the publicly available Euler-SR program. Our implementation is freely available for download from http://cuda-ec.sourceforge.net .

  10. Kernel bandwidth estimation for non-parametric density estimation: a comparative study

    CSIR Research Space (South Africa)

    Van der Walt, CM

    2013-12-01

    Full Text Available We investigate the performance of conventional bandwidth estimators for non-parametric kernel density estimation on a number of representative pattern-recognition tasks, to gain a better understanding of the behaviour of these estimators in high...

  11. High-performance parallel computing in the classroom using the public goods game as an example

    Science.gov (United States)

    Perc, Matjaž

    2017-07-01

    The use of computers in statistical physics is common because the sheer number of equations that describe the behaviour of an entire system particle by particle often makes it impossible to solve them exactly. Monte Carlo methods form a particularly important class of numerical methods for solving problems in statistical physics. Although these methods are simple in principle, their proper use requires a good command of statistical mechanics, as well as considerable computational resources. The aim of this paper is to demonstrate how the usage of widely accessible graphics cards on personal computers can elevate the computing power in Monte Carlo simulations by orders of magnitude, thus allowing live classroom demonstration of phenomena that would otherwise be out of reach. As an example, we use the public goods game on a square lattice where two strategies compete for common resources in a social dilemma situation. We show that the second-order phase transition to an absorbing phase in the system belongs to the directed percolation universality class, and we compare the time needed to arrive at this result by means of the main processor and by means of a suitable graphics card. Parallel computing on graphics processing units has been developed actively during the last decade, to the point where today the learning curve for entry is anything but steep for those familiar with programming. The subject is thus ripe for inclusion in graduate and advanced undergraduate curricula, and we hope that this paper will facilitate this process in the realm of physics education. To that end, we provide a documented source code for an easy reproduction of presented results and for further development of Monte Carlo simulations of similar systems.

  12. Parallel implementation of high-speed, phase diverse atmospheric turbulence compensation method on a neural network-based architecture

    Science.gov (United States)

    Arrasmith, William W.; Sullivan, Sean F.

    2008-04-01

    Phase diversity imaging methods work well in removing atmospheric turbulence and some system effects from predominantly near-field imaging systems. However, phase diversity approaches can be computationally intensive and slow. We present a recently adapted, high-speed phase diversity method using a conventional, software-based neural network paradigm. This phase-diversity method has the advantage of eliminating many time consuming, computationally heavy calculations and directly estimates the optical transfer function from the entrance pupil phases or phase differences. Additionally, this method is more accurate than conventional Zernike-based, phase diversity approaches and lends itself to implementation on parallel software or hardware architectures. We use computer simulation to demonstrate how this high-speed, phase diverse imaging method can be implemented on a parallel, highspeed, neural network-based architecture-specifically the Cellular Neural Network (CNN). The CNN architecture was chosen as a representative, neural network-based processing environment because 1) the CNN can be implemented in 2-D or 3-D processing schemes, 2) it can be implemented in hardware or software, 3) recent 2-D implementations of CNN technology have shown a 3 orders of magnitude superiority in speed, area, or power over equivalent digital representations, and 4) a complete development environment exists. We also provide a short discussion on processing speed.

  13. Concentric Parallel Combining Balun for Millimeter-Wave Power Amplifier in Low-Power CMOS with High-Power Density

    Science.gov (United States)

    Han, Jiang-An; Kong, Zhi-Hui; Ma, Kaixue; Yeo, Kiat Seng; Lim, Wei Meng

    2016-11-01

    This paper presents a novel balun for a millimeter-wave power amplifier (PA) design to achieve high-power density in a 65-nm low-power (LP) CMOS process. By using a concentric winding technique, the proposed parallel combining balun with compact size accomplishes power combining and unbalance-balance conversion concurrently. For calculating its power combination efficiency in the condition of various amplitude and phase wave components, a method basing on S-parameters is derived. Based on the proposed parallel combining balun, a fabricated 60-GHz industrial, scientific, and medical (ISM) band PA with single-ended I/O achieves an 18.9-dB gain and an 8.8-dBm output power at 1-dB compression and 14.3-dBm saturated output power ( P sat) at 62 GHz. This PA occupying only a 0.10-mm2 core area has demonstrated a high-power density of 269.15 mW/mm2 in 65 nm LP CMOS.

  14. The use of coded PCR primers enables high-throughput sequencing of multiple homolog amplification products by 454 parallel sequencing

    DEFF Research Database (Denmark)

    Binladen, Jonas; Gilbert, M Thomas P; Bollback, Jonathan P

    2007-01-01

    BACKGROUND: The invention of the Genome Sequence 20 DNA Sequencing System (454 parallel sequencing platform) has enabled the rapid and high-volume production of sequence data. Until now, however, individual emulsion PCR (emPCR) reactions and subsequent sequencing runs have been unable to combine ...... be applied to PCR-based assays of multiple homologous PCR products. The new approach will be of value to a broad range of research areas, such as those of comparative genomics, complete mitochondrial analyses, population genetics, and phylogenetics.......BACKGROUND: The invention of the Genome Sequence 20 DNA Sequencing System (454 parallel sequencing platform) has enabled the rapid and high-volume production of sequence data. Until now, however, individual emulsion PCR (emPCR) reactions and subsequent sequencing runs have been unable to combine...... template DNA from multiple individuals, as homologous sequences cannot be subsequently assigned to their original sources. METHODOLOGY: We use conventional PCR with 5'-nucleotide tagged primers to generate homologous DNA amplification products from multiple specimens, followed by sequencing through...

  15. Churchill: an ultra-fast, deterministic, highly scalable and balanced parallelization strategy for the discovery of human genetic variation in clinical and population-scale genomics.

    Science.gov (United States)

    Kelly, Benjamin J; Fitch, James R; Hu, Yangqiu; Corsmeier, Donald J; Zhong, Huachun; Wetzel, Amy N; Nordquist, Russell D; Newsom, David L; White, Peter

    2015-01-20

    While advances in genome sequencing technology make population-scale genomics a possibility, current approaches for analysis of these data rely upon parallelization strategies that have limited scalability, complex implementation and lack reproducibility. Churchill, a balanced regional parallelization strategy, overcomes these challenges, fully automating the multiple steps required to go from raw sequencing reads to variant discovery. Through implementation of novel deterministic parallelization techniques, Churchill allows computationally efficient analysis of a high-depth whole genome sample in less than two hours. The method is highly scalable, enabling full analysis of the 1000 Genomes raw sequence dataset in a week using cloud resources. http://churchill.nchri.org/.

  16. Improved Radiation and Bandwidth of Triangular and Star Patch Antenna

    Directory of Open Access Journals (Sweden)

    M. Ramkumar Prabhu

    2012-06-01

    Full Text Available This study presents a hexagonal shape Defected Ground Structure (DGS implemented on two element triangular patch microstrip antenna array. The radiation performance of the antenna is characterized by varying the geometry and dimension of the DGS and also by locating the DGS at specific position which were simulated. Simulation and measurement results have verified that the antenna with DGS had improved the antenna without DGS. Measurement results of the hexagonal DGS have axial ratio bandwidth enhancement of 10 MHz, return loss improvement of 35%, mutual coupling reduction of 3 dB and gain enhancement of 1 dB. A new wideband and small size star shaped patch antenna fed capacitively by a small diamond shape patch is proposed. To enhance the impedance bandwidth, posts are incorporated under the patch antenna. HFSS high frequency simulator is employed to analyze the proposed antenna and simulated results on the return loss, the E- and H-plane radiation patterns and Gain of the proposed antenna are presented at various frequencies. The antenna is able to achieve in the range of 4-8.8 GHz an impedance bandwidth of 81% for return loss of less than-10 dB.

  17. Parallel implementation of inverse adding-doubling and Monte Carlo multi-layered programs for high performance computing systems with shared and distributed memory

    Science.gov (United States)

    Chugunov, Svyatoslav; Li, Changying

    2015-09-01

    Parallel implementation of two numerical tools popular in optical studies of biological materials-Inverse Adding-Doubling (IAD) program and Monte Carlo Multi-Layered (MCML) program-was developed and tested in this study. The implementation was based on Message Passing Interface (MPI) and standard C-language. Parallel versions of IAD and MCML programs were compared to their sequential counterparts in validation and performance tests. Additionally, the portability of the programs was tested using a local high performance computing (HPC) cluster, Penguin-On-Demand HPC cluster, and Amazon EC2 cluster. Parallel IAD was tested with up to 150 parallel cores using 1223 input datasets. It demonstrated linear scalability and the speedup was proportional to the number of parallel cores (up to 150x). Parallel MCML was tested with up to 1001 parallel cores using problem sizes of 104-109 photon packets. It demonstrated classical performance curves featuring communication overhead and performance saturation point. Optimal performance curve was derived for parallel MCML as a function of problem size. Typical speedup achieved for parallel MCML (up to 326x) demonstrated linear increase with problem size. Precision of MCML results was estimated in a series of tests - problem size of 106 photon packets was found optimal for calculations of total optical response and 108 photon packets for spatially-resolved results. The presented parallel versions of MCML and IAD programs are portable on multiple computing platforms. The parallel programs could significantly speed up the simulation for scientists and be utilized to their full potential in computing systems that are readily available without additional costs.

  18. Optimization of Quantum-Dot Molecular Beam Epitaxy for Broad Spectral Bandwidth Devices

    KAUST Repository

    Majid, M. A.

    2012-12-01

    The optimization of the key growth parameters for broad spectral bandwidth devices based on quantum dots is reported. A combination of atomic force microscopy, photoluminescence of test samples, and optoelectronic characterization of superluminescent diodes (SLDs) is used to optimize the growth conditions to obtain high-quality devices with large spectral bandwidth, radiative efficiency (due to a reduced defective-dot density), and thus output power. The defective-dot density is highlighted as being responsible for the degradation of device performance. An SLD device with 160 nm of bandwidth centered at 1230 nm is demonstrated.

  19. A novel detection platform for parallel monitoring of DNA hybridization with high sensitivity and specificity

    DEFF Research Database (Denmark)

    Yi, Sun; Perch-Nielsen, Ivan R.; Wang, Zhenyu;

    We developed a high-sensitive platform to monior multiple hybridization events in real time. By creating a microoptical array in a polymeric chip, the system combine the excellent discriminative power of supercritical angle fluorescence (SAF) microscopy with high-throughput capabilities...

  20. Analysis and design of a parallel-connected single active bridge DC-DC converter for high-power wind farm applications

    DEFF Research Database (Denmark)

    Park, Kiwoo; Chen, Zhe

    2013-01-01

    This paper presents a parallel-connected Single Active Bridge (SAB) dc-dc converter for high-power applications. Paralleling lower-power converters can lower the current rating of each modular converter and interleaving the outputs can significantly reduce the magnitudes of input and output current...... requirements, this modular converter concept is expected to be highly beneficial especially for the offshore wind farm application....

  1. Bandwidth-sharing in LHCONE, an analysis of the problem

    Science.gov (United States)

    Wildish, T.

    2015-12-01

    The LHC experiments have traditionally regarded the network as an unreliable resource, one which was expected to be a major source of errors and inefficiency at the time their original computing models were derived. Now, however, the network is seen as much more capable and reliable. Data are routinely transferred with high efficiency and low latency to wherever computing or storage resources are available to use or manage them. Although there was sufficient network bandwidth for the experiments’ needs during Run-1, they cannot rely on ever-increasing bandwidth as a solution to their data-transfer needs in the future. Sooner or later they need to consider the network as a finite resource that they interact with to manage their traffic, in much the same way as they manage their use of disk and CPU resources. There are several possible ways for the experiments to integrate management of the network in their software stacks, such as the use of virtual circuits with hard bandwidth guarantees or soft real-time flow-control, with somewhat less firm guarantees. Abstractly, these can all be considered as the users (the experiments, or groups of users within the experiment) expressing a request for a given bandwidth between two points for a given duration of time. The network fabric then grants some allocation to each user, dependent on the sum of all requests and the sum of available resources, and attempts to ensure the requirements are met (either deterministically or statistically). An unresolved question at this time is how to convert the users’ requests into an allocation. Simply put, how do we decide what fraction of a network's bandwidth to allocate to each user when the sum of requests exceeds the available bandwidth? The usual problems of any resourcescheduling system arise here, namely how to ensure the resource is used efficiently and fairly, while still satisfying the needs of the users. Simply fixing quotas on network paths for each user is likely to lead

  2. BACH:A Bandwidth-Aware Hybrid Cache Hierarchy Design with Nonvolatile Memories

    Institute of Scientific and Technical Information of China (English)

    Jishen Zhao; Cong Xu; Tao Zhang; Yuan Xie

    2016-01-01

    Limited main memory bandwidth is becoming a fundamental performance bottleneck in chip-multiprocessor (CMP) design. Yet directly increasing the peak memory bandwidth can incur high cost and power consump-tion. In this paper, we address this problem by proposing a memory, a bandwidth-aware reconfigurable cache hierarchy, BACH, with hybrid memory technologies. Components of our BACH design include a hybrid cache hierarchy, a reconfigura-tion mechanism, and a statistical prediction engine. Our hybrid cache hierarchy chooses different memory technologies with various bandwidth characteristics, such as spin-transfer torque memory (STT-MRAM), resistive memory (ReRAM), and embedded DRAM (eDRAM), to configure each level so that the peak bandwidth of the overall cache hierarchy is optimized. Our reconfiguration mechanism can dynamically adjust the cache capacity of each level based on the predicted bandwidth demands of running workloads. The bandwidth prediction is performed by our prediction engine. We evaluate the system performance gain obtained by BACH design with a set of multithreaded and multiprogrammed workloads with and without the limitation of system power budget. Compared with traditional SRAM-based cache design, BACH improves the system throughput by 58%and 14%with multithreaded and multiprogrammed workloads respectively.

  3. NVRAM as Main Storage of Parallel File System

    Directory of Open Access Journals (Sweden)

    MALINOWSKI Artur

    2016-05-01

    Full Text Available Modern cluster environments' main trouble used to be lack of computational power provided by CPUs and GPUs, but recently they suffer more and more from insufficient performance of input and output operations. Apart from better network infrastructure and more sophisticated processing algorithms, a lot of solutions base on emerging memory technologies. This paper presents evaluation of using non-volatile random-access memory as a main storage of Parallel File System. The author justifies feasibility of such configuration and evaluates it with MPI I/O, OrangeFS as a file system, two popular cluster I/O benchmarks and software memory simulation. Obtained results suggest, that with Parallel File System highly optimized for block devices, small differences in access time and memory bandwidth does not influence system performance.

  4. An efficient algorithm for the parallel solution of high-dimensional differential equations

    CERN Document Server

    Klus, Stefan; Liu, Cong; Dellnitz, Michael

    2010-01-01

    The study of high-dimensional differential equations is challenging and difficult due to the analytical and computational intractability. Here, we significantly improve the speed of waveform relaxation (WR), a method to simulate high-dimensional differential-algebraic equations. This new method termed adaptive waveform relaxation (AWR) is tested on a communication network example. Further we analyze different heuristics for computing graph partitions tailored to adaptive waveform relaxation.

  5. Parallel Solvers for Finite-Difference Modeling of Large-Scale, High-Resolution Electromagnetic Problems in MRI

    Directory of Open Access Journals (Sweden)

    Hua Wang

    2008-01-01

    Full Text Available With the movement of magnetic resonance imaging (MRI technology towards higher field (and therefore frequency systems, the interaction of the fields generated by the system with patients, healthcare workers, and internally within the system is attracting more attention. Due to the complexity of the interactions, computational modeling plays an essential role in the analysis, design, and development of modern MRI systems. As a result of the large computational scale associated with most of the MRI models, numerical schemes that rely on a single computer processing unit often require a significant amount of memory and long computational times, which makes modeling of these problems quite inefficient. This paper presents dedicated message passing interface (MPI, OPENMP parallel computing solvers for finite-difference time-domain (FDTD, and quasistatic finite-difference (QSFD schemes. The FDTD and QSFD methods have been widely used to model/ analyze the induction of electric fields/ currents in voxel phantoms and MRI system components at high and low frequencies, respectively. The power of the optimized parallel computing architectures is illustrated by distinct, large-scale field calculation problems and shows significant computational advantages over conventional single processing platforms.

  6. DBAS: A Deployable Bandwidth Aggregation System

    CERN Document Server

    Habak, Karim; Harras, Khaled A

    2012-01-01

    The explosive increase in data demand coupled with the rapid deployment of various wireless access technologies have led to the increase of number of multi-homed or multi-interface enabled devices. Fully exploiting these interfaces has motivated researchers to propose numerous solutions that aggregate their available bandwidths to increase overall throughput and satisfy the end-user's growing data demand. These solutions, however, have faced a steep deployment barrier that we attempt to overcome in this paper. We propose a Deployable Bandwidth Aggregation System (DBAS) for multi-interface enabled devices. Our system does not introduce any intermediate hardware, modify current operating systems, modify socket implementations, nor require changes to current applications or legacy servers. The DBAS architecture is designed to automatically estimate the characteristics of applications and dynamically schedule various connections or packets to different interfaces. Since our main focus is deployability, we fully i...

  7. Parallelizing the Cellular Potts Model on graphics processing units

    Science.gov (United States)

    Tapia, José Juan; D'Souza, Roshan M.

    2011-04-01

    The Cellular Potts Model (CPM) is a lattice based modeling technique used for simulating cellular structures in computational biology. The computational complexity of the model means that current serial implementations restrict the size of simulation to a level well below biological relevance. Parallelization on computing clusters enables scaling the size of the simulation but marginally addresses computational speed due to the limited memory bandwidth between nodes. In this paper we present new data-parallel algorithms and data structures for simulating the Cellular Potts Model on graphics processing units. Our implementations handle most terms in the Hamiltonian, including cell-cell adhesion constraint, cell volume constraint, cell surface area constraint, and cell haptotaxis. We use fine level checkerboards with lock mechanisms using atomic operations to enable consistent updates while maintaining a high level of parallelism. A new data-parallel memory allocation algorithm has been developed to handle cell division. Tests show that our implementation enables simulations of >10 cells with lattice sizes of up to 256 3 on a single graphics card. Benchmarks show that our implementation runs ˜80× faster than serial implementations, and ˜5× faster than previous parallel implementations on computing clusters consisting of 25 nodes. The wide availability and economy of graphics cards mean that our techniques will enable simulation of realistically sized models at a fraction of the time and cost of previous implementations and are expected to greatly broaden the scope of CPM applications.

  8. Power-efficient high-speed parallel-sampling adcs for broadband multi-carrier systems

    CERN Document Server

    Lin, Yu; Doris, Kostas; van Roermund, Arthur H M

    2015-01-01

    This book addresses the challenges of designing high performance analog-to-digital converters (ADCs) based on the “smart data converters” concept, which implies context awareness, on-chip intelligence and adaptation. Readers will learn to exploit various information either a-priori or a-posteriori (obtained from devices, signals, applications or the ambient situations, etc.) for circuit and architecture optimization during the design phase or adaptation during operation, to enhance data converters performance, flexibility, robustness and power-efficiency. The authors focus on exploiting the a-priori knowledge of the system/application to develop enhancement techniques for ADCs, with particular emphasis on improving the power efficiency of high-speed and high-resolution ADCs for broadband multi-carrier systems.

  9. Digital demodulator for wide bandwidth SAR

    DEFF Research Database (Denmark)

    Jørgensen, Jørn Hjelm

    2000-01-01

    A novel approach to the design of efficient digital quadrature demodulators for wide bandwidth SAR systems is described. Efficiency is obtained by setting the intermediate frequency to 1/4 the ADC sampling frequency. One channel is made filter-free by synchronizing the local oscillator...... with the output decimator. The filter required by the other channel is optimized through global search using the system level performance metrics integrated sidelobe level ratio (ISLR) and peak sidelobe level ratio (PSLR)....

  10. Parallel Eclipse Project Checkout

    Science.gov (United States)

    Crockett, Thomas M.; Joswig, Joseph C.; Shams, Khawaja S.; Powell, Mark W.; Bachmann, Andrew G.

    2011-01-01

    Parallel Eclipse Project Checkout (PEPC) is a program written to leverage parallelism and to automate the checkout process of plug-ins created in Eclipse RCP (Rich Client Platform). Eclipse plug-ins can be aggregated in a feature project. This innovation digests a feature description (xml file) and automatically checks out all of the plug-ins listed in the feature. This resolves the issue of manually checking out each plug-in required to work on the project. To minimize the amount of time necessary to checkout the plug-ins, this program makes the plug-in checkouts parallel. After parsing the feature, a request to checkout for each plug-in in the feature has been inserted. These requests are handled by a thread pool with a configurable number of threads. By checking out the plug-ins in parallel, the checkout process is streamlined before getting started on the project. For instance, projects that took 30 minutes to checkout now take less than 5 minutes. The effect is especially clear on a Mac, which has a network monitor displaying the bandwidth use. When running the client from a developer s home, the checkout process now saturates the bandwidth in order to get all the plug-ins checked out as fast as possible. For comparison, a checkout process that ranged from 8-200 Kbps from a developer s home is now able to saturate a pipe of 1.3 Mbps, resulting in significantly faster checkouts. Eclipse IDE (integrated development environment) tries to build a project as soon as it is downloaded. As part of another optimization, this innovation programmatically tells Eclipse to stop building while checkouts are happening, which dramatically reduces lock contention and enables plug-ins to continue downloading until all of them finish. Furthermore, the software re-enables automatic building, and forces Eclipse to do a clean build once it finishes checking out all of the plug-ins. This software is fully generic and does not contain any NASA-specific code. It can be applied to any

  11. A Novel Dynamic Bandwidth Allocation Algorithm with Correction-based the Multiple Traffic Prediction in EPON

    Directory of Open Access Journals (Sweden)

    Ziyi Fu

    2012-10-01

    Full Text Available According to the upstream TDM in the system of Ethernet passive optical network (EPON, this paper proposes a novel dynamic bandwidth allocation algorithm which supports the mechanism with correction-based the multiple services estimation. To improve the real-time performance of the bandwidth allocation, this algorithm forecasts the traffic of high priority services, and then pre-allocate bandwidth for various priority services is corrected according to Gaussian distribution characteristics, which will make traffic prediction closer to the real traffic. The simulation results show that proposed algorithm is better than the existing DBA algorithm. Not only can it meet the delay requirement of high priority services, but also control the delay abnormity of low priority services. In addition, with rectification scheme, it obviously improves the bandwidth utilization.

  12. A wide bandwidth analog front-end circuit for 60-GHz wireless communication receiver

    Science.gov (United States)

    Furuta, M.; Okuni, H.; Hosoya, M.; Sai, A.; Matsuno, J.; Saigusa, S.; Itakura, T.

    2014-03-01

    This paper presents an analog front-end circuit for a 60-GHz wireless communication receiver. The feature of the proposed analog front-end circuit is a bandwidth more than 1-GHz wide. To expand the bandwidth of a low-pass filter and a voltage gain amplifier, a technique to reduce the parasitic capacitance of a transconductance amplifier is proposed. Since the bandwidth is also limited by on-resistance of the ADC sampling switch, a switch separation technique for reduction of the on-resistance is also proposed. In a high-speed ADC, the SNDR is limited by the sampling jitter. The developed high resolution VCO auto tuning effectively reduces the jitter of PLL. The prototype is fabricated in 65nm CMOS. The analog front-end circuit achieves over 1-GHz bandwidth and 27.2-dB SNDR with 224 mW Power consumption.

  13. Gaussian entanglement distribution with GHz bandwidth

    CERN Document Server

    Ast, Stefan; Mehmet, Moritz; Schnabel, Roman

    2016-01-01

    The distribution of Gaussian entanglement can be used to generate a mathematically-proven secure key for quantum cryptography. The distributed secret key rate is limited by the bandwidth of the nonlinear resonators used for entanglement generation, which is less than 100 MHz for current state-of-the-art setups. The development of an entanglement source with a higher bandwidth promises an increased measurement speed and a linear boost in the secure data rate. Here, we present the experimental realization of a continuous-variable entanglement source with a bandwidth of more than 1.25 GHz. The measured entanglement spectrum was quantified via the inseparability criterion introduced by Duan and coworkers with a critical value of 4 below which entanglement is certified. The measurements yielded an inseparability value of about 1.8 at a frequency of 300 MHz to about 2.8 at 1.2 GHz extending further to about 3.1 at 1.48 GHz. In the experiment we used two 2.6 mm long monolithic PPKTP crystal resonators to generate tw...

  14. Parallel Störmer-Cowell methods for high-precision orbit computations

    NARCIS (Netherlands)

    Houwen, P.J. van der; Messina, E.; Swart, J.J.B. de

    1998-01-01

    Many orbit problems in celestial mechanics are described by (nonstiff) initial-value problems (IVPs) for second-order ordinary differential equations of the form $y' = {bf f (y)$. The most successful integration methods are based on high-order Runge-Kutta-Nyström formulas. However, these methods wer

  15. High-throughput liquid-liquid extraction in 96-well format: Parallel artificial liquid membrane extraction

    DEFF Research Database (Denmark)

    Gjelstad, Astrid; Andresen, Alf Terje; Dahlgren, Anders

    2017-01-01

    , highly efficient sample cleanup, and direct compatibility with liquid chromatography–mass spectrometry (LC–MS). The consumption of hazardous organic solvents is also almost eliminated using PALME as the sample preparation technique. This article summarizes current experiences with PALME, based on work...

  16. High sensitivity and high Q-factor nanoslotted parallel quadrabeam photonic crystal cavity for real-time and label-free sensing

    Energy Technology Data Exchange (ETDEWEB)

    Yang, Daquan [Rowland Institute at Harvard University, Cambridge, Massachusetts 02142 (United States); State Key Laboratory of Information Photonics and Optical Communications, School of Information and Communication Engineering, Beijing University of Posts and Telecommunications, Beijing 100876 (China); School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts 02138 (United States); Kita, Shota; Wang, Cheng; Lončar, Marko [School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts 02138 (United States); Liang, Feng; Quan, Qimin [Rowland Institute at Harvard University, Cambridge, Massachusetts 02142 (United States); Tian, Huiping; Ji, Yuefeng [State Key Laboratory of Information Photonics and Optical Communications, School of Information and Communication Engineering, Beijing University of Posts and Telecommunications, Beijing 100876 (China)

    2014-08-11

    We experimentally demonstrate a label-free sensor based on nanoslotted parallel quadrabeam photonic crystal cavity (NPQC). The NPQC possesses both high sensitivity and high Q-factor. We achieved sensitivity (S) of 451 nm/refractive index unit and Q-factor >7000 in water at telecom wavelength range, featuring a sensor figure of merit >2000, an order of magnitude improvement over the previous photonic crystal sensors. In addition, we measured the streptavidin-biotin binding affinity and detected 10 ag/mL concentrated streptavidin in the phosphate buffered saline solution.

  17. Parallel algorithms

    CERN Document Server

    Casanova, Henri; Robert, Yves

    2008-01-01

    ""…The authors of the present book, who have extensive credentials in both research and instruction in the area of parallelism, present a sound, principled treatment of parallel algorithms. … This book is very well written and extremely well designed from an instructional point of view. … The authors have created an instructive and fascinating text. The book will serve researchers as well as instructors who need a solid, readable text for a course on parallelism in computing. Indeed, for anyone who wants an understandable text from which to acquire a current, rigorous, and broad vi

  18. Highly Parallelized Pattern Matching Hardware for Fast Tracking at Hadron Colliders

    Science.gov (United States)

    Citraro, S.; Annovi, A.; Biesuz, N.; Giannetti, P.; Luciano, P.; Nasimi, H.; Piendibene, M.; Sotiropoulou, C.-L.; Volpi, G.

    2016-04-01

    A high-performance “pattern matching” implementation based on the Associative Memory (AM) system is presented. It is designed to solve the real-time hit-to-track association problem for particles produced in high-energy physics experiments at hadron colliders. The processing time of pattern recognition in CPU-based algorithms increases rapidly with the detector occupancy due to the limited computing power and input-output capacity of hardware available on the market. The AM system presented here solves the problem by being able to process even the most complex hadron collider events produced at a rate of 100 kHz with an average latency smaller than 10 μs. The board built for this goal is able to execute 12 petabyte comparisons per second, with peak power consumption below 250 W, uniformly distributed on the large area of the board.

  19. A Novel Bandwidth Efficient Transmit Diversity Scheme Based on Water-filling

    Institute of Scientific and Technical Information of China (English)

    SHENCong; DAILin; ZHOUShidong; YAOYan

    2004-01-01

    In this paper we propose a novel bandwidth efficient transmit diversity scheme based on layered space-time architecture, in which Channel state information (CSI) is fully utilized to maximize channel capacity according to water-filling principle. It is shown that compared with V-BLAST, this new scheme can maintain the same high bandwidth efficiency, but achieve much better performance thanks to more effective transmission power allocation and diversity gain.

  20. Transmission Bandwidth Tunability of a Liquid-Filled Photonic Bandgap Fiber

    Institute of Scientific and Technical Information of China (English)

    ZOU Bing; LIU Yan-Ge; DU Jiang-Sing; WANG Zhi; HAN Ting-Ting; XU Jian-Bo; LI Yuan; LIU Bo

    2009-01-01

    @@ A temperature tunable photonic bandgap tiber (PBGF) is demonstrated by an index-guiding photonic crystal fiber filled with high-index liquid. The temperature tunable characteristics of the fiber axe experimentally and numerically investigated. Compression of transmission bandwidth of the PBGF is demonstrated by changing the temperature of part of the fiber. The tunable transmission bandwidth with a range of 250 nm is achieved by changing the temperature from 30℃ to 90℃.