high-performance parallel coupler: Topics by WorldWideScience.org

Sample records for high-performance parallel coupler

Cpl6: The New Extensible, High-Performance Parallel Coupler forthe Community Climate System Model

Energy Technology Data Exchange (ETDEWEB)

Craig, Anthony P.; Jacob, Robert L.; Kauffman, Brain; Bettge,Tom; Larson, Jay; Ong, Everest; Ding, Chris; He, Yun

2005-03-24

Coupled climate models are large, multiphysics applications designed to simulate the Earth's climate and predict the response of the climate to any changes in the forcing or boundary conditions. The Community Climate System Model (CCSM) is a widely used state-of-art climate model that has released several versions to the climate community over the past ten years. Like many climate models, CCSM employs a coupler, a functional unit that coordinates the exchange of data between parts of climate system such as the atmosphere and ocean. This paper describes the new coupler, cpl6, contained in the latest version of CCSM,CCSM3. Cpl6 introduces distributed-memory parallelism to the coupler, a class library for important coupler functions, and a standardized interface for component models. Cpl6 is implemented entirely in Fortran90 and uses Model Coupling Toolkit as the base for most of its classes. Cpl6 gives improved performance over previous versions and scales well on multiple platforms.
Refractive index engineering of high performance coupler for compact photonic integrated circuits

Science.gov (United States)

Liu, Lu; Zhou, Zhiping

2017-04-01

High performance couplers are highly desired in many applications, but the design is limited by nearly unchangeable material refractive index. To tackle this issue, refractive index engineering method is investigated, which can be realized by subwavelength grating. Subwavelength gratings are periodical structures with pitches small enough to locally synthesize the refractive index of photonic waveguides, which allows direct control of optical profile as well as easier fabrication process. This review provides an introduction to the basics of subwavelength structures and pay special attention to the design strategies of some representative examples of subwavelength grating devices, including: edge couplers, fiber-chip grating couplers, directional couplers and multimode interference couplers. Benefited from the subwavelength grating which can engineer the refractive index as well as birefringence and dispersion, these devices show better performance when compared to their conventional counterparts.
High efficiency grating couplers based on shared process with CMOS MOSFETs

International Nuclear Information System (INIS)

Qiu Chao; Sheng Zhen; Wu Ai-Min; Wang Xi; Zou Shi-Chang; Gan Fu-Wan; Li Le; Albert Pang

2013-01-01

Grating couplers are widely investigated as coupling interfaces between silicon-on-insulator waveguides and optical fibers. In this work, a high-efficiency and complementary metal—oxide—semiconductor (CMOS) process compatible grating coupler is proposed. The poly-Si layer used as a gate in the CMOS metal—oxide—semiconductor field effect transistor (MOSFET) is combined with a normal fully etched grating coupler, which greatly enhances its coupling efficiency. With optimal structure parameters, a coupling efficiency can reach as high as ∼ 70% at a wavelength of 1550 nm as indicated by simulation. From the angle of fabrication, all masks and etching steps are shared between MOSFETs and grating couplers, thereby making the high performance grating couplers easily integrated with CMOS circuits. Fabrication errors such as alignment shift are also simulated, showing that the device is quite tolerant in fabrication. (electromagnetism, optics, acoustics, heat transfer, classical mechanics, and fluid dynamics)
High performance parallel I/O

CERN Document Server

Prabhat

2014-01-01

Gain Critical Insight into the Parallel I/O EcosystemParallel I/O is an integral component of modern high performance computing (HPC), especially in storing and processing very large datasets to facilitate scientific discovery. Revealing the state of the art in this field, High Performance Parallel I/O draws on insights from leading practitioners, researchers, software architects, developers, and scientists who shed light on the parallel I/O ecosystem.The first part of the book explains how large-scale HPC facilities scope, configure, and operate systems, with an emphasis on choices of I/O har
Measuring the performance of the coaxial HOM coupler on a 2-cell TESLA-shape copper cavity

International Nuclear Information System (INIS)

Wang Fang; Wang Erdong; Zhang Baocheng; Zhao Kui

2009-01-01

Coaxial High Order Mode (HOM) couplers have been fabricated at Peking University and their RF performance has been measured on a test device consisting of a coaxial transmission line and a 2-cell TESLA-shape copper cavity. The test results on the 2-cell TESLA-shape copper cavity with HOM couplers indicate that the coupler can cut off the fundamental mode TM 010 and absorb HOMs effectively after a careful adjustment. The optimal angle of the HOM coupler with the beam tube is found. The initial test results of HOM couplers are presented in this paper. (authors)
Silica-on-silicon optical couplers and coupler based optical filters

DEFF Research Database (Denmark)

Leick, Lasse

2002-01-01

is not an adequate description of the waveguides. A simple application for an optical couplers is as a 980/1550 nm mulitmplexer for erbium doped wavguide amplifiers. A numerical analysis shows that a directional coupler has acceptable specifications, whereas a mulit mode interference coupler does not. The wavelength......This work concerns modeling and chracterization of non ampligying silica-on-silicon optical components for wavelength division mulitplexed networks. Emphasis is placed on optical couplers and how they can be used as building blocks for devices with a larger complexity. It has been investigated how...... to construct wavelength flattened and process tolerant couplers. A thorough comparison between directional couplers, multi mode interference couplers and interferometer-based couplers has been performed. Numerically all these architectures have the ability to obtain similar wavelength-flatness, but the multi...
High-power tests of a single-cell copper accelerating cavity driven by two input couplers

International Nuclear Information System (INIS)

Horan, D.; Bromberek, D.; Meyer, D.; Waldschmidt, G.

2008-01-01

High-power tests were conducted on a 350-MHz, single-cell copper accelerating cavity driven simultaneously by two H-loop input couplers for the purpose of determining the reliability, performance, and power-handling capability of the cavity and related components, which have routinely operated at 100-kW power levels. The test was carried out utilizing the APS 350-MHz RF Test Stand, which was modified to split the input rf power into two frac12-power feeds, each supplying power to a separate H-loop coupler on the cavity. Electromagnetic simulations of the two-coupler feed system were used to determine coupler match, peak cavity fields, and the effect of phasing errors between the coupler feed lines. The test was conducted up to a maximum total rf input power of 164-kW CW. Test apparatus details and performance data will be presented.
rf coaxial couplers for high-intensity linear accelerators

International Nuclear Information System (INIS)

Manca, J.J.; Knapp, E.A.

1980-02-01

Two rf coaxial couplers that are particularly suitable for intertank connection of the disk-and-washer accelerating structure for use in high-intensity linear accelerators have been developed. These devices have very high coupling to the accelerating structure and very low rf power loss at the operating frequency, and they can be designed for any relative particle velocity β > 0.4. Focusing and monitoring devices can be located inside these couplers
High performance parallel computers for science

International Nuclear Information System (INIS)

Nash, T.; Areti, H.; Atac, R.; Biel, J.; Cook, A.; Deppe, J.; Edel, M.; Fischler, M.; Gaines, I.; Hance, R.

1989-01-01

This paper reports that Fermilab's Advanced Computer Program (ACP) has been developing cost effective, yet practical, parallel computers for high energy physics since 1984. The ACP's latest developments are proceeding in two directions. A Second Generation ACP Multiprocessor System for experiments will include $3500 RISC processors each with performance over 15 VAX MIPS. To support such high performance, the new system allows parallel I/O, parallel interprocess communication, and parallel host processes. The ACP Multi-Array Processor, has been developed for theoretical physics. Each $4000 node is a FORTRAN or C programmable pipelined 20 Mflops (peak), 10 MByte single board computer. These are plugged into a 16 port crossbar switch crate which handles both inter and intra crate communication. The crates are connected in a hypercube. Site oriented applications like lattice gauge theory are supported by system software called CANOPY, which makes the hardware virtually transparent to users. A 256 node, 5 GFlop, system is under construction
High power coupler issues in normal conducting and superconducting accelerator applications

Energy Technology Data Exchange (ETDEWEB)

Matsumoto, H. [High Energy Accelerator Research Organization, Tsukuba, Ibaraki (Japan)

2001-02-01

The ceramic material (Al{sub 2}O{sub 3}) commonly used for the klystron output coupler in normal conducting, and for an input coupler to superconducting cavities is one of the most troublesome parts in accelerator applications. But the performance can be improved very much by starting with high purity (>99.9%) alumina powder of controlled grain-size (0.1-0.5-{mu}m), and reducing the magnesium (Mg) sintering-binder to lower the dielectric loss to the order of 10{sup -4} at S-band frequencies. It has been confirmed that the new ceramic can stand a peak S-band frequency rf power of up to 300 MW and 2.5 {mu}sec pulse width. (author)
Study of a power coupler for superconducting RF cavities used in high intensity proton accelerator

International Nuclear Information System (INIS)

Souli, M.

2007-07-01

The coaxial power coupler needed for superconducting RF cavities used in the high energy section of the EUROTRANS driver should transmit 150 kW (CW operation) RF power to the protons beam. The calculated RF and dielectric losses in the power coupler (inner and outer conductor, RF window) are relatively high. Consequently, it is necessary to design very carefully the cooling circuits in order to remove the generated heat and to ensure stable and reliable operating conditions for the coupler cavity system. After calculating all type of losses in the power coupler, we have designed and validated the inner conductor cooling circuit using numerical simulations results. We have also designed and optimized the outer conductor cooling circuit by establishing its hydraulic and thermal characteristics. Next, an experiment dedicated to study the thermal interaction between the power coupler and the cavity was successfully performed at CRYOHLAB test facility. The critical heat load Qc for which a strong degradation of the cavity RF performance was measured leading to Q c in the range 3 W-5 W. The measured heat load will be considered as an upper limit of the residual heat flux at the outer conductor cold extremity. A dedicated test facility was developed and successfully operated for measuring the performance of the outer conductor heat exchanger using supercritical helium as coolant. The test cell used reproduces the realistic thermal boundary conditions of the power coupler mounted on the cavity in the cryo-module. The first experimental results have confirmed the excellent performance of the tested heat exchanger. The maximum residual heat flux measured was 60 mW for a 127 W thermal load. As the RF losses in the coupler are proportional to the incident RF power, we can deduce that the outer conductor heat exchanger performance is continued up to 800 kW RF power. Heat exchanger thermal conductance has been identified using a 2D axisymmetric thermal model by comparing
Development of an optical resonator with high-efficient output coupler for the JAERI far-infrared free-electron laser

International Nuclear Information System (INIS)

Nagai, Ryoji; Hajima, Ryoichi; Nishimori, Nobuyuki; Sawamura, Masaru; Kikuzawa, Nobuhiro; Shizuma, Toshiyuki; Minehara, Eisuke

2001-01-01

An optical resonator with a high-efficient output coupler was developed for the JAERI far-infrared free-electron laser. The optical resonator is symmetrical near-concentric geometry with an insertable scraper output coupler. As a result of the development of the optical resonator, the JAERI-FEL has been successfully, lased with averaged power over 1 kW. Performance of the optical resonator with the output coupler was evaluated at optical wavelength of 22 μm by using an optical mode calculation code. The output coupling and diffractive loss with a dominant eigen-mode of the resonator were calculated using an iterative computation called Fox-Li procedure. An efficiency factor of the optical resonator was introduced for the evaluation of the optical resonator performance. The efficiency factor was derived by the amount of the output coupling and diffractive loss of the optical resonator. It was found that the optical resonator with the insertable scraper coupler was the most suitable to a high-power and high-efficient far-infrared free-electron laser. (author)
Low-crosstalk orbital angular momentum fiber coupler design.

Science.gov (United States)

Zhang, Zhishen; Gan, Jiulin; Heng, Xiaobo; Li, Muqiao; Li, Jiong; Xu, Shanhui; Yang, Zhongmin

2017-05-15

A fiber coupler for low-crosstalk orbital angular momentum mode beam splitter is proposed with the structure of two separate and parallel microfibers. By properly setting the center-to-center distance between microfibers, the crosstalk is less than -20 dB, which means that the purity of the needed OAM mode in output port is higher than 99%. For a fixed overlapping length, high coupling efficiency (>97%) is achieved in 1545-1560 nm. The operating wavelength is tuned to the whole C-band by using the thermosensitive liquid. So the designed coupler can achieve the tunable coupling ratio over the whole C-band, which is a prospective component for the further OAM fiber system.
Inexpensive 3dB coupler for POF communication by injection-molding production

Science.gov (United States)

Haupt, M.; Fischer, U. H. P.

2011-01-01

POFs (polymer optical fibers) gradually replace traditional communication media such as copper and glass within short distance communication systems. Primarily, this is due to their cost-effectiveness and easy handling. POFs are used in various fields of optical communication, e.g. the automotive sector or in-house communication. So far, however, only a few key components for a POF communication network are available. Even basic components, such as splices and couplers, are fabricated manually. Therefore, these circumstances result in high costs and fluctuations in components' performance. Available couplers have high insertion losses due to their manufacturing method. This can only be compensated by higher power budgets. In order to produce couplers with higher performances new fabrication methods are indispensable. A cheap and effective way to produce couplers for POF communication systems is injection molding. The paper gives an overview of couplers available on market, compares their performances, and shows a way to produce couplers by means of injection molding.
High Performance Parallel Multigrid Algorithms for Unstructured Grids

Science.gov (United States)

Frederickson, Paul O.

1996-01-01

We describe a high performance parallel multigrid algorithm for a rather general class of unstructured grid problems in two and three dimensions. The algorithm PUMG, for parallel unstructured multigrid, is related in structure to the parallel multigrid algorithm PSMG introduced by McBryan and Frederickson, for they both obtain a higher convergence rate through the use of multiple coarse grids. Another reason for the high convergence rate of PUMG is its smoother, an approximate inverse developed by Baumgardner and Frederickson.
CAMAC/PDP 11-45 coupler

International Nuclear Information System (INIS)

Pascual, Joseph; Raoul, J.-C.

1978-04-01

The complex experimental devices used in high energy physics require the use of minicomputers. The latter are coupled to the detectors using the CAMAC standard which has been adopted by the majority of high energy physics laboratories, much to the ease of international collaboration. The performance of industrially available interfaces having shown to be inadequate, the DPhPE has undertaken the development of a multibranche CAMAC/PDP 11-45 coupler. This system can control up to 49 crates shared out between 7 branches. It consists of a programmed channel and up to three high speed (556 Kwords/second) automatic channels. The four channels can work simultaneously through time sharing. The coupler includes a LAM handling system. The correspondent software has been developed simultaneously: the monitor is an extended version of the RT 11 system supplied by the manufacturer. This interface has been used so far in five experiments on the CERN PS and SPS. Besides this publication, intended to give a description of the coupler, a user's utilisation manuel exists in English [fr
Substrate integrated waveguide (SIW 3 dB coupler for K-Band applications

Directory of Open Access Journals (Sweden)

Khalid Nurehansafwanah

2017-01-01

Full Text Available This paper presented a designed coupler by using Rogers RO4003C with thickness (h 0.508 mm and relative permittivity (εr 3.55. The four port network coupler operates in K-band (18-27 GHz and design by using substrate integrated waveguide (SIW method. The reflection coefficient and isolation coefficient of propose Substrate Integrated Waveguide (SIW coupler is below than -10 dB. Meanwhile the coupler requirements are phase shift 90° between coupled port and output. SIW are high performance broadband interconnects with excellent immunity to electromagnetic interference and suitable for use in microwave and communication electronics, as well as increase bandwidth systems. The designs of coupler are investigated using CST Microwave Studio simulation tool. This proposed couplers are varied from parameters that cover the frequency range (21 -24 GHz and better performance of scattering (S-parameter.
Transverse emittance dilution due to coupler kicks in linear accelerators

Directory of Open Access Journals (Sweden)

Brandon Buckley

2007-11-01

Full Text Available One of the main concerns in the design of low emittance linear accelerators (linacs is the preservation of beam emittance. Here we discuss one possible source of emittance dilution, the coupler kick, due to transverse electromagnetic fields in the accelerating cavities of the linac caused by the power coupler geometry. In addition to emittance growth, the coupler kick also produces orbit distortions. It is common wisdom that emittance growth from coupler kicks can be strongly reduced by using two couplers per cavity mounted opposite each other or by having the couplers of successive cavities alternate from above to below the beam pipe so as to cancel each individual kick. While this is correct, including two couplers per cavity or alternating the coupler location requires large technical changes and increased cost for superconducting cryomodules where cryogenic pipes are arranged parallel to a string of several cavities. We therefore analyze consequences of alternate coupler placements. We show here that alternating the coupler location from above to below compensates the emittance growth as well as the orbit distortions. For sufficiently large Q values, alternating the coupler location from before to after the cavity leads to a cancellation of the orbit distortion but not of the emittance growth, whereas alternating the coupler location from before and above to behind and below the cavity cancels the emittance growth but not the orbit distortion. We show that cancellations hold for sufficiently large Q values. These compensations hold even when each cavity is individually detuned, e.g., by microphonics. Another effective method for reducing coupler kicks that is studied is the optimization of the phase of the coupler kick so as to minimize the effects on emittance from each coupler. This technique is independent of the coupler geometry but relies on operating on crest. A final technique studied is symmetrization of the cavity geometry in the
Ultra-High-Efficiency Apodized Grating Coupler Using a Fully Etched Photonic Crystal

DEFF Research Database (Denmark)

Ding, Yunhong; Peucheret, Christophe; Ou, Haiyan

2013-01-01

We demonstrate an apodized fiber-to-chip grating coupler using fully etched photonic crystal holes on the silicon-on-insulator platform. An ultra-high coupling efficiency of 1.65 dB (68%) with 3 dB bandwidth of 60 nm is experimentally demonstrated.......We demonstrate an apodized fiber-to-chip grating coupler using fully etched photonic crystal holes on the silicon-on-insulator platform. An ultra-high coupling efficiency of 1.65 dB (68%) with 3 dB bandwidth of 60 nm is experimentally demonstrated....
HOM/LOM Coupler Study for the ILC Crab Cavity

International Nuclear Information System (INIS)

Xiao, L.; Li, Z.; Ko, K.

2007-01-01

The FNAL 9-cell 3.9GHz deflecting mode cavity designed for the CKM experiment was chosen as the baseline design for the ILC BDS crab cavity. The full 9-cell CKM cavity including the coupler end-groups was simulated using the parallel eigensolver Omega3P and scattering parameter solver S3P. It was found that both the notch filters for the HOM/LOM couplers are very sensitive to the notch gap, which is about 1.6MHz/micron and is more than 10 times more sensitive than the TTF cavity. It was also found in the simulation that the unwanted vertical π-mode (SOM) is strongly coupled to the horizontal 7π/9 mode which causes x-y coupling and reduces the effectiveness of the SOM damping. To meet the ILC requirements, the HOM/LOM couplers are redesigned to address these issues. With the new designs, the damping of the HOM/LOM modes is improved. The sensitivity of the notch filter for the HOM coupler is reduced by one order of magnitude. The notch filter for the LOM coupler is eliminated in the new design which significantly simplifies the geometry. In this paper, we will present the simulation results of the original CKM cavity and the progresses on the HOM/LOM coupler re-design and optimization

Design and characterization of dielectric-loaded plasmonic directional couplers

DEFF Research Database (Denmark)

Stær, Tobias Holmgaard; Chen, Zhuo; Bozhevolnyi, Sergey

2009-01-01

Ultracompact directional couplers (DCs) based on dielectric-loaded surface plasmon-polariton waveguides (DLSPPWs) are analyzed using the effective index method (EIM), with the coupling, both in the parallel interaction region and in- and out-coupling regions, being taken into account. Near-field...... characterization of fabricated DCs performed with a scanning near-field optical microscope verifies the applicability of the EIM in the analysis and design of DLSPPW-based wavelength-selective DCs. The design approach applicable to a large variety of integrated optical waveguides is developed, enabling...
A modified lower hybrid coupler for TPX

International Nuclear Information System (INIS)

Bernabei, S.; Greenough, N.

1995-01-01

Efforts have concentrated on redesigning the configuration of the Lower Hybrid coupler for TPX tokamak. Several concerns motivated this redesign: reduce the effect of thermal incompatibility between coupler and rf-window material, reduce weight, reduce the risk of wind failure and address the problem of replaceability, increase the reliability by reducing the number connections and finally, reduce the total cost. The result is a highly compact, light and easily serviceable coupler which incorporates some of the simplicity of the multifunction coupler but preserves the spectral flexibility of a conventional coupler
Fundamental Power Couplers for Superconducting Cavities

International Nuclear Information System (INIS)

Isidoro E. Campisi

2001-01-01

Fundamental power couplers (FPC's) for superconducting cavities must meet very strict requirements to perform at high power levels (hundreds of kilowatts) and in a variety of conditions (CS, pulsed, travelling wave, standing wave) without adversely affecting the performance of the cavities they are powering. Producing good coupler designs and achieving operational performances in accelerator environments are challenging tasks that have traditionally involved large resources from many laboratories. The designs involve state-of-the-art activities in RF, cryogenic and mechanical engineering, materials science, vacuum technology, and electromagnetic field modeling. Handling, assembly and conditioning procedures have been developed to achieve ever-increasing power levels and more reliable operation. In this paper, the technical issues associated with the design, construction, assembly, processing, and operation of FPC's will be reviewed, together with the progress in FPC activities in several laboratories during the past few years
Development of fundamental power coupler for C-ADS superconducting elliptical cavities

Science.gov (United States)

Gu, Kui-Xiang; Bing, Feng; Pan, Wei-Min; Huang, Tong-Ming; Ma, Qiang; Meng, Fan-Bo

2017-06-01

5-cell elliptical cavities have been selected for the main linac of the China Accelerator Driven sub-critical System (C-ADS) in the medium energy section. According to the design, each cavity should be driven with radio frequency (RF) energy up to 150 kW by a fundamental power coupler (FPC). As the cavities work with high quality factor and high accelerating gradient, the coupler should keep the cavity from contamination in the assembly procedure. To fulfil the requirements, a single-window coaxial type coupler was designed with the capabilities of handling high RF power, class 10 clean room assembly, and heat load control. This paper presents the coupler design and gives details of RF design, heat load optimization and thermal analysis as well as multipacting simulations. In addition, a primary high power test has been performed and is described in this paper. Supported by China ADS Project (XDA03020000) and National Natural Science Foundation of China (11475203)
Development and performance of a new version of the OASIS coupler, OASIS3-MCT_3.0

Science.gov (United States)

Craig, Anthony; Valcke, Sophie; Coquart, Laure

2017-09-01

OASIS is coupling software developed primarily for use in the climate community. It provides the ability to couple different models with low implementation and performance overhead. OASIS3-MCT is the latest version of OASIS. It includes several improvements compared to OASIS3, including elimination of a separate hub coupler process, parallelization of the coupling communication and run-time grid interpolation, and the ability to easily reuse mapping weight files. OASIS3-MCT_3.0 is the latest release and includes the ability to couple between components running sequentially on the same set of tasks as well as to couple within a single component between different grids or decompositions such as physics, dynamics, and I/O. OASIS3-MCT has been tested with different configurations on up to 32 000 processes, with components running on high-resolution grids with up to 1.5 million grid cells, and with over 10 000 2-D coupling fields. Several new features will be available in OASIS3-MCT_4.0, and some of those are also described.
Development and performance of a new version of the OASIS coupler, OASIS3-MCT_3.0

Directory of Open Access Journals (Sweden)

A. Craig

2017-09-01

Full Text Available OASIS is coupling software developed primarily for use in the climate community. It provides the ability to couple different models with low implementation and performance overhead. OASIS3-MCT is the latest version of OASIS. It includes several improvements compared to OASIS3, including elimination of a separate hub coupler process, parallelization of the coupling communication and run-time grid interpolation, and the ability to easily reuse mapping weight files. OASIS3-MCT_3.0 is the latest release and includes the ability to couple between components running sequentially on the same set of tasks as well as to couple within a single component between different grids or decompositions such as physics, dynamics, and I/O. OASIS3-MCT has been tested with different configurations on up to 32 000 processes, with components running on high-resolution grids with up to 1.5 million grid cells, and with over 10 000 2-D coupling fields. Several new features will be available in OASIS3-MCT_4.0, and some of those are also described.
HF power couplers for pulsed superconducting cavity resonators; Coupleurs de puissance HF pour cavites supraconductrices en mode pulse

Energy Technology Data Exchange (ETDEWEB)

Jenhani, Hassen [Laboratoire de l' Accelerateur Lineaire, IN2P3-CNRS et Universite de Paris-Sud, BP 34, F-91898 Orsay Cedex (France)

2006-11-15

Recent years have seen an impressive improvement in the accelerating gradients obtained in superconducting cavities. Consequently, such cavities have become attractive candidates for large superconducting linear accelerator projects such as the European XFEL and the International Linear Collider (ILC). As a result, there is a strong interest in reducing RF conditioning time and improving the performance of the input power couplers for these cavities. The so-called TTF-III input power coupler, adopted for the XFEL superconducting RF cavities are complex components. In order to better understand the behavior of this component we have performed a series of experiments on a number of such couplers. Initially, we developed a fully automated RF high power test stand for coupler conditioning procedure. Following this, we performed a series of coupler conditioning tests. This has allowed the study of the coupler behavior during processing. A number of experiments were carried out to evaluate the in-situ baking effect on the conditioning time. Some of the conditioned couplers were sent to DESY in order to be tested on 9-cells TESLA cavities under cryogenic conditions. These tests have shown that the couplers in no way limit the cavity performance, even up to gradients of 35 MV/m. The main objective of our coupler studies was the reduction of their conditioning time, which represents one of the most important criteria in the choice of coupler for high energy linacs. Excellent progress in reducing the conditioning time has been demonstrated by making appropriate modifications to the conditioning procedure. Furthermore, special attention was paid to electron generation processes in the couplers, via multipacting. Simulations of this process were made on both the TTF-III coupler and on a new coupler prototype, TTF-V. Experiments aimed at suppressing multipacting were also successfully achieved by using a DC bias on the inner conductor of the co-axial coupler. (author)
UV written compact broadband optical couplers

DEFF Research Database (Denmark)

Olivero, Massimo; Svalgaard, Mikael

2005-01-01

In this paper the first demonstration of compact asymmetric directional couplers made by UV writing is presented. The combined performance in terms bandwidth, loss and compactness exceeds that reported using other, more elaborate fabrication techniques.......In this paper the first demonstration of compact asymmetric directional couplers made by UV writing is presented. The combined performance in terms bandwidth, loss and compactness exceeds that reported using other, more elaborate fabrication techniques....
Model coupler for coupling of atmospheric, oceanic, and terrestrial models

International Nuclear Information System (INIS)

Nagai, Haruyasu; Kobayashi, Takuya; Tsuduki, Katsunori; Kim, Keyong-Ok

2007-02-01

A numerical simulation system SPEEDI-MP, which is applicable for various environmental studies, consists of dynamical models and material transport models for the atmospheric, terrestrial, and oceanic environments, meteorological and geographical databases for model inputs, and system utilities for file management, visualization, analysis, etc., using graphical user interfaces (GUIs). As a numerical simulation tool, a model coupling program (model coupler) has been developed. It controls parallel calculations of several models and data exchanges among them to realize the dynamical coupling of the models. It is applicable for any models with three-dimensional structured grid system, which is used by most environmental and hydrodynamic models. A coupled model system for water circulation has been constructed with atmosphere, ocean, wave, hydrology, and land-surface models using the model coupler. Performance tests of the coupled model system for water circulation were also carried out for the flood event at Saudi Arabia in January 2005 and the storm surge case by the hurricane KATRINA in August 2005. (author)
Structural effects on electromagnetic flow coupler performance

International Nuclear Information System (INIS)

Aoyama, Goro; Yokota, Norikatsu; Mine, Masao; Watanabe, Takashi; Takuma, Tadasu; Takenaka, Kiyoshi.

1992-01-01

A prototype electromagnetic flow coupler was tested using 300degC liquid sodium to estimate the effect on performance of generator flow velocity, magnetic flux density, magnetic core length and bus bar length. Its performance was not affected by changes in fluid velocity and magnetic flux density up to 8.3 m/s and 0.51 T, respectively. Besides the experiments, a two-dimensional numerical analysis program based on Ohm's law and the current continuity equation was prepared to estimate the effects of magnetic core length and bus bar construction. The current transferred from the generator to the pump, the current transfer ratio, increased by lengthening the magnetic core being a maximum of 0.706 for a 100 mm core and 0.764 for a 300 mm core. The numerical results showed that the presence of the bus bar in the outer region of the magnetic core gave inferior performance to that in its absence. (author)
Comparative Simulation Studies of Multipacting in Higher-Order-Mode Couplers of Superconducting RF Cavities

International Nuclear Information System (INIS)

Li, Y. M.; Liu, Kexin; Geng, Rongli

2014-01-01

Multipacting (MP) in higher-order-mode (HOM) couplers of the International Linear Collider (ILC) baseline cavity and the Continuous Electron Beam Accelerator Facility (CEBAF) 12 GeV upgrade cavity is studied by using the ACE3P suites, developed by the Advanced Computations Department at SLAC. For the ILC cavity HOM coupler, the simulation results show that resonant trajectories exist in three zones, corresponding to an accelerating gradient range of 0.6A-1.6 MV/m, 21A-34 MV/m, 32A-35 MV/m, and > 40MV/m, respectively. For the CEBAF 12 GeV upgrade cavity HOM coupler, resonant trajectories exist in one zone, corresponding to an accelerating gradient range of 6A-13 MV/m. Potential implications of these MP barriers are discussed in the context of future high energy pulsed as well as medium energy continuous wave (CW) accelerators based on superconducting radio frequency cavities. Frequency scaling of MPA's predicted in HOM couplers of the ILC, CBEAF upgrade, SNS and FLASH third harmonic cavity is given and found to be in good agreement with the analytical result based on the parallel plate model
Comparative Simulation Studies of Multipacting in Higher-Order-Mode Couplers of Superconducting RF Cavities

Energy Technology Data Exchange (ETDEWEB)

Li, Y. M. [Peking University, Beijing (China); Thomas Jefferson National Accelerator Facility, Newport News, VA (United States); Liu, Kexin [Peking University, Beijing (China); Geng, Rongli [Thomas Jefferson National Accelerator Facility, Newport News, VA (United States)

2014-02-01

Multipacting (MP) in higher-order-mode (HOM) couplers of the International Linear Collider (ILC) baseline cavity and the Continuous Electron Beam Accelerator Facility (CEBAF) 12 GeV upgrade cavity is studied by using the ACE3P suites, developed by the Advanced Computations Department at SLAC. For the ILC cavity HOM coupler, the simulation results show that resonant trajectories exist in three zones, corresponding to an accelerating gradient range of 0.6-1.6 MV/m, 21-34 MV/m, 32-35 MV/m, and > 40MV/m, respectively. For the CEBAF 12 GeV upgrade cavity HOM coupler, resonant trajectories exist in one zone, corresponding to an accelerating gradient range of 6-13 MV/m. Potential implications of these MP barriers are discussed in the context of future high energy pulsed as well as medium energy continuous wave (CW) accelerators based on superconducting radio frequency cavities. Frequency scaling of MP's predicted in HOM couplers of the ILC, CBEAF upgrade, SNS and FLASH third harmonic cavity is given and found to be in good agreement with the analytical result based on the parallel plate model.
Waveguide silicon nitride grating coupler

Science.gov (United States)

Litvik, Jan; Dolnak, Ivan; Dado, Milan

2016-12-01

Grating couplers are one of the most used elements for coupling of light between optical fibers and photonic integrated components. Silicon-on-insulator platform provides strong confinement of light and allows high integration. In this work, using simulations we have designed a broadband silicon nitride surface grating coupler. The Fourier-eigenmode expansion and finite difference time domain methods are utilized in design optimization of grating coupler structure. The fully, single etch step grating coupler is based on a standard silicon-on-insulator wafer with 0.55 μm waveguide Si3N4 layer. The optimized structure at 1550 nm wavelength yields a peak coupling efficiency -2.6635 dB (54.16%) with a 1-dB bandwidth up to 80 nm. It is promising way for low-cost fabrication using complementary metal-oxide- semiconductor fabrication process.
Study and development of an input coupler for the future TESLA collider

International Nuclear Information System (INIS)

Dupery, C.

1996-01-01

The TESLA (TeV Superconducting Linear Accelerator) is operating with a high frequency cavity resonator input coupler. Some technical restraints (such as thermal, mechanical, electrical, vacuum, multipactor discharge phenomena) constrain the development of this coupler. In order to solve these problems, studies have been performed at the French Atomic Energy Commission (CEA) and are presented in this paper
High efficiency diffractive grating coupler based on transferred silicon nanomembrane overlay on photonic waveguide

Energy Technology Data Exchange (ETDEWEB)

Saha, Tapas Kumar; Zhou Weidong [University of Texas at Arlington, Department of Electrical Engineering, NanoFAB Center, Arlington, TX 76019-0072 (United States)

2009-04-21

We report here the design of a new type of high efficiency grating coupler, based on single crystalline Si nanomembrane overlay and stacking. Such high efficiency diffractive grating couplers are designed for the purpose of coupling light between single mode fibres and nanophotonic waveguides, and for the coupling between multiple photonic interconnect layers for compact three-dimensional vertical integration. Two-dimensional model simulation based on eigenmode expansion shows a diffractive power-up efficiency of 81% and a fibre coupling efficiency of 64%. With nanomembrane stacking, it is feasible to integrate the side-distributed Bragg reflector and bottom reflector, which can lead to the diffractive power-up efficiency and the fibre coupling efficiency of 97% and 73.5%, respectively. For a negatively detuned coupler, the bottom reflector is not needed, and the diffractive power-up efficiency can reach 98% over a large spectral range. The device is extremely tolerant to fabrication errors.
10th International Workshop on Parallel Tools for High Performance Computing

CERN Document Server

Gracia, José; Hilbrich, Tobias; Knüpfer, Andreas; Resch, Michael; Nagel, Wolfgang

2017-01-01

This book presents the proceedings of the 10th International Parallel Tools Workshop, held October 4-5, 2016 in Stuttgart, Germany – a forum to discuss the latest advances in parallel tools. High-performance computing plays an increasingly important role for numerical simulation and modelling in academic and industrial research. At the same time, using large-scale parallel systems efficiently is becoming more difficult. A number of tools addressing parallel program development and analysis have emerged from the high-performance computing community over the last decade, and what may have started as collection of small helper script has now matured to production-grade frameworks. Powerful user interfaces and an extensive body of documentation allow easy usage by non-specialists.
Coupler induced monopole component and its minimization in deflecting cavities

Directory of Open Access Journals (Sweden)

P. K. Ambattu

2013-06-01

Full Text Available Deflecting cavities are used in particle accelerators for the manipulation of charged particles by deflecting or crabbing (rotating them. For short deflectors, the effect of the power coupler on the deflecting field can become significant. The particular power coupler type can introduce multipole rf field components and coupler-specific wakefields. Coupler types that would normally be considered like standard on-cell coupler, waveguide coupler, or mode-launcher coupler could have one or two rf feeds. The major advantage of a dual-feed coupler is the absence of monopole and quadrupole rf field components in the deflecting structure. However, a dual-feed coupler is mechanically more complex than a typical single-feed coupler and needs a splitter. For most applications, deflecting structures are placed in regions where there is small space hence reducing the size of the structure is very desirable. This paper investigates the multipole field components of the deflecting mode in single-feed couplers and ways to overcome the effect of the monopole component on the beam. Significant advances in performance have been demonstrated. Additionally, a novel coupler design is introduced which has no monopole field component to the deflecting mode and is more compact than the conventional dual-feed coupler.
Polarization converted coupler for plasma current drive experiments

International Nuclear Information System (INIS)

Arai, H.; Shimizu, S.; Goto, N.

1986-01-01

In this paper, the authors propose the polarization converted coupler which has narrow width shape and radiates electric field perpendicular to the main toroidal magnetic field. The advantages of the polarization converted coupler are as follows: (l) The rectangular waveguide as the transmission line has the high power capability. (2) The all metal design is not damaged by the fusion neutron. (3) The characteristic of this coupler is not changed widely, since the coupler has the matching section. For example, the VSWR of its input impedance is less than 2.0 for both water and air load. The authors present characteristics of the polarization converted coupler measured by the model experiments
7th International Workshop on Parallel Tools for High Performance Computing

CERN Document Server

Gracia, José; Nagel, Wolfgang; Resch, Michael

2014-01-01

Current advances in High Performance Computing (HPC) increasingly impact efficient software development workflows. Programmers for HPC applications need to consider trends such as increased core counts, multiple levels of parallelism, reduced memory per core, and I/O system challenges in order to derive well performing and highly scalable codes. At the same time, the increasing complexity adds further sources of program defects. While novel programming paradigms and advanced system libraries provide solutions for some of these challenges, appropriate supporting tools are indispensable. Such tools aid application developers in debugging, performance analysis, or code optimization and therefore make a major contribution to the development of robust and efficient parallel software. This book introduces a selection of the tools presented and discussed at the 7th International Parallel Tools Workshop, held in Dresden, Germany, September 3-4, 2013.
High-directionality fiber-chip grating coupler with interleaved trenches and subwavelength index-matching structure.

Science.gov (United States)

Benedikovic, Daniel; Alonso-Ramos, Carlos; Cheben, Pavel; Schmid, Jens H; Wang, Shurui; Xu, Dan-Xia; Lapointe, Jean; Janz, Siegfried; Halir, Robert; Ortega-Moñux, Alejandro; Wangüemert-Pérez, J Gonzalo; Molina-Fernández, Iñigo; Fédéli, Jean-Marc; Vivien, Laurent; Dado, Milan

2015-09-15

We present the first experimental demonstration of a new fiber-chip grating coupler concept that exploits the blazing effect by interleaving the standard full (220 nm) and shallow etch (70 nm) trenches in a 220 nm thick silicon layer. The high directionality is obtained by controlling the separation between the deep and shallow trenches to achieve constructive interference in the upward direction and destructive interference toward the silicon substrate. Utilizing this concept, the grating directionality can be maximized independent of the bottom oxide thickness. The coupler also includes a subwavelength-engineered index-matching region, designed to reduce the reflectivity at the interface between the injection waveguide and the grating. We report a measured fiber-chip coupling efficiency of -1.3 dB, the highest coupling efficiency achieved to date for a surface grating coupler in a 220 nm silicon-on-insulator platform fabricated in a conventional dual-etch process without high-index overlays or bottom mirrors.

RF Coupler Design for the TRIUMF ISAC-II Superconducting Quarter Wave Resonator

CERN Document Server

Poirier, R L; Harmer, P; Laxdal, R E; Mitra, A K; Sekatchev, I; Waraich, B; Zvyagintsev, V

2004-01-01

An RF Coupler for the ISAC-II medium beta (β=0.058 and 0.071) superconducting quarter wave resonators was designed and tested at TRIUMF. The main goal of this development was to achieve stable operation of superconducting cavities at high acceleration gradients and low thermal load to the helium refrigeration system. The cavities will operate at 6 MV/m acceleration gradient in overcoupled mode at a forward power 200 W at 106 MHz. The overcoupling provides ±20 Hz cavity bandwidth, which improves the stability of the RF control system for fast helium pressure fluctuations, microphonics and environmental noise. Choice of materials, cooling with liquid nitrogen, aluminum nitride RF window and thermal shields insure a small thermal load on the helium refrigeration system by the Coupler. An RF finger contact which causedμdust in the coupler housing was eliminated without any degradation of the coupler performance. RF and thermal calculations, design and test results on the coupler are p...
Comparative simulation studies of multipacting in higher-order-mode couplers of superconducting rf cavities

Directory of Open Access Journals (Sweden)

Y. M. Li

2014-02-01

Full Text Available Multipacting (MP in higher-order-mode (HOM couplers of the International Linear Collider (ILC baseline cavity and the Continuous Electron Beam Accelerator Facility (CEBAF 12 GeV upgrade cavity is studied by using the ACE3P suites, developed by the Advanced Computations Department at SLAC. For the ILC cavity HOM coupler, the simulation results show that resonant trajectories exist in three zones, corresponding to an accelerating gradient range of 0.6–1.6 MV/m, 21–34 MV/m, 32–35 MV/m and >40 MV/m, respectively. For the CEBAF 12 GeV upgrade cavity HOM coupler, resonant trajectories exist in one zone, corresponding to an accelerating gradient range of 6–13 MV/m. Potential implications of these MP barriers are discussed in the context of future high-energy pulsed as well as medium-energy continuous wave accelerators based on superconducting radio frequency cavities. Frequency scaling of MP’s predicted in HOM couplers of the ILC, CEBAF upgrade, Spallation Neutron Source (SNS, and Free-Electron Laser in Hamburg (FLASH third harmonic cavity is given and found to be in good agreement with the analytical result based on the parallel plate model.
8th International Workshop on Parallel Tools for High Performance Computing

CERN Document Server

Gracia, José; Knüpfer, Andreas; Resch, Michael; Nagel, Wolfgang

2015-01-01

Numerical simulation and modelling using High Performance Computing has evolved into an established technique in academic and industrial research. At the same time, the High Performance Computing infrastructure is becoming ever more complex. For instance, most of the current top systems around the world use thousands of nodes in which classical CPUs are combined with accelerator cards in order to enhance their compute power and energy efficiency. This complexity can only be mastered with adequate development and optimization tools. Key topics addressed by these tools include parallelization on heterogeneous systems, performance optimization for CPUs and accelerators, debugging of increasingly complex scientific applications, and optimization of energy usage in the spirit of green IT. This book represents the proceedings of the 8th International Parallel Tools Workshop, held October 1-2, 2014 in Stuttgart, Germany – which is a forum to discuss the latest advancements in the parallel tools.
Microfabrication of pre-aligned fiber bundle couplers using ultraviolet lithography of SU-8

OpenAIRE

Yang, Ren; Soper, Steven A.; Wang, Wanjun

2006-01-01

This paper describes the design, microfabrication and testing of a pre-aligned array of fiber couplers using direct UV-lithography of SU-8. The fiber coupler array includes an out-of-plane refractive microlens array and two fiberport collimator arrays. With the optical axis of the pixels parallel to the substrate, each pixel of the microlens array can be pre-aligned with the corresponding pixels of the fiberport collimator array as defined by the lithography mask design. This out-of-plane pol...
Development of the SCRF Power Coupler for the APT Accelerator

Energy Technology Data Exchange (ETDEWEB)

Schmierer, E.N.; Lujan, R.E.; Rusnak, B.; Smith, B.; Haynes, W.B.; Gautier, C.; Waynert, J.A.; Krawczyk, F.; Gioia, J.

1999-03-01

The team responsible for the design of the Accelerator Production of Tritium (APT) superconducting (SC) radio frequency (RF) power coupler has developed two 700-MHz, helium gas-cooled power couplers. One has a fixed inner conductor and the other has an adjustable inner conductor (gamma prototype and alpha prototype). The power couplers will be performance tested in the near future. This paper discusses the mechanical design and fabrication techniques employed in the development of each power coupler. This includes material selection, copper coating, assembly sequences, and metal joining procedures, as well as the engineering analyses performed to determine the dynamic response of the inner conductors due to environmental excitations. A bellows is used in both prototype inner conductors in the area near the ceramic RF window, to compensate for thermal expansion and mechanical tolerance build-up. In addition, a bellows is used near the tip of the inner conductor of the alpha prototype for running the power coupler after it is installed on the accelerator. Extensive analytical work has been performed to determine the static loads transmitted by the bellows due to thermally induced expansion on the inner conductor and on the RF window. This paper also discusses this analysis, as well as the mechanical analysis performed to determine the final geometric shape of the bellows. Finally, a discussion of the electromagnetic analysis used to optimize the performance of the power couplers is included.
All-optical switching using a new photonic crystal directional coupler

Directory of Open Access Journals (Sweden)

B. Vakili

2015-07-01

Full Text Available In this paper all-optical switching in a new photonic crystal directional coupler is performed. The structure of the switch consists of a directional coupler and a separate path for a control signal called “control waveguide”. In contrast to the former reported structures in which the directional couplers are made by removing a row of rods entirely, the directional coupler in our optical switch is constructed by two reduced-radius line-defect waveguides separated by the control waveguide. Furthermore, in our case the background material has the nonlinear Kerr property. Therefore, in the structure of this work, no frequency overlap occurs between the control waveguide mode and the directional coupler modes. It is shown that such a condition provides a very good isolation between the control and the probe signals at the output ports. In the control waveguide, nonlinear Kerr effect causes the required refractive index change by the presence of a high power control (pump signal. Even and odd modes of the coupler are investigated by applying the distribution of the refractive index change in the nonlinear region of a super-cell so that a switching length of about 94 µm is obtained at the wavelength of 1.55 µm. Finally, all-optical switching of the 1.55 µm probe signal using a control signal at the wavelength of 1.3 µm, is simulated through the finite-difference time-domain method, where both signals are desirable in optical communication systems. A very high extinction ratio of 67 dB is achieved and the temporal characteristics of the switch are demonstrated.
CUDA/GPU Technology : Parallel Programming For High Performance Scientific Computing

OpenAIRE

YUHENDRA; KUZE, Hiroaki; JOSAPHAT, Tetuko Sri Sumantyo

2009-01-01

[ABSTRACT]Graphics processing units (GP Us) originally designed for computer video cards have emerged as the most powerful chip in a high-performance workstation. In the high performance computation capabilities, graphic processing units (GPU) lead to much more powerful performance than conventional CPUs by means of parallel processing. In 2007, the birth of Compute Unified Device Architecture (CUDA) and CUDA-enabled GPUs by NVIDIA Corporation brought a revolution in the general purpose GPU a...
High performance statistical computing with parallel R: applications to biology and climate modelling

International Nuclear Information System (INIS)

Samatova, Nagiza F; Branstetter, Marcia; Ganguly, Auroop R; Hettich, Robert; Khan, Shiraj; Kora, Guruprasad; Li, Jiangtian; Ma, Xiaosong; Pan, Chongle; Shoshani, Arie; Yoginath, Srikanth

2006-01-01

Ultrascale computing and high-throughput experimental technologies have enabled the production of scientific data about complex natural phenomena. With this opportunity, comes a new problem - the massive quantities of data so produced. Answers to fundamental questions about the nature of those phenomena remain largely hidden in the produced data. The goal of this work is to provide a scalable high performance statistical data analysis framework to help scientists perform interactive analyses of these raw data to extract knowledge. Towards this goal we have been developing an open source parallel statistical analysis package, called Parallel R, that lets scientists employ a wide range of statistical analysis routines on high performance shared and distributed memory architectures without having to deal with the intricacies of parallelizing these routines
Coupler developments at CERN

International Nuclear Information System (INIS)

Cavallari, G.; Chiaveri, E.; Haebel, E.; Legendre, P.; Weingarten, W.

1988-01-01

This paper discusses the coupler developments that have taken place at CERN since the last RF superconductivity workshop. At that time beam tube coupling was just starting to be examined. It was found that in restricting the number of cells to four with the correct amount of intercell coupling, and endcells compensated simultaneously for several modes, trapped modes can be avoided at least up to three and a half times the fundamental mode frequency. This result is regarded as a sufficiently safe basis to switch over to beam tube coupling with two higher order mode (hom) couplers, one on each side and with 65 degrees in between them, and in addition one beam tube power coupler. The characteristics of the cavity and the machine determine the basic coupler specifications. Four designs are discussed for hom couplers. 22 reference, 15 figures, 2 tables
Implementation of a high performance parallel finite element micromagnetics package

International Nuclear Information System (INIS)

Scholz, W.; Suess, D.; Dittrich, R.; Schrefl, T.; Tsiantos, V.; Forster, H.; Fidler, J.

2004-01-01

A new high performance scalable parallel finite element micromagnetics package has been implemented. It includes solvers for static energy minimization, time integration of the Landau-Lifshitz-Gilbert equation, and the nudged elastic band method
RF Processing of the Couplers for the SNS Superconducting Cavities

International Nuclear Information System (INIS)

Y.Kang; I.E. Campisi; D. Stout; A. Vassioutchenko; M. Stirbet; M. Drury; T. Powers

2005-01-01

All eighty-one fundamental power couplers for the 805 MHz superconducting cavities of the SNS linac have been RF conditioned and installed in the cryomodules successfully. The couplers were RF processed at JLAB or at the SNS in ORNL: more than forty couplers have been RF conditioned in the SNS RF Test Facility (RFTF) after the first forty couplers were conditioned at JLAB. The couplers were conditioned up to 650 kW forward power at 8% duty cycle in traveling and standing waves. They were installed on the cavities in the cryomodules and then assembled with the airside waveguide transitions. The couplers have been high power RF tested with satisfactory accelerating field gradients in the cooled cavities
Very short intracavity directional coupler for high-speed communication

Science.gov (United States)

Griffel, Giora

1993-07-01

We propose a novel intracavity modulator/switch that consists of a directional-coupler located inside a Fabry-Perot cavity. The back mirror of the cavity has a unit reflectivity so that both input and output signals are at the same side. In this way we obtain a two-port, single side element, with coupling length of 83.5 μm, which is the shortest modulation coupler proposed so far. The upper frequency limit due to photon lifetime is 275 GHz, which is well over the bandwidth constraints of microwave lumped structures. A unified approach for the analysis of this device and other similar structures is presented and discussed.
Directional multimode coupler for planar magnonics: Side-coupled magnetic stripes

Energy Technology Data Exchange (ETDEWEB)

Sadovnikov, A. V., E-mail: sadovnikovav@gmail.com; Nikitov, S. A. [Laboratory “Metamaterials,” Saratov State University, Saratov 410012 (Russian Federation); Kotel' nikov Institute of Radioengineering and Electronics, Russian Academy of Sciences, Moscow 125009 (Russian Federation); Beginin, E. N.; Sheshukova, S. E.; Romanenko, D. V.; Sharaevskii, Yu. P. [Laboratory “Metamaterials,” Saratov State University, Saratov 410012 (Russian Federation)

2015-11-16

We experimentally demonstrate spin waves coupling in two laterally adjacent magnetic stripes. By the means of Brillouin light scattering spectroscopy, we show that the coupling efficiency depends both on the magnonic waveguides' geometry and the characteristics of spin-wave modes. In particular, the lateral confinement of coupled yttrium-iron-garnet stripes enables the possibility of control over the spin-wave propagation characteristics. Numerical simulations (in time domain and frequency domain) reveal the nature of intermodal coupling between two magnonic stripes. The proposed topology of multimode magnonic coupler can be utilized as a building block for fabrication of integrated parallel functional and logic devices such as the frequency selective directional coupler or tunable splitter, enabling a number of potential applications for planar magnonics.
2×2 polymeric electro-optic MZI switch using multimode interference couplers

Science.gov (United States)

Li, H. P.; Liao, J. K.; Tang, X. G.; Lu, R. G.; Liu, Y. Z.

2009-11-01

We present the design of a 2×2 photonic switch operating at 1.55-μm wavelength using electro-optic (EO) polymer waveguides. A Mach-Zehnder interferometer (MZI) is used to implement the proposed switch in which two identical 2×2 multimode interference (MMI) couplers are connected by two identical parallel single mode waveguides (two MZI arms). These two single-mode waveguides with electrodes allow modulating the phase difference between the two MZI arms based on the EO effect. In the proposed switch, the EO polymer, IPC-E/polysulfone, is used for the core layer of optical waveguides. UV15 and NOA61 are employed for the lower and upper cladding layers, respectively. The singlemode waveguide structure and 2×2 MMI coupler have been designed and analyzed for the EO switch. Device performance has been simulated using the beam propagation method. It is found that the switch performance is most sensitive to the MMI width and less sensitive to the MMI length. Optimized structure has been obtained for the 2×2 polymeric EO switch, which has a crosstalk level better than -25 dB and insertion loss lower than -1.8 dB. This performance makes the switch a potential candidate for practical use in photonic systems.
A high performance parallel approach to medical imaging

International Nuclear Information System (INIS)

Frieder, G.; Frieder, O.; Stytz, M.R.

1988-01-01

Research into medical imaging using general purpose parallel processing architectures is described and a review of the performance of previous medical imaging machines is provided. Results demonstrating that general purpose parallel architectures can achieve performance comparable to other, specialized, medical imaging machine architectures is presented. A new back-to-front hidden-surface removal algorithm is described. Results demonstrating the computational savings obtained by using the modified back-to-front hidden-surface removal algorithm are presented. Performance figures for forming a full-scale medical image on a mesh interconnected multiprocessor are presented
Design and Analysis of an Optical Coupler for Concentrated Solar Light Using Optical Fibers in Residential Buildings

Directory of Open Access Journals (Sweden)

Afshin Aslian

2016-01-01

Full Text Available Concentrated sunlight that is transmitted by fiber optics has been used for generating electricity, heat, and daylight. On the other hand, multijunction photovoltaic cells provide high efficiency for generating electricity from highly concentrated sunlight. This study deals with designing and simulating a high-efficiency coupler, employing a mathematical model to connect sunlight with fiber optics for multiple applications. The coupler concentrates and distributes irradiated light from a primary concentrator. In this study, a parabolic dish was used as the primary concentrator, a coupler that contains nine components called a compound truncated pyramid and a cone (CTPC, all of which were mounted on a plate. The material of both the CTPC and the plate was BK7 optical glass. Fiber optics cables and multijunction photovoltaic cells were connected to the cylindrical part of the CTPC. The fibers would transmit the light to the building to provide heat and daylight, whereas multijunction photovoltaic cells generate electricity. Theoretical and simulation results showed high performance of the designed coupler. The efficiency of the coupler was as high as 92%, whereas the rim angle of the dish increased to an optimum angle. Distributed sunlight in the coupler increased the flexibility and simplicity of the design, resulting in a system that provided concentrated electricity, heat, and lighting for residential buildings.
A coupler for parasitic mode diagnosis in an X-band triaxial klystron amplifier

Directory of Open Access Journals (Sweden)

Wei Zhang

2017-10-01

Full Text Available The traditional methods of parasitic mode excitation diagnosis in an X-band triaxial klystron amplifier (TKA meet two difficulties: limited installation space and vacuum sealing. In order to solve these issues, a simple and compact coupler with good sealing performance, which can prevent air flow between the main and the auxiliary waveguides, is proposed and investigated experimentally. The coupler is designed with the aperture diffraction theory and the finite-different time-domain (FDTD method. The designed coupler consists of a main coaxial waveguide (for microwave transmission and a rectangular auxiliary waveguide (for parasitic mode diagnosis. The entire coupler structure has been fabricated by macromolecule polymer which is transparent to microwave signal in frequency range of X-band. The metal coating of about 200 microns has been performed through electroplating technique to ensure that the device operates well at high power. A small aperture is made in the metal coating. Hence, microwave can couple through the hole and the wave-transparent medium, whereas air flow is blocked by the wave-transparent medium. The coupling coefficient is analyzed and simulated with CST software. The coupler model is also included in particle-in-cell (PIC simulation with CHIPIC software and the associated parasitic mode excitation is studied. A frequency component of 11.46 GHz is observed in the FFT of the electric field of the drift tube and its corresponding competition mode appears as TE61 mode according to the electric field distribution. Besides, a frequency component of 10.8 GHz is also observed in the FFT of the electric field. After optimization of TE61 mode suppression, an experiment of the TKA with the designed coupler is carried out and the parasitic mode excitation at 10.8 GHz is observed through the designed coupler.
A novel bridge coupler for SSC coupled cavity linac

International Nuclear Information System (INIS)

Yao, C.G.; Chang, C.R.; Funk, W.

1992-01-01

A novel magnetically coupled multi-cavity bridge coupler is proposed for SSC Coupled-Cavity-Linac (CCL). The bridge coupler is a five cell disc-loaded waveguide with a small central aperture used for measurement and two large curved coupling slots near the edge on each disc. The two coupling slots on the adjacent disc are rotated 90 degrees in orientation to reduce the direct coupling. This type of structure is capable of producing very large coupling (>10% in our longest bridge coupler). Also because of the small opening on the discs, the high-order-modes are very far (> 300 MHz) above the operating mode. Thus for long bridge couplers, the magnetic coupled structure should provide maximum coupling with minimum mode mixing problems. In this paper both physics and engineering issues of this new bridge coupler are presented. (Author) 5 refs., 2 tabs., 6 figs
Structural Analysis of Taper-Threaded Rebar Couplers

International Nuclear Information System (INIS)

Chu, Seok Jae; Kwon, Hyuk Mo; Seo, Sang Hwan

2014-01-01

A number of rebar couplers were developed by the leading companies. The information about the products is available from the company website. However, the theory on the taper-threaded coupler is not available. In this paper, the mechanics of the taper-thread was developed to understand the effect of the tightening torque. Structural analysis of our own newly developed rebar coupler was done to improve the strength of the coupler. The taper-threaded rebar coupler was analyzed. The tightening of the rebar into the coupler developed a circumferential stress in the coupler. The circumferential stress depends on the coefficient of friction as well as the tightening torque. The circumferential stress is less than the allowable stress 20 kgf/mm 2 of the material for the coefficient of friction greater than 0.1. The tightening of the rebar into the coupler and the subsequent tensioning was simulated using CATIA. Linear elastic analysis considering contact was done. The tightening of the taper-threaded rebar developed a uniform stress distribution in both standard coupler and position coupler. On the other hand, the tightening of the nut in the axial direction developed a non-uniform stress distribution. Similarly the tensioning also developed a non-uniform stress distribution
Direct UV-written broadband directional broadband planar waveguide couplers

DEFF Research Database (Denmark)

Olivero, Massimo; Svalgaard, Mikael

2005-01-01

We report the fabrication of broadband directional couplers by direct UV-writing. The fabrication process is shown to be beneficial, robust and flexible. The components are compact and show superior performance in terms of loss and broadband operation.......We report the fabrication of broadband directional couplers by direct UV-writing. The fabrication process is shown to be beneficial, robust and flexible. The components are compact and show superior performance in terms of loss and broadband operation....

Demonstration of a High-Order Mode Input Coupler for a 220-GHz Confocal Gyrotron Traveling Wave Tube

Science.gov (United States)

Guan, Xiaotong; Fu, Wenjie; Yan, Yang

2018-02-01

A design of high-order mode input coupler for 220-GHz confocal gyrotron travelling wave tube is proposed, simulated, and demonstrated by experimental tests. This input coupler is designed to excite confocal TE 06 mode from rectangle waveguide TE 10 mode over a broadband frequency range. Simulation results predict that the optimized conversion loss is about 2.72 dB with a mode purity excess of 99%. Considering of the gyrotron interaction theory, an effective bandwidth of 5 GHz is obtained, in which the beam-wave coupling efficiency is higher than half of maximum. The field pattern under low power demonstrates that TE 06 mode is successfully excited in confocal waveguide at 220 GHz. Cold test results from the vector network analyzer perform good agreements with simulation results. Both simulation and experimental results illustrate that the reflection at input port S11 is sensitive to the perpendicular separation of two mirrors. It provides an engineering possibility for estimating the assembly precision.
Wakefield and RF Kicks Due to Coupler Asymmetry in TESLA-Type Accelerating Cavities

International Nuclear Information System (INIS)

Bane, K

2008-01-01

In a future linear collider, such as the International Linear Collider (ILC), trains of high current, low emittance bunches will be accelerated in a linac before colliding at the interaction point. Asymmetries in the accelerating cavities of the linac will generate fields that will kick the beam transversely and degrade the beam emittance and thus the collider performance. In the main linac of the ILC, which is filled with TESLA-type superconducting cavities, it is the fundamental (FM) and higher mode (HM) couplers that are asymmetric and thus the source of such kicks. The kicks are of two types: one, due to (the asymmetry in) the fundamental RF fields and the other, due to transverse wakefields that are generated by the beam even when it is on axis. In this report we calculate the strength of these kicks and estimate their effect on the ILC beam. The TESLA cavity comprises nine cells, one HM coupler in the upstream end, and one (identical, though rotated) HM coupler and one FM coupler in the downstream end (for their shapes and location see Figs. 1, 2) [1]. The cavity is 1.1 m long, the iris radius 35 mm, and the coupler beam pipe radius 39 mm. Note that the couplers reach closer to the axis than the irises, down to a distance of 30 mm
Coupler for nuclear reactor absorber rods

International Nuclear Information System (INIS)

Kerz, K.

1984-01-01

A coupler is described for absorber rods being suspended during operation of nuclear reactors which includes plurality of actuating elements being movable for individually and jointly releasing the coupler, the movement of each of the actuating elements for releasing the coupler being independently controllable
High performance parallel computers for science: New developments at the Fermilab advanced computer program

International Nuclear Information System (INIS)

Nash, T.; Areti, H.; Atac, R.

1988-08-01

Fermilab's Advanced Computer Program (ACP) has been developing highly cost effective, yet practical, parallel computers for high energy physics since 1984. The ACP's latest developments are proceeding in two directions. A Second Generation ACP Multiprocessor System for experiments will include $3500 RISC processors each with performance over 15 VAX MIPS. To support such high performance, the new system allows parallel I/O, parallel interprocess communication, and parallel host processes. The ACP Multi-Array Processor, has been developed for theoretical physics. Each $4000 node is a FORTRAN or C programmable pipelined 20 MFlops (peak), 10 MByte single board computer. These are plugged into a 16 port crossbar switch crate which handles both inter and intra crate communication. The crates are connected in a hypercube. Site oriented applications like lattice gauge theory are supported by system software called CANOPY, which makes the hardware virtually transparent to users. A 256 node, 5 GFlop, system is under construction. 10 refs., 7 figs
High performance parallelism pearls 2 multicore and many-core programming approaches

CERN Document Server

Jeffers, Jim

2015-01-01

High Performance Parallelism Pearls Volume 2 offers another set of examples that demonstrate how to leverage parallelism. Similar to Volume 1, the techniques included here explain how to use processors and coprocessors with the same programming - illustrating the most effective ways to combine Xeon Phi coprocessors with Xeon and other multicore processors. The book includes examples of successful programming efforts, drawn from across industries and domains such as biomed, genetics, finance, manufacturing, imaging, and more. Each chapter in this edited work includes detailed explanations of t
An analysis of multislot directional coupler

International Nuclear Information System (INIS)

Arai, Hiroyuki; Goto, Naohisa; Yamamoto, Takumi.

1986-03-01

This paper presents an analysis of multislot directional coupler for monitoring the gyrotron output. We solved the boundary value problem of the directional coupler to investigate the detailed effect of finite thickness slot and mutual coupling between slots. Numerical data of coupler design is presented for non-resonant a pair slot, and mode sensitivity in overmoded waveguide is also evaluated. (author)
Apodized grating coupler using fully-etched nanostructures

International Nuclear Information System (INIS)

Wu Hua; Li Chong; Guo Xia; Li Zhi-Yong

2016-01-01

A two-dimensional apodized grating coupler for interfacing between single-mode fiber and photonic circuit is demonstrated in order to bridge the mode gap between the grating coupler and optical fiber. The grating grooves of the grating couplers are realized by columns of fully etched nanostructures, which are utilized to digitally tailor the effective refractive index of each groove in order to obtain the Gaussian-like output diffractive mode and then enhance the coupling efficiency. Compared with that of the uniform grating coupler, the coupling efficiency of the apodized grating coupler is increased by 4.3% and 5.7%, respectively, for the nanoholes and nanorectangles as refractive index tunes layer. (paper)
Low Loss 1×2 Optical Coupler Based on Cosine S-bend with Segmented Waveguides

Science.gov (United States)

Yulianti, Ian; Sahmah, Abu; Supa'at, M.; Idrus, Sevia M.; Ridwanto, Muhammad; Al-hetar, Abdulaziz M.

2011-05-01

This paper presents an optimization of 1×2 polymer Y-junction optical coupler. The optimized optical coupler comprises straight polymer waveguide as the input waveguide, tapered waveguide, modified cosine S-bend and linear waveguide. At the branching point, N short waveguides with small width are introduced to reduce evanescent field. At operating wavelength of 1550 nm the excess loss of the coupler is ˜0.18 dB. In term of polarization dependence loss (PDL), the proposed coupler also shows a good performance with PDL value of less than 0.015 dB for wavelength range of 1470 nm-1550 nm. The proposed coupler could reduce excess loss more than 25% compared to conventional Y junction optical coupler.
Testing Procedures and Results of the Prototype Fundamental Power Coupler for the Spallation Neutron Source

International Nuclear Information System (INIS)

M. Stirbet; I.E. Campisi; E.F. Daly; G.K. Davis; M. Drury; P. Kneisel; G. Myneni; T. Powers; W.J. Schneider; K.M. Wilson; Y. Kang; K.A. Cummings; T. Hardek

2001-01-01

High-power RF testing with peak power in excess of 500 kW has been performed on prototype Fundamental Power Couplers (FPC) for the Spallation Neutron Source superconducting (SNS) cavities. The testing followed the development of procedures for cleaning, assembling and preparing the FPC for installation in the test stand. The qualification of the couplers has occurred for the time being only in a limited set of conditions (travelling wave, 20 pps) as the available RF system and control instrumentation are under improvement
CW all optical self switching in nonlinear chalcogenide nano plasmonic directional coupler

Science.gov (United States)

Motamed-Jahromi, Leila; Hatami, Mohsen

2018-04-01

In this paper we obtain the coupling coefficient of plasmonic directional coupler (PDC) made up of two parallel monolayer waveguides filled with high nonlinear chalcogenide material for TM mode in continues wave (CW) regime. In addition, we assume each waveguides acts as a perturbation to other waveguide. Four nonlinear-coupled equations are derived. Transfer distances are numerically calculated and used for deriving length of all optical switch. The length of designed switch is in the range of 10-1000 μm, and the switching power is in the range of 1-100 W/m. Obtained values are suitable for designing all optical elements in the integrated optical circuits.
Fiber-chip edge coupler with large mode size for silicon photonic wire waveguides.

Science.gov (United States)

Papes, Martin; Cheben, Pavel; Benedikovic, Daniel; Schmid, Jens H; Pond, James; Halir, Robert; Ortega-Moñux, Alejandro; Wangüemert-Pérez, Gonzalo; Ye, Winnie N; Xu, Dan-Xia; Janz, Siegfried; Dado, Milan; Vašinek, Vladimír

2016-03-07

Fiber-chip edge couplers are extensively used in integrated optics for coupling of light between planar waveguide circuits and optical fibers. In this work, we report on a new fiber-chip edge coupler concept with large mode size for silicon photonic wire waveguides. The coupler allows direct coupling with conventional cleaved optical fibers with large mode size while circumventing the need for lensed fibers. The coupler is designed for 220 nm silicon-on-insulator (SOI) platform. It exhibits an overall coupling efficiency exceeding 90%, as independently confirmed by 3D Finite-Difference Time-Domain (FDTD) and fully vectorial 3D Eigenmode Expansion (EME) calculations. We present two specific coupler designs, namely for a high numerical aperture single mode optical fiber with 6 µm mode field diameter (MFD) and a standard SMF-28 fiber with 10.4 µm MFD. An important advantage of our coupler concept is the ability to expand the mode at the chip edge without leading to high substrate leakage losses through buried oxide (BOX), which in our design is set to 3 µm. This remarkable feature is achieved by implementing in the SiO 2 upper cladding thin high-index Si 3 N 4 layers. The Si 3 N 4 layers increase the effective refractive index of the upper cladding near the facet. The index is controlled along the taper by subwavelength refractive index engineering to facilitate adiabatic mode transformation to the silicon wire waveguide while the Si-wire waveguide is inversely tapered along the coupler. The mode overlap optimization at the chip facet is carried out with a full vectorial mode solver. The mode transformation along the coupler is studied using 3D-FDTD simulations and with fully-vectorial 3D-EME calculations. The couplers are optimized for operating with transverse electric (TE) polarization and the operating wavelength is centered at 1.55 µm.
L-shaped fiber-chip grating couplers with high directionality and low reflectivity fabricated with deep-UV lithography.

Science.gov (United States)

Benedikovic, Daniel; Alonso-Ramos, Carlos; Pérez-Galacho, Diego; Guerber, Sylvain; Vakarin, Vladyslav; Marcaud, Guillaume; Le Roux, Xavier; Cassan, Eric; Marris-Morini, Delphine; Cheben, Pavel; Boeuf, Frédéric; Baudot, Charles; Vivien, Laurent

2017-09-01

Grating couplers enable position-friendly interfacing of silicon chips by optical fibers. The conventional coupler designs call upon comparatively complex architectures to afford efficient light coupling to sub-micron silicon-on-insulator (SOI) waveguides. Conversely, the blazing effect in double-etched gratings provides high coupling efficiency with reduced fabrication intricacy. In this Letter, we demonstrate for the first time, to the best of our knowledge, the realization of an ultra-directional L-shaped grating coupler, seamlessly fabricated by using 193 nm deep-ultraviolet (deep-UV) lithography. We also include a subwavelength index engineered waveguide-to-grating transition that provides an eight-fold reduction of the grating reflectivity, down to 1% (-20 dB). A measured coupling efficiency of -2.7 dB (54%) is achieved, with a bandwidth of 62 nm. These results open promising prospects for the implementation of efficient, robust, and cost-effective coupling interfaces for sub-micrometric SOI waveguides, as desired for large-volume applications in silicon photonics.
Kemari: A Portable High Performance Fortran System for Distributed Memory Parallel Processors

Directory of Open Access Journals (Sweden)

T. Kamachi

1997-01-01

Full Text Available We have developed a compilation system which extends High Performance Fortran (HPF in various aspects. We support the parallelization of well-structured problems with loop distribution and alignment directives similar to HPF's data distribution directives. Such directives give both additional control to the user and simplify the compilation process. For the support of unstructured problems, we provide directives for dynamic data distribution through user-defined mappings. The compiler also allows integration of message-passing interface (MPI primitives. The system is part of a complete programming environment which also comprises a parallel debugger and a performance monitor and analyzer. After an overview of the compiler, we describe the language extensions and related compilation mechanisms in detail. Performance measurements demonstrate the compiler's applicability to a variety of application classes.
Study of thermal interaction between a 150 kW CW power coupler and a superconducting 704 MHz elliptical cavity

Energy Technology Data Exchange (ETDEWEB)

Souli, M. [Institut de Physique Nucleaire d' Orsay, CNRS/IN2P3, Orsay (France)]. E-mail: souli@ipno.in2p3.fr; Fouaidy, M. [Institut de Physique Nucleaire d' Orsay, CNRS/IN2P3, Orsay (France); Saugnac, H. [Institut de Physique Nucleaire d' Orsay, CNRS/IN2P3, Orsay (France); Szott, P. [Institut de Physique Nucleaire d' Orsay, CNRS/IN2P3, Orsay (France); Gandolfo, N. [Institut de Physique Nucleaire d' Orsay, CNRS/IN2P3, Orsay (France); Bousson, S. [Institut de Physique Nucleaire d' Orsay, CNRS/IN2P3, Orsay (France); Braud, D. [CEA Saclay, DSM/DAPNIA/SACM, 91191 Gif sur Yvette (France); Charrier, J.P. [CEA Saclay, DSM/DAPNIA/SACM, 91191 Gif sur Yvette (France); Roudier, D. [CEA Saclay, DSM/DAPNIA/SACM, 91191 Gif sur Yvette (France); Sahuquet, P. [CEA Saclay, DSM/DAPNIA/SACM, 91191 Gif sur Yvette (France); Visentin, B. [CEA Saclay, DSM/DAPNIA/SACM, 91191 Gif sur Yvette (France)

2006-07-15

The power coupler needed for {beta} = 0.65 SRF elliptical cavities dedicated to the driver of XADS (eXperimental Accelerator Driven System) should transmit a CW RF power of 150 kW to a 10 mA proton beam. The estimated average values of the RF losses in the coupler are 130 W (respectively 46 W) for the inner (respectively outer) conductor in SW mode. Due to such high values of the RF losses, it is necessary to very carefully design and optimize the cooling circuits of the coupler in order to efficiently remove the generated heat and to reduce the thermal load to the cavity operating at T = 2 K. An experiment simulating the thermal interaction between the power coupler and a 704 MHz SRF five cells cavity was performed in the CRYHOLAB test facility in order to determine the critical heat load that can be sustained by the cavity without degradation of its RF performance. Experimental data are compared to numerical simulation results obtained with the Finite Element Method code COSMOS/M. These data allow us also to perform in situ measurements of the thermal parameters needed in the thermal model of the coupler (thermal conductivity, thermal contact resistance). These data are used to validate numerical simulations.
Raman probes based on optically-poled double-clad fiber and coupler

DEFF Research Database (Denmark)

Brunetti, Anna Chiara; Margulis, Walter; Rottwitt, Karsten

2012-01-01

of a sample of dimethyl sulfoxide (DMSO), when illuminating the waveguide with 1064nm laser light. The Raman signal is collected in the inner cladding, from which it is retrieved with either a bulk dichroic mirror or a double-clad fiber coupler. The coupler allows for a substantial reduction of the fiber......Two fiber Raman probes are presented, one based on an optically-poled double-clad fiber and the second based on an optically-poled double-clad fiber coupler respectively. Optical poling of the core of the fiber allows for the generation of enough 532nm light to perform Raman spectroscopy...
Cancellation of RF Coupler-Induced Emittance Due to Astigmatism

Energy Technology Data Exchange (ETDEWEB)

Dowell, David H.; /SLAC

2016-12-11

It is well-known that the electron beam quality required for applications such as FEL’s and ultra-fast electron diffraction can be degraded by the asymmetric fields introduced by the RF couplers of superconducting linacs. This effect is especially troublesome in the injector where the low energy beam from the gun is captured into the first high gradient accelerator section. Unfortunately modifying the established cavity design is expensive and time consuming, especially considering that only one or two sections are needed for an injector. Instead, it is important to analyze the coupler fields to understand their characteristics and help find less costly solutions for their cancellation and mitigation. This paper finds the RF coupler-induced emittance for short bunches is mostly due to the transverse spatial sloping or tilt of the field, rather than the field’s time-dependence. It is shown that the distorting effects of the coupler can be canceled with a static (DC) quadrupole lens rotated about the z-axis.
Analysis of the rectangular resonator with butterfly MMI coupler using SOI

Science.gov (United States)

Kim, Sun-Ho; Park, Jun-Hee; Kim, Eudum; Jeon, Su-Jin; Kim, Ji-Hoon; Choi, Young-Wan

2018-02-01

We propose a rectangular resonator sensor structure with butterfly MMI coupler using SOI. It consists of the rectangular resonator, total internal reflection (TIR) mirror, and the butterfly MMI coupler. The rectangular resonator is expected to be used as bio and chemical sensors because of the advantages of using MMI coupler and the absence of bending loss unlike ring resonators. The butterfly MMI coupler can miniaturize the device compared to conventional MMI by using a linear butterfly shape instead of a square in the MMI part. The width, height, and slab height of the rib type waveguide are designed to be 1.5 μm, 1.5 μm, and 0.9 μm, respectively. This structure is designed as a single mode. When designing a TIR mirror, we considered the Goos-Hänchen shift and critical angle. We designed 3:1 MMI coupler because rectangular resonator has no bending loss. The width of MMI is designed to be 4.5 μm and we optimize the length of the butterfly MMI coupler using finite-difference time-domain (FDTD) method for higher Q-factor. It has the equal performance with conventional MMI even though the length is reduced by 1/3. As a result of the simulation, Qfactor of rectangular resonator can be obtained as 7381.
Suppression of multipacting in high power RF couplers operating with superconducting cavities

Energy Technology Data Exchange (ETDEWEB)

Ostroumov, P.N., E-mail: ostroumov@frib.msu.edu [Facility for Rare Isotope Beams (FRIB), Michigan State University, East Lansing, MI 48824 (United States); Kazakov, S. [Fermi National Accelerator Laboratory, Batavia, IL 60510 (United States); Morris, D.; Larter, T.; Plastun, A.S.; Popielarski, J.; Wei, J.; Xu, T. [Facility for Rare Isotope Beams (FRIB), Michigan State University, East Lansing, MI 48824 (United States)

2017-06-01

Capacitive input couplers based on a 50 Ω coaxial transmission line are frequently used to transmit RF power to superconducting (SC) resonators operating in CW mode. It is well known that coaxial transmission lines are prone to multipacting phenomenon in a wide range of RF power level and operating frequency. The Facility for Rare Isotope Beams (FRIB) being constructed at Michigan State University includes two types of quarter wave SC resonators (QWR) operating at 80.5 MHz and two types of half wave SC resonators (HWR) operating at 322 MHz. As was reported in ref. [1] a capacitive input coupler used with HWRs was experiencing strong multipacting that resulted in a long conditioning time prior the cavity testing at design levels of accelerating fields. We have developed an insert into 50 Ω coaxial transmission line that provides opportunity to bias the RF coupler antenna and protect the amplifier from the bias potential in the case of breakdown in DC isolation. Two of such devices have been built and are currently used for the off-line testing of 8 HWRs installed in the cryomodule.
OBLIMAP 2.0: a fast climate model-ice sheet model coupler including online embeddable mapping routines

Science.gov (United States)

Reerink, Thomas J.; van de Berg, Willem Jan; van de Wal, Roderik S. W.

2016-11-01

This paper accompanies the second OBLIMAP open-source release. The package is developed to map climate fields between a general circulation model (GCM) and an ice sheet model (ISM) in both directions by using optimal aligned oblique projections, which minimize distortions. The curvature of the surfaces of the GCM and ISM grid differ, both grids may be irregularly spaced and the ratio of the grids is allowed to differ largely. OBLIMAP's stand-alone version is able to map data sets that differ in various aspects on the same ISM grid. Each grid may either coincide with the surface of a sphere, an ellipsoid or a flat plane, while the grid types might differ. Re-projection of, for example, ISM data sets is also facilitated. This is demonstrated by relevant applications concerning the major ice caps. As the stand-alone version also applies to the reverse mapping direction, it can be used as an offline coupler. Furthermore, OBLIMAP 2.0 is an embeddable GCM-ISM coupler, suited for high-frequency online coupled experiments. A new fast scan method is presented for structured grids as an alternative for the former time-consuming grid search strategy, realising a performance gain of several orders of magnitude and enabling the mapping of high-resolution data sets with a much larger number of grid nodes. Further, a highly flexible masked mapping option is added. The limitation of the fast scan method with respect to unstructured and adaptive grids is discussed together with a possible future parallel Message Passing Interface (MPI) implementation.
OBLIMAP 2.0: a fast climate model–ice sheet model coupler including online embeddable mapping routines

Directory of Open Access Journals (Sweden)

T. J. Reerink

2016-11-01

Full Text Available This paper accompanies the second OBLIMAP open-source release. The package is developed to map climate fields between a general circulation model (GCM and an ice sheet model (ISM in both directions by using optimal aligned oblique projections, which minimize distortions. The curvature of the surfaces of the GCM and ISM grid differ, both grids may be irregularly spaced and the ratio of the grids is allowed to differ largely. OBLIMAP's stand-alone version is able to map data sets that differ in various aspects on the same ISM grid. Each grid may either coincide with the surface of a sphere, an ellipsoid or a flat plane, while the grid types might differ. Re-projection of, for example, ISM data sets is also facilitated. This is demonstrated by relevant applications concerning the major ice caps. As the stand-alone version also applies to the reverse mapping direction, it can be used as an offline coupler. Furthermore, OBLIMAP 2.0 is an embeddable GCM–ISM coupler, suited for high-frequency online coupled experiments. A new fast scan method is presented for structured grids as an alternative for the former time-consuming grid search strategy, realising a performance gain of several orders of magnitude and enabling the mapping of high-resolution data sets with a much larger number of grid nodes. Further, a highly flexible masked mapping option is added. The limitation of the fast scan method with respect to unstructured and adaptive grids is discussed together with a possible future parallel Message Passing Interface (MPI implementation.

Ultrashort hybrid metal-insulator plasmonic directional coupler.

Science.gov (United States)

Noghani, Mahmoud Talafi; Samiei, Mohammad Hashem Vadjed

2013-11-01

An ultrashort plasmonic directional coupler based on the hybrid metal-insulator slab waveguide is proposed and analyzed at the telecommunication wavelength of 1550 nm. It is first analyzed using the supermode theory based on mode analysis via the transfer matrix method in the interaction region. Then the 2D model of the coupler, including transition arms, is analyzed using a commercial finite-element method simulator. The hybrid slab waveguide is composed of a metallic layer of silver and two dielectric layers of silica (SiO2) and silicon (Si). The coupler is optimized to have a minimum coupling length and to transfer maximum power considering the layer thicknesses as optimization variables. The resulting coupling length in the submicrometer region along with a noticeable power transfer efficiency are advantages of the proposed coupler compared to previously reported plasmonic couplers.
9th International Workshop on Parallel Tools for High Performance Computing

CERN Document Server

Hilbrich, Tobias; Niethammer, Christoph; Gracia, José; Nagel, Wolfgang; Resch, Michael

2016-01-01

High Performance Computing (HPC) remains a driver that offers huge potentials and benefits for science and society. However, a profound understanding of the computational matters and specialized software is needed to arrive at effective and efficient simulations. Dedicated software tools are important parts of the HPC software landscape, and support application developers. Even though a tool is by definition not a part of an application, but rather a supplemental piece of software, it can make a fundamental difference during the development of an application. Such tools aid application developers in the context of debugging, performance analysis, and code optimization, and therefore make a major contribution to the development of robust and efficient parallel software. This book introduces a selection of the tools presented and discussed at the 9th International Parallel Tools Workshop held in Dresden, Germany, September 2-3, 2015, which offered an established forum for discussing the latest advances in paral...
Cryogenic cooler thermal coupler

International Nuclear Information System (INIS)

Green, K.E.; Talbourdet, J.A.

1984-01-01

A thermal coupler assembly mounted to the coldfinger of a cryogenic cooler which provides improved thermal transfer between the coldfinger and the detector assembly mounted on the dewar endwell. The thermal coupler design comprises a stud and spring-loaded cap mounted on the coldfinger assembly. Thermal transfer is made primarily through the air space between the cap and coldwell walls along the radial surfaces. The cap is spring loaded to provide thermal contact between the cap and endwell end surfaces
High performance parallel backprojection on FPGA

Energy Technology Data Exchange (ETDEWEB)

Pfanner, Florian; Knaup, Michael; Kachelriess, Marc [Erlangen-Nuernberg Univ., Erlangen (Germany). Inst. of Medical Physics (IMP)

2011-07-01

Reconstruction of tomographic images, i.e., images from a Computed Tomography scanner, is a very time consuming issue. The most calculation power is needed for the backprojection step. A closer inspection shows that the algorithm for backprojection is easy to parallelize. FPGAs are able to execute many operations in the same time, so a highly parallel algorithm is a requirement for a powerful acceleration. For data flow rate maximization, we realized the backprojection in a pipelined structure with data throughput of one clock cycle. Due the hardware limitations of the FPGA, it is not possible to reconstruct the image as a whole. So it is necessary to split up the image and reconstruct these parts separately. Despite that, a reconstruction of 512 projections into a 5122 image is calculated within 13 ms on a Virtex 5 FPGA. To save hardware resources we use fixed point arithmetic with an accuracy of 23 bit for calculation. A comparison of the result image and an image, calculated with floating point arithmetic on CPU, shows that there are no differences between these images. (orig.)
Magnetic field sensor based on cascaded microfiber coupler with magnetic fluid

Energy Technology Data Exchange (ETDEWEB)

Mao, Lianmin; Su, Delong; Wang, Zhaofang [College of Science, University of Shanghai for Science and Technology, Shanghai 200093 (China); Pu, Shengli, E-mail: shlpu@usst.edu.cn [College of Science, University of Shanghai for Science and Technology, Shanghai 200093 (China); Shanghai Key Laboratory of Modern Optical System, University of Shanghai for Science and Technology, Shanghai 200093 (China); Zeng, Xianglong [The Key Lab of Specialty Fiber Optics and Optical Access Network, Shanghai University, Shanghai 200072 (China); Lahoubi, Mahieddine [Laboratory L.P.S., Department of Physics, Faculty of Sciences, Badji-Mokhtar Annaba University, Annaba 23000 (Algeria)

2016-09-07

A kind of magnetic field sensor based on cascaded microfiber coupler with magnetic fluid is proposed and experimentally demonstrated. The magnetic fluid is utilized as the cladding of the fused regions of the cascaded microfiber coupler. As the interference valley wavelength of the sensing structure is sensitive to the ambient variation, considering the magnetic-field-dependent refractive index of magnetic fluid, the proposed structure is employed for magnetic field sensing. The effective coupling length for each coupling region of the as-fabricated cascaded microfiber coupler is 6031 μm. The achieved sensitivity is 125 pm/Oe, which is about three times larger than that of the previously similar structure based on the single microfiber coupler. Experimental results indicate that the sensing sensitivity can be easily improved by increasing the effective coupling length or cascading more microfiber couplers. The proposed magnetic field sensor is attractive due to its low cost, immunity to electromagnetic interference, as well as high sensitivity, which also has the potentials in other tunable all-fiber photonic devices, such as filter.
High-performance parallel approaches for three-dimensional light detection and ranging point clouds gridding

Science.gov (United States)

Rizki, Permata Nur Miftahur; Lee, Heezin; Lee, Minsu; Oh, Sangyoon

2017-01-01

With the rapid advance of remote sensing technology, the amount of three-dimensional point-cloud data has increased extraordinarily, requiring faster processing in the construction of digital elevation models. There have been several attempts to accelerate the computation using parallel methods; however, little attention has been given to investigating different approaches for selecting the most suited parallel programming model for a given computing environment. We present our findings and insights identified by implementing three popular high-performance parallel approaches (message passing interface, MapReduce, and GPGPU) on time demanding but accurate kriging interpolation. The performances of the approaches are compared by varying the size of the grid and input data. In our empirical experiment, we demonstrate the significant acceleration by all three approaches compared to a C-implemented sequential-processing method. In addition, we also discuss the pros and cons of each method in terms of usability, complexity infrastructure, and platform limitation to give readers a better understanding of utilizing those parallel approaches for gridding purposes.
Design and optimization of mechanically down-doped terahertz fiber directional couplers

DEFF Research Database (Denmark)

Bao, Hualong; Nielsen, Kristian; Rasmussen, Henrik K.

2014-01-01

We present a thorough practical design optimization of broadband low loss, terahertz (THz) photonic crystal fiber directional couplers in which the two cores are mechanically down-doped with a triangular array of air holes. A figure of merit taking both the 3-dB bandwidth and loss of the coupler...... into account, is used for optimization of the structure parameters, given by the diameter and pitch of the cladding (d and Λ) and of the core (dc and Λc) air-hole structure. The coupler with Λ = 498.7 μm, dc= 324.2 μm, Λc = 74.8 μm, and dc = 32.5 μm is found to have the best performance at a center frequency...... of 1THz, with a bandwidth of 0.25 THz and a total device loss of 9.2 dB. The robustness of the optimum coupler to structural changes is investigated. © 2014 Optical Society of America....
Computational Performance of a Parallelized Three-Dimensional High-Order Spectral Element Toolbox

Science.gov (United States)

Bosshard, Christoph; Bouffanais, Roland; Clémençon, Christian; Deville, Michel O.; Fiétier, Nicolas; Gruber, Ralf; Kehtari, Sohrab; Keller, Vincent; Latt, Jonas

In this paper, a comprehensive performance review of an MPI-based high-order three-dimensional spectral element method C++ toolbox is presented. The focus is put on the performance evaluation of several aspects with a particular emphasis on the parallel efficiency. The performance evaluation is analyzed with help of a time prediction model based on a parameterization of the application and the hardware resources. A tailor-made CFD computation benchmark case is introduced and used to carry out this review, stressing the particular interest for clusters with up to 8192 cores. Some problems in the parallel implementation have been detected and corrected. The theoretical complexities with respect to the number of elements, to the polynomial degree, and to communication needs are correctly reproduced. It is concluded that this type of code has a nearly perfect speed up on machines with thousands of cores, and is ready to make the step to next-generation petaflop machines.
Compact broadband polarization beam splitter using a symmetric directional coupler with sinusoidal bends.

Science.gov (United States)

Zhang, Fan; Yun, Han; Wang, Yun; Lu, Zeqin; Chrostowski, Lukas; Jaeger, Nicolas A F

2017-01-15

We design and demonstrate a compact broadband polarization beam splitter (PBS) using a symmetric directional coupler with sinusoidal bends on a silicon-on-insulator platform. The sinusoidal bends in our PBS suppress the power exchange between two parallel symmetric strip waveguides for the transverse-electric (TE) mode, while allowing for the maximum power transfer to the adjacent waveguide for the transverse-magnetic (TM) mode. Our PBS has a nominal coupler length of 8.55 μm, and it has an average extinction ratio (ER) of 12.0 dB for the TE mode, an average ER of 20.1 dB for the TM mode, an average polarization isolation (PI) of 20.6 dB for the through port, and an average PI of 11.5 dB for the cross port, all over a bandwidth of 100 nm.
High-power RF window and coupler development for the PEP-II B Factory

International Nuclear Information System (INIS)

Neubauer, M.; Fant, K.; Hodgson, J.; Judkins, J.; Schwarz, H.; Rimmer, R.A.

1995-05-01

We describe the fabrication and testing of the RF windows designed to transmit power to the PEP-II 476 MHz cavities. Design choices to maximize the reliability of the window are discussed. Fabrication technologies for the window are described and finite-element analysis of the assembly process is presented. Conditioning and high-power testing of the window are discussed. Design of the coupler assembly including the integration of the window and other components is reported
Efficient waveguide coupler based on metal materials

Science.gov (United States)

Wu, Wenjun; Yang, Junbo; Chang, Shengli; Zhang, Jingjing; Lu, Huanyu

2015-10-01

Because of the diffraction limit of light, the scale of optical element stays in the order of wavelength, which makes the interface optics and nano-electronic components cannot be directly matched, thus the development of photonics technology encounters a bottleneck. In order to solve the problem that coupling of light into the subwavelength waveguide, this paper proposes a model of coupler based on metal materials. By using Surface Plasmon Polaritons (SPPs) wave, incident light can be efficiently coupled into waveguide of diameter less than 100 nm. This paper mainly aims at near infrared wave band, and tests a variety of the combination of metal materials, and by changing the structural parameters to get the maximum coupling efficiency. This structure splits the plane incident light with wavelength of 864 nm, the width of 600 nm into two uniform beams, and separately coupled into the waveguide layer whose width is only about 80 nm, and the highest coupling efficiency can reach above 95%. Using SPPs structure will be an effective method to break through the diffraction limit and implement photonics device high-performance miniaturization. We can further compress the light into small scale fiber or waveguide by using the metal coupler, and to save the space to hold more fiber or waveguide layer, so that we can greatly improve the capacity of optical communication. In addition, high-performance miniaturization of the optical transmission medium can improve the integration of optical devices, also provide a feasible solution for the photon computer research and development in the future.
Geometric optimisation of an accurate cosine correcting optic fibre coupler for solar spectral measurement

Science.gov (United States)

Cahuantzi, Roberto; Buckley, Alastair

2017-09-01

Making accurate and reliable measurements of solar irradiance is important for understanding performance in the photovoltaic energy sector. In this paper, we present design details and performance of a number of fibre optic couplers for use in irradiance measurement systems employing remote light sensors applicable for either spectrally resolved or broadband measurement. The angular and spectral characteristics of different coupler designs are characterised and compared with existing state-of-the-art commercial technology. The new coupler designs are fabricated from polytetrafluorethylene (PTFE) rods and operate through forward scattering of incident sunlight on the front surfaces of the structure into an optic fibre located in a cavity to the rear of the structure. The PTFE couplers exhibit up to 4.8% variation in scattered transmission intensity between 425 nm and 700 nm and show minimal specular reflection, making the designs accurate and reliable over the visible region. Through careful geometric optimization near perfect cosine dependence on the angular response of the coupler can be achieved. The PTFE designs represent a significant improvement over the state of the art with less than 0.01% error compared with ideal cosine response for angles of incidence up to 50°.
Geometric optimisation of an accurate cosine correcting optic fibre coupler for solar spectral measurement.

Science.gov (United States)

Cahuantzi, Roberto; Buckley, Alastair

2017-09-01

Making accurate and reliable measurements of solar irradiance is important for understanding performance in the photovoltaic energy sector. In this paper, we present design details and performance of a number of fibre optic couplers for use in irradiance measurement systems employing remote light sensors applicable for either spectrally resolved or broadband measurement. The angular and spectral characteristics of different coupler designs are characterised and compared with existing state-of-the-art commercial technology. The new coupler designs are fabricated from polytetrafluorethylene (PTFE) rods and operate through forward scattering of incident sunlight on the front surfaces of the structure into an optic fibre located in a cavity to the rear of the structure. The PTFE couplers exhibit up to 4.8% variation in scattered transmission intensity between 425 nm and 700 nm and show minimal specular reflection, making the designs accurate and reliable over the visible region. Through careful geometric optimization near perfect cosine dependence on the angular response of the coupler can be achieved. The PTFE designs represent a significant improvement over the state of the art with less than 0.01% error compared with ideal cosine response for angles of incidence up to 50°.
Quantitative analysis of coupler tuning

International Nuclear Information System (INIS)

Zheng Shuxin; Cui Yupeng; Chen Huaibi; Xiao Liling

2001-01-01

The author deduces the equation of coupler frequency deviation Δf and coupling coefficient β instead of only giving the adjusting direction in the process of matching coupler, on the basis of coupling-cavity chain equivalent circuits model. According to this equation, automatic measurement and quantitative display are realized on a measuring system. It contributes to industrialization of traveling-wave accelerators for large container inspection systems
High performance parallel computing of flows in complex geometries: I. Methods

International Nuclear Information System (INIS)

Gourdain, N; Gicquel, L; Montagnac, M; Vermorel, O; Staffelbach, G; Garcia, M; Boussuge, J-F; Gazaix, M; Poinsot, T

2009-01-01

Efficient numerical tools coupled with high-performance computers, have become a key element of the design process in the fields of energy supply and transportation. However flow phenomena that occur in complex systems such as gas turbines and aircrafts are still not understood mainly because of the models that are needed. In fact, most computational fluid dynamics (CFD) predictions as found today in industry focus on a reduced or simplified version of the real system (such as a periodic sector) and are usually solved with a steady-state assumption. This paper shows how to overcome such barriers and how such a new challenge can be addressed by developing flow solvers running on high-end computing platforms, using thousands of computing cores. Parallel strategies used by modern flow solvers are discussed with particular emphases on mesh-partitioning, load balancing and communication. Two examples are used to illustrate these concepts: a multi-block structured code and an unstructured code. Parallel computing strategies used with both flow solvers are detailed and compared. This comparison indicates that mesh-partitioning and load balancing are more straightforward with unstructured grids than with multi-block structured meshes. However, the mesh-partitioning stage can be challenging for unstructured grids, mainly due to memory limitations of the newly developed massively parallel architectures. Finally, detailed investigations show that the impact of mesh-partitioning on the numerical CFD solutions, due to rounding errors and block splitting, may be of importance and should be accurately addressed before qualifying massively parallel CFD tools for a routine industrial use.
Apodized grating coupler using fully-etched nanostructures

Science.gov (United States)

Wu, Hua; Li, Chong; Li, Zhi-Yong; Guo, Xia

2016-08-01

A two-dimensional apodized grating coupler for interfacing between single-mode fiber and photonic circuit is demonstrated in order to bridge the mode gap between the grating coupler and optical fiber. The grating grooves of the grating couplers are realized by columns of fully etched nanostructures, which are utilized to digitally tailor the effective refractive index of each groove in order to obtain the Gaussian-like output diffractive mode and then enhance the coupling efficiency. Compared with that of the uniform grating coupler, the coupling efficiency of the apodized grating coupler is increased by 4.3% and 5.7%, respectively, for the nanoholes and nanorectangles as refractive index tunes layer. Project supported by the National Natural Science Foundation of China (Grant Nos. 61222501, 61335004, and 61505003), the Specialized Research Fund for the Doctoral Program of Higher Education of China (Grant No. 20111103110019), the Postdoctoral Science Foundation of Beijing Funded Project, China (Grant No. Q6002012201502), and the Science and Technology Research Project of Jiangxi Provincial Education Department, China (Grant No. GJJ150998).
Development of photonic-crystal-fiber-based optical coupler with a broad operating wavelength range of 800 nm

International Nuclear Information System (INIS)

Yoon, Min-Seok; Kwon, Oh-Jang; Kim, Hyun-Joo; Chu, Su-Ho; Kim, Gil-Hwan; Lee, Sang-Bae; Han, Young-Geun

2010-01-01

We developed a broadband optical coupler based on a photonic crystal fiber (PCF), which is very useful for applications to optical coherence tomography (OCT). The PCF-based coupler is fabricated by using a fused biconical tapering (FBT) method. The PCF has six hexagonally-stacked layers of air holes. The PCF-based coupler has a nearly-flat 50/50 coupling ratio in a broad bandwidth range of 800 nm, which is much wider than that previously reported for a PCF-based coupler and a singlemode-fiber-based coupler. The bandwidth and the bandedge wavelength of the broadband coupler are controlled by changing the elongation length. The fabricated broadband optical coupler has great potential for realizing a broadband interferogram with a high resolution in an OCT system.
rf coupler technology for fusion applications

International Nuclear Information System (INIS)

Hoffman, D.J.

1983-01-01

Radio frequency (rf) oscillations at critical frequencies have successfully provided a means to convey power to fusion plasmas due to the electrical-magnetic properties of the plasma. While large rf systems to couple power to the plasma have been designed, built, and tested, the main link to the plasma, the coupler, is still in an evolutionary stage of development. Design and fabrication of optimal antennas for fusion applications are complicated by incomplete characterizations of the harsh plasma environment and of coupling mechanisms. A brief description of rf coupler technology required for plasma conditions is presented along with an assessment of the status and goals of coupler development
Performance studies of the parallel VIM code

International Nuclear Information System (INIS)

Shi, B.; Blomquist, R.N.

1996-01-01

In this paper, the authors evaluate the performance of the parallel version of the VIM Monte Carlo code on the IBM SPx at the High Performance Computing Research Facility at ANL. Three test problems with contrasting computational characteristics were used to assess effects in performance. A statistical method for estimating the inefficiencies due to load imbalance and communication is also introduced. VIM is a large scale continuous energy Monte Carlo radiation transport program and was parallelized using history partitioning, the master/worker approach, and p4 message passing library. Dynamic load balancing is accomplished when the master processor assigns chunks of histories to workers that have completed a previously assigned task, accommodating variations in the lengths of histories, processor speeds, and worker loads. At the end of each batch (generation), the fission sites and tallies are sent from each worker to the master process, contributing to the parallel inefficiency. All communications are between master and workers, and are serial. The SPx is a scalable 128-node parallel supercomputer with high-performance Omega switches of 63 microsec latency and 35 MBytes/sec bandwidth. For uniform and reproducible performance, they used only the 120 identical regular processors (IBM RS/6000) and excluded the remaining eight planet nodes, which may be loaded by other's jobs
Parameters that affect parallel processing for computational electromagnetic simulation codes on high performance computing clusters

Science.gov (United States)

Moon, Hongsik

What is the impact of multicore and associated advanced technologies on computational software for science? Most researchers and students have multicore laptops or desktops for their research and they need computing power to run computational software packages. Computing power was initially derived from Central Processing Unit (CPU) clock speed. That changed when increases in clock speed became constrained by power requirements. Chip manufacturers turned to multicore CPU architectures and associated technological advancements to create the CPUs for the future. Most software applications benefited by the increased computing power the same way that increases in clock speed helped applications run faster. However, for Computational ElectroMagnetics (CEM) software developers, this change was not an obvious benefit - it appeared to be a detriment. Developers were challenged to find a way to correctly utilize the advancements in hardware so that their codes could benefit. The solution was parallelization and this dissertation details the investigation to address these challenges. Prior to multicore CPUs, advanced computer technologies were compared with the performance using benchmark software and the metric was FLoting-point Operations Per Seconds (FLOPS) which indicates system performance for scientific applications that make heavy use of floating-point calculations. Is FLOPS an effective metric for parallelized CEM simulation tools on new multicore system? Parallel CEM software needs to be benchmarked not only by FLOPS but also by the performance of other parameters related to type and utilization of the hardware, such as CPU, Random Access Memory (RAM), hard disk, network, etc. The codes need to be optimized for more than just FLOPs and new parameters must be included in benchmarking. In this dissertation, the parallel CEM software named High Order Basis Based Integral Equation Solver (HOBBIES) is introduced. This code was developed to address the needs of the

A novel highly efficient grating coupler with large filling factor used for optoelectronic integration

International Nuclear Information System (INIS)

Zhou Liang; Li Zhi-Yong; Zhu Yu; Li Yun-Tao; Yu Yu-De; Yu Jin-Zhong; Fan Zhong-Cao; Han Wei-Hua

2010-01-01

A novel highly efficient grating coupler with large filling factor and deep etching is proposed in silicon-on-insulator for near vertical coupling between the rib waveguide and optical fibre. The deep slots acting as high efficient scattering centres are analysed and optimized. As high as 60% coupling efficiency at telecom wavelength of 1550-nm and 3-dB bandwidth of 61 nm are predicted by simulation. A peak coupling efficiency of 42.1% at wavelength 1546-nm and 3-dB bandwidth of 37.6 nm are obtained experimentally. (classical areas of phenomenology)
Study of a power coupler for superconducting RF cavities used in high intensity proton accelerator; Etude et developpement d'un coupleur de puissance pour les cavites supraconductrices destinees aux accelerateurs de protons de haute intensite

Energy Technology Data Exchange (ETDEWEB)

Souli, M

2007-07-15

The coaxial power coupler needed for superconducting RF cavities used in the high energy section of the EUROTRANS driver should transmit 150 kW (CW operation) RF power to the protons beam. The calculated RF and dielectric losses in the power coupler (inner and outer conductor, RF window) are relatively high. Consequently, it is necessary to design very carefully the cooling circuits in order to remove the generated heat and to ensure stable and reliable operating conditions for the coupler cavity system. After calculating all type of losses in the power coupler, we have designed and validated the inner conductor cooling circuit using numerical simulations results. We have also designed and optimized the outer conductor cooling circuit by establishing its hydraulic and thermal characteristics. Next, an experiment dedicated to study the thermal interaction between the power coupler and the cavity was successfully performed at CRYOHLAB test facility. The critical heat load Qc for which a strong degradation of the cavity RF performance was measured leading to Q{sub c} in the range 3 W-5 W. The measured heat load will be considered as an upper limit of the residual heat flux at the outer conductor cold extremity. A dedicated test facility was developed and successfully operated for measuring the performance of the outer conductor heat exchanger using supercritical helium as coolant. The test cell used reproduces the realistic thermal boundary conditions of the power coupler mounted on the cavity in the cryo-module. The first experimental results have confirmed the excellent performance of the tested heat exchanger. The maximum residual heat flux measured was 60 mW for a 127 W thermal load. As the RF losses in the coupler are proportional to the incident RF power, we can deduce that the outer conductor heat exchanger performance is continued up to 800 kW RF power. Heat exchanger thermal conductance has been identified using a 2D axisymmetric thermal model by comparing
Grating-assisted surface acoustic wave directional couplers

Science.gov (United States)

Golan, G.; Griffel, G.; Seidman, A.; Croitoru, N.

1991-07-01

Physical properties of novel grating-assisted Y directional couplers are examined using the coupled-mode theory. A general formalism for the analysis of the lateral perturbed directional coupler properties is presented. Explicit expressions for waveguide key parameters such as coupling length, grating period, and other structural characterizations, are obtained. The influence of other physical properties such as time and frequency response or cutoff conditions are also analyzed. A plane grating-assisted directional coupler is presented and examined as a basic component in the integrated acoustic technology.
Development of higher order mode couplers at Cornell

International Nuclear Information System (INIS)

Amato, J.C.

1988-01-01

Higher order mode (HOM) couplers are integral parts of a superconducting accelerator cavity. The damping which the couplers must provide is dictated by the frequency and shunt impedance of the cavity modes as well as by the stability requirements of the accelerator incorporating the cavities. Cornell's 5-cell 1500 MHz elliptical cavity was designed for use in a 50 x 50 GeV electron-positron storage ring with a total beam current of 3.5 mA (CESR-II). HOM couplers for the Cornell cavity were designed and evaluated with this machine in mind. The development of these couplers is described in this paper. 8 references, 8 figures
Time-Domain Simulation of RF Couplers

International Nuclear Information System (INIS)

Smithe, David; Carlsson, Johan; Austin, Travis

2009-01-01

We have developed a finite-difference time-domain (FDTD) fluid-like approach to integrated plasma-and-coupler simulation [1], and show how it can be used to model LH and ICRF couplers in the MST and larger tokamaks.[2] This approach permits very accurate 3-D representation of coupler geometry, and easily includes non-axi-symmetry in vessel wall, magnetic equilibrium, and plasma density. The plasma is integrated with the FDTD Maxwell solver in an implicit solve that steps over electron time-scales, and permits tenuous plasma in the coupler itself, without any need to distinguish or interface between different regions of vacuum and/or plasma. The FDTD algorithm is also generalized to incorporate a time-domain sheath potential [3] on metal structures within the simulation, to look for situations where the sheath potential might generate local sputtering opportunities. Benchmarking of the time-domain sheath algorithm has been reported in the references. Finally, the time-domain software [4] permits the use of particles, either as field diagnostic (test particles) or to self-consistently compute plasma current from the applied RF power.
Detuning related coupler kick variation of a superconducting nine-cell 1.3 GHz cavity

Science.gov (United States)

Hellert, Thorsten; Dohlus, Martin

2018-04-01

Superconducting TESLA-type cavities are widely used to accelerate electrons in long bunch trains, such as in high repetition rate free electron lasers. The TESLA cavity is equipped with two higher order mode couplers and a fundamental power coupler (FPC), which break the axial symmetry of the cavity. The passing electrons therefore experience axially asymmetrical coupler kicks, which depend on the transverse beam position at the couplers and the rf phase. The resulting emittance dilution has been studied in detail in the literature. However, the kick induced by the FPC depends explicitly on the ratio of the forward to the backward traveling waves at the coupler, which has received little attention. The intention of this paper is to present the concept of discrete coupler kicks with a novel approach of separating the field disturbances related to the standing wave and a reflection dependent part. Particular attention is directed to the role of the penetration depth of the FPC antenna, which determines the loaded quality factor of the cavity. The developed beam transport model is compared to dedicated experiments at FLASH and European XFEL. Both the observed transverse coupling and detuning related coupler kick variations are in good agreement with the model. Finally, the expected trajectory variations due to coupler kick variations at European XFEL are investigated and results of numerical studies are presented.
High performance parallel computing of flows in complex geometries: II. Applications

International Nuclear Information System (INIS)

Gourdain, N; Gicquel, L; Staffelbach, G; Vermorel, O; Duchaine, F; Boussuge, J-F; Poinsot, T

2009-01-01

Present regulations in terms of pollutant emissions, noise and economical constraints, require new approaches and designs in the fields of energy supply and transportation. It is now well established that the next breakthrough will come from a better understanding of unsteady flow effects and by considering the entire system and not only isolated components. However, these aspects are still not well taken into account by the numerical approaches or understood whatever the design stage considered. The main challenge is essentially due to the computational requirements inferred by such complex systems if it is to be simulated by use of supercomputers. This paper shows how new challenges can be addressed by using parallel computing platforms for distinct elements of a more complex systems as encountered in aeronautical applications. Based on numerical simulations performed with modern aerodynamic and reactive flow solvers, this work underlines the interest of high-performance computing for solving flow in complex industrial configurations such as aircrafts, combustion chambers and turbomachines. Performance indicators related to parallel computing efficiency are presented, showing that establishing fair criterions is a difficult task for complex industrial applications. Examples of numerical simulations performed in industrial systems are also described with a particular interest for the computational time and the potential design improvements obtained with high-fidelity and multi-physics computing methods. These simulations use either unsteady Reynolds-averaged Navier-Stokes methods or large eddy simulation and deal with turbulent unsteady flows, such as coupled flow phenomena (thermo-acoustic instabilities, buffet, etc). Some examples of the difficulties with grid generation and data analysis are also presented when dealing with these complex industrial applications.
The ongoing investigation of high performance parallel computing in HEP

CERN Document Server

Peach, Kenneth J; Böck, R K; Dobinson, Robert W; Hansroul, M; Norton, Alan Robert; Willers, Ian Malcolm; Baud, J P; Carminati, F; Gagliardi, F; McIntosh, E; Metcalf, M; Robertson, L; CERN. Geneva. Detector Research and Development Committee

1993-01-01

Past and current exploitation of parallel computing in High Energy Physics is summarized and a list of R & D projects in this area is presented. The applicability of new parallel hardware and software to physics problems is investigated, in the light of the requirements for computing power of LHC experiments and the current trends in the computer industry. Four main themes are discussed (possibilities for a finer grain of parallelism; fine-grain communication mechanism; usable parallel programming environment; different programming models and architectures, using standard commercial products). Parallel computing technology is potentially of interest for offline and vital for real time applications in LHC. A substantial investment in applications development and evaluation of state of the art hardware and software products is needed. A solid development environment is required at an early stage, before mainline LHC program development begins.
Simple, parallel, high-performance virtual machines for extreme computations

International Nuclear Information System (INIS)

Chokoufe Nejad, Bijan; Ohl, Thorsten; Reuter, Jurgen

2014-11-01

We introduce a high-performance virtual machine (VM) written in a numerically fast language like Fortran or C to evaluate very large expressions. We discuss the general concept of how to perform computations in terms of a VM and present specifically a VM that is able to compute tree-level cross sections for any number of external legs, given the corresponding byte code from the optimal matrix element generator, O'Mega. Furthermore, this approach allows to formulate the parallel computation of a single phase space point in a simple and obvious way. We analyze hereby the scaling behaviour with multiple threads as well as the benefits and drawbacks that are introduced with this method. Our implementation of a VM can run faster than the corresponding native, compiled code for certain processes and compilers, especially for very high multiplicities, and has in general runtimes in the same order of magnitude. By avoiding the tedious compile and link steps, which may fail for source code files of gigabyte sizes, new processes or complex higher order corrections that are currently out of reach could be evaluated with a VM given enough computing power.
Design of electro-absorption modulator with tapered-mode coupler on the GeSi layer

International Nuclear Information System (INIS)

Li, Ym; Cheng, Bw

2013-01-01

A tapered-mode coupler integrated GeSi electro-absorption (EA) modulator is investigated theoretically. To improve the parameter insensitivity and modulation efficiency of the GeSi EA modulator based on evanescent coupling, a tapered coupler on the GeSi layer is introduced in our design. The two coupling mechanisms in our suggested structure are compared. Both the beam propagation method (BPM) calculation and coupling mode theory show almost 100% power transfer from the bottom rib waveguide to the GeSi layer. After a series of designs of the tapered coupler, we get a modulator with the advantages of both evanescent-coupling modulators (Feng et al 2011 Opt. Express 19 7062–7, Feng et al 2012 Opt. Express 20 22224–32, Liu et al 2008 Nature Photon. 2 433–7, Liu et al 2007 Opt. Express 15 623–8) and butt-coupling modulators (Lim et al 2011 Opt. Express 19 5040–6), that are ease of fabrication, low coupling loss, performance stability and high modulation efficiency. (paper)
MAFIA simulation and cold model test of three types of bridge coupler

International Nuclear Information System (INIS)

Chang, C.R.; Yao, C.G.; Swenson, D.A.; Funk, L.W.; Raparia, D.

1992-01-01

In the new design of the SSC CCL, the total number of bridge couplers has increased from 50 to 63, and their maximum length increased from 37.2 to 46.1 cm. Choosing a bridge coupler that gives maximum coupling, minimum power flow, phase shift and fabrication cost becomes important. The conventional TM010 single cavity bridge coupler used in LAMPF and Fermilab will have severe mode mixing problem when the bridge length is over 30 cm, and the coupling is very weak. Three types of bridge coupler have been proposed: (1) TM012 single cavity bridge coupler; (2) electrically coupled multi-cavity bridge coupler and (3) magnetically coupled multi-cavity bridge coupler. This paper presents both MAFIA simulations and cold model tests results. Each bridge coupler has its unique characteristics with advantages and disadvantages, but all three are superior to the conventional coupler. (Author) 6 figs., tab., 2 refs
A High-Performance Parallel FDTD Method Enhanced by Using SSE Instruction Set

Directory of Open Access Journals (Sweden)

Dau-Chyrh Chang

2012-01-01

Full Text Available We introduce a hardware acceleration technique for the parallel finite difference time domain (FDTD method using the SSE (streaming (single instruction multiple data SIMD extensions instruction set. The implementation of SSE instruction set to parallel FDTD method has achieved the significant improvement on the simulation performance. The benchmarks of the SSE acceleration on both the multi-CPU workstation and computer cluster have demonstrated the advantages of (vector arithmetic logic unit VALU acceleration over GPU acceleration. Several engineering applications are employed to demonstrate the performance of parallel FDTD method enhanced by SSE instruction set.
A linear atomic quantum coupler

Energy Technology Data Exchange (ETDEWEB)

El-Orany, Faisal A A [Department of Mathematics and computer Science, Faculty of Science, Suez Canal University 41522, Ismailia (Egypt); Wahiddin, M R B, E-mail: el_orany@hotmail.co, E-mail: faisal.orany@mimos.m, E-mail: mridza@mimos.m [Cyberspace Security Laboratory, MIMOS Berhad, Technology Park Malaysia, 57000 Kuala Lumpur (Malaysia)

2010-04-28

In this paper we develop the notion of the linear atomic quantum coupler. This device consists of two modes propagating into two waveguides, each of which includes a localized atom. These waveguides are placed close enough to allow exchange of energy between them via evanescent waves. Each mode interacts with the atom in the same waveguide in the standard way as the Jaynes-Cummings model (JCM) and with the atom-mode system in the second waveguide via the evanescent wave. We present the Hamiltonian for this system and deduce its wavefunction. We investigate the atomic inversions and the second-order correlation function. In contrast to the conventional coupler the atomic quantum coupler is able to generate nonclassical effects. The atomic inversions can exhibit a long revival-collapse phenomenon as well as subsidiary revivals based on the competition among the switching mechanisms in the system. Finally, under certain conditions the system can yield the results of the two-mode JCM.
Venous coupler use for free-flap breast reconstructions: specific analyses of TMG and DIEP flaps.

Science.gov (United States)

Bodin, Frédéric; Brunetti, Stefania; Dissaux, Caroline; Erik, A Sauleau; Facca, Sybille; Bruant-Rodier, Catherine; Liverneaux, Philippe

2015-05-01

The purpose of this report was to present the results of comparisons of anastomotic data and flap complications in the use of venous coupler in breast reconstruction with the transverse musculocutaneous gracilis (TMG) flap and the deep inferior epigastric perforator (DIEP) flap. Over a three-year period, 95 patients suffering from breast cancer were treated with mastectomy and breast reconstruction using free flaps. We performed 121 mechanical venous anastomoses for 105 flap procedures (80 DIEP and 25 TMG). The coupler size, anastomotic duration, number of anastomoses and postoperative complications were assessed for the entire series. The coupling device was perfectly suitable for all end-to-end anastomoses between the vein(s) of the flap and the internal mammary vein(s). No venous thrombosis occurred. The mean anastomotic time did not significantly differ between the DIEP (330 seconds) and TMG flap procedures (352 seconds) (P = 0.069). Additionally, there were no differences in coupling time observed following a comparison of seven coupler sizes (P = 0.066). The mean coupler size used during the TMG flap procedure was smaller than that used with the DIEP (2.4 mm versus 2.8 mm) (P TMG flap (28%) than with the DIEP flap (11%). The coupler size used was smaller for the TMG procedure and when double venous anastomosis was performed. Additionally, anastomotic time was not affected by the flap type or coupler size used or by anastomosis number. © 2014 Wiley Periodicals, Inc.
Multi-layered dielectric cladding plasmonic microdisk resonator filter and coupler

International Nuclear Information System (INIS)

Han Cheng, Bo; Lan, Yung-Chiang

2013-01-01

This work develops the plasmonic microdisk filter/coupler, whose effectiveness is evaluated by finite-difference time-domain simulation and theoretical analyses. Multi-layer dielectric cladding is used to prevent the scattering of surface plasmons (SPs) from a silver microdisk. This method allows devices that efficiently perform filter/coupler functions to be developed. The resonant conditions and the effective refractive index of bounded SP modes on the microdisk are determined herein. The waveguide-to-microdisk distance barely influences the resonant wavelength but it is inversely related to the bandwidth. These findings are consistent with predictions made using the typical ring resonator model.
Wireless power transfer magnetic couplers

Science.gov (United States)

Wu, Hunter; Gilchrist, Aaron; Sealy, Kylee

2016-01-19

A magnetic coupler is disclosed for wireless power transfer systems. A ferrimagnetic component is capable of guiding a magnetic field. A wire coil is wrapped around at least a portion of the ferrimagnetic component. A screen is capable of blocking leakage magnetic fields. The screen may be positioned to cover at least one side of the ferrimagnetic component and the coil. A distance across the screen may be at least six times an air gap distance between the ferrimagnetic component and a receiving magnetic coupler.
High efficiency and broad bandwidth grating coupler between nanophotonic waveguide and fibre

International Nuclear Information System (INIS)

Yu, Zhu; Xue-Jun, Xu; Zhi-Yong, Li; Liang, Zhou; Yu-De, Yu; Jin-Zhong, Yu; Wei-Hua, Han; Zhong-Chao, Fan

2010-01-01

A high efficiency and broad bandwidth grating coupler between a silicon-on-insulator (SOI) nanophotonic waveguide and fibre is designed and fabricated. Coupling efficiencies of 46% and 25% at a wavelength of 1.55 μm are achieved by simulation and experiment, respectively. An optical 3 dB bandwidth of 45 nm from 1530 nm to 1575 nm is also obtained in experiment. Numerical calculation shows that a tolerance to fabrication error of 10 nm in etch depth is achievable. The measurement results indicate that the alignment error of ±2 μm results in less than 1 dB additional coupling loss. (classical areas of phenomenology)
Very high coupling of TM polarised light in photonic crystal directional couplers

DEFF Research Database (Denmark)

Borel, Peter Ingo; Thorhauge, Morten; Frandsen, Lars Hagedorn

2003-01-01

The experimental and simulated spectra for TE and TM polarised light for the transmission through photonic crystal directional couplers are presented. The 3D FDTD simulations successfully explain all the major features of the experimental spectra as well as the actual transmission level. Especially...
Fiber-optic couplers as displacement sensors

Science.gov (United States)

Baruch, Martin C.; Gerdt, David W.; Adkins, Charles M.

2003-04-01

We introduce the novel concept of using a fiber-optic coupler as a versatile displacement sensor. Comparatively long fiber-optic couplers, with a coupling region of approximately 10 mm, are manufactured using standard communication SM fiber and placed in a looped-back configuration. The result is a displacement sensor, which is robust and highly sensitive over a wide dynamic range. This displacement sensor resolves 1-2 μm over distances of 1-1.5 mm and is characterized by the essential absence of a 'spring constant' plaguing other strain gauge-type sensors. Consequently, it is possible to couple to extremely weak vibrations, such as the skin displacement affected by arterial heart beat pulsations. Used as a wrist-worn heartbeat monitor, the fidelity of the arterial pulse signal has been shown to be so high that it is possible to not only determine heartbeat and breathing rates, but to implement a new single-point blood pressure measurement scheme which does not squeeze the arm. In an application as a floor vibration sensor for the non-intrusive monitoring of independently living elderly, the sensor has been shown to resolve the distinct vibration spectra of different persons and different events.
Performance of the Galley Parallel File System

Science.gov (United States)

Nieuwejaar, Nils; Kotz, David

1996-01-01

As the input/output (I/O) needs of parallel scientific applications increase, file systems for multiprocessors are being designed to provide applications with parallel access to multiple disks. Many parallel file systems present applications with a conventional Unix-like interface that allows the application to access multiple disks transparently. This interface conceals the parallism within the file system, which increases the ease of programmability, but makes it difficult or impossible for sophisticated programmers and libraries to use knowledge about their I/O needs to exploit that parallelism. Furthermore, most current parallel file systems are optimized for a different workload than they are being asked to support. We introduce Galley, a new parallel file system that is intended to efficiently support realistic parallel workloads. Initial experiments, reported in this paper, indicate that Galley is capable of providing high-performance 1/O to applications the applications that rely on them. In Section 3 we describe that access data in patterns that have been observed to be common.

HOM Coupler Optimisation for the Superconducting RF Cavities in ESS

CERN Document Server

Ainsworth, R; Calaga, R

2012-01-01

The European Spallation Source (ESS) will be the world’s most powerful next generation neutron source. It consists of a linear accelerator, target, and instruments for neutron experiments. The linac is designed to accelerate protons to a ﬁnal energy of 2.5 GeV, with an average design beam power of 5 MW, for collision with a target used to produce a high neutron ﬂux. A section of the linac will contain Superconducting RF (SCRF) cavities designed at 704 MHz. Beam induced HOMs in these cavities may drive the beam unstable and increase the cryogenic load, therefore HOM couplers are installed to provide sufﬁcient damping. Previous studies have shown that these couplers are susceptible to multipacting, a resonant process which can absorb RF power and lead to heating effects. This paper will show how a coupler suffering from multipacting has been redesigned to limit this effect. Optimisation of the RF damping is also discussed.
Design of high-performance parallelized gene predictors in MATLAB.

Science.gov (United States)

Rivard, Sylvain Robert; Mailloux, Jean-Gabriel; Beguenane, Rachid; Bui, Hung Tien

2012-04-10

This paper proposes a method of implementing parallel gene prediction algorithms in MATLAB. The proposed designs are based on either Goertzel's algorithm or on FFTs and have been implemented using varying amounts of parallelism on a central processing unit (CPU) and on a graphics processing unit (GPU). Results show that an implementation using a straightforward approach can require over 4.5 h to process 15 million base pairs (bps) whereas a properly designed one could perform the same task in less than five minutes. In the best case, a GPU implementation can yield these results in 57 s. The present work shows how parallelism can be used in MATLAB for gene prediction in very large DNA sequences to produce results that are over 270 times faster than a conventional approach. This is significant as MATLAB is typically overlooked due to its apparent slow processing time even though it offers a convenient environment for bioinformatics. From a practical standpoint, this work proposes two strategies for accelerating genome data processing which rely on different parallelization mechanisms. Using a CPU, the work shows that direct access to the MEX function increases execution speed and that the PARFOR construct should be used in order to take full advantage of the parallelizable Goertzel implementation. When the target is a GPU, the work shows that data needs to be segmented into manageable sizes within the GFOR construct before processing in order to minimize execution time.
Parallel Backprojection: A Case Study in High-Performance Reconfigurable Computing

Directory of Open Access Journals (Sweden)

Cordes Ben

2009-01-01

Full Text Available High-performance reconfigurable computing (HPRC is a novel approach to provide large-scale computing power to modern scientific applications. Using both general-purpose processors and FPGAs allows application designers to exploit fine-grained and coarse-grained parallelism, achieving high degrees of speedup. One scientific application that benefits from this technique is backprojection, an image formation algorithm that can be used as part of a synthetic aperture radar (SAR processing system. We present an implementation of backprojection for SAR on an HPRC system. Using simulated data taken at a variety of ranges, our implementation runs over 200 times faster than a similar software program, with an overall application speedup better than 50x. The backprojection application is easily parallelizable, achieving near-linear speedup when run on multiple nodes of a clustered HPRC system. The results presented can be applied to other systems and other algorithms with similar characteristics.
Parallel Backprojection: A Case Study in High-Performance Reconfigurable Computing

Directory of Open Access Journals (Sweden)

2009-03-01

Full Text Available High-performance reconfigurable computing (HPRC is a novel approach to provide large-scale computing power to modern scientific applications. Using both general-purpose processors and FPGAs allows application designers to exploit fine-grained and coarse-grained parallelism, achieving high degrees of speedup. One scientific application that benefits from this technique is backprojection, an image formation algorithm that can be used as part of a synthetic aperture radar (SAR processing system. We present an implementation of backprojection for SAR on an HPRC system. Using simulated data taken at a variety of ranges, our implementation runs over 200 times faster than a similar software program, with an overall application speedup better than 50x. The backprojection application is easily parallelizable, achieving near-linear speedup when run on multiple nodes of a clustered HPRC system. The results presented can be applied to other systems and other algorithms with similar characteristics.
Coupling Characteristics of Fused Optical Fiber Coupler Formed with Single-Mode Fiber and Photonic Crystal Fiber Having Air Hole Collapsed Taper

Directory of Open Access Journals (Sweden)

Hirohisa Yokota

2016-01-01

Full Text Available Fused coupler forming with a single-mode fiber (SMF and a photonic crystal fiber (PCF is one of the solutions for optical coupling from a light source to a PCF. In this paper, we presented coupling characteristics of a fused fiber coupler formed with an ordinary SMF and a PCF having air hole collapsed taper. A prototype of SMF-PCF coupler with air hole collapsed taper was fabricated using CO2 laser irradiation. The coupling efficiency from SMF to PCF was −6.2 dB at 1554 nm wavelength in the fabricated coupler. The structure of the SMF-PCF coupler to obtain high coupling efficiency was theoretically clarified by beam propagation analysis using an equivalent model of the coupler with simplification. It was clarified that appropriately choosing the prestretched or etched SMF diameter and the length of air hole collapsed region was effective to obtain high coupling efficiency that was a result of high extinction ratio at cross port and low excess loss. We also demonstrated that the diameter of prestretched SMF to obtain high coupling efficiency was insensitive to the air hole diameter ratio to pitch of the PCF in the air hole collapsed SMF-PCF coupler.
Development of Industrial High-Speed Transfer Parallel Robot

International Nuclear Information System (INIS)

Kim, Byung In; Kyung, Jin Ho; Do, Hyun Min; Jo, Sang Hyun

2013-01-01

Parallel robots used in industry require high stiffness or high speed because of their structural characteristics. Nowadays, the importance of rapid transportation has increased in the distribution industry. In this light, an industrial parallel robot has been developed for high-speed transfer. The developed parallel robot can handle a maximum payload of 3 kg. For a payload of 0.1 kg, the trajectory cycle time is 0.3 s (come and go), and the maximum velocity is 4.5 m/s (pick amp, place work, adept cycle). In this motion, its maximum acceleration is very high and reaches approximately 13g. In this paper, the design, analysis, and performance test results of the developed parallel robot system are introduced
Development of three-dimensional neoclassical transport simulation code with high performance Fortran on a vector-parallel computer

International Nuclear Information System (INIS)

Satake, Shinsuke; Okamoto, Masao; Nakajima, Noriyoshi; Takamaru, Hisanori

2005-11-01

A neoclassical transport simulation code (FORTEC-3D) applicable to three-dimensional configurations has been developed using High Performance Fortran (HPF). Adoption of computing techniques for parallelization and a hybrid simulation model to the δf Monte-Carlo method transport simulation, including non-local transport effects in three-dimensional configurations, makes it possible to simulate the dynamism of global, non-local transport phenomena with a self-consistent radial electric field within a reasonable computation time. In this paper, development of the transport code using HPF is reported. Optimization techniques in order to achieve both high vectorization and parallelization efficiency, adoption of a parallel random number generator, and also benchmark results, are shown. (author)
Overview of Parallel Platforms for Common High Performance Computing

Directory of Open Access Journals (Sweden)

T. Fryza

2012-04-01

Full Text Available The paper deals with various parallel platforms used for high performance computing in the signal processing domain. More precisely, the methods exploiting the multicores central processing units such as message passing interface and OpenMP are taken into account. The properties of the programming methods are experimentally proved in the application of a fast Fourier transform and a discrete cosine transform and they are compared with the possibilities of MATLAB's built-in functions and Texas Instruments digital signal processors with very long instruction word architectures. New FFT and DCT implementations were proposed and tested. The implementation phase was compared with CPU based computing methods and with possibilities of the Texas Instruments digital signal processing library on C6747 floating-point DSPs. The optimal combination of computing methods in the signal processing domain and new, fast routines' implementation is proposed as well.
Progress on H5Part: A Portable High Performance Parallel Data Interface for Electromagnetics Simulations

International Nuclear Information System (INIS)

Adelmann, Andreas; Gsell, Achim; Oswald, Benedikt; Schietinger, Thomas; Bethel, Wes; Shalf, John; Siegerist, Cristina; Stockinger, Kurt

2007-01-01

Significant problems facing all experimental and computational sciences arise from growing data size and complexity. Common to all these problems is the need to perform efficient data I/O on diverse computer architectures. In our scientific application, the largest parallel particle simulations generate vast quantities of six-dimensional data. Such a simulation run produces data for an aggregate data size up to several TB per run. Motivated by the need to address data I/O and access challenges, we have implemented H5Part, an open source data I/O API that simplifies the use of the Hierarchical Data Format v5 library (HDF5). HDF5 is an industry standard for high performance, cross-platform data storage and retrieval that runs on all contemporary architectures from large parallel supercomputers to laptops. H5Part, which is oriented to the needs of the particle physics and cosmology communities, provides support for parallel storage and retrieval of particles, structured and in the future unstructured meshes. In this paper, we describe recent work focusing on I/O support for particles and structured meshes and provide data showing performance on modern supercomputer architectures like the IBM POWER 5
Current-drive on the Versator-II tokamak with a slotted-waveguide fast-wave coupler

International Nuclear Information System (INIS)

Colborn, J.A.

1987-11-01

A slotted-waveguide fast-wave coupler has been constructed, without dielectric, and used to drive current on the Versator-II tokamak. Up to 35 kW of net microwave power at 2.45 GHz has been radiated into plasmas with 2 x 10 12 cm -3 ≤ mean of n/sub e/ ≤ 1.2 x 10 13 cm -3 and B/sub tor/ ≅ 1.0 T. The launched spectrum had a peak near N/sub parallel/ = -2.0 and a larger peak near N/sub parallel/ = 0.7. Radiating efficiency of the antenna was roughly independent of antenna position except when the antenna was at least 0.2 cm outside the limiter, in which case the radiating efficiency slightly improved as the antenna was moved farther outside. When the coupler was inside the limiter, radiating efficiency improved moderately with increased mean of n/sub e/. Current-drive efficiency was comparable to that of the slow wave and was not affected when the antenna spectrum was reversed; however, no current was driven for mean of n/sub e/ ≤ 2 x 10 12 cm -3 . These results indicate the fast wave was launched, but a substantial part of the power may have been mode-converted to the slow wave, possibly via a downshift in N/sub parallel/, and these slow waves may have been responsible for most of the driven current. Relevant theory for waves in plasma, current-drive efficiency, and coupling of the slotted-waveguide is discussed, the antenna design method is explained, and future work, including the construction of a much-improved probe-fed antenna, is described. 42 refs., 45 figs
Parallel file system performances in fusion data storage

International Nuclear Information System (INIS)

Iannone, F.; Podda, S.; Bracco, G.; Manduchi, G.; Maslennikov, A.; Migliori, S.; Wolkersdorfer, K.

2012-01-01

High I/O flow rates, up to 10 GB/s, are required in large fusion Tokamak experiments like ITER where hundreds of nodes store simultaneously large amounts of data acquired during the plasma discharges. Typical network topologies such as linear arrays (systolic), rings, meshes (2-D arrays), tori (3-D arrays), trees, butterfly, hypercube in combination with high speed data transports like Infiniband or 10G-Ethernet, are the main areas in which the effort to overcome the so-called parallel I/O bottlenecks is most focused. The high I/O flow rates were modelled in an emulated testbed based on the parallel file systems such as Lustre and GPFS, commonly used in High Performance Computing. The test runs on High Performance Computing–For Fusion (8640 cores) and ENEA CRESCO (3392 cores) supercomputers. Message Passing Interface based applications were developed to emulate parallel I/O on Lustre and GPFS using data archival and access solutions like MDSPLUS and Universal Access Layer. These methods of data storage organization are widely diffused in nuclear fusion experiments and are being developed within the EFDA Integrated Tokamak Modelling – Task Force; the authors tried to evaluate their behaviour in a realistic emulation setup.
Parallel file system performances in fusion data storage

Energy Technology Data Exchange (ETDEWEB)

Iannone, F., E-mail: francesco.iannone@enea.it [Associazione EURATOM-ENEA sulla Fusione, C.R.ENEA Frascati, via E.Fermi, 45 - 00044 Frascati, Rome (Italy); Podda, S.; Bracco, G. [ENEA Information Communication Tecnologies, Lungotevere Thaon di Revel, 76 - 00196 Rome (Italy); Manduchi, G. [Associazione EURATOM-ENEA sulla Fusione, Consorzio RFX, Corso Stati Uniti, 4 - 35127 Padua (Italy); Maslennikov, A. [CASPUR Inter-University Consortium for the Application of Super-Computing for Research, via dei Tizii, 6b - 00185 Rome (Italy); Migliori, S. [ENEA Information Communication Tecnologies, Lungotevere Thaon di Revel, 76 - 00196 Rome (Italy); Wolkersdorfer, K. [Juelich Supercomputing Centre-FZJ, D-52425 Juelich (Germany)

2012-12-15

High I/O flow rates, up to 10 GB/s, are required in large fusion Tokamak experiments like ITER where hundreds of nodes store simultaneously large amounts of data acquired during the plasma discharges. Typical network topologies such as linear arrays (systolic), rings, meshes (2-D arrays), tori (3-D arrays), trees, butterfly, hypercube in combination with high speed data transports like Infiniband or 10G-Ethernet, are the main areas in which the effort to overcome the so-called parallel I/O bottlenecks is most focused. The high I/O flow rates were modelled in an emulated testbed based on the parallel file systems such as Lustre and GPFS, commonly used in High Performance Computing. The test runs on High Performance Computing-For Fusion (8640 cores) and ENEA CRESCO (3392 cores) supercomputers. Message Passing Interface based applications were developed to emulate parallel I/O on Lustre and GPFS using data archival and access solutions like MDSPLUS and Universal Access Layer. These methods of data storage organization are widely diffused in nuclear fusion experiments and are being developed within the EFDA Integrated Tokamak Modelling - Task Force; the authors tried to evaluate their behaviour in a realistic emulation setup.
Introduction to massively-parallel computing in high-energy physics

CERN Document Server

AUTHOR|(CDS)2083520

1993-01-01

Ever since computers were first used for scientific and numerical work, there has existed an "arms race" between the technical development of faster computing hardware, and the desires of scientists to solve larger problems in shorter time-scales. However, the vast leaps in processor performance achieved through advances in semi-conductor science have reached a hiatus as the technology comes up against the physical limits of the speed of light and quantum effects. This has lead all high performance computer manufacturers to turn towards a parallel architecture for their new machines. In these lectures we will introduce the history and concepts behind parallel computing, and review the various parallel architectures and software environments currently available. We will then introduce programming methodologies that allow efficient exploitation of parallel machines, and present case studies of the parallelization of typical High Energy Physics codes for the two main classes of parallel computing architecture (S...
Optimized Ultrawideband and Uniplanar Minkowski Fractal Branch Line Coupler

Directory of Open Access Journals (Sweden)

Mohammad Jahanbakht

2012-01-01

Full Text Available The non-Euclidean Minkowski fractal geometry is used in design, optimization, and fabrication of an ultrawideband (UWB branch line coupler. Self-similarities of the fractal geometries make them act like an infinite length in a finite area. This property creates a smaller design with broader bandwidth. The designed 3 dB microstrip coupler has a single layer and uniplanar platform with quite easy fabrication process. This optimized 180° coupler also shows a perfect isolation and insertion loss over the UWB frequency range of 3.1–10.6 GHz.
Ultra-compact silicon nitride grating coupler for microscopy systems

OpenAIRE

Zhu, Yunpeng; Wang, Jie; Xie, Weiqiang; Tian, Bin; Li, Yanlu; Brainis, Edouard; Jiao, Yuqing; Van Thourhout, Dries

2017-01-01

Grating couplers have been widely used for coupling light between photonic chips and optical fibers. For various quantum-optics and bio-optics experiments, on the other hand, there is a need to achieve good light coupling between photonic chips and microscopy systems. Here, we propose an ultra-compact silicon nitride (SiN) grating coupler optimized for coupling light from a waveguide to a microscopy system. The grating coupler is about 4 by 2 mu m(2) in size and a 116 nm 1 dB bandwidth can be...
Comparison of sound transmission in human ears and coupler loaded by audiometric earphones

DEFF Research Database (Denmark)

Ciric, Dejan; Hammershøi, Dorte

2005-01-01

in the coupler, but since the "ear canal entrance" is not well-defined for the coupler, the mentioned measurements were done at different depths in the coupler. The sound transmission and coupling were described in terms of the pressure division at the entrance of the ear canal and the transmissions in human......, the differences among earphones as well as between human ears and the coupler affect the results of audiometric measurements inducing uncertainty. The influence of these differences is examined by investigating the sound transmission in both human ears and standardized coupler loaded by different audiometric......The thresholds of hearing are usually determined using audiometric earphones. They are calibrated by means of a standardized acoustical coupler. In order to have determined thresholds independent of the earphone type, the coupler should approximate the average human ear closely. Nevertheless...
Language interoperability for high-performance parallel scientific components

International Nuclear Information System (INIS)

Elliot, N; Kohn, S; Smolinski, B

1999-01-01

With the increasing complexity and interdisciplinary nature of scientific applications, code reuse is becoming increasingly important in scientific computing. One method for facilitating code reuse is the use of components technologies, which have been used widely in industry. However, components have only recently worked their way into scientific computing. Language interoperability is an important underlying technology for these component architectures. In this paper, we present an approach to language interoperability for a high-performance parallel, component architecture being developed by the Common Component Architecture (CCA) group. Our approach is based on Interface Definition Language (IDL) techniques. We have developed a Scientific Interface Definition Language (SIDL), as well as bindings to C and Fortran. We have also developed a SIDL compiler and run-time library support for reference counting, reflection, object management, and exception handling (Babel). Results from using Babel to call a standard numerical solver library (written in C) from C and Fortran show that the cost of using Babel is minimal, where as the savings in development time and the benefits of object-oriented development support for C and Fortran far outweigh the costs
High-energy physics software parallelization using database techniques

International Nuclear Information System (INIS)

Argante, E.; Van der Stok, P.D.V.; Willers, I.

1997-01-01

A programming model for software parallelization, called CoCa, is introduced that copes with problems caused by typical features of high-energy physics software. By basing CoCa on the database transaction paradigm, the complexity induced by the parallelization is for a large part transparent to the programmer, resulting in a higher level of abstraction than the native message passing software. CoCa is implemented on a Meiko CS-2 and on a SUN SPARCcenter 2000 parallel computer. On the CS-2, the performance is comparable with the performance of native PVM and MPI. (orig.)
A Third Generation Lower Hybrid Coupler

International Nuclear Information System (INIS)

Bernabei, S.; Hosea, J.; Kung, C.; Loesser, D.; Rushinski, J.; Wilson, J.R.; Parker, R.

2001-01-01

The Princeton Plasma Physics Laboratory (PPPL) and the Massachusetts Institute of Technology (MIT) are preparing an experiment of current profile control using lower-hybrid waves in order to produce and sustain advanced tokamak regimes in steady-state conditions in Alcator C-Mod. Unlike JET's, ToreSupra's and JT60's couplers, the C-Mod lower-hybrid coupler does not employ the now conventional multijunction design, but will have similar characteristics, compactness, and internal power division while retaining full control of the antenna element phasing. This is achieved by using 3 dB vertical power splitters and a stack of laminated plates with the waveguides milled in them. Construction is simplified and allows easy control and maintenance of all parts. Many precautions are taken to avoid arcing. Special care is also taken to avoid the recycling of reflected power which could affect the coupling and the launched n(subscript ||) spectrum. The results from C-Mod should allow further simplification in the designs of the coupler planned for KSTAR (Korea Superconducting Tokamak Advanced Research) and ITER (International Thermonuclear Experimental Reactor)
The Prototype Fundamental Power Coupler For The Spallation Neutron Source Superconducting Cavities: Design And Initial Test Results

International Nuclear Information System (INIS)

K. M. Wilson; I. E. Campisi; E. F. Daly; G. K. Davis; M. Drury; J. E. Henry; P. Kneisel; G. Myneni; T. Powers; W. J. Schneider; M. Stirbet; Y. Kang; K. Cummings; T. Hardek

2001-01-01

Each of the 805 MHz superconducting cavities of the Spallation Neutron Source (SNS) is powered via a coaxial Fundamental Power Coupler (FPC) with a 50 Omega impedance and a warm planar alumina window. The design is derived from the experience of other laboratories; in particular, a number of details are based on the coupler developed for the KEK B-Factory superconducting cavities. However, other design features have been modified to account for the fact that the SNS FPC will transfer a considerably lower average power than the KEK-B coupler. Four prototypes have been manufactured so far, and preliminary tests performed on two of them at Los Alamos National Laboratory (LANL). During these tests, peak powers of over 500 kW were transferred through the couplers in the test stand designed and built for this purpose. This paper gives details of the coupler design and of the results obtained from the RF tests on the test stand during the last few months. A more comprehensive set of tests is planned for the near future

Surface acoustic waves voltage controlled directional coupler

Science.gov (United States)

Golan, G.; Griffel, G.; Yanilov, E.; Ruschin, S.; Seidman, A.; Croitoru, N.

1988-10-01

An important condition for the development of surface wave integrated-acoustic devices is the ability to guide and control the propagation of the acoustic energy. This can be implemented by deposition of metallic "loading" channels on an anisotropic piezoelectric substrate. Deposition of such two parallel channels causes an effective coupling of acoustic energy from one channel to the other. A basic requirement for this coupling effect is the existence of the two basic modes: a symmetrical and a nonsymmetrical one. A mode map that shows the number of sustained modes as a function of the device parameters (i.e., channel width; distance between channels; material velocity; and acoustical exciting frequency) is presented. This kind of map can help significantly in the design process of such a device. In this paper we devise an advanced acoustical "Y" coupler with the ability to control its effective coupling by an externally applied voltage, thereby causing modulation of the output intensities of the signals.
A novel six-degrees-of-freedom series-parallel manipulator

Energy Technology Data Exchange (ETDEWEB)

Gallardo-Alvarado, J.; Rodriguez-Castro, R.; Aguilar-Najera, C. R.; Perez-Gonzalez, L. [Instituto Tecnologico de Celaya, Celaya (Mexico)

2012-06-15

This paper addresses the description and kinematic analyses of a new non-redundant series-parallel manipulator. The primary feature of the robot is to have a decoupled topology consisting of a lower parallel manipulator, for controlling the orientation of the coupler platform, assembled in series connection with a upper parallel manipulator, for controlling the position of the output platform, capable to provide arbitrary poses to the output platform with respect to the fixed platform. The forward displacement analysis is carried-out in semi-closed form solutions by resorting to simple closure equations. On the other hand; the velocity, acceleration and singularity analyses of the manipulator are approached by means of the theory of screws. Simple and compact expressions are derived here for solving the infinitesimal kinematics by taking advantage of the concept of reciprocal screws. Furthermore, the analysis of the Jacobians of the robot shows that the lower parallel manipulator is practically free of singularities. In order to illustrate the performance of the manipulator, a numerical example which consists of solving the inverse/forward kinematics of the series-parallel manipulator as well as its singular configurations is provided.
Zeno effect and switching of solitons in nonlinear couplers

DEFF Research Database (Denmark)

Abdullaev, F Kh; Konotop, V V; Ögren, Magnus

2011-01-01

The Zeno effect is investigated for soliton type pulses in a nonlinear directional coupler with dissipation. The effect consists in increase of the coupler transparency with increase of the dissipative losses in one of the arms. It is shown that localized dissipation can lead to switching...
Directional couplers using long-range surface plasmon polariton waveguides

DEFF Research Database (Denmark)

Boltasseva, Alexandra; Bozhevolnyi, Sergey I.

2006-01-01

14-nm-thick stripes and a wavelength of 1550 urn, LR-SPP propagation loss is determined for the stripe widths varying from 2 to 12 mu m and is found to be similar to 7 and 5 dB/cm for 10- and 4-mu m-wide stripes, respectively. For the directional couplers based on 14-nm-thick and 8-mu m-wide gold...... stripes and a wavelength of 1570 nm, the coupling lengths of 4.1, 1.9, and 0.8 mm are found for the respective waveguide separations of 8, 4, and 0 mu m. We model the LR-SPP-based directional couplers using the effective-refractive-index method and obtain a good agreement with the experimental results....... The transmission spectra of LR-SPP-based directional couplers are presented demonstrating an efficient (similar to 30 dB) separation of different telecom wavelength bands. Various possibilities for dynamic control of wavelength division/multiplexing with LRSPP-based directional couplers that utilize the thermo...
Broadband photonic crystal fiber coupler with polarization selection of coupling ratio

Science.gov (United States)

Jaroszewicz, Leszek R.; Stasiewicz, Karol A.; Marć, Paweł; Szymański, Michał

2010-09-01

In the paper a new broadband photonic crystal fiber coupler is presented. The proper application of the biconical taper technology has been used for manufacturing the coupler without air holes collapse in LMA10 fiber (NKT Photonics Crystal). This coupler, operates in the weakly coupling condition, protects coupling operation in range from 900 nm to 1700 nm. The coupling ratio between output arms is depending on wavelength and can be tuning by selection the proper input state of polarization. It gives opportunity to use the broadband crystal fiber coupler in many applications in which it is necessary to tune a coupling between output arms during the measurement.
Unconsumed precursors and couplers after formation of oxidative hair dyes

DEFF Research Database (Denmark)

Rastogi, Suresh Chandra; Søsted, Heidi; Johansen, Jeanne Duus

2006-01-01

Contact allergy to hair dye ingredients, especially precursors and couplers, is a well-known entity among consumers having hair colouring done at home or at a hairdresser. The aim of the present investigation was to estimate consumer exposure to some selected precursors (p-phenylenediamine, toluene......-2,5-diamine) and couplers (3-aminophenol, 4-aminophenol, resorcinol) of oxidative hair dyes during and after hair dyeing. Concentrations of unconsumed precursors and couplers in 8 hair dye formulations for non-professional use were investigated, under the conditions reflecting hair dyeing. Oxidative...... hair dye formation in the absence of hair was investigated using 6 products, and 2 products were used for experimental hair dyeing. In both presence and absence of hair, significant amounts of unconsumed precursors and couplers remained in the hair dye formulations after final colour development. Thus...
Parallel-fed planar dipole antenna arrays for low-observable platforms

CERN Document Server

Singh, Hema; Jha, Rakesh Mohan

2016-01-01

This book focuses on determination of scattering of parallel-fed planar dipole arrays in terms of reflection and transmission coefficients at different levels of the array system. In aerospace vehicles, the phased arrays are often in planar configuration. The radar cross section (RCS) of the vehicle is mainly due to its structure and the antennas mounted over it. There can be situation when the signatures due to antennas dominate over the structural RCS of the platform. This necessitates the study towards the reduction and control of antenna/ array RCS. The planar dipole array is considered as a stacked linear dipole array. A systematic, step-by-step approach is used to determine the RCS pattern including the finite dimensions of dipole antenna elements. The mutual impedance between the dipole elements for planar configuration is determined. The scattering till second-level of couplers in parallel feed network is taken into account. The phase shifters are modelled as delay line. All the couplers in the feed n...
Parallelization of an existing high energy physics event reconstruction software package

International Nuclear Information System (INIS)

Schiefer, R.; Francis, D.

1996-01-01

Software parallelization allows an efficient use of available computing power to increase the performance of applications. In a case study the authors have investigated the parallelization of high energy physics event reconstruction software in terms of costs (effort, computing resource requirements), benefits (performance increase) and the feasibility of a systematic parallelization approach. Guidelines facilitating a parallel implementation are proposed for future software development
Ultra-low loss nano-taper coupler for Silicon-on-Insulator ridge waveguide

DEFF Research Database (Denmark)

Pu, Minhao; Liu, Liu; Ou, Haiyan

2010-01-01

A nano-taper coupler is optimized specially for the transverse-magnetic mode for interfacing light between a silicon-on-insulator ridge waveguide and a single-mode fiber. An ultra-low coupling loss of ~0.36dB is achieved for the nano-taper coupler.......A nano-taper coupler is optimized specially for the transverse-magnetic mode for interfacing light between a silicon-on-insulator ridge waveguide and a single-mode fiber. An ultra-low coupling loss of ~0.36dB is achieved for the nano-taper coupler....
InGaN directional coupler made with a one-step etching technique

Science.gov (United States)

Gao, Xumin; Yuan, Jialei; Yang, Yongchao; Zhang, Shuai; Shi, Zheng; Li, Xin; Wang, Yongjin

2017-06-01

We propose, fabricate and characterize an on-chip integration of light source, InGaN waveguide, directional coupler and photodiode, in which AlGaN layers are used as top and bottom optical claddings to form an InGaN waveguide for guiding the in-plane emitted light from the InGaN/GaN multiple-quantum-well light-emitting diode (MQW-LED). The difference in etch rate caused by different exposure windows leads to an etching depth discrepancy using the one-step etching technique, which forms the InGaN directional coupler with the overlapped underlying slab. Light propagation results directly confirm effective light coupling in the InGaN directional coupler, which is achieved through high-order guided modes. The InGaN waveguide couples the modulated light from the InGaN/GaN MQW-LED and transfers part of light to the coupled waveguide via the InGaN directional coupler. The in-plane InGaN/GaN MQW-photodiode absorbs the guided light by the coupled InGaN waveguide and induces the photocurrent. The on-chip InGaN photonic integration experimentally demonstrates an in-plane light communication with a data transmission of 50 Mbps.
HPC-NMF: A High-Performance Parallel Algorithm for Nonnegative Matrix Factorization

Energy Technology Data Exchange (ETDEWEB)

2016-08-22

NMF is a useful tool for many applications in different domains such as topic modeling in text mining, background separation in video analysis, and community detection in social networks. Despite its popularity in the data mining community, there is a lack of efficient distributed algorithms to solve the problem for big data sets. We propose a high-performance distributed-memory parallel algorithm that computes the factorization by iteratively solving alternating non-negative least squares (NLS) subproblems for $\\WW$ and $\\HH$. It maintains the data and factor matrices in memory (distributed across processors), uses MPI for interprocessor communication, and, in the dense case, provably minimizes communication costs (under mild assumptions). As opposed to previous implementation, our algorithm is also flexible: It performs well for both dense and sparse matrices, and allows the user to choose any one of the multiple algorithms for solving the updates to low rank factors $\\WW$ and $\\HH$ within the alternating iterations.
High-Performance Parallel and Stream Processing of X-ray Microdiffraction Data on Multicores

International Nuclear Information System (INIS)

Bauer, Michael A; McIntyre, Stewart; Xie Yuzhen; Biem, Alain; Tamura, Nobumichi

2012-01-01

We present the design and implementation of a high-performance system for processing synchrotron X-ray microdiffraction (XRD) data in IBM InfoSphere Streams on multicore processors. We report on the parallel and stream processing techniques that we use to harvest the power of clusters of multicores to analyze hundreds of gigabytes of synchrotron XRD data in order to reveal the microtexture of polycrystalline materials. The timing to process one XRD image using one pipeline is about ten times faster than the best C program at present. With the support of InfoSphere Streams platform, our software is able to be scaled up to operate on clusters of multi-cores for processing multiple images concurrently. This system provides a high-performance processing kernel to achieve near real-time data analysis of image data from synchrotron experiments.
Microwave tomography global optimization, parallelization and performance evaluation

CERN Document Server

Noghanian, Sima; Desell, Travis; Ashtari, Ali

2014-01-01

This book provides a detailed overview on the use of global optimization and parallel computing in microwave tomography techniques. The book focuses on techniques that are based on global optimization and electromagnetic numerical methods. The authors provide parallelization techniques on homogeneous and heterogeneous computing architectures on high performance and general purpose futuristic computers. The book also discusses the multi-level optimization technique, hybrid genetic algorithm and its application in breast cancer imaging.
Multipacting Simulations of Tuner-adjustable waveguide coupler (TaCo) with CST

CERN Document Server

Shafqat, Nuaman; Wegner, Rolf

2015-01-01

Tuner-adjustable waveguide couplers (TaCo) are used to feed microwave power to different RF structures of LINAC4. This paper studies the multipacting phenomenon for TaCo using PIC solver of CST PS. Simulations are performed for complete field sweeps and results are analysed.
Experimental test of a supercritical helium heat exchanger dedicated to EUROTRANS 150 kW CW power coupler

Science.gov (United States)

Souli, M.; Fouaidy, M.; Hammoudi, N.

2010-05-01

The coaxial power coupler needed for beta = 0.65 superconducting RF cavities used in the high energy section of the EUROTRANS driver should transmit 150 kW (CW operation) RF power to the proton beam. The estimated RF losses on the power coupler outer conductor in standing wave mode operation are 46 W. To remove these heat loads, a full scale copper coil heat exchanger brazed around the outer conductor was designed and tested using supercritical helium at T = 6 K as a coolant. Our main objective was to minimise the heat loads to cold extremity of SRF cavity maintained at 2 K or 4.2 K. A dedicated test facility named SUPERCRYLOOP was developed and successfully operated in order to measure the performance of the cold heat exchanger. The test cell used reproduces the realistic thermal boundary conditions of the power coupler mounted on the cavity in the cryomodule. After a short introduction, a brief discussion about the problem of power coupler cooling systems in different machines is made. After that, we describe the experimental set-up and test apparatus. Then, a heat exchanger thermal model will be developed with FEM code COSMOS/M to estimate the different heat transfer coefficients by comparison between numerical simulation results and experimental data in order to validate the design. Finally, thermo-hydraulic behavior of supercritical helium has been investigated as function of different parameters (inlet pressure, flow rate, heat loads).
High-performance parallel processors based on star-coupled wavelength division multiplexing optical interconnects

Science.gov (United States)

Deri, Robert J.; DeGroot, Anthony J.; Haigh, Ronald E.

2002-01-01

As the performance of individual elements within parallel processing systems increases, increased communication capability between distributed processor and memory elements is required. There is great interest in using fiber optics to improve interconnect communication beyond that attainable using electronic technology. Several groups have considered WDM, star-coupled optical interconnects. The invention uses a fiber optic transceiver to provide low latency, high bandwidth channels for such interconnects using a robust multimode fiber technology. Instruction-level simulation is used to quantify the bandwidth, latency, and concurrency required for such interconnects to scale to 256 nodes, each operating at 1 GFLOPS performance. Performance scales have been shown to .apprxeq.100 GFLOPS for scientific application kernels using a small number of wavelengths (8 to 32), only one wavelength received per node, and achievable optoelectronic bandwidth and latency.
Magnetic Shielding Design for Coupler of Wireless Electric Vehicle Charging Using Finite Element Analysis

Science.gov (United States)

Zhao, W. N.; Yang, X. J.; Yao, C.; Ma, D. G.; Tang, H. J.

2017-10-01

Inductive power transfer (IPT) is a practical and preferable method for wireless electric vehicle (EV) charging which proved to be safe, convenient and reliable. Due to the air gap between the magnetic coupler, the magnetic field coupling decreases and the magnetic leakage increases significantly compared to traditional transformer, and this may lead to the magnetic flux density around the coupler more than the safety limit for human. So magnetic shielding should be adding to the winding made from litz wire to enhance the magnetic field coupling effect in the working area and reduce magnetic field strength in non-working area. Magnetic shielding can be achieved by adding high-permeability material or high-conductivity material. For high-permeability material its magnetic reluctance is much lower than the surrounding air medium so most of the magnetic line goes through the high-permeability material rather than surrounding air. For high-conductivity material the eddy current in the material can produce reverse magnetic field to achieve magnetic shielding. This paper studies the effect of the two types of shielding material on coupler for wireless EV charging and designs combination shielding made from high-permeability material and high-conductivity material. The investigation of the paper is done with the help of finite element analysis.
Performance assessment of the SIMFAP parallel cluster at IFIN-HH Bucharest

International Nuclear Information System (INIS)

Adam, Gh.; Adam, S.; Ayriyan, A.; Dushanov, E.; Hayryan, E.; Korenkov, V.; Lutsenko, A.; Mitsyn, V.; Sapozhnikova, T.; Sapozhnikov, A; Streltsova, O.; Buzatu, F.; Dulea, M.; Vasile, I.; Sima, A.; Visan, C.; Busa, J.; Pokorny, I.

2008-01-01

Performance assessment and case study outputs of the parallel SIMFAP cluster at IFIN-HH Bucharest point to its effective and reliable operation. A comparison with results on the supercomputing system in LIT-JINR Dubna adds insight on resource allocation for problem solving by parallel computing. The solution of models asking for very large numbers of knots in the discretization mesh needs the migration to high performance computing based on parallel cluster architectures. The acquisition of ready-to-use parallel computing facilities being beyond limited budgetary resources, the solution at IFIN-HH was to buy the hardware and the inter-processor network, and to implement by own efforts the open software concerning both the operating system and the parallel computing standard. The present paper provides a report demonstrating the successful solution of these tasks. The implementation of the well-known HPL (High Performance LINPACK) Benchmark points to the effective and reliable operation of the cluster. The comparison of HPL outputs obtained on parallel clusters of different magnitudes shows that there is an optimum range of the order N of the linear algebraic system over which a given parallel cluster provides optimum parallel solutions. For the SIMFAP cluster, this range can be inferred to correspond to about 1 to 2 x 10 4 linear algebraic equations. For an algorithm of polynomial complexity N α the task sharing among p processors within a parallel solution mainly follows an (N/p)α behaviour under peak performance achievement. Thus, while the problem complexity remains the same, a substantial decrease of the coefficient of the leading order of the polynomial complexity is achieved. (authors)
Optimization of the buffer layer of a side polished fiber slab coupler based on 3 D ADI beam propagation method

International Nuclear Information System (INIS)

Lee, Cherl Hee; Kim, Cheol; Park, Jae Hee

2008-01-01

A side polished fiber slab coupler has been widely applied to a sensor, which has the advantages of short response time, simple manufacturing process, and reusability as well as in line fiber component. A new type of a side polished fiber sensor providing remote sensing with an improved performance was also recently developed. The side polished fiber slab coupler is modeled as a fiber to planar waveguide coupler with four layers, including the fiber cladding, a buffer layer, planar waveguide and overlay material. The coupling effects by the buffer layer of a side polished fiber slab coupler are analyzed by using 3 dimensional alternating direction implicit (ADI)beam propagation method, where the refractive index and thickness of the buffer layer were tuned for efficient light coupling. The coupling is easily tuned and more occurred by the refractive index and thickness of the buffer layer for efficient coupling. This study tried to optimize the buffer layer parameters for achieving the desired light coupling and power transfer performance
Optimized Parallel Discrete Event Simulation (PDES) for High Performance Computing (HPC) Clusters

National Research Council Canada - National Science Library

Abu-Ghazaleh, Nael

2005-01-01

The aim of this project was to study the communication subsystem performance of state of the art optimistic simulator Synchronous Parallel Environment for Emulation and Discrete-Event Simulation (SPEEDES...

Nonclassical properties of a contradirectional nonlinear optical coupler

Energy Technology Data Exchange (ETDEWEB)

Thapliyal, Kishore [Jaypee Institute of Information Technology, A-10, Sector-62, Noida, UP-201307 (India); Pathak, Anirban, E-mail: anirban.pathak@gmail.com [Jaypee Institute of Information Technology, A-10, Sector-62, Noida, UP-201307 (India); RCPTM, Joint Laboratory of Optics of Palacky University and Institute of Physics of Academy of Science of the Czech Republic, Faculty of Science, Palacky University, 17. listopadu 12, 771 46 Olomouc (Czech Republic); Sen, Biswajit [Department of Physics, Vidyasagar Teachers' Training College, Midnapore 721101 (India); Perřina, Jan [RCPTM, Joint Laboratory of Optics of Palacky University and Institute of Physics of Academy of Science of the Czech Republic, Faculty of Science, Palacky University, 17. listopadu 12, 771 46 Olomouc (Czech Republic); Department of Optics, Palacky University, 17. listopadu 12, 771 46 Olomouc (Czech Republic)

2014-10-24

We investigate the nonclassical properties of output fields propagated through a contradirectional asymmetric nonlinear optical coupler consisting of a linear waveguide and a nonlinear (quadratic) waveguide operated by second harmonic generation. In contrast to the earlier results, all the initial fields are considered weak and a completely quantum-mechanical model is used here to describe the system. Perturbative solutions of Heisenberg's equations of motion for various field modes are obtained using Sen–Mandal technique. Obtained solutions are subsequently used to show the existence of single-mode and intermodal squeezing, single-mode and intermodal antibunching, two-mode and multi-mode entanglement in the output of contradirectional asymmetric nonlinear optical coupler. Further, existence of higher order nonclassicality is also established by showing the existence of higher order antibunching, higher order squeezing and higher order entanglement. Variation of observed nonclassical characters with different coupling constants and phase mismatch is discussed. - Highlights: • Nonclassicalities in fields propagating through a directional coupler is studied. • Completely quantum-mechanical description of the coupler is provided. • Analytic solutions of Heisenberg equations of motion for various modes are obtained. • Existence of lower order and higher order entanglement is shown. • Variation of nonclassicalities with phase-mismatch and coupling constants is studied.
Implementation and performance of parallelized elegant

International Nuclear Information System (INIS)

Wang, Y.; Borland, M.

2008-01-01

The program elegant is widely used for design and modeling of linacs for free-electron lasers and energy recovery linacs, as well as storage rings and other applications. As part of a multi-year effort, we have parallelized many aspects of the code, including single-particle dynamics, wakefields, and coherent synchrotron radiation. We report on the approach used for gradual parallelization, which proved very beneficial in getting parallel features into the hands of users quickly. We also report details of parallelization of collective effects. Finally, we discuss performance of the parallelized code in various applications.
Structural and dynamic analysis of an ultra short intracavity directional coupler

Science.gov (United States)

Gravé, Ilan; Griffel, Giora; Daou, Youssef; Golan, Gadi

1997-01-01

A recently proposed intracavity directional coupler is analysed. Exact analytic expressions for important parameters such as the transmission ratio, the coupling length, and the photon lifetime are given. We show that by controlling the mirror reflectivities of the cavity, it is theoretically possible to reduce the coupling length to a zero limit. The photon lifetime, which governs the dynamic properties of the structure, sets an upper frequency limit of a few hundreds of GHz, which is well over the bandwidth limitation of microwave lumped or travelling wave electrodes. This novel family of intracavity couplers has important applications in the realization of integrated optics circuits for high-speed computing, data processing, and communication.
Integration experiences and performance studies of A COTS parallel archive systems

Energy Technology Data Exchange (ETDEWEB)

Chen, Hsing-bung [Los Alamos National Laboratory; Scott, Cody [Los Alamos National Laboratory; Grider, Bary [Los Alamos National Laboratory; Torres, Aaron [Los Alamos National Laboratory; Turley, Milton [Los Alamos National Laboratory; Sanchez, Kathy [Los Alamos National Laboratory; Bremer, John [Los Alamos National Laboratory

2010-01-01

Current and future Archive Storage Systems have been asked to (a) scale to very high bandwidths, (b) scale in metadata performance, (c) support policy-based hierarchical storage management capability, (d) scale in supporting changing needs of very large data sets, (e) support standard interface, and (f) utilize commercial-off-the-shelf(COTS) hardware. Parallel file systems have been asked to do the same thing but at one or more orders of magnitude faster in performance. Archive systems continue to move closer to file systems in their design due to the need for speed and bandwidth, especially metadata searching speeds such as more caching and less robust semantics. Currently the number of extreme highly scalable parallel archive solutions is very small especially those that will move a single large striped parallel disk file onto many tapes in parallel. We believe that a hybrid storage approach of using COTS components and innovative software technology can bring new capabilities into a production environment for the HPC community much faster than the approach of creating and maintaining a complete end-to-end unique parallel archive software solution. In this paper, we relay our experience of integrating a global parallel file system and a standard backup/archive product with a very small amount of additional code to provide a scalable, parallel archive. Our solution has a high degree of overlap with current parallel archive products including (a) doing parallel movement to/from tape for a single large parallel file, (b) hierarchical storage management, (c) ILM features, (d) high volume (non-single parallel file) archives for backup/archive/content management, and (e) leveraging all free file movement tools in Linux such as copy, move, ls, tar, etc. We have successfully applied our working COTS Parallel Archive System to the current world's first petaflop/s computing system, LANL's Roadrunner, and demonstrated its capability to address requirements of
Integration experiments and performance studies of a COTS parallel archive system

Energy Technology Data Exchange (ETDEWEB)

Chen, Hsing-bung [Los Alamos National Laboratory; Scott, Cody [Los Alamos National Laboratory; Grider, Gary [Los Alamos National Laboratory; Torres, Aaron [Los Alamos National Laboratory; Turley, Milton [Los Alamos National Laboratory; Sanchez, Kathy [Los Alamos National Laboratory; Bremer, John [Los Alamos National Laboratory

2010-06-16

Current and future Archive Storage Systems have been asked to (a) scale to very high bandwidths, (b) scale in metadata performance, (c) support policy-based hierarchical storage management capability, (d) scale in supporting changing needs of very large data sets, (e) support standard interface, and (f) utilize commercial-off-the-shelf (COTS) hardware. Parallel file systems have been asked to do the same thing but at one or more orders of magnitude faster in performance. Archive systems continue to move closer to file systems in their design due to the need for speed and bandwidth, especially metadata searching speeds such as more caching and less robust semantics. Currently the number of extreme highly scalable parallel archive solutions is very small especially those that will move a single large striped parallel disk file onto many tapes in parallel. We believe that a hybrid storage approach of using COTS components and innovative software technology can bring new capabilities into a production environment for the HPC community much faster than the approach of creating and maintaining a complete end-to-end unique parallel archive software solution. In this paper, we relay our experience of integrating a global parallel file system and a standard backup/archive product with a very small amount of additional code to provide a scalable, parallel archive. Our solution has a high degree of overlap with current parallel archive products including (a) doing parallel movement to/from tape for a single large parallel file, (b) hierarchical storage management, (c) ILM features, (d) high volume (non-single parallel file) archives for backup/archive/content management, and (e) leveraging all free file movement tools in Linux such as copy, move, Is, tar, etc. We have successfully applied our working COTS Parallel Archive System to the current world's first petafiop/s computing system, LANL's Roadrunner machine, and demonstrated its capability to address
High-Performance Psychometrics: The Parallel-E Parallel-M Algorithm for Generalized Latent Variable Models. Research Report. ETS RR-16-34

Science.gov (United States)

von Davier, Matthias

2016-01-01

This report presents results on a parallel implementation of the expectation-maximization (EM) algorithm for multidimensional latent variable models. The developments presented here are based on code that parallelizes both the E step and the M step of the parallel-E parallel-M algorithm. Examples presented in this report include item response…
Ultra-low-loss inverted taper coupler for silicon-on-insulator ridge waveguide

DEFF Research Database (Denmark)

Pu, Minhao; Liu, Liu; Ou, Haiyan

2010-01-01

An ultra-low-loss coupler for interfacing a silicon-on-insulator ridge waveguide and a single-mode fiber in both polarizations is presented. The inverted taper coupler, embedded in a polymer waveguide, is optimized for both the transverse-magnetic and transverse-electric modes through tapering...... the width of the silicon-on-insulator waveguide from 450 nm down to less than 15 nm applying a thermal oxidation process. Two inverted taper couplers are integrated with a 3-mm long silicon-on-insulator ridge waveguide in the fabricated sample. The measured coupling losses of the inverted taper coupler...... for transverse-magnetic and transverse-electric modes are ~0.36 dB and ~0.66 dB per connection, respectively....
High-performance computing — an overview

Science.gov (United States)

Marksteiner, Peter

1996-08-01

An overview of high-performance computing (HPC) is given. Different types of computer architectures used in HPC are discussed: vector supercomputers, high-performance RISC processors, various parallel computers like symmetric multiprocessors, workstation clusters, massively parallel processors. Software tools and programming techniques used in HPC are reviewed: vectorizing compilers, optimization and vector tuning, optimization for RISC processors; parallel programming techniques like shared-memory parallelism, message passing and data parallelism; and numerical libraries.
Ultralow loss, high Q, four port resonant couplers for quantum optics and photonics.

Science.gov (United States)

Rokhsari, H; Vahala, K J

2004-06-25

We demonstrate a low-loss, optical four port resonant coupler (add-drop geometry), using ultrahigh Q (>10(8)) toroidal microcavities. Different regimes of operation are investigated by variation of coupling between resonator and fiber taper waveguides. As a result, waveguide-to-waveguide power transfer efficiency of 93% (0.3 dB loss) and nonresonant insertion loss of 0.02% (photonic networks.
Multitasking TORT Under UNICOS: Parallel Performance Models and Measurements

International Nuclear Information System (INIS)

Azmy, Y.Y.; Barnett, D.A.

1999-01-01

The existing parallel algorithms in the TORT discrete ordinates were updated to function in a UNI-COS environment. A performance model for the parallel overhead was derived for the existing algorithms. The largest contributors to the parallel overhead were identified and a new algorithm was developed. A parallel overhead model was also derived for the new algorithm. The results of the comparison of parallel performance models were compared to applications of the code to two TORT standard test problems and a large production problem. The parallel performance models agree well with the measured parallel overhead
Multitasking TORT under UNICOS: Parallel performance models and measurements

International Nuclear Information System (INIS)

Barnett, A.; Azmy, Y.Y.

1999-01-01

The existing parallel algorithms in the TORT discrete ordinates code were updated to function in a UNICOS environment. A performance model for the parallel overhead was derived for the existing algorithms. The largest contributors to the parallel overhead were identified and a new algorithm was developed. A parallel overhead model was also derived for the new algorithm. The results of the comparison of parallel performance models were compared to applications of the code to two TORT standard test problems and a large production problem. The parallel performance models agree well with the measured parallel overhead
Cost-Effectiveness Comparison of Coupler Designs of Wireless Power Transfer for Electric Vehicle Dynamic Charging

Directory of Open Access Journals (Sweden)

Weitong Chen

2016-11-01

Full Text Available This paper presents a cost-effectiveness comparison of coupler designs for wireless power transfer (WPT, meant for electric vehicle (EV dynamic charging. The design comparison of three common types of couplers is first based on the raw material cost, output power, transfer efficiency, tolerance of horizontal offset, and flux density. Then, the optimal cost-effectiveness combination is selected for EV dynamic charging. The corresponding performances of the proposed charging system are compared and analyzed by both simulation and experimentation. The results verify the validity of the proposed dynamic charging system for EVs.
Design of a compact polarization beam splitter based on a deformed photonic crystal directional coupler

International Nuclear Information System (INIS)

Ren Gang; Zheng Wanhua; Wang Ke; Du Xiaoyu; Xing Mingxin; Chen Lianghui

2008-01-01

In this paper a compact polarization beam splitter based on a deformed photonic crystal directional coupler is designed and simulated. The transverse-electric (TE) guided mode and transverse-magnetic (TM) guided mode are split due to different guiding mechanisms. The effect of the shape deformation of the air holes on the coupler is studied. It discovered that the coupling strength of the coupled waveguides is strongly enhanced by introducing elliptical airholes, which reduce the device length to less than 18.5μm. A finite-difference time-domain simulation is performed to evaluate the performance of the device, and the extinction ratios for both TE and TM polarized light are higher than 20 dB. (classical areas of phenomenology)
Large-Area Binary Blazed Grating Coupler between Nanophotonic Waveguide and LED

Directory of Open Access Journals (Sweden)

Hongqiang Li

2014-01-01

Full Text Available A large-area binary blazed grating coupler for the arrayed waveguide grating (AWG demodulation integrated microsystem on silicon-on-insulator (SOI was designed for the first time. Through the coupler, light can be coupled into the SOI waveguide from the InP-based C-band LED for the AWG demodulation integrated microsystem to function. Both the length and width of the grating coupler are 360 μm, as large as the InP-based C-band LED light emitting area in the system. The coupler was designed and optimized based on the finite difference time domain method. When the incident angle of the light source is 0°, the coupling efficiency of the binary blazed grating is 40.92%, and the 3 dB bandwidth is 72 nm at a wavelength of 1550 nm.
Studying quick coupler efficiency in working attachment system of single-bucket power shovel

Science.gov (United States)

Duganova, E. V.; Zagorodniy, N. A.; Solodovnikov, D. N.; Korneyev, A. S.

2018-03-01

A prototype of a quick-disconnect connector (quick coupler) with an unloaded retention mechanism was developed from the analysis of typical quick couplers used as intermediate elements for power shovels of different manufacturers. A method is presented, allowing building a simulation model of the quick coupler prototype as an alternative to physical modeling for further studies.
Low insertion loss SOI microring resonator integrated with nano-taper couplers

DEFF Research Database (Denmark)

Pu, Minhao; Frandsen, Lars Hagedorn; Ou, Haiyan

2009-01-01

We demonstrate a microring resonator working at TM mode integrated with nano-taper couplers with 3.6dB total insertion loss. The measured insertion loss of the nano-taper coupler was only 1.3dB for TM mode....
Homemade Buckeye-Pi: A Learning Many-Node Platform for High-Performance Parallel Computing

Science.gov (United States)

Amooie, M. A.; Moortgat, J.

2017-12-01

We report on the "Buckeye-Pi" cluster, the supercomputer developed in The Ohio State University School of Earth Sciences from 128 inexpensive Raspberry Pi (RPi) 3 Model B single-board computers. Each RPi is equipped with fast Quad Core 1.2GHz ARMv8 64bit processor, 1GB of RAM, and 32GB microSD card for local storage. Therefore, the cluster has a total RAM of 128GB that is distributed on the individual nodes and a flash capacity of 4TB with 512 processors, while it benefits from low power consumption, easy portability, and low total cost. The cluster uses the Message Passing Interface protocol to manage the communications between each node. These features render our platform the most powerful RPi supercomputer to date and suitable for educational applications in high-performance-computing (HPC) and handling of large datasets. In particular, we use the Buckeye-Pi to implement optimized parallel codes in our in-house simulator for subsurface media flows with the goal of achieving a massively-parallelized scalable code. We present benchmarking results for the computational performance across various number of RPi nodes. We believe our project could inspire scientists and students to consider the proposed unconventional cluster architecture as a mainstream and a feasible learning platform for challenging engineering and scientific problems.
All-optical switching in lithium niobate directional couplers with cascaded nonlinearity

NARCIS (Netherlands)

Schiek, R.; Baek, Y.; Krijnen, Gijsbertus J.M.; Stegeman, G.I.; Baumann, I.; Sohler, W.

1996-01-01

We report on intensity-dependent switching in lithium niobate directional couplers. Large nonlinear phase shifts that are due to cascading detune the coupling between the coupler branches, which makes all-optical switching possible. Depending on the input intensity, the output could be switched
A high-speed linear algebra library with automatic parallelism

Science.gov (United States)

Boucher, Michael L.

1994-01-01

Parallel or distributed processing is key to getting highest performance workstations. However, designing and implementing efficient parallel algorithms is difficult and error-prone. It is even more difficult to write code that is both portable to and efficient on many different computers. Finally, it is harder still to satisfy the above requirements and include the reliability and ease of use required of commercial software intended for use in a production environment. As a result, the application of parallel processing technology to commercial software has been extremely small even though there are numerous computationally demanding programs that would significantly benefit from application of parallel processing. This paper describes DSSLIB, which is a library of subroutines that perform many of the time-consuming computations in engineering and scientific software. DSSLIB combines the high efficiency and speed of parallel computation with a serial programming model that eliminates many undesirable side-effects of typical parallel code. The result is a simple way to incorporate the power of parallel processing into commercial software without compromising maintainability, reliability, or ease of use. This gives significant advantages over less powerful non-parallel entries in the market.
Event parallelism: Distributed memory parallel computing for high energy physics experiments

International Nuclear Information System (INIS)

Nash, T.

1989-05-01

This paper describes the present and expected future development of distributed memory parallel computers for high energy physics experiments. It covers the use of event parallel microprocessor farms, particularly at Fermilab, including both ACP multiprocessors and farms of MicroVAXES. These systems have proven very cost effective in the past. A case is made for moving to the more open environment of UNIX and RISC processors. The 2nd Generation ACP Multiprocessor System, which is based on powerful RISC systems, is described. Given the promise of still more extraordinary increases in processor performance, a new emphasis on point to point, rather than bussed, communication will be required. Developments in this direction are described. 6 figs

Event parallelism: Distributed memory parallel computing for high energy physics experiments

International Nuclear Information System (INIS)

Nash, T.

1989-01-01

This paper describes the present and expected future development of distributed memory parallel computers for high energy physics experiments. It covers the use of event parallel microprocessor farms, particularly at Fermilab, including both ACP multiprocessors and farms of MicroVAXES. These systems have proven very cost effective in the past. A case is made for moving to the more open environment of UNIX and RISC processors. The 2nd Generation ACP Multiprocessor System, which is based on powerful RISC systems, is described. Given the promise of still more extraordinary increases in processor performance, a new emphasis on point to point, rather than bussed, communication will be required. Developments in this direction are described. (orig.)
Event parallelism: Distributed memory parallel computing for high energy physics experiments

Science.gov (United States)

Nash, Thomas

1989-12-01

This paper describes the present and expected future development of distributed memory parallel computers for high energy physics experiments. It covers the use of event parallel microprocessor farms, particularly at Fermilab, including both ACP multiprocessors and farms of MicroVAXES. These systems have proven very cost effective in the past. A case is made for moving to the more open environment of UNIX and RISC processors. The 2nd Generation ACP Multiprocessor System, which is based on powerful RISC system, is described. Given the promise of still more extraordinary increases in processor performance, a new emphasis on point to point, rather than bussed, communication will be required. Developments in this direction are described.
Analysis of parallel computing performance of the code MCNP

International Nuclear Information System (INIS)

Wang Lei; Wang Kan; Yu Ganglin

2006-01-01

Parallel computing can reduce the running time of the code MCNP effectively. With the MPI message transmitting software, MCNP5 can achieve its parallel computing on PC cluster with Windows operating system. Parallel computing performance of MCNP is influenced by factors such as the type, the complexity level and the parameter configuration of the computing problem. This paper analyzes the parallel computing performance of MCNP regarding with these factors and gives measures to improve the MCNP parallel computing performance. (authors)
Multipacting in a coaxial coupler with bias voltage for SRF operation with a large beam current

Science.gov (United States)

Liu, Z.-K.; Wang, Ch.; Chang, F.-Y.; Chang, L.-H.; Chang, M.-H.; Chen, L.-J.; Chung, F.-T.; Lin, M.-C.; Lo, C.-H.; Tsai, C.-L.; Tsai, M.-H.; Yeh, M.-S.; Yu, T.-C.

2016-09-01

A superconducting radio-frequency (SRF) module is commonly used for a high-energy accelerator; its purpose is to provide energy to the particle beam. Because of the low power dissipation and smaller impedance of a higher-order mode for this module, it can provide more power to the particle beam with better stability through decreasing the couple bunch instability. A RF coupler is necessary to transfer the high power from a RF generator to the cavity. A coupler of coaxial type is a common choice. With high-power operation, it might suffer from multipacting, which is a resonance phenomenon due to re-emission of secondary electrons. Applying a bias voltage between inner and outer conductors of the coaxial coupler might increase or decrease the strength of the multipacting effect. We studied the effect of a bias voltage on multipacting using numerical simulation to track the motion of the electrons. The simulation results and an application for SRF operation with a large beam current are presented in this paper.
High Power Tm3+-Doped Fiber Lasers Tuned by a Variable Reflective Output Coupler

Directory of Open Access Journals (Sweden)

Yulong Tang

2008-01-01

Full Text Available Wide wavelength tuning by a variable reflective output coupler is demonstrated in high-power double-clad Tm3+-doped silica fiber lasers diode-pumped at ∼790 nm. Varying the output coupling from 96% to 5%, the laser wavelength is tuned over a range of 106 nm from 1949 to 2055 nm. The output power exceeds 20 W over 90-nm range and the maximum output power is 32 W at 1949 nm for 51-W launched pump power, corresponding to a slope efficiency of ∼70%. Assisted with different fiber lengths, the tuning range is expanded to 240 nm from 1866 to 2107 nm with the output power larger than 10 W.
The Performance of an Object-Oriented, Parallel Operating System

Directory of Open Access Journals (Sweden)

David R. Kohr, Jr.

1994-01-01

Full Text Available The nascent and rapidly evolving state of parallel systems often leaves parallel application developers at the mercy of inefficient, inflexible operating system software. Given the relatively primitive state of parallel systems software, maximizing the performance of parallel applications not only requires judicious tuning of the application software, but occasionally, the replacement of specific system software modules with others that can more readily respond to the imposed pattern of resource demands. To assess the feasibility of application and performance tuning via malleable system software and to understand the performance penalties for detailed operating system performance data capture, we describe a set of performance instrumentation techniques for parallel, object-oriented operating systems and a set of performance experiments with Choices, an experimental, object-oriented operating system designed for use with parallel sys- tems. These performance experiments show that (a the performance overhead for operating system data capture is modest, (b the penalty for malleable, object-oriented operating systems is negligible, but (c techniques are needed to strictly enforce adherence of implementation to design if operating system modules are to be replaced.
Broadband polymer microstructured THz fiber coupler with downdoped cores

DEFF Research Database (Denmark)

Nielsen, Kristian; Rasmussen, Henrik K.; Bang, Ole

2010-01-01

We demonstrate a broadband THz directional coupler based on a dual core photonic crystal fiber (PCF) design with mechanically down-doped core regions. For a center frequency of 1.3 THz we demonstrate a bandwidth of 0.65 THz.......We demonstrate a broadband THz directional coupler based on a dual core photonic crystal fiber (PCF) design with mechanically down-doped core regions. For a center frequency of 1.3 THz we demonstrate a bandwidth of 0.65 THz....
Design of the new couplers for C-ADS RFQ

Science.gov (United States)

Shi, Ai-Min; Sun, Lie-Peng; Zhang, Zhou-Li; Xu, Xian-Bo; Shi, Long-Bo; Li, Chen-Xing; Wang, Wen-Bin

2015-04-01

A new special coupler with a kind of bowl-shaped ceramic window for a proton linear accelerator named the Chinese Accelerator Driven System (C-ADS) at the Institute of Modern Physics (IMP) has been simulated and constructed and a continuous wave (CW) beam commissioning through a four-meter long radio frequency quadruple (RFQ) was completed by the end of July 2014. In the experiments of conditioning and beam, some problems were promoted gradually such as sparking and thermal issues. Finally, two new couplers were passed with almost 110 kW CW power and 120 kW pulsed mode, respectively. The 10 mA intensity beam experiments have now been completed, and the couplers during the operation had no thermal or electro-magnetic problems. The detailed design and results are presented in the paper. Supported by Strategic Priority Research Program of Chinese Academy of Sciences (XDA03020500)
Distributed grating-assisted coupler for optical all-dielectric electron accelerator

Directory of Open Access Journals (Sweden)

Zhiyu Zhang

2005-07-01

Full Text Available A Bragg waveguide consisting of multiple dielectric layers with alternating index of refraction becomes an excellent option to form electron accelerating structure powered by high power laser sources. It provides confinement of a synchronous speed-of-light mode with extremely low loss. However, laser field cannot be coupled into the structure collinearly with the electron beam. There are three requirements in designing input coupler for a Bragg electron accelerator: side coupling, selective mode excitation, and high coupling efficiency. We present a side-coupling scheme using a distributed grating-assisted coupler to inject the laser power into the waveguide. Side coupling is achieved by a grating with a period on the order of an optical wavelength. The phase matching condition results in resonance coupling thus providing selective mode excitation capability. The coupling efficiency is limited by profile matching between the outgoing beam and the incoming beam, which has normally a Gaussian profile. We demonstrate a nonuniform distributed grating structure generating an outgoing beam with a Gaussian profile, therefore, increasing the coupling efficiency.
Comparison of coaxial higher order mode couplers for the CERN Superconducting Proton Linac study

Directory of Open Access Journals (Sweden)

K. Papke

2017-06-01

Full Text Available Higher order modes (HOMs may affect beam stability and refrigeration requirements of superconducting proton linacs such as the Superconducting Proton Linac, which is studied at CERN. Under certain conditions beam-induced HOMs can accumulate sufficient energy to destabilize the beam or quench the superconducting cavities. In order to limit these effects, CERN considers the use of coaxial HOM couplers on the cutoff tubes of the 5-cell superconducting cavities. These couplers consist of resonant antennas shaped as loops or probes, which are designed to couple to potentially dangerous modes while sufficiently rejecting the fundamental mode. In this paper, the design process is presented and a comparison is made between various designs for the high-beta SPL cavities, which operate at 704.4 MHz. The rf and thermal behavior as well as mechanical aspects are discussed. In order to verify the designs, a rapid prototype for the favored coupler was fabricated and characterized on a low-power test-stand.
Comparison of coaxial higher order mode couplers for the CERN Superconducting Proton Linac study

CERN Document Server

AUTHOR|(CDS)2085329; Gerigk, Frank; Van Rienen, Ursula

2017-01-01

Higher order modes (HOMs) may affect beam stability and refrigeration requirements of superconducting proton linacs such as the Superconducting Proton Linac, which is studied at CERN. Under certain conditions beam-induced HOMs can accumulate sufficient energy to destabilize the beam or quench the superconducting cavities. In order to limit these effects, CERN considers the use of coaxial HOM couplers on the cutoff tubes of the 5-cell superconducting cavities. These couplers consist of resonant antennas shaped as loops or probes, which are designed to couple to potentially dangerous modes while sufficiently rejecting the fundamental mode. In this paper, the design process is presented and a comparison is made between various designs for the high-beta SPL cavities, which operate at 704.4 MHz. The rf and thermal behavior as well as mechanical aspects are discussed. In order to verify the designs, a rapid prototype for the favored coupler was fabricated and characterized on a low-power test-stand.
Fiber pigtailed thin wall capillary coupler for excitation of microsphere WGM resonator.

Science.gov (United States)

Wang, Hanzheng; Lan, Xinwei; Huang, Jie; Yuan, Lei; Kim, Cheol-Woon; Xiao, Hai

2013-07-01

In this paper, we demonstrate a fiber pigtailed thin wall capillary coupler for excitation of Whispering Gallery Modes (WGMs) of microsphere resonators. The coupler is made by fusion-splicing an optical fiber with a capillary tube and consequently etching the capillary wall to a thickness of a few microns. Light is coupled through the peripheral contact between inserted microsphere and the etched capillary wall. The coupling efficiency as a function of the wall thickness was studied experimentally. WGM resonance with a Q-factor of 1.14 × 10(4) was observed using a borosilicate glass microsphere with a diameter of 71 μm. The coupler operates in the reflection mode and provides a robust mechanical support to the microsphere resonator. It is expected that the new coupler may find broad applications in sensors, optical filters and lasers.
An Overview of High-performance Parallel Big Data transfers over multiple network channels with Transport Layer Security (TLS) and TLS plus Perfect Forward Secrecy (PFS)

Energy Technology Data Exchange (ETDEWEB)

Fang, Chin [SLAC National Accelerator Lab., Menlo Park, CA (United States); Corttrell, R. A. [SLAC National Accelerator Lab., Menlo Park, CA (United States)

2015-05-06

This Technical Note provides an overview of high-performance parallel Big Data transfers with and without encryption for data in-transit over multiple network channels. It shows that with the parallel approach, it is feasible to carry out high-performance parallel "encrypted" Big Data transfers without serious impact to throughput. But other impacts, e.g. the energy-consumption part should be investigated. It also explains our rationales of using a statistics-based approach for gaining understanding from test results and for improving the system. The presentation is of high-level nature. Nevertheless, at the end we will pose some questions and identify potentially fruitful directions for future work.
A study of polaritonic transparency in couplers made from excitonic materials

Energy Technology Data Exchange (ETDEWEB)

Singh, Mahi R.; Racknor, Chris [Department of Physics and Astronomy, Western University, London, Ontario N6A 3K7 (Canada)

2015-03-14

We have studied light matter interaction in quantum dot and exciton-polaritonic coupler hybrid systems. The coupler is made by embedding two slabs of an excitonic material (CdS) into a host excitonic material (ZnO). An ensemble of non-interacting quantum dots is doped in the coupler. The bound exciton polariton states are calculated in the coupler using the transfer matrix method in the presence of the coupling between the external light (photons) and excitons. These bound exciton-polaritons interact with the excitons present in the quantum dots and the coupler is acting as a reservoir. The Schrödinger equation method has been used to calculate the absorption coefficient in quantum dots. It is found that when the distance between two slabs (CdS) is greater than decay length of evanescent waves the absorption spectrum has two peaks and one minimum. The minimum corresponds to a transparent state in the system. However, when the distance between the slabs is smaller than the decay length of evanescent waves, the absorption spectra has three peaks and two transparent states. In other words, one transparent state can be switched to two transparent states when the distance between the two layers is modified. This could be achieved by applying stress and strain fields. It is also found that transparent states can be switched on and off by applying an external control laser field.
LiNbO/sub 3/:Ti directional-coupler modulators for high-bandwidth, single-shot instrumentation systems operating at 800 nm

International Nuclear Information System (INIS)

Lowry, M.; Jander, D.; Lancaster, G.; Kwiat, P.; McWright, G.; Peterson, R.T.; Tindall, W.; Roeske, F.

1987-01-01

The authors update their work on optical directional-coupler modulators (ODCMs) for single-shot, analog instrumentation systems operating at -- 800 nm. They can now fabricate directional-coupler devices that have one input and two output pigtails with insertion losses of 8.9 dB on average. Data for the ODCMs indicate an impulse response of less than 40 ps. They have implemented these devices in an ultrafast, x-ray measurement system. They discuss our data from this implementation and their implications for continued ODCM development
Parallel phase model : a programming model for high-end parallel machines with manycores.

Energy Technology Data Exchange (ETDEWEB)

Wu, Junfeng (Syracuse University, Syracuse, NY); Wen, Zhaofang; Heroux, Michael Allen; Brightwell, Ronald Brian

2009-04-01

This paper presents a parallel programming model, Parallel Phase Model (PPM), for next-generation high-end parallel machines based on a distributed memory architecture consisting of a networked cluster of nodes with a large number of cores on each node. PPM has a unified high-level programming abstraction that facilitates the design and implementation of parallel algorithms to exploit both the parallelism of the many cores and the parallelism at the cluster level. The programming abstraction will be suitable for expressing both fine-grained and coarse-grained parallelism. It includes a few high-level parallel programming language constructs that can be added as an extension to an existing (sequential or parallel) programming language such as C; and the implementation of PPM also includes a light-weight runtime library that runs on top of an existing network communication software layer (e.g. MPI). Design philosophy of PPM and details of the programming abstraction are also presented. Several unstructured applications that inherently require high-volume random fine-grained data accesses have been implemented in PPM with very promising results.
Flat-Passband 3 × 3 Interleaving Filter Designed With Optical Directional Couplers in Lattice Structure

Science.gov (United States)

Wang, Qi Jie; Zhang, Ying; Soh, Yeng Chai

2005-12-01

This paper presents a novel lattice optical delay-line circuit using 3 × 3 directional couplers to implement three-port optical interleaving filters. It is shown that the proposed circuit can deliver three channels of 2pi/3 phase-shifted interleaving transmission spectra if the coupling ratios of the last two directional couplers are selected appropriately. The other performance requirements of an optical interleaver can be achieved by designing the remaining part of the lattice circuit. A recursive synthesis design algorithm is developed to calculate the design parameters of the lattice circuit that will yield the desired filter response. As illustrative examples, interleavers with maximally flat-top passband transmission and with given transmission performance on passband ripples and passband bandwidth, respectively, are designed to verify the effectiveness of the proposed design scheme.
Investigations and Simulations of All optical Switches in linear state Based on Photonic Crystal Directional Coupler

Directory of Open Access Journals (Sweden)

S. Maktoobi

2014-10-01

Full Text Available Switching is a principle process in digital computers and signal processing systems. The growth of optical signal processing systems, draws particular attention to design of ultra-fast optical switches. In this paper, All Optical Switches in linear state Based On photonic crystal Directional coupler is analyzed and simulated. Among different methods, the finite difference time domain method (FDTD is a preferable method and is used. We have studied the application of photonic crystal lattices, the physics of optical switching and photonic crystal Directional coupler. In this paper, Electric field intensity and the power output that are two factors to improve the switching performance and the device efficiency are investigated and simulated. All simulations are performed by COMSOL software.
Design and fabrication of multimode interference couplers based on digital micro-mirror system

Science.gov (United States)

Wu, Sumei; He, Xingdao; Shen, Chenbo

2008-03-01

Multimode interference (MMI) couplers, based on the self-imaging effect (SIE), are accepted popularly in integrated optics. According to the importance of MMI devices, in this paper, we present a novel method to design and fabricate MMI couplers. A technology of maskless lithography to make MMI couplers based on a smart digital micro-mirror device (DMD) system is proposed. A 1×4 MMI device is designed as an example, which shows the present method is efficient and cost-effective.
A high performance image processing platform based on CPU-GPU heterogeneous cluster with parallel image reconstroctions for micro-CT

International Nuclear Information System (INIS)

Ding Yu; Qi Yujin; Zhang Xuezhu; Zhao Cuilan

2011-01-01

In this paper, we report the development of a high-performance image processing platform, which is based on CPU-GPU heterogeneous cluster. Currently, it consists of a Dell Precision T7500 and HP XW8600 workstations with parallel programming and runtime environment, using the message-passing interface (MPI) and CUDA (Compute Unified Device Architecture). We succeeded in developing parallel image processing techniques for 3D image reconstruction of X-ray micro-CT imaging. The results show that a GPU provides a computing efficiency of about 194 times faster than a single CPU, and the CPU-GPU clusters provides a computing efficiency of about 46 times faster than the CPU clusters. These meet the requirements of rapid 3D image reconstruction and real time image display. In conclusion, the use of CPU-GPU heterogeneous cluster is an effective way to build high-performance image processing platform. (authors)

Development of 20 kW input power coupler for 1.3 GHz ERL main linac. Component test at 30 kW IOT test stand

International Nuclear Information System (INIS)

Sakai, Hiroshi; Umemori, Kensei; Sakanaka, Shogo; Takahashi, Takeshi; Furuya, Takaaki; Shinoe, Kenji; Ishii, Atsushi; Nakamura, Norio; Sawamura, Masaru

2009-01-01

We started to develop an input coupler for a 1.3 GHz ERL superconducting cavity. Required input power is about 20 kW for the cavity acceleration field of 20 MV/m and the beam current of 100 mA in energy recovery operation. The input coupler is designed based on the STF-BL input coupler and some modifications are applied to the design for the CW 20 kW power operation. We fabricated input coupler components such as ceramic windows and bellows and carried out the high-power test of the components by using a 30 kW IOT power source and a test stand constructed for the highpower test. In this report, we mainly describe the results of the high-power test of ceramic window and bellows. (author)
Free electron laser variable bridge coupler

International Nuclear Information System (INIS)

Spalek, G.; Billen, J.H.; Garcia, J.A.; McMurry, D.E.; Harnsborough, L.D.; Giles, P.M.; Stevens, S.B.

1985-01-01

The Los Alamos free-electron laser (FEL) is being modified to test a scheme for recovering most of the power in the residual 20-MeV electron beam by decelerating the microbunches in a linear standing-wave accelerator and using the recovered energy to accelerate new beam. A variable-coupler low-power model that resonantly couples the accelerator and decelerator structures has been built and tested. By mixing the TE 101 and TE 102 modes, this device permits continuous variation of the decelerator fields relative to the accelerator fields through a range of 1:1 to 1:2.5. Phase differences between the two structures are kept below 1 0 and are independent of power-flow direction. The rf power is also fed to the two structures through this coupling device. Measurements were also made on a three-post-loaded variable coupler that is a promising candidate for the same task
Parallel Libraries to support High-Level Programming

DEFF Research Database (Denmark)

Larsen, Morten Nørgaard

and the Microsoft .NET iv framework. Normally, one would not directly think of the .NET framework when talking scientific applications, but Microsoft has in the last couple of versions of .NET introduce a number of tools for writing parallel and high performance code. The first section examines how programmers can...
Proton damage in linear and digital opto-couplers; Effets des protons sur des optocoupleurs lineaires et numeriques

Energy Technology Data Exchange (ETDEWEB)

Johnston, A.; Rax, B.G. [California Institute of Technology, Jet Propulsion Laboratory, Pasadena (United States)

1999-07-01

This paper discusses proton degradation of linear and digital opto-couplers. One obvious way to harden opto-coupler technologies is to select LEDs (light emitting diodes) that are more resistant to displacement damage. A direct comparison is made of degradation of a commercial linear opto-coupler from one manufactured with a modified version of the same device with a different LED technology. Other factors, including degradation of optical photoresponse and transistor gain are also discussed, along with basic comparisons of digital and analog opto-couplers. The experimental work has been made with 50 MeV protons. 3 underlying factors contribute to opto-coupler degradation. The most important factor is LED degradation, it is possible to select opto-coupler with double-heterojunction LEDs that are inherently more resistant to displacement damage. The second factor is gain degradation that is particularly important for opto-couplers with sensitive LEDs because the light output decreases so much at low radiation levels. The third factor, optical photoresponse is the largest contribution to CTR (current transfer ratio) degradation for opto-couplers with improved LED hardness. Photoresponse degradation depends on wavelength because the absorption coefficient is wavelength dependent. (A.C.)
Improving estimations of greenhouse gas transfer velocities by atmosphere-ocean couplers in Earth-System and regional models

Science.gov (United States)

Vieira, V. M. N. C. S.; Sahlée, E.; Jurus, P.; Clementi, E.; Pettersson, H.; Mateus, M.

2015-09-01

Earth-System and regional models, forecasting climate change and its impacts, simulate atmosphere-ocean gas exchanges using classical yet too simple generalizations relying on wind speed as the sole mediator while neglecting factors as sea-surface agitation, atmospheric stability, current drag with the bottom, rain and surfactants. These were proved fundamental for accurate estimates, particularly in the coastal ocean, where a significant part of the atmosphere-ocean greenhouse gas exchanges occurs. We include several of these factors in a customizable algorithm proposed for the basis of novel couplers of the atmospheric and oceanographic model components. We tested performances with measured and simulated data from the European coastal ocean, having found our algorithm to forecast greenhouse gas exchanges largely different from the forecasted by the generalization currently in use. Our algorithm allows calculus vectorization and parallel processing, improving computational speed roughly 12× in a single cpu core, an essential feature for Earth-System models applications.
Design of the 1.5 MW, 30-96 MHz ultra-wideband 3 dB high power hybrid coupler for Ion Cyclotron Resonance Frequency (ICRF) heating in fusion grade reactor

Energy Technology Data Exchange (ETDEWEB)

Yadav, Rana Pratap, E-mail: ranayadav97@gmail.com; Kumar, Sunil; Kulkarni, S. V. [Thapar University, Patiala, Punjab 147004, India and Institute for Plasma Research, Gandhinagar 382428 (India)

2016-01-15

Design and developmental procedure of strip-line based 1.5 MW, 30-96 MHz, ultra-wideband high power 3 dB hybrid coupler has been presented and its applicability in ion cyclotron resonance heating (ICRH) in tokamak is discussed. For the high power handling capability, spacing between conductors and ground need to very high. Hence other structural parameters like strip-width, strip thickness coupling gap, and junction also become large which can be gone upto optimum limit where various constrains like fabrication tolerance, discontinuities, and excitation of higher TE and TM modes become prominent and significantly deteriorates the desired parameters of the coupled lines system. In designed hybrid coupler, two 8.34 dB coupled lines are connected in tandem to get desired coupling of 3 dB and air is used as dielectric. The spacing between ground and conductors are taken as 0.164 m for 1.5 MW power handling capability. To have the desired spacing, each of 8.34 dB segments are designed with inner dimension of 3.6 × 1.0 × 40 cm where constraints have been significantly realized, compensated, and applied in designing of 1.5 MW hybrid coupler and presented in paper.
A calculation method for RF couplers design based on numerical simulation by microwave studio

International Nuclear Information System (INIS)

Wang Rong; Pei Yuanji; Jin Kai

2006-01-01

A numerical simulation method for coupler design is proposed. It is based on the matching procedure for the 2π/3 structure given by Dr. R.L. Kyhl. Microwave Studio EigenMode Solver is used for such numerical simulation. the simulation for a coupler has been finished with this method and the simulation data are compared with experimental measurements. The results show that this numerical simulation method is feasible for coupler design. (authors)
Fabrication of LD-3 Polymer Directional Couplers

National Research Council Canada - National Science Library

Chen, Ray T

1998-01-01

.... LD-3 polymer directional couplers arc designed and fabricated to operate at 1.3 microns. Waveguide propagation losses, device characterization, demonstration of cross coupling and packaged device pictures are presented in this final report.
ANSYS modeling of thermal contraction of SPL HOM couplers during cool-down

CERN Document Server

Papke, K

2016-01-01

During the cool-down the HOM coupler as well as the cavity inside the cryo module experience a thermal contraction. For most materials between room temperature and liquid helium temperatures, the changes in dimension are in the order of a few tenths of a percent change in volume. This paper presents the effect of thermal contraction on the RF transmission behavior of HOM couplers, and in particular the influence on its notch filter. Furthermore the simulation process with APDL is explained in detail. Conclusions about the necessary tuning range of the notch filter are made which is especially a concern for couplers with only notch filter.
Preventing distortion of quick couplers of hoses of central pipe lines--a cheap and simple method.

Directory of Open Access Journals (Sweden)

Kamath S

1995-01-01

Full Text Available A cheap and practical approach of steel chains attached to the station outlet quick couplers helps in maintaining the shape of the quick couplers and ensures their effective functioning over a long period of time and avoids mishap of connection due to damage of these couplers.
A parallel solution for high resolution histological image analysis.

Science.gov (United States)

Bueno, G; González, R; Déniz, O; García-Rojo, M; González-García, J; Fernández-Carrobles, M M; Vállez, N; Salido, J

2012-10-01

This paper describes a general methodology for developing parallel image processing algorithms based on message passing for high resolution images (on the order of several Gigabytes). These algorithms have been applied to histological images and must be executed on massively parallel processing architectures. Advances in new technologies for complete slide digitalization in pathology have been combined with developments in biomedical informatics. However, the efficient use of these digital slide systems is still a challenge. The image processing that these slides are subject to is still limited both in terms of data processed and processing methods. The work presented here focuses on the need to design and develop parallel image processing tools capable of obtaining and analyzing the entire gamut of information included in digital slides. Tools have been developed to assist pathologists in image analysis and diagnosis, and they cover low and high-level image processing methods applied to histological images. Code portability, reusability and scalability have been tested by using the following parallel computing architectures: distributed memory with massive parallel processors and two networks, INFINIBAND and Myrinet, composed of 17 and 1024 nodes respectively. The parallel framework proposed is flexible, high performance solution and it shows that the efficient processing of digital microscopic images is possible and may offer important benefits to pathology laboratories. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers

Directory of Open Access Journals (Sweden)

Mark James Abraham

2015-09-01

Full Text Available GROMACS is one of the most widely used open-source and free software codes in chemistry, used primarily for dynamical simulations of biomolecules. It provides a rich set of calculation types, preparation and analysis tools. Several advanced techniques for free-energy calculations are supported. In version 5, it reaches new performance heights, through several new and enhanced parallelization algorithms. These work on every level; SIMD registers inside cores, multithreading, heterogeneous CPU–GPU acceleration, state-of-the-art 3D domain decomposition, and ensemble-level parallelization through built-in replica exchange and the separate Copernicus framework. The latest best-in-class compressed trajectory storage format is supported.
Performance modeling of parallel algorithms for solving neutron diffusion problems

International Nuclear Information System (INIS)

Azmy, Y.Y.; Kirk, B.L.

1995-01-01

Neutron diffusion calculations are the most common computational methods used in the design, analysis, and operation of nuclear reactors and related activities. Here, mathematical performance models are developed for the parallel algorithm used to solve the neutron diffusion equation on message passing and shared memory multiprocessors represented by the Intel iPSC/860 and the Sequent Balance 8000, respectively. The performance models are validated through several test problems, and these models are used to estimate the performance of each of the two considered architectures in situations typical of practical applications, such as fine meshes and a large number of participating processors. While message passing computers are capable of producing speedup, the parallel efficiency deteriorates rapidly as the number of processors increases. Furthermore, the speedup fails to improve appreciably for massively parallel computers so that only small- to medium-sized message passing multiprocessors offer a reasonable platform for this algorithm. In contrast, the performance model for the shared memory architecture predicts very high efficiency over a wide range of number of processors reasonable for this architecture. Furthermore, the model efficiency of the Sequent remains superior to that of the hypercube if its model parameters are adjusted to make its processors as fast as those of the iPSC/860. It is concluded that shared memory computers are better suited for this parallel algorithm than message passing computers
P3T+: A Performance Estimator for Distributed and Parallel Programs

Directory of Open Access Journals (Sweden)

T. Fahringer

2000-01-01

Full Text Available Developing distributed and parallel programs on today's multiprocessor architectures is still a challenging task. Particular distressing is the lack of effective performance tools that support the programmer in evaluating changes in code, problem and machine sizes, and target architectures. In this paper we introduce P3T+ which is a performance estimator for mostly regular HPF (High Performance Fortran programs but partially covers also message passing programs (MPI. P3T+ is unique by modeling programs, compiler code transformations, and parallel and distributed architectures. It computes at compile-time a variety of performance parameters including work distribution, number of transfers, amount of data transferred, transfer times, computation times, and number of cache misses. Several novel technologies are employed to compute these parameters: loop iteration spaces, array access patterns, and data distributions are modeled by employing highly effective symbolic analysis. Communication is estimated by simulating the behavior of a communication library used by the underlying compiler. Computation times are predicted through pre-measured kernels on every target architecture of interest. We carefully model most critical architecture specific factors such as cache lines sizes, number of cache lines available, startup times, message transfer time per byte, etc. P3T+ has been implemented and is closely integrated with the Vienna High Performance Compiler (VFC to support programmers develop parallel and distributed applications. Experimental results for realistic kernel codes taken from real-world applications are presented to demonstrate both accuracy and usefulness of P3T+.
Enhancement of entanglement in the nonlinear optical coupler by homodyne-mediated feedback

Energy Technology Data Exchange (ETDEWEB)

Ke Shasha [Department of Physics, Huazhong Normal University, Wuhan 430079 (China); Cheng Guiping [Department of Physics, Huazhong Normal University, Wuhan 430079 (China); Zhang Lihui [Department of Physics, Jianghan University, Wuhan 430056 (China); Li, Gao-xiang [Department of Physics, Huazhong Normal University, Wuhan 430079 (China)

2007-07-28

The enhancement of the intracavity entanglement of a nonlinear coupler via homodyne-mediated quantum feedback is investigated. It is found that the feedback can effectively enhance the squeezing, entanglement and purity of a two-mode field in the nonlinear coupler by appropriately choosing the quadrature angle at which the quantum feedback is introduced.
A Tandem Coupler for Terahertz Integrated Circuits

Science.gov (United States)

Reck, Theodore J.; Deal, William; Chattopadhyay, Goutam

2013-01-01

A coplanar waveguide 3 dB quadrature coupler operating from 500 to 700 GHz is designed, fabricated and measured. On-wafer measurements demonstrate an amplitude balance of +/-2 dB and phase balance of +/-20 deg.
Multipacting Simulations of Tuner-adjustable waveguide coupler (TaCo) with CST Particle Studio®

CERN Document Server

Shafqat, N; Wegner, R

2014-01-01

Tuner-adjustable waveguide couplers (TaCo) are used to feed microwave power to different RF structures of LINAC4. This paper studies the multipacting phenomenon for TaCo using the PIC solver of CST PS. Simulations are performed for complete field sweeps and results are analysed.
Pthreads vs MPI Parallel Performance of Angular-Domain Decomposed S

International Nuclear Information System (INIS)

Azmy, Y.Y.; Barnett, D.A.

2000-01-01

Two programming models for parallelizing the Angular Domain Decomposition (ADD) of the discrete ordinates (S n ) approximation of the neutron transport equation are examined. These are the shared memory model based on the POSIX threads (Pthreads) standard, and the message passing model based on the Message Passing Interface (MPI) standard. These standard libraries are available on most multiprocessor platforms thus making the resulting parallel codes widely portable. The question is: on a fixed platform, and for a particular code solving a given test problem, which of the two programming models delivers better parallel performance? Such comparison is possible on Symmetric Multi-Processors (SMP) architectures in which several CPUs physically share a common memory, and in addition are capable of emulating message passing functionality. Implementation of the two-dimensional,(S n ), Arbitrarily High Order Transport (AHOT) code for solving neutron transport problems using these two parallelization models is described. Measured parallel performance of each model on the COMPAQ AlphaServer 8400 and the SGI Origin 2000 platforms is described, and comparison of the observed speedup for the two programming models is reported. For the case presented in this paper it appears that the MPI implementation scales better than the Pthreads implementation on both platforms
On the Performance of the Python Programming Language for Serial and Parallel Scientific Computations

Directory of Open Access Journals (Sweden)

Xing Cai

2005-01-01

Full Text Available This article addresses the performance of scientific applications that use the Python programming language. First, we investigate several techniques for improving the computational efficiency of serial Python codes. Then, we discuss the basic programming techniques in Python for parallelizing serial scientific applications. It is shown that an efficient implementation of the array-related operations is essential for achieving good parallel performance, as for the serial case. Once the array-related operations are efficiently implemented, probably using a mixed-language implementation, good serial and parallel performance become achievable. This is confirmed by a set of numerical experiments. Python is also shown to be well suited for writing high-level parallel programs.
Performance Analysis of Parallel Mathematical Subroutine library PARCEL

International Nuclear Information System (INIS)

Yamada, Susumu; Shimizu, Futoshi; Kobayashi, Kenichi; Kaburaki, Hideo; Kishida, Norio

2000-01-01

The parallel mathematical subroutine library PARCEL (Parallel Computing Elements) has been developed by Japan Atomic Energy Research Institute for easy use of typical parallelized mathematical codes in any application problems on distributed parallel computers. The PARCEL includes routines for linear equations, eigenvalue problems, pseudo-random number generation, and fast Fourier transforms. It is shown that the results of performance for linear equations routines exhibit good parallelization efficiency on vector, as well as scalar, parallel computers. A comparison of the efficiency results with the PETSc (Portable Extensible Tool kit for Scientific Computations) library has been reported. (author)

A parallel calibration utility for WRF-Hydro on high performance computers

Science.gov (United States)

Wang, J.; Wang, C.; Kotamarthi, V. R.

2017-12-01

A successful modeling of complex hydrological processes comprises establishing an integrated hydrological model which simulates the hydrological processes in each water regime, calibrates and validates the model performance based on observation data, and estimates the uncertainties from different sources especially those associated with parameters. Such a model system requires large computing resources and often have to be run on High Performance Computers (HPC). The recently developed WRF-Hydro modeling system provides a significant advancement in the capability to simulate regional water cycles more completely. The WRF-Hydro model has a large range of parameters such as those in the input table files — GENPARM.TBL, SOILPARM.TBL and CHANPARM.TBL — and several distributed scaling factors such as OVROUGHRTFAC. These parameters affect the behavior and outputs of the model and thus may need to be calibrated against the observations in order to obtain a good modeling performance. Having a parameter calibration tool specifically for automate calibration and uncertainty estimates of WRF-Hydro model can provide significant convenience for the modeling community. In this study, we developed a customized tool using the parallel version of the model-independent parameter estimation and uncertainty analysis tool, PEST, to enabled it to run on HPC with PBS and SLURM workload manager and job scheduler. We also developed a series of PEST input file templates that are specifically for WRF-Hydro model calibration and uncertainty analysis. Here we will present a flood case study occurred in April 2013 over Midwest. The sensitivity and uncertainties are analyzed using the customized PEST tool we developed.
Near-field probing of photonic crystal directional couplers

DEFF Research Database (Denmark)

Volkov, V. S.; Bozhevolnyi, S. I.; Borel, Peter Ingo

2006-01-01

We report the design, fabrication and characterization of a photonic crystal directional with a size of ~20 x 20 mm2 fabricated in silicon-on-insulator material. Using a scanning near-field optical microscope we demonstrate a high coupling efficiency for TM polarized light at telecom wavelengths....... By comparing the near-field optical images recorded in and after the directional coupler area, the features of light distribution are analyzed. Finally, the scanning near-field optical microscope observations are found to be in agreement with the transmission measurements conducted with the same sample....
Design, construction and tuning of S-band coupler for electron linear accelerator of institute for research in fundamental sciences (IPM E-linac)

International Nuclear Information System (INIS)

Ghasemi, F.; Abbasi Davani, F.; Lamehi Rachti, M.; Shaker, H.; Ahmadiannamin, S.

2015-01-01

Design and construction of an electron linear accelerator by Institute for Research in Fundamental Science (IPM) is considered as Iran’s first attempt to construct such an accelerator. In order to design a linear accelerating tube, after defining the accelerating tube and buncher geometries, RF input and output couplers must be designed. In this article, firstly, a brief report on the specifications of an S-band electron linear accelerator which is in progress in the school of particles and accelerators is presented and then, the design process and construction reports of the couplers required for this accelerator are described. Through performing necessary calculations and tuning the coupling factor and resonant frequency, couplers with desired specification have been fabricated by shrinking method. The final obtained coupling factor and resonant frequency have been respectively 1.05 and 2997 MHz for the first coupler, and 0.98 and 2996.9 MHz for the second one that are close to calculation results
RF characterization and testing of ridge waveguide transitions for RF power couplers

Energy Technology Data Exchange (ETDEWEB)

Kumar, Rajesh; Jose, Mentes; Singh, G.N. [Ion Accelerator Development Division, Bhabha Atomic Research Centre, Mumbai 400085 (India); Kumar, Girish [Department of Electrical Engineering, IIT Bombay, Mumbai 400076,India (India); Bhagwat, P.V. [Ion Accelerator Development Division, Bhabha Atomic Research Centre, Mumbai 400085 (India)

2016-12-01

RF characterization of rectangular to ridge waveguide transitions for RF power couplers has been carried out by connecting them back to back. Rectangular waveguide to N type adapters are first calibrated by TRL method and then used for RF measurements. Detailed information is obtained about their RF behavior by measurements and full wave simulations. It is shown that the two transitions can be characterized and tuned for required return loss at design frequency of 352.2 MHz. This opens the possibility of testing and conditioning two transitions together on a test bench. Finally, a RF coupler based on these transitions is coupled to an accelerator cavity. The power coupler is successfully tested up to 200 kW, 352.2 MHz with 0.2% duty cycle.
Theoretical and experimental analysis of a linear accelerator endowed with single feed coupler with movable short-circuit.

Science.gov (United States)

Dal Forno, Massimo; Craievich, Paolo; Penco, Giuseppe; Vescovo, Roberto

2013-11-01

The front-end injection systems of the FERMI@Elettra linac produce high brightness electron beams that define the performance of the Free Electron Laser. The photoinjector mainly consists of the radiofrequency (rf) gun and of two S-band rf structures which accelerate the beam. Accelerating structures endowed with a single feed coupler cause deflection and degradation of the electron beam properties, due to the asymmetry of the electromagnetic field. In this paper, a new type of single feed structure with movable short-circuit is proposed. It has the advantage of having only one waveguide input, but we propose a novel design where the dipolar component is reduced. Moreover, the racetrack geometry allows to reduce the quadrupolar component. This paper presents the microwave design and the analysis of the particle motion inside the linac. A prototype has been machined at the Elettra facility to verify the new coupler design and the rf field has been measured by adopting the bead-pull method. The results are here presented, showing good agreement with the expectations.
Theoretical and experimental analysis of a linear accelerator endowed with single feed coupler with movable short-circuit

International Nuclear Information System (INIS)

Forno, Massimo Dal; Craievich, Paolo; Penco, Giuseppe; Vescovo, Roberto

2013-01-01

The front-end injection systems of the FERMI@Elettra linac produce high brightness electron beams that define the performance of the Free Electron Laser. The photoinjector mainly consists of the radiofrequency (rf) gun and of two S-band rf structures which accelerate the beam. Accelerating structures endowed with a single feed coupler cause deflection and degradation of the electron beam properties, due to the asymmetry of the electromagnetic field. In this paper, a new type of single feed structure with movable short-circuit is proposed. It has the advantage of having only one waveguide input, but we propose a novel design where the dipolar component is reduced. Moreover, the racetrack geometry allows to reduce the quadrupolar component. This paper presents the microwave design and the analysis of the particle motion inside the linac. A prototype has been machined at the Elettra facility to verify the new coupler design and the rf field has been measured by adopting the bead-pull method. The results are here presented, showing good agreement with the expectations
Ultrahigh-efficiency apodized grating coupler using fully etched photonic crystals

DEFF Research Database (Denmark)

Ding, Yunhong; Ou, Haiyan; Peucheret, Christophe

2013-01-01

We present an efficient method to design apodized grating couplers with Gaussian output profiles for efficient coupling between standard single mode fibers and silicon chips. An apodized grating coupler using fully etched photonic crystal holes on the silicon-on-insulator platform is designed......, and fabricated in a single step of lithography and etching. An ultralow coupling loss of x2212;1.74x2009;x2009;dB (67% coupling efficiency) with a 3xA0;dB bandwidth of 60xA0;nm is experimentally measured....
Mode conversion in hybrid optical fiber coupler

Science.gov (United States)

Stasiewicz, Karol A.; Marc, P.; Jaroszewicz, Leszek R.

2012-04-01

Designing of all in-line fiber optic systems with a supercontinuum light source gives some issues. The use of a standard single mode fiber (SMF) as an input do not secure single mode transmission in full wavelength range. In the paper, the experimental results of the tested hybrid fiber optic coupler were presented. It was manufactured by fusing a standard single mode fiber (SMF28) and a photonic crystal fiber (PCF). The fabrication process is based on the standard fused biconical taper technique. Two types of large mode area fibers (LMA8 and LAM10 NKT Photonics) with different air holes arrangements were used as the photonic crystal fiber. Spectral characteristics within the range of 800 nm - 1700 nm were presented. All process was optimized to obtain a mode conversion between SMF and PCF and to reach a single mode transmission in the PCF output of the coupler.
Study on High Performance of MPI-Based Parallel FDTD from WorkStation to Super Computer Platform

Directory of Open Access Journals (Sweden)

Z. L. He

2012-01-01

Full Text Available Parallel FDTD method is applied to analyze the electromagnetic problems of the electrically large targets on super computer. It is well known that the more the number of processors the less computing time consumed. Nevertheless, with the same number of processors, computing efficiency is affected by the scheme of the MPI virtual topology. Then, the influence of different virtual topology schemes on parallel performance of parallel FDTD is studied in detail. The general rules are presented on how to obtain the highest efficiency of parallel FDTD algorithm by optimizing MPI virtual topology. To show the validity of the presented method, several numerical results are given in the later part. Various comparisons are made and some useful conclusions are summarized.
Designing a High Performance Parallel Personal Cluster

OpenAIRE

Kapanova, K. G.; Sellier, J. M.

2016-01-01

Today, many scientific and engineering areas require high performance computing to perform computationally intensive experiments. For example, many advances in transport phenomena, thermodynamics, material properties, computational chemistry and physics are possible only because of the availability of such large scale computing infrastructures. Yet many challenges are still open. The cost of energy consumption, cooling, competition for resources have been some of the reasons why the scientifi...
Parallel performance of the angular versus spatial domain decomposition for discrete ordinates transport methods

International Nuclear Information System (INIS)

Fischer, J.W.; Azmy, Y.Y.

2003-01-01

A previously reported parallel performance model for Angular Domain Decomposition (ADD) of the Discrete Ordinates method for solving multidimensional neutron transport problems is revisited for further validation. Three communication schemes: native MPI, the bucket algorithm, and the distributed bucket algorithm, are included in the validation exercise that is successfully conducted on a Beowulf cluster. The parallel performance model is comprised of three components: serial, parallel, and communication. The serial component is largely independent of the number of participating processors, P, while the parallel component decreases like 1/P. These two components are independent of the communication scheme, in contrast with the communication component that typically increases with P in a manner highly dependent on the global reduced algorithm. Correct trends for each component and each communication scheme were measured for the Arbitrarily High Order Transport (AHOT) code, thus validating the performance models. Furthermore, extensive experiments illustrate the superiority of the bucket algorithm. The primary question addressed in this research is: for a given problem size, which domain decomposition method, angular or spatial, is best suited to parallelize Discrete Ordinates methods on a specific computational platform? We address this question for three-dimensional applications via parallel performance models that include parameters specifying the problem size and system performance: the above-mentioned ADD, and a previously constructed and validated Spatial Domain Decomposition (SDD) model. We conclude that for large problems the parallel component dwarfs the communication component even on moderately large numbers of processors. The main advantages of SDD are: (a) scalability to higher numbers of processors of the order of the number of computational cells; (b) smaller memory requirement; (c) better performance than ADD on high-end platforms and large number of
PERFORMANCE EVALUATION OF OR1200 PROCESSOR WITH EVOLUTIONARY PARALLEL HPRC USING GEP

Directory of Open Access Journals (Sweden)

R. Maheswari

2012-04-01

Full Text Available In this fast computing era, most of the embedded system requires more computing power to complete the complex function/ task at the lesser amount of time. One way to achieve this is by boosting up the processor performance which allows processor core to run faster. This paper presents a novel technique of increasing the performance by parallel HPRC (High Performance Reconfigurable Computing in the CPU/DSP (Digital Signal Processor unit of OR1200 (Open Reduced Instruction Set Computer (RISC 1200 using Gene Expression Programming (GEP an evolutionary programming model. OR1200 is a soft-core RISC processor of the Intellectual Property cores that can efficiently run any modern operating system. In the manufacturing process of OR1200 a parallel HPRC is placed internally in the Integer Execution Pipeline unit of the CPU/DSP core to increase the performance. The GEP Parallel HPRC is activated /deactivated by triggering the signals i HPRC_Gene_Start ii HPRC_Gene_End. A Verilog HDL(Hardware Description language functional code for Gene Expression Programming parallel HPRC is developed and synthesised using XILINX ISE in the former part of the work and a CoreMark processor core benchmark is used to test the performance of the OR1200 soft core in the later part of the work. The result of the implementation ensures the overall speed-up increased to 20.59% by GEP based parallel HPRC in the execution unit of OR1200.
Flexibility and Performance of Parallel File Systems

Science.gov (United States)

Kotz, David; Nieuwejaar, Nils

1996-01-01

As we gain experience with parallel file systems, it becomes increasingly clear that a single solution does not suit all applications. For example, it appears to be impossible to find a single appropriate interface, caching policy, file structure, or disk-management strategy. Furthermore, the proliferation of file-system interfaces and abstractions make applications difficult to port. We propose that the traditional functionality of parallel file systems be separated into two components: a fixed core that is standard on all platforms, encapsulating only primitive abstractions and interfaces, and a set of high-level libraries to provide a variety of abstractions and application-programmer interfaces (API's). We present our current and next-generation file systems as examples of this structure. Their features, such as a three-dimensional file structure, strided read and write interfaces, and I/O-node programs, are specifically designed with the flexibility and performance necessary to support a wide range of applications.
Load balancing in highly parallel processing of Monte Carlo code for particle transport

International Nuclear Information System (INIS)

Higuchi, Kenji; Takemiya, Hiroshi; Kawasaki, Takuji

1998-01-01

In parallel processing of Monte Carlo (MC) codes for neutron, photon and electron transport problems, particle histories are assigned to processors making use of independency of the calculation for each particle. Although we can easily parallelize main part of a MC code by this method, it is necessary and practically difficult to optimize the code concerning load balancing in order to attain high speedup ratio in highly parallel processing. In fact, the speedup ratio in the case of 128 processors remains in nearly one hundred times when using the test bed for the performance evaluation. Through the parallel processing of the MCNP code, which is widely used in the nuclear field, it is shown that it is difficult to attain high performance by static load balancing in especially neutron transport problems, and a load balancing method, which dynamically changes the number of assigned particles minimizing the sum of the computational and communication costs, overcomes the difficulty, resulting in nearly fifteen percentage of reduction for execution time. (author)
Performance of MPI parallel processing implemented by MCNP5/ MCNPX for criticality benchmark problems

International Nuclear Information System (INIS)

Mark Dennis Usang; Mohd Hairie Rabir; Mohd Amin Sharifuldin Salleh; Mohamad Puad Abu

2012-01-01

MPI parallelism are implemented on a SUN Workstation for running MCNPX and on the High Performance Computing Facility (HPC) for running MCNP5. 23 input less obtained from MCNP Criticality Validation Suite are utilized for the purpose of evaluating the amount of speed up achievable by using the parallel capabilities of MPI. More importantly, we will study the economics of using more processors and the type of problem where the performance gain are obvious. This is important to enable better practices of resource sharing especially for the HPC facilities processing time. Future endeavours in this direction might even reveal clues for best MCNP5/ MCNPX coding practices for optimum performance of MPI parallelisms. (author)
Conceptual design of a sapphire loaded coupler for superconducting radio-frequency 1.3 GHz cavities

Science.gov (United States)

Xu, Chen; Tantawi, Sami

2016-02-01

This paper explores a hybrid mode rf structure that served as a superconducting radio-frequency coupler. This application achieves a reflection S(1 ,1 ) varying from 0 to -30 db and delivers cw power at 7 KW. The coupler has good thermal isolation between the 2 and 300 K sections due to vacuum separation. Only one single hybrid mode can propagate through each section, and no higher order mode is coupled. The analytical and numerical analysis for this coupler is given and the design is optimized. The coupling mechanism to the cavity is also discussed.
Multiplexing of adjacent vortex modes with the forked grating coupler

Science.gov (United States)

Nadovich, Christopher T.; Kosciolek, Derek J.; Crouse, David T.; Jemison, William D.

2017-08-01

For vortex fiber multiplexing to reach practical commercial viability, simple silicon photonic interfaces with vortex fiber will be required. These interfaces must support multiplexing. Toward this goal, an efficient singlefed multimode Forked Grating Coupler (FGC) for coupling two different optical vortex OAM charges to or from the TE0 and TE1 rectangular waveguide modes has been developed. A simple, apodized device implemented with e-beam lithography and a conventional dual-etch processing on SOI wafer exhibits low crosstalk and reasonable mode match. Advanced designs using this concept are expected to further improve performance.
Performance of a parallel plate volume calorimeter prototype

International Nuclear Information System (INIS)

Arefiev, A.; Bencze, Gy.L.; Bizzeti, A.; Choumilov, E.; Civinini, C; D'Alessandro, R.; Ferrando, A.; Fouz, M.C.; Iglesias, A.; Ivochkin, V.; Josa, M.I.; Malinin, A.; Meschini, M.; Misyura, S.; Pojidaev, V.; Salicio, J.M.; Sikler, F.

1995-01-01

An iron/gas parallel plate volume calorimeter prototype, working in the avalanche mode, has been tested using electrons of 20 to 150 GeV/c momentum with high voltages varying from 5400 to 5600 V (electric fields ranging from 36 to 37 KV/cm), and a gas mixture of CF4/CO, (80/20%). The collected charge was measured as a function of the high voltage and of the electron energy. The energy resolution was also measured. Comparisons are made with Monte-Carlo predictions. Agreement between data and simulation allows the calculation of the expected performance of a full size calorimeter. (Author)
Performance of a parallel plate volume calorimeter prototype

International Nuclear Information System (INIS)

Arefiev, A.; Bencze, G.L.; Bizzeti, A.

1995-09-01

An iron/gas parallel plate volume calorimeter prototype, working in the avalanche mode, has been tested using electrons of 20 to 150 GeV/c momentum with high voltages varying from 5400 to 5600 V (electric fields ranging from 36 to 37 KV/cm), and a gas mixture of CF 4 /CO 2 (80/20%). The collected charge was measured as a function of the high voltage and of the electron energy. The energy resolution was also measured. Comparisons are made with Monte-Carlo predictions. Agreement between data and simulation allows the calculation of the expected performance of a full size calorimeter
High performance data transfer

Science.gov (United States)

Cottrell, R.; Fang, C.; Hanushevsky, A.; Kreuger, W.; Yang, W.

2017-10-01

The exponentially increasing need for high speed data transfer is driven by big data, and cloud computing together with the needs of data intensive science, High Performance Computing (HPC), defense, the oil and gas industry etc. We report on the Zettar ZX software. This has been developed since 2013 to meet these growing needs by providing high performance data transfer and encryption in a scalable, balanced, easy to deploy and use way while minimizing power and space utilization. In collaboration with several commercial vendors, Proofs of Concept (PoC) consisting of clusters have been put together using off-the- shelf components to test the ZX scalability and ability to balance services using multiple cores, and links. The PoCs are based on SSD flash storage that is managed by a parallel file system. Each cluster occupies 4 rack units. Using the PoCs, between clusters we have achieved almost 200Gbps memory to memory over two 100Gbps links, and 70Gbps parallel file to parallel file with encryption over a 5000 mile 100Gbps link.

Tuning HDF5 subfiling performance on parallel file systems

Energy Technology Data Exchange (ETDEWEB)

Byna, Suren [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Chaarawi, Mohamad [Intel Corp. (United States); Koziol, Quincey [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Mainzer, John [The HDF Group (United States); Willmore, Frank [The HDF Group (United States)

2017-05-12

Subfiling is a technique used on parallel file systems to reduce locking and contention issues when multiple compute nodes interact with the same storage target node. Subfiling provides a compromise between the single shared file approach that instigates the lock contention problems on parallel file systems and having one file per process, which results in generating a massive and unmanageable number of files. In this paper, we evaluate and tune the performance of recently implemented subfiling feature in HDF5. In specific, we explain the implementation strategy of subfiling feature in HDF5, provide examples of using the feature, and evaluate and tune parallel I/O performance of this feature with parallel file systems of the Cray XC40 system at NERSC (Cori) that include a burst buffer storage and a Lustre disk-based storage. We also evaluate I/O performance on the Cray XC30 system, Edison, at NERSC. Our results show performance benefits of 1.2X to 6X performance advantage with subfiling compared to writing a single shared HDF5 file. We present our exploration of configurations, such as the number of subfiles and the number of Lustre storage targets to storing files, as optimization parameters to obtain superior I/O performance. Based on this exploration, we discuss recommendations for achieving good I/O performance as well as limitations with using the subfiling feature.
Overview of SNS Cryomodule Performance

CERN Document Server

Drury, Michael A; Davis, Kirk; Delayen, Jean R; Grenoble, Christiana; Hicks, William R; King, Larry; Plawski, Tomasz; Powers, Tom; Preble, Joseph P; Wang, Haipeng; Wiseman, Mark

2005-01-01

Thomas Jefferson National Accelerating Facility (Jefferson Lab) has completed production of 24 Superconducting Radio Frequency (SRF) cryomodules for the Spallation Neutron Source (SNS) superconducting linac. This includes one medium-beta (0.61) prototype, eleven medium-beta and twelve high-beta (0.81) production cryomodules. Ten medium-beta cryomodules as well as two high beta cryomodules have undergone complete operational performance testing in the Cryomodule Test Facility at Jefferson Lab. The set of tests includes measurements of maximum gradient, unloaded Q (Q0), microphonics, and response to Lorentz forces. The Qext's of the various couplers are measured and the behavior of the higher order mode couplers is examined. The mechanical and piezo tuners are also characterized. The results of these performance tests will be discussed in this paper.
H5Part A Portable High Performance Parallel Data Interface for Particle Simulations

CERN Document Server

Adelmann, Andreas; Shalf, John M; Siegerist, Cristina

2005-01-01

Largest parallel particle simulations, in six dimensional phase space generate wast amont of data. It is also desirable to share data and data analysis tools such as ParViT (Particle Visualization Toolkit) among other groups who are working on particle-based accelerator simulations. We define a very simple file schema built on top of HDF5 (Hierarchical Data Format version 5) as well as an API that simplifies the reading/writing of the data to the HDF5 file format. HDF5 offers a self-describing machine-independent binary file format that supports scalable parallel I/O performance for MPI codes on a variety of supercomputing systems and works equally well on laptop computers. The API is available for C, C++, and Fortran codes. The file format will enable disparate research groups with very different simulation implementations to share data transparently and share data analysis tools. For instance, the common file format will enable groups that depend on completely different simulation implementations to share c...
Parallel integer sorting with medium and fine-scale parallelism

Science.gov (United States)

Dagum, Leonardo

1993-01-01

Two new parallel integer sorting algorithms, queue-sort and barrel-sort, are presented and analyzed in detail. These algorithms do not have optimal parallel complexity, yet they show very good performance in practice. Queue-sort designed for fine-scale parallel architectures which allow the queueing of multiple messages to the same destination. Barrel-sort is designed for medium-scale parallel architectures with a high message passing overhead. The performance results from the implementation of queue-sort on a Connection Machine CM-2 and barrel-sort on a 128 processor iPSC/860 are given. The two implementations are found to be comparable in performance but not as good as a fully vectorized bucket sort on the Cray YMP.
Suspended mid-infrared fiber-to-chip grating couplers for SiGe waveguides

Science.gov (United States)

Favreau, Julien; Durantin, Cédric; Fédéli, Jean-Marc; Boutami, Salim; Duan, Guang-Hua

2016-03-01

Silicon photonics has taken great importance owing to the applications in optical communications, ranging from short reach to long haul. Originally dedicated to telecom wavelengths, silicon photonics is heading toward circuits handling with a broader spectrum, especially in the short and mid-infrared (MIR) range. This trend is due to potential applications in chemical sensing, spectroscopy and defense in the 2-10 μm range. We previously reported the development of a MIR photonic platform based on buried SiGe/Si waveguide with propagation losses between 1 and 2 dB/cm. However the low index contrast of the platform makes the design of efficient grating couplers very challenging. In order to achieve a high fiber-to-chip efficiency, we propose a novel grating coupler structure, in which the grating is locally suspended in air. The grating has been designed with a FDTD software. To achieve high efficiency, suspended structure thicknesses have been jointly optimized with the grating parameters, namely the fill factor, the period and the grating etch depth. Using the Efficient Global Optimization (EGO) method we obtained a configuration where the fiber-to-waveguide efficiency is above 57 %. Moreover the optical transition between the suspended and the buried SiGe waveguide has been carefully designed by using an Eigenmode Expansion software. Transition efficiency as high as 86 % is achieved.
Conceptual design of a sapphire loaded coupler for superconducting radio-frequency 1.3 GHz cavities

Directory of Open Access Journals (Sweden)

Chen Xu

2016-02-01

Full Text Available This paper explores a hybrid mode rf structure that served as a superconducting radio-frequency coupler. This application achieves a reflection S_{(1,1} varying from 0 to −30 db and delivers cw power at 7 KW. The coupler has good thermal isolation between the 2 and 300 K sections due to vacuum separation. Only one single hybrid mode can propagate through each section, and no higher order mode is coupled. The analytical and numerical analysis for this coupler is given and the design is optimized. The coupling mechanism to the cavity is also discussed.
MulticoreBSP for C : A high-performance library for shared-memory parallel programming

NARCIS (Netherlands)

Yzelman, A. N.; Bisseling, R. H.; Roose, D.; Meerbergen, K.

2014-01-01

The bulk synchronous parallel (BSP) model, as well as parallel programming interfaces based on BSP, classically target distributed-memory parallel architectures. In earlier work, Yzelman and Bisseling designed a MulticoreBSP for Java library specifically for shared-memory architectures. In the
Overview of SNS Cryomodule Performance

International Nuclear Information System (INIS)

Michael Drury; Edward Daly; Christiana Grenoble; William Hicks; Lawrence King; Tomasz Plawski; Thomas Powers; Joseph Preble; Haipeng Wang; Mark Wiseman; G. Davis; Jean Delayen

2005-01-01

Thomas Jefferson National Accelerating Facility (Jefferson Lab) has completed production of 24 Superconducting Radio Frequency (SRF) cryomodules for the Spallation Neutron Source (SNS) superconducting linac. This includes one medium-β (0.61) prototype, eleven medium-β and twelve high-β (0.81) production cryomodules. Nine medium-β cryomodules as well as two high-β cryomodules have undergone complete operational performance testing in the Cryomodule Test Facility at Jefferson Lab. The set of tests includes measurements of maximum gradient, unloaded Q (Q 0 ), microphonics, and response to Lorentz forces. The Q ext 's of the various couplers are measured and the behavior of the higher order mode couplers is examined. The mechanical and piezo tuners are also characterized. The results of these performance tests will be discussed in this paper
Fully-etched apodized fiber-to-chip grating coupler on the SOI platform with -0.78 dB coupling efficiency using photonic crystals and bonded Al mirror

DEFF Research Database (Denmark)

Ding, Yunhong; Ou, Haiyan; Peucheret, Christophe

2014-01-01

We design and fabricate an ultra-high coupling efficiency fully-etched apodized grating coupler on the SOI platform using photonic crystals and bonded aluminum mirror. Ultra-high coupling efficiency of -0.78 dB with a 3 dB bandwidth of 74 nm are demonstrated.......We design and fabricate an ultra-high coupling efficiency fully-etched apodized grating coupler on the SOI platform using photonic crystals and bonded aluminum mirror. Ultra-high coupling efficiency of -0.78 dB with a 3 dB bandwidth of 74 nm are demonstrated....
Contribution to the study of accelerating structure for electrons and respective radiofrequency couplers

International Nuclear Information System (INIS)

Franco, M.A.R.

1991-01-01

In this work, the experimental results pertaining to the construction and evaluation of a constant gradient accelerating structure and of the radiofrequency couplers are presented. The theoretical methods to determine the initial dimensions of the iris-loaded accelerating structure are presented. The final dimensions have been experimentally determined utilizing four three-cavity sections representing the 4 sup(th), 12 sup(th), 20 sup(th) and 27 sup(th) cavities of the final structure. The diameters of the cavities were corrected for variations of temperature, pressure and humidity. A v sub(p) = c, constant gradient, twelve-cavity prototype of the accelerating structure have been constructed and its principal parameters were experimentally determined according to methods also described in this work. Two prototypes of door-knob type radiofrequency couplers have been constructed and experimental procedures to match and tune the couplers and the accelerating structure were implemented. (author)
A high performance data parallel tensor contraction framework: Application to coupled electro-mechanics

Science.gov (United States)

Poya, Roman; Gil, Antonio J.; Ortigosa, Rogelio

2017-07-01

The paper presents aspects of implementation of a new high performance tensor contraction framework for the numerical analysis of coupled and multi-physics problems on streaming architectures. In addition to explicit SIMD instructions and smart expression templates, the framework introduces domain specific constructs for the tensor cross product and its associated algebra recently rediscovered by Bonet et al. (2015, 2016) in the context of solid mechanics. The two key ingredients of the presented expression template engine are as follows. First, the capability to mathematically transform complex chains of operations to simpler equivalent expressions, while potentially avoiding routes with higher levels of computational complexity and, second, to perform a compile time depth-first or breadth-first search to find the optimal contraction indices of a large tensor network in order to minimise the number of floating point operations. For optimisations of tensor contraction such as loop transformation, loop fusion and data locality optimisations, the framework relies heavily on compile time technologies rather than source-to-source translation or JIT techniques. Every aspect of the framework is examined through relevant performance benchmarks, including the impact of data parallelism on the performance of isomorphic and nonisomorphic tensor products, the FLOP and memory I/O optimality in the evaluation of tensor networks, the compilation cost and memory footprint of the framework and the performance of tensor cross product kernels. The framework is then applied to finite element analysis of coupled electro-mechanical problems to assess the speed-ups achieved in kernel-based numerical integration of complex electroelastic energy functionals. In this context, domain-aware expression templates combined with SIMD instructions are shown to provide a significant speed-up over the classical low-level style programming techniques.
Automatic performance tuning of parallel and accelerated seismic imaging kernels

KAUST Repository

Haberdar, Hakan

2014-01-01

With the increased complexity and diversity of mainstream high performance computing systems, significant effort is required to tune parallel applications in order to achieve the best possible performance for each particular platform. This task becomes more and more challenging and requiring a larger set of skills. Automatic performance tuning is becoming a must for optimizing applications such as Reverse Time Migration (RTM) widely used in seismic imaging for oil and gas exploration. An empirical search based auto-tuning approach is applied to the MPI communication operations of the parallel isotropic and tilted transverse isotropic kernels. The application of auto-tuning using the Abstract Data and Communication Library improved the performance of the MPI communications as well as developer productivity by providing a higher level of abstraction. Keeping productivity in mind, we opted toward pragma based programming for accelerated computation on latest accelerated architectures such as GPUs using the fairly new OpenACC standard. The same auto-tuning approach is also applied to the OpenACC accelerated seismic code for optimizing the compute intensive kernel of the Reverse Time Migration application. The application of such technique resulted in an improved performance of the original code and its ability to adapt to different execution environments.
Ultra-low coupling loss fully-etched apodized grating coupler with bonded metal mirror

DEFF Research Database (Denmark)

Ding, Yunhong; Peucheret, Christophe; Ou, Haiyan

2014-01-01

A fully etched apodized grating coupler with bonded metal mirror is designed and demonstrated on the silicon-on-insulator platform, showing an ultra-low coupling loss of only 1.25 dB with 3 dB bandwidth of 69 nm.......A fully etched apodized grating coupler with bonded metal mirror is designed and demonstrated on the silicon-on-insulator platform, showing an ultra-low coupling loss of only 1.25 dB with 3 dB bandwidth of 69 nm....
Study of a superconducting spoke-type cavity and of its associated power coupler

International Nuclear Information System (INIS)

Mielot, Ch.

2004-12-01

This work deals with the study of a spoke-type cavity and its associated power coupler. The results of this study are used in the framework of the high power proton linear accelerator of the experimental accelerator-driven system project (XADS). The cavity (F=352 MHz, β=0.35) was tested at 4 K and 2 K. The results at 4 K gave good margins toward XADS requirements that increase the reliability of a spoke based driver. At 2 K the accelerating field reached is the highest in the world for spoke cavities: 16 MV/M. The position and diameter of the coupling have been optimized in order to decrease the HF losses and avoid multi-factor risk. In order to decrease HF losses (taking into account the 20 kW power fed into the cavity) the electric coupling mode has been chosen. Different types of ceramic windows have been studied in order to make this critical point of the coupler reliable: coaxial disk with or without chokes or empty coaxial cylinder. The optimization process focused on the reflected power, the losses in the ceramic and the surface electric field. The risk with chokes has been modeled and studied with the propagation lines theory. A systematic study of the different windows has been done regarding the geometrical parameters. The disk without chokes seems to be a good solution for our application. The power source will be a solid state amplifier (for reliability and modularity reasons). An all over coaxial coupler can be designed and will be fabricated and tested soon. (author)
Systematic approach for deriving feasible mappings of parallel algorithms to parallel computing platforms

NARCIS (Netherlands)

Arkin, Ethem; Tekinerdogan, Bedir; Imre, Kayhan M.

2017-01-01

The need for high-performance computing together with the increasing trend from single processor to parallel computer architectures has leveraged the adoption of parallel computing. To benefit from parallel computing power, usually parallel algorithms are defined that can be mapped and executed
Balanced PIN-TIA photoreceiver with integrated 3 dB fiber coupler for distributed fiber optic sensors

Science.gov (United States)

Datta, Shubhashish; Rajagopalan, Sruti; Lemke, Shaun; Joshi, Abhay

2014-06-01

We report a balanced PIN-TIA photoreceiver integrated with a 3 dB fiber coupler for distributed fiber optic sensors. This detector demonstrates -3 dB bandwidth >15 GHz and coupled conversion gain >65 V/W per photodiode through either input port of the 3 dB coupler, and can be operated at local oscillator power of +17 dBm. The combined common mode rejection of the balanced photoreceiver and the integrated 3 dB coupler is >20 dB. We also present measurement results with various optical stimuli, namely impulses, sinusoids, and pseudo-random sequences, which are relevant for time domain reflectometry, frequency domain reflectometry, and code correlation sensors, respectively.
Thermal-Mechanical Study of 3.9 GHz CW Coupler and Cavity for LCLS-II Project

Energy Technology Data Exchange (ETDEWEB)

Gonin, Ivan [Fermilab; Harms, Elvin [Fermilab; Khabiboulline, Timergali [Fermilab; Solyak, Nikolay [Fermilab; Yakovlev, Vyacheslav [Fermilab

2017-05-01

Third harmonic system was originally developed by Fermilab for FLASH facility at DESY and then was adopted and modified by INFN for the XFEL project [1-3]. In contrast to XFEL project, all cryomodules in LCLS-II project will operate in CW regime with higher RF average power for 1.3 GHz and 3.9 GHz cavities and couplers. Design of the cavity and fundamental power coupler has been modified to satisfy LCLS-II requirements. In this paper we discuss the results of COMSOL thermal and mechanical analysis of the 3.9 GHz coupler and cavity to verify proposed modifica-tion of the design. For the dressed cavity we present simulations of Lorentz force detuning, helium pressure sensitivity df/dP and major mechanical resonances.
Practical parallel computing

CERN Document Server

Morse, H Stephen

1994-01-01

Practical Parallel Computing provides information pertinent to the fundamental aspects of high-performance parallel processing. This book discusses the development of parallel applications on a variety of equipment.Organized into three parts encompassing 12 chapters, this book begins with an overview of the technology trends that converge to favor massively parallel hardware over traditional mainframes and vector machines. This text then gives a tutorial introduction to parallel hardware architectures. Other chapters provide worked-out examples of programs using several parallel languages. Thi
An RF input coupler system for the CEBAF energy upgrade cryomodule

International Nuclear Information System (INIS)

J.R. Delayen; L.R. Doolittle; T. Hiatt; J. Hogan; J. Mammosser; L. Phillips; J. Preble; W.J. Schneider; G. Wu

1999-01-01

Long term plans for CEBAF at Jefferson Lab call for achieving 12 GeV in the middle of the next decade and 24 GeV after 2010. Thus an upgraded cryomodule to more than double the present voltage is under development. A new waveguide coupler system has been designed and prototypes are currently being developed. This coupler, unlike the original, has a nominal Q ext of 2.1 x 10 7 , reduced sensitivity of Q ext to mechanical deformation, reduced field asymmetry within the beam envelope, freedom from window arcing with a single window at 300 K, and is capable of transmitting 6kW CW both traveling wave and in full reflection
Pulsed x-ray induced attenuation measurements of single mode optical fibers and coupler materials

International Nuclear Information System (INIS)

Johan, A.; Charre, P.

1994-01-01

Pulsed X-ray induced transient radiation attenuation measurements of single mode optical fibers have been performed versus total dose, light wavelength, optical power and fiber coil diameter in order to determine the behavior of parameters sensitive to ionizing radiation. The results did not show any photobleaching phenomenon and the attenuation was found independent of the spool diameter. As expected, transient attenuation was lower for higher wave-lengths. The recovery took place in the millisecond range and was independent of total dose, light wavelength and optical power. In optical modules and devices a large range of behaviors was observed according to coupler material i.e., Corning coupler showed a small peak attenuation that remained more than one day later; on the other hand LiTaO 3 material experienced an order of magnitude higher peak attenuation and a recovery in the millisecond range. For applications with optical fibers and integrated optics devices the authors showed that in many cases the optical fiber (length above 100 m) is the most sensitive device in a transient ionizing radiation field

Study on the coaxial couplers for disk and washer loaded accelerating structures

International Nuclear Information System (INIS)

Dajkovskij, A.G.; Paramonov, V.V.; Portugalov, Yu.I.; Ryabov, A.D.; Ryabova, T.D.

1983-01-01

The paper describes the dispersion and energy properties of the coaxial coupler (CC), which is a promising component for an accelerating system, with the disk and washer (DAW) structure. Resonators, consisting of the DAW structure sections and CC are shown to persist the main advantage of DAW structure, i.e. high stability of the accelerating field distribution. Therewith RF power losses are small. The presence of nonsymetriric modes in the neighbourbood of the operating mode is noted
GROMACS 4.5: A high-throughput and highly parallel open source molecular simulation toolkit

Energy Technology Data Exchange (ETDEWEB)

Pronk, Sander [Science for Life Lab., Stockholm (Sweden); KTH Royal Institute of Technology, Stockholm (Sweden); Pall, Szilard [Science for Life Lab., Stockholm (Sweden); KTH Royal Institute of Technology, Stockholm (Sweden); Schulz, Roland [Univ. of Tennessee, Knoxville, TN (United States); Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Larsson, Per [Univ. of Virginia, Charlottesville, VA (United States); Bjelkmar, Par [Science for Life Lab., Stockholm (Sweden); Stockholm Univ., Stockholm (Sweden); Apostolov, Rossen [Science for Life Lab., Stockholm (Sweden); KTH Royal Institute of Technology, Stockholm (Sweden); Shirts, Michael R. [Univ. of Virginia, Charlottesville, VA (United States); Smith, Jeremy C. [Univ. of Tennessee, Knoxville, TN (United States); Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Kasson, Peter M. [Univ. of Virginia, Charlottesville, VA (United States); van der Spoel, David [Science for Life Lab., Stockholm (Sweden); Uppsala Univ., Uppsala (Sweden); Hess, Berk [Science for Life Lab., Stockholm (Sweden); KTH Royal Institute of Technology, Stockholm (Sweden); Lindahl, Erik [Science for Life Lab., Stockholm (Sweden); KTH Royal Institute of Technology, Stockholm (Sweden); Stockholm Univ., Stockholm (Sweden)

2013-02-13

In this study, molecular simulation has historically been a low-throughput technique, but faster computers and increasing amounts of genomic and structural data are changing this by enabling large-scale automated simulation of, for instance, many conformers or mutants of biomolecules with or without a range of ligands. At the same time, advances in performance and scaling now make it possible to model complex biomolecular interaction and function in a manner directly testable by experiment. These applications share a need for fast and efficient software that can be deployed on massive scale in clusters, web servers, distributed computing or cloud resources. As a result, we present a range of new simulation algorithms and features developed during the past 4 years, leading up to the GROMACS 4.5 software package. The software now automatically handles wide classes of biomolecules, such as proteins, nucleic acids and lipids, and comes with all commonly used force fields for these molecules built-in. GROMACS supports several implicit solvent models, as well as new free-energy algorithms, and the software now uses multithreading for efficient parallelization even on low-end systems, including windows-based workstations. Together with hand-tuned assembly kernels and state-of-the-art parallelization, this provides extremely high performance and cost efficiency for high-throughput as well as massively parallel simulations.
Thermal behaviour analysis of SRF cavities and superconducting HOM couplers

International Nuclear Information System (INIS)

Fouaidy, M.; Junquera, T.

1993-01-01

Two individual papers appear in this report, titled Thermal model calculations in superconducting RF cavities, and Thermal study of HOM couplers for superconducting RF cavities. Both were indexed separately for the INIS database. (R.P.)
Highly scalable parallel processing of extracellular recordings of Multielectrode Arrays.

Science.gov (United States)

Gehring, Tiago V; Vasilaki, Eleni; Giugliano, Michele

2015-01-01

Technological advances of Multielectrode Arrays (MEAs) used for multisite, parallel electrophysiological recordings, lead to an ever increasing amount of raw data being generated. Arrays with hundreds up to a few thousands of electrodes are slowly seeing widespread use and the expectation is that more sophisticated arrays will become available in the near future. In order to process the large data volumes resulting from MEA recordings there is a pressing need for new software tools able to process many data channels in parallel. Here we present a new tool for processing MEA data recordings that makes use of new programming paradigms and recent technology developments to unleash the power of modern highly parallel hardware, such as multi-core CPUs with vector instruction sets or GPGPUs. Our tool builds on and complements existing MEA data analysis packages. It shows high scalability and can be used to speed up some performance critical pre-processing steps such as data filtering and spike detection, helping to make the analysis of larger data sets tractable.
Implementation of a Monte Carlo simulation environment for fully 3D PET on a high-performance parallel platform

CERN Document Server

Zaidi, H; Morel, Christian

1998-01-01

This paper describes the implementation of the Eidolon Monte Carlo program designed to simulate fully three-dimensional (3D) cylindrical positron tomographs on a MIMD parallel architecture. The original code was written in Objective-C and developed under the NeXTSTEP development environment. Different steps involved in porting the software on a parallel architecture based on PowerPC 604 processors running under AIX 4.1 are presented. Basic aspects and strategies of running Monte Carlo calculations on parallel computers are described. A linear decrease of the computing time was achieved with the number of computing nodes. The improved time performances resulting from parallelisation of the Monte Carlo calculations makes it an attractive tool for modelling photon transport in 3D positron tomography. The parallelisation paradigm used in this work is independent from the chosen parallel architecture
Coaxial TW window for power couplers and multipactor considerations

International Nuclear Information System (INIS)

Hanus, X.; Mosnier, A.

1996-01-01

A Traveling Wave coaxial window has been studied for power couplers purposes. The main features, a reduced electrical field in the ceramic and its multipacting free shape are presented. Multipacting simulations results for other window geometries, using a conical or a cylindrical ceramic are also showed. (author)
Circuit mismatch influence on performance of paralleling silicon carbide MOSFETs

DEFF Research Database (Denmark)

Li, Helong; Munk-Nielsen, Stig; Pham, Cam

2014-01-01

This paper focuses on circuit mismatch influence on performance of paralleling SiC MOSFETs. Power circuit mismatch and gate driver mismatch influences are analyzed in detail. Simulation and experiment results show the influence of circuit mismatch and verify the analysis. This paper aims to give...... suggestions on paralleling discrete SiC MOSFETs and designing layout of power modules with paralleled SiC MOSFETs dies....
Design of Dual-Band Two-Branch-Line Couplers with Arbitrary Coupling Coefficients in Bands

Directory of Open Access Journals (Sweden)

I. Prudyus

2014-12-01

Full Text Available A new approach to design dual-band two-branch couplers with arbitrary coupling coefficients at two operating frequency bands is proposed in this article. The method is based on the usage of equivalent subcircuits input reactances of the even-mode and odd-mode excitations. The exact design formulas for three options of the dual-band coupler with different location and number of stubs are received. These formulas permit to obtain the different variants for each structure in order to select the physically realizable solution and can be used in broad range of frequency ratio and power division ratio. For verification, three different dual-band couplers, which are operating at 2.4/3.9 GHz with different coupling coefficients (one with 3/6 dB, and 10/3 dB two others are designed, simulated, fabricated and tested. The measured results are in good agreement with the simulated ones.
Finite element analysis and frequency shift studies for the bridge coupler of the coupled cavity linear accelerator of the spallation neutron source.

Energy Technology Data Exchange (ETDEWEB)

Chen, Z. (Zukun)

2001-01-01

The Spallation Neutron Source (SNS) is an accelerator-based neutron scattering research facility. The linear accelerator (linac) is the principal accelerating structure and divided into a room-temperature linac and a superconducting linac. The normal conducting linac system that consists of a Drift Tube Linac (DTL) and a Coupled Cavity Linac (CCL) is to be built by Los Alamos National Laboratory. The CCL structure is 55.36-meters long. It accelerates H- beam from 86.8 Mev to 185.6 Mev at operating frequency of 805 MHz. This side coupled cavity structure has 8 cells per segment, 12 segments and 11 bridge couplers per module, and 4 modules total. A 5-MW klystron powers each module. The number 3 and number 9 bridge coupler of each module are connected to the 5-MW RF power supply. The bridge coupler with length of 2.5 {beta}{gamma} is a three-cell structure and located between the segments and allows power flow through the module. The center cell of each bridge coupler is excited during normal operation. To obtain a uniform electromagnetic filed and meet the resonant frequency shift, the RF induced heat must be removed. Thus, the thermal deformation and frequency shift studies are performed via numerical simulations in order to have an appropriate cooling design and predict the frequency shift under operation. The center cell of the bridge coupler also contains a large 4-inch slug tuner and a tuning post that used to provide bulk frequency adjustment and field intensity adjustment, so that produce the proper total field distribution in the module assembly.
Enhancing Application Performance Using Mini-Apps: Comparison of Hybrid Parallel Programming Paradigms

Science.gov (United States)

Lawson, Gary; Sosonkina, Masha; Baurle, Robert; Hammond, Dana

2017-01-01

In many fields, real-world applications for High Performance Computing have already been developed. For these applications to stay up-to-date, new parallel strategies must be explored to yield the best performance; however, restructuring or modifying a real-world application may be daunting depending on the size of the code. In this case, a mini-app may be employed to quickly explore such options without modifying the entire code. In this work, several mini-apps have been created to enhance a real-world application performance, namely the VULCAN code for complex flow analysis developed at the NASA Langley Research Center. These mini-apps explore hybrid parallel programming paradigms with Message Passing Interface (MPI) for distributed memory access and either Shared MPI (SMPI) or OpenMP for shared memory accesses. Performance testing shows that MPI+SMPI yields the best execution performance, while requiring the largest number of code changes. A maximum speedup of 23 was measured for MPI+SMPI, but only 11 was measured for MPI+OpenMP.
Analysis for Parallel Execution without Performing Hardware/Software Co-simulation

OpenAIRE

Muhammad Rashid

2014-01-01

Hardware/software co-simulation improves the performance of embedded applications by executing the applications on a virtual platform before the actual hardware is available in silicon. However, the virtual platform of the target architecture is often not available during early stages of the embedded design flow. Consequently, analysis for parallel execution without performing hardware/software co-simulation is required. This article presents an analysis methodology for parallel execution of ...
USING THE SCRAP TIRES TO PRODUCE A FLEXIBLE COUPLER

Directory of Open Access Journals (Sweden)

Tahsean A. Hussain

2018-01-01

Full Text Available The scrap tires considered a problematic source of waste, the old rubber tires causes a big environmental problem that is need much money and effort to disposes it safely. In Iraq there are more than two million used tires disposed to the environment annually. one of the tire’s recycling methods is the use of tire layers to produce a new rubber parts used in the engineering and industrial purposes as the bridges and machines dampers, this trend of recycling doesn’t take a sufficient care comparing with the other uses. There are a lot of studies conducted in these field, these studies suggests many methods to manage the huge number of scrap tires, the current paper aims to use the old rubber tires in engineering purposes (especially as a coupler joins the motor or engines with the other equipment as electric dynamo or pumps, the study focusing on the mechanical properties of a strip from a used tires and comparing with one prepared in the lab., and suggesting a new method to use as an engineering parts (for example the coupler lays between the IC engine and the dynamo of an electric generator. One of the results obtained from the experiments, there is no significant difference between the mechanical properties of the old and the new strip, (in the tensile test, the breaking force of the -Lab. tensile specimen- is 137 N whereas the specimen of old tire have a breaking force 113.27 N, but they are same in the elongation. A computational example is suggested to estimate the dimensions of a flexible coupler use an old tire pieces.
Straightforward and accurate technique for post-coupler stabilization in drift tube linac structures

CERN Document Server

Khalvati, Mohammad Reza

2016-01-01

The axial electric field of Alvarez drift tube linacs (DTLs) is known to be susceptible to variations due to static and dynamic effects like manufacturing tolerances and beam loading. Post-couplers are used to stabilize the accelerating fields of DTLs against tuning errors. Tilt sensitivity and its slope have been introduced as measures for the stability right from the invention of post-couplers but since then the actual stabilization has mostly been done by tedious iteration. In the present article, the local tilt-sensitivity slope TS 0 n is established as the principal measure for stabilization instead of tilt sensitivity or some visual slope, and its significance is developed on the basis of an equivalent-circuit diagram of the DTL. Experimental and 3D simulation results are used to analyze its behavior and to define a technique for stabilization that allows finding the best post-coupler settings with just four tilt-sensitivity measurements. CERN ’ s Linac4 DTL Tank 2 and Tank 3 have been stabilized succ...
Coupler Development and Gap Field Analysis for the 352 MHz Superconducting CH-Cavity

CERN Document Server

Liebermann, H; Ratzinger, U; Sauer, A C

2004-01-01

The cross-bar H-type (CH) cavity is a multi-gap drift tube structure based on the H-210 mode currently under development at IAP Frankfurt and in collaboration with GSI. Numerical simulations and rf model measurements showed that the CH-type cavity is an excellent candidate to realize s.c. multi-cell structures ranging from the RFQ exit energy up to the injection energy into elliptical multi-cell cavities. The reasonable frequency range is from about 150 MHz up to 800 MHz. A 19-cell, β=0.1, 352 MHz, bulk niobium prototype cavity is under development at the ACCEL-Company, Bergisch-Gladbach. This paper will present detailed MicroWave Studio simulations and measurements for the coupler development of the 352 MHz superconducting CH cavity. It will describe possibilities for coupling into the superconducting CH-Cavity. The development of the coupler is supported by measurement on a room temperature CH-copper model. We will present the first results of the measurements of different couplers, e.g. capacitiv...
An Introduction to High Performance Fortran

Directory of Open Access Journals (Sweden)

John Merlin

1995-01-01

Full Text Available High Performance Fortran (HPF is an informal standard for extensions to Fortran 90 to assist its implementation on parallel architectures, particularly for data-parallel computation. Among other things, it includes directives for specifying data distribution across multiple memories, and concurrent execution features. This article provides a tutorial introduction to the main features of HPF.
HPTA: High-Performance Text Analytics

OpenAIRE

Vandierendonck, Hans; Murphy, Karen; Arif, Mahwish; Nikolopoulos, Dimitrios S.

2017-01-01

One of the main targets of data analytics is unstructured data, which primarily involves textual data. High-performance processing of textual data is non-trivial. We present the HPTA library for high-performance text analytics. The library helps programmers to map textual data to a dense numeric representation, which can be handled more efficiently. HPTA encapsulates three performance optimizations: (i) efficient memory management for textual data, (ii) parallel computation on associative dat...
Highly sensitive magnetic field sensor based on microfiber coupler with magnetic fluid

International Nuclear Information System (INIS)

Luo, Longfeng; Pu, Shengli; Tang, Jiali; Zeng, Xianglong; Lahoubi, Mahieddine

2015-01-01

A kind of magnetic field sensor using a microfiber coupler (MFC) surrounded with magnetic fluid (MF) is proposed and experimentally demonstrated. As the MFC is strongly sensitive to the surrounding refractive index (RI) and MF's RI is sensitive to magnetic field, the magnetic field sensing function of the proposed structure is realized. Interrogation of magnetic field strength is achieved by measuring the dip wavelength shift and transmission loss change of the transmission spectrum. The experimental results show that the sensitivity of the sensor is wavelength-dependent. The maximum sensitivity of 191.8 pm/Oe is achieved at wavelength of around 1537 nm in this work. In addition, a sensitivity of −0.037 dB/Oe is achieved by monitoring variation of the fringe visibility. These suggest the potential applications of the proposed structure in tunable all-in-fiber photonic devices such as magneto-optical modulator, filter, and sensing
Highly sensitive magnetic field sensor based on microfiber coupler with magnetic fluid

Energy Technology Data Exchange (ETDEWEB)

Luo, Longfeng; Pu, Shengli, E-mail: shlpu@usst.edu.cn; Tang, Jiali [College of Science, University of Shanghai for Science and Technology, Shanghai 200093 (China); Zeng, Xianglong [2Key Laboratory of Specialty Fiber Optics and Optical Access Network, Shanghai University, Shanghai 200072 (China); Lahoubi, Mahieddine [Department of Physics, Faculty of Sciences, Laboratory L.P.S., Badji Mokhtar-Annaba University, P. O. Box 12, 23000 Annaba (Algeria)

2015-05-11

A kind of magnetic field sensor using a microfiber coupler (MFC) surrounded with magnetic fluid (MF) is proposed and experimentally demonstrated. As the MFC is strongly sensitive to the surrounding refractive index (RI) and MF's RI is sensitive to magnetic field, the magnetic field sensing function of the proposed structure is realized. Interrogation of magnetic field strength is achieved by measuring the dip wavelength shift and transmission loss change of the transmission spectrum. The experimental results show that the sensitivity of the sensor is wavelength-dependent. The maximum sensitivity of 191.8 pm/Oe is achieved at wavelength of around 1537 nm in this work. In addition, a sensitivity of −0.037 dB/Oe is achieved by monitoring variation of the fringe visibility. These suggest the potential applications of the proposed structure in tunable all-in-fiber photonic devices such as magneto-optical modulator, filter, and sensing.
Analysis and design of arrayed waveguide gratings with MMI couplers.

Science.gov (United States)

Munoz, P; Pastor, D; Capmany, J

2001-09-24

We present an extension of the AWG model and design procedure described in [1] to incorporate multimode interference, MMI, couplers. For the first time to our knowledge, a closed formula for the passing bands bandwidth and crosstalk estimation plots are derived.
GROMACS 4.5: a high-throughput and highly parallel open source molecular simulation toolkit.

Science.gov (United States)

Pronk, Sander; Páll, Szilárd; Schulz, Roland; Larsson, Per; Bjelkmar, Pär; Apostolov, Rossen; Shirts, Michael R; Smith, Jeremy C; Kasson, Peter M; van der Spoel, David; Hess, Berk; Lindahl, Erik

2013-04-01

Molecular simulation has historically been a low-throughput technique, but faster computers and increasing amounts of genomic and structural data are changing this by enabling large-scale automated simulation of, for instance, many conformers or mutants of biomolecules with or without a range of ligands. At the same time, advances in performance and scaling now make it possible to model complex biomolecular interaction and function in a manner directly testable by experiment. These applications share a need for fast and efficient software that can be deployed on massive scale in clusters, web servers, distributed computing or cloud resources. Here, we present a range of new simulation algorithms and features developed during the past 4 years, leading up to the GROMACS 4.5 software package. The software now automatically handles wide classes of biomolecules, such as proteins, nucleic acids and lipids, and comes with all commonly used force fields for these molecules built-in. GROMACS supports several implicit solvent models, as well as new free-energy algorithms, and the software now uses multithreading for efficient parallelization even on low-end systems, including windows-based workstations. Together with hand-tuned assembly kernels and state-of-the-art parallelization, this provides extremely high performance and cost efficiency for high-throughput as well as massively parallel simulations. GROMACS is an open source and free software available from http://www.gromacs.org. Supplementary data are available at Bioinformatics online.

Parallel segmented outlet flow high performance liquid chromatography with multiplexed detection

International Nuclear Information System (INIS)

Camenzuli, Michelle; Terry, Jessica M.; Shalliker, R. Andrew; Conlan, Xavier A.; Barnett, Neil W.; Francis, Paul S.

2013-01-01

Graphical abstract: -- Highlights: •Multiplexed detection for liquid chromatography. •‘Parallel segmented outlet flow’ distributes inner and outer portions of the analyte zone. •Three detectors were used simultaneously for the determination of opiate alkaloids. -- Abstract: We describe a new approach to multiplex detection for HPLC, exploiting parallel segmented outlet flow – a new column technology that provides pressure-regulated control of eluate flow through multiple outlet channels, which minimises the additional dead volume associated with conventional post-column flow splitting. Using three detectors: one UV-absorbance and two chemiluminescence systems (tris(2,2′-bipyridine)ruthenium(III) and permanganate), we examine the relative responses for six opium poppy (Papaver somniferum) alkaloids under conventional and multiplexed conditions, where approximately 30% of the eluate was distributed to each detector and the remaining solution directed to a collection vessel. The parallel segmented outlet flow mode of operation offers advantages in terms of solvent consumption, waste generation, total analysis time and solute band volume when applying multiple detectors to HPLC, but the manner in which each detection system is influenced by changes in solute concentration and solution flow rates must be carefully considered
Parallel segmented outlet flow high performance liquid chromatography with multiplexed detection

Energy Technology Data Exchange (ETDEWEB)

Camenzuli, Michelle [Australian Centre for Research on Separation Science (ACROSS), School of Science and Health, University of Western Sydney (Parramatta), Sydney, NSW (Australia); Terry, Jessica M. [Centre for Chemistry and Biotechnology, School of Life and Environmental Sciences, Deakin University, Geelong, Victoria 3216 (Australia); Shalliker, R. Andrew, E-mail: r.shalliker@uws.edu.au [Australian Centre for Research on Separation Science (ACROSS), School of Science and Health, University of Western Sydney (Parramatta), Sydney, NSW (Australia); Conlan, Xavier A.; Barnett, Neil W. [Centre for Chemistry and Biotechnology, School of Life and Environmental Sciences, Deakin University, Geelong, Victoria 3216 (Australia); Francis, Paul S., E-mail: paul.francis@deakin.edu.au [Centre for Chemistry and Biotechnology, School of Life and Environmental Sciences, Deakin University, Geelong, Victoria 3216 (Australia)

2013-11-25

Graphical abstract: -- Highlights: •Multiplexed detection for liquid chromatography. •‘Parallel segmented outlet flow’ distributes inner and outer portions of the analyte zone. •Three detectors were used simultaneously for the determination of opiate alkaloids. -- Abstract: We describe a new approach to multiplex detection for HPLC, exploiting parallel segmented outlet flow – a new column technology that provides pressure-regulated control of eluate flow through multiple outlet channels, which minimises the additional dead volume associated with conventional post-column flow splitting. Using three detectors: one UV-absorbance and two chemiluminescence systems (tris(2,2′-bipyridine)ruthenium(III) and permanganate), we examine the relative responses for six opium poppy (Papaver somniferum) alkaloids under conventional and multiplexed conditions, where approximately 30% of the eluate was distributed to each detector and the remaining solution directed to a collection vessel. The parallel segmented outlet flow mode of operation offers advantages in terms of solvent consumption, waste generation, total analysis time and solute band volume when applying multiple detectors to HPLC, but the manner in which each detection system is influenced by changes in solute concentration and solution flow rates must be carefully considered.
Design of parallel dual-energy X-ray beam and its performance for security radiography

International Nuclear Information System (INIS)

Kim, Kwang Hyun; Myoung, Sung Min; Chung, Yong Hyun

2011-01-01

A new concept of dual-energy X-ray beam generation and acquisition of dual-energy security radiography is proposed. Erbium (Er) and rhodium (Rh) with a copper filter were positioned in front of X-ray tube to generate low- and high-energy X-ray spectra. Low- and high-energy X-rays were guided to separately enter into two parallel detectors. Monte Carlo code of MCNPX was used to derive an optimum thickness of each filter for improved dual X-ray image quality. It was desired to provide separation ability between organic and inorganic matters for the condition of 140 kVp/0.8 mA as used in the security application. Acquired dual-energy X-ray beams were evaluated by the dual-energy Z-map yielding enhanced performance compared with a commercial dual-energy detector. A collimator for the parallel dual-energy X-ray beam was designed to minimize X-ray beam interference between low- and high-energy parallel beams for 500 mm source-to-detector distance.
High performance computing of density matrix renormalization group method for 2-dimensional model. Parallelization strategy toward peta computing

International Nuclear Information System (INIS)

Yamada, Susumu; Igarashi, Ryo; Machida, Masahiko; Imamura, Toshiyuki; Okumura, Masahiko; Onishi, Hiroaki

2010-01-01

We parallelize the density matrix renormalization group (DMRG) method, which is a ground-state solver for one-dimensional quantum lattice systems. The parallelization allows us to extend the applicable range of the DMRG to n-leg ladders i.e., quasi two-dimension cases. Such an extension is regarded to bring about several breakthroughs in e.g., quantum-physics, chemistry, and nano-engineering. However, the straightforward parallelization requires all-to-all communications between all processes which are unsuitable for multi-core systems, which is a mainstream of current parallel computers. Therefore, we optimize the all-to-all communications by the following two steps. The first one is the elimination of the communications between all processes by only rearranging data distribution with the communication data amount kept. The second one is the avoidance of the communication conflict by rescheduling the calculation and the communication. We evaluate the performance of the DMRG method on multi-core supercomputers and confirm that our two-steps tuning is quite effective. (author)
HPDC ´12 : proceedings of the 21st ACM symposium on high-performance parallel and distributed computing, June 18-22, 2012, Delft, The Netherlands

NARCIS (Netherlands)

Epema, D.H.J.; Kielmann, T.; Ripeanu, M.

2012-01-01

Welcome to ACM HPDC 2012! This is the twenty-first year of HPDC and we are pleased to report that our community continues to grow in size, quality and reputation. The program consists of three days packed with presentations on the latest developments in high-performance parallel and distributed
All silicon waveguide spherical microcavity coupler device.

Science.gov (United States)

Xifré-Pérez, E; Domenech, J D; Fenollosa, R; Muñoz, P; Capmany, J; Meseguer, F

2011-02-14

A coupler based on silicon spherical microcavities coupled to silicon waveguides for telecom wavelengths is presented. The light scattered by the microcavity is detected and analyzed as a function of the wavelength. The transmittance signal through the waveguide is strongly attenuated (up to 25 dB) at wavelengths corresponding to the Mie resonances of the microcavity. The coupling between the microcavity and the waveguide is experimentally demonstrated and theoretically modeled with the help of FDTD calculations.
Performance evaluation for compressible flow calculations on five parallel computers of different architectures

International Nuclear Information System (INIS)

Kimura, Toshiya.

1997-03-01

A two-dimensional explicit Euler solver has been implemented for five MIMD parallel computers of different machine architectures in Center for Promotion of Computational Science and Engineering of Japan Atomic Energy Research Institute. These parallel computers are Fujitsu VPP300, NEC SX-4, CRAY T94, IBM SP2, and Hitachi SR2201. The code was parallelized by several parallelization methods, and a typical compressible flow problem has been calculated for different grid sizes changing the number of processors. Their effective performances for parallel calculations, such as calculation speed, speed-up ratio and parallel efficiency, have been investigated and evaluated. The communication time among processors has been also measured and evaluated. As a result, the differences on the performance and the characteristics between vector-parallel and scalar-parallel computers can be pointed, and it will present the basic data for efficient use of parallel computers and for large scale CFD simulations on parallel computers. (author)
Operational mesoscale atmospheric dispersion prediction using high performance parallel computing cluster for emergency response

International Nuclear Information System (INIS)

Srinivas, C.V.; Venkatesan, R.; Muralidharan, N.V.; Das, Someshwar; Dass, Hari; Eswara Kumar, P.

2005-08-01

An operational atmospheric dispersion prediction system is implemented on a cluster super computer for 'Online Emergency Response' for Kalpakkam nuclear site. The numerical system constitutes a parallel version of a nested grid meso-scale meteorological model MM5 coupled to a random walk particle dispersion model FLEXPART. The system provides 48 hour forecast of the local weather and radioactive plume dispersion due to hypothetical air borne releases in a range of 100 km around the site. The parallel code was implemented on different cluster configurations like distributed and shared memory systems. Results of MM5 run time performance for 1-day prediction are reported on all the machines available for testing. A reduction of 5 times in runtime is achieved using 9 dual Xeon nodes (18 physical/36 logical processors) compared to a single node sequential run. Based on the above run time results a cluster computer facility with 9-node Dual Xeon is commissioned at IGCAR for model operation. The run time of a triple nested domain MM5 is about 4 h for 24 h forecast. The system has been operated continuously for a few months and results were ported on the IMSc home page. Initial and periodic boundary condition data for MM5 are provided by NCMRWF, New Delhi. An alternative source is found to be NCEP, USA. These two sources provide the input data to the operational models at different spatial and temporal resolutions and using different assimilation methods. A comparative study on the results of forecast is presented using these two data sources for present operational use. Slight improvement is noticed in rainfall, winds, geopotential heights and the vertical atmospheric structure while using NCEP data probably because of its high spatial and temporal resolution. (author)
DVS-SOFTWARE: An Effective Tool for Applying Highly Parallelized Hardware To Computational Geophysics

Science.gov (United States)

Herrera, I.; Herrera, G. S.

2015-12-01

Most geophysical systems are macroscopic physical systems. The behavior prediction of such systems is carried out by means of computational models whose basic models are partial differential equations (PDEs) [1]. Due to the enormous size of the discretized version of such PDEs it is necessary to apply highly parallelized super-computers. For them, at present, the most efficient software is based on non-overlapping domain decomposition methods (DDM). However, a limiting feature of the present state-of-the-art techniques is due to the kind of discretizations used in them. Recently, I. Herrera and co-workers using 'non-overlapping discretizations' have produced the DVS-Software which overcomes this limitation [2]. The DVS-software can be applied to a great variety of geophysical problems and achieves very high parallel efficiencies (90%, or so [3]). It is therefore very suitable for effectively applying the most advanced parallel supercomputers available at present. In a parallel talk, in this AGU Fall Meeting, Graciela Herrera Z. will present how this software is being applied to advance MOD-FLOW. Key Words: Parallel Software for Geophysics, High Performance Computing, HPC, Parallel Computing, Domain Decomposition Methods (DDM)REFERENCES [1]. Herrera Ismael and George F. Pinder, Mathematical Modelling in Science and Engineering: An axiomatic approach", John Wiley, 243p., 2012. [2]. Herrera, I., de la Cruz L.M. and Rosas-Medina A. "Non Overlapping Discretization Methods for Partial, Differential Equations". NUMER METH PART D E, 30: 1427-1454, 2014, DOI 10.1002/num 21852. (Open source) [3]. Herrera, I., & Contreras Iván "An Innovative Tool for Effectively Applying Highly Parallelized Software To Problems of Elasticity". Geofísica Internacional, 2015 (In press)
Parallel computing works

Energy Technology Data Exchange (ETDEWEB)

1991-10-23

An account of the Caltech Concurrent Computation Program (C{sup 3}P), a five year project that focused on answering the question: Can parallel computers be used to do large-scale scientific computations '' As the title indicates, the question is answered in the affirmative, by implementing numerous scientific applications on real parallel computers and doing computations that produced new scientific results. In the process of doing so, C{sup 3}P helped design and build several new computers, designed and implemented basic system software, developed algorithms for frequently used mathematical computations on massively parallel machines, devised performance models and measured the performance of many computers, and created a high performance computing facility based exclusively on parallel computers. While the initial focus of C{sup 3}P was the hypercube architecture developed by C. Seitz, many of the methods developed and lessons learned have been applied successfully on other massively parallel architectures.
Visualization and Data Analysis for High-Performance Computing

Energy Technology Data Exchange (ETDEWEB)

Sewell, Christopher Meyer [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

2016-09-27

This is a set of slides from a guest lecture for a class at the University of Texas, El Paso on visualization and data analysis for high-performance computing. The topics covered are the following: trends in high-performance computing; scientific visualization, such as OpenGL, ray tracing and volume rendering, VTK, and ParaView; data science at scale, such as in-situ visualization, image databases, distributed memory parallelism, shared memory parallelism, VTK-m, "big data", and then an analysis example.
Advanced parallel processing with supercomputer architectures

International Nuclear Information System (INIS)

Hwang, K.

1987-01-01

This paper investigates advanced parallel processing techniques and innovative hardware/software architectures that can be applied to boost the performance of supercomputers. Critical issues on architectural choices, parallel languages, compiling techniques, resource management, concurrency control, programming environment, parallel algorithms, and performance enhancement methods are examined and the best answers are presented. The authors cover advanced processing techniques suitable for supercomputers, high-end mainframes, minisupers, and array processors. The coverage emphasizes vectorization, multitasking, multiprocessing, and distributed computing. In order to achieve these operation modes, parallel languages, smart compilers, synchronization mechanisms, load balancing methods, mapping parallel algorithms, operating system functions, application library, and multidiscipline interactions are investigated to ensure high performance. At the end, they assess the potentials of optical and neural technologies for developing future supercomputers
A Novel Multimode Waveguide Coupler for Accurate Power Measurement of Traveling Wave Tube Harmonic Frequencies

Science.gov (United States)

Wintucky, Edwin G.; Simons, Rainee N.

2014-01-01

This paper presents the design, fabrication and test results for a novel waveguide multimode directional coupler (MDC). The coupler fabricated from two dissimilar waveguides is capable of isolating the power at the second harmonic frequency from the fundamental power at the output port of a traveling-wave tube (TWT). In addition to accurate power measurements at harmonic frequencies, a potential application of the MDC is in the design of a beacon source for atmospheric propagation studies at millimeter-wave frequencies.
Evaluation of the power consumption of a high-speed parallel robot

Science.gov (United States)

Han, Gang; Xie, Fugui; Liu, Xin-Jun

2018-06-01

An inverse dynamic model of a high-speed parallel robot is established based on the virtual work principle. With this dynamic model, a new evaluation method is proposed to measure the power consumption of the robot during pick-and-place tasks. The power vector is extended in this method and used to represent the collinear velocity and acceleration of the moving platform. Afterward, several dynamic performance indices, which are homogenous and possess obvious physical meanings, are proposed. These indices can evaluate the power input and output transmissibility of the robot in a workspace. The distributions of the power input and output transmissibility of the high-speed parallel robot are derived with these indices and clearly illustrated in atlases. Furtherly, a low-power-consumption workspace is selected for the robot.
Flexural behavior of concrete beam with mechanical splices of reinforcement subjected to cyclic loading

International Nuclear Information System (INIS)

Nab, H. S.; Kim, W. B.

2008-01-01

In nuclear power plant structures, the mechanical rebar splices are designated and constructed on the basis of ACI and ASME code. Regardless of good performance on mechanical rebar splices, these splicing methods that did not be registered on ASME code have not restricted to apply to construction site. In this study, the main candidate splice is cold roll formed parallel threaded splice. This was registered newly in ASME Section III division 2 CC 4333 'Mechanical Splices' in 2004. To compare the traditional rebar splice with mechanical rebar splices, concrete beams were made to evaluate the ductility of spliced reinforcing bars. Based on Experimental results, it was identified that the mechanical rebar splices by parallel threaded coupler had better accumulated dissipation energy capacity to resist seismic behavior than the traditional lapping splices. It showed that concrete specimens with D36 reinforcing bar coupler are 1.8 times better performance and that concrete specimens with D22 reinforcing bar coupler are 2.8 times better performance. (authors)
Distributed and parallel approach for handle and perform huge datasets

Science.gov (United States)

Konopko, Joanna

2015-12-01

Big Data refers to the dynamic, large and disparate volumes of data comes from many different sources (tools, machines, sensors, mobile devices) uncorrelated with each others. It requires new, innovative and scalable technology to collect, host and analytically process the vast amount of data. Proper architecture of the system that perform huge data sets is needed. In this paper, the comparison of distributed and parallel system architecture is presented on the example of MapReduce (MR) Hadoop platform and parallel database platform (DBMS). This paper also analyzes the problem of performing and handling valuable information from petabytes of data. The both paradigms: MapReduce and parallel DBMS are described and compared. The hybrid architecture approach is also proposed and could be used to solve the analyzed problem of storing and processing Big Data.
Parallel Architectures and Parallel Algorithms for Integrated Vision Systems. Ph.D. Thesis

Science.gov (United States)

Choudhary, Alok Nidhi

1989-01-01

Computer vision is regarded as one of the most complex and computationally intensive problems. An integrated vision system (IVS) is a system that uses vision algorithms from all levels of processing to perform for a high level application (e.g., object recognition). An IVS normally involves algorithms from low level, intermediate level, and high level vision. Designing parallel architectures for vision systems is of tremendous interest to researchers. Several issues are addressed in parallel architectures and parallel algorithms for integrated vision systems.
Fully etched apodized grating coupler on the SOI platform with −058 dB coupling efficiency

DEFF Research Database (Denmark)

Ding, Yunhong; Peucheret, Christophe; Ou, Haiyan

2014-01-01

We design and fabricate an ultrahigh coupling efficiency (CE) fully etched apodized grating coupler on the silicon- on-insulator (SOI) platform using subwavelength photonic crystals and bonded aluminum mirror. Fabrication error sensitivity andcoupling angle dependence are experimentally investiga......We design and fabricate an ultrahigh coupling efficiency (CE) fully etched apodized grating coupler on the silicon- on-insulator (SOI) platform using subwavelength photonic crystals and bonded aluminum mirror. Fabrication error sensitivity andcoupling angle dependence are experimentally...
NETRA: A parallel architecture for integrated vision systems 2: Algorithms and performance evaluation

Science.gov (United States)

Choudhary, Alok N.; Patel, Janak H.; Ahuja, Narendra

1989-01-01

In part 1 architecture of NETRA is presented. A performance evaluation of NETRA using several common vision algorithms is also presented. Performance of algorithms when they are mapped on one cluster is described. It is shown that SIMD, MIMD, and systolic algorithms can be easily mapped onto processor clusters, and almost linear speedups are possible. For some algorithms, analytical performance results are compared with implementation performance results. It is observed that the analysis is very accurate. Performance analysis of parallel algorithms when mapped across clusters is presented. Mappings across clusters illustrate the importance and use of shared as well as distributed memory in achieving high performance. The parameters for evaluation are derived from the characteristics of the parallel algorithms, and these parameters are used to evaluate the alternative communication strategies in NETRA. Furthermore, the effect of communication interference from other processors in the system on the execution of an algorithm is studied. Using the analysis, performance of many algorithms with different characteristics is presented. It is observed that if communication speeds are matched with the computation speeds, good speedups are possible when algorithms are mapped across clusters.
The Permanent Magnet Operating Mechanism of Double Coil Parallel Driven at a High Speed

Directory of Open Access Journals (Sweden)

WEI Xau-Lao

2017-02-01

Full Text Available Abstract:Operating mechanism is the main part of breaker，and the quality of breaker will directly influence the safe operation of power system. Because of the continuous improvement requirements of switch，in order to mak this actuator faster and more powerful closing，this paper proposes a double coil parallel driven permanent magnet actuator at a high speed. This paper expounds the working principle of single and double coil parallel driven permanent magnet actuator. It uses Ansoft building model and contrasts test results. In prance we designed and produced the single and double coil parallel driven permanent magnet actuator for experimental study. The simulation and experiment results show that double coil parallel driven permanent magnet actuator，compared with single coil parallel driven permanent magnet actuator，has a better and faster action performance. Thus，the double coil parallel driven permanent magnet actuator achieves a kind of optimization.

Parallelization of the AliRoot event reconstruction by performing a semi- automatic source-code transformation

CERN Multimedia

CERN. Geneva

2012-01-01

side bus or processor interconnections. Parallelism can only result in performance gain, if the memory usage is optimized, memory locality improved and the communication between threads is minimized. But the domain of concurrent programming has become a field for highly skilled experts, as the implementation of multithreading is difficult, error prone and labor intensive. A full re-implementation for parallel execution of existing offline frameworks, like AliRoot in ALICE, is thus unaffordable. An alternative method, is to use a semi-automatic source-to-source transformation for getting a simple parallel design, with almost no interference between threads. This reduces the need of rewriting the develop...
A grating coupler with a trapezoidal hole array for perfectly vertical light coupling between optical fibers and waveguides

Science.gov (United States)

Mizutani, Akio; Eto, Yohei; Kikuta, Hisao

2017-12-01

A grating coupler with a trapezoidal hole array was designed and fabricated for perfectly vertical light coupling between a single-mode optical fiber and a silicon waveguide on a silicon-on-insulator (SOI) substrate. The grating coupler with an efficiency of 53% was computationally designed at a 1.1-µm-thick buried oxide (BOX) layer. The grating coupler and silicon waveguide were fabricated on the SOI substrate with a 3.0-µm-thick BOX layer by a single full-etch process. The measured coupling efficiency was 24% for TE-polarized light at 1528 nm wavelength, which was 0.69 times of the calculated coupling efficiency for the 3.0-µm-thick BOX layer.
Evaluation of high-performance computing software

Energy Technology Data Exchange (ETDEWEB)

Browne, S.; Dongarra, J. [Univ. of Tennessee, Knoxville, TN (United States); Rowan, T. [Oak Ridge National Lab., TN (United States)

1996-12-31

The absence of unbiased and up to date comparative evaluations of high-performance computing software complicates a user`s search for the appropriate software package. The National HPCC Software Exchange (NHSE) is attacking this problem using an approach that includes independent evaluations of software, incorporation of author and user feedback into the evaluations, and Web access to the evaluations. We are applying this approach to the Parallel Tools Library (PTLIB), a new software repository for parallel systems software and tools, and HPC-Netlib, a high performance branch of the Netlib mathematical software repository. Updating the evaluations with feed-back and making it available via the Web helps ensure accuracy and timeliness, and using independent reviewers produces unbiased comparative evaluations difficult to find elsewhere.
Analysis and comparison between electric and magnetic power couplers for accelerators in Free Electron Lasers (FEL)

Science.gov (United States)

Serpico, C.; Grudiev, A.; Vescovo, R.

2016-10-01

Free-electron lasers represent a new and exciting class of coherent optical sources possessing broad wavelength tunability and excellent optical-beam quality. The FERMI seeded free-electron laser (FEL), located at the Elettra laboratory in Trieste, is driven by a 200 m long, S-band linac: the high energy part of the linac is equipped with 6 m long backward traveling wave (BTW) structures. The structures have small iris radius and a nose cone geometry which allows for high gradient operation. Development of new high-gradient, S-band accelerating structures for the replacement of the existing BTWs is under consideration. This paper investigates two possible solutions for the RF power couplers suitable for a linac driven FEL which require reduced wakefields effects, high operating gradient and very high reliability. The first part of the manuscript focuses on the reduction of residual field asymmetries, while in the second analyzes RF performances, the peak surface fields and the expected breakdown rate. In the conclusion, two solutions are compared and pros and cons are highlighted.
Analysis and comparison between electric and magnetic power couplers for accelerators in Free Electron Lasers (FEL)

Energy Technology Data Exchange (ETDEWEB)

Serpico, C., E-mail: claudio.serpico@elettra.eu [Elettra - Sincrotrone Trieste, Trieste (Italy); Grudiev, A. [CERN, Geneva (Switzerland); Vescovo, R. [University of Trieste, Trieste (Italy)

2016-10-11

Free-electron lasers represent a new and exciting class of coherent optical sources possessing broad wavelength tunability and excellent optical-beam quality. The FERMI seeded free-electron laser (FEL), located at the Elettra laboratory in Trieste, is driven by a 200 m long, S-band linac: the high energy part of the linac is equipped with 6 m long backward traveling wave (BTW) structures. The structures have small iris radius and a nose cone geometry which allows for high gradient operation. Development of new high-gradient, S-band accelerating structures for the replacement of the existing BTWs is under consideration. This paper investigates two possible solutions for the RF power couplers suitable for a linac driven FEL which require reduced wakefields effects, high operating gradient and very high reliability. The first part of the manuscript focuses on the reduction of residual field asymmetries, while in the second analyzes RF performances, the peak surface fields and the expected breakdown rate. In the conclusion, two solutions are compared and pros and cons are highlighted.
High-speed parallel counter

International Nuclear Information System (INIS)

Gus'kov, B.N.; Kalinnikov, V.A.; Krastev, V.R.; Maksimov, A.N.; Nikityuk, N.M.

1985-01-01

This paper describes a high-speed parallel counter that contains 31 inputs and 15 outputs and is implemented by integrated circuits of series 500. The counter is designed for fast sampling of events according to the number of particles that pass simultaneously through the hodoscopic plane of the detector. The minimum delay of the output signals relative to the input is 43 nsec. The duration of the output signals can be varied from 75 to 120 nsec
Performance of Air Pollution Models on Massively Parallel Computers

DEFF Research Database (Denmark)

Brown, John; Hansen, Per Christian; Wasniewski, Jerzy

1996-01-01

To compare the performance and use of three massively parallel SIMD computers, we implemented a large air pollution model on the computers. Using a realistic large-scale model, we gain detailed insight about the performance of the three computers when used to solve large-scale scientific problems...
Coupler tuning for constant gradient travelling wave accelerating structures

International Nuclear Information System (INIS)

Guo Xingkun; Ma Yanyun; Wang Xiulong

2013-01-01

The method of the coupler tuning for the constant gradient traveling wave accelerating structure was described and the formula of coupling coefficient p was deduced on the basis of analyzing the existing methods for the constant impedance traveling wave accelerating structures and coupling-cavity chain equivalent circuits. The method and formula were validated by the simulation result by CST and experiment data. (authors)
Line filter design of parallel interleaved VSCs for high power wind energy conversion systems

DEFF Research Database (Denmark)

Gohil, Ghanshyamsinh Vijaysinh; Bede, Lorand; Teodorescu, Remus

2015-01-01

The Voltage Source Converters (VSCs) are often connected in parallel in a Wind Energy Conversion System (WECS) to match the high power rating of the modern wind turbines. The effect of the interleaved carriers on the harmonic performance of the parallel connected VSCs is analyzed in this paper...... limit. In order to achieve the desired filter performance with optimal values of the filter parameters, the use of a LC trap branch with the conventional LCL filter is proposed. The expressions for the resonant frequencies of the proposed line filter are derived and used in the design to selectively...
Strategies and Experiences Using High Performance Fortran

National Research Council Canada - National Science Library

Shires, Dale

2001-01-01

.... High performance Fortran (HPF) is a relative new addition to the Fortran dialect It is an attempt to provide an efficient high-level Fortran parallel programming language for the latest generation of been debatable...
Enabling high performance computational science through combinatorial algorithms

International Nuclear Information System (INIS)

Boman, Erik G; Bozdag, Doruk; Catalyurek, Umit V; Devine, Karen D; Gebremedhin, Assefaw H; Hovland, Paul D; Pothen, Alex; Strout, Michelle Mills

2007-01-01

The Combinatorial Scientific Computing and Petascale Simulations (CSCAPES) Institute is developing algorithms and software for combinatorial problems that play an enabling role in scientific and engineering computations. Discrete algorithms will be increasingly critical for achieving high performance for irregular problems on petascale architectures. This paper describes recent contributions by researchers at the CSCAPES Institute in the areas of load balancing, parallel graph coloring, performance improvement, and parallel automatic differentiation
Enabling high performance computational science through combinatorial algorithms

Energy Technology Data Exchange (ETDEWEB)

Boman, Erik G [Discrete Algorithms and Math Department, Sandia National Laboratories (United States); Bozdag, Doruk [Biomedical Informatics, and Electrical and Computer Engineering, Ohio State University (United States); Catalyurek, Umit V [Biomedical Informatics, and Electrical and Computer Engineering, Ohio State University (United States); Devine, Karen D [Discrete Algorithms and Math Department, Sandia National Laboratories (United States); Gebremedhin, Assefaw H [Computer Science and Center for Computational Science, Old Dominion University (United States); Hovland, Paul D [Mathematics and Computer Science Division, Argonne National Laboratory (United States); Pothen, Alex [Computer Science and Center for Computational Science, Old Dominion University (United States); Strout, Michelle Mills [Computer Science, Colorado State University (United States)

2007-07-15

The Combinatorial Scientific Computing and Petascale Simulations (CSCAPES) Institute is developing algorithms and software for combinatorial problems that play an enabling role in scientific and engineering computations. Discrete algorithms will be increasingly critical for achieving high performance for irregular problems on petascale architectures. This paper describes recent contributions by researchers at the CSCAPES Institute in the areas of load balancing, parallel graph coloring, performance improvement, and parallel automatic differentiation.
Parallel computing for event reconstruction in high-energy physics

International Nuclear Information System (INIS)

Wolbers, S.

1993-01-01

Parallel computing has been recognized as a solution to large computing problems. In High Energy Physics offline event reconstruction of detector data is a very large computing problem that has been solved with parallel computing techniques. A review of the parallel programming package CPS (Cooperative Processes Software) developed and used at Fermilab for offline reconstruction of Terabytes of data requiring the delivery of hundreds of Vax-Years per experiment is given. The Fermilab UNIX farms, consisting of 180 Silicon Graphics workstations and 144 IBM RS6000 workstations, are used to provide the computing power for the experiments. Fermilab has had a long history of providing production parallel computing starting with the ACP (Advanced Computer Project) Farms in 1986. The Fermilab UNIX Farms have been in production for over 2 years with 24 hour/day service to experimental user groups. Additional tools for management, control and monitoring these large systems will be described. Possible future directions for parallel computing in High Energy Physics will be given
Straightforward and accurate technique for post-coupler stabilization in drift tube linac structures

Directory of Open Access Journals (Sweden)

Mohammad Reza Khalvati

2016-04-01

Full Text Available The axial electric field of Alvarez drift tube linacs (DTLs is known to be susceptible to variations due to static and dynamic effects like manufacturing tolerances and beam loading. Post-couplers are used to stabilize the accelerating fields of DTLs against tuning errors. Tilt sensitivity and its slope have been introduced as measures for the stability right from the invention of post-couplers but since then the actual stabilization has mostly been done by tedious iteration. In the present article, the local tilt-sensitivity slope TS_{n}^{′} is established as the principal measure for stabilization instead of tilt sensitivity or some visual slope, and its significance is developed on the basis of an equivalent-circuit diagram of the DTL. Experimental and 3D simulation results are used to analyze its behavior and to define a technique for stabilization that allows finding the best post-coupler settings with just four tilt-sensitivity measurements. CERN’s Linac4 DTL Tank 2 and Tank 3 have been stabilized successfully using this technique. The final tilt-sensitivity error has been reduced from ±100%/MHz down to ±3%/MHz for Tank 2 and down to ±1%/MHz for Tank 3. Finally, an accurate procedure for tuning the structure using slug tuners is discussed.
Straightforward and accurate technique for post-coupler stabilization in drift tube linac structures

Science.gov (United States)

Khalvati, Mohammad Reza; Ramberger, Suitbert

2016-04-01

The axial electric field of Alvarez drift tube linacs (DTLs) is known to be susceptible to variations due to static and dynamic effects like manufacturing tolerances and beam loading. Post-couplers are used to stabilize the accelerating fields of DTLs against tuning errors. Tilt sensitivity and its slope have been introduced as measures for the stability right from the invention of post-couplers but since then the actual stabilization has mostly been done by tedious iteration. In the present article, the local tilt-sensitivity slope TSn' is established as the principal measure for stabilization instead of tilt sensitivity or some visual slope, and its significance is developed on the basis of an equivalent-circuit diagram of the DTL. Experimental and 3D simulation results are used to analyze its behavior and to define a technique for stabilization that allows finding the best post-coupler settings with just four tilt-sensitivity measurements. CERN's Linac4 DTL Tank 2 and Tank 3 have been stabilized successfully using this technique. The final tilt-sensitivity error has been reduced from ±100 %/MHz down to ±3 %/MHz for Tank 2 and down to ±1 %/MHz for Tank 3. Finally, an accurate procedure for tuning the structure using slug tuners is discussed.
A parallelization study of the general purpose Monte Carlo code MCNP4 on a distributed memory highly parallel computer

International Nuclear Information System (INIS)

Yamazaki, Takao; Fujisaki, Masahide; Okuda, Motoi; Takano, Makoto; Masukawa, Fumihiro; Naito, Yoshitaka

1993-01-01

The general purpose Monte Carlo code MCNP4 has been implemented on the Fujitsu AP1000 distributed memory highly parallel computer. Parallelization techniques developed and studied are reported. A shielding analysis function of the MCNP4 code is parallelized in this study. A technique to map a history to each processor dynamically and to map control process to a certain processor was applied. The efficiency of parallelized code is up to 80% for a typical practical problem with 512 processors. These results demonstrate the advantages of a highly parallel computer to the conventional computers in the field of shielding analysis by Monte Carlo method. (orig.)
A highly scalable massively parallel fast marching method for the Eikonal equation

Science.gov (United States)

Yang, Jianming; Stern, Frederick

2017-03-01

The fast marching method is a widely used numerical method for solving the Eikonal equation arising from a variety of scientific and engineering fields. It is long deemed inherently sequential and an efficient parallel algorithm applicable to large-scale practical applications is not available in the literature. In this study, we present a highly scalable massively parallel implementation of the fast marching method using a domain decomposition approach. Central to this algorithm is a novel restarted narrow band approach that coordinates the frequency of communications and the amount of computations extra to a sequential run for achieving an unprecedented parallel performance. Within each restart, the narrow band fast marching method is executed; simple synchronous local exchanges and global reductions are adopted for communicating updated data in the overlapping regions between neighboring subdomains and getting the latest front status, respectively. The independence of front characteristics is exploited through special data structures and augmented status tags to extract the masked parallelism within the fast marching method. The efficiency, flexibility, and applicability of the parallel algorithm are demonstrated through several examples. These problems are extensively tested on six grids with up to 1 billion points using different numbers of processes ranging from 1 to 65536. Remarkable parallel speedups are achieved using tens of thousands of processes. Detailed pseudo-codes for both the sequential and parallel algorithms are provided to illustrate the simplicity of the parallel implementation and its similarity to the sequential narrow band fast marching algorithm.
Parallel performance of TORT on the CRAY J90: Model and measurement

International Nuclear Information System (INIS)

Barnett, A.; Azmy, Y.Y.

1997-10-01

A limitation on the parallel performance of TORT on the CRAY J90 is the amount of extra work introduced by the multitasking algorithm itself. The extra work beyond that of the serial version of the code, called overhead, arises from the synchronization of the parallel tasks and the accumulation of results by the master task. The goal of recent updates to TORT was to reduce the time consumed by these activities. To help understand which components of the multitasking algorithm contribute significantly to the overhead, a parallel performance model was constructed and compared to measurements of actual timings of the code
Massively parallel mathematical sieves

Energy Technology Data Exchange (ETDEWEB)

Montry, G.R.

1989-01-01

The Sieve of Eratosthenes is a well-known algorithm for finding all prime numbers in a given subset of integers. A parallel version of the Sieve is described that produces computational speedups over 800 on a hypercube with 1,024 processing elements for problems of fixed size. Computational speedups as high as 980 are achieved when the problem size per processor is fixed. The method of parallelization generalizes to other sieves and will be efficient on any ensemble architecture. We investigate two highly parallel sieves using scattered decomposition and compare their performance on a hypercube multiprocessor. A comparison of different parallelization techniques for the sieve illustrates the trade-offs necessary in the design and implementation of massively parallel algorithms for large ensemble computers.
Using the Eclipse Parallel Tools Platform to Assist Earth Science Model Development and Optimization on High Performance Computers

Science.gov (United States)

Alameda, J. C.

2011-12-01

Development and optimization of computational science models, particularly on high performance computers, and with the advent of ubiquitous multicore processor systems, practically on every system, has been accomplished with basic software tools, typically, command-line based compilers, debuggers, performance tools that have not changed substantially from the days of serial and early vector computers. However, model complexity, including the complexity added by modern message passing libraries such as MPI, and the need for hybrid code models (such as openMP and MPI) to be able to take full advantage of high performance computers with an increasing core count per shared memory node, has made development and optimization of such codes an increasingly arduous task. Additional architectural developments, such as many-core processors, only complicate the situation further. In this paper, we describe how our NSF-funded project, "SI2-SSI: A Productive and Accessible Development Workbench for HPC Applications Using the Eclipse Parallel Tools Platform" (WHPC) seeks to improve the Eclipse Parallel Tools Platform, an environment designed to support scientific code development targeted at a diverse set of high performance computing systems. Our WHPC project to improve Eclipse PTP takes an application-centric view to improve PTP. We are using a set of scientific applications, each with a variety of challenges, and using PTP to drive further improvements to both the scientific application, as well as to understand shortcomings in Eclipse PTP from an application developer perspective, to drive our list of improvements we seek to make. We are also partnering with performance tool providers, to drive higher quality performance tool integration. We have partnered with the Cactus group at Louisiana State University to improve Eclipse's ability to work with computational frameworks and extremely complex build systems, as well as to develop educational materials to incorporate into

Data access performance through parallelization and vectored access. Some results

International Nuclear Information System (INIS)

Furano, F; Hanushevsky, A

2008-01-01

High Energy Physics data processing and analysis applications typically deal with the problem of accessing and processing data at high speed. Recent studies, development and test work have shown that the latencies due to data access can often be hidden by parallelizing them with the data processing, thus giving the ability to have applications which process remote data with a high level of efficiency. Techniques and algorithms able to reach this result have been implemented in the client side of the Scalla/xrootd system, and in this contribution we describe the results of some tests done in order to compare their performance and characteristics. These techniques, if used together with multiple streams data access, can also be effective in allowing to efficiently and transparently deal with data repositories accessible via a Wide Area Network
Parallel Computing:. Some Activities in High Energy Physics

Science.gov (United States)

Willers, Ian

This paper examines some activities in High Energy Physics that utilise parallel computing. The topic includes all computing from the proposed SIMD front end detectors, the farming applications, high-powered RISC processors and the large machines in the computer centers. We start by looking at the motivation behind using parallelism for general purpose computing. The developments around farming are then described from its simplest form to the more complex system in Fermilab. Finally, there is a list of some developments that are happening close to the experiments.
Parallelization of 2-D lattice Boltzmann codes

International Nuclear Information System (INIS)

Suzuki, Soichiro; Kaburaki, Hideo; Yokokawa, Mitsuo.

1996-03-01

Lattice Boltzmann (LB) codes to simulate two dimensional fluid flow are developed on vector parallel computer Fujitsu VPP500 and scalar parallel computer Intel Paragon XP/S. While a 2-D domain decomposition method is used for the scalar parallel LB code, a 1-D domain decomposition method is used for the vector parallel LB code to be vectorized along with the axis perpendicular to the direction of the decomposition. High parallel efficiency of 95.1% by the vector parallel calculation on 16 processors with 1152x1152 grid and 88.6% by the scalar parallel calculation on 100 processors with 800x800 grid are obtained. The performance models are developed to analyze the performance of the LB codes. It is shown by our performance models that the execution speed of the vector parallel code is about one hundred times faster than that of the scalar parallel code with the same number of processors up to 100 processors. We also analyze the scalability in keeping the available memory size of one processor element at maximum. Our performance model predicts that the execution time of the vector parallel code increases about 3% on 500 processors. Although the 1-D domain decomposition method has in general a drawback in the interprocessor communication, the vector parallel LB code is still suitable for the large scale and/or high resolution simulations. (author)
Parallelization of 2-D lattice Boltzmann codes

Energy Technology Data Exchange (ETDEWEB)

Suzuki, Soichiro; Kaburaki, Hideo; Yokokawa, Mitsuo

1996-03-01

Lattice Boltzmann (LB) codes to simulate two dimensional fluid flow are developed on vector parallel computer Fujitsu VPP500 and scalar parallel computer Intel Paragon XP/S. While a 2-D domain decomposition method is used for the scalar parallel LB code, a 1-D domain decomposition method is used for the vector parallel LB code to be vectorized along with the axis perpendicular to the direction of the decomposition. High parallel efficiency of 95.1% by the vector parallel calculation on 16 processors with 1152x1152 grid and 88.6% by the scalar parallel calculation on 100 processors with 800x800 grid are obtained. The performance models are developed to analyze the performance of the LB codes. It is shown by our performance models that the execution speed of the vector parallel code is about one hundred times faster than that of the scalar parallel code with the same number of processors up to 100 processors. We also analyze the scalability in keeping the available memory size of one processor element at maximum. Our performance model predicts that the execution time of the vector parallel code increases about 3% on 500 processors. Although the 1-D domain decomposition method has in general a drawback in the interprocessor communication, the vector parallel LB code is still suitable for the large scale and/or high resolution simulations. (author).
Parallel microscope-based fluorescence, absorbance and time-of-flight mass spectrometry detection for high performance liquid chromatography and determination of glucosamine in urine.

Science.gov (United States)

Xiong, Bo; Wang, Ling-Ling; Li, Qiong; Nie, Yu-Ting; Cheng, Shuang-Shuang; Zhang, Hui; Sun, Ren-Qiang; Wang, Yu-Jiao; Zhou, Hong-Bin

2015-11-01

A parallel microscope-based laser-induced fluorescence (LIF), ultraviolet-visible absorbance (UV) and time-of-flight mass spectrometry (TOF-MS) detection for high performance liquid chromatography (HPLC) was achieved and used to determine glucosamine in urines. First, a reliable and convenient LIF detection was developed based on an inverted microscope and corresponding modulations. Parallel HPLC-LIF/UV/TOF-MS detection was developed by the combination of preceding Microscope-based LIF detection and HPLC coupled with UV and TOF-MS. The proposed setup, due to its parallel scheme, was free of the influence from photo bleaching in LIF detection. Rhodamine B, glutamic acid and glucosamine have been determined to evaluate its performance. Moreover, the proposed strategy was used to determine the glucosamine in urines, and subsequent results suggested that glucosamine, which was widely used in the prevention of the bone arthritis, was metabolized to urines within 4h. Furthermore, its concentration in urines decreased to 5.4mM at 12h. Efficient glucosamine detection was achieved based on a sensitive quantification (LIF), a universal detection (UV) and structural characterizations (TOF-MS). This application indicated that the proposed strategy was sensitive, universal and versatile, and it was capable of improved analysis, especially for analytes with low concentrations in complex samples, compared with conventional HPLC-UV/TOF-MS. Copyright © 2015 Elsevier B.V. All rights reserved.
Open | SpeedShop: An Open Source Infrastructure for Parallel Performance Analysis

Directory of Open Access Journals (Sweden)

Martin Schulz

2008-01-01

Full Text Available Over the last decades a large number of performance tools has been developed to analyze and optimize high performance applications. Their acceptance by end users, however, has been slow: each tool alone is often limited in scope and comes with widely varying interfaces and workflow constraints, requiring different changes in the often complex build and execution infrastructure of the target application. We started the Open | SpeedShop project about 3 years ago to overcome these limitations and provide efficient, easy to apply, and integrated performance analysis for parallel systems. Open | SpeedShop has two different faces: it provides an interoperable tool set covering the most common analysis steps as well as a comprehensive plugin infrastructure for building new tools. In both cases, the tools can be deployed to large scale parallel applications using DPCL/Dyninst for distributed binary instrumentation. Further, all tools developed within or on top of Open | SpeedShop are accessible through multiple fully equivalent interfaces including an easy-to-use GUI as well as an interactive command line interface reducing the usage threshold for those tools.
Compression, splitting and switching of bright and dark solitons in nonlinear directional coupler

International Nuclear Information System (INIS)

Mandal, Basanti; Chowdhury, A. Roy

2006-01-01

A detailed numerical simulation of the switching, compression and splitting characteristics of various solitary pulses (bright, grey and dark) are carried out by a direct solution of the associated coupled NLS equation. Important physical parameters of the out going pulse such as, intensity distribution, root mean square spatial and temporal width and chirp are calculated. Both the cases of symmetric and asymmetric couplers are considered. The important phenomenon of periodic power transfer from one channel to the other unfolds. The compression varies with the type of pulse launched in the initial channel. It is observed that the chirping of the initial pulse has an optimum value and it vary quite noticeably with the character of the pulse and couplers, symmetric and asymmetric
Air-side performance of a parallel-flow parallel-fin (PF{sup 2}) heat exchanger in sequential frosting

Energy Technology Data Exchange (ETDEWEB)

Zhang, Ping [Zhejiang Vocational College of Commerce, Hangzhou, Binwen Road 470 (China); Department of Mechanical Science and Engineering, University of Illinois at Urbana-Champaign, 1206 West Green Street, Urbana, IL 61801 (United States); Hrnjak, P.S. [Department of Mechanical Science and Engineering, University of Illinois at Urbana-Champaign, 1206 West Green Street, Urbana, IL 61801 (United States)

2010-09-15

The thermal-hydraulic performance in periodic frosting conditions is experimentally studied for the parallel-flow parallel-fin heat exchanger, henceforth referred to as a PF{sup 2} heat exchanger, a new style of heat exchanger that uses louvered bent fins on flat tubes to enhance water drainage when the flat tubes are horizontal. Typically, it takes a few frosting/defrosting cycles to come to repeatable conditions. The criterion for the initiation of defrost and a sufficiently long defrost period are determined for the test PF{sup 2} heat exchanger and test condition. The effects of blower operation on the pressure drop, frost accumulation, water retention, and capacity in time are compared under the conditions of 15 sequential frosting cycles. Pressure drop across the heat exchanger and overall heat transfer coefficient are quantified under frost conditions as functions of the air humidity and air face velocity. The performances of two types of flat-tube heat exchangers, PF{sup 2} heat exchanger and conventional parallel-flow serpentine-fin (PFSF) heat exchanger, are compared and the results obtained are presented. (author)
Broadband Silicon-On-Insulator directional couplers using a combination of straight and curved waveguide sections.

Science.gov (United States)

Chen, George F R; Ong, Jun Rong; Ang, Thomas Y L; Lim, Soon Thor; Png, Ching Eng; Tan, Dawn T H

2017-08-03

Broadband Silicon-On-Insulator (SOI) directional couplers are designed based on a combination of curved and straight coupled waveguide sections. A design methodology based on the transfer matrix method (TMM) is used to determine the required coupler section lengths, radii, and waveguide cross-sections. A 50/50 power splitter with a measured bandwidth of 88 nm is designed and fabricated, with a device footprint of 20 μm × 3 μm. In addition, a balanced Mach-Zehnder interferometer is fabricated showing an extinction ratio of >16 dB over 100 nm of bandwidth.
De Novo Ultrascale Atomistic Simulations On High-End Parallel Supercomputers

Energy Technology Data Exchange (ETDEWEB)

Nakano, A; Kalia, R K; Nomura, K; Sharma, A; Vashishta, P; Shimojo, F; van Duin, A; Goddard, III, W A; Biswas, R; Srivastava, D; Yang, L H

2006-09-04

We present a de novo hierarchical simulation framework for first-principles based predictive simulations of materials and their validation on high-end parallel supercomputers and geographically distributed clusters. In this framework, high-end chemically reactive and non-reactive molecular dynamics (MD) simulations explore a wide solution space to discover microscopic mechanisms that govern macroscopic material properties, into which highly accurate quantum mechanical (QM) simulations are embedded to validate the discovered mechanisms and quantify the uncertainty of the solution. The framework includes an embedded divide-and-conquer (EDC) algorithmic framework for the design of linear-scaling simulation algorithms with minimal bandwidth complexity and tight error control. The EDC framework also enables adaptive hierarchical simulation with automated model transitioning assisted by graph-based event tracking. A tunable hierarchical cellular decomposition parallelization framework then maps the O(N) EDC algorithms onto Petaflops computers, while achieving performance tunability through a hierarchy of parameterized cell data/computation structures, as well as its implementation using hybrid Grid remote procedure call + message passing + threads programming. High-end computing platforms such as IBM BlueGene/L, SGI Altix 3000 and the NSF TeraGrid provide an excellent test grounds for the framework. On these platforms, we have achieved unprecedented scales of quantum-mechanically accurate and well validated, chemically reactive atomistic simulations--1.06 billion-atom fast reactive force-field MD and 11.8 million-atom (1.04 trillion grid points) quantum-mechanical MD in the framework of the EDC density functional theory on adaptive multigrids--in addition to 134 billion-atom non-reactive space-time multiresolution MD, with the parallel efficiency as high as 0.998 on 65,536 dual-processor BlueGene/L nodes. We have also achieved an automated execution of hierarchical QM
Toward an ultra-high resolution community climate system model for the BlueGene platform

International Nuclear Information System (INIS)

Dennis, John M; Jacob, Robert; Vertenstein, Mariana; Craig, Tony; Loy, Raymond

2007-01-01

Global climate models need to simulate several small, regional-scale processes which affect the global circulation in order to accurately simulate the climate. This is particularly important in the ocean where small scale features such as oceanic eddies are currently represented with adhoc parameterizations. There is also a need for higher resolution to provide climate predictions at small, regional scales. New high-performance computing platforms such as the IBM BlueGene can provide the necessary computational power to perform ultra-high resolution climate model integrations. We have begun to investigate the scaling of the individual components of the Community Climate System Model to prepare it for integrations on BlueGene and similar platforms. Our investigations show that it is possible to successfully utilize O(32K) processors. We describe the scalability of five models: the Parallel Ocean Program (POP), the Community Ice CodE (CICE), the Community Land Model (CLM), and the new CCSM sequential coupler (CPL7) which are components of the next generation Community Climate System Model (CCSM); as well as the High-Order Method Modeling Environment (HOMME) which is a dynamical core currently being evaluated within the Community Atmospheric Model. For our studies we concentrate on 1/10 0 resolution for CICE, POP, and CLM models and 1/4 0 resolution for HOMME. The ability to simulate high resolutions on the massively parallel petascale systems that will dominate high-performance computing for the foreseeable future is essential to the advancement of climate science
Evaluating the performance of the particle finite element method in parallel architectures

Science.gov (United States)

Gimenez, Juan M.; Nigro, Norberto M.; Idelsohn, Sergio R.

2014-05-01

This paper presents a high performance implementation for the particle-mesh based method called particle finite element method two (PFEM-2). It consists of a material derivative based formulation of the equations with a hybrid spatial discretization which uses an Eulerian mesh and Lagrangian particles. The main aim of PFEM-2 is to solve transport equations as fast as possible keeping some level of accuracy. The method was found to be competitive with classical Eulerian alternatives for these targets, even in their range of optimal application. To evaluate the goodness of the method with large simulations, it is imperative to use of parallel environments. Parallel strategies for Finite Element Method have been widely studied and many libraries can be used to solve Eulerian stages of PFEM-2. However, Lagrangian stages, such as streamline integration, must be developed considering the parallel strategy selected. The main drawback of PFEM-2 is the large amount of memory needed, which limits its application to large problems with only one computer. Therefore, a distributed-memory implementation is urgently needed. Unlike a shared-memory approach, using domain decomposition the memory is automatically isolated, thus avoiding race conditions; however new issues appear due to data distribution over the processes. Thus, a domain decomposition strategy for both particle and mesh is adopted, which minimizes the communication between processes. Finally, performance analysis running over multicore and multinode architectures are presented. The Courant-Friedrichs-Lewy number used influences the efficiency of the parallelization and, in some cases, a weighted partitioning can be used to improve the speed-up. However the total cputime for cases presented is lower than that obtained when using classical Eulerian strategies.
In-cylinder diesel spray combustion simulations using parallel computation: A performance benchmarking study

International Nuclear Information System (INIS)

Pang, Kar Mun; Ng, Hoon Kiat; Gan, Suyin

2012-01-01

Highlights: ► A performance benchmarking exercise is conducted for diesel combustion simulations. ► The reduced chemical mechanism shows its advantages over base and skeletal models. ► High efficiency and great reduction of CPU runtime are achieved through 4-node solver. ► Increasing ISAT memory from 0.1 to 2 GB reduces the CPU runtime by almost 35%. ► Combustion and soot processes are predicted well with minimal computational cost. - Abstract: In the present study, in-cylinder diesel combustion simulation was performed with parallel processing on an Intel Xeon Quad-Core platform to allow both fluid dynamics and chemical kinetics of the surrogate diesel fuel model to be solved simultaneously on multiple processors. Here, Cartesian Z-Coordinate was selected as the most appropriate partitioning algorithm since it computationally bisects the domain such that the dynamic load associated with fuel particle tracking was evenly distributed during parallel computations. Other variables examined included number of compute nodes, chemistry sizes and in situ adaptive tabulation (ISAT) parameters. Based on the performance benchmarking test conducted, parallel configuration of 4-compute node was found to reduce the computational runtime most efficiently whereby a parallel efficiency of up to 75.4% was achieved. The simulation results also indicated that accuracy level was insensitive to the number of partitions or the partitioning algorithms. The effect of reducing the number of species on computational runtime was observed to be more significant than reducing the number of reactions. Besides, the study showed that an increase in the ISAT maximum storage of up to 2 GB reduced the computational runtime by 50%. Also, the ISAT error tolerance of 10 −3 was chosen to strike a balance between results accuracy and computational runtime. The optimised parameters in parallel processing and ISAT, as well as the use of the in-house reduced chemistry model allowed accurate
Multi-petascale highly efficient parallel supercomputer

Science.gov (United States)

Asaad, Sameh; Bellofatto, Ralph E.; Blocksome, Michael A.; Blumrich, Matthias A.; Boyle, Peter; Brunheroto, Jose R.; Chen, Dong; Cher, Chen-Yong; Chiu, George L.; Christ, Norman; Coteus, Paul W.; Davis, Kristan D.; Dozsa, Gabor J.; Eichenberger, Alexandre E.; Eisley, Noel A.; Ellavsky, Matthew R.; Evans, Kahn C.; Fleischer, Bruce M.; Fox, Thomas W.; Gara, Alan; Giampapa, Mark E.; Gooding, Thomas M.; Gschwind, Michael K.; Gunnels, John A.; Hall, Shawn A.; Haring, Rudolf A.; Heidelberger, Philip; Inglett, Todd A.; Knudson, Brant L.; Kopcsay, Gerard V.; Kumar, Sameer; Mamidala, Amith R.; Marcella, James A.; Megerian, Mark G.; Miller, Douglas R.; Miller, Samuel J.; Muff, Adam J.; Mundy, Michael B.; O'Brien, John K.; O'Brien, Kathryn M.; Ohmacht, Martin; Parker, Jeffrey J.; Poole, Ruth J.; Ratterman, Joseph D.; Salapura, Valentina; Satterfield, David L.; Senger, Robert M.; Steinmacher-Burow, Burkhard; Stockdell, William M.; Stunkel, Craig B.; Sugavanam, Krishnan; Sugawara, Yutaka; Takken, Todd E.; Trager, Barry M.; Van Oosten, James L.; Wait, Charles D.; Walkup, Robert E.; Watson, Alfred T.; Wisniewski, Robert W.; Wu, Peng

2018-05-15

A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaflop-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC). The ASIC nodes are interconnected by a five dimensional torus network that optimally maximize the throughput of packet communications between nodes and minimize latency. The network implements collective network and a global asynchronous network that provides global barrier and notification functions. Integrated in the node design include a list-based prefetcher. The memory system implements transaction memory, thread level speculation, and multiversioning cache that improves soft error rate at the same time and supports DMA functionality allowing for parallel processing message-passing.
A task-based parallelism and vectorized approach to 3D Method of Characteristics (MOC) reactor simulation for high performance computing architectures

Science.gov (United States)

Tramm, John R.; Gunow, Geoffrey; He, Tim; Smith, Kord S.; Forget, Benoit; Siegel, Andrew R.

2016-05-01

In this study we present and analyze a formulation of the 3D Method of Characteristics (MOC) technique applied to the simulation of full core nuclear reactors. Key features of the algorithm include a task-based parallelism model that allows independent MOC tracks to be assigned to threads dynamically, ensuring load balancing, and a wide vectorizable inner loop that takes advantage of modern SIMD computer architectures. The algorithm is implemented in a set of highly optimized proxy applications in order to investigate its performance characteristics on CPU, GPU, and Intel Xeon Phi architectures. Speed, power, and hardware cost efficiencies are compared. Additionally, performance bottlenecks are identified for each architecture in order to determine the prospects for continued scalability of the algorithm on next generation HPC architectures.
Highly parallel line-based image coding for many cores.

Science.gov (United States)

Peng, Xiulian; Xu, Jizheng; Zhou, You; Wu, Feng

2012-01-01

Computers are developing along with a new trend from the dual-core and quad-core processors to ones with tens or even hundreds of cores. Multimedia, as one of the most important applications in computers, has an urgent need to design parallel coding algorithms for compression. Taking intraframe/image coding as a start point, this paper proposes a pure line-by-line coding scheme (LBLC) to meet the need. In LBLC, an input image is processed line by line sequentially, and each line is divided into small fixed-length segments. The compression of all segments from prediction to entropy coding is completely independent and concurrent at many cores. Results on a general-purpose computer show that our scheme can get a 13.9 times speedup with 15 cores at the encoder and a 10.3 times speedup at the decoder. Ideally, such near-linear speeding relation with the number of cores can be kept for more than 100 cores. In addition to the high parallelism, the proposed scheme can perform comparatively or even better than the H.264 high profile above middle bit rates. At near-lossless coding, it outperforms H.264 more than 10 dB. At lossless coding, up to 14% bit-rate reduction is observed compared with H.264 lossless coding at the high 4:4:4 profile.
Highly parallel machines and future of scientific computing

International Nuclear Information System (INIS)

Singh, G.S.

1992-01-01

Computing requirement of large scale scientific computing has always been ahead of what state of the art hardware could supply in the form of supercomputers of the day. And for any single processor system the limit to increase in the computing power was realized a few years back itself. Now with the advent of parallel computing systems the availability of machines with the required computing power seems a reality. In this paper the author tries to visualize the future large scale scientific computing in the penultimate decade of the present century. The author summarized trends in parallel computers and emphasize the need for a better programming environment and software tools for optimal performance. The author concludes this paper with critique on parallel architectures, software tools and algorithms. (author). 10 refs., 2 tabs
Towards a streaming model for nested data parallelism

DEFF Research Database (Denmark)

Madsen, Frederik Meisner; Filinski, Andrzej

2013-01-01

The language-integrated cost semantics for nested data parallelism pioneered by NESL provides an intuitive, high-level model for predicting performance and scalability of parallel algorithms with reasonable accuracy. However, this predictability, obtained through a uniform, parallelism-flattening......The language-integrated cost semantics for nested data parallelism pioneered by NESL provides an intuitive, high-level model for predicting performance and scalability of parallel algorithms with reasonable accuracy. However, this predictability, obtained through a uniform, parallelism......-processable in a streaming fashion. This semantics is directly compatible with previously proposed piecewise execution models for nested data parallelism, but allows the expected space usage to be reasoned about directly at the source-language level. The language definition and implementation are still very much work...
Arbitrary-ratio power splitter based on nonlinear multimode interference coupler

International Nuclear Information System (INIS)

Tajaldini, Mehdi; Jafri, Mohd Zubir Mat

2015-01-01

We propose an ultra-compact multimode interference (MMI) power splitter based on nonlinear effects from simulations using nonlinear modal propagation analysis (NMPA) cooperation with finite difference Method (FDM) to access free choice of splitting ratio. Conventional multimode interference power splitter could only obtain a few discrete ratios. The power splitting ratio may be adjusted continuously while the input set power is varying by a tunable laser. In fact, using an ultra- compact MMI with a simple structure that is launched by a tunable nonlinear input fulfills the problem of arbitrary-ratio in integrated photonics circuits. Silicon on insulator (SOI) is used as the offered material due to the high contrast refractive index and Centro symmetric properties. The high-resolution images at the end of the multimode waveguide in the simulated power splitter have a high power balance, whereas access to a free choice of splitting ratio is not possible under the linear regime in the proposed length range except changes in the dimension for any ratio. The compact dimensions and ideal performance of the device are established according to optimized parameters. The proposed regime can be extended to the design of M×N arbitrary power splitters ratio for programmable logic devices in all optical digital signal processing. The results of this study indicate that nonlinear modal propagation analysis solves the miniaturization problem for all-optical devices based on MMI couplers to achieve multiple functions in a compact planar integrated circuit and also overcomes the limitations of previously proposed methods for nonlinear MMI
Arbitrary-ratio power splitter based on nonlinear multimode interference coupler

Energy Technology Data Exchange (ETDEWEB)

Tajaldini, Mehdi [School of Physics, Universiti Sains Malaysia, 11800 Pulau Pinang (Malaysia); Young Researchers and Elite Club, Baft Branch, Islamic Azad University, Baft (Iran, Islamic Republic of); Jafri, Mohd Zubir Mat [School of Physics, Universiti Sains Malaysia, 11800 Pulau Pinang (Malaysia)

2015-04-24

We propose an ultra-compact multimode interference (MMI) power splitter based on nonlinear effects from simulations using nonlinear modal propagation analysis (NMPA) cooperation with finite difference Method (FDM) to access free choice of splitting ratio. Conventional multimode interference power splitter could only obtain a few discrete ratios. The power splitting ratio may be adjusted continuously while the input set power is varying by a tunable laser. In fact, using an ultra- compact MMI with a simple structure that is launched by a tunable nonlinear input fulfills the problem of arbitrary-ratio in integrated photonics circuits. Silicon on insulator (SOI) is used as the offered material due to the high contrast refractive index and Centro symmetric properties. The high-resolution images at the end of the multimode waveguide in the simulated power splitter have a high power balance, whereas access to a free choice of splitting ratio is not possible under the linear regime in the proposed length range except changes in the dimension for any ratio. The compact dimensions and ideal performance of the device are established according to optimized parameters. The proposed regime can be extended to the design of M×N arbitrary power splitters ratio for programmable logic devices in all optical digital signal processing. The results of this study indicate that nonlinear modal propagation analysis solves the miniaturization problem for all-optical devices based on MMI couplers to achieve multiple functions in a compact planar integrated circuit and also overcomes the limitations of previously proposed methods for nonlinear MMI.

Broad-band anti-reflection coupler for a : Si thin-film solar cell

International Nuclear Information System (INIS)

Lo, S.-S.; Chen, C.-C.; Garwe, Frank; Pertch, Thomas

2007-01-01

This work numerically demonstrates a new anti-reflection coupler (ARC) with high coupling efficiency in a Si substrate solar cell. The ARC in which the grating is integrated on a glass encapsulation and a three-layer impedance match layer is proposed. A coupling efficiency of 90% is obtained at wavelengths between 350 and 1200 nm in the TE and TM modes when the incident angle is less than 30 0 . In comparison with a 1μm absorber layer, the integrated absorption of an a-Si thin-film solar cell without a new ARC is doubled, at long wavelengths (750 nm ≤ λ ≤ 1200 nm), as calculated by FDTD method
Parallel Microcracks-based Ultrasensitive and Highly Stretchable Strain Sensors.

Science.gov (United States)

Amjadi, Morteza; Turan, Mehmet; Clementson, Cameron P; Sitti, Metin

2016-03-02

There is an increasing demand for flexible, skin-attachable, and wearable strain sensors due to their various potential applications. However, achieving strain sensors with both high sensitivity and high stretchability is still a grand challenge. Here, we propose highly sensitive and stretchable strain sensors based on the reversible microcrack formation in composite thin films. Controllable parallel microcracks are generated in graphite thin films coated on elastomer films. Sensors made of graphite thin films with short microcracks possess high gauge factors (maximum value of 522.6) and stretchability (ε ≥ 50%), whereas sensors with long microcracks show ultrahigh sensitivity (maximum value of 11,344) with limited stretchability (ε ≤ 50%). We demonstrate the high performance strain sensing of our sensors in both small and large strain sensing applications such as human physiological activity recognition, human body large motion capturing, vibration detection, pressure sensing, and soft robotics.
Development Of A Parallel Performance Model For The THOR Neutral Particle Transport Code

Energy Technology Data Exchange (ETDEWEB)

Yessayan, Raffi; Azmy, Yousry; Schunert, Sebastian

2017-02-01

The THOR neutral particle transport code enables simulation of complex geometries for various problems from reactor simulations to nuclear non-proliferation. It is undergoing a thorough V&V requiring computational efficiency. This has motivated various improvements including angular parallelization, outer iteration acceleration, and development of peripheral tools. For guiding future improvements to the code’s efficiency, better characterization of its parallel performance is useful. A parallel performance model (PPM) can be used to evaluate the benefits of modifications and to identify performance bottlenecks. Using INL’s Falcon HPC, the PPM development incorporates an evaluation of network communication behavior over heterogeneous links and a functional characterization of the per-cell/angle/group runtime of each major code component. After evaluating several possible sources of variability, this resulted in a communication model and a parallel portion model. The former’s accuracy is bounded by the variability of communication on Falcon while the latter has an error on the order of 1%.
Efficient Parallel Kernel Solvers for Computational Fluid Dynamics Applications

Science.gov (United States)

Sun, Xian-He

1997-01-01

Distributed-memory parallel computers dominate today's parallel computing arena. These machines, such as Intel Paragon, IBM SP2, and Cray Origin2OO, have successfully delivered high performance computing power for solving some of the so-called "grand-challenge" problems. Despite initial success, parallel machines have not been widely accepted in production engineering environments due to the complexity of parallel programming. On a parallel computing system, a task has to be partitioned and distributed appropriately among processors to reduce communication cost and to attain load balance. More importantly, even with careful partitioning and mapping, the performance of an algorithm may still be unsatisfactory, since conventional sequential algorithms may be serial in nature and may not be implemented efficiently on parallel machines. In many cases, new algorithms have to be introduced to increase parallel performance. In order to achieve optimal performance, in addition to partitioning and mapping, a careful performance study should be conducted for a given application to find a good algorithm-machine combination. This process, however, is usually painful and elusive. The goal of this project is to design and develop efficient parallel algorithms for highly accurate Computational Fluid Dynamics (CFD) simulations and other engineering applications. The work plan is 1) developing highly accurate parallel numerical algorithms, 2) conduct preliminary testing to verify the effectiveness and potential of these algorithms, 3) incorporate newly developed algorithms into actual simulation packages. The work plan has well achieved. Two highly accurate, efficient Poisson solvers have been developed and tested based on two different approaches: (1) Adopting a mathematical geometry which has a better capacity to describe the fluid, (2) Using compact scheme to gain high order accuracy in numerical discretization. The previously developed Parallel Diagonal Dominant (PDD) algorithm
Baking Arithmetic and Error Analyses for PEFP Fundamental Power Couplers

International Nuclear Information System (INIS)

Zhang, Liping; An, Sun; Tang, Yazhe; Cho, Yong Sub

2009-01-01

The Proton Engineering Frontier Project (PEFP) is considering developing and using SRF technology to accelerate a proton beam at 700 MHz in its present project and its extended project (PEP). The first section of the PEFP SRF linac (SCL) is composed of low-beta cryomodules. Each cryomodule has three 5-cell cavities and each cavity has one fundamental power coupler (FPC). Before the high power RF processing, each FPC needs to be baked out for 24 hours at 200 degrees Celsius ( .deg. C). The whole control system is described in reference, in this system, the temperature in the baking-box need to be changed according to three straight lines with different slope. This paper described how we can make the temperature of the baking-box changed according to the required values
Baking Arithmetic and Error Analyses for PEFP Fundamental Power Couplers

Energy Technology Data Exchange (ETDEWEB)

Zhang, Liping; An, Sun; Tang, Yazhe; Cho, Yong Sub [Korea Atomic Energy Research Institute, Daejeon (Korea, Republic of)

2009-05-15

The Proton Engineering Frontier Project (PEFP) is considering developing and using SRF technology to accelerate a proton beam at 700 MHz in its present project and its extended project (PEP). The first section of the PEFP SRF linac (SCL) is composed of low-beta cryomodules. Each cryomodule has three 5-cell cavities and each cavity has one fundamental power coupler (FPC). Before the high power RF processing, each FPC needs to be baked out for 24 hours at 200 degrees Celsius ( .deg. C). The whole control system is described in reference, in this system, the temperature in the baking-box need to be changed according to three straight lines with different slope. This paper described how we can make the temperature of the baking-box changed according to the required values.
Topology optimization of grating couplers for the efficient excitation of surface plasmons

DEFF Research Database (Denmark)

Andkjær, Jacob Anders; Sigmund, Ole; Nishiwaki, Shinji

2010-01-01

We propose a methodology for a systematic design of grating couplers for efficient excitation of surface plasmons at metal-dielectric interfaces. The methodology is based on a two-dimensional topology optimization formulation based on the H-polarized scalar Helmholtz equation and finite-element m...
Structural Directed Growth of Ultrathin Parallel Birnessite on β-MnO2 for High-Performance Asymmetric Supercapacitors.

Science.gov (United States)

Zhu, Shijin; Li, Li; Liu, Jiabin; Wang, Hongtao; Wang, Tian; Zhang, Yuxin; Zhang, Lili; Ruoff, Rodney S; Dong, Fan

2018-02-27

Two-dimensional birnessite has attracted attention for electrochemical energy storage because of the presence of redox active Mn 4+ /Mn 3+ ions and spacious interlayer channels available for ions diffusion. However, current strategies are largely limited to enhancing the electrical conductivity of birnessite. One key limitation affecting the electrochemical properties of birnessite is the poor utilization of the MnO 6 unit. Here, we assemble β-MnO 2 /birnessite core-shell structure that exploits the exposed crystal face of β-MnO 2 as the core and ultrathin birnessite sheets that have the structure advantage to enhance the utilization efficiency of the Mn from the bulk. Our birnessite that has sheets parallel to each other is found to have unusual crystal structure with interlayer spacing, Mn(III)/Mn(IV) ratio and the content of the balancing cations differing from that of the common birnessite. The substrate directed growth mechanism is carefully investigated. The as-prepared core-shell nanostructures enhance the exposed surface area of birnessite and achieve high electrochemical performances (for example, 657 F g -1 in 1 M Na 2 SO 4 electrolyte based on the weight of parallel birnessite) and excellent rate capability over a potential window of up to 1.2 V. This strategy opens avenues for fundamental studies of birnessite and its properties and suggests the possibility of its use in energy storage and other applications. The potential window of an asymmetric supercapacitor that was assembled with this material can be enlarged to 2.2 V (in aqueous electrolyte) with a good cycling ability.
Design and length optimization of an adiabatic coupler for on-chip vertical integration of rare-earth-doped double tungstate waveguide amplifiers

NARCIS (Netherlands)

Mu, Jinfeng; Sefünç, Mustafa; García Blanco, Sonia Maria

2014-01-01

The integration of rare-earth doped double tungstate waveguide amplifiers onto passive technology platforms enables the on-chip amplification of very high bit rate signals. In this work, a methodology for the optimized design of vertical adiabatic couplers between a passive Si3N4 waveguide and the
Design and fabrication of three-dimensional polymer mode multiplexer based on asymmetric waveguide couplers

Science.gov (United States)

He, Guobing; Gao, Yang; Xu, Yan; Ji, Lanting; Sun, Xiaoqiang; Wang, Xibin; Yi, Yunji; Chen, Changming; Wang, Fei; Zhang, Daming; Wu, Yuanda

2018-05-01

A polymer mode multiplexer based on asymmetric couplers is theoretically designed and experimentally demonstrated. The proposed X-junction coupler is formed by waveguides overlapped with different crossing angles in the vertical direction. A beam propagation method is adopted to optimize the dimensional parameters of the mode multiplexer to convert LP01 mode of two lower waveguides to LP11a and LP21a mode of the upper waveguide. The ultraviolet lithography and wet chemical etching are used in the fabrication process. A conversion ratio over 98% for both LP11a and LP21a mode in the wavelength range from 1530 to 1570 nm are experimentally demonstrated. This mode multiplexer has potential in broadband mode-division multiplexing transmission systems.
Teaching RLC Parallel Circuits in High-School Physics Class

Science.gov (United States)

Simon, Alpár

2015-01-01

This paper will try to give an alternative treatment of the subject "parallel RLC circuits" and "resonance in parallel RLC circuits" from the Physics curricula for the XIth grade from Romanian high-schools, with an emphasis on practical type circuits and their possible applications, and intends to be an aid for both Physics…
The FORCE: A highly portable parallel programming language

Science.gov (United States)

Jordan, Harry F.; Benten, Muhammad S.; Alaghband, Gita; Jakob, Ruediger

1989-01-01

Here, it is explained why the FORCE parallel programming language is easily portable among six different shared-memory microprocessors, and how a two-level macro preprocessor makes it possible to hide low level machine dependencies and to build machine-independent high level constructs on top of them. These FORCE constructs make it possible to write portable parallel programs largely independent of the number of processes and the specific shared memory multiprocessor executing them.
The FORCE - A highly portable parallel programming language

Science.gov (United States)

Jordan, Harry F.; Benten, Muhammad S.; Alaghband, Gita; Jakob, Ruediger

1989-01-01

This paper explains why the FORCE parallel programming language is easily portable among six different shared-memory multiprocessors, and how a two-level macro preprocessor makes it possible to hide low-level machine dependencies and to build machine-independent high-level constructs on top of them. These FORCE constructs make it possible to write portable parallel programs largely independent of the number of processes and the specific shared-memory multiprocessor executing them.
Parallel processing for fluid dynamics applications

International Nuclear Information System (INIS)

Johnson, G.M.

1989-01-01

The impact of parallel processing on computational science and, in particular, on computational fluid dynamics is growing rapidly. In this paper, particular emphasis is given to developments which have occurred within the past two years. Parallel processing is defined and the reasons for its importance in high-performance computing are reviewed. Parallel computer architectures are classified according to the number and power of their processing units, their memory, and the nature of their connection scheme. Architectures which show promise for fluid dynamics applications are emphasized. Fluid dynamics problems are examined for parallelism inherent at the physical level. CFD algorithms and their mappings onto parallel architectures are discussed. Several example are presented to document the performance of fluid dynamics applications on present-generation parallel processing devices
Achieving high performance in numerical computations on RISC workstations and parallel systems

Energy Technology Data Exchange (ETDEWEB)

Goedecker, S. [Max-Planck Inst. for Solid State Research, Stuttgart (Germany); Hoisie, A. [Los Alamos National Lab., NM (United States)

1997-08-20

The nominal peak speeds of both serial and parallel computers is raising rapidly. At the same time however it is becoming increasingly difficult to get out a significant fraction of this high peak speed from modern computer architectures. In this tutorial the authors give the scientists and engineers involved in numerically demanding calculations and simulations the necessary basic knowledge to write reasonably efficient programs. The basic principles are rather simple and the possible rewards large. Writing a program by taking into account optimization techniques related to the computer architecture can significantly speedup your program, often by factors of 10--100. As such, optimizing a program can for instance be a much better solution than buying a faster computer. If a few basic optimization principles are applied during program development, the additional time needed for obtaining an efficient program is practically negligible. In-depth optimization is usually only needed for a few subroutines or kernels and the effort involved is therefore also acceptable.
Miniature mechanical transfer optical coupler

Science.gov (United States)

Abel, Philip [Overland Park, KS; Watterson, Carl [Kansas City, MO

2011-02-15

A miniature mechanical transfer (MT) optical coupler ("MMTOC") for optically connecting a first plurality of optical fibers with at least one other plurality of optical fibers. The MMTOC may comprise a beam splitting element, a plurality of collimating lenses, and a plurality of alignment elements. The MMTOC may optically couple a first plurality of fibers disposed in a plurality of ferrules of a first MT connector with a second plurality of fibers disposed in a plurality of ferrules of a second MT connector and a third plurality of fibers disposed in a plurality of ferrules of a third MT connector. The beam splitting element may allow a portion of each beam of light from the first plurality of fibers to pass through to the second plurality of fibers and simultaneously reflect another portion of each beam of light from the first plurality of fibers to the third plurality of fibers.
On the photonic implementation of universal quantum gates, bell states preparation circuit and quantum LDPC encoders and decoders based on directional couplers and HNLF.

Science.gov (United States)

Djordjevic, Ivan B

2010-04-12

The Bell states preparation circuit is a basic circuit required in quantum teleportation. We describe how to implement it in all-fiber technology. The basic building blocks for its implementation are directional couplers and highly nonlinear optical fiber (HNLF). Because the quantum information processing is based on delicate superposition states, it is sensitive to quantum errors. In order to enable fault-tolerant quantum computing the use of quantum error correction is unavoidable. We show how to implement in all-fiber technology encoders and decoders for sparse-graph quantum codes, and provide an illustrative example to demonstrate this implementation. We also show that arbitrary set of universal quantum gates can be implemented based on directional couplers and HNLFs.
Bi-directional triplexer with butterfly MMI coupler using SU-8 polymer waveguides

Science.gov (United States)

Mareš, David; Jeřábek, Vítězslav; Prajzler, Václav

2015-01-01

We report about a design of a bi-directional planar optical multiplex/demultiplex filter (triplexer) for the optical part of planar hybrid WDM bi-directional transceiver in fiber-to-the-home (FTTH) PON applications. The triplex lightwave circuit is based on the Epoxy Novolak Resin SU-8 waveguides on the silica-on-silicon substrate with Polymethylmethacrylate cladding layer. The triplexer is comprised of a linear butterfly concept of multimode interference (MMI) coupler separating downstream optical signals of 1490 nm and 1550 nm. For the upstream channel of 1310 nm, an additional directional coupler (DC) is used to add optical signal of 1310 nm propagating in opposite direction. The optical triplexer was designed and optimized using beam propagation method. The insertion losses, crosstalk attenuation, and extinction ratio for all three inputs/outputs were investigated. The intended triplexer was designed using the parameters of the separated DC and MMI filter to approximate the idealized direct connection of both devices.
A highly efficient parallel algorithm for solving the neutron diffusion nodal equations on shared-memory computers

International Nuclear Information System (INIS)

Azmy, Y.Y.; Kirk, B.L.

1990-01-01

Modern parallel computer architectures offer an enormous potential for reducing CPU and wall-clock execution times of large-scale computations commonly performed in various applications in science and engineering. Recently, several authors have reported their efforts in developing and implementing parallel algorithms for solving the neutron diffusion equation on a variety of shared- and distributed-memory parallel computers. Testing of these algorithms for a variety of two- and three-dimensional meshes showed significant speedup of the computation. Even for very large problems (i.e., three-dimensional fine meshes) executed concurrently on a few nodes in serial (nonvector) mode, however, the measured computational efficiency is very low (40 to 86%). In this paper, the authors present a highly efficient (∼85 to 99.9%) algorithm for solving the two-dimensional nodal diffusion equations on the Sequent Balance 8000 parallel computer. Also presented is a model for the performance, represented by the efficiency, as a function of problem size and the number of participating processors. The model is validated through several tests and then extrapolated to larger problems and more processors to predict the performance of the algorithm in more computationally demanding situations
Design of a hybrid silicon-plasmonic co-propagating coupler operating close to coherent perfect absorption

Energy Technology Data Exchange (ETDEWEB)

Zanotto, Simone; Melloni, Andrea [Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Piazza Leonardo da Vinci 32, 20133 Milano (Italy)

2016-04-28

By hybrid integration of plasmonic and dielectric waveguide concepts, it is shown that nearly perfect coherent absorption can be achieved in a co-propagating coupler geometry. First, the operating principle of the proposed device is detailed in the context of a more general 2 × 2 lossy coupler formalism. Then, it is shown how to tune the device in a wide region of possible working points, its broadband operation, and the tolerance to fabrication uncertainties. Finally, a complete picture of the electromagnetic modes inside the hybrid structure is analyzed, shining light onto the potentials which the proposed device holds in view of classical and quantum signal processing, nonlinear optics, polarization control, and sensing.

Ultra-Broadband Silicon-Wire Polarization Beam Combiner/Splitter Based on a Wavelength Insensitive Coupler With a Point-Symmetrical Configuration

OpenAIRE

Uematsu, Takui; Kitayama, Tetsuya; Ishizaka, Yuhei; Saitoh, Kunimasa

2014-01-01

An ultrabroadband silicon wire polarization beam combiner/splitter (PBCS) based on a wavelength-insensitive coupler is proposed. The proposed PBCS consists of three identical directional couplers and two identical delay lines. We design the PBCS using the 3-D finite element method. Numerical simulations show that the proposed PBCS can achieve the transmittance of more than 90% over a wide wavelength range from 1450 to 1650 nm for both TE and TM polarized modes.
On-chip grating coupler array on the SOI platform for fan-in/fan-out of MCFs with low insertion loss and crosstalk

DEFF Research Database (Denmark)

Ding, Yunhong; Ye, Feihong; Peucheret, Christophe

2015-01-01

We report the design and fabrication of a compact multi-core fiber fan-in/fan-out using a grating coupler array on the SOI platform. The grating couplers are fully-etched, enabling the whole circuit to be fabricated in a single lithography and etching step. Thanks to the apodized design...... for the grating couplers and the introduction of an aluminum reflective mirror, a highest coupling efficiency of -3.8 dB with 3 dB coupling bandwidth of 48 nm and 1.5 dB bandwidth covering the whole C band, together with crosstalk lower than -32 dB are demonstrated. (C)2015 Optical Society of America...
Adapting high-level language programs for parallel processing using data flow

Science.gov (United States)

Standley, Hilda M.

1988-01-01

EASY-FLOW, a very high-level data flow language, is introduced for the purpose of adapting programs written in a conventional high-level language to a parallel environment. The level of parallelism provided is of the large-grained variety in which parallel activities take place between subprograms or processes. A program written in EASY-FLOW is a set of subprogram calls as units, structured by iteration, branching, and distribution constructs. A data flow graph may be deduced from an EASY-FLOW program.
High performance shallow water kernels for parallel overland flow simulations based on FullSWOF2D

KAUST Repository

Wittmann, Roland

2017-01-25

We describe code optimization and parallelization procedures applied to the sequential overland flow solver FullSWOF2D. Major difficulties when simulating overland flows comprise dealing with high resolution datasets of large scale areas which either cannot be computed on a single node either due to limited amount of memory or due to too many (time step) iterations resulting from the CFL condition. We address these issues in terms of two major contributions. First, we demonstrate a generic step-by-step transformation of the second order finite volume scheme in FullSWOF2D towards MPI parallelization. Second, the computational kernels are optimized by the use of templates and a portable vectorization approach. We discuss the load imbalance of the flux computation due to dry and wet cells and propose a solution using an efficient cell counting approach. Finally, scalability results are shown for different test scenarios along with a flood simulation benchmark using the Shaheen II supercomputer.
Parallel processing method for high-speed real time digital pulse processing for gamma-ray spectroscopy

International Nuclear Information System (INIS)

Fernandes, A.M.; Pereira, R.C.; Sousa, J.; Neto, A.; Carvalho, P.; Batista, A.J.N.; Carvalho, B.B.; Varandas, C.A.F.; Tardocchi, M.; Gorini, G.

2010-01-01

A new data acquisition (DAQ) system was developed to fulfil the requirements of the gamma-ray spectrometer (GRS) JET-EP2 (joint European Torus enhancement project 2), providing high-resolution spectroscopy at very high-count rate (up to few MHz). The system is based on the Advanced Telecommunications Computing Architecture TM (ATCA TM ) and includes a transient record (TR) module with 8 channels of 14 bits resolution at 400 MSamples/s (MSPS) sampling rate, 4 GB of local memory, and 2 field programmable gate array (FPGA) able to perform real time algorithms for data reduction and digital pulse processing. Although at 400 MSPS only fast programmable devices such as FPGAs can be used either for data processing and data transfer, FPGA resources also present speed limitation at some specific tasks, leading to an unavoidable data lost when demanding algorithms are applied. To overcome this problem and foreseeing an increase of the algorithm complexity, a new digital parallel filter was developed, aiming to perform real time pulse processing in the FPGAs of the TR module at the presented sampling rate. The filter is based on the conventional digital time-invariant trapezoidal shaper operating with parallelized data while performing pulse height analysis (PHA) and pile up rejection (PUR). The incoming sampled data is successively parallelized and fed into the processing algorithm block at one fourth of the sampling rate. The following data processing and data transfer is also performed at one fourth of the sampling rate. The algorithm based on data parallelization technique was implemented and tested at JET facilities, where a spectrum was obtained. Attending to the observed results, the PHA algorithm will be improved by implementing the pulse pile up discrimination.
Parallel pic plasma simulation through particle decomposition techniques

International Nuclear Information System (INIS)

Briguglio, S.; Vlad, G.; Di Martino, B.; Naples, Univ. 'Federico II'

1998-02-01

Particle-in-cell (PIC) codes are among the major candidates to yield a satisfactory description of the detail of kinetic effects, such as the resonant wave-particle interaction, relevant in determining the transport mechanism in magnetically confined plasmas. A significant improvement of the simulation performance of such codes con be expected from parallelization, e.g., by distributing the particle population among several parallel processors. Parallelization of a hybrid magnetohydrodynamic-gyrokinetic code has been accomplished within the High Performance Fortran (HPF) framework, and tested on the IBM SP2 parallel system, using a 'particle decomposition' technique. The adopted technique requires a moderate effort in porting the code in parallel form and results in intrinsic load balancing and modest inter processor communication. The performance tests obtained confirm the hypothesis of high effectiveness of the strategy, if targeted towards moderately parallel architectures. Optimal use of resources is also discussed with reference to a specific physics problem [it
High temporal resolution functional MRI using parallel echo volumar imaging

International Nuclear Information System (INIS)

Rabrait, C.; Ciuciu, P.; Ribes, A.; Poupon, C.; Dehaine-Lambertz, G.; LeBihan, D.; Lethimonnier, F.; Le Roux, P.; Dehaine-Lambertz, G.

2008-01-01

Purpose: To combine parallel imaging with 3D single-shot acquisition (echo volumar imaging, EVI) in order to acquire high temporal resolution volumar functional MRI (fMRI) data. Materials and Methods: An improved EVI sequence was associated with parallel acquisition and field of view reduction in order to acquire a large brain volume in 200 msec. Temporal stability and functional sensitivity were increased through optimization of all imaging parameters and Tikhonov regularization of parallel reconstruction. Two human volunteers were scanned with parallel EVI in a 1.5 T whole-body MR system, while submitted to a slow event-related auditory paradigm. Results: Thanks to parallel acquisition, the EVI volumes display a low level of geometric distortions and signal losses. After removal of low-frequency drifts and physiological artifacts,activations were detected in the temporal lobes of both volunteers and voxel-wise hemodynamic response functions (HRF) could be computed. On these HRF different habituation behaviors in response to sentence repetition could be identified. Conclusion: This work demonstrates the feasibility of high temporal resolution 3D fMRI with parallel EVI. Combined with advanced estimation tools,this acquisition method should prove useful to measure neural activity timing differences or study the nonlinearities and non-stationarities of the BOLD response. (authors)
Integrated computer network high-speed parallel interface

International Nuclear Information System (INIS)

Frank, R.B.

1979-03-01

As the number and variety of computers within Los Alamos Scientific Laboratory's Central Computer Facility grows, the need for a standard, high-speed intercomputer interface has become more apparent. This report details the development of a High-Speed Parallel Interface from conceptual through implementation stages to meet current and future needs for large-scle network computing within the Integrated Computer Network. 4 figures
Routing performance analysis and optimization within a massively parallel computer

Science.gov (United States)

Archer, Charles Jens; Peters, Amanda; Pinnow, Kurt Walter; Swartz, Brent Allen

2013-04-16

An apparatus, program product and method optimize the operation of a massively parallel computer system by, in part, receiving actual performance data concerning an application executed by the plurality of interconnected nodes, and analyzing the actual performance data to identify an actual performance pattern. A desired performance pattern may be determined for the application, and an algorithm may be selected from among a plurality of algorithms stored within a memory, the algorithm being configured to achieve the desired performance pattern based on the actual performance data.
Parallel plasma fluid turbulence calculations

International Nuclear Information System (INIS)

Leboeuf, J.N.; Carreras, B.A.; Charlton, L.A.; Drake, J.B.; Lynch, V.E.; Newman, D.E.; Sidikman, K.L.; Spong, D.A.

1994-01-01

The study of plasma turbulence and transport is a complex problem of critical importance for fusion-relevant plasmas. To this day, the fluid treatment of plasma dynamics is the best approach to realistic physics at the high resolution required for certain experimentally relevant calculations. Core and edge turbulence in a magnetic fusion device have been modeled using state-of-the-art, nonlinear, three-dimensional, initial-value fluid and gyrofluid codes. Parallel implementation of these models on diverse platforms--vector parallel (National Energy Research Supercomputer Center's CRAY Y-MP C90), massively parallel (Intel Paragon XP/S 35), and serial parallel (clusters of high-performance workstations using the Parallel Virtual Machine protocol)--offers a variety of paths to high resolution and significant improvements in real-time efficiency, each with its own advantages. The largest and most efficient calculations have been performed at the 200 Mword memory limit on the C90 in dedicated mode, where an overlap of 12 to 13 out of a maximum of 16 processors has been achieved with a gyrofluid model of core fluctuations. The richness of the physics captured by these calculations is commensurate with the increased resolution and efficiency and is limited only by the ingenuity brought to the analysis of the massive amounts of data generated
High spatial resolution CT image reconstruction using parallel computing

International Nuclear Information System (INIS)

Yin Yin; Liu Li; Sun Gongxing

2003-01-01

Using the PC cluster system with 16 dual CPU nodes, we accelerate the FBP and OR-OSEM reconstruction of high spatial resolution image (2048 x 2048). Based on the number of projections, we rewrite the reconstruction algorithms into parallel format and dispatch the tasks to each CPU. By parallel computing, the speedup factor is roughly equal to the number of CPUs, which can be up to about 25 times when 25 CPUs used. This technique is very suitable for real-time high spatial resolution CT image reconstruction. (authors)
Performance study of a cluster calculation; parallelization and application under geant4

International Nuclear Information System (INIS)

Trabelsi, Abir

2007-01-01

This work concretizes the final studies project for engineering computer sciences, it is archived within the national center of nuclear sciences and technology. The project consists in studying the performance of a set of machines in order to determine the best architecture to assemble them in a cluster. As well as the parallelism and the parallel implementation of GEANT4, as a tool of simulation. The realisation of this project consists on : 1) programming with C++ and executing the two benchmarks P MV and PMM on each station; 2) Interpreting this result in order to show the best architecture of the cluster; 3) parallelism with TOP-C the two benchmarks; 4) Executing the two Top-C versions on the cluster; 5) Generalizing this results; 6)parallelism et executing the parallel version of GEANT4. (Author). 14 refs
Optical interconnection networks for high-performance computing systems

International Nuclear Information System (INIS)

Biberman, Aleksandr; Bergman, Keren

2012-01-01

Enabled by silicon photonic technology, optical interconnection networks have the potential to be a key disruptive technology in computing and communication industries. The enduring pursuit of performance gains in computing, combined with stringent power constraints, has fostered the ever-growing computational parallelism associated with chip multiprocessors, memory systems, high-performance computing systems and data centers. Sustaining these parallelism growths introduces unique challenges for on- and off-chip communications, shifting the focus toward novel and fundamentally different communication approaches. Chip-scale photonic interconnection networks, enabled by high-performance silicon photonic devices, offer unprecedented bandwidth scalability with reduced power consumption. We demonstrate that the silicon photonic platforms have already produced all the high-performance photonic devices required to realize these types of networks. Through extensive empirical characterization in much of our work, we demonstrate such feasibility of waveguides, modulators, switches and photodetectors. We also demonstrate systems that simultaneously combine many functionalities to achieve more complex building blocks. We propose novel silicon photonic devices, subsystems, network topologies and architectures to enable unprecedented performance of these photonic interconnection networks. Furthermore, the advantages of photonic interconnection networks extend far beyond the chip, offering advanced communication environments for memory systems, high-performance computing systems, and data centers. (review article)
Beam dynamics simulations using a parallel version of PARMILA

International Nuclear Information System (INIS)

Ryne, R.D.

1996-01-01

The computer code PARMILA has been the primary tool for the design of proton and ion linacs in the United States for nearly three decades. Previously it was sufficient to perform simulations with of order 10000 particles, but recently the need to perform high resolution halo studies for next-generation, high intensity linacs has made it necessary to perform simulations with of order 100 million particles. With the advent of massively parallel computers such simulations are now within reach. Parallel computers already make it possible, for example, to perform beam dynamics calculations with tens of millions of particles, requiring over 10 GByte of core memory, in just a few hours. Also, parallel computers are becoming easier to use thanks to the availability of mature, Fortran-like languages such as Connection Machine Fortran and High Performance Fortran. We will describe our experience developing a parallel version of PARMILA and the performance of the new code
Beam dynamics simulations using a parallel version of PARMILA

International Nuclear Information System (INIS)

Ryne, Robert

1996-01-01

The computer code PARMILA has been the primary tool for the design of proton and ion linacs in the United States for nearly three decades. Previously it was sufficient to perform simulations with of order 10000 particles, but recently the need to perform high resolution halo studies for next-generation, high intensity linacs has made it necessary to perform simulations with of order 100 million particles. With the advent of massively parallel computers such simulations are now within reach. Parallel computers already make it possible, for example, to perform beam dynamics calculations with tens of millions of particles, requiring over 10 GByte of core memory, in just a few hours. Also, parallel computers are becoming easier to use thanks to the availability of mature, Fortran-like languages such as Connection Machine Fortran and High Performance Fortran. We will describe our experience developing a parallel version of PARMILA and the performance of the new code. (author)
Improving matrix-vector product performance and multi-level preconditioning for the parallel PCG package

Energy Technology Data Exchange (ETDEWEB)

McLay, R.T.; Carey, G.F.

1996-12-31

In this study we consider parallel solution of sparse linear systems arising from discretized PDE`s. As part of our continuing work on our parallel PCG Solver package, we have made improvements in two areas. The first is improving the performance of the matrix-vector product. Here on regular finite-difference grids, we are able to use the cache memory more efficiently for smaller domains or where there are multiple degrees of freedom. The second problem of interest in the present work is the construction of preconditioners in the context of the parallel PCG solver we are developing. Here the problem is partitioned over a set of processors subdomains and the matrix-vector product for PCG is carried out in parallel for overlapping grid subblocks. For problems of scaled speedup, the actual rate of convergence of the unpreconditioned system deteriorates as the mesh is refined. Multigrid and subdomain strategies provide a logical approach to resolving the problem. We consider the parallel trade-offs between communication and computation and provide a complexity analysis of a representative algorithm. Some preliminary calculations using the parallel package and comparisons with other preconditioners are provided together with parallel performance results.
High accuracy microwave frequency measurement based on single-drive dual-parallel Mach-Zehnder modulator

DEFF Research Database (Denmark)

Zhao, Ying; Pang, Xiaodan; Deng, Lei

2011-01-01

A novel approach for broadband microwave frequency measurement by employing a single-drive dual-parallel Mach-Zehnder modulator is proposed and experimentally demonstrated. Based on bias manipulations of the modulator, conventional frequency-to-power mapping technique is developed by performing a...... 10−3 relative error. This high accuracy frequency measurement technique is a promising candidate for high-speed electronic warfare and defense applications....
6th International Parallel Tools Workshop

CERN Document Server

Brinkmann, Steffen; Gracia, José; Resch, Michael; Nagel, Wolfgang

2013-01-01

The latest advances in the High Performance Computing hardware have significantly raised the level of available compute performance. At the same time, the growing hardware capabilities of modern supercomputing architectures have caused an increasing complexity of the parallel application development. Despite numerous efforts to improve and simplify parallel programming, there is still a lot of manual debugging and tuning work required. This process is supported by special software tools, facilitating debugging, performance analysis, and optimization and thus making a major contribution to the development of robust and efficient parallel software. This book introduces a selection of the tools, which were presented and discussed at the 6th International Parallel Tools Workshop, held in Stuttgart, Germany, 25-26 September 2012.
The Design of Polymer Planar Optical Triplexer with MMI Filter and Directional Coupler

Directory of Open Access Journals (Sweden)

V. Jerabek

2013-12-01

Full Text Available Optical bidirectional WDM transceiver is a key component of the Passive Optical Network of the Fiber to the Home topology. Essential parts of such transceivers are filters that combine multiplexing and demultiplexing function of optical signal (triplexing filters. In this paper we report about a design of a new planar optical multi-wavelength selective system triplexing filter, which combines a multimode interference filter with directional coupler based on the epoxy polymer SU-8 on Si/SiO2 substrate. The optical triplexing filter was designed using the Beam Propagation Method. The aim of this project was to optimize the triplexing filter optical parameters and to minimize the planar optical wavelength selective system dimensions. The multimode interference filter was used for separation of downstream optical signal in designed optoelectronic integrated WDM transceiver. The directional coupler was used for adding of upstream optical signal.
Eddy current loss calculation and thermal analysis of axial-flux permanent magnet couplers

Directory of Open Access Journals (Sweden)

Di Zheng

2017-02-01

Full Text Available A three-dimensional magnetic field analytical model of axial-flux permanent magnet couplers is presented to calculate the eddy current loss, and the prediction of the copper plate temperature under various loads is analyzed. The magnetic field distribution is calculated, and then the eddy current loss is obtained, with the magnetic field analytical model established in cylindrical coordinate. The influence of various loads on eddy current loss is analyzed. Furthermore, a thermal model of axial-flux permanent magnet couplers is established by taking the eddy current loss as the heat source, using the electromagnetic-thermal coupled method. With the help of the thermal model, the influence of various loads on copper plate temperature rise is also analyzed. The calculated results are compared with the results of finite element method and measurement. The comparison results confirm the validity of the magnetic field analytical model and thermal model.

Toward an ultra-high resolution community climate system model for the BlueGene platform

Energy Technology Data Exchange (ETDEWEB)

Dennis, John M [Computer Science Section, National Center for Atmospheric Research, Boulder, CO (United States); Jacob, Robert [Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL (United States); Vertenstein, Mariana [Climate and Global Dynamics Division, National Center for Atmospheric Research, Boulder, CO (United States); Craig, Tony [Climate and Global Dynamics Division, National Center for Atmospheric Research, Boulder, CO (United States); Loy, Raymond [Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL (United States)

2007-07-15

Global climate models need to simulate several small, regional-scale processes which affect the global circulation in order to accurately simulate the climate. This is particularly important in the ocean where small scale features such as oceanic eddies are currently represented with adhoc parameterizations. There is also a need for higher resolution to provide climate predictions at small, regional scales. New high-performance computing platforms such as the IBM BlueGene can provide the necessary computational power to perform ultra-high resolution climate model integrations. We have begun to investigate the scaling of the individual components of the Community Climate System Model to prepare it for integrations on BlueGene and similar platforms. Our investigations show that it is possible to successfully utilize O(32K) processors. We describe the scalability of five models: the Parallel Ocean Program (POP), the Community Ice CodE (CICE), the Community Land Model (CLM), and the new CCSM sequential coupler (CPL7) which are components of the next generation Community Climate System Model (CCSM); as well as the High-Order Method Modeling Environment (HOMME) which is a dynamical core currently being evaluated within the Community Atmospheric Model. For our studies we concentrate on 1/10{sup 0} resolution for CICE, POP, and CLM models and 1/4{sup 0} resolution for HOMME. The ability to simulate high resolutions on the massively parallel petascale systems that will dominate high-performance computing for the foreseeable future is essential to the advancement of climate science.
Highly parallel algorithm for high pT physics at FAIR-CBM

International Nuclear Information System (INIS)

Fueloep, A; Vesztergombi, G

2010-01-01

The limitations of presently available data on p T range are discussed and planned future upgrades are outlined. Special attention is given to the FAIR-CBM experiment as a unique high luminosity facility for future continuation of the measurements at very high p T with emphasis on the so-called mosaic trigger system to use the highly parallel online algorithm.
High-Performance Control of Paralleled Three-Phase Inverters for Residential Microgrid Architectures Based on Online Uninterruptable Power Systems

DEFF Research Database (Denmark)

Zhang, Chi; Guerrero, Josep M.; Vasquez, Juan Carlos

2015-01-01

In this paper, a control strategy for the parallel operation of three-phase inverters forming an online uninterruptible power system (UPS) is presented. The UPS system consists of a cluster of paralleled inverters with LC filters directly connected to an AC critical bus and an AC/DC forming a DC...... bus. The proposed control scheme is performed on two layers: (i) a local layer that contains a “reactive power vs phase” in order to synchronize the phase angle of each inverter and a virtual resistance loop that guarantees equal power sharing among inverters; (ii) a central controller that guarantees...... synchronization with an external real/fictitious utility, and critical bus voltage restoration. Constant transient and steady-state frequency, active, reactive and harmonic power sharing, and global phase-locked loop resynchronization capability are achieved. Detailed system topology and control architecture...
A Highly Parallel and Scalable Motion Estimation Algorithm with GPU for HEVC

Directory of Open Access Journals (Sweden)

Yun-gang Xue

2017-01-01

Full Text Available We propose a highly parallel and scalable motion estimation algorithm, named multilevel resolution motion estimation (MLRME for short, by combining the advantages of local full search and downsampling. By subsampling a video frame, a large amount of computation is saved. While using the local full-search method, it can exploit massive parallelism and make full use of the powerful modern many-core accelerators, such as GPU and Intel Xeon Phi. We implanted the proposed MLRME into HM12.0, and the experimental results showed that the encoding quality of the MLRME method is close to that of the fast motion estimation in HEVC, which declines by less than 1.5%. We also implemented the MLRME with CUDA, which obtained 30–60x speed-up compared to the serial algorithm on single CPU. Specifically, the parallel implementation of MLRME on a GTX 460 GPU can meet the real-time coding requirement with about 25 fps for the 2560×1600 video format, while, for 832×480, the performance is more than 100 fps.
Quantum superchemistry in an output coupler of coherent matter waves

International Nuclear Information System (INIS)

Jing, H.; Cheng, J.

2006-01-01

We investigate the quantum superchemistry or Bose-enhanced atom-molecule conversions in a coherent output coupler of matter waves, as a simple generalization of the two-color photoassociation. The stimulated effects of molecular output step and atomic revivals are exhibited by steering the rf output couplings. The quantum noise-induced molecular damping occurs near a total conversion in a levitation trap. This suggests a feasible two-trap scheme to make a stable coherent molecular beam
Kinematic Analysis and Performance Evaluation of Novel PRS Parallel Mechanism

Science.gov (United States)

Balaji, K.; Khan, B. Shahul Hamid

2018-02-01

In this paper, a 3 DoF (Degree of Freedom) novel PRS (Prismatic-Revolute- Spherical) type parallel mechanisms has been designed and presented. The combination of striaght and arc type linkages for 3 DOF parallel mechanism is introduced for the first time. The performances of the mechanisms are evaluated based on the indices such as Minimum Singular Value (MSV), Condition Number (CN), Local Conditioning Index (LCI), Kinematic Configuration Index (KCI) and Global Conditioning Index (GCI). The overall reachable workspace of all mechanisms are presented. The kinematic measure, dexterity measure and workspace analysis for all the mechanism have been evaluated and compared.
Template based parallel checkpointing in a massively parallel computer system

Science.gov (United States)

Archer, Charles Jens [Rochester, MN; Inglett, Todd Alan [Rochester, MN

2009-01-13

A method and apparatus for a template based parallel checkpoint save for a massively parallel super computer system using a parallel variation of the rsync protocol, and network broadcast. In preferred embodiments, the checkpoint data for each node is compared to a template checkpoint file that resides in the storage and that was previously produced. Embodiments herein greatly decrease the amount of data that must be transmitted and stored for faster checkpointing and increased efficiency of the computer system. Embodiments are directed to a parallel computer system with nodes arranged in a cluster with a high speed interconnect that can perform broadcast communication. The checkpoint contains a set of actual small data blocks with their corresponding checksums from all nodes in the system. The data blocks may be compressed using conventional non-lossy data compression algorithms to further reduce the overall checkpoint size.
PGHPF – An Optimizing High Performance Fortran Compiler for Distributed Memory Machines

Directory of Open Access Journals (Sweden)

Zeki Bozkus

1997-01-01

Full Text Available High Performance Fortran (HPF is the first widely supported, efficient, and portable parallel programming language for shared and distributed memory systems. HPF is realized through a set of directive-based extensions to Fortran 90. It enables application developers and Fortran end-users to write compact, portable, and efficient software that will compile and execute on workstations, shared memory servers, clusters, traditional supercomputers, or massively parallel processors. This article describes a production-quality HPF compiler for a set of parallel machines. Compilation techniques such as data and computation distribution, communication generation, run-time support, and optimization issues are elaborated as the basis for an HPF compiler implementation on distributed memory machines. The performance of this compiler on benchmark programs demonstrates that high efficiency can be achieved executing HPF code on parallel architectures.
Z-buffer image assembly processing in high parallel visualization processing

International Nuclear Information System (INIS)

Kaneko, Isamu; Muramatsu, Kazuhiro

2000-03-01

On the platform of the parallel computer with many processors, the domain decomposition method is used as a popular means of parallel processing. In these days when the simulation scale becomes much larger and takes a lot of time, the simultaneous visualization processing with the actual computation is much more needed, and especially in case of a real-time visualization, the domain decomposition technique is indispensable. In case of parallel rendering processing, the rendered results must be gathered to one processor to compose the integrated picture in the last stage. This integration is usually conducted by the method using Z-buffer values. This process, however, induces the crucial problems of much lower speed processing and local memory shortage in case of parallel processing exceeding more than several tens of processors. In this report, the two new solutions are proposed. The one is the adoption of a special operator (Reduce operator) in the parallelization process, and the other is a buffer compression by deleting the background informations. This report includes the performance results of these new techniques to investigate their effect with use of the parallel computer Paragon. (author)
Tolerance study for the components of the probe-type and hook-type Higher Order Mode couplers for the HL-LHC 800 MHz harmonic system

CERN Document Server

Blanco, Esteban

2016-01-01

A superconducting 800 MHz second harmonic RF system is one of the considered options as a Landau damping mechanism for HiLumi LHC. The Higher Order Mode (HOM) coupler designs require tight manufacturing tolerances in order to operate at the design specifications. The project consists of defining the mechanical tolerances for the different components of both the probe-type and hook-type HOM coupler. With the use of electromagnetic field simulation software it is possible to identify the critical components of the HOM coupler and to quantify their respective tolerances. The obtained results are discussed in this paper.
High power tests of dressed supconducting 1.3 GHz RF cavities

Energy Technology Data Exchange (ETDEWEB)

Hocker, A.; Harms, E.R.; Lunin, A.; Sukhanov, A.; /Fermilab

2011-03-01

A single-cavity test cryostat is used to conduct pulsed high power RF tests of superconducting 1.3 GHz RF cavities at 2 K. The cavities under test are welded inside individual helium vessels and are outfitted ('dressed') with a fundamental power coupler, higher-order mode couplers, magnetic shielding, a blade tuner, and piezoelectric tuners. The cavity performance is evaluated in terms of accelerating gradient, unloaded quality factor, and field emission, and the functionality of the auxiliary components is verified. Test results from the first set of dressed cavities are presented here.
A numerical method for determining the radial wave motion correction in plane wave couplers

DEFF Research Database (Denmark)

Cutanda Henriquez, Vicente; Barrera Figueroa, Salvador; Torras Rosell, Antoni

2016-01-01

Microphones are used for realising the unit of sound pressure level, the pascal (Pa). Electro-acoustic reciprocity is the preferred method for the absolute determination of the sensitivity. This method can be applied in different sound fields: uniform pressure, free field or diffuse field. Pressure...... solution is an analytical expression that estimates the difference between the ideal plane wave sound field and a more complex lossless sound field created by a non-planar movement of the microphone’s membranes. Alternatively, a correction may be calculated numerically by introducing a full model...... of the microphone-coupler system in a Boundary Element formulation. In order to obtain a realistic representation of the sound field, viscous losses must be introduced in the model. This paper presents such a model, and the results of the simulations for different combinations of microphones and couplers...
High Performance Computation of a Jet in Crossflow by Lattice Boltzmann Based Parallel Direct Numerical Simulation

Directory of Open Access Journals (Sweden)

Jiang Lei

2015-01-01

Full Text Available Direct numerical simulation (DNS of a round jet in crossflow based on lattice Boltzmann method (LBM is carried out on multi-GPU cluster. Data parallel SIMT (single instruction multiple thread characteristic of GPU matches the parallelism of LBM well, which leads to the high efficiency of GPU on the LBM solver. With present GPU settings (6 Nvidia Tesla K20M, the present DNS simulation can be completed in several hours. A grid system of 1.5 × 108 is adopted and largest jet Reynolds number reaches 3000. The jet-to-free-stream velocity ratio is set as 3.3. The jet is orthogonal to the mainstream flow direction. The validated code shows good agreement with experiments. Vortical structures of CRVP, shear-layer vortices and horseshoe vortices, are presented and analyzed based on velocity fields and vorticity distributions. Turbulent statistical quantities of Reynolds stress are also displayed. Coherent structures are revealed in a very fine resolution based on the second invariant of the velocity gradients.
Switching management in couplers with biharmonic longitudinal modulation of refractive index

OpenAIRE

Kartashov, Yaroslav V.; Vysloukh, Victor A.

2009-01-01

We address light propagation in couplers with longitudinal biharmonic modulation of refractive index in neighboring channels. While simplest single-frequency out-of-phase modulation allows suppression of coupling for strictly defined set of resonant frequencies, the addition of modulation on multiple frequency dramatically modifies the structure of resonances. Thus, additional modulation on double frequency may suppress primary resonance, while modulation on triple frequency causes fusion of ...
New design of a triplexer using ring resonator integrated with directional coupler based on photonic crystals

Science.gov (United States)

Wu, Yaw-Dong; Shih, Tien-Tsorng; Lee, Jian-Jang

2009-11-01

In this paper, we proposed the design of directional coupler integrated with ring resonator based on two-dimensional photonic crystals (2D PCs) to develop a triplexer filter. It can be widely used as the fiber access network element for multiplexer-demultiplexer wavelength selective in fiber-to-the-home (FTTH) communication systems. The directional coupler is chosen to separate the wavelengths of 1490nm and 1310nm. The ring resonator separates the wavelength of 1550nm. The transmission efficiency is larger than 90%. Besides, the total size of propose triplexer is only 19μm×12μm. We present simulation results using the finite-difference time-domain (FDTD) method for the proposed structure.
High-Bandwidth, High-Efficiency Envelope Tracking Power Supply for 40W RF Power Amplifier Using Paralleled Bandpass Current Sources

DEFF Research Database (Denmark)

Høyerby, Mikkel Christian Wendelboe; Andersen, Michael Andreas E.

2005-01-01

This paper presents a high-performance power conversion scheme for power supply applications that require very high output voltage slew rates (dV/dt). The concept is to parallel 2 switching bandpass current sources, each optimized for its passband frequency space and the expected load current....... The principle is demonstrated with a power supply, designed for supplying a 40 W linear RF power amplifier for efficient amplification of a 16-QAM modulated data stream...
High Efficiency EBCOT with Parallel Coding Architecture for JPEG2000

Directory of Open Access Journals (Sweden)

Chiang Jen-Shiun

2006-01-01

Full Text Available This work presents a parallel context-modeling coding architecture and a matching arithmetic coder (MQ-coder for the embedded block coding (EBCOT unit of the JPEG2000 encoder. Tier-1 of the EBCOT consumes most of the computation time in a JPEG2000 encoding system. The proposed parallel architecture can increase the throughput rate of the context modeling. To match the high throughput rate of the parallel context-modeling architecture, an efficient pipelined architecture for context-based adaptive arithmetic encoder is proposed. This encoder of JPEG2000 can work at 180 MHz to encode one symbol each cycle. Compared with the previous context-modeling architectures, our parallel architectures can improve the throughput rate up to 25%.
High performance deformable image registration algorithms for manycore processors

CERN Document Server

Shackleford, James; Sharp, Gregory

2013-01-01

High Performance Deformable Image Registration Algorithms for Manycore Processors develops highly data-parallel image registration algorithms suitable for use on modern multi-core architectures, including graphics processing units (GPUs). Focusing on deformable registration, we show how to develop data-parallel versions of the registration algorithm suitable for execution on the GPU. Image registration is the process of aligning two or more images into a common coordinate frame and is a fundamental step to be able to compare or fuse data obtained from different sensor measurements. E
Integrated optical isolators based on two-mode interference couplers

International Nuclear Information System (INIS)

Sun, Yiling; Zhou, Haifeng; Jiang, Xiaoqing; Hao, Yinlei; Yang, Jianyi; Wang, Minghua

2010-01-01

This paper presents an optical waveguide isolator based on two-mode interference (TMI) couplers, by utilizing the magneto-optical nonreciprocal phase shift (NPS). The operating principle of this device is to utilize the difference between the nonreciprocal phase shifts of the two lowest-order modes. A two-dimensional (2D) semi-vectorial finite difference method is used to calculate the difference between the nonreciprocal phase shifts of the two lowest-order modes and optimize the parameters. The proposed device may play an important role in integrated optical devices and optical communication systems
Brain inspired high performance electronics on flexible silicon

KAUST Repository

Sevilla, Galo T.; Rojas, Jhonathan Prieto; Hussain, Muhammad Mustafa

2014-01-01

Brain's stunning speed, energy efficiency and massive parallelism makes it the role model for upcoming high performance computation systems. Although human brain components are a million times slower than state of the art silicon industry components

High-speed parallel solution of the neutron diffusion equation with the hierarchical domain decomposition boundary element method incorporating parallel communications

International Nuclear Information System (INIS)

Tsuji, Masashi; Chiba, Gou

2000-01-01

A hierarchical domain decomposition boundary element method (HDD-BEM) for solving the multiregion neutron diffusion equation (NDE) has been fully parallelized, both for numerical computations and for data communications, to accomplish a high parallel efficiency on distributed memory message passing parallel computers. Data exchanges between node processors that are repeated during iteration processes of HDD-BEM are implemented, without any intervention of the host processor that was used to supervise parallel processing in the conventional parallelized HDD-BEM (P-HDD-BEM). Thus, the parallel processing can be executed with only cooperative operations of node processors. The communication overhead was even the dominant time consuming part in the conventional P-HDD-BEM, and the parallelization efficiency decreased steeply with the increase of the number of processors. With the parallel data communication, the efficiency is affected only by the number of boundary elements assigned to decomposed subregions, and the communication overhead can be drastically reduced. This feature can be particularly advantageous in the analysis of three-dimensional problems where a large number of processors are required. The proposed P-HDD-BEM offers a promising solution to the deterioration problem of parallel efficiency and opens a new path to parallel computations of NDEs on distributed memory message passing parallel computers. (author)
Modular high-temperature gas-cooled reactor simulation using parallel processors

International Nuclear Information System (INIS)

Ball, S.J.; Conklin, J.C.

1989-01-01

The MHPP (Modular HTGR Parallel Processor) code has been developed to simulate modular high-temperature gas-cooled reactor (MHTGR) transients and accidents. MHPP incorporates a very detailed model for predicting the dynamics of the reactor core, vessel, and cooling systems over a wide variety of scenarios ranging from expected transients to very-low-probability severe accidents. The simulations routines, which had originally been developed entirely as serial code, were readily adapted to parallel processing Fortran. The resulting parallelized simulation speed was enhanced significantly. Workstation interfaces are being developed to provide for user (operator) interaction. In this paper the benefits realized by adapting previous MHTGR codes to run on a parallel processor are discussed, along with results of typical accident analyses
Advanced Output Coupling for High Power Gyrotrons

Energy Technology Data Exchange (ETDEWEB)

Read, Michael [Calabazas Creek Research, Inc., San Mateo, CA (United States); Ives, Robert Lawrence [Calabazas Creek Research, Inc., San Mateo, CA (United States); Marsden, David [Calabazas Creek Research, Inc., San Mateo, CA (United States); Collins, George [Calabazas Creek Research, Inc., San Mateo, CA (United States); Temkin, Richard [Massachusetts Inst. of Technology (MIT), Cambridge, MA (United States); Guss, William [Massachusetts Inst. of Technology (MIT), Cambridge, MA (United States); Lohr, John [General Atomics, La Jolla, CA (United States); Neilson, Jeffrey [Lexam Research, Redwood City, CA (United States); Bui, Thuc [Calabazas Creek Research, Inc., San Mateo, CA (United States)

2016-11-28

The Phase II program developed an internal RF coupler that transforms the whispering gallery RF mode produced in gyrotron cavities to an HE11 waveguide mode propagating in corrugated waveguide. This power is extracted from the vacuum using a broadband, chemical vapor deposited (CVD) diamond, Brewster angle window capable of transmitting more than 1.5 MW CW of RF power over a broad range of frequencies. This coupling system eliminates the Mirror Optical Units now required to externally couple Gaussian output power into corrugated waveguide, significantly reducing system cost and increasing efficiency. The program simulated the performance using a broad range of advanced computer codes to optimize the design. Both a direct coupler and Brewster angle window were built and tested at low and high power. Test results confirmed the performance of both devices and demonstrated they are capable of achieving the required performance for scientific, defense, industrial, and medical applications.
High-Efficient Parallel CAVLC Encoders on Heterogeneous Multicore Architectures

Directory of Open Access Journals (Sweden)

H. Y. Su

2012-04-01

Full Text Available This article presents two high-efficient parallel realizations of the context-based adaptive variable length coding (CAVLC based on heterogeneous multicore processors. By optimizing the architecture of the CAVLC encoder, three kinds of dependences are eliminated or weaken, including the context-based data dependence, the memory accessing dependence and the control dependence. The CAVLC pipeline is divided into three stages: two scans, coding, and lag packing, and be implemented on two typical heterogeneous multicore architectures. One is a block-based SIMD parallel CAVLC encoder on multicore stream processor STORM. The other is a component-oriented SIMT parallel encoder on massively parallel architecture GPU. Both of them exploited rich data-level parallelism. Experiments results show that compared with the CPU version, more than 70 times of speedup can be obtained for STORM and over 50 times for GPU. The implementation of encoder on STORM can make a real-time processing for 1080p @30fps and GPU-based version can satisfy the requirements for 720p real-time encoding. The throughput of the presented CAVLC encoders is more than 10 times higher than that of published software encoders on DSP and multicore platforms.
Measurement of S Parameters ofan Accelerating Structure with Double-Feed Couplers

CERN Document Server

Fandos, R; Wuensch, W

2006-01-01

A method for measuring the transmission and reflection coefficients of an accelerating structure with double-feed input and output couplers using a 2 port network analyzer is presented. This method avoids the use of magic Ts and hybrids, whose symmetry is not obvious. The procedure is extended to devices with n symmetrical input and m symmetrical output ports. The method to make bead pull measurements for such devices is described.
Numerical simulation of waveguide input/output couplers for a planar mm-wave linac cavity

International Nuclear Information System (INIS)

Kang, Y.W.

1994-01-01

A double-sided planar mm-wave linear accelerating cavity structure has been studied. The input/output couplers for the accelerating cavity structure have been designed using the Hewlett-Packard High Frequency Structure Simulator (HFSS). The program is a frequency domain finite element 3-D field solver and can include matched port boundary conditions. The power transmission property of the structure is calculated in the frequency domain. The dimensions of the, coupling cavities and the irises at the input/output ports are adjusted to have the structure matched to rectangular waveguides. The field distributions in the accelerating structure for the 2π/3-mode traveling wave are shown
PDDP, A Data Parallel Programming Model

Directory of Open Access Journals (Sweden)

Karen H. Warren

1996-01-01

Full Text Available PDDP, the parallel data distribution preprocessor, is a data parallel programming model for distributed memory parallel computers. PDDP implements high-performance Fortran-compatible data distribution directives and parallelism expressed by the use of Fortran 90 array syntax, the FORALL statement, and the WHERE construct. Distributed data objects belong to a global name space; other data objects are treated as local and replicated on each processor. PDDP allows the user to program in a shared memory style and generates codes that are portable to a variety of parallel machines. For interprocessor communication, PDDP uses the fastest communication primitives on each platform.
Proposal for fabrication-tolerant SOI polarization splitter-rotator based on cascaded MMI couplers and an assisted bi-level taper.

Science.gov (United States)

Wang, Jing; Qi, Minghao; Xuan, Yi; Huang, Haiyang; Li, You; Li, Ming; Chen, Xin; Jia, Qi; Sheng, Zhen; Wu, Aimin; Li, Wei; Wang, Xi; Zou, Shichang; Gan, Fuwan

2014-11-17

A novel silicon-on-insulator (SOI) polarization splitter-rotator (PSR) with a large fabrication tolerance is proposed based on cascaded multimode interference (MMI) couplers and an assisted mode-evolution taper. The tapers are designed to adiabatically convert the input TM(0) mode into the TE(1) mode, which will output as the TE(0) mode after processed by the subsequent MMI mode converter, 90-degree phase shifter (PS) and MMI 3 dB coupler. The numerical simulation results show that the proposed device has a silicon photonics technology.
Performance Improvement of Shunt Active Power Filter With Dual Parallel Topology

DEFF Research Database (Denmark)

Asiminoaei, Lucian; Lascu, Cristian; Blaabjerg, Frede

2007-01-01

loop and the other is in a feedforward loop for harmonic compensation. Thus, both active power filters bring their own characteristic advantages, i.e., the feedback filter improves the steady-state performance of the harmonic mitigation and the feedforward filter improves the dynamic response. Another......This paper describes the control and parallel operation of two active power filters (APFs). Possible parallel operation situations of two APFs are investigated, and then the proposed topology is analyzed. The filters are coupled in a combined topology in which one filter is connected in a feedback...
Design of a 300 GHz Broadband TWT Coupler and RF-Structure

CERN Document Server

Krawczyk, F L

2004-01-01

Recent LANL activities in millimeter wave structures focus on 94 and 300 GHz structures. They aim at power generation from low power (1002000 W) with a round electron beam (120 kV, 0.11.0 A) to high power (2100 kW) with a sheet beam structure (120 kV, 20 A). Applications cover basic research, radar and secure communications and remote sensing of biological and chemical agents. In this presentation the design and cold-test measurements of a 300 GHz RF-structure with a broadband (>6% bandwidth) power coupler are presented. The design choice of two input/output waveguides, a special coupling region and the structure parameters themselves are presented. As a benchmark also a scaled up version at 10 GHz was designed and measured. These results will also be presented.
Fabrication of low temperature cofired ceramic (LTCC) chip couplers for high frequencies : I. Effect of binder burnout process on the formation of electrode line

Energy Technology Data Exchange (ETDEWEB)

Cho, N.T.; Shim, K.B.; Lee, S.W. [Hanyang University, Seoul (Korea); Koo, K.D. [K-Cera Inc., Yongin (Korea)

1999-06-01

In the fabrication of ceramic chip couplers for high frequency applications such as the mobile communication equipment, the formation of electrode lines and Ag diffusion were investigated with heat treatment conditions for removing organic binders. The deformation and densification of the electrode line greatly depended on the binder burnout process due to the overlapped temperature zone near 400{sup o} C of the binder dissociation and the solid phase sintering of the silver electrode. Ag ions were diffused into the glass ceramic substrate. The Ag diffusion was led by the glassy phase containing Pb ions rather than by the crystalline phase containing Ca ions. The fact suggests that the Ag diffusion could be controlled by managing the composition of the glass ceramic substrate. 9 refs., 10 figs., 1 tab.
A concurrent visualization system for large-scale unsteady simulations. Parallel vector performance on an NEC SX-4

International Nuclear Information System (INIS)

Takei, Toshifumi; Doi, Shun; Matsumoto, Hideki; Muramatsu, Kazuhiro

2000-01-01

We have developed a concurrent visualization system RVSLIB (Real-time Visual Simulation Library). This paper shows the effectiveness of the system when it is applied to large-scale unsteady simulations, for which the conventional post-processing approach may no longer work, on high-performance parallel vector supercomputers. The system performs almost all of the visualization tasks on a computation server and uses compressed visualized image data for efficient communication between the server and the user terminal. We have introduced several techniques, including vectorization and parallelization, into the system to minimize the computational costs of the visualization tools. The performance of RVSLIB was evaluated by using an actual CFD code on an NEC SX-4. The computational time increase due to the concurrent visualization was at most 3% for a smaller (1.6 million) grid and less than 1% for a larger (6.2 million) one. (author)
Aspects of computation on asynchronous parallel processors

International Nuclear Information System (INIS)

Wright, M.

1989-01-01

The increasing availability of asynchronous parallel processors has provided opportunities for original and useful work in scientific computing. However, the field of parallel computing is still in a highly volatile state, and researchers display a wide range of opinion about many fundamental questions such as models of parallelism, approaches for detecting and analyzing parallelism of algorithms, and tools that allow software developers and users to make effective use of diverse forms of complex hardware. This volume collects the work of researchers specializing in different aspects of parallel computing, who met to discuss the framework and the mechanics of numerical computing. The far-reaching impact of high-performance asynchronous systems is reflected in the wide variety of topics, which include scientific applications (e.g. linear algebra, lattice gauge simulation, ordinary and partial differential equations), models of parallelism, parallel language features, task scheduling, automatic parallelization techniques, tools for algorithm development in parallel environments, and system design issues
PARALLEL IMPLEMENTATION OF CROSS-LAYER OPTIMIZATION - A PERFORMANCE EVALUATION BASED ON SWARM INTELLIGENCE

Directory of Open Access Journals (Sweden)

Vanaja Gokul

2012-01-01

Full Text Available In distributed systems real time optimizations need to be performed dynamically for better utilization of the network resources. Real time optimizations can be performed effectively by using Cross Layer Optimization (CLO within the network operating system. This paper presents the performance evaluation of Cross Layer Optimization (CLO in comparison with the traditional approach of Single-Layer Optimization (SLO. In the parallel implementation of the approaches the experimental study carried out indicates that the CLO results in a significant improvement in network utilization when compared to SLO. A variant of the Particle Swarm Optimization technique that utilizes Digital Pheromones (PSODP for better performance has been used here. A significantly higher speed up in performance was observed from the parallel implementation of CLO that used PSODP on a cluster of nodes.
Tunable negative-tap photonic microwave filter based on a cladding-mode coupler and an optically injected laser of large detuning.

Science.gov (United States)

Chan, Sze-Chun; Liu, Qing; Wang, Zhu; Chiang, Kin Seng

2011-06-20

A tunable negative-tap photonic microwave filter using a cladding-mode coupler together with optical injection locking of large wavelength detuning is demonstrated. Continuous and precise tunability of the filter is realized by physically sliding a pair of bare fibers inside the cladding-mode coupler. Signal inversion for the negative tap is achieved by optical injection locking of a single-mode semiconductor laser. To couple light into and out of the cladding-mode coupler, a pair of matching long-period fiber gratings is employed. The large bandwidth of the gratings requires injection locking of an exceptionally large wavelength detuning that has never been demonstrated before. Experimentally, injection locking with wavelength detuning as large as 27 nm was achieved, which corresponded to locking the 36-th side mode. Microwave filtering with a free-spectral range tunable from 88.6 MHz to 1.57 GHz and a notch depth larger than 35 dB was obtained.
HVI Ballistic Performance Characterization of Non-Parallel Walls

Science.gov (United States)

Bohl, William; Miller, Joshua; Christiansen, Eric

2012-01-01

The Double-Wall, "Whipple" Shield [1] has been the subject of many hypervelocity impact studies and has proven to be an effective shield system for Micro-Meteoroid and Orbital Debris (MMOD) impacts for spacecraft. The US modules of the International Space Station (ISS), with their "bumper shields" offset from their pressure holding rear walls provide good examples of effective on-orbit use of the double wall shield. The concentric cylinder shield configuration with its large radius of curvature relative to separation distance is easily and effectively represented for testing and analysis as a system of two parallel plates. The parallel plate double wall configuration has been heavily tested and characterized for shield performance for normal and oblique impacts for the ISS and other programs. The double wall shield and principally similar Stuffed Whipple Shield are very common shield types for MMOD protection. However, in some locations with many spacecraft designs, the rear wall cannot be modeled as being parallel or concentric with the outer bumper wall. As represented in Figure 1, there is an included angle between the two walls. And, with a cylindrical outer wall, the effective included angle constantly changes. This complicates assessment of critical spacecraft components located within outer spacecraft walls when using software tools such as NASA's BumperII. In addition, the validity of the risk assessment comes into question when using the standard double wall shield equations, especially since verification testing of every set of double wall included angles is impossible.
Design of a highly parallel board-level-interconnection with 320 Gbps capacity

Science.gov (United States)

Lohmann, U.; Jahns, J.; Limmer, S.; Fey, D.; Bauer, H.

2012-01-01

A parallel board-level interconnection design is presented consisting of 32 channels, each operating at 10 Gbps. The hardware uses available optoelectronic components (VCSEL, TIA, pin-diodes) and a combination of planarintegrated free-space optics, fiber-bundles and available MEMS-components, like the DMD™ from Texas Instruments. As a specific feature, we present a new modular inter-board interconnect, realized by 3D fiber-matrix connectors. The performance of the interconnect is evaluated with regard to optical properties and power consumption. Finally, we discuss the application of the interconnect for strongly distributed system architectures, as, for example, in high performance embedded computing systems and data centers.
Parallel-In-Time For Moving Meshes

Energy Technology Data Exchange (ETDEWEB)

Falgout, R. D. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Manteuffel, T. A. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Southworth, B. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Schroder, J. B. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

2016-02-04

With steadily growing computational resources available, scientists must develop e ective ways to utilize the increased resources. High performance, highly parallel software has be- come a standard. However until recent years parallelism has focused primarily on the spatial domain. When solving a space-time partial di erential equation (PDE), this leads to a sequential bottleneck in the temporal dimension, particularly when taking a large number of time steps. The XBraid parallel-in-time library was developed as a practical way to add temporal parallelism to existing se- quential codes with only minor modi cations. In this work, a rezoning-type moving mesh is applied to a di usion problem and formulated in a parallel-in-time framework. Tests and scaling studies are run using XBraid and demonstrate excellent results for the simple model problem considered herein.
Acoustic Coupler for the Acquisition of Coronary Artery Murmurs

DEFF Research Database (Denmark)

Zimmermann, Niels Henrik; Schmidt, Samuel; Hansen, John

to record heart sound in the diastolic period with a sound pressure level approximately 30 dB above the noise floor of the microphone and recording system in the frequency range from 200-700 Hz. The capability of the sensor to record diastolic heart sound in the relevant frequency range indicates...... and the microphone used was evaluated through a large study, where the coupler was used for recording the murmur sound from 464 heart patients. The power spectrum of the diastolic heart sounds was analyzed to determine the characteristics of the frequency spectrum. The preliminary results show, that it was possible...
Impact of temperature on performance of series and parallel connected mono-crystalline silicon solar cells

Directory of Open Access Journals (Sweden)

Subhash Chander

2015-11-01

Full Text Available This paper presents a study on impact of temperature on the performance of series and parallel connected mono-crystalline silicon (mono-Si solar cell employing solar simulator. The experiment was carried out at constant light intensity 550 W/m2with cell temperature in the range 25–60 oC for single, series and parallel connected mono-Si solar cells. The performance parameters like open circuit voltage, maximum power, fill factor and efficiency are found to decrease with cell temperature while the short circuit current is observed to increase. The experimental results reveal that silicon solar cells connected in series and parallel combinations follow the Kirchhoff’s laws and the temperature has a significant effect on the performance parameters of solar cell.

A Generic High-performance GPU-based Library for PDE solvers

DEFF Research Database (Denmark)

Glimberg, Stefan Lemvig; Engsig-Karup, Allan Peter

, the privilege of high-performance parallel computing is now in principle accessible for many scientific users, no matter their economic resources. Though being highly effective units, GPUs and parallel architectures in general, pose challenges for software developers to utilize their efficiency. Sequential...... legacy codes are not always easily parallelized and the time spent on conversion might not pay o in the end. We present a highly generic C++ library for fast assembling of partial differential equation (PDE) solvers, aiming at utilizing the computational resources of GPUs. The library requires a minimum...... of GPU computing knowledge, while still oering the possibility to customize user-specic solvers at kernel level if desired. Spatial dierential operators are based on matrix free exible order nite dierence approximations. These matrix free operators minimize both memory consumption and main memory access...
High-Performance Modeling of Carbon Dioxide Sequestration by Coupling Reservoir Simulation and Molecular Dynamics

KAUST Repository

Bao, Kai; Yan, Mi; Allen, Rebecca; Salama, Amgad; Lu, Ligang; Jordan, Kirk E.; Sun, Shuyu; Keyes, David E.

2015-01-01

The present work describes a parallel computational framework for carbon dioxide (CO2) sequestration simulation by coupling reservoir simulation and molecular dynamics (MD) on massively parallel high-performance-computing (HPC) systems
NINJA: Java for High Performance Numerical Computing

Directory of Open Access Journals (Sweden)

José E. Moreira

2002-01-01

Full Text Available When Java was first introduced, there was a perception that its many benefits came at a significant performance cost. In the particularly performance-sensitive field of numerical computing, initial measurements indicated a hundred-fold performance disadvantage between Java and more established languages such as Fortran and C. Although much progress has been made, and Java now can be competitive with C/C++ in many important situations, significant performance challenges remain. Existing Java virtual machines are not yet capable of performing the advanced loop transformations and automatic parallelization that are now common in state-of-the-art Fortran compilers. Java also has difficulties in implementing complex arithmetic efficiently. These performance deficiencies can be attacked with a combination of class libraries (packages, in Java that implement truly multidimensional arrays and complex numbers, and new compiler techniques that exploit the properties of these class libraries to enable other, more conventional, optimizations. Two compiler techniques, versioning and semantic expansion, can be leveraged to allow fully automatic optimization and parallelization of Java code. Our measurements with the NINJA prototype Java environment show that Java can be competitive in performance with highly optimized and tuned Fortran code.
High power test of RF window and coaxial line in vacuum

International Nuclear Information System (INIS)

Sun, D.; Champion, M.; Gormley, M.; Kerns, Q.; Koepke, K.; Moretti, A.

1993-01-01

Primary rf input couplers for the superconducting accelerating cavities of the TESLA electron linear accelerator test to be performed at DESY, Hamburg, Germany are under development at both DESY and Fermilab. The input couplers consist of a WR650 waveguide to coaxial line transition with an integral ceramic window, a coaxial connection to the superconducting accelerating cavity with a second ceramic window located at the liquid nitrogen heat intercept location, and bellows on both sides of the cold window to allow for cavity motion during cooldown, coupling adjustments and easier assembly. To permit in situ high peak power processing of the TESLA superconducting accelerating cavities, the input couplers are designed to transmit nominally 1 ms long, 2 MW peak, 1.3 GHz rf pulses from the WR650 waveguide at room temperature to the cavities at 1.8 K. The coaxial part of the Fermilab TESLA input coupler design has been tested up to 1.7 MW using the prototype 805 MHz rf source located at the A0 service building of the Tevatron. The rf source, the testing system and the test results are described
Visualizing Network Traffic to Understand the Performance of Massively Parallel Simulations

KAUST Repository

Landge, A. G.

2012-12-01

The performance of massively parallel applications is often heavily impacted by the cost of communication among compute nodes. However, determining how to best use the network is a formidable task, made challenging by the ever increasing size and complexity of modern supercomputers. This paper applies visualization techniques to aid parallel application developers in understanding the network activity by enabling a detailed exploration of the flow of packets through the hardware interconnect. In order to visualize this large and complex data, we employ two linked views of the hardware network. The first is a 2D view, that represents the network structure as one of several simplified planar projections. This view is designed to allow a user to easily identify trends and patterns in the network traffic. The second is a 3D view that augments the 2D view by preserving the physical network topology and providing a context that is familiar to the application developers. Using the massively parallel multi-physics code pF3D as a case study, we demonstrate that our tool provides valuable insight that we use to explain and optimize pF3D-s performance on an IBM Blue Gene/P system. © 1995-2012 IEEE.
Aspects of magnetohydrodynamic duct flow at high magnetic Reynolds number

International Nuclear Information System (INIS)

Turner, R.B.

1973-07-01

The thesis is concerned with the performance of a flow coupler, which consists of an MHD generator coupled to an MHD pump so that one stream of fluid is induced to move by the motion of another. The flow coupler investigations include: the effects caused by eddy currents on the applied magnetic field and electric potential distribution, the velocity perturbation which occurs as a liquid flows through a magnetic field, devices in which large currents flow through a moving conductor and through an external circuit, and the movement of two conductors through the gap of a magnet. The expected performance of a flow coupler is calculated. (U.K.)
High Performance with Prescriptive Optimization and Debugging

DEFF Research Database (Denmark)

Jensen, Nicklas Bo

parallelization and automatic vectorization is attractive as it transparently optimizes programs. The thesis contributes an improved dependence analysis for explicitly parallel programs. These improvements lead to more loops being vectorized, on average we achieve a speedup of 1.46 over the existing dependence...... analysis and vectorizer in GCC. Automatic optimizations often fail for theoretical and practical reasons. When they fail we argue that a hybrid approach can be effective. Using compiler feedback, we propose to use the programmer’s intuition and insight to achieve high performance. Compiler feedback...... enlightens the programmer why a given optimization was not applied, and suggest how to change the source code to make it more amenable to optimizations. We show how this can yield significant speedups and achieve 2.4 faster execution on a real industrial use case. To aid in parallel debugging we propose...
Customizable Memory Schemes for Data Parallel Architectures

NARCIS (Netherlands)

Gou, C.

2011-01-01

Memory system efficiency is crucial for any processor to achieve high performance, especially in the case of data parallel machines. Processing capabilities of parallel lanes will be wasted, when data requests are not accomplished in a sustainable and timely manner. Irregular vector memory accesses
Experiences in Data-Parallel Programming

Directory of Open Access Journals (Sweden)

Terry W. Clark

1997-01-01

Full Text Available To efficiently parallelize a scientific application with a data-parallel compiler requires certain structural properties in the source program, and conversely, the absence of others. A recent parallelization effort of ours reinforced this observation and motivated this correspondence. Specifically, we have transformed a Fortran 77 version of GROMOS, a popular dusty-deck program for molecular dynamics, into Fortran D, a data-parallel dialect of Fortran. During this transformation we have encountered a number of difficulties that probably are neither limited to this particular application nor do they seem likely to be addressed by improved compiler technology in the near future. Our experience with GROMOS suggests a number of points to keep in mind when developing software that may at some time in its life cycle be parallelized with a data-parallel compiler. This note presents some guidelines for engineering data-parallel applications that are compatible with Fortran D or High Performance Fortran compilers.
The Parallel Algorithm Based on Genetic Algorithm for Improving the Performance of Cognitive Radio

Directory of Open Access Journals (Sweden)

Liu Miao

2018-01-01

Full Text Available The intercarrier interference (ICI problem of cognitive radio (CR is severe. In this paper, the machine learning algorithm is used to obtain the optimal interference subcarriers of an unlicensed user (un-LU. Masking the optimal interference subcarriers can suppress the ICI of CR. Moreover, the parallel ICI suppression algorithm is designed to improve the calculation speed and meet the practical requirement of CR. Simulation results show that the data transmission rate threshold of un-LU can be set, the data transmission quality of un-LU can be ensured, the ICI of a licensed user (LU is suppressed, and the bit error rate (BER performance of LU is improved by implementing the parallel suppression algorithm. The ICI problem of CR is solved well by the new machine learning algorithm. The computing performance of the algorithm is improved by designing a new parallel structure and the communication performance of CR is enhanced.
Computational performance of a smoothed particle hydrodynamics simulation for shared-memory parallel computing

Science.gov (United States)

Nishiura, Daisuke; Furuichi, Mikito; Sakaguchi, Hide

2015-09-01

The computational performance of a smoothed particle hydrodynamics (SPH) simulation is investigated for three types of current shared-memory parallel computer devices: many integrated core (MIC) processors, graphics processing units (GPUs), and multi-core CPUs. We are especially interested in efficient shared-memory allocation methods for each chipset, because the efficient data access patterns differ between compute unified device architecture (CUDA) programming for GPUs and OpenMP programming for MIC processors and multi-core CPUs. We first introduce several parallel implementation techniques for the SPH code, and then examine these on our target computer architectures to determine the most effective algorithms for each processor unit. In addition, we evaluate the effective computing performance and power efficiency of the SPH simulation on each architecture, as these are critical metrics for overall performance in a multi-device environment. In our benchmark test, the GPU is found to produce the best arithmetic performance as a standalone device unit, and gives the most efficient power consumption. The multi-core CPU obtains the most effective computing performance. The computational speed of the MIC processor on Xeon Phi approached that of two Xeon CPUs. This indicates that using MICs is an attractive choice for existing SPH codes on multi-core CPUs parallelized by OpenMP, as it gains computational acceleration without the need for significant changes to the source code.
Design and numerical optimization of a mode multiplexer based on few-mode fiber couplers

International Nuclear Information System (INIS)

Xie, Yiwei; Fu, Songnian; Liu, Hai; Zhang, Hailiang; Tang, Ming; Liu, Deming; Shum, P

2013-01-01

Mode division multiplexing (MDM) transmission based on few-mode fibers (FMFs) appears to be an alternative solution for overcoming the capacity limit of single-mode fibers (SMFs). A FMF coupler-based mode division multiplexer/demultiplexer (MMUX/DeMMUX) is proposed and theoretically investigated after the fabricated FMF is characterized. MMUXs/DeMMUXs with a mode contrast ratio (MCR) of more than 20 dB can be obtained for two-mode multiplexing and three-mode multiplexing over a wavelength span of 60 and 10 nm, respectively. We numerically verify the proposed MMUX/DeMMUX which has the advantages of high MCR, easy fabrication and maintenance, and low wavelength dependence. (paper)
High Performance, Three-Dimensional Bilateral Filtering

International Nuclear Information System (INIS)

Bethel, E. Wes

2008-01-01

Image smoothing is a fundamental operation in computer vision and image processing. This work has two main thrusts: (1) implementation of a bilateral filter suitable for use in smoothing, or denoising, 3D volumetric data; (2) implementation of the 3D bilateral filter in three different parallelization models, along with parallel performance studies on two modern HPC architectures. Our bilateral filter formulation is based upon the work of Tomasi [11], but extended to 3D for use on volumetric data. Our three parallel implementations use POSIX threads, the Message Passing Interface (MPI), and Unified Parallel C (UPC), a Partitioned Global Address Space (PGAS) language. Our parallel performance studies, which were conducted on a Cray XT4 supercomputer and aquad-socket, quad-core Opteron workstation, show our algorithm to have near-perfect scalability up to 120 processors. Parallel algorithms, such as the one we present here, will have an increasingly important role for use in production visual analysis systems as the underlying computational platforms transition from single- to multi-core architectures in the future.
High Performance, Three-Dimensional Bilateral Filtering

Energy Technology Data Exchange (ETDEWEB)

Bethel, E. Wes

2008-06-05

Image smoothing is a fundamental operation in computer vision and image processing. This work has two main thrusts: (1) implementation of a bilateral filter suitable for use in smoothing, or denoising, 3D volumetric data; (2) implementation of the 3D bilateral filter in three different parallelization models, along with parallel performance studies on two modern HPC architectures. Our bilateral filter formulation is based upon the work of Tomasi [11], but extended to 3D for use on volumetric data. Our three parallel implementations use POSIX threads, the Message Passing Interface (MPI), and Unified Parallel C (UPC), a Partitioned Global Address Space (PGAS) language. Our parallel performance studies, which were conducted on a Cray XT4 supercomputer and aquad-socket, quad-core Opteron workstation, show our algorithm to have near-perfect scalability up to 120 processors. Parallel algorithms, such as the one we present here, will have an increasingly important role for use in production visual analysis systems as the underlying computational platforms transition from single- to multi-core architectures in the future.
PERFORMANCE ANALYSIS BETWEEN EXPLICIT SCHEDULING AND IMPLICIT SCHEDULING OF PARALLEL ARRAY-BASED DOMAIN DECOMPOSITION USING OPENMP

Directory of Open Access Journals (Sweden)

MOHAMMED FAIZ ABOALMAALY

2014-10-01

Full Text Available With the continuous revolution of multicore architecture, several parallel programming platforms have been introduced in order to pave the way for fast and efficient development of parallel algorithms. Back into its categories, parallel computing can be done through two forms: Data-Level Parallelism (DLP or Task-Level Parallelism (TLP. The former can be done by the distribution of data among the available processing elements while the latter is based on executing independent tasks concurrently. Most of the parallel programming platforms have built-in techniques to distribute the data among processors, these techniques are technically known as automatic distribution (scheduling. However, due to their wide range of purposes, variation of data types, amount of distributed data, possibility of extra computational overhead and other hardware-dependent factors, manual distribution could achieve better outcomes in terms of performance when compared to the automatic distribution. In this paper, this assumption is investigated by conducting a comparison between automatic and our newly proposed manual distribution of data among threads in parallel. Empirical results of matrix addition and matrix multiplication show a considerable performance gain when manual distribution is applied against automatic distribution.
High speed ultra-broadband amplitude modulators with ultrahigh extinction >65 dB.

Science.gov (United States)

Liu, S; Cai, H; DeRose, C T; Davids, P; Pomerene, A; Starbuck, A L; Trotter, D C; Camacho, R; Urayama, J; Lentine, A

2017-05-15

We experimentally demonstrate ultrahigh extinction ratio (>65 dB) amplitude modulators (AMs) that can be electrically tuned to operate across a broad spectral range of 160 nm from 1480 - 1640 nm and 95 nm from 1280 - 1375 nm. Our on-chip AMs employ one extra coupler compared with conventional Mach-Zehnder interferometers (MZI), thus form a cascaded MZI (CMZI) structure. Either directional or adiabatic couplers are used to compose the CMZI AMs and experimental comparisons are made between these two different structures. We investigate the performance of CMZI AMs under extreme conditions such as using 95:5 split ratio couplers and unbalanced waveguide losses. Electro-optic phase shifters are also integrated in the CMZI AMs for high-speed operation. Finally, we investigate the output optical phase when the amplitude is modulated, which provides us valuable information when both amplitude and phase are to be controlled. Our demonstration not only paves the road to applications such as quantum information processing that requires high extinction ratio AMs but also significantly alleviates the tight fabrication tolerance needed for large-scale integrated photonics.
Performance Assessment in a Heat Exchanger Tube with Opposite/Parallel Wing Twisted Tapes

Directory of Open Access Journals (Sweden)

S. Eiamsa-ard

2015-02-01

Full Text Available The thermohydraulic performance in a tube containing a modified twisted tape with alternate-axes and wing arrangements is reported. This work aims to investigate the effects of wing arrangements (opposite (O and parallel (P wings at different wing shapes (triangle (Tri, rectangular (Rec, and trapezoidal (Tra wings and on the thermohydraulic performance characteristics. The obtained results show that wing twisted tapes with all wing shape arrangements (O-Tri/O-Rec/O-Tra/P-Tri/P-Rec/P-Tra give superior thermohydraulic performance and heat transfer rate to the typical twisted tape. In addition, the tapes with opposite wing arrangement of O-Tra, O-Rec, and O-Tri give superior thermohydraulic performances to those with parallel wing arrangement of P-Tra, P-Rec, and P-Tri around 2.7%, 3.5%, and 3.2%, respectively.
High-performance computational fluid dynamics: a custom-code approach

International Nuclear Information System (INIS)

Fannon, James; Náraigh, Lennon Ó; Loiseau, Jean-Christophe; Valluri, Prashant; Bethune, Iain

2016-01-01

We introduce a modified and simplified version of the pre-existing fully parallelized three-dimensional Navier–Stokes flow solver known as TPLS. We demonstrate how the simplified version can be used as a pedagogical tool for the study of computational fluid dynamics (CFDs) and parallel computing. TPLS is at its heart a two-phase flow solver, and uses calls to a range of external libraries to accelerate its performance. However, in the present context we narrow the focus of the study to basic hydrodynamics and parallel computing techniques, and the code is therefore simplified and modified to simulate pressure-driven single-phase flow in a channel, using only relatively simple Fortran 90 code with MPI parallelization, but no calls to any other external libraries. The modified code is analysed in order to both validate its accuracy and investigate its scalability up to 1000 CPU cores. Simulations are performed for several benchmark cases in pressure-driven channel flow, including a turbulent simulation, wherein the turbulence is incorporated via the large-eddy simulation technique. The work may be of use to advanced undergraduate and graduate students as an introductory study in CFDs, while also providing insight for those interested in more general aspects of high-performance computing. (paper)
High-performance computational fluid dynamics: a custom-code approach

Science.gov (United States)

Fannon, James; Loiseau, Jean-Christophe; Valluri, Prashant; Bethune, Iain; Náraigh, Lennon Ó.

2016-07-01

We introduce a modified and simplified version of the pre-existing fully parallelized three-dimensional Navier-Stokes flow solver known as TPLS. We demonstrate how the simplified version can be used as a pedagogical tool for the study of computational fluid dynamics (CFDs) and parallel computing. TPLS is at its heart a two-phase flow solver, and uses calls to a range of external libraries to accelerate its performance. However, in the present context we narrow the focus of the study to basic hydrodynamics and parallel computing techniques, and the code is therefore simplified and modified to simulate pressure-driven single-phase flow in a channel, using only relatively simple Fortran 90 code with MPI parallelization, but no calls to any other external libraries. The modified code is analysed in order to both validate its accuracy and investigate its scalability up to 1000 CPU cores. Simulations are performed for several benchmark cases in pressure-driven channel flow, including a turbulent simulation, wherein the turbulence is incorporated via the large-eddy simulation technique. The work may be of use to advanced undergraduate and graduate students as an introductory study in CFDs, while also providing insight for those interested in more general aspects of high-performance computing.
High Performance Proactive Digital Forensics

International Nuclear Information System (INIS)

Alharbi, Soltan; Traore, Issa; Moa, Belaid; Weber-Jahnke, Jens

2012-01-01

With the increase in the number of digital crimes and in their sophistication, High Performance Computing (HPC) is becoming a must in Digital Forensics (DF). According to the FBI annual report, the size of data processed during the 2010 fiscal year reached 3,086 TB (compared to 2,334 TB in 2009) and the number of agencies that requested Regional Computer Forensics Laboratory assistance increasing from 689 in 2009 to 722 in 2010. Since most investigation tools are both I/O and CPU bound, the next-generation DF tools are required to be distributed and offer HPC capabilities. The need for HPC is even more evident in investigating crimes on clouds or when proactive DF analysis and on-site investigation, requiring semi-real time processing, are performed. Although overcoming the performance challenge is a major goal in DF, as far as we know, there is almost no research on HPC-DF except for few papers. As such, in this work, we extend our work on the need of a proactive system and present a high performance automated proactive digital forensic system. The most expensive phase of the system, namely proactive analysis and detection, uses a parallel extension of the iterative z algorithm. It also implements new parallel information-based outlier detection algorithms to proactively and forensically handle suspicious activities. To analyse a large number of targets and events and continuously do so (to capture the dynamics of the system), we rely on a multi-resolution approach to explore the digital forensic space. Data set from the Honeynet Forensic Challenge in 2001 is used to evaluate the system from DF and HPC perspectives.

High Performance Molecular Visualization: In-Situ and Parallel Rendering with EGL

Science.gov (United States)

Stone, John E.; Messmer, Peter; Sisneros, Robert; Schulten, Klaus

2016-01-01

Large scale molecular dynamics simulations produce terabytes of data that is impractical to transfer to remote facilities. It is therefore necessary to perform visualization tasks in-situ as the data are generated, or by running interactive remote visualization sessions and batch analyses co-located with direct access to high performance storage systems. A significant challenge for deploying visualization software within clouds, clusters, and supercomputers involves the operating system software required to initialize and manage graphics acceleration hardware. Recently, it has become possible for applications to use the Embedded-system Graphics Library (EGL) to eliminate the requirement for windowing system software on compute nodes, thereby eliminating a significant obstacle to broader use of high performance visualization applications. We outline the potential benefits of this approach in the context of visualization applications used in the cloud, on commodity clusters, and supercomputers. We discuss the implementation of EGL support in VMD, a widely used molecular visualization application, and we outline benefits of the approach for molecular visualization tasks on petascale computers, clouds, and remote visualization servers. We then provide a brief evaluation of the use of EGL in VMD, with tests using developmental graphics drivers on conventional workstations and on Amazon EC2 G2 GPU-accelerated cloud instance types. We expect that the techniques described here will be of broad benefit to many other visualization applications. PMID:27747137
Parallel optical control of spatiotemporal neuronal spike activity using high-frequency digital light processingtechnology

Directory of Open Access Journals (Sweden)

Jason eJerome

2011-08-01

Full Text Available Neurons in the mammalian neocortex receive inputs from and communicate back to thousands of other neurons, creating complex spatiotemporal activity patterns. The experimental investigation of these parallel dynamic interactions has been limited due to the technical challenges of monitoring or manipulating neuronal activity at that level of complexity. Here we describe a new massively parallel photostimulation system that can be used to control action potential firing in in vitro brain slices with high spatial and temporal resolution while performing extracellular or intracellular electrophysiological measurements. The system uses Digital-Light-Processing (DLP technology to generate 2-dimensional (2D stimulus patterns with >780,000 independently controlled photostimulation sites that operate at high spatial (5.4 µm and temporal (>13kHz resolution. Light is projected through the quartz-glass bottom of the perfusion chamber providing access to a large area (2.76 x 2.07 mm2 of the slice preparation. This system has the unique capability to induce temporally precise action potential firing in large groups of neurons distributed over a wide area covering several cortical columns. Parallel photostimulation opens up new opportunities for the in vitro experimental investigation of spatiotemporal neuronal interactions at a broad range of anatomical scales.
Design Procedure and Fabrication of Reproducible Silicon Vernier Devices for High-Performance Refractive Index Sensing.

Science.gov (United States)

Troia, Benedetto; Khokhar, Ali Z; Nedeljkovic, Milos; Reynolds, Scott A; Hu, Youfang; Mashanovich, Goran Z; Passaro, Vittorio M N

2015-06-10

In this paper, we propose a generalized procedure for the design of integrated Vernier devices for high performance chemical and biochemical sensing. In particular, we demonstrate the accurate control of the most critical design and fabrication parameters of silicon-on-insulator cascade-coupled racetrack resonators operating in the second regime of the Vernier effect, around 1.55 μm. The experimental implementation of our design strategies has allowed a rigorous and reliable investigation of the influence of racetrack resonator and directional coupler dimensions as well as of waveguide process variability on the operation of Vernier devices. Figures of merit of our Vernier architectures have been measured experimentally, evidencing a high reproducibility and a very good agreement with the theoretical predictions, as also confirmed by relative errors even lower than 1%. Finally, a Vernier gain as high as 30.3, average insertion loss of 2.1 dB and extinction ratio up to 30 dB have been achieved.
Incorporating Parallel Computing into the Goddard Earth Observing System Data Assimilation System (GEOS DAS)

Science.gov (United States)

Larson, Jay W.

1998-01-01

Atmospheric data assimilation is a method of combining actual observations with model forecasts to produce a more accurate description of the earth system than the observations or forecast alone can provide. The output of data assimilation, sometimes called the analysis, are regular, gridded datasets of observed and unobserved variables. Analysis plays a key role in numerical weather prediction and is becoming increasingly important for climate research. These applications, and the need for timely validation of scientific enhancements to the data assimilation system pose computational demands that are best met by distributed parallel software. The mission of the NASA Data Assimilation Office (DAO) is to provide datasets for climate research and to support NASA satellite and aircraft missions. The system used to create these datasets is the Goddard Earth Observing System Data Assimilation System (GEOS DAS). The core components of the the GEOS DAS are: the GEOS General Circulation Model (GCM), the Physical-space Statistical Analysis System (PSAS), the Observer, the on-line Quality Control (QC) system, the Coupler (which feeds analysis increments back to the GCM), and an I/O package for processing the large amounts of data the system produces (which will be described in another presentation in this session). The discussion will center on the following issues: the computational complexity for the whole GEOS DAS, assessment of the performance of the individual elements of GEOS DAS, and parallelization strategy for some of the components of the system.
Conference on High Performance Software for Nonlinear Optimization

CERN Document Server

Murli, Almerico; Pardalos, Panos; Toraldo, Gerardo

1998-01-01

This book contains a selection of papers presented at the conference on High Performance Software for Nonlinear Optimization (HPSN097) which was held in Ischia, Italy, in June 1997. The rapid progress of computer technologies, including new parallel architec tures, has stimulated a large amount of research devoted to building software environments and defining algorithms able to fully exploit this new computa tional power. In some sense, numerical analysis has to conform itself to the new tools. The impact of parallel computing in nonlinear optimization, which had a slow start at the beginning, seems now to increase at a fast rate, and it is reasonable to expect an even greater acceleration in the future. As with the first HPSNO conference, the goal of the HPSN097 conference was to supply a broad overview of the more recent developments and trends in nonlinear optimization, emphasizing the algorithmic and high performance software aspects. Bringing together new computational methodologies with theoretical...
Shared Variable Oriented Parallel Precompiler for SPMD Model

Institute of Scientific and Technical Information of China (English)

无

1995-01-01

For the moment,commercial parallel computer systems with distributed memory architecture are usually provided with parallel FORTRAN or parallel C compliers,which are just traditional sequential FORTRAN or C compilers expanded with communication statements.Programmers suffer from writing parallel programs with communication statements. The Shared Variable Oriented Parallel Precompiler (SVOPP) proposed in this paper can automatically generate appropriate communication statements based on shared variables for SPMD(Single Program Multiple Data) computation model and greatly ease the parallel programming with high communication efficiency.The core function of parallel C precompiler has been successfully verified on a transputer-based parallel computer.Its prominent performance shows that SVOPP is probably a break-through in parallel programming technique.
The path toward HEP High Performance Computing

International Nuclear Information System (INIS)

Apostolakis, John; Brun, René; Gheata, Andrei; Wenzel, Sandro; Carminati, Federico

2014-01-01

High Energy Physics code has been known for making poor use of high performance computing architectures. Efforts in optimising HEP code on vector and RISC architectures have yield limited results and recent studies have shown that, on modern architectures, it achieves a performance between 10% and 50% of the peak one. Although several successful attempts have been made to port selected codes on GPUs, no major HEP code suite has a 'High Performance' implementation. With LHC undergoing a major upgrade and a number of challenging experiments on the drawing board, HEP cannot any longer neglect the less-than-optimal performance of its code and it has to try making the best usage of the hardware. This activity is one of the foci of the SFT group at CERN, which hosts, among others, the Root and Geant4 project. The activity of the experiments is shared and coordinated via a Concurrency Forum, where the experience in optimising HEP code is presented and discussed. Another activity is the Geant-V project, centred on the development of a highperformance prototype for particle transport. Achieving a good concurrency level on the emerging parallel architectures without a complete redesign of the framework can only be done by parallelizing at event level, or with a much larger effort at track level. Apart the shareable data structures, this typically implies a multiplication factor in terms of memory consumption compared to the single threaded version, together with sub-optimal handling of event processing tails. Besides this, the low level instruction pipelining of modern processors cannot be used efficiently to speedup the program. We have implemented a framework that allows scheduling vectors of particles to an arbitrary number of computing resources in a fine grain parallel approach. The talk will review the current optimisation activities within the SFT group with a particular emphasis on the development perspectives towards a simulation framework able to profit
Parallel processing of structural integrity analysis codes

International Nuclear Information System (INIS)

Swami Prasad, P.; Dutta, B.K.; Kushwaha, H.S.

1996-01-01

Structural integrity analysis forms an important role in assessing and demonstrating the safety of nuclear reactor components. This analysis is performed using analytical tools such as Finite Element Method (FEM) with the help of digital computers. The complexity of the problems involved in nuclear engineering demands high speed computation facilities to obtain solutions in reasonable amount of time. Parallel processing systems such as ANUPAM provide an efficient platform for realising the high speed computation. The development and implementation of software on parallel processing systems is an interesting and challenging task. The data and algorithm structure of the codes plays an important role in exploiting the parallel processing system capabilities. Structural analysis codes based on FEM can be divided into two categories with respect to their implementation on parallel processing systems. The first category codes such as those used for harmonic analysis, mechanistic fuel performance codes need not require the parallelisation of individual modules of the codes. The second category of codes such as conventional FEM codes require parallelisation of individual modules. In this category, parallelisation of equation solution module poses major difficulties. Different solution schemes such as domain decomposition method (DDM), parallel active column solver and substructuring method are currently used on parallel processing systems. Two codes, FAIR and TABS belonging to each of these categories have been implemented on ANUPAM. The implementation details of these codes and the performance of different equation solvers are highlighted. (author). 5 refs., 12 figs., 1 tab
Analytical modeling and analysis of magnetic field and torque for novel axial flux eddy current couplers with PM excitation

Science.gov (United States)

Li, Zhao; Wang, Dazhi; Zheng, Di; Yu, Linxin

2017-10-01

Rotational permanent magnet eddy current couplers are promising devices for torque and speed transmission without any mechanical contact. In this study, flux-concentration disk-type permanent magnet eddy current couplers with double conductor rotor are investigated. Given the drawback of the accurate three-dimensional finite element method, this paper proposes a mixed two-dimensional analytical modeling approach. Based on this approach, the closed-form expressions of magnetic field, eddy current, electromagnetic force and torque for such devices are obtained. Finally, a three-dimensional finite element method is employed to validate the analytical results. Besides, a prototype is manufactured and tested for the torque-speed characteristic.
An investigation into the accuracy, stability and parallel performance of a highly stable explicit technique for stiff reaction-transport PDEs

Energy Technology Data Exchange (ETDEWEB)

Franz, A., LLNL

1998-02-17

The numerical simulation of chemically reacting flows is a topic, that has attracted a great deal of current research At the heart of numerical reactive flow simulations are large sets of coupled, nonlinear Partial Differential Equations (PDES). Due to the stiffness that is usually present, explicit time differencing schemes are not used despite their inherent simplicity and efficiency on parallel and vector machines, since these schemes require prohibitively small numerical stepsizes. Implicit time differencing schemes, although possessing good stability characteristics, introduce a great deal of computational overhead necessary to solve the simultaneous algebraic system at each timestep. This thesis examines an algorithm based on a preconditioned time differencing scheme. The algorithm is explicit and permits a large stable time step. An investigation of the algorithm`s accuracy, stability and performance on a parallel architecture is presented
Modern industrial simulation tools: Kernel-level integration of high performance parallel processing, object-oriented numerics, and adaptive finite element analysis. Final report, July 16, 1993--September 30, 1997

Energy Technology Data Exchange (ETDEWEB)

Deb, M.K.; Kennon, S.R.

1998-04-01

A cooperative R&D effort between industry and the US government, this project, under the HPPP (High Performance Parallel Processing) initiative of the Dept. of Energy, started the investigations into parallel object-oriented (OO) numerics. The basic goal was to research and utilize the emerging technologies to create a physics-independent computational kernel for applications using adaptive finite element method. The industrial team included Computational Mechanics Co., Inc. (COMCO) of Austin, TX (as the primary contractor), Scientific Computing Associates, Inc. (SCA) of New Haven, CT, Texaco and CONVEX. Sandia National Laboratory (Albq., NM) was the technology partner from the government side. COMCO had the responsibility of the main kernel design and development, SCA had the lead in parallel solver technology and guidance on OO technologies was Sandia`s main expertise in this venture. CONVEX and Texaco supported the partnership by hardware resource and application knowledge, respectively. As such, a minimum of fifty-percent cost-sharing was provided by the industry partnership during this project. This report describes the R&D activities and provides some details about the prototype kernel and example applications.
Highly Parallelized Pattern Matching Execution for the ATLAS Experiment

CERN Document Server

Citraro, Saverio; The ATLAS collaboration

2015-01-01

The trigger system of the ATLAS experiment at LHC will extend its rejection capabilities during operations in 2015-2018 by introducing the Fast TracKer system (FTK). FTK is a hardware based system capable of finding charged particle tracks by analyzing hits in silicon detectors at the rate of 105 events per second. The core of track reconstruction is performed into two pipelined steps. At first step the candidate tracks are found by matching combination of low resolution hits to predefined patterns; then they are used in the second step to seed a more precise track fitting algorithm. The key FTK component is an Associative Memory (AM) system that is used to perform pattern matching with high degree of parallelism. The AM system implementation, the AM Serial Link Processor, is based on an extremely powerful network of 2 Gb/s serial links to sustain a huge traffic of data. We report on the design of the Serial Link Processor consisting of two types of boards, the Little Associative Memory Board (LAMB), a mezzan...
Highly Parallelized Pattern Matching Execution for the ATLAS Experiment

CERN Document Server

Citraro, Saverio; The ATLAS collaboration

2015-01-01

Abstract– The Associative Memory (AM) system of the Fast Tracker (FTK) processor has been designed to perform pattern matching using as input the data from the silicon tracker in the ATLAS experiment. The AM is the primary component of the FTK system and is designed using ASIC technology (the AM chip) to execute pattern matching with a high degree of parallelism. The FTK system finds track candidates at low resolution that are seeds for a full resolution track fitting. The AM system implementation is named “Serial Link Processor” and is based on an extremely powerful network of 2 Gb/s serial links to sustain a huge traffic of data. This paper reports on the design of the Serial Link Processor consisting of two types of boards, the Little Associative Memory Board (LAMB), a mezzanine where the AM chips are mounted, and the Associative Memory Board (AMB), a 9U VME motherboard which hosts four LAMB daughterboards. We also report on the performance of the prototypes (both hardware and firmware) produced and ...
Design strategies for irregularly adapting parallel applications

International Nuclear Information System (INIS)

Oliker, Leonid; Biswas, Rupak; Shan, Hongzhang; Sing, Jaswinder Pal

2000-01-01

Achieving scalable performance for dynamic irregular applications is eminently challenging. Traditional message-passing approaches have been making steady progress towards this goal; however, they suffer from complex implementation requirements. The use of a global address space greatly simplifies the programming task, but can degrade the performance of dynamically adapting computations. In this work, we examine two major classes of adaptive applications, under five competing programming methodologies and four leading parallel architectures. Results indicate that it is possible to achieve message-passing performance using shared-memory programming techniques by carefully following the same high level strategies. Adaptive applications have computational work loads and communication patterns which change unpredictably at runtime, requiring dynamic load balancing to achieve scalable performance on parallel machines. Efficient parallel implementations of such adaptive applications are therefore a challenging task. This work examines the implementation of two typical adaptive applications, Dynamic Remeshing and N-Body, across various programming paradigms and architectural platforms. We compare several critical factors of the parallel code development, including performance, programmability, scalability, algorithmic development, and portability
Parallel Performance Optimizations on Unstructured Mesh-based Simulations

Energy Technology Data Exchange (ETDEWEB)

Sarje, Abhinav; Song, Sukhyun; Jacobsen, Douglas; Huck, Kevin; Hollingsworth, Jeffrey; Malony, Allen; Williams, Samuel; Oliker, Leonid

2015-01-01

© The Authors. Published by Elsevier B.V. This paper addresses two key parallelization challenges the unstructured mesh-based ocean modeling code, MPAS-Ocean, which uses a mesh based on Voronoi tessellations: (1) load imbalance across processes, and (2) unstructured data access patterns, that inhibit intra- and inter-node performance. Our work analyzes the load imbalance due to naive partitioning of the mesh, and develops methods to generate mesh partitioning with better load balance and reduced communication. Furthermore, we present methods that minimize both inter- and intranode data movement and maximize data reuse. Our techniques include predictive ordering of data elements for higher cache efficiency, as well as communication reduction approaches. We present detailed performance data when running on thousands of cores using the Cray XC30 supercomputer and show that our optimization strategies can exceed the original performance by over 2×. Additionally, many of these solutions can be broadly applied to a wide variety of unstructured grid-based computations.
Connectionist Models and Parallelism in High Level Vision.

Science.gov (United States)

1985-01-01

GRANT NUMBER(s) Jerome A. Feldman N00014-82-K-0193 9. PERFORMING ORGANIZATION NAME AND ADDRESS 10. PROGRAM ELEMENt. PROJECT, TASK Computer Science...Connectionist Models 2.1 Background and Overviev % Computer science is just beginning to look seriously at parallel computation : it may turn out that...the chair. The program includes intermediate level networks that compute more complex joints and ones that compute parallelograms in the image. These
An efficient implementation of 3D high-resolution imaging for large-scale seismic data with GPU/CPU heterogeneous parallel computing

Science.gov (United States)

Xu, Jincheng; Liu, Wei; Wang, Jin; Liu, Linong; Zhang, Jianfeng

2018-02-01

De-absorption pre-stack time migration (QPSTM) compensates for the absorption and dispersion of seismic waves by introducing an effective Q parameter, thereby making it an effective tool for 3D, high-resolution imaging of seismic data. Although the optimal aperture obtained via stationary-phase migration reduces the computational cost of 3D QPSTM and yields 3D stationary-phase QPSTM, the associated computational efficiency is still the main problem in the processing of 3D, high-resolution images for real large-scale seismic data. In the current paper, we proposed a division method for large-scale, 3D seismic data to optimize the performance of stationary-phase QPSTM on clusters of graphics processing units (GPU). Then, we designed an imaging point parallel strategy to achieve an optimal parallel computing performance. Afterward, we adopted an asynchronous double buffering scheme for multi-stream to perform the GPU/CPU parallel computing. Moreover, several key optimization strategies of computation and storage based on the compute unified device architecture (CUDA) were adopted to accelerate the 3D stationary-phase QPSTM algorithm. Compared with the initial GPU code, the implementation of the key optimization steps, including thread optimization, shared memory optimization, register optimization and special function units (SFU), greatly improved the efficiency. A numerical example employing real large-scale, 3D seismic data showed that our scheme is nearly 80 times faster than the CPU-QPSTM algorithm. Our GPU/CPU heterogeneous parallel computing framework significant reduces the computational cost and facilitates 3D high-resolution imaging for large-scale seismic data.
The STAPL Parallel Graph Library

KAUST Repository

Harshvardhan,

2013-01-01

This paper describes the stapl Parallel Graph Library, a high-level framework that abstracts the user from data-distribution and parallelism details and allows them to concentrate on parallel graph algorithm development. It includes a customizable distributed graph container and a collection of commonly used parallel graph algorithms. The library introduces pGraph pViews that separate algorithm design from the container implementation. It supports three graph processing algorithmic paradigms, level-synchronous, asynchronous and coarse-grained, and provides common graph algorithms based on them. Experimental results demonstrate improved scalability in performance and data size over existing graph libraries on more than 16,000 cores and on internet-scale graphs containing over 16 billion vertices and 250 billion edges. © Springer-Verlag Berlin Heidelberg 2013.
Multicenter Clinical Trial of Vibroplasty Couplers to Treat Mixed/Conductive Hearing Loss: First Results.

Science.gov (United States)

Zahnert, Thomas; Löwenheim, Hubert; Beutner, Dirk; Hagen, Rudolf; Ernst, Arneborg; Pau, Hans-Wilhelm; Zehlicke, Thorsten; Kühne, Hilke; Friese, Natascha; Tropitzsch, Anke; Lüers, Jan-Christoffer; Mlynski, Robert; Todt, Ingo; Hüttenbrink, Karl-Bernd

2016-01-01

To evaluate the safety and effectiveness of round window (RW), oval window (OW), CliP and Bell couplers for use with an active middle ear implant. This is a multicenter, long-term, prospective trial with consecutive enrollment, involving 6 university hospitals in Germany. Bone conduction, air conduction, implant-aided warble-tone thresholds and Freiburger monosyllable word recognition scores were compared with unaided preimplantation results in 28 moderate-to-profound hearing-impaired patients after 12 months of follow-up. All patients had previously undergone failed reconstruction surgeries (up to 5 or more). In a subset of patients, additional speech tests at 12 months postoperatively were used to compare the aided with the unaided condition after implantation with the processor switched off. An established quality-of-life questionnaire for hearing aids was used to determine patient satisfaction. Postoperative bone conduction remained stable. Mean functional gain for all couplers was 37 dB HL (RW = 42 dB, OW = 35 dB, Bell = 38 dB, CliP = 27 dB). The mean postoperative Freiburger monosyllable score was 71% at 65 dB SPL. The postimplantation mean SRT50 (speech reception in quiet for 50% understanding of words in sentences) improved on average by 23 dB over unaided testing and signal-to-noise ratios also improved in all patients. The International Outcome Inventory for Hearing Aids (IOI-HA)quality-of-life questionnaire was scored very positively by all patients. A significant improvement was seen with all couplers, and patients were satisfied with the device at 12 months postoperatively. These results demonstrate that an active implant is an advantage in achieving good hearing benefit in patients with prior failed reconstruction surgery. © 2016 S. Karger AG, Basel.
Fabrication of etched facets and vertical couplers in InP for packaging and on-wafer test

NARCIS (Netherlands)

Lemos Alvares Dos Santos, Rui; D'Agostino, D.; Soares, F. M.; Haghighi, H. Rabbani; Williams, K. A.; Leijtens, X. J. M.

2016-01-01

In this letter, the fabrication and the characterization of angled and straight etched facets in InP-based technology are reported. In addition, we report on etched facets combined with coupler mirrors for vertical outcoupling, realized with a wet-etching process.

Iteration schemes for parallelizing models of superconductivity

Energy Technology Data Exchange (ETDEWEB)

Gray, P.A. [Michigan State Univ., East Lansing, MI (United States)

1996-12-31

The time dependent Lawrence-Doniach model, valid for high fields and high values of the Ginzburg-Landau parameter, is often used for studying vortex dynamics in layered high-T{sub c} superconductors. When solving these equations numerically, the added degrees of complexity due to the coupling and nonlinearity of the model often warrant the use of high-performance computers for their solution. However, the interdependence between the layers can be manipulated so as to allow parallelization of the computations at an individual layer level. The reduced parallel tasks may then be solved independently using a heterogeneous cluster of networked workstations connected together with Parallel Virtual Machine (PVM) software. Here, this parallelization of the model is discussed and several computational implementations of varying degrees of parallelism are presented. Computational results are also given which contrast properties of convergence speed, stability, and consistency of these implementations. Included in these results are models involving the motion of vortices due to an applied current and pinning effects due to various material properties.
Low-loss multimode interference couplers for terahertz waves

Science.gov (United States)

Themistos, Christos; Kalli, Kyriacos; Komodromos, Michael; Markides, Christos; Quadir, Anita; Rahman, B. M. Azizur; Grattan, Kenneth T. V.

2012-04-01

The terahertz (THz) frequency region of the electromagnetic spectrum is located between the traditional microwave spectrum and the optical frequencies, and offers a significant scientific and technological potential in many fields, such as in sensing, in imaging and in spectroscopy. Waveguiding in this intermediate spectral region is a major challenge. Amongst the various THz waveguides suggested, metal-clad plasmonic waveguides and specifically hollow core structures, coated with insulating material are the most promising low-loss waveguides used in both active and passive devices. Optical power splitters are important components in the design of optoelectronic systems and optical communication networks such as Mach-Zehnder Interferometric switches, polarization splitter and polarization scramblers. Several designs for the implementation of the 3dB power splitters have been proposed in the past, such as the directional coupler-based approach, the Y-junction-based devices and the MMI-based approach. In the present paper a novel MMI-based 3dB THz wave splitter is implemented using Gold/polystyrene (PS) coated hollow glass rectangular waveguides. The H-field FEM based full-vector formulation is used here to calculate the complex propagation characteristics of the waveguide structure and the finite element beam propagation method (FE-BPM) and finite difference time domain (FDTD) approach to demonstrate the performance of the proposed 3dB splitter.
Preliminary design of high-power wave-guide/transmission system

Indian Academy of Sciences (India)

... CW klystron followed by wave-guide ﬁlter, dual directional coupler, high-power circulator, three 3 dB magic TEE power dividers to split the main channel into four equal channels of 250 kW each. Each individual channel has dual directional couplers, ﬂexible wave-guide sections and high power ceramic vacuum window.
BurstMem: A High-Performance Burst Buffer System for Scientific Applications

Energy Technology Data Exchange (ETDEWEB)

Wang, Teng [Auburn University, Auburn, Alabama; Oral, H Sarp [ORNL; Wang, Yandong [Auburn University, Auburn, Alabama; Settlemyer, Bradley W [ORNL; Atchley, Scott [ORNL; Yu, Weikuan [Auburn University, Auburn, Alabama

2014-01-01

The growth of computing power on large-scale sys- tems requires commensurate high-bandwidth I/O system. Many parallel file systems are designed to provide fast sustainable I/O in response to applications soaring requirements. To meet this need, a novel system is imperative to temporarily buffer the bursty I/O and gradually flush datasets to long-term parallel file systems. In this paper, we introduce the design of BurstMem, a high- performance burst buffer system. BurstMem provides a storage framework with efficient storage and communication manage- ment strategies. Our experiments demonstrate that BurstMem is able to speed up the I/O performance of scientific applications by up to 8.5 on leadership computer systems.
Radiation-hard/high-speed parallel optical links

Energy Technology Data Exchange (ETDEWEB)

Gan, K.K., E-mail: gan@mps.ohio-state.edu [Department of Physics, The Ohio State University, Columbus, OH 43210 (United States); Buchholz, P.; Heidbrink, S. [Fachbereich Physik, Universität Siegen, Siegen (Germany); Kagan, H.P.; Kass, R.D.; Moore, J.; Smith, D.S. [Department of Physics, The Ohio State University, Columbus, OH 43210 (United States); Vogt, M.; Ziolkowski, M. [Fachbereich Physik, Universität Siegen, Siegen (Germany)

2016-09-21

We have designed and fabricated a compact parallel optical engine for transmitting data at 5 Gb/s. The device consists of a 4-channel ASIC driving a VCSEL (Vertical Cavity Surface Emitting Laser) array in an optical package. The ASIC is designed using only core transistors in a 65 nm CMOS process to enhance the radiation-hardness. The ASIC contains an 8-bit DAC to control the bias and modulation currents of the individual channels in the VCSEL array. The performance of the optical engine up at 5 Gb/s is satisfactory.
PRISMA/DB: A Parallel Main-Memory Relational DBMS

NARCIS (Netherlands)

Apers, Peter M.G.; Flokstra, Jan; van den Berg, Carel A.; Grefen, P.W.P.J.; Wilschut, A.N.; Kersten, Martin L.; van den Berg, C.A.

1992-01-01

PRISMA/DB, a full-fledged parallel, main memory relational database management system (DBMS) is described. PRISMA/DB's high performance is obtained by the use of parallelism for query processing and main memory storage of the entire database. A flexible architecture for experimenting with
High-performance file I/O in Java : existing approaches and bulk I/O extensions.

Energy Technology Data Exchange (ETDEWEB)

Bonachea, D.; Dickens, P.; Thakur, R.; Mathematics and Computer Science; Univ. of California at Berkeley; Illinois Institute of Technology

2001-07-01

There is a growing interest in using Java as the language for developing high-performance computing applications. To be successful in the high-performance computing domain, however, Java must not only be able to provide high computational performance, but also high-performance I/O. In this paper, we first examine several approaches that attempt to provide high-performance I/O in Java - many of which are not obvious at first glance - and evaluate their performance on two parallel machines, the IBM SP and the SGI Origin2000. We then propose extensions to the Java I/O library that address the deficiencies in the Java I/O API and improve performance dramatically. The extensions add bulk (array) I/O operations to Java, thereby removing much of the overhead currently associated with array I/O in Java. We have implemented the extensions in two ways: in a standard JVM using the Java Native Interface (JNI) and in a high-performance parallel dialect of Java called Titanium. We describe the two implementations and present performance results that demonstrate the benefits of the proposed extensions.
High performance image processing of SPRINT

Energy Technology Data Exchange (ETDEWEB)

DeGroot, T. [Lawrence Livermore National Lab., CA (United States)

1994-11-15

This talk will describe computed tomography (CT) reconstruction using filtered back-projection on SPRINT parallel computers. CT is a computationally intensive task, typically requiring several minutes to reconstruct a 512x512 image. SPRINT and other parallel computers can be applied to CT reconstruction to reduce computation time from minutes to seconds. SPRINT is a family of massively parallel computers developed at LLNL. SPRINT-2.5 is a 128-node multiprocessor whose performance can exceed twice that of a Cray-Y/MP. SPRINT-3 will be 10 times faster. Described will be the parallel algorithms for filtered back-projection and their execution on SPRINT parallel computers.
Parallelization of quantum molecular dynamics simulation code

International Nuclear Information System (INIS)

Kato, Kaori; Kunugi, Tomoaki; Shibahara, Masahiko; Kotake, Susumu

1998-02-01

A quantum molecular dynamics simulation code has been developed for the analysis of the thermalization of photon energies in the molecule or materials in Kansai Research Establishment. The simulation code is parallelized for both Scalar massively parallel computer (Intel Paragon XP/S75) and Vector parallel computer (Fujitsu VPP300/12). Scalable speed-up has been obtained with a distribution to processor units by division of particle group in both parallel computers. As a result of distribution to processor units not only by particle group but also by the particles calculation that is constructed with fine calculations, highly parallelization performance is achieved in Intel Paragon XP/S75. (author)
Parallel Application Performance on Two Generations of Intel Xeon HPC Platforms

Energy Technology Data Exchange (ETDEWEB)

Chang, Christopher H.; Long, Hai; Sides, Scott; Vaidhynathan, Deepthi; Jones, Wesley

2015-10-15

Two next-generation node configurations hosting the Haswell microarchitecture were tested with a suite of microbenchmarks and application examples, and compared with a current Ivy Bridge production node on NREL" tm s Peregrine high-performance computing cluster. A primary conclusion from this study is that the additional cores are of little value to individual task performance--limitations to application parallelism, or resource contention among concurrently running but independent tasks, limits effective utilization of these added cores. Hyperthreading generally impacts throughput negatively, but can improve performance in the absence of detailed attention to runtime workflow configuration. The observations offer some guidance to procurement of future HPC systems at NREL. First, raw core count must be balanced with available resources, particularly memory bandwidth. Balance-of-system will determine value more than processor capability alone. Second, hyperthreading continues to be largely irrelevant to the workloads that are commonly seen, and were tested here, at NREL. Finally, perhaps the most impactful enhancement to productivity might occur through enabling multiple concurrent jobs per node. Given the right type and size of workload, more may be achieved by doing many slow things at once, than fast things in order.
RISC Processors and High Performance Computing

Science.gov (United States)

Bailey, David H.; Saini, Subhash; Craw, James M. (Technical Monitor)

1995-01-01

This tutorial will discuss the top five RISC microprocessors and the parallel systems in which they are used. It will provide a unique cross-machine comparison not available elsewhere. The effective performance of these processors will be compared by citing standard benchmarks in the context of real applications. The latest NAS Parallel Benchmarks, both absolute performance and performance per dollar, will be listed. The next generation of the NPB will be described. The tutorial will conclude with a discussion of future directions in the field. Technology Transfer Considerations: All of these computer systems are commercially available internationally. Information about these processors is available in the public domain, mostly from the vendors themselves. The NAS Parallel Benchmarks and their results have been previously approved numerous times for public release, beginning back in 1991.
Generalized MHD for numerical stability analysis of high-performance plasmas in tokamaks

International Nuclear Information System (INIS)

Mikhailovskii, A.B.

1998-01-01

A set of generalized magnetohydrodynamic (MHD) equations is formulated to accommodate the effects associated with high ion and electron temperatures in high-performance plasmas in tokamaks. The effects of neoclassical bootstrap current, neoclassical ion viscosity, the ion finite Larmor radius effect and electron and ion drift effects are taken into account in two-fluid MHD equations together with gyroviscosity, parallel viscosity, electron parallel inertia and collisionless ion heat flux. The ion velocity is identified as the plasma velocity, while the electron velocity is expressed in terms of the plasma velocity and electric current. Ion and electron momentum equations are combined to give the plasma momentum equation. The perpendicular (with respect to the equilibrium magnetic field) ion momentum equation is used as perpendicular Ohm's law and the parallel electron momentum equation - as parallel Ohm's law. Perpendicular Ohm's law allows for the Hall and ion drift effects. Parallel Ohm's law includes the electron drift effect, collisionless skin effect and bootstrap current. In addition, both perpendicular and parallel Ohm's laws contain the resistivity. Due to the quasineutrality condition, the ions and electrons are characterized by the same number density which is described by the ion continuity equation. On the other hand, the ion and electron temperatures are allowed to be different. The ion temperature is described by the ion energy equation allowing for the oblique heat flux, in addition to the perpendicular ion heat flux. The electron temperature is determined by the condition of high parallel electron heat conductivity. The ion and electron parallel viscosities are represented in a form valid for all the collisionality regimes (Pfirsch-Schluter, plateau, and banana). An optimized form of the generalized MHD equations is then represented in terms of the toroidal coordinate system used in the JET equilibrium and stability codes. The derived equations
Parallel sparse direct solver for integrated circuit simulation

CERN Document Server

Chen, Xiaoming; Yang, Huazhong

2017-01-01

This book describes algorithmic methods and parallelization techniques to design a parallel sparse direct solver which is specifically targeted at integrated circuit simulation problems. The authors describe a complete flow and detailed parallel algorithms of the sparse direct solver. They also show how to improve the performance by simple but effective numerical techniques. The sparse direct solver techniques described can be applied to any SPICE-like integrated circuit simulator and have been proven to be high-performance in actual circuit simulation. Readers will benefit from the state-of-the-art parallel integrated circuit simulation techniques described in this book, especially the latest parallel sparse matrix solution techniques. · Introduces complicated algorithms of sparse linear solvers, using concise principles and simple examples, without complex theory or lengthy derivations; · Describes a parallel sparse direct solver that can be adopted to accelerate any SPICE-like integrated circuit simulato...
SISYPHUS: A high performance seismic inversion factory

Science.gov (United States)

Gokhberg, Alexey; Simutė, Saulė; Boehm, Christian; Fichtner, Andreas

2016-04-01

In the recent years the massively parallel high performance computers became the standard instruments for solving the forward and inverse problems in seismology. The respective software packages dedicated to forward and inverse waveform modelling specially designed for such computers (SPECFEM3D, SES3D) became mature and widely available. These packages achieve significant computational performance and provide researchers with an opportunity to solve problems of bigger size at higher resolution within a shorter time. However, a typical seismic inversion process contains various activities that are beyond the common solver functionality. They include management of information on seismic events and stations, 3D models, observed and synthetic seismograms, pre-processing of the observed signals, computation of misfits and adjoint sources, minimization of misfits, and process workflow management. These activities are time consuming, seldom sufficiently automated, and therefore represent a bottleneck that can substantially offset performance benefits provided by even the most powerful modern supercomputers. Furthermore, a typical system architecture of modern supercomputing platforms is oriented towards the maximum computational performance and provides limited standard facilities for automation of the supporting activities. We present a prototype solution that automates all aspects of the seismic inversion process and is tuned for the modern massively parallel high performance computing systems. We address several major aspects of the solution architecture, which include (1) design of an inversion state database for tracing all relevant aspects of the entire solution process, (2) design of an extensible workflow management framework, (3) integration with wave propagation solvers, (4) integration with optimization packages, (5) computation of misfits and adjoint sources, and (6) process monitoring. The inversion state database represents a hierarchical structure with
Kalman Filter Tracking on Parallel Architectures

International Nuclear Information System (INIS)

Cerati, Giuseppe; Elmer, Peter; Krutelyov, Slava; Lantz, Steven; Lefebvre, Matthieu; McDermott, Kevin; Riley, Daniel; Tadel, Matevž; Wittich, Peter; Würthwein, Frank; Yagil, Avi

2016-01-01

Power density constraints are limiting the performance improvements of modern CPUs. To address this we have seen the introduction of lower-power, multi-core processors such as GPGPU, ARM and Intel MIC. In order to achieve the theoretical performance gains of these processors, it will be necessary to parallelize algorithms to exploit larger numbers of lightweight cores and specialized functions like large vector units. Track finding and fitting is one of the most computationally challenging problems for event reconstruction in particle physics. At the High-Luminosity Large Hadron Collider (HL-LHC), for example, this will be by far the dominant problem. The need for greater parallelism has driven investigations of very different track finding techniques such as Cellular Automata or Hough Transforms. The most common track finding techniques in use today, however, are those based on a Kalman filter approach. Significant experience has been accumulated with these techniques on real tracking detector systems, both in the trigger and offline. They are known to provide high physics performance, are robust, and are in use today at the LHC. Given the utility of the Kalman filter in track finding, we have begun to port these algorithms to parallel architectures, namely Intel Xeon and Xeon Phi. We report here on our progress towards an end-to-end track reconstruction algorithm fully exploiting vectorization and parallelization techniques in a simplified experimental environment
High Performance Computing in Science and Engineering '99 : Transactions of the High Performance Computing Center

CERN Document Server

Jäger, Willi

2000-01-01

The book contains reports about the most significant projects from science and engineering of the Federal High Performance Computing Center Stuttgart (HLRS). They were carefully selected in a peer-review process and are showcases of an innovative combination of state-of-the-art modeling, novel algorithms and the use of leading-edge parallel computer technology. The projects of HLRS are using supercomputer systems operated jointly by university and industry and therefore a special emphasis has been put on the industrial relevance of results and methods.
High Performance Processing and Analysis of Geospatial Data Using CUDA on GPU

Directory of Open Access Journals (Sweden)

STOJANOVIC, N.

2014-11-01

Full Text Available In this paper, the high-performance processing of massive geospatial data on many-core GPU (Graphic Processing Unit is presented. We use CUDA (Compute Unified Device Architecture programming framework to implement parallel processing of common Geographic Information Systems (GIS algorithms, such as viewshed analysis and map-matching. Experimental evaluation indicates the improvement in performance with respect to CPU-based solutions and shows feasibility of using GPU and CUDA for parallel implementation of GIS algorithms over large-scale geospatial datasets.
The Galley Parallel File System

Science.gov (United States)

Nieuwejaar, Nils; Kotz, David

1996-01-01

Most current multiprocessor file systems are designed to use multiple disks in parallel, using the high aggregate bandwidth to meet the growing I/0 requirements of parallel scientific applications. Many multiprocessor file systems provide applications with a conventional Unix-like interface, allowing the application to access multiple disks transparently. This interface conceals the parallelism within the file system, increasing the ease of programmability, but making it difficult or impossible for sophisticated programmers and libraries to use knowledge about their I/O needs to exploit that parallelism. In addition to providing an insufficient interface, most current multiprocessor file systems are optimized for a different workload than they are being asked to support. We introduce Galley, a new parallel file system that is intended to efficiently support realistic scientific multiprocessor workloads. We discuss Galley's file structure and application interface, as well as the performance advantages offered by that interface.
Scientific Programming with High Performance Fortran: A Case Study Using the xHPF Compiler

Directory of Open Access Journals (Sweden)

Eric De Sturler

1997-01-01

Full Text Available Recently, the first commercial High Performance Fortran (HPF subset compilers have appeared. This article reports on our experiences with the xHPF compiler of Applied Parallel Research, version 1.2, for the Intel Paragon. At this stage, we do not expect very High Performance from our HPF programs, even though performance will eventually be of paramount importance for the acceptance of HPF. Instead, our primary objective is to study how to convert large Fortran 77 (F77 programs to HPF such that the compiler generates reasonably efficient parallel code. We report on a case study that identifies several problems when parallelizing code with HPF; most of these problems affect current HPF compiler technology in general, although some are specific for the xHPF compiler. We discuss our solutions from the perspective of the scientific programmer, and presenttiming results on the Intel Paragon. The case study comprises three programs of different complexity with respect to parallelization. We use the dense matrix-matrix product to show that the distribution of arrays and the order of nested loops significantly influence the performance of the parallel program. We use Gaussian elimination with partial pivoting to study the parallelization strategy of the compiler. There are various ways to structure this algorithm for a particular data distribution. This example shows how much effort may be demanded from the programmer to support the compiler in generating an efficient parallel implementation. Finally, we use a small application to show that the more complicated structure of a larger program may introduce problems for the parallelization, even though all subroutines of the application are easy to parallelize by themselves. The application consists of a finite volume discretization on a structured grid and a nested iterative solver. Our case study shows that it is possible to obtain reasonably efficient parallel programs with xHPF, although the compiler
Acoustic coupler for acquisition of coronary artery murmurs

DEFF Research Database (Denmark)

Zimmermann, Niels Henrik; Schmidt, Samuel; Hansen, John

2011-01-01

in a clinical trial including 463 patients referred for elective coronary angiography. The preliminary results show, that it was possible to record heart sound in the diastolic period with a sound pressure level approximately 30 dB above the noise floor of the microphone and recording system in the frequency...... of the coupler, while the low frequency behavior was determined by the properties of the microphone, electronic circuits and inadvertent leakages in the acoustical coupling. The requirement for the microphone and pr-amplifier was a low inherent noise level. The setup was used for collection of heart sounds...... range from 200-700 Hz. The capability of the sensor to record diastolic heart sound in the relevant frequency range indicates that the sensor is suitable for recording of coronary murmurs....

High performance simulation for the Silva project using the tera computer

International Nuclear Information System (INIS)

Bergeaud, V.; La Hargue, J.P.; Mougery, F.; Boulet, M.; Scheurer, B.; Le Fur, J.F.; Comte, M.; Benisti, D.; Lamare, J. de; Petit, A.

2003-01-01

In the context of the SILVA Project (Atomic Vapor Laser Isotope Separation), numerical simulation of the plant scale propagation of laser beams through uranium vapour was a great challenge. The PRODIGE code has been developed to achieve this goal. Here we focus on the task of achieving high performance simulation on the TERA computer. We describe the main issues for optimizing the parallelization of the PRODIGE code on TERA. Thus, we discuss advantages and drawbacks of the implemented diagonal parallelization scheme. As a consequence, it has been found fruitful to fit out the code in three aspects: memory allocation, MPI communications and interconnection network bandwidth usage. We stress out the interest of MPI/IO in this context and the benefit obtained for production computations on TERA. Finally, we shall illustrate our developments. We indicate some performance measurements reflecting the good parallelization properties of PRODIGE on the TERA computer. The code is currently used for demonstrating the feasibility of the laser propagation at a plant enrichment level and for preparing the 2003 Menphis experiment. We conclude by emphasizing the contribution of high performance TERA simulation to the project. (authors)
High performance simulation for the Silva project using the tera computer

Energy Technology Data Exchange (ETDEWEB)

Bergeaud, V.; La Hargue, J.P.; Mougery, F. [CS Communication and Systemes, 92 - Clamart (France); Boulet, M.; Scheurer, B. [CEA Bruyeres-le-Chatel, 91 - Bruyeres-le-Chatel (France); Le Fur, J.F.; Comte, M.; Benisti, D.; Lamare, J. de; Petit, A. [CEA Saclay, 91 - Gif sur Yvette (France)

2003-07-01

In the context of the SILVA Project (Atomic Vapor Laser Isotope Separation), numerical simulation of the plant scale propagation of laser beams through uranium vapour was a great challenge. The PRODIGE code has been developed to achieve this goal. Here we focus on the task of achieving high performance simulation on the TERA computer. We describe the main issues for optimizing the parallelization of the PRODIGE code on TERA. Thus, we discuss advantages and drawbacks of the implemented diagonal parallelization scheme. As a consequence, it has been found fruitful to fit out the code in three aspects: memory allocation, MPI communications and interconnection network bandwidth usage. We stress out the interest of MPI/IO in this context and the benefit obtained for production computations on TERA. Finally, we shall illustrate our developments. We indicate some performance measurements reflecting the good parallelization properties of PRODIGE on the TERA computer. The code is currently used for demonstrating the feasibility of the laser propagation at a plant enrichment level and for preparing the 2003 Menphis experiment. We conclude by emphasizing the contribution of high performance TERA simulation to the project. (authors)
An efficient parallel algorithm for matrix-vector multiplication

Energy Technology Data Exchange (ETDEWEB)

Hendrickson, B.; Leland, R.; Plimpton, S.

1993-03-01

The multiplication of a vector by a matrix is the kernel computation of many algorithms in scientific computation. A fast parallel algorithm for this calculation is therefore necessary if one is to make full use of the new generation of parallel supercomputers. This paper presents a high performance, parallel matrix-vector multiplication algorithm that is particularly well suited to hypercube multiprocessors. For an n x n matrix on p processors, the communication cost of this algorithm is O(n/[radical]p + log(p)), independent of the matrix sparsity pattern. The performance of the algorithm is demonstrated by employing it as the kernel in the well-known NAS conjugate gradient benchmark, where a run time of 6.09 seconds was observed. This is the best published performance on this benchmark achieved to date using a massively parallel supercomputer.
Performance of DS-CDMA systems with optimal hard-decision parallel interference cancellation

NARCIS (Netherlands)

Hofstad, van der R.W.; Klok, M.J.

2003-01-01

We study a multiuser detection system for code-division multiple access (CDMA). We show that applying multistage hard-decision parallel interference cancellation (HD-PIC) significantly improves performance compared to the matched filter system. In (multistage) HD-PIC, estimates of the interfering
Data parallel sorting for particle simulation

Science.gov (United States)

Dagum, Leonardo

1992-01-01

Sorting on a parallel architecture is a communications intensive event which can incur a high penalty in applications where it is required. In the case of particle simulation, only integer sorting is necessary, and sequential implementations easily attain the minimum performance bound of O (N) for N particles. Parallel implementations, however, have to cope with the parallel sorting problem which, in addition to incurring a heavy communications cost, can make the minimun performance bound difficult to attain. This paper demonstrates how the sorting problem in a particle simulation can be reduced to a merging problem, and describes an efficient data parallel algorithm to solve this merging problem in a particle simulation. The new algorithm is shown to be optimal under conditions usual for particle simulation, and its fieldwise implementation on the Connection Machine is analyzed in detail. The new algorithm is about four times faster than a fieldwise implementation of radix sort on the Connection Machine.
Parallel Algorithms for the Exascale Era

Energy Technology Data Exchange (ETDEWEB)

Robey, Robert W. [Los Alamos National Laboratory

2016-10-19

New parallel algorithms are needed to reach the Exascale level of parallelism with millions of cores. We look at some of the research developed by students in projects at LANL. The research blends ideas from the early days of computing while weaving in the fresh approach brought by students new to the field of high performance computing. We look at reproducibility of global sums and why it is important to parallel computing. Next we look at how the concept of hashing has led to the development of more scalable algorithms suitable for next-generation parallel computers. Nearly all of this work has been done by undergraduates and published in leading scientific journals.
Turbostratic stacked CVD graphene for high-performance devices

Science.gov (United States)

Uemura, Kohei; Ikuta, Takashi; Maehashi, Kenzo

2018-03-01

We have fabricated turbostratic stacked graphene with high-transport properties by the repeated transfer of CVD monolayer graphene. The turbostratic stacked CVD graphene exhibited higher carrier mobility and conductivity than CVD monolayer graphene. The electron mobility for the three-layer turbostratic stacked CVD graphene surpassed 10,000 cm2 V-1 s-1 at room temperature, which is five times greater than that for CVD monolayer graphene. The results indicate that the high performance is derived from maintenance of the linear band dispersion, suppression of the carrier scattering, and parallel conduction. Therefore, turbostratic stacked CVD graphene is a superior material for high-performance devices.
Parallel Scaling Characteristics of Selected NERSC User ProjectCodes

Energy Technology Data Exchange (ETDEWEB)

Skinner, David; Verdier, Francesca; Anand, Harsh; Carter,Jonathan; Durst, Mark; Gerber, Richard

2005-03-05

This report documents parallel scaling characteristics of NERSC user project codes between Fiscal Year 2003 and the first half of Fiscal Year 2004 (Oct 2002-March 2004). The codes analyzed cover 60% of all the CPU hours delivered during that time frame on seaborg, a 6080 CPU IBM SP and the largest parallel computer at NERSC. The scale in terms of concurrency and problem size of the workload is analyzed. Drawing on batch queue logs, performance data and feedback from researchers we detail the motivations, benefits, and challenges of implementing highly parallel scientific codes on current NERSC High Performance Computing systems. An evaluation and outlook of the NERSC workload for Allocation Year 2005 is presented.
Data driven parallelism in experimental high energy physics applications

International Nuclear Information System (INIS)

Pohl, M.

1987-01-01

I present global design principles for the implementation of high energy physics data analysis code on sequential and parallel processors with mixed shared and local memory. Potential parallelism in the structure of high energy physics tasks is identified with granularity varying from a few times 10 8 instructions all the way down to a few times 10 4 instructions. It follows the hierarchical structure of detector and data acquisition systems. To take advantage of this - yet preserving the necessary portability of the code - I propose a computational model with purely data driven concurrency in Single Program Multiple Data (SPMD) mode. The task granularity is defined by varying the granularity of the central data structure manipulated. Concurrent processes coordiate themselves asynchroneously using simple lock constructs on parts of the data structure. Load balancing among processes occurs naturally. The scheme allows to map the internal layout of the data structure closely onto the layout of local and shared memory in a parallel architecture. It thus allows to optimize the application with respect to synchronization as well as data transport overheads. I present a coarse top level design for a portable implementation of this scheme on sequential machines, multiprocessor mainframes (e.g. IBM 3090), tightly coupled multiprocessors (e.g. RP-3) and loosely coupled processor arrays (e.g. LCAP, Emulating Processor Farms). (orig.)
Data driven parallelism in experimental high energy physics applications

Science.gov (United States)

Pohl, Martin

1987-08-01

I present global design principles for the implementation of High Energy Physics data analysis code on sequential and parallel processors with mixed shared and local memory. Potential parallelism in the structure of High Energy Physics tasks is identified with granularity varying from a few times 10 8 instructions all the way down to a few times 10 4 instructions. It follows the hierarchical structure of detector and data acquisition systems. To take advantage of this - yet preserving the necessary portability of the code - I propose a computational model with purely data driven concurrency in Single Program Multiple Data (SPMD) mode. The Task granularity is defined by varying the granularity of the central data structure manipulated. Concurrent processes coordinate themselves asynchroneously using simple lock constructs on parts of the data structure. Load balancing among processes occurs naturally. The scheme allows to map the internal layout of the data structure closely onto the layout of local and shared memory in a parallel architecture. It thus allows to optimize the application with respect to synchronization as well as data transport overheads. I present a coarse top level design for a portable implementation of this scheme on sequential machines, multiprocessor mainframes (e.g. IBM 3090), tightly coupled multiprocessors (e.g. RP-3) and loosely coupled processor arrays (e.g. LCAP, Emulating Processor Farms).
Performing an allreduce operation on a plurality of compute nodes of a parallel computer

Science.gov (United States)

Faraj, Ahmad [Rochester, MN

2012-04-17

Methods, apparatus, and products are disclosed for performing an allreduce operation on a plurality of compute nodes of a parallel computer. Each compute node includes at least two processing cores. Each processing core has contribution data for the allreduce operation. Performing an allreduce operation on a plurality of compute nodes of a parallel computer includes: establishing one or more logical rings among the compute nodes, each logical ring including at least one processing core from each compute node; performing, for each logical ring, a global allreduce operation using the contribution data for the processing cores included in that logical ring, yielding a global allreduce result for each processing core included in that logical ring; and performing, for each compute node, a local allreduce operation using the global allreduce results for each processing core on that compute node.
High fidelity thermal-hydraulic analysis using CFD and massively parallel computers

International Nuclear Information System (INIS)

Weber, D.P.; Wei, T.Y.C.; Brewster, R.A.; Rock, Daniel T.; Rizwan-uddin

2000-01-01

Thermal-hydraulic analyses play an important role in design and reload analysis of nuclear power plants. These analyses have historically relied on early generation computational fluid dynamics capabilities, originally developed in the 1960s and 1970s. Over the last twenty years, however, dramatic improvements in both computational fluid dynamics codes in the commercial sector and in computing power have taken place. These developments offer the possibility of performing large scale, high fidelity, core thermal hydraulics analysis. Such analyses will allow a determination of the conservatism employed in traditional design approaches and possibly justify the operation of nuclear power systems at higher powers without compromising safety margins. The objective of this work is to demonstrate such a large scale analysis approach using a state of the art CFD code, STAR-CD, and the computing power of massively parallel computers, provided by IBM. A high fidelity representation of a current generation PWR was analyzed with the STAR-CD CFD code and the results were compared to traditional analyses based on the VIPRE code. Current design methodology typically involves a simplified representation of the assemblies, where a single average pin is used in each assembly to determine the hot assembly from a whole core analysis. After determining this assembly, increased refinement is used in the hot assembly, and possibly some of its neighbors, to refine the analysis for purposes of calculating DNBR. This latter calculation is performed with sub-channel codes such as VIPRE. The modeling simplifications that are used involve the approximate treatment of surrounding assemblies and coarse representation of the hot assembly, where the subchannel is the lowest level of discretization. In the high fidelity analysis performed in this study, both restrictions have been removed. Within the hot assembly, several hundred thousand to several million computational zones have been used, to
High figure of merit ultra-compact 3-channel parallel-connected photonic crystal mini-hexagonal-H1 defect microcavity sensor array

Science.gov (United States)

Wang, Chunhong; Sun, Fujun; Fu, Zhongyuan; Ding, Zhaoxiang; Wang, Chao; Zhou, Jian; Wang, Jiawen; Tian, Huiping

2017-08-01

In this paper, a photonic crystal (PhC) butt-coupled mini-hexagonal-H1 defect (MHHD) microcavity sensor is proposed. The MHHD microcavity is designed by introducing six mini-holes into the initial H1 defect region. Further, based on a well-designed 1 ×3 PhC Beam Splitter and three optimal MHHD microcavity sensors with different lattice constants (a), a 3-channel parallel-connected PhC sensor array on monolithic silicon on insulator (SOI) is proposed. Finite-difference time-domain (FDTD) simulations method is performed to demonstrate the high performance of our structures. As statistics show, the quality factor (Q) of our optimal MHHD microcavity attains higher than 7×104, while the sensitivity (S) reaches up to 233 nm/RIU(RIU = refractive index unit). Thus, the figure of merit (FOM) >104 of the sensor is obtained, which is enhanced by two orders of magnitude compared to the previous butt-coupled sensors [1-4]. As for the 3-channel parallel-connected PhC MHHD microcavity sensor array, the FOMs of three independent MHHD microcavity sensors are 8071, 8250 and 8250, respectively. In addition, the total footprint of the proposed 3-channel parallel-connected PhC sensor array is ultra-compactness of 12.5 μm ×31 μm (width × length). Therefore, the proposed high FOM sensor array is an ideal platform for realizing ultra-compact highly parallel refractive index (RI) sensing.
Optical fiber couplers for spectrophotometry. Perspectives for in-situ on-line and remote measurements

International Nuclear Information System (INIS)

Boisde, G.; Linger, C.; Chevalier, G.; Perez, J.J.

1983-01-01

Optical fiber couplers have been developed specially for nuclear chemical spectrophotometric applications. Coupling devices are described for TELEPHOT industrial photometers and some commercial spectrophotometer, together with the probes and measurement cells employed. The value of optical multiplexing is mentioned. Non nuclear applications in medical analysis are also mentioned, together with the possibilities offered by these devices for uses other than spectrophotometry [fr
A Review of Lightweight Thread Approaches for High Performance Computing

Energy Technology Data Exchange (ETDEWEB)

Castello, Adrian; Pena, Antonio J.; Seo, Sangmin; Mayo, Rafael; Balaji, Pavan; Quintana-Orti, Enrique S.

2016-09-12

High-level, directive-based solutions are becoming the programming models (PMs) of the multi/many-core architectures. Several solutions relying on operating system (OS) threads perfectly work with a moderate number of cores. However, exascale systems will spawn hundreds of thousands of threads in order to exploit their massive parallel architectures and thus conventional OS threads are too heavy for that purpose. Several lightweight thread (LWT) libraries have recently appeared offering lighter mechanisms to tackle massive concurrency. In order to examine the suitability of LWTs in high-level runtimes, we develop a set of microbenchmarks consisting of commonlyfound patterns in current parallel codes. Moreover, we study the semantics offered by some LWT libraries in order to expose the similarities between different LWT application programming interfaces. This study reveals that a reduced set of LWT functions can be sufficient to cover the common parallel code patterns and that those LWT libraries perform better than OS threads-based solutions in cases where task and nested parallelism are becoming more popular with new architectures.
Monolithic optofluidic mode coupler for broadband thermo- and piezo-optical characterization of liquids.

Science.gov (United States)

Pumpe, Sebastian; Chemnitz, Mario; Kobelke, Jens; Schmidt, Markus A

2017-09-18

We present a monolithic fiber device that enables investigation of the thermo- and piezo-optical properties of liquids using straightforward broadband transmission measurements. The device is a directional mode coupler consisting of a multi-mode liquid core and a single-mode glass core with pronounced coupling resonances whose wavelength strongly depend on the operation temperature. We demonstrated the functionality and flexibility of our device for carbon disulfide, extending the current knowledge of the thermo-optic coefficient by 200 nm at 20 °C and uniquely for high temperatures. Moreover, our device allows measuring the piezo-optic coefficient of carbon disulfide, confirming results first obtained by Röntgen in 1891. Finally, we applied our approach to obtain the dispersion of the thermo-optic coefficients of benzene and tetrachloroethylene between 450 and 800 nm, whereas no data was available for the latter so far.
The Protein Maker: an automated system for high-throughput parallel purification

International Nuclear Information System (INIS)

Smith, Eric R.; Begley, Darren W.; Anderson, Vanessa; Raymond, Amy C.; Haffner, Taryn E.; Robinson, John I.; Edwards, Thomas E.; Duncan, Natalie; Gerdts, Cory J.; Mixon, Mark B.; Nollert, Peter; Staker, Bart L.; Stewart, Lance J.

2011-01-01

The Protein Maker instrument addresses a critical bottleneck in structural genomics by allowing automated purification and buffer testing of multiple protein targets in parallel with a single instrument. Here, the use of this instrument to (i) purify multiple influenza-virus proteins in parallel for crystallization trials and (ii) identify optimal lysis-buffer conditions prior to large-scale protein purification is described. The Protein Maker is an automated purification system developed by Emerald BioSystems for high-throughput parallel purification of proteins and antibodies. This instrument allows multiple load, wash and elution buffers to be used in parallel along independent lines for up to 24 individual samples. To demonstrate its utility, its use in the purification of five recombinant PB2 C-terminal domains from various subtypes of the influenza A virus is described. Three of these constructs crystallized and one diffracted X-rays to sufficient resolution for structure determination and deposition in the Protein Data Bank. Methods for screening lysis buffers for a cytochrome P450 from a pathogenic fungus prior to upscaling expression and purification are also described. The Protein Maker has become a valuable asset within the Seattle Structural Genomics Center for Infectious Disease (SSGCID) and hence is a potentially valuable tool for a variety of high-throughput protein-purification applications
Femtosecond laser inscription of asymmetric directional couplers for in-fiber optical taps and fiber cladding photonics.

Science.gov (United States)

Grenier, Jason R; Fernandes, Luís A; Herman, Peter R

2015-06-29

Precise alignment of femtosecond laser tracks in standard single mode optical fiber is shown to enable controllable optical tapping of the fiber core waveguide light with fiber cladding photonic circuits. Asymmetric directional couplers are presented with tunable coupling ratios up to 62% and bandwidths up to 300 nm at telecommunication wavelengths. Real-time fiber monitoring during laser writing permitted a means of controlling the coupler length to compensate for micron-scale alignment errors and to facilitate tailored design of coupling ratio, spectral bandwidth and polarization properties. Laser induced waveguide birefringence was harnessed for polarization dependent coupling that led to the formation of in-fiber polarization-selective taps with 32 dB extinction ratio. This technology enables the interconnection of light propagating in pre-existing waveguides with laser-formed devices, thereby opening a new practical direction for the three-dimensional integration of optical devices in the cladding of optical fibers and planar lightwave circuits.
Design of Miniaturized 10dB Wideband Branch Line Coupler Using Dual and T-Shape Transmission Lines

Directory of Open Access Journals (Sweden)

M. Kumar

2018-04-01

Full Text Available This paper presents a design mechanism of miniaturized wideband branch line coupler (BLC with loose coupling of 10 dB. Dual transmission lines are used as a feed network which provides a size reduction of 32% with a fractional bandwidth (FBW of 60% for 10±0.5 dB coupling but return loss performance is found to be poor in the operating band. For further improvement of return loss performance as well as for size reduction of the BLC, a T- shape transmission lines are used instead of series quarter wavelength transmission lines, and hence the overall size reduction of around 44% with FBW of 50.4% is achieved. The return loss and isolation performance is found to be les than 15 dB in the entire operating band (2.5–4.1 GHz with respect to design frequency 3G Hz. The proposed BLC is analyzed, fabricated and tested.
Optical loss analysis and parameter optimization for fan-shaped single-polarization grating coupler at wavelength of 1.3 µm band

Science.gov (United States)

Ushida, Jun; Tokushima, Masatoshi; Sobu, Yohei; Shimura, Daisuke; Yashiki, Kenichiro; Takahashi, Shigeki; Kurata, Kazuhiko

2018-05-01

Fan-shaped grating couplers (F-GCs) can be smaller than straight ones but are less efficient in general in coupling to single-mode fibers. To find a small F-GC with sufficiently high fiber-coupling characteristics, we numerically compared the dependencies of coupling efficiencies on wavelengths, the starting width of gratings, and misalignment distances among 25, 45, and 60° tapered angles of fan shape by using the three-dimensional finite-difference time domain method. A F-GC with a tapered angle of 25° exhibited the highest performances for all dependencies. The optical loss origins of F-GCs were discussed in terms of the electric field structures in them and scattering at the joint between the fan-shaped slab and channel waveguide. We fabricated an optimized 25° F-GC by using ArF photolithography, which almost exactly reproduced the optical coupling efficiency and radiation angle characteristics that were numerically expected.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.