WorldWideScience

Sample records for 3-d parallel program

  1. Fast implementations of 3D PET reconstruction using vector and parallel programming techniques

    International Nuclear Information System (INIS)

    Computationally intensive techniques that offer potential clinical use have arisen in nuclear medicine. Examples include iterative reconstruction, 3D PET data acquisition and reconstruction, and 3D image volume manipulation including image registration. One obstacle in achieving clinical acceptance of these techniques is the computational time required. This study focuses on methods to reduce the computation time for 3D PET reconstruction through the use of fast computer hardware, vector and parallel programming techniques, and algorithm optimization. The strengths and weaknesses of i860 microprocessor based workstation accelerator boards are investigated in implementations of 3D PET reconstruction

  2. 3-D parallel program for numerical calculation of gas dynamics problems with heat conductivity on distributed memory computational systems (CS)

    Energy Technology Data Exchange (ETDEWEB)

    Sofronov, I.D.; Voronin, B.L.; Butnev, O.I. [VNIIEF (Russian Federation)] [and others

    1997-12-31

    The aim of the work performed is to develop a 3D parallel program for numerical calculation of gas dynamics problem with heat conductivity on distributed memory computational systems (CS), satisfying the condition of numerical result independence from the number of processors involved. Two basically different approaches to the structure of massive parallel computations have been developed. The first approach uses the 3D data matrix decomposition reconstructed at temporal cycle and is a development of parallelization algorithms for multiprocessor CS with shareable memory. The second approach is based on using a 3D data matrix decomposition not reconstructed during a temporal cycle. The program was developed on 8-processor CS MP-3 made in VNIIEF and was adapted to a massive parallel CS Meiko-2 in LLNL by joint efforts of VNIIEF and LLNL staffs. A large number of numerical experiments has been carried out with different number of processors up to 256 and the efficiency of parallelization has been evaluated in dependence on processor number and their parameters.

  3. Shared Memory Parallelism for 3D Cartesian Discrete Ordinates Solver

    Science.gov (United States)

    Moustafa, Salli; Dutka-Malen, Ivan; Plagne, Laurent; Ponçot, Angélique; Ramet, Pierre

    2014-06-01

    This paper describes the design and the performance of DOMINO, a 3D Cartesian SN solver that implements two nested levels of parallelism (multicore+SIMD) on shared memory computation nodes. DOMINO is written in C++, a multi-paradigm programming language that enables the use of powerful and generic parallel programming tools such as Intel TBB and Eigen. These two libraries allow us to combine multi-thread parallelism with vector operations in an efficient and yet portable way. As a result, DOMINO can exploit the full power of modern multi-core processors and is able to tackle very large simulations, that usually require large HPC clusters, using a single computing node. For example, DOMINO solves a 3D full core PWR eigenvalue problem involving 26 energy groups, 288 angular directions (S16), 46 × 106 spatial cells and 1 × 1012 DoFs within 11 hours on a single 32-core SMP node. This represents a sustained performance of 235 GFlops and 40:74% of the SMP node peak performance for the DOMINO sweep implementation. The very high Flops/Watt ratio of DOMINO makes it a very interesting building block for a future many-nodes nuclear simulation tool.

  4. DYNA3D2000*, Explicit 3-D Hydrodynamic FEM Program

    International Nuclear Information System (INIS)

    1 - Description of program or function: DYNA3D2000 is a nonlinear explicit finite element code for analyzing 3-D structures and solid continuum. The code is vectorized and available on several computer platforms. The element library includes continuum, shell, beam, truss and spring/damper elements to allow maximum flexibility in modeling physical problems. Many materials are available to represent a wide range of material behavior, including elasticity, plasticity, composites, thermal effects and rate dependence. In addition, DYNA3D has a sophisticated contact interface capability, including frictional sliding, single surface contact and automatic contact generation. 2 - Method of solution: Discretization of a continuous model transforms partial differential equations into algebraic equations. A numerical solution is then obtained by solving these algebraic equations through a direct time marching scheme. 3 - Restrictions on the complexity of the problem: Recent software improvements have eliminated most of the user identified limitations with dynamic memory allocation and a very large format description that has pushed potential problem sizes beyond the reach of most users. The dominant restrictions remain in code execution speed and robustness, which the developers constantly strive to improve

  5. Parallel Processor for 3D Recovery from Optical Flow

    Directory of Open Access Journals (Sweden)

    Jose Hugo Barron-Zambrano

    2009-01-01

    Full Text Available 3D recovery from motion has received a major effort in computer vision systems in the recent years. The main problem lies in the number of operations and memory accesses to be performed by the majority of the existing techniques when translated to hardware or software implementations. This paper proposes a parallel processor for 3D recovery from optical flow. Its main feature is the maximum reuse of data and the low number of clock cycles to calculate the optical flow, along with the precision with which 3D recovery is achieved. The results of the proposed architecture as well as those from processor synthesis are presented.

  6. Parallel 3-D SN performance for DANTSYS/MPI on the Cray T3D

    International Nuclear Information System (INIS)

    A data parallel version of the 3-D transport solver in DANTSYS has been in use on the SIMD CM-200's at LANL since 1994. This version typically obtains grind times of 150--200 nanoseconds on a 2,048 PE CM-200. The authors have now implemented a new message passing parallel version of DANTSYS, referred to as DANTSYS/MPI, on the 512 PE Cray T3D at Los Alamos. By taking advantage of the SPMD architecture of the Cray T3D, as well as its low latency communications network, they have managed to achieve grind times of less than 10 nanoseconds on real problems. DANTSYS/MPI is fully accelerated using DSA on both the inner and outer iterations. This paper describes the implementation of DANTSYS/MPI on the Cray T3D, and presents two simple performance models for the transport sweep which accurately predict the grind time as a function of the number of PE's and problem size, or scalability

  7. Programs Lucky and LuckyC - 3D parallel transport codes for the multi-group transport equation solution for XYZ geometry by Pm Sn method

    International Nuclear Information System (INIS)

    Powerful supercomputers are available today. MBC-1000M is one of Russian supercomputers that may be used by distant way access. Programs LUCKY and LUCKYC were created to work for multi-processors systems. These programs have algorithms created especially for these computers and used MPI (message passing interface) service for exchanges between processors. LUCKY may resolved shielding tasks by multigroup discreet ordinate method. LUCKYC may resolve critical tasks by same method. Only XYZ orthogonal geometry is available. Under little space steps to approximate discreet operator this geometry may be used as universal one to describe complex geometrical structures. Cross section libraries are used up to P8 approximation by Legendre polynomials for nuclear data in GIT format. Programming language is Fortran-90. 'Vector' processors may be used that lets get a time profit up to 30 times. But unfortunately MBC-1000M has not these processors. Nevertheless sufficient value for efficiency of parallel calculations was obtained under 'space' (LUCKY) and 'space and energy' (LUCKYC) paralleling. AUTOCAD program is used to control geometry after a treatment of input data. Programs have powerful geometry module, it is a beautiful tool to achieve any geometry. Output results may be processed by graphic programs on personal computer. (authors)

  8. Parallel acquisition of 3D-HA(CA)NH and 3D-HACACO spectra

    Energy Technology Data Exchange (ETDEWEB)

    Reddy, Jithender G.; Hosur, Ramakrishna V., E-mail: hosur@tifr.res.in [Tata Institute of Fundamental Research, Department of Chemical Sciences (India)

    2013-06-15

    We present here an NMR pulse sequence with 5 independent incrementable time delays within the frame of a 3-dimensional experiment, by incorporating polarization sharing and dual receiver concepts. This has been applied to directly record 3D-HA(CA)NH and 3D-HACACO spectra of proteins simultaneously using parallel detection of {sup 1}H and {sup 13}C nuclei. While both the experiments display intra-residue backbone correlations, the 3D-HA(CA)NH provides also sequential 'i - 1 {yields} i' correlation along the {sup 1}H{alpha} dimension. Both the spectra contain special peak patterns at glycine locations which serve as check points during the sequential assignment process. The 3D-HACACO spectrum contains, in addition, information on prolines and side chains of residues having H-C-CO network (i.e., {sup 1}H{beta}, {sup 13}C{beta} and {sup 13}CO{gamma} of Asp and Asn, and {sup 1}H{gamma}, {sup 13}C{gamma} and {sup 13}CO{delta} of Glu and Gln), which are generally absent in most conventional proton detected experiments.

  9. Parallel acquisition of 3D-HA(CA)NH and 3D-HACACO spectra

    International Nuclear Information System (INIS)

    We present here an NMR pulse sequence with 5 independent incrementable time delays within the frame of a 3-dimensional experiment, by incorporating polarization sharing and dual receiver concepts. This has been applied to directly record 3D-HA(CA)NH and 3D-HACACO spectra of proteins simultaneously using parallel detection of 1H and 13C nuclei. While both the experiments display intra-residue backbone correlations, the 3D-HA(CA)NH provides also sequential ‘i − 1 → i’ correlation along the 1Hα dimension. Both the spectra contain special peak patterns at glycine locations which serve as check points during the sequential assignment process. The 3D-HACACO spectrum contains, in addition, information on prolines and side chains of residues having H–C–CO network (i.e., 1Hβ, 13Cβ and 13COγ of Asp and Asn, and 1Hγ, 13Cγ and 13COδ of Glu and Gln), which are generally absent in most conventional proton detected experiments.

  10. Parallel Optimization of 3D Cardiac Electrophysiological Model Using GPU

    Directory of Open Access Journals (Sweden)

    Yong Xia

    2015-01-01

    Full Text Available Large-scale 3D virtual heart model simulations are highly demanding in computational resources. This imposes a big challenge to the traditional computation resources based on CPU environment, which already cannot meet the requirement of the whole computation demands or are not easily available due to expensive costs. GPU as a parallel computing environment therefore provides an alternative to solve the large-scale computational problems of whole heart modeling. In this study, using a 3D sheep atrial model as a test bed, we developed a GPU-based simulation algorithm to simulate the conduction of electrical excitation waves in the 3D atria. In the GPU algorithm, a multicellular tissue model was split into two components: one is the single cell model (ordinary differential equation and the other is the diffusion term of the monodomain model (partial differential equation. Such a decoupling enabled realization of the GPU parallel algorithm. Furthermore, several optimization strategies were proposed based on the features of the virtual heart model, which enabled a 200-fold speedup as compared to a CPU implementation. In conclusion, an optimized GPU algorithm has been developed that provides an economic and powerful platform for 3D whole heart simulations.

  11. Parallel OSEM Reconstruction Algorithm for Fully 3-D SPECT on a Beowulf Cluster.

    Science.gov (United States)

    Rong, Zhou; Tianyu, Ma; Yongjie, Jin

    2005-01-01

    In order to improve the computation speed of ordered subset expectation maximization (OSEM) algorithm for fully 3-D single photon emission computed tomography (SPECT) reconstruction, an experimental beowulf-type cluster was built and several parallel reconstruction schemes were described. We implemented a single-program-multiple-data (SPMD) parallel 3-D OSEM reconstruction algorithm based on message passing interface (MPI) and tested it with combinations of different number of calculating processors and different size of voxel grid in reconstruction (64×64×64 and 128×128×128). Performance of parallelization was evaluated in terms of the speedup factor and parallel efficiency. This parallel implementation methodology is expected to be helpful to make fully 3-D OSEM algorithms more feasible in clinical SPECT studies.

  12. INGRID, 3-D Mesh Generator for Program DYNA3D and NIKE3D and FACET and TOPAZ3D

    International Nuclear Information System (INIS)

    1 - Description of program or function: INGRID is a general-purpose, three-dimensional mesh generator developed for use with finite element, nonlinear, structural dynamics codes. INGRID generates the large and complex input data files for DYNA3D (NESC 9909), NIKE3D (NESC 9725), FACET, and TOPAZ3D. One of the greatest advantages of INGRID is that virtually any shape can be described without resorting to wedge elements, tetrahedrons, triangular elements or highly distorted quadrilateral or hexahedral elements. Other capabilities available are in the areas of geometry and graphics. Exact surface equations and surface intersections considerably improve the ability to deal with accurate models, and a hidden line graphics algorithm is included which is efficient on the most complicated meshes. The most important new capability is associated with the boundary conditions, loads, and material properties required by nonlinear mechanics programs. Commands have been designed for each case to minimize user effort. This is particularly important since special processing is almost always required for each load or boundary condition. 2 - Method of solution: Geometries are described primarily using the index space notation of the INGEN program (NESC 975) with an additional type of notation, index progression. Index progressions provide a concise and simple method for describing complex structures; the concept was developed to facilitate defining multiple regions in index space. Rather than specifying the minimum and maximum indices for a region, one specifies the progression of indices along the I, J and K directions, respectively. The index progression method allows the analyst to describe most geometries including nodes and elements with roughly the same amount of input as a solids modeler

  13. Parallel PAB3D: Experiences with a Prototype in MPI

    Science.gov (United States)

    Guerinoni, Fabio; Abdol-Hamid, Khaled S.; Pao, S. Paul

    1998-01-01

    PAB3D is a three-dimensional Navier Stokes solver that has gained acceptance in the research and industrial communities. It takes as computational domain, a set disjoint blocks covering the physical domain. This is the first report on the implementation of PAB3D using the Message Passing Interface (MPI), a standard for parallel processing. We discuss briefly the characteristics of tile code and define a prototype for testing. The principal data structure used for communication is derived from preprocessing "patching". We describe a simple interface (COMMSYS) for MPI communication, and some general techniques likely to be encountered when working on problems of this nature. Last, we identify levels of improvement from the current version and outline future work.

  14. Glnemo2: Interactive Visualization 3D Program

    Science.gov (United States)

    Lambert, Jean-Charles

    2011-10-01

    Glnemo2 is an interactive 3D visualization program developed in C++ using the OpenGL library and Nokia QT 4.X API. It displays in 3D the particles positions of the different components of an nbody snapshot. It quickly gives a lot of information about the data (shape, density area, formation of structures such as spirals, bars, or peanuts). It allows for in/out zooms, rotations, changes of scale, translations, selection of different groups of particles and plots in different blending colors. It can color particles according to their density or temperature, play with the density threshold, trace orbits, display different time steps, take automatic screenshots to make movies, select particles using the mouse, and fly over a simulation using a given camera path. All these features are accessible from a very intuitive graphic user interface. Glnemo2 supports a wide range of input file formats (Nemo, Gadget 1 and 2, phiGrape, Ramses, list of files, realtime gyrfalcON simulation) which are automatically detected at loading time without user intervention. Glnemo2 uses a plugin mechanism to load the data, so that it is easy to add a new file reader. It's powered by a 3D engine which uses the latest OpenGL technology, such as shaders (glsl), vertex buffer object, frame buffer object, and takes in account the power of the graphic card used in order to accelerate the rendering. With a fast GPU, millions of particles can be rendered in real time. Glnemo2 runs on Linux, Windows (using minGW compiler), and MaxOSX, thanks to the QT4API.

  15. DPGL: The Direct3D9-based Parallel Graphics Library for Multi-display Environment

    Institute of Scientific and Technical Information of China (English)

    Zhen Liu; Jiao-Ying Shi

    2007-01-01

    The emergence of high performance 3D graphics cards has opened the way to PC clusters for high performance multidisplay environment. In order to exploit the rendering ability of PC clusters, we should design appropriate parallel rendering algorithms and parallel graphics library interfaces. Due to the rapid development of Direct3D, we bring forward DPGL, the Direct3D9-based parallel graphics library in D3DPR parallel rendering system, which implements Direct3D9 interfaces to support existing Direct3D9 application parallelization with no modification. Based on the parallelism analysis of Direct3D9 rendering pipeline, we briefly introduce D3DPR parallel rendering system. DPGL is the fundamental component of D3DPR. After presenting DPGL three layers architecture,we discuss the rendering resource interception and management. Finally, we describe the design and implementation of DPGL in detail,including rendering command interception layer, rendering command interpretation layer and rendering resource parallelization layer.

  16. Parallel and Cached Scan Matching for Robotic 3D Mapping

    OpenAIRE

    Nuechter, Andreas

    2009-01-01

    Intelligent autonomous acting of mobile robots in unstructured environments requires 3D maps. Since manual mapping is a tedious job, automatization of this job is necessary. Automatic, consistent volumetric modeling of environments requires a solution to the simultaneous localization and map building problem (SLAM problem). In 3D task is computationally expensive, since the environments are sampled with many data points with state of the art sensing technology. In ...

  17. Parallel Hall effect from 3D single-component metamaterials

    OpenAIRE

    Kern, Christian; Kadic, Muamer; Wegener, Martin

    2015-01-01

    We propose a class of three-dimensional metamaterial architectures composed of a single doped semiconductor (e.g., n-Si) in air or vacuum that lead to unusual effective behavior of the classical Hall effect. Using an anisotropic structure, we numerically demonstrate a Hall voltage that is parallel---rather than orthogonal---to the external static magnetic-field vector ("parallel Hall effect"). The sign of this parallel Hall voltage can be determined by a structure parameter. Together with the...

  18. Parallel Simulation of 3-D Turbulent Flow Through Hydraulic Machinery

    Institute of Scientific and Technical Information of China (English)

    徐宇; 吴玉林

    2003-01-01

    Parallel calculational methods were used to analyze incompressible turbulent flow through hydraulic machinery. Two parallel methods were used to simulate the complex flow field. The space decomposition method divides the computational domain into several sub-ranges. Parallel discrete event simulation divides the whole task into several parts according to their functions. The simulation results were compared with the serial simulation results and particle image velocimetry (PIV) experimental results. The results give the distribution and configuration of the complex vortices and illustrate the effectiveness of the parallel algorithms for numerical simulation of turbulent flows.

  19. Developing Parallel Programs

    Directory of Open Access Journals (Sweden)

    Ranjan Sen

    2012-09-01

    Full Text Available Parallel programming is an extension of sequential programming; today, it is becoming the mainstream paradigm in day-to-day information processing. Its aim is to build the fastest programs on parallel computers. The methodologies for developing a parallelprogram can be put into integrated frameworks. Development focuses on algorithm, languages, and how the program is deployed on the parallel computer.

  20. Parallel Hall effect from 3D single-component metamaterials

    CERN Document Server

    Kern, Christian; Wegener, Martin

    2015-01-01

    We propose a class of three-dimensional metamaterial architectures composed of a single doped semiconductor (e.g., n-Si) in air or vacuum that lead to unusual effective behavior of the classical Hall effect. Using an anisotropic structure, we numerically demonstrate a Hall voltage that is parallel---rather than orthogonal---to the external static magnetic-field vector ("parallel Hall effect"). The sign of this parallel Hall voltage can be determined by a structure parameter. Together with the previously demonstrated positive or negative orthogonal Hall voltage, we demonstrate four different sign combinations

  1. Parallelism in Constraint Programming

    OpenAIRE

    Rolf, Carl Christian

    2011-01-01

    Writing efficient parallel programs is the biggest challenge of the software industry for the foreseeable future. We are currently in a time when parallel computers are the norm, not the exception. Soon, parallel processors will be standard even in cell phones. Without drastic changes in hardware development, all software must be parallelized to its fullest extent. Parallelism can increase performance and reduce power consumption at the same time. Many programs will execute faster on a...

  2. A Parallel Sweeping Preconditioner for Heterogeneous 3D Helmholtz Equations

    KAUST Repository

    Poulson, Jack

    2013-05-02

    A parallelization of a sweeping preconditioner for three-dimensional Helmholtz equations without large cavities is introduced and benchmarked for several challenging velocity models. The setup and application costs of the sequential preconditioner are shown to be O(γ2N4/3) and O(γN logN), where γ(ω) denotes the modestly frequency-dependent number of grid points per perfectly matched layer. Several computational and memory improvements are introduced relative to using black-box sparse-direct solvers for the auxiliary problems, and competitive runtimes and iteration counts are reported for high-frequency problems distributed over thousands of cores. Two open-source packages are released along with this paper: Parallel Sweeping Preconditioner (PSP) and the underlying distributed multifrontal solver, Clique. © 2013 Society for Industrial and Applied Mathematics.

  3. Parallel deterministic neutronics with AMR in 3D

    Energy Technology Data Exchange (ETDEWEB)

    Clouse, C.; Ferguson, J.; Hendrickson, C. [Lawrence Livermore National Lab., CA (United States)

    1997-12-31

    AMTRAN, a three dimensional Sn neutronics code with adaptive mesh refinement (AMR) has been parallelized over spatial domains and energy groups and runs on the Meiko CS-2 with MPI message passing. Block refined AMR is used with linear finite element representations for the fluxes, which allows for a straight forward interpretation of fluxes at block interfaces with zoning differences. The load balancing algorithm assumes 8 spatial domains, which minimizes idle time among processors.

  4. An efficient parallel algorithm: Poststack and prestack Kirchhoff 3D depth migration using flexi-depth iterations

    Science.gov (United States)

    Rastogi, Richa; Srivastava, Abhishek; Khonde, Kiran; Sirasala, Kirannmayi M.; Londhe, Ashutosh; Chavhan, Hitesh

    2015-07-01

    This paper presents an efficient parallel 3D Kirchhoff depth migration algorithm suitable for current class of multicore architecture. The fundamental Kirchhoff depth migration algorithm exhibits inherent parallelism however, when it comes to 3D data migration, as the data size increases the resource requirement of the algorithm also increases. This challenges its practical implementation even on current generation high performance computing systems. Therefore a smart parallelization approach is essential to handle 3D data for migration. The most compute intensive part of Kirchhoff depth migration algorithm is the calculation of traveltime tables due to its resource requirements such as memory/storage and I/O. In the current research work, we target this area and develop a competent parallel algorithm for post and prestack 3D Kirchhoff depth migration, using hybrid MPI+OpenMP programming techniques. We introduce a concept of flexi-depth iterations while depth migrating data in parallel imaging space, using optimized traveltime table computations. This concept provides flexibility to the algorithm by migrating data in a number of depth iterations, which depends upon the available node memory and the size of data to be migrated during runtime. Furthermore, it minimizes the requirements of storage, I/O and inter-node communication, thus making it advantageous over the conventional parallelization approaches. The developed parallel algorithm is demonstrated and analysed on Yuva II, a PARAM series of supercomputers. Optimization, performance and scalability experiment results along with the migration outcome show the effectiveness of the parallel algorithm.

  5. Programming structure into 3D nanomaterials

    Directory of Open Access Journals (Sweden)

    Dara Van Gough

    2009-06-01

    Full Text Available Programming three dimensional nanostructures into materials is becoming increasingly important given the need for ever more highly functional solids. Applications for materials with complex programmed structures include solar energy harvesting, energy storage, molecular separation, sensors, pharmaceutical agent delivery, nanoreactors and advanced optical devices. Here we discuss examples of molecular and optical routes to program the structure of three-dimensional nanomaterials with exquisite control over nanomorphology and the resultant properties and conclude with a discussion of the opportunities and challenges of such an approach.

  6. Explicit parallel programming

    OpenAIRE

    Gamble, James Graham

    1990-01-01

    While many parallel programming languages exist, they rarely address programming languages from the issue of communication (implying expressability, and readability). A new language called Explicit Parallel Programming (EPP), attempts to provide this quality by separating the responsibility for the execution of run time actions from the responsibility for deciding the order in which they occur. The ordering of a parallel algorithm is specified in the new EPP language; run ti...

  7. The 3D Elevation Program: summary of program direction

    Science.gov (United States)

    Snyder, Gregory I.

    2012-01-01

    The 3D Elevation Program (3DEP) initiative responds to a growing need for high-quality topographic data and a wide range of other three-dimensional representations of the Nation's natural and constructed features. The National Enhanced Elevation Assessment (NEEA), which was completed in 2011, clearly documented this need within government and industry sectors. The results of the NEEA indicated that enhanced elevation data have the potential to generate $13 billion in new benefits annually. The benefits apply to food risk management, agriculture, water supply, homeland security, renewable energy, aviation safety, and other areas. The 3DEP initiative was recommended by the National Digital Elevation Program and its 12 Federal member agencies and was endorsed by the National States Geographic Information Council (NSGIC) and the National Geospatial Advisory Committee (NGAC).

  8. ERROR ANALYSIS OF 3D DETECTING SYSTEM BASED ON WHOLE-FIELD PARALLEL CONFOCAL MICROSCOPE

    Institute of Scientific and Technical Information of China (English)

    Wang Yonghong; Yu Xiaofen

    2005-01-01

    Compared with the traditional scanning confocal microscopy, the effect of various factors on characteristic in multi-beam parallel confocal system is discussed, the error factors in multi-beam parallel confocal system are analyzed. The factors influencing the characteristics of the multi-beam parallel confocal system are discussed. The construction and working principle of the non-scanning 3D detecting system is introduced, and some experiment results prove the effect of various factors on the detecting system.

  9. Programming Parallel Computers

    OpenAIRE

    Chandy, K. Mani

    1988-01-01

    This paper is from a keynote address to the IEEE International Conference on Computer Languages, October 9, 1988. Keynote addresses are expected to be provocative (and perhaps even entertaining), but not necessarily scholarly. The reader should be warned that this talk was prepared with these expectations in mind.Parallel computers offer the potential of great speed at low cost. The promise of parallelism is limited by the ability to program parallel machines effectively. This paper explores ...

  10. DANTSYS/MPI- a system for 3-D deterministic transport on parallel architectures

    International Nuclear Information System (INIS)

    A data parallel version of the 3-D transport solver in DANTSYS has been in use on the SIMD CM-200s at LANL since 1994. This version typically obtains grind times of 150-200 nanoseconds on a 2048 PE CM-200. A new message passing parallel version of DANTSYS has been implemented referred to as DANTSYS/MPI, on the 512 PE Cray T3D at Los Alamos. By taking advantage of the SPMD architecture of the Cray T3D, as well as its low latency communications network, we have managed to achieve grind times of less than 10 nanoseconds on real problems. DANTSYS/MPI is fully accelerated using DSA on both the inner and outer iterations. The implementation is described of DANTSYS/MPI on the Cray T3D, and presents a simple performance model which accurately predicts the grind time as a function of the number of PE's and problem size, or scalableness. (author)

  11. Parallel Finite Element Solution of 3D Rayleigh-Benard-Marangoni Flows

    Science.gov (United States)

    Carey, G. F.; McLay, R.; Bicken, G.; Barth, B.; Pehlivanov, A.

    1999-01-01

    A domain decomposition strategy and parallel gradient-type iterative solution scheme have been developed and implemented for computation of complex 3D viscous flow problems involving heat transfer and surface tension effects. Details of the implementation issues are described together with associated performance and scalability studies. Representative Rayleigh-Benard and microgravity Marangoni flow calculations and performance results on the Cray T3D and T3E are presented. The work is currently being extended to tightly-coupled parallel "Beowulf-type" PC clusters and we present some preliminary performance results on this platform. We also describe progress on related work on hierarchic data extraction for visualization.

  12. Reconstruction for Time-Domain In Vivo EPR 3D Multigradient Oximetric Imaging—A Parallel Processing Perspective

    Directory of Open Access Journals (Sweden)

    Christopher D. Dharmaraj

    2009-01-01

    Full Text Available Three-dimensional Oximetric Electron Paramagnetic Resonance Imaging using the Single Point Imaging modality generates unpaired spin density and oxygen images that can readily distinguish between normal and tumor tissues in small animals. It is also possible with fast imaging to track the changes in tissue oxygenation in response to the oxygen content in the breathing air. However, this involves dealing with gigabytes of data for each 3D oximetric imaging experiment involving digital band pass filtering and background noise subtraction, followed by 3D Fourier reconstruction. This process is rather slow in a conventional uniprocessor system. This paper presents a parallelization framework using OpenMP runtime support and parallel MATLAB to execute such computationally intensive programs. The Intel compiler is used to develop a parallel C++ code based on OpenMP. The code is executed on four Dual-Core AMD Opteron shared memory processors, to reduce the computational burden of the filtration task significantly. The results show that the parallel code for filtration has achieved a speed up factor of 46.66 as against the equivalent serial MATLAB code. In addition, a parallel MATLAB code has been developed to perform 3D Fourier reconstruction. Speedup factors of 4.57 and 4.25 have been achieved during the reconstruction process and oximetry computation, for a data set with 23×23×23 gradient steps. The execution time has been computed for both the serial and parallel implementations using different dimensions of the data and presented for comparison. The reported system has been designed to be easily accessible even from low-cost personal computers through local internet (NIHnet. The experimental results demonstrate that the parallel computing provides a source of high computational power to obtain biophysical parameters from 3D EPR oximetric imaging, almost in real-time.

  13. Reconstruction for time-domain in vivo EPR 3D multigradient oximetric imaging--a parallel processing perspective.

    Science.gov (United States)

    Dharmaraj, Christopher D; Thadikonda, Kishan; Fletcher, Anthony R; Doan, Phuc N; Devasahayam, Nallathamby; Matsumoto, Shingo; Johnson, Calvin A; Cook, John A; Mitchell, James B; Subramanian, Sankaran; Krishna, Murali C

    2009-01-01

    Three-dimensional Oximetric Electron Paramagnetic Resonance Imaging using the Single Point Imaging modality generates unpaired spin density and oxygen images that can readily distinguish between normal and tumor tissues in small animals. It is also possible with fast imaging to track the changes in tissue oxygenation in response to the oxygen content in the breathing air. However, this involves dealing with gigabytes of data for each 3D oximetric imaging experiment involving digital band pass filtering and background noise subtraction, followed by 3D Fourier reconstruction. This process is rather slow in a conventional uniprocessor system. This paper presents a parallelization framework using OpenMP runtime support and parallel MATLAB to execute such computationally intensive programs. The Intel compiler is used to develop a parallel C++ code based on OpenMP. The code is executed on four Dual-Core AMD Opteron shared memory processors, to reduce the computational burden of the filtration task significantly. The results show that the parallel code for filtration has achieved a speed up factor of 46.66 as against the equivalent serial MATLAB code. In addition, a parallel MATLAB code has been developed to perform 3D Fourier reconstruction. Speedup factors of 4.57 and 4.25 have been achieved during the reconstruction process and oximetry computation, for a data set with 23 x 23 x 23 gradient steps. The execution time has been computed for both the serial and parallel implementations using different dimensions of the data and presented for comparison. The reported system has been designed to be easily accessible even from low-cost personal computers through local internet (NIHnet). The experimental results demonstrate that the parallel computing provides a source of high computational power to obtain biophysical parameters from 3D EPR oximetric imaging, almost in real-time.

  14. Parallel Isosurface Extraction for 3D Data Analysis Workflows in Distributed Environments

    OpenAIRE

    D'Agostino, Daniele; Clematis, Andrea; Gianuzzi, Vittoria

    2011-01-01

    Abstract In this paper we discuss the issues related to the development of efficient parallel implementations of the Marching Cubes algorithm, one of the most used methods for isosurface extraction, which is a fundamental operation for 3D data analysis and visualization. We present three possible parallelization strategies and we outline pros and cons of each of them, considering isosurface extraction as stand-alone operation or as part of a dynamic workflow. Our analysis shows tha...

  15. RELAP5-3D Developer Guidelines and Programming Practices

    Energy Technology Data Exchange (ETDEWEB)

    Dr. George L Mesina

    2014-03-01

    Our ultimate goal is to create and maintain RELAP5-3D as the best software tool available to analyze nuclear power plants. This begins with writing excellent programming and requires thorough testing. This document covers development of RELAP5-3D software, the behavior of the RELAP5-3D program that must be maintained, and code testing. RELAP5-3D must perform in a manner consistent with previous code versions with backward compatibility for the sake of the users. Thus file operations, code termination, input and output must remain consistent in form and content while adding appropriate new files, input and output as new features are developed. As computer hardware, operating systems, and other software change, RELAP5-3D must adapt and maintain performance. The code must be thoroughly tested to ensure that it continues to perform robustly on the supported platforms. The coding must be written in a consistent manner that makes the program easy to read to reduce the time and cost of development, maintenance and error resolution. The programming guidelines presented her are intended to institutionalize a consistent way of writing FORTRAN code for the RELAP5-3D computer program that will minimize errors and rework. A common format and organization of program units creates a unifying look and feel to the code. This in turn increases readability and reduces time required for maintenance, development and debugging. It also aids new programmers in reading and understanding the program. Therefore, when undertaking development of the RELAP5-3D computer program, the programmer must write computer code that follows these guidelines. This set of programming guidelines creates a framework of good programming practices, such as initialization, structured programming, and vector-friendly coding. It sets out formatting rules for lines of code, such as indentation, capitalization, spacing, etc. It creates limits on program units, such as subprograms, functions, and modules. It

  16. Programming standards for effective S-3D game development

    Science.gov (United States)

    Schneider, Neil; Matveev, Alexander

    2008-02-01

    When a video game is in development, more often than not it is being rendered in three dimensions - complete with volumetric depth. It's the PC monitor that is taking this three-dimensional information, and artificially displaying it in a flat, two-dimensional format. Stereoscopic drivers take the three-dimensional information captured from DirectX and OpenGL calls and properly display it with a unique left and right sided view for each eye so a proper stereoscopic 3D image can be seen by the gamer. The two-dimensional limitation of how information is displayed on screen has encouraged programming short-cuts and work-arounds that stifle this stereoscopic 3D effect, and the purpose of this guide is to outline techniques to get the best of both worlds. While the programming requirements do not significantly add to the game development time, following these guidelines will greatly enhance your customer's stereoscopic 3D experience, increase your likelihood of earning Meant to be Seen certification, and give you instant cost-free access to the industry's most valued consumer base. While this outline is mostly based on NVIDIA's programming guide and iZ3D resources, it is designed to work with all stereoscopic 3D hardware solutions and is not proprietary in any way.

  17. Rotation symmetry axes and the quality index in a 3D octahedral parallel robot manipulator system

    OpenAIRE

    Tanev, T. K.; Rooney, J.

    2002-01-01

    The geometry of a 3D octahedral parallel robot manipulator system is specified in terms of two rigid octahedral structures (the fixed and moving platforms) and six actuation legs. The symmetry of the system is exploited to determine the behaviour of (a new version of) the quality index for various motions. The main results are presented graphically.

  18. Parallel Programming with Intel Parallel Studio XE

    CERN Document Server

    Blair-Chappell , Stephen

    2012-01-01

    Optimize code for multi-core processors with Intel's Parallel Studio Parallel programming is rapidly becoming a "must-know" skill for developers. Yet, where to start? This teach-yourself tutorial is an ideal starting point for developers who already know Windows C and C++ and are eager to add parallelism to their code. With a focus on applying tools, techniques, and language extensions to implement parallelism, this essential resource teaches you how to write programs for multicore and leverage the power of multicore in your programs. Sharing hands-on case studies and real-world examples, the

  19. Imaging Nuclear Waste Plumes at the Hanford Site using Large Domain 3D High Resolution Resistivity Methods and the New Parallel-Processing EarthImager3DCL Program

    Science.gov (United States)

    Greenwood, J.; Rucker, D.; Levitt, M.; Yang, X.; Lagmanson, M.

    2007-12-01

    High Resolution Resistivity data is currently used by hydroGEOPHYSICS, Inc to detect and characterize the distribution of suspected contaminant plumes beneath leaking tanks and disposal sites within the U.S. Department of Energy Hanford Site, in Eastern Washington State. The success of the characterization effort has led to resistivity data acquisition in extremely large survey areas exceeding 0.6 km2 and containing over 6,000 electrodes. Optimal data processing results are achieved by utilizing 105 data points within a single finite difference or finite element model domain. The large number of measurements and electrodes and high resolution of the modeling domain requires a model mesh of over 106 nodes. Existing commercially available resistivity inversion software could not support the domain size due to software and hardware limitations. hydroGEOPHYSICS, Inc teamed with Advanced Geosciences, Inc to advance the existing EarthImager3D inversion software to allow for parallel-processing and large memory support under a 64 bit operating system. The basis for the selection of EarthImager3D is demonstrated with a series of verification tests and benchmark comparisons using synthetic test models, field scale experiments and 6 months of intensive modeling using an array of multi-processor servers. The results of benchmark testing show equivalence to other industry standard inversion codes that perform the same function on significantly smaller domain models. hydroGEOPHYSICS, Inc included the use of 214 steel-cased monitoring wells as "long electrodes", 6000 surface electrodes and 8 buried point source electrodes. Advanced Geosciences, Inc. implemented a long electrode modeling function to support the Hanford Site well casing data. This utility is unique to commercial resistivity inversion software, and was evaluated through a series of laboratory and field scale tests using engineered subsurface plumes. The Hanford site is an ideal proving ground for these methods due

  20. 2D/3D Program work summary report

    International Nuclear Information System (INIS)

    The 2D/3D Program was carried out by Germany, Japan and the United States to investigate the thermal-hydraulics of a PWR large-break LOCA. A contributory approach was utilized in which each country contributed significant effort to the program and all three countries shared the research results. Germany constructed and operated the Upper Plenum Test Facility (UPTF), and Japan constructed and operated the Cylindrical Core Test Facility (CCTF) and the Slab Core Test Facility (SCTF). The US contribution consisted of provision of advanced instrumentation to each of the three test facilities, and assessment of the TRAC computer code against the test results. Evaluations of the test results were carried out in all three countries. This report summarizes the 2D/3D Program in terms of the contributing efforts of the participants, and was prepared in a coordination among three countries. US and Germany have published the report as NUREG/IA-0126 and GRS-100, respectively. (author)

  1. Gust Acoustics Computation with a Space-Time CE/SE Parallel 3D Solver

    Science.gov (United States)

    Wang, X. Y.; Himansu, A.; Chang, S. C.; Jorgenson, P. C. E.; Reddy, D. R. (Technical Monitor)

    2002-01-01

    The benchmark Problem 2 in Category 3 of the Third Computational Aero-Acoustics (CAA) Workshop is solved using the space-time conservation element and solution element (CE/SE) method. This problem concerns the unsteady response of an isolated finite-span swept flat-plate airfoil bounded by two parallel walls to an incident gust. The acoustic field generated by the interaction of the gust with the flat-plate airfoil is computed by solving the 3D (three-dimensional) Euler equations in the time domain using a parallel version of a 3D CE/SE solver. The effect of the gust orientation on the far-field directivity is studied. Numerical solutions are presented and compared with analytical solutions, showing a reasonable agreement.

  2. Stiffness Analysis of 3-d.o.f. Overconstrained Translational Parallel Manipulators

    CERN Document Server

    Pashkevich, Anatoly; Wenger, Philippe

    2008-01-01

    The paper presents a new stiffness modelling method for overconstrained parallel manipulators, which is applied to 3-d.o.f. translational mechanisms. It is based on a multidimensional lumped-parameter model that replaces the link flexibility by localized 6-d.o.f. virtual springs. In contrast to other works, the method includes a FEA-based link stiffness evaluation and employs a new solution strategy of the kinetostatic equations, which allows computing the stiffness matrix for the overconstrained architectures and for the singular manipulator postures. The advantages of the developed technique are confirmed by application examples, which deal with comparative stiffness analysis of two translational parallel manipulators.

  3. An improved parallel SPH approach to solve 3D transient generalized Newtonian free surface flows

    Science.gov (United States)

    Ren, Jinlian; Jiang, Tao; Lu, Weigang; Li, Gang

    2016-08-01

    In this paper, a corrected parallel smoothed particle hydrodynamics (C-SPH) method is proposed to simulate the 3D generalized Newtonian free surface flows with low Reynolds number, especially the 3D viscous jets buckling problems are investigated. The proposed C-SPH method is achieved by coupling an improved SPH method based on the incompressible condition with the traditional SPH (TSPH), that is, the improved SPH with diffusive term and first-order Kernel gradient correction scheme is used in the interior of the fluid domain, and the TSPH is used near the free surface. Thus the C-SPH method possesses the advantages of two methods. Meanwhile, an effective and convenient boundary treatment is presented to deal with 3D multiple-boundary problem, and the MPI parallelization technique with a dynamic cells neighbor particle searching method is considered to improve the computational efficiency. The validity and the merits of the C-SPH are first verified by solving several benchmarks and compared with other results. Then the viscous jet folding/coiling based on the Cross model is simulated by the C-SPH method and compared with other experimental or numerical results. Specially, the influences of macroscopic parameters on the flow are discussed. All the numerical results agree well with available data, and show that the C-SPH method has higher accuracy and better stability for solving 3D moving free surface flows over other particle methods.

  4. Parallel programming with MPI

    International Nuclear Information System (INIS)

    MPI is a practical, portable, efficient and flexible standard for message passing, which has been implemented on most MPPs and network of workstations by machine vendors, universities and national laboratories. MPI avoids specifying how operations will take place and superfluous work to achieve efficiency as well as portability, and is also designed to encourage overlapping communication and computation to hide communication latencies. This presentation briefly explains the MPI standard, and comments on efficient parallel programming to improve performance. (author)

  5. Parallel programming with MPI

    Energy Technology Data Exchange (ETDEWEB)

    Tatebe, Osamu [Electrotechnical Lab., Tsukuba, Ibaraki (Japan)

    1998-03-01

    MPI is a practical, portable, efficient and flexible standard for message passing, which has been implemented on most MPPs and network of workstations by machine vendors, universities and national laboratories. MPI avoids specifying how operations will take place and superfluous work to achieve efficiency as well as portability, and is also designed to encourage overlapping communication and computation to hide communication latencies. This presentation briefly explains the MPI standard, and comments on efficient parallel programming to improve performance. (author)

  6. Spatial Parallelism of a 3D Finite Difference, Velocity-Stress Elastic Wave Propagation Code

    Energy Technology Data Exchange (ETDEWEB)

    MINKOFF,SUSAN E.

    1999-12-09

    Finite difference methods for solving the wave equation more accurately capture the physics of waves propagating through the earth than asymptotic solution methods. Unfortunately. finite difference simulations for 3D elastic wave propagation are expensive. We model waves in a 3D isotropic elastic earth. The wave equation solution consists of three velocity components and six stresses. The partial derivatives are discretized using 2nd-order in time and 4th-order in space staggered finite difference operators. Staggered schemes allow one to obtain additional accuracy (via centered finite differences) without requiring additional storage. The serial code is most unique in its ability to model a number of different types of seismic sources. The parallel implementation uses the MP1 library, thus allowing for portability between platforms. Spatial parallelism provides a highly efficient strategy for parallelizing finite difference simulations. In this implementation, one can decompose the global problem domain into one-, two-, and three-dimensional processor decompositions with 3D decompositions generally producing the best parallel speed up. Because i/o is handled largely outside of the time-step loop (the most expensive part of the simulation) we have opted for straight-forward broadcast and reduce operations to handle i/o. The majority of the communication in the code consists of passing subdomain face information to neighboring processors for use as ''ghost cells''. When this communication is balanced against computation by allocating subdomains of reasonable size, we observe excellent scaled speed up. Allocating subdomains of size 25 x 25 x 25 on each node, we achieve efficiencies of 94% on 128 processors. Numerical examples for both a layered earth model and a homogeneous medium with a high-velocity blocky inclusion illustrate the accuracy of the parallel code.

  7. Performance analysis of high quality parallel preconditioners applied to 3D finite element structural analysis

    Energy Technology Data Exchange (ETDEWEB)

    Kolotilina, L.; Nikishin, A.; Yeremin, A. [and others

    1994-12-31

    The solution of large systems of linear equations is a crucial bottleneck when performing 3D finite element analysis of structures. Also, in many cases the reliability and robustness of iterative solution strategies, and their efficiency when exploiting hardware resources, fully determine the scope of industrial applications which can be solved on a particular computer platform. This is especially true for modern vector/parallel supercomputers with large vector length and for modern massively parallel supercomputers. Preconditioned iterative methods have been successfully applied to industrial class finite element analysis of structures. The construction and application of high quality preconditioners constitutes a high percentage of the total solution time. Parallel implementation of high quality preconditioners on such architectures is a formidable challenge. Two common types of existing preconditioners are the implicit preconditioners and the explicit preconditioners. The implicit preconditioners (e.g. incomplete factorizations of several types) are generally high quality but require solution of lower and upper triangular systems of equations per iteration which are difficult to parallelize without deteriorating the convergence rate. The explicit type of preconditionings (e.g. polynomial preconditioners or Jacobi-like preconditioners) require sparse matrix-vector multiplications and can be parallelized but their preconditioning qualities are less than desirable. The authors present results of numerical experiments with Factorized Sparse Approximate Inverses (FSAI) for symmetric positive definite linear systems. These are high quality preconditioners that possess a large resource of parallelism by construction without increasing the serial complexity.

  8. Late gadolinium enhancement cardiac imaging on a 3T scanner with parallel RF transmission technique: prospective comparison of 3D-PSIR and 3D-IR

    Energy Technology Data Exchange (ETDEWEB)

    Schultz, Anthony [Nouvel Hopital Civil, Strasbourg University Hospital, Radiology Department, Strasbourg Cedex (France); Nouvel Hopital Civil, Service de Radiologie, Strasbourg Cedex (France); Caspar, Thibault [Nouvel Hopital Civil, Strasbourg University Hospital, Cardiology Department, Strasbourg Cedex (France); Schaeffer, Mickael [Nouvel Hopital Civil, Strasbourg University Hospital, Public Health and Biostatistics Department, Strasbourg Cedex (France); Labani, Aissam; Jeung, Mi-Young; El Ghannudi, Soraya; Roy, Catherine [Nouvel Hopital Civil, Strasbourg University Hospital, Radiology Department, Strasbourg Cedex (France); Ohana, Mickael [Nouvel Hopital Civil, Strasbourg University Hospital, Radiology Department, Strasbourg Cedex (France); Universite de Strasbourg / CNRS, UMR 7357, iCube Laboratory, Illkirch (France)

    2016-06-15

    To qualitatively and quantitatively compare different late gadolinium enhancement (LGE) sequences acquired at 3T with a parallel RF transmission technique. One hundred and sixty participants prospectively enrolled underwent a 3T cardiac MRI with 3 different LGE sequences: 3D Phase-Sensitive Inversion-Recovery (3D-PSIR) acquired 5 minutes after injection, 3D Inversion-Recovery (3D-IR) at 9 minutes and 3D-PSIR at 13 minutes. All LGE-positive patients were qualitatively evaluated both independently and blindly by two radiologists using a 4-level scale, and quantitatively assessed with measurement of contrast-to-noise ratio and LGE maximal surface. Statistical analyses were calculated under a Bayesian paradigm using MCMC methods. Fifty patients (70 % men, 56yo ± 19) exhibited LGE (62 % were post-ischemic, 30 % related to cardiomyopathy and 8 % post-myocarditis). Early and late 3D-PSIR were superior to 3D-IR sequences (global quality, estimated coefficient IR > early-PSIR: -2.37 CI = [-3.46; -1.38], prob(coef > 0) = 0 % and late-PSIR > IR: 3.12 CI = [0.62; 4.41], prob(coef > 0) = 100 %), LGE surface estimated coefficient IR > early-PSIR: -0.09 CI = [-1.11; -0.74], prob(coef > 0) = 0 % and late-PSIR > IR: 0.96 CI = [0.77; 1.15], prob(coef > 0) = 100 %. Probabilities for late PSIR being superior to early PSIR concerning global quality and CNR were over 90 %, regardless of the aetiological subgroup. In 3T cardiac MRI acquired with parallel RF transmission technique, 3D-PSIR is qualitatively and quantitatively superior to 3D-IR. (orig.)

  9. Advanced quadratures and periodic boundary conditions in parallel 3D S{sub n} transport

    Energy Technology Data Exchange (ETDEWEB)

    Manalo, K.; Yi, C.; Huang, M.; Sjoden, G. [Nuclear and Radiological Engineering Program, G.W. Woodruff School of Mechanical Engineering, Georgia Institute of Technology, 770 State Street, Atlanta, GA 30332-0745 (United States)

    2013-07-01

    Significant updates in numerical quadratures have warranted investigation with 3D Sn discrete ordinates transport. We show new applications of quadrature departing from level symmetric (S{sub 2}o). investigating 3 recently developed quadratures: Even-Odd (EO), Linear-Discontinuous Finite Element - Surface Area (LDFE-SA), and the non-symmetric Icosahedral Quadrature (IC). We discuss implementation changes to 3D Sn codes (applied to Hybrid MOC-Sn TITAN and 3D parallel PENTRAN) that can be performed to accommodate Icosahedral Quadrature, as this quadrature is not 90-degree rotation invariant. In particular, as demonstrated using PENTRAN, the properties of Icosahedral Quadrature are suitable for trivial application using periodic BCs versus that of reflective BCs. In addition to implementing periodic BCs for 3D Sn PENTRAN, we implemented a technique termed 'angular re-sweep' which properly conditions periodic BCs for outer eigenvalue iterative loop convergence. As demonstrated by two simple transport problems (3-group fixed source and 3-group reflected/periodic eigenvalue pin cell), we remark that all of the quadratures we investigated are generally superior to level symmetric quadrature, with Icosahedral Quadrature performing the most efficiently for problems tested. (authors)

  10. Parallel Imaging of 3D Surface Profile with Space-Division Multiplexing

    Science.gov (United States)

    Lee, Hyung Seok; Cho, Soon-Woo; Kim, Gyeong Hun; Jeong, Myung Yung; Won, Young Jae; Kim, Chang-Seok

    2016-01-01

    We have developed a modified optical frequency domain imaging (OFDI) system that performs parallel imaging of three-dimensional (3D) surface profiles by using the space division multiplexing (SDM) method with dual-area swept sourced beams. We have also demonstrated that 3D surface information for two different areas could be well obtained in a same time with only one camera by our method. In this study, double field of views (FOVs) of 11.16 mm × 5.92 mm were achieved within 0.5 s. Height range for each FOV was 460 µm and axial and transverse resolutions were 3.6 and 5.52 µm, respectively. PMID:26805840

  11. Parallel Imaging of 3D Surface Profile with Space-Division Multiplexing

    Directory of Open Access Journals (Sweden)

    Hyung Seok Lee

    2016-01-01

    Full Text Available We have developed a modified optical frequency domain imaging (OFDI system that performs parallel imaging of three-dimensional (3D surface profiles by using the space division multiplexing (SDM method with dual-area swept sourced beams. We have also demonstrated that 3D surface information for two different areas could be well obtained in a same time with only one camera by our method. In this study, double field of views (FOVs of 11.16 mm × 5.92 mm were achieved within 0.5 s. Height range for each FOV was 460 µm and axial and transverse resolutions were 3.6 and 5.52 µm, respectively.

  12. Parallel implementation of 3D FFT with volumetric decomposition schemes for efficient molecular dynamics simulations

    Science.gov (United States)

    Jung, Jaewoon; Kobayashi, Chigusa; Imamura, Toshiyuki; Sugita, Yuji

    2016-03-01

    Three-dimensional Fast Fourier Transform (3D FFT) plays an important role in a wide variety of computer simulations and data analyses, including molecular dynamics (MD) simulations. In this study, we develop hybrid (MPI+OpenMP) parallelization schemes of 3D FFT based on two new volumetric decompositions, mainly for the particle mesh Ewald (PME) calculation in MD simulations. In one scheme, (1d_Alltoall), five all-to-all communications in one dimension are carried out, and in the other, (2d_Alltoall), one two-dimensional all-to-all communication is combined with two all-to-all communications in one dimension. 2d_Alltoall is similar to the conventional volumetric decomposition scheme. We performed benchmark tests of 3D FFT for the systems with different grid sizes using a large number of processors on the K computer in RIKEN AICS. The two schemes show comparable performances, and are better than existing 3D FFTs. The performances of 1d_Alltoall and 2d_Alltoall depend on the supercomputer network system and number of processors in each dimension. There is enough leeway for users to optimize performance for their conditions. In the PME method, short-range real-space interactions as well as long-range reciprocal-space interactions are calculated. Our volumetric decomposition schemes are particularly useful when used in conjunction with the recently developed midpoint cell method for short-range interactions, due to the same decompositions of real and reciprocal spaces. The 1d_Alltoall scheme of 3D FFT takes 4.7 ms to simulate one MD cycle for a virus system containing more than 1 million atoms using 32,768 cores on the K computer.

  13. The 3D Elevation Program initiative: a call for action

    Science.gov (United States)

    Sugarbaker, Larry J.; Constance, Eric W.; Heidemann, Hans Karl; Jason, Allyson L.; Lukas, Vicki; Saghy, David L.; Stoker, Jason M.

    2014-01-01

    The 3D Elevation Program (3DEP) initiative is accelerating the rate of three-dimensional (3D) elevation data collection in response to a call for action to address a wide range of urgent needs nationwide. It began in 2012 with the recommendation to collect (1) high-quality light detection and ranging (lidar) data for the conterminous United States (CONUS), Hawaii, and the U.S. territories and (2) interferometric synthetic aperture radar (ifsar) data for Alaska. Specifications were created for collecting 3D elevation data, and the data management and delivery systems are being modernized. The National Elevation Dataset (NED) will be completely refreshed with new elevation data products and services. The call for action requires broad support from a large partnership community committed to the achievement of national 3D elevation data coverage. The initiative is being led by the U.S. Geological Survey (USGS) and includes many partners—Federal agencies and State, Tribal, and local governments—who will work together to build on existing programs to complete the national collection of 3D elevation data in 8 years. Private sector firms, under contract to the Government, will continue to collect the data and provide essential technology solutions for the Government to manage and deliver these data and services. The 3DEP governance structure includes (1) an executive forum established in May 2013 to have oversight functions and (2) a multiagency coordinating committee based upon the committee structure already in place under the National Digital Elevation Program (NDEP). The 3DEP initiative is based on the results of the National Enhanced Elevation Assessment (NEEA) that was funded by NDEP agencies and completed in 2011. The study, led by the USGS, identified more than 600 requirements for enhanced (3D) elevation data to address mission-critical information requirements of 34 Federal agencies, all 50 States, and a sample of private sector companies and Tribal and local

  14. Parallel 3D Finite Element Numerical Modelling of DC Electron Guns

    Energy Technology Data Exchange (ETDEWEB)

    Prudencio, E.; Candel, A.; Ge, L.; Kabel, A.; Ko, K.; Lee, L.; Li, Z.; Ng, C.; Schussman, G.; /SLAC

    2008-02-04

    In this paper we present Gun3P, a parallel 3D finite element application that the Advanced Computations Department at the Stanford Linear Accelerator Center is developing for the analysis of beam formation in DC guns and beam transport in klystrons. Gun3P is targeted specially to complex geometries that cannot be described by 2D models and cannot be easily handled by finite difference discretizations. Its parallel capability allows simulations with more accuracy and less processing time than packages currently available. We present simulation results for the L-band Sheet Beam Klystron DC gun, in which case Gun3P is able to reduce simulation time from days to some hours.

  15. Assessing the performance of a parallel MATLAB-based 3D convection code

    Science.gov (United States)

    Kirkpatrick, G. J.; Hasenclever, J.; Phipps Morgan, J.; Shi, C.

    2008-12-01

    We are currently building 2D and 3D MATLAB-based parallel finite element codes for mantle convection and melting. The codes use the MATLAB implementation of core MPI commands (eg. Send, Receive, Broadcast) for message passing between computational subdomains. We have found that code development and algorithm testing are much faster in MATLAB than in our previous work coding in C or FORTRAN, this code was built from scratch with only 12 man-months of effort. The one extra cost w.r.t. C coding on a Beowulf cluster is the cost of the parallel MATLAB license for a >4core cluster. Here we present some preliminary results on the efficiency of MPI messaging in MATLAB on a small 4 machine, 16core, 32Gb RAM Intel Q6600 processor-based cluster. Our code implements fully parallelized preconditioned conjugate gradients with a multigrid preconditioner. Our parallel viscous flow solver is currently 20% slower for a 1,000,000 DOF problem on a single core in 2D as the direct solve MILAMIN MATLAB viscous flow solver. We have tested both continuous and discontinuous pressure formulations. We test with various configurations of network hardware, CPU speeds, and memory using our own and MATLAB's built in cluster profiler. So far we have only explored relatively small (up to 1.6GB RAM) test problems. We find that with our current code and Intel memory controller bandwidth limitations we can only get ~2.3 times performance out of 4 cores than 1 core per machine. Even for these small problems the code runs faster with message passing between 4 machines with one core each than 1 machine with 4 cores and internal messaging (1.29x slower), or 1 core (2.15x slower). It surprised us that for 2D ~1GB-sized problems with only 3 multigrid levels, the direct- solve on the coarsest mesh consumes comparable time to the iterative solve on the finest mesh - a penalty that is greatly reduced either by using a 4th multigrid level or by using an iterative solve at the coarsest grid level. We plan to

  16. 3D Body Scanning Measurement System Associated with RF Imaging, Zero-padding and Parallel Processing

    Directory of Open Access Journals (Sweden)

    Kim Hyung Tae

    2016-04-01

    Full Text Available This work presents a novel signal processing method for high-speed 3D body measurements using millimeter waves with a general processing unit (GPU and zero-padding fast Fourier transform (ZPFFT. The proposed measurement system consists of a radio-frequency (RF antenna array for a penetrable measurement, a high-speed analog-to-digital converter (ADC for significant data acquisition, and a general processing unit for fast signal processing. The RF waves of the transmitter and the receiver are converted to real and imaginary signals that are sampled by a high-speed ADC and synchronized with the kinematic positions of the scanner. Because the distance between the surface and the antenna is related to the peak frequency of the conjugate signals, a fast Fourier transform (FFT is applied to the signal processing after the sampling. The sampling time is finite owing to a short scanning time, and the physical resolution needs to be increased; further, zero-padding is applied to interpolate the spectra of the sampled signals to consider a 1/m floating point frequency. The GPU and parallel algorithm are applied to accelerate the speed of the ZPFFT because of the large number of additional mathematical operations of the ZPFFT. 3D body images are finally obtained by spectrograms that are the arrangement of the ZPFFT in a 3D space.

  17. Three-dimensional parallel UNIPIC-3D code for simulations of high-power microwave devices

    Science.gov (United States)

    Wang, Jianguo; Chen, Zaigao; Wang, Yue; Zhang, Dianhui; Liu, Chunliang; Li, Yongdong; Wang, Hongguang; Qiao, Hailiang; Fu, Meiyan; Yuan, Yuan

    2010-07-01

    This paper introduces a self-developed, three-dimensional parallel fully electromagnetic particle simulation code UNIPIC-3D. In this code, the electromagnetic fields are updated using the second-order, finite-difference time-domain method, and the particles are moved using the relativistic Newton-Lorentz force equation. The electromagnetic field and particles are coupled through the current term in Maxwell's equations. Two numerical examples are used to verify the algorithms adopted in this code, numerical results agree well with theoretical ones. This code can be used to simulate the high-power microwave (HPM) devices, such as the relativistic backward wave oscillator, coaxial vircator, and magnetically insulated line oscillator, etc. UNIPIC-3D is written in the object-oriented C++ language and can be run on a variety of platforms including WINDOWS, LINUX, and UNIX. Users can use the graphical user's interface to create the complex geometric structures of the simulated HPM devices, which can be automatically meshed by UNIPIC-3D code. This code has a powerful postprocessor which can display the electric field, magnetic field, current, voltage, power, spectrum, momentum of particles, etc. For the sake of comparison, the results computed by using the two-and-a-half-dimensional UNIPIC code are also provided for the same parameters of HPM devices, the numerical results computed from these two codes agree well with each other.

  18. Parallel CAE system for large-scale 3-D finite element analyses

    International Nuclear Information System (INIS)

    This paper describes a new pre- and post-processing system for the automation of large-scale 3D finite element analyses. In the pre-processing stage, a geometry model lo be analyzed is defined by a user through an interactive operation with a 3D graphics editor. The analysis model is constructed by adding analysis conditions and a mesh refinement information lo the geometry model. The mesh refinement information, i.e. a nodal density distribution over the whole analysis domain is initially defined by superposing several locally optimum nodal patterns stored in the nodal pattern database of the system. Nodes and tetrahedral elements are generated using some computational geometry techniques whose processing speed is almost proportional to the total number of nodes. In the post-processing stage, scalar and vector values are evaluated at arbitrary points in the analysis domain, and displayed as equi-contours, vector lines, iso-surfaces, particle plots and realtime animation by means of scientific visualization techniques. The present system is also capable of mesh optimization. A posteriori error distribution over the whole analysis domain is obtained based on the simple error estimator proposed by Zienkiewicz and Zhu. The nodal density distribution to be used for mesh generation is optimized referring the obtained error distribution. Finally nodes and tetrahedral elements are re-generated. The present remeshing method is one of the global hr-version mesh adaptation methods. To deal with large-scale 3D finite element analyses in a reasonable computational time and memory requirement, a distributed/parallel processing technique is applied to some part of the present system. Fundamental performances of the present system are clearly demonstrated through 3D thermal conduction analyses. (author)

  19. Computer Assisted Parallel Program Generation

    CERN Document Server

    Kawata, Shigeo

    2015-01-01

    Parallel computation is widely employed in scientific researches, engineering activities and product development. Parallel program writing itself is not always a simple task depending on problems solved. Large-scale scientific computing, huge data analyses and precise visualizations, for example, would require parallel computations, and the parallel computing needs the parallelization techniques. In this Chapter a parallel program generation support is discussed, and a computer-assisted parallel program generation system P-NCAS is introduced. Computer assisted problem solving is one of key methods to promote innovations in science and engineering, and contributes to enrich our society and our life toward a programming-free environment in computing science. Problem solving environments (PSE) research activities had started to enhance the programming power in 1970's. The P-NCAS is one of the PSEs; The PSE concept provides an integrated human-friendly computational software and hardware system to solve a target ...

  20. The development of laser-plasma interaction program LAP3D on thousands of processors

    Energy Technology Data Exchange (ETDEWEB)

    Hu, Xiaoyan, E-mail: hu-xiaoyan@iapcm.ac.cn; Hao, Liang; Liu, Zhanjun; Zheng, Chunyang; Li, Bin, E-mail: li.bin@iapcm.ac.cn; Guo, Hong [Institute of Applied Physics and Computational Mathematics, Beijing 100088 (China)

    2015-08-15

    Modeling laser-plasma interaction (LPI) processes in real-size experiments scale is recognized as a challenging task. For explorering the influence of various instabilities in LPI processes, a three-dimensional laser and plasma code (LAP3D) has been developed, which includes filamentation, stimulated Brillouin backscattering (SBS), stimulated Raman backscattering (SRS), non-local heat transport and plasmas flow computation modules. In this program, a second-order upwind scheme is applied to solve the plasma equations which are represented by an Euler fluid model. Operator splitting method is used for solving the equations of the light wave propagation, where the Fast Fourier translation (FFT) is applied to compute the diffraction operator and the coordinate translations is used to solve the acoustic wave equation. The coupled terms of the different physics processes are computed by the second-order interpolations algorithm. In order to simulate the LPI processes in massively parallel computers well, several parallel techniques are used, such as the coupled parallel algorithm of FFT and fluid numerical computation, the load balance algorithm, and the data transfer algorithm. Now the phenomena of filamentation, SBS and SRS have been studied in low-density plasma successfully with LAP3D. Scalability of the program is demonstrated with a parallel efficiency above 50% on about ten thousand of processors.

  1. The development of laser-plasma interaction program LAP3D on thousands of processors

    Directory of Open Access Journals (Sweden)

    Xiaoyan Hu

    2015-08-01

    Full Text Available Modeling laser-plasma interaction (LPI processes in real-size experiments scale is recognized as a challenging task. For explorering the influence of various instabilities in LPI processes, a three-dimensional laser and plasma code (LAP3D has been developed, which includes filamentation, stimulated Brillouin backscattering (SBS, stimulated Raman backscattering (SRS, non-local heat transport and plasmas flow computation modules. In this program, a second-order upwind scheme is applied to solve the plasma equations which are represented by an Euler fluid model. Operator splitting method is used for solving the equations of the light wave propagation, where the Fast Fourier translation (FFT is applied to compute the diffraction operator and the coordinate translations is used to solve the acoustic wave equation. The coupled terms of the different physics processes are computed by the second-order interpolations algorithm. In order to simulate the LPI processes in massively parallel computers well, several parallel techniques are used, such as the coupled parallel algorithm of FFT and fluid numerical computation, the load balance algorithm, and the data transfer algorithm. Now the phenomena of filamentation, SBS and SRS have been studied in low-density plasma successfully with LAP3D. Scalability of the program is demonstrated with a parallel efficiency above 50% on about ten thousand of processors.

  2. The development of laser-plasma interaction program LAP3D on thousands of processors

    International Nuclear Information System (INIS)

    Modeling laser-plasma interaction (LPI) processes in real-size experiments scale is recognized as a challenging task. For explorering the influence of various instabilities in LPI processes, a three-dimensional laser and plasma code (LAP3D) has been developed, which includes filamentation, stimulated Brillouin backscattering (SBS), stimulated Raman backscattering (SRS), non-local heat transport and plasmas flow computation modules. In this program, a second-order upwind scheme is applied to solve the plasma equations which are represented by an Euler fluid model. Operator splitting method is used for solving the equations of the light wave propagation, where the Fast Fourier translation (FFT) is applied to compute the diffraction operator and the coordinate translations is used to solve the acoustic wave equation. The coupled terms of the different physics processes are computed by the second-order interpolations algorithm. In order to simulate the LPI processes in massively parallel computers well, several parallel techniques are used, such as the coupled parallel algorithm of FFT and fluid numerical computation, the load balance algorithm, and the data transfer algorithm. Now the phenomena of filamentation, SBS and SRS have been studied in low-density plasma successfully with LAP3D. Scalability of the program is demonstrated with a parallel efficiency above 50% on about ten thousand of processors

  3. Approach of generating parallel programs from parallelized algorithm design strategies

    Institute of Scientific and Technical Information of China (English)

    WAN Jian-yi; LI Xiao-ying

    2008-01-01

    Today, parallel programming is dominated by message passing libraries, such as message passing interface (MPI). This article intends to simplify parallel programming by generating parallel programs from parallelized algorithm design strategies. It uses skeletons to abstract parallelized algorithm design strategies, as well as parallel architectures. Starting from problem specification, an abstract parallel abstract programming language+ (Apla+) program is generated from parallelized algorithm design strategies and problem-specific function definitions. By combining with parallel architectures, implicity of parallelism inside the parallelized algorithm design strategies is exploited. With implementation and transformation, C++ and parallel virtual machine (CPPVM) parallel program is finally generated. Parallelized branch and bound (B&B) algorithm design strategy and parallelized divide and conquer (D & C) algorithm design strategy are studied in this article as examples. And it also illustrates the approach with a case study.

  4. 3-D readout-electronics packaging for high-bandwidth massively paralleled imager

    Science.gov (United States)

    Kwiatkowski, Kris; Lyke, James

    2007-12-18

    Dense, massively parallel signal processing electronics are co-packaged behind associated sensor pixels. Microchips containing a linear or bilinear arrangement of photo-sensors, together with associated complex electronics, are integrated into a simple 3-D structure (a "mirror cube"). An array of photo-sensitive cells are disposed on a stacked CMOS chip's surface at a 45.degree. angle from light reflecting mirror surfaces formed on a neighboring CMOS chip surface. Image processing electronics are held within the stacked CMOS chip layers. Electrical connections couple each of said stacked CMOS chip layers and a distribution grid, the connections for distributing power and signals to components associated with each stacked CSMO chip layer.

  5. Patterns For Parallel Programming

    CERN Document Server

    Mattson, Timothy G; Massingill, Berna L

    2005-01-01

    From grids and clusters to next-generation game consoles, parallel computing is going mainstream. Innovations such as Hyper-Threading Technology, HyperTransport Technology, and multicore microprocessors from IBM, Intel, and Sun are accelerating the movement's growth. Only one thing is missing: programmers with the skills to meet the soaring demand for parallel software.

  6. In situ patterned micro 3D liver constructs for parallel toxicology testing in a fluidic device.

    Science.gov (United States)

    Skardal, Aleksander; Devarasetty, Mahesh; Soker, Shay; Hall, Adam R

    2015-09-01

    3D tissue models are increasingly being implemented for drug and toxicology testing. However, the creation of tissue-engineered constructs for this purpose often relies on complex biofabrication techniques that are time consuming, expensive, and difficult to scale up. Here, we describe a strategy for realizing multiple tissue constructs in a parallel microfluidic platform using an approach that is simple and can be easily scaled for high-throughput formats. Liver cells mixed with a UV-crosslinkable hydrogel solution are introduced into parallel channels of a sealed microfluidic device and photopatterned to produce stable tissue constructs in situ. The remaining uncrosslinked material is washed away, leaving the structures in place. By using a hydrogel that specifically mimics the properties of the natural extracellular matrix, we closely emulate native tissue, resulting in constructs that remain stable and functional in the device during a 7-day culture time course under recirculating media flow. As proof of principle for toxicology analysis, we expose the constructs to ethyl alcohol (0-500 mM) and show that the cell viability and the secretion of urea and albumin decrease with increasing alcohol exposure, while markers for cell damage increase. PMID:26355538

  7. The 3D Elevation Program: summary for Texas

    Science.gov (United States)

    Carswell, William J.

    2013-01-01

    Elevation data are essential to a broad range of applications, including forest resources management, wildlife and habitat management, national security, recreation, and many others. For the State of Texas, elevation data are critical for natural resources conservation; wildfire management, planning, and response; flood risk management; agriculture and precision farming; infrastructure and construction management; water supply and quality; and other business uses. Today, high-quality light detection and ranging (lidar) data are the source for creating elevation models and other elevation datasets. Federal, State, and local agencies work in partnership to (1) replace data, on a national basis, that are (on average) 30 years old and of lower quality and (2) provide coverage where publicly accessible data do not exist. A joint goal of State and Federal partners is to acquire consistent, statewide coverage to support existing and emerging applications enabled by lidar data. The new 3D Elevation Program (3DEP) initiative, managed by the U.S. Geological Survey (USGS), responds to the growing need for high-quality topographic data and a wide range of other three-dimensional representations of the Nation’s natural and constructed features.

  8. The 3D Elevation Program: summary for Minnesota

    Science.gov (United States)

    Carswell, William J., Jr.

    2013-01-01

    Elevation data are essential to a broad range of applications, including forest resources management, wildlife and habitat management, national security, recreation, and many others. For the State of Minnesota, elevation data are critical for agriculture and precision farming, natural resources conservation, flood risk management, infrastructure and construction management, water supply and quality, coastal zone management, and other business uses. Today, high-quality light detection and ranging (lidar) data are the sources for creating elevation models and other elevation datasets. Federal, State, and local agencies work in partnership to (1) replace data, on a national basis, that are (on average) 30 years old and of lower quality and (2) provide coverage where publicly accessible data do not exist. A joint goal of State and Federal partners is to acquire consistent, statewide coverage to support existing and emerging applications enabled by lidar data. The new 3D Elevation Program (3DEP) initiative, managed by the U.S. Geological Survey (USGS), responds to the growing need for high-quality topographic data and a wide range of other three-dimensional representations of the Nation’s natural and constructed features.

  9. The 3D Elevation Program: summary for Rhode Island

    Science.gov (United States)

    Carswell, William J., Jr.

    2013-01-01

    Elevation data are essential to a broad range of applications, including forest resources management, wildlife and habitat management, national security, recreation, and many others. For the State of Rhode Island, elevation data are critical for flood risk management, natural resources conservation, coastal zone management, sea level rise and subsidence, agriculture and precision farming, and other business uses. Today, high-quality light detection and ranging (lidar) data are the sources for creating elevation models and other elevation datasets. Federal, State, and local agencies work in partnership to (1) replace data, on a national basis, that are (on average) 30 years old and of lower quality and (2) provide coverage where publicly accessible data do not exist. A joint goal of State and Federal partners is to acquire consistent, statewide coverage to support existing and emerging applications enabled by lidar data. The new 3D Elevation Program (3DEP) initiative (Snyder, 2012a,b), managed by the U.S. Geological Survey (USGS), responds to the growing need for high-quality topographic data and a wide range of other three-dimensional representations of the Nation’s natural and constructed features.

  10. The 3D Elevation Program: summary for Wisconsin

    Science.gov (United States)

    Carswell, William J., Jr.

    2013-01-01

    Elevation data are essential to a broad range of applications, including forest resources management, wildlife and habitat management, national security, recreation, and many others. For the State of Wisconsin, elevation data are critical for agriculture and precision farming, natural resources conservation, flood risk management, infrastructure and construction management, water supply and quality, and other business uses. Today, high-quality light detection and ranging (lidar) data are the sources for creating elevation models and other elevation datasets. Federal, State, and local agencies work in partnership to (1) replace data, on a national basis, that are (on average) 30 years old and of lower quality and (2) provide coverage where publicly accessible data do not exist. A joint goal of State and Federal partners is to acquire consistent, statewide coverage to support existing and emerging applications enabled by lidar data. The new 3D Elevation Program (3DEP) initiative, managed by the U.S. Geological Survey (USGS), responds to the growing need for high-quality topographic data and a wide range of other three-dimensional representations of the Nation’s natural and constructed features.

  11. The 3D Elevation Program: summary for California

    Science.gov (United States)

    Carswell, William J., Jr.

    2013-01-01

    Elevation data are essential to a broad range of applications, including forest resources management, wildlife and habitat management, national security, recreation, and many others. For the State of California, elevation data are critical for infrastructure and construction management; natural resources conservation; flood risk management; wildfire management, planning, and response; agriculture and precision farming; geologic resource assessment and hazard mitigation; and other business uses. Today, high-quality light detection and ranging (lidar) data are the sources for creating elevation models and other elevation datasets. Federal, State, and local agencies work in partnership to (1) replace data, on a national basis, that are (on average) 30 years old and of lower quality and (2) provide coverage where publicly accessible data do not exist. A joint goal of State and Federal partners is to acquire consistent, statewide coverage to support existing and emerging applications enabled by lidar data. The new 3D Elevation Program (3DEP) initiative, managed by the U.S. Geological Survey (USGS), responds to the growing need for high-quality topographic data and a wide range of other three-dimensional representations of the Nation’s natural and constructed features.

  12. Continuous parallel ESI-MS analysis of reactions carried out in a bespoke 3D printed device

    Directory of Open Access Journals (Sweden)

    Jennifer S. Mathieson

    2013-04-01

    Full Text Available Herein, we present an approach for the rapid, straightforward and economical preparation of a tailored reactor device using three-dimensional (3D printing, which can be directly linked to a high-resolution electrospray ionisation mass spectrometer (ESI-MS for real-time, in-line observations. To highlight the potential of the setup, supramolecular coordination chemistry was carried out in the device, with the product of the reactions being recorded continuously and in parallel by ESI-MS. Utilising in-house-programmed computer control, the reactant flow rates and order were carefully controlled and varied, with the changes in the pump inlets being mirrored by the recorded ESI-MS spectra.

  13. Parallel programming with Python

    CERN Document Server

    Palach, Jan

    2014-01-01

    A fast, easy-to-follow and clear tutorial to help you develop Parallel computing systems using Python. Along with explaining the fundamentals, the book will also introduce you to slightly advanced concepts and will help you in implementing these techniques in the real world. If you are an experienced Python programmer and are willing to utilize the available computing resources by parallelizing applications in a simple way, then this book is for you. You are required to have a basic knowledge of Python development to get the most of this book.

  14. Practical parallel programming

    CERN Document Server

    Bauer, Barr E

    2014-01-01

    This is the book that will teach programmers to write faster, more efficient code for parallel processors. The reader is introduced to a vast array of procedures and paradigms on which actual coding may be based. Examples and real-life simulations using these devices are presented in C and FORTRAN.

  15. Parallel computing simulation of electrical excitation and conduction in the 3D human heart.

    Science.gov (United States)

    Di Yu; Dongping Du; Hui Yang; Yicheng Tu

    2014-01-01

    A correctly beating heart is important to ensure adequate circulation of blood throughout the body. Normal heart rhythm is produced by the orchestrated conduction of electrical signals throughout the heart. Cardiac electrical activity is the resulted function of a series of complex biochemical-mechanical reactions, which involves transportation and bio-distribution of ionic flows through a variety of biological ion channels. Cardiac arrhythmias are caused by the direct alteration of ion channel activity that results in changes in the AP waveform. In this work, we developed a whole-heart simulation model with the use of massive parallel computing with GPGPU and OpenGL. The simulation algorithm was implemented under several different versions for the purpose of comparisons, including one conventional CPU version and two GPU versions based on Nvidia CUDA platform. OpenGL was utilized for the visualization / interaction platform because it is open source, light weight and universally supported by various operating systems. The experimental results show that the GPU-based simulation outperforms the conventional CPU-based approach and significantly improves the speed of simulation. By adopting modern computer architecture, this present investigation enables real-time simulation and visualization of electrical excitation and conduction in the large and complicated 3D geometry of a real-world human heart.

  16. Development of a Stereo Vision Measurement System for a 3D Three-Axial Pneumatic Parallel Mechanism Robot Arm

    OpenAIRE

    Chien-Lun Hou; Hao-Ting Lin; Mao-Hsiung Chiang

    2011-01-01

    In this paper, a stereo vision 3D position measurement system for a three-axial pneumatic parallel mechanism robot arm is presented. The stereo vision 3D position measurement system aims to measure the 3D trajectories of the end-effector of the robot arm. To track the end-effector of the robot arm, the circle detection algorithm is used to detect the desired target and the SAD algorithm is used to track the moving target and to search the corresponding target location along the conjugate epip...

  17. Teaching Parallel Programming Using Java

    OpenAIRE

    Shafi, Aamir; Akhtar, Aleem; Javed, Ansar; Carpenter, Bryan

    2014-01-01

    This paper presents an overview of the "Applied Parallel Computing" course taught to final year Software Engineering undergraduate students in Spring 2014 at NUST, Pakistan. The main objective of the course was to introduce practical parallel programming tools and techniques for shared and distributed memory concurrent systems. A unique aspect of the course was that Java was used as the principle programming language. The course was divided into three sections. The first section covered paral...

  18. Explicit Parallel Programming: System Description

    OpenAIRE

    Gamble, Jim; Ribbens, Calvin J.

    1991-01-01

    The implementation of the Explicit Parallel Programming (EPP) system is described. EPP is a prototype implementation of a language for writing parallel programs for shared memory multiprocessors. EPP may be viewed as a coordination language, since it is used to define the sequencing or ordering of various tasks, while the tasks themselves are defined in some other compilable language. The two main components of the EPP system---a compiler and an executive---are described in this report. An...

  19. Explicit Parallel Programming: User's Guide

    OpenAIRE

    Gamble, Jim; Ribbens, Calvin J.

    1991-01-01

    The Explicit Parallel Programming (EPP) language is defined and illustrated with several examples. EPP is a prototype implementation of a language for writing parallel programs for shared memory multiprocessors. EPP may be viewed as a coordination language, since it is used to define the sequencing or ordering of various tasks, while the tasks themselves are defined in some other compilable language. The prototype described here requires FORTRAN as the base language, but there is no inheren...

  20. Influence of intrinsic and extrinsic forces on 3D stress distribution using CUDA programming

    Science.gov (United States)

    Räss, Ludovic; Omlin, Samuel; Podladchikov, Yuri

    2013-04-01

    In order to have a better understanding of the influence of buoyancy (intrinsic) and boundary (extrinsic) forces in a nonlinear rheology due to a power law fluid, some basics needs to be explored through 3D numerical calculation. As first approach, the already studied Stokes setup of a rising sphere will be used to calibrate the 3D model. Far field horizontal tectonic stress is applied to the sphere, which generates a vertical acceleration, buoyancy driven. This simple and known setup allows some benchmarking performed through systematic runs. The relative importance of intrinsic and extrinsic forces producing the wide variety of rates and styles of deformation, including absence of deformation and generating 3D stress patterns, will be determined. Relation between vertical motion and power law exponent will also be explored. The goal of these investigations will be to run models having topography and density structure from geophysical imaging as input, and 3D stress field as output. The stress distribution in Swiss Alps and Plateau and its implication for risk analysis is one of the perspective for this research. In fact, proximity of the stress to the failure is fundamental for risk assessment. Sensitivity of this to the accurate topography representation can then be evaluated. The developed 3D numerical codes, tuned for mid-sized cluster, need to be optimized, especially while running good resolution in full 3D. Therefor, two largely used computing platforms, MATLAB and FORTRAN 90 are explored. Starting with an easy adaptable and as short as possible MATLAB code, which is then upgraded in order to reach higher performance in simulation times and resolution. A significant speedup using the rising NVIDIA CUDA technology and resources is also possible. Programming in C-CUDA, creating some synchronization feature, and comparing the results with previous runs, helps us to investigate the new speedup possibilities allowed through GPU parallel computing. These codes

  1. Development of a 3D Parallel Mechanism Robot Arm with Three Vertical-Axial Pneumatic Actuators Combined with a Stereo Vision System

    OpenAIRE

    Hao-Ting Lin; Mao-Hsiung Chiang

    2011-01-01

    This study aimed to develop a novel 3D parallel mechanism robot driven by three vertical-axial pneumatic actuators with a stereo vision system for path tracking control. The mechanical system and the control system are the primary novel parts for developing a 3D parallel mechanism robot. In the mechanical system, a 3D parallel mechanism robot contains three serial chains, a fixed base, a movable platform and a pneumatic servo system. The parallel mechanism are designed and analyzed first for ...

  2. Design and verification of an ultra-precision 3D-coordinate measuring machine with parallel drives

    International Nuclear Information System (INIS)

    An ultra-precision 3D coordinate measuring machine (CMM), the TriNano N100, has been developed. In our design, the workpiece is mounted on a 3D stage, which is driven by three parallel drives that are mutually orthogonal. The linear drives support the 3D stage using vacuum preloaded (VPL) air bearings, whereby each drive determines the position of the 3D stage along one translation direction only. An exactly constrained design results in highly repeatable machine behavior. Furthermore, the machine complies with the Abbé principle over its full measurement range and the application of parallel drives allows for excellent dynamic behavior. The design allows a 3D measurement uncertainty of 100 nanometers in a measurement range of 200 cubic centimeters. Verification measurements using a Gannen XP 3D tactile probing system on a spherical artifact show a standard deviation in single point repeatability of around 2 nm in each direction. (paper)

  3. TOMO3D: 3-D joint refraction and reflection traveltime tomography parallel code for active-source seismic data—synthetic test

    Science.gov (United States)

    Meléndez, A.; Korenaga, J.; Sallarès, V.; Miniussi, A.; Ranero, C. R.

    2015-10-01

    We present a new 3-D traveltime tomography code (TOMO3D) for the modelling of active-source seismic data that uses the arrival times of both refracted and reflected seismic phases to derive the velocity distribution and the geometry of reflecting boundaries in the subsurface. This code is based on its popular 2-D version TOMO2D from which it inherited the methods to solve the forward and inverse problems. The traveltime calculations are done using a hybrid ray-tracing technique combining the graph and bending methods. The LSQR algorithm is used to perform the iterative regularized inversion to improve the initial velocity and depth models. In order to cope with an increased computational demand due to the incorporation of the third dimension, the forward problem solver, which takes most of the run time (˜90 per cent in the test presented here), has been parallelized with a combination of multi-processing and message passing interface standards. This parallelization distributes the ray-tracing and traveltime calculations among available computational resources. The code's performance is illustrated with a realistic synthetic example, including a checkerboard anomaly and two reflectors, which simulates the geometry of a subduction zone. The code is designed to invert for a single reflector at a time. A data-driven layer-stripping strategy is proposed for cases involving multiple reflectors, and it is tested for the successive inversion of the two reflectors. Layers are bound by consecutive reflectors, and an initial velocity model for each inversion step incorporates the results from previous steps. This strategy poses simpler inversion problems at each step, allowing the recovery of strong velocity discontinuities that would otherwise be smoothened.

  4. Development of a Stereo Vision Measurement System for a 3D Three-Axial Pneumatic Parallel Mechanism Robot Arm

    Directory of Open Access Journals (Sweden)

    Chien-Lun Hou

    2011-02-01

    Full Text Available In this paper, a stereo vision 3D position measurement system for a three-axial pneumatic parallel mechanism robot arm is presented. The stereo vision 3D position measurement system aims to measure the 3D trajectories of the end-effector of the robot arm. To track the end-effector of the robot arm, the circle detection algorithm is used to detect the desired target and the SAD algorithm is used to track the moving target and to search the corresponding target location along the conjugate epipolar line in the stereo pair. After camera calibration, both intrinsic and extrinsic parameters of the stereo rig can be obtained, so images can be rectified according to the camera parameters. Thus, through the epipolar rectification, the stereo matching process is reduced to a horizontal search along the conjugate epipolar line. Finally, 3D trajectories of the end-effector are computed by stereo triangulation. The experimental results show that the stereo vision 3D position measurement system proposed in this paper can successfully track and measure the fifth-order polynomial trajectory and sinusoidal trajectory of the end-effector of the three- axial pneumatic parallel mechanism robot arm.

  5. Development of a stereo vision measurement system for a 3D three-axial pneumatic parallel mechanism robot arm.

    Science.gov (United States)

    Chiang, Mao-Hsiung; Lin, Hao-Ting; Hou, Chien-Lun

    2011-01-01

    In this paper, a stereo vision 3D position measurement system for a three-axial pneumatic parallel mechanism robot arm is presented. The stereo vision 3D position measurement system aims to measure the 3D trajectories of the end-effector of the robot arm. To track the end-effector of the robot arm, the circle detection algorithm is used to detect the desired target and the SAD algorithm is used to track the moving target and to search the corresponding target location along the conjugate epipolar line in the stereo pair. After camera calibration, both intrinsic and extrinsic parameters of the stereo rig can be obtained, so images can be rectified according to the camera parameters. Thus, through the epipolar rectification, the stereo matching process is reduced to a horizontal search along the conjugate epipolar line. Finally, 3D trajectories of the end-effector are computed by stereo triangulation. The experimental results show that the stereo vision 3D position measurement system proposed in this paper can successfully track and measure the fifth-order polynomial trajectory and sinusoidal trajectory of the end-effector of the three- axial pneumatic parallel mechanism robot arm.

  6. Reactor Dosimetry Applications Using RAPTOR-M3G:. a New Parallel 3-D Radiation Transport Code

    Science.gov (United States)

    Longoni, Gianluca; Anderson, Stanwood L.

    2009-08-01

    The numerical solution of the Linearized Boltzmann Equation (LBE) via the Discrete Ordinates method (SN) requires extensive computational resources for large 3-D neutron and gamma transport applications due to the concurrent discretization of the angular, spatial, and energy domains. This paper will discuss the development RAPTOR-M3G (RApid Parallel Transport Of Radiation - Multiple 3D Geometries), a new 3-D parallel radiation transport code, and its application to the calculation of ex-vessel neutron dosimetry responses in the cavity of a commercial 2-loop Pressurized Water Reactor (PWR). RAPTOR-M3G is based domain decomposition algorithms, where the spatial and angular domains are allocated and processed on multi-processor computer architectures. As compared to traditional single-processor applications, this approach reduces the computational load as well as the memory requirement per processor, yielding an efficient solution methodology for large 3-D problems. Measured neutron dosimetry responses in the reactor cavity air gap will be compared to the RAPTOR-M3G predictions. This paper is organized as follows: Section 1 discusses the RAPTOR-M3G methodology; Section 2 describes the 2-loop PWR model and the numerical results obtained. Section 3 addresses the parallel performance of the code, and Section 4 concludes this paper with final remarks and future work.

  7. Experiences Using Hybrid MPI/OpenMP in the Real World: Parallelization of a 3D CFD Solver for Multi-Core Node Clusters

    Directory of Open Access Journals (Sweden)

    Gabriele Jost

    2010-01-01

    Full Text Available Today most systems in high-performance computing (HPC feature a hierarchical hardware design: shared-memory nodes with several multi-core CPUs are connected via a network infrastructure. When parallelizing an application for these architectures it seems natural to employ a hierarchical programming model such as combining MPI and OpenMP. Nevertheless, there is the general lore that pure MPI outperforms the hybrid MPI/OpenMP approach. In this paper, we describe the hybrid MPI/OpenMP parallelization of IR3D (Incompressible Realistic 3-D code, a full-scale real-world application, which simulates the environmental effects on the evolution of vortices trailing behind control surfaces of underwater vehicles. We discuss performance, scalability and limitations of the pure MPI version of the code on a variety of hardware platforms and show how the hybrid approach can help to overcome certain limitations.

  8. Parallel Adaptive Computation of Blood Flow in a 3D ``Whole'' Body Model

    Science.gov (United States)

    Zhou, M.; Figueroa, C. A.; Taylor, C. A.; Sahni, O.; Jansen, K. E.

    2008-11-01

    Accurate numerical simulations of vascular trauma require the consideration of a larger portion of the vasculature than previously considered, due to the systemic nature of the human body's response. A patient-specific 3D model composed of 78 connected arterial branches extending from the neck to the lower legs is constructed to effectively represent the entire body. Recently developed outflow boundary conditions that appropriately represent the downstream vasculature bed which is not included in the 3D computational domain are applied at 78 outlets. In this work, the pulsatile blood flow simulations are started on a fairly uniform, unstructured mesh that is subsequently adapted using a solution-based approach to efficiently resolve the flow features. The adapted mesh contains non-uniform, anisotropic elements resulting in resolution that conforms with the physical length scales present in the problem. The effects of the mesh resolution on the flow field are studied, specifically on relevant quantities of pressure, velocity and wall shear stress.

  9. A first 3D parallel diffusion solver based on a mixed dual finite element approximation

    International Nuclear Information System (INIS)

    This paper presents a new extension of the mixed dual finite element approximation of the diffusion equation in rectangular geometry. The mixed dual formulation has been extended in order to take into account discontinuity conditions. The iterative method is based on an alternating direction method which uses the current as unknown. This method is parallelizable and have very fast convergence properties. Some results for a 3D calculation on the CRAY computer are presented. (orig.)

  10. Compensation of errors in robot machining with a parallel 3D-piezo compensation mechanism

    OpenAIRE

    Schneider, Ulrich; Drust, Manuel; Puzik, Arnold; Verl, Alexander

    2013-01-01

    This paper proposes an approach for a 3D-Piezo Compensation Mechanism unit that is capable of fast and accurate adaption of the spindle position to enhance machining by robots. The mechanical design is explained which focuses on low mass, good stiffness and high bandwidth in order to allow compensating for errors beyond the bandwidth of the robot. In addition to previous works [7] and [9], an advanced actuation design is presented enabling movements in three translational axes allowing a work...

  11. C# game programming cookbook for Unity 3D

    CERN Document Server

    Murray, Jeff W

    2014-01-01

    Making Games the Modular Way Important Programming ConceptsBuilding the Core Game Framework Controllers and Managers Building the Core Framework ScriptsPlayer Structure Game-Specific Player Controller Dealing with Input Player Manager User Data ManagerRecipes: Common Components Introduction The Timer Class Spawn ScriptsSet Gravity Pretend Friction-Friction Simulation to Prevent Slipping Around Cameras Input Scripts Automatic Self-Destruction Script Automatic Object SpinnerScene Manager Building Player Movement ControllersShoot 'Em Up Spaceship Humanoid Character Wheeled Vehicle Weapon Systems

  12. Parallel load balancing strategy for Volume-of-Fluid methods on 3-D unstructured meshes

    Science.gov (United States)

    Jofre, Lluís; Borrell, Ricard; Lehmkuhl, Oriol; Oliva, Assensi

    2015-02-01

    Volume-of-Fluid (VOF) is one of the methods of choice to reproduce the interface motion in the simulation of multi-fluid flows. One of its main strengths is its accuracy in capturing sharp interface geometries, although requiring for it a number of geometric calculations. Under these circumstances, achieving parallel performance on current supercomputers is a must. The main obstacle for the parallelization is that the computing costs are concentrated only in the discrete elements that lie on the interface between fluids. Consequently, if the interface is not homogeneously distributed throughout the domain, standard domain decomposition (DD) strategies lead to imbalanced workload distributions. In this paper, we present a new parallelization strategy for general unstructured VOF solvers, based on a dynamic load balancing process complementary to the underlying DD. Its parallel efficiency has been analyzed and compared to the DD one using up to 1024 CPU-cores on an Intel SandyBridge based supercomputer. The results obtained on the solution of several artificially generated test cases show a speedup of up to ∼12× with respect to the standard DD, depending on the interface size, the initial distribution and the number of parallel processes engaged. Moreover, the new parallelization strategy presented is of general purpose, therefore, it could be used to parallelize any VOF solver without requiring changes on the coupled flow solver. Finally, note that although designed for the VOF method, our approach could be easily adapted to other interface-capturing methods, such as the Level-Set, which may present similar workload imbalances.

  13. 3D frequency modeling of elastic seismic wave propagation via a structured massively parallel direct Helmholtz solver

    Science.gov (United States)

    Wang, S.; De Hoop, M. V.; Xia, J.; Li, X.

    2011-12-01

    We consider the modeling of elastic seismic wave propagation on a rectangular domain via the discretization and solution of the inhomogeneous coupled Helmholtz equation in 3D, by exploiting a parallel multifrontal sparse direct solver equipped with Hierarchically Semi-Separable (HSS) structure to reduce the computational complexity and storage. In particular, we are concerned with solving this equation on a large domain, for a large number of different forcing terms in the context of seismic problems in general, and modeling in particular. We resort to a parsimonious mixed grid finite differences scheme for discretizing the Helmholtz operator and Perfect Matched Layer boundaries, resulting in a non-Hermitian matrix. We make use of a nested dissection based domain decomposition, and introduce an approximate direct solver by developing a parallel HSS matrix compression, factorization, and solution approach. We cast our massive parallelization in the framework of the multifrontal method. The assembly tree is partitioned into local trees and a global tree. The local trees are eliminated independently in each processor, while the global tree is eliminated through massive communication. The solver for the inhomogeneous equation is a parallel hybrid between multifrontal and HSS structure. The computational complexity associated with the factorization is almost linear with the size of the Helmholtz matrix. Our numerical approach can be compared with the spectral element method in 3D seismic applications.

  14. A Framework for Parallel Programming in Java

    OpenAIRE

    Launay, Pascale; Pazat, Jean-Louis

    1997-01-01

    To ease the task of programming parallel and distributed applications, the Do! project aims at the automatic generation of distributed code from multi-threaded Java programs. We provide a parallel programming model, embedded in a framework that constraints parallelism without any extension to the Java language. This framework is described here and is used as a basis to generate distributed programs.

  15. Simulation of the 3D viscoelastic free surface flow by a parallel corrected particle scheme

    Science.gov (United States)

    Jin-Lian, Ren; Tao, Jiang

    2016-02-01

    In this work, the behavior of the three-dimensional (3D) jet coiling based on the viscoelastic Oldroyd-B model is investigated by a corrected particle scheme, which is named the smoothed particle hydrodynamics with corrected symmetric kernel gradient and shifting particle technique (SPH_CS_SP) method. The accuracy and stability of SPH_CS_SP method is first tested by solving Poiseuille flow and Taylor-Green flow. Then the capacity for the SPH_CS_SP method to solve the viscoelastic fluid is verified by the polymer flow through a periodic array of cylinders. Moreover, the convergence of the SPH_CS_SP method is also investigated. Finally, the proposed method is further applied to the 3D viscoelastic jet coiling problem, and the influences of macroscopic parameters on the jet coiling are discussed. The numerical results show that the SPH_CS_SP method has higher accuracy and better stability than the traditional SPH method and other corrected SPH method, and can improve the tensile instability. Project supported by the Natural Science Foundation of Jiangsu Province, China (Grant Nos. BK20130436 and BK20150436) and the Natural Science Foundation of the Higher Education Institutions of Jiangsu Province, China (Grant No. 15KJB110025).

  16. Contributions to computational stereology and parallel programming

    DEFF Research Database (Denmark)

    Rasmusson, Allan

    rotator, even without the need for isotropic sections. To meet the need for computational power to perform image restoration of virtual tissue sections, parallel programming on GPUs has also been part of the project. This has lead to a significant change in paradigm for a previously developed surgical...... between computer science and stereology, we try to overcome these problems by developing new virtual stereological probes and virtual tissue sections. A concrete result is the development of a new virtual 3D probe, the spatial rotator, which was found to have lower variance than the widely used planar...... simulator and a memory efficient, GPU implementation of for connected components labeling. This was furthermore extended to produce signed distance fields and Voronoi diagrams, all with real-time performance. It has during the course of the project been realized that many disciplines within computer science...

  17. 3D seismic modeling and reverse‐time migration with the parallel Fourier method using non‐blocking collective communications

    KAUST Repository

    Chu, Chunlei

    2009-01-01

    The major performance bottleneck of the parallel Fourier method on distributed memory systems is the network communication cost. In this study, we investigate the potential of using non‐blocking all‐to‐all communications to solve this problem by overlapping computation and communication. We present the runtime comparison of a 3D seismic modeling problem with the Fourier method using non‐blocking and blocking calls, respectively, on a Linux cluster. The data demonstrate that a performance improvement of up to 40% can be achieved by simply changing blocking all‐to‐all communication calls to non‐blocking ones to introduce the overlapping capability. A 3D reverse‐time migration result is also presented as an extension to the modeling work based on non‐blocking collective communications.

  18. High-Performance Computation of Distributed-Memory Parallel 3D Voronoi and Delaunay Tessellation

    Energy Technology Data Exchange (ETDEWEB)

    Peterka, Tom; Morozov, Dmitriy; Phillips, Carolyn

    2014-11-14

    Computing a Voronoi or Delaunay tessellation from a set of points is a core part of the analysis of many simulated and measured datasets: N-body simulations, molecular dynamics codes, and LIDAR point clouds are just a few examples. Such computational geometry methods are common in data analysis and visualization; but as the scale of simulations and observations surpasses billions of particles, the existing serial and shared-memory algorithms no longer suffice. A distributed-memory scalable parallel algorithm is the only feasible approach. The primary contribution of this paper is a new parallel Delaunay and Voronoi tessellation algorithm that automatically determines which neighbor points need to be exchanged among the subdomains of a spatial decomposition. Other contributions include periodic and wall boundary conditions, comparison of our method using two popular serial libraries, and application to numerous science datasets.

  19. Calibration of 3-d.o.f. Translational Parallel Manipulators Using Leg Observations

    CERN Document Server

    Pashkevich, Anatoly; Wenger, Philippe; Gomolitsky, Roman

    2009-01-01

    The paper proposes a novel approach for the geometrical model calibration of quasi-isotropic parallel kinematic mechanisms of the Orthoglide family. It is based on the observations of the manipulator leg parallelism during motions between the specific test postures and employs a low-cost measuring system composed of standard comparator indicators attached to the universal magnetic stands. They are sequentially used for measuring the deviation of the relevant leg location while the manipulator moves the TCP along the Cartesian axes. Using the measured differences, the developed algorithm estimates the joint offsets and the leg lengths that are treated as the most essential parameters. Validity of the proposed calibration technique is confirmed by the experimental results.

  20. Multiscale simulation of mixing processes using 3D-parallel, fluid-structure interaction techniques

    OpenAIRE

    Valette, Rudy; Vergnes, Bruno; Coupez, Thierry

    2008-01-01

    International audience This work focuses on the development of a general finite element code, called Ximex®, devoted to the three-dimensional direct simulation of mixing processes of complex fluids. The code is based on a simplified fictitious domain method coupled with a "level-set" approach to represent the rigid moving boundaries, such as screws and rotors, as well as free surfaces. These techniques, combined with the use of parallel computing, allow computing the time-dependent flow of...

  1. Comparison of parallel and spiral tagged MRI geometries in estimation of 3-D myocardial strains

    Science.gov (United States)

    Tustison, Nicholas J.; Amini, Amir A.

    2005-04-01

    Research involving the quantification of left ventricular myocardial strain from cardiac tagged magnetic resonance imaging (MRI) is extensive. Two different imaging geometries are commonly employed by these methodologies to extract longitudinal deformation. We refer to these imaging geometries as either parallel or spiral. In the spiral configuration, four long-axis tagged image slices which intersect along the long-axis of the left ventricle are collected and in the parallel configuration, contiguous tagged long-axis images spanning the width of the left ventricle between the lateral wall and the septum are collected. Despite the number of methodologies using either or both imaging configurations, to date, no comparison has been made to determine which geometry results in more accurate estimation of strains. Using previously published work in which left ventricular myocardial strain is calculated from 4-D anatomical NURBS models, we compare the strain calculated from these two imaging geometries in both simulated tagged MR images for which ground truth strain is available as well as in in vivo data. It is shown that strains calculated using the parallel imaging protocol are more accurate than that calculated using spiral protocol.

  2. Sparse Approximations of the Schur Complement for Parallel Algebraic Hybrid Solvers in 3D

    Institute of Scientific and Technical Information of China (English)

    L.Giraud; A.Haidar; Y.Saad

    2010-01-01

    In this paper we study the computational performance of variants of an al-gebraic additive Schwarz preconditioner for the Schur complement for the solution of large sparse linear systems. In earlier works, the local Schur complements were com- puted exactly using a sparse direct solver. The robustness of the preconditioner comes at the price of this memory and time intensive computation that is the main bottleneck of the approach for tackling huge problems. In this work we investigate the use of sparse approximation of the dense local Schur complements. These approximations are com-puted using a partial incomplete LU factorization. Such a numerical calculation is the core of the multi-level incomplete factorization such as the one implemented in pARMS. The numerical and computing performance of the new numerical scheme is illustrated on a set of large 3D convection-diffusion problems;preliminary experiments on linear systems arising from structural mechanics are also reported.

  3. Knowledge rule base for the beam optics program TRACE 3-D

    International Nuclear Information System (INIS)

    An expert system type of knowledge rule base has been developed for the input parameters used by the particle beam transport program TRACE 3-D. The goal has been to provide the program's user with adequate on-screen information to allow him to initially set up a problem with minimal open-quotes off-lineclose quotes calculations. The focus of this work has been in developing rules for the parameters which define the beam line transport elements. Ten global parameters, the particle mass and charge, beam energy, etc., are used to provide open-quotes expertclose quotes estimates of lower and upper limits for each of the transport element parameters. For example, the limits for the field strength of the quadrupole element are based on a water-cooled, iron-core electromagnet with dimensions derived from practical engineering constraints, and the upper limit for the effective length is scaled with the particle momenta so that initially parallel trajectories do not cross the axis inside the magnet. Limits for the quadrupole doublet and triplet parameters incorporate these rules and additional rules based on stable FODO lattices and bidirectional focusing requirements. The structure of the rule base is outlined and examples for the quadrupole singlet, doublet and triplet are described. The rule base has been implemented within the Shell for Particle Accelerator Related Codes (SPARC) graphical user interface (GUI)

  4. Massively Parallel Linear Stability Analysis with P_ARPACK for 3D Fluid Flow Modeled with MPSalsa

    Energy Technology Data Exchange (ETDEWEB)

    Lehoucq, R.B.; Salinger, A.G.

    1998-10-13

    We are interested in the stability of three-dimensional fluid flows to small dkturbances. One computational approach is to solve a sequence of large sparse generalized eigenvalue problems for the leading modes that arise from discretizating the differential equations modeling the flow. The modes of interest are the eigenvalues of largest real part and their associated eigenvectors. We discuss our work to develop an effi- cient and reliable eigensolver for use by the massively parallel simulation code MPSalsa. MPSalsa allows simulation of complex 3D fluid flow, heat transfer, and mass transfer with detailed bulk fluid and surface chemical reaction kinetics.

  5. A new ray-tracing scheme for 3D diffuse radiation transfer on highly parallel architectures

    OpenAIRE

    Tanaka, Satoshi; Yoshikawa, Kohji; Okamoto, Takashi; HASEGAWA, Kenji

    2014-01-01

    We present a new numerical scheme to solve the transfer of diffuse radiation on three-dimensional mesh grids which is efficient on processors with highly parallel architecture such as recently popular GPUs and CPUs with multi- and many-core architectures. The scheme is based on the ray-tracing method and the computational cost is proportional to $N_{\\rm m}^{5/3}$ where $N_{\\rm m}$ is the number of mesh grids, and is devised to compute the radiation transfer along each light-ray completely in ...

  6. 3D parallel-detection microwave tomography for clinical breast imaging

    Energy Technology Data Exchange (ETDEWEB)

    Epstein, N. R., E-mail: nepstein@ucalgary.ca [Schulich School of Engineering, University of Calgary, 2500 University Dr. NW, Calgary, Alberta T2N 1N4 (Canada); Meaney, P. M. [Thayer School of Engineering, Dartmouth College, 14 Engineering Dr., Hanover, New Hampshire 03755 (United States); Paulsen, K. D. [Thayer School of Engineering, Dartmouth College, 14 Engineering Dr., Hanover, New Hampshire 03755 (United States); Department of Radiology, Geisel School of Medicine, Dartmouth College, Hanover, New Hampshire 03755 (United States); Norris Cotton Cancer Center, Dartmouth Hitchcock Medical Center, Lebanon, New Hampshire 03756 (United States); Advanced Surgical Center, Dartmouth Hitchcock Medical Center, Lebanon, New Hampshire 03756 (United States)

    2014-12-15

    A biomedical microwave tomography system with 3D-imaging capabilities has been constructed and translated to the clinic. Updates to the hardware and reconfiguration of the electronic-network layouts in a more compartmentalized construct have streamlined system packaging. Upgrades to the data acquisition and microwave components have increased data-acquisition speeds and improved system performance. By incorporating analog-to-digital boards that accommodate the linear amplification and dynamic-range coverage our system requires, a complete set of data (for a fixed array position at a single frequency) is now acquired in 5.8 s. Replacement of key components (e.g., switches and power dividers) by devices with improved operational bandwidths has enhanced system response over a wider frequency range. High-integrity, low-power signals are routinely measured down to −130 dBm for frequencies ranging from 500 to 2300 MHz. Adequate inter-channel isolation has been maintained, and a dynamic range >110 dB has been achieved for the full operating frequency range (500–2900 MHz). For our primary band of interest, the associated measurement deviations are less than 0.33% and 0.5° for signal amplitude and phase values, respectively. A modified monopole antenna array (composed of two interwoven eight-element sub-arrays), in conjunction with an updated motion-control system capable of independently moving the sub-arrays to various in-plane and cross-plane positions within the illumination chamber, has been configured in the new design for full volumetric data acquisition. Signal-to-noise ratios (SNRs) are more than adequate for all transmit/receive antenna pairs over the full frequency range and for the variety of in-plane and cross-plane configurations. For proximal receivers, in-plane SNRs greater than 80 dB are observed up to 2900 MHz, while cross-plane SNRs greater than 80 dB are seen for 6 cm sub-array spacing (for frequencies up to 1500 MHz). We demonstrate accurate

  7. Refinement of Parallel and Reactive Programs

    OpenAIRE

    Back, R.J.R.

    1992-01-01

    We show how to apply the refinement calculus to stepwise refinement of parallel and reactive programs. We use action systems as our basic program model. Action systems are sequential programs which can be implemented in a parallel fashion. Hence refinement calculus methods, originally developed for sequential programs, carry over to the derivation of parallel programs. Refinement of reactive programs is handled by data refinement techniques originally developed for the sequential refinement c...

  8. The DANTE Boltzmann transport solver: An unstructured mesh, 3-D, spherical harmonics algorithm compatible with parallel computer architectures

    International Nuclear Information System (INIS)

    A spherical harmonics research code (DANTE) has been developed which is compatible with parallel computer architectures. DANTE provides 3-D, multi-material, deterministic, transport capabilities using an arbitrary finite element mesh. The linearized Boltzmann transport equation is solved in a second order self-adjoint form utilizing a Galerkin finite element spatial differencing scheme. The core solver utilizes a preconditioned conjugate gradient algorithm. Other distinguishing features of the code include options for discrete-ordinates and simplified spherical harmonics angular differencing, an exact Marshak boundary treatment for arbitrarily oriented boundary faces, in-line matrix construction techniques to minimize memory consumption, and an effective diffusion based preconditioner for scattering dominated problems. Algorithm efficiency is demonstrated for a massively parallel SIMD architecture (CM-5), and compatibility with MPP multiprocessor platforms or workstation clusters is anticipated

  9. A 3D hybrid grid generation technique and a multigrid/parallel algorithm based on anisotropic agglomeration approach

    Institute of Scientific and Technical Information of China (English)

    Zhang Laiping; Zhao Zhong; Chang Xinghua; He Xin

    2013-01-01

    A hybrid grid generation technique and a multigrid/parallel algorithm are presented in this paper for turbulence flow simulations over three-dimensional (3D) complex geometries.The hybrid grid generation technique is based on an agglomeration method of anisotropic tetrahedrons.Firstly,the complex computational domain is covered by pure tetrahedral grids,in which anisotropic tetrahedrons are adopted to discrete the boundary layer and isotropic tetrahedrons in the outer field.Then,the anisotropic tetrahedrons in the boundary layer are agglomerated to generate prismatic grids.The agglomeration method can improve the grid quality in boundary layer and reduce the grid quantity to enhance the numerical accuracy and efficiency.In order to accelerate the convergence history,a multigrid/parallel algorithm is developed also based on anisotropic agglomeration approach.The numerical results demonstrate the excellent accelerating capability of this multigrid method.

  10. Introducing zeus-mp: a 3d, parallel, multiphysics code for astrophysical fluid dynamics

    Directory of Open Access Journals (Sweden)

    Michael L. Norman

    2000-01-01

    Full Text Available Describimos ZEUS-MP: un c odigo Multi-F sica, Masivamente-Paralelo, Pasa-Mensajes para simulaciones tridimensionales de din amica de uidos astrof sicos. ZEUS-MP es la continuaci on de los c odigos ZEUS-2D y ZEUS-3D, desarrollados y diseminados por el Laboratorio de Astrof sica Computacional (lca.ncsa.uiuc.edu del NCSA. La versi on V1.0, liberada el 1/1/2000, contiene los siguientes m odulos: hidrodin amica ideal, MHD ideal y auto-gravedad. Las pr oximas versiones tendr an difusi on radiativa de ujo limitado, conducci on de calor, plasma de dos temperaturas y funciones de enfriamiento y calentamiento. Las ecuaciones covariantes est an avanzadas en una malla Euleriana m ovil en coordenadas Cartesianas, cil ndricas y polares esf ericas. La paralelizaci on es hecha por descomposici on del dominio y est a implementada en F77 y MPI. El c odigo es portable en un amplio rango de plataformas, desde redes de estaciones de trabajo hasta procesadores de paralelismo masivo. Se presentan algunos resultados de la e ciencia en paralelo junto con una aplicaci on a formaci on estelar turbulenta.

  11. 3D interconnect architecture for high-bandwidth massively paralleled imager

    International Nuclear Information System (INIS)

    The proton radiography group at LANL is developing a fast (5x106 frames/s or 5 megaframe/s) multi-frame imager for use in dynamic radiographic experiments with high-energy protons. The mega-pixel imager will acquire and process a burst of 32 frames captured at inter-frame time ∼200 ns. Real time signal processing and storage requirements for entire frames, of rapidly acquired pixels impose severe demands on the space available for the electronics in a standard monolithic approach. As such, a 3D arrangement of detector and circuit elements is under development. In this scheme, the readout integrated circuits (ROICs) are stacked vertically (like playing cards) into a cube configuration. Another die, a fully depleted pixel photo-diode focal plane array (FPA), is bump bonded to one of the edge surfaces formed by the resulting ROIC cube. Recently, an assembly of the proof-of-principle test cube and sensor has been completed

  12. 3D interconnect architecture for high-bandwidth massively paralleled imager

    Energy Technology Data Exchange (ETDEWEB)

    Kwiatkowski, K. E-mail: krisk@lanl.gov; Lyke, J.C.; Wojnarowski, R.J.; Beche, J.-F.; Fillion, R.; Kapusta, C.; Millaud, J.; Saia, R.; Wilke, M.D

    2003-08-21

    The proton radiography group at LANL is developing a fast (5x10{sup 6} frames/s or 5 megaframe/s) multi-frame imager for use in dynamic radiographic experiments with high-energy protons. The mega-pixel imager will acquire and process a burst of 32 frames captured at inter-frame time {approx}200 ns. Real time signal processing and storage requirements for entire frames, of rapidly acquired pixels impose severe demands on the space available for the electronics in a standard monolithic approach. As such, a 3D arrangement of detector and circuit elements is under development. In this scheme, the readout integrated circuits (ROICs) are stacked vertically (like playing cards) into a cube configuration. Another die, a fully depleted pixel photo-diode focal plane array (FPA), is bump bonded to one of the edge surfaces formed by the resulting ROIC cube. Recently, an assembly of the proof-of-principle test cube and sensor has been completed.

  13. Parallel phase-shifting digital holography and its application to high-speed 3D imaging of dynamic object

    Science.gov (United States)

    Awatsuji, Yasuhiro; Xia, Peng; Wang, Yexin; Matoba, Osamu

    2016-03-01

    Digital holography is a technique of 3D measurement of object. The technique uses an image sensor to record the interference fringe image containing the complex amplitude of object, and numerically reconstructs the complex amplitude by computer. Parallel phase-shifting digital holography is capable of accurate 3D measurement of dynamic object. This is because this technique can reconstruct the complex amplitude of object, on which the undesired images are not superimposed, form a single hologram. The undesired images are the non-diffraction wave and the conjugate image which are associated with holography. In parallel phase-shifting digital holography, a hologram, whose phase of the reference wave is spatially and periodically shifted every other pixel, is recorded to obtain complex amplitude of object by single-shot exposure. The recorded hologram is decomposed into multiple holograms required for phase-shifting digital holography. The complex amplitude of the object is free from the undesired images is reconstructed from the multiple holograms. To validate parallel phase-shifting digital holography, a high-speed parallel phase-shifting digital holography system was constructed. The system consists of a Mach-Zehnder interferometer, a continuous-wave laser, and a high-speed polarization imaging camera. Phase motion picture of dynamic air flow sprayed from a nozzle was recorded at 180,000 frames per second (FPS) have been recorded by the system. Also phase motion picture of dynamic air induced by discharge between two electrodes has been recorded at 1,000,000 FPS, when high voltage was applied between the electrodes.

  14. A Case Study of a Hybrid Parallel 3D Surface Rendering Graphics Architecture

    DEFF Research Database (Denmark)

    Holten-Lund, Hans Erik; Madsen, Jan; Pedersen, Steen

    1997-01-01

    This paper presents a case study in the design strategy used inbuilding a graphics computer, for drawing very complex 3Dgeometric surfaces. The goal is to build a PC based computer systemcapable of handling surfaces built from about 2 million triangles, andto be able to render a perspective view...... of these on a computer displayat interactive frame rates, i.e. processing around 50 milliontriangles per second. The paper presents a hardware/softwarearchitecture called HPGA (Hybrid Parallel Graphics Architecture) whichis likely to be able to carry out this task. The case study focuses ontechniques to increase...... the clock frequency as well as the parallelismof the system. This paper focuses on the back-end graphics pipeline,which is responsible for rasterizing triangles.%with a practically linear increase in performance. A pure software implementation of the proposed architecture iscurrently able to process 300...

  15. High-speed 3D imaging using two-wavelength parallel-phase-shift interferometry.

    Science.gov (United States)

    Safrani, Avner; Abdulhalim, Ibrahim

    2015-10-15

    High-speed three dimensional imaging based on two-wavelength parallel-phase-shift interferometry is presented. The technique is demonstrated using a high-resolution polarization-based Linnik interferometer operating with three high-speed phase-masked CCD cameras and two quasi-monochromatic modulated light sources. The two light sources allow for phase unwrapping the single source wrapped phase so that relatively high step profiles having heights as large as 3.7 μm can be imaged in video rate with ±2  nm accuracy and repeatability. The technique is validated using a certified very large scale integration (VLSI) step standard followed by a demonstration from the semiconductor industry showing an integrated chip with 2.75 μm height copper micro pillars at different packing densities. PMID:26469586

  16. Hybrid Characteristics: 3D radiative transfer for parallel adaptive mesh refinement hydrodynamics

    CERN Document Server

    Rijkhorst, E J; Dubey, A; Mellema, G R; Rijkhorst, Erik-Jan; Plewa, Tomasz; Dubey, Anshu; Mellema, Garrelt

    2005-01-01

    We have developed a three-dimensional radiative transfer method designed specifically for use with parallel adaptive mesh refinement hydrodynamics codes. This new algorithm, which we call hybrid characteristics, introduces a novel form of ray tracing that can neither be classified as long, nor as short characteristics, but which applies the underlying principles, i.e. efficient execution through interpolation and parallelizability, of both. Primary applications of the hybrid characteristics method are radiation hydrodynamics problems that take into account the effects of photoionization and heating due to point sources of radiation. The method is implemented in the hydrodynamics package FLASH. The ionization, heating, and cooling processes are modelled using the DORIC ionization package. Upon comparison with the long characteristics method, we find that our method calculates the column density with a similarly high accuracy and produces sharp and well defined shadows. We show the quality of the new algorithm ...

  17. 3D Profile Filter Algorithm Based on Parallel Generalized B-spline Approximating Gaussian

    Institute of Scientific and Technical Information of China (English)

    REN Zhiying; GAO Chenghui; SHEN Ding

    2015-01-01

    Currently, the approximation methods of the Gaussian filter by some other spline filters have been developed. However, these methods are only suitable for the study of one-dimensional filtering, when these methods are used for three-dimensional filtering, it is found that a rounding error and quantization error would be passed to the next in every part. In this paper, a new and high-precision implementation approach for Gaussian filter is described, which is suitable for three-dimensional reference filtering. Based on the theory of generalized B-spline function and the variational principle, the transmission characteristics of a digital filter can be changed through the sensitivity of the parameters (t1, t2), and which can also reduce the rounding error and quantization error by the filter in a parallel form instead of the cascade form. Finally, the approximation filter of Gaussian filter is obtained. In order to verify the feasibility of the new algorithm, the reference extraction of the conventional methods are also used and compared. The experiments are conducted on the measured optical surface, and the results show that the total calculation by the new algorithm only requires 0.07 s for 480´480 data points;the amplitude deviation between the reference of the parallel form filter and the Gaussian filter is smaller;the new method is closer to the characteristic of the Gaussian filter through the analysis of three-dimensional roughness parameters, comparing with the cascade generalized B-spline approximating Gaussian. So the new algorithm is also efficient and accurate for the implementation of Gaussian filter in the application of surface roughness measurement.

  18. The Experience of Large Computational Programs Parallelization. Parallel Version of MINUIT Program

    CERN Document Server

    Sapozhnikov, A P

    2003-01-01

    The problems around large computational programs parallelization are discussed. As an example a parallel version of MINUIT, widely used program for minimization, is introduced. The given results of MPI-based MINUIT testing on multiprocessor system demonstrate really reached parallelism.

  19. Adobe Flash 11 Stage3D (Molehill) Game Programming Beginner's Guide

    CERN Document Server

    Kaitila, Christer

    2011-01-01

    Written in an informal and friendly manner, the style and approach of this book will take you on an exciting adventure. Piece by piece, detailed examples help you along the way by providing real-world game code required to make a complete 3D video game. Each chapter builds upon the experience and achievements earned in the last, culminating in the ultimate prize - your game! If you ever wanted to make your own 3D game in Flash, then this book is for you. This book is a perfect introduction to 3D game programming in Adobe Molehill for complete beginners. You do not need to know anything about S

  20. Synergia: A hybrid, parallel beam dynamics code with 3D space charge

    Energy Technology Data Exchange (ETDEWEB)

    James F. Amundson; Panagiotis Spentzouris

    2003-07-09

    We describe Synergia, a hybrid code developed under the DOE SciDAC-supported Accelerator Simulation Program. The code combines and extends the existing accelerator modeling packages IMPACT and beamline/mxyzptlk. We discuss the design and implementation of Synergia, its performance on different architectures, and its potential applications.

  1. Structured Parallel Programming Patterns for Efficient Computation

    CERN Document Server

    McCool, Michael; Robison, Arch

    2012-01-01

    Programming is now parallel programming. Much as structured programming revolutionized traditional serial programming decades ago, a new kind of structured programming, based on patterns, is relevant to parallel programming today. Parallel computing experts and industry insiders Michael McCool, Arch Robison, and James Reinders describe how to design and implement maintainable and efficient parallel algorithms using a pattern-based approach. They present both theory and practice, and give detailed concrete examples using multiple programming models. Examples are primarily given using two of th

  2. Non-Iterative Rigid 2D/3D Point-Set Registration Using Semidefinite Programming

    Science.gov (United States)

    Khoo, Yuehaw; Kapoor, Ankur

    2016-07-01

    We describe a convex programming framework for pose estimation in 2D/3D point-set registration with unknown point correspondences. We give two mixed-integer nonlinear program (MINP) formulations of the 2D/3D registration problem when there are multiple 2D images, and propose convex relaxations for both of the MINPs to semidefinite programs (SDP) that can be solved efficiently by interior point methods. Our approach to the 2D/3D registration problem is non-iterative in nature as we jointly solve for pose and correspondence. Furthermore, these convex programs can readily incorporate feature descriptors of points to enhance registration results. We prove that the convex programs exactly recover the solution to the original nonconvex 2D/3D registration problem under noiseless condition. We apply these formulations to the registration of 3D models of coronary vessels to their 2D projections obtained from multiple intra-operative fluoroscopic images. For this application, we experimentally corroborate the exact recovery property in the absence of noise and further demonstrate robustness of the convex programs in the presence of noise.

  3. About Parallel Programming: Paradigms, Parallel Execution and Collaborative Systems

    Directory of Open Access Journals (Sweden)

    Loredana MOCEAN

    2009-01-01

    Full Text Available In the last years, there were made efforts for delineation of a stabile and unitary frame, where the problems of logical parallel processing must find solutions at least at the level of imperative languages. The results obtained by now are not at the level of the made efforts. This paper wants to be a little contribution at these efforts. We propose an overview in parallel programming, parallel execution and collaborative systems.

  4. A 3D MPI-Parallel GPU-accelerated framework for simulating ocean wave energy converters

    Science.gov (United States)

    Pathak, Ashish; Raessi, Mehdi

    2015-11-01

    We present an MPI-parallel GPU-accelerated computational framework for studying the interaction between ocean waves and wave energy converters (WECs). The computational framework captures the viscous effects, nonlinear fluid-structure interaction (FSI), and breaking of waves around the structure, which cannot be captured in many potential flow solvers commonly used for WEC simulations. The full Navier-Stokes equations are solved using the two-step projection method, which is accelerated by porting the pressure Poisson equation to GPUs. The FSI is captured using the numerically stable fictitious domain method. A novel three-phase interface reconstruction algorithm is used to resolve three phases in a VOF-PLIC context. A consistent mass and momentum transport approach enables simulations at high density ratios. The accuracy of the overall framework is demonstrated via an array of test cases. Numerical simulations of the interaction between ocean waves and WECs are presented. Funding from the National Science Foundation CBET-1236462 grant is gratefully acknowledged.

  5. Parallel Programming in the Age of Ubiquitous Parallelism

    Science.gov (United States)

    Pingali, Keshav

    2014-04-01

    Multicore and manycore processors are now ubiquitous, but parallel programming remains as difficult as it was 30-40 years ago. During this time, our community has explored many promising approaches including functional and dataflow languages, logic programming, and automatic parallelization using program analysis and restructuring, but none of these approaches has succeeded except in a few niche application areas. In this talk, I will argue that these problems arise largely from the computation-centric foundations and abstractions that we currently use to think about parallelism. In their place, I will propose a novel data-centric foundation for parallel programming called the operator formulation in which algorithms are described in terms of actions on data. The operator formulation shows that a generalized form of data-parallelism called amorphous data-parallelism is ubiquitous even in complex, irregular graph applications such as mesh generation/refinement/partitioning and SAT solvers. Regular algorithms emerge as a special case of irregular ones, and many application-specific optimization techniques can be generalized to a broader context. The operator formulation also leads to a structural analysis of algorithms called TAO-analysis that provides implementation guidelines for exploiting parallelism efficiently. Finally, I will describe a system called Galois based on these ideas for exploiting amorphous data-parallelism on multicores and GPUs

  6. High-performance parallel solver for 3D time-dependent Schrodinger equation for large-scale nanosystems

    Science.gov (United States)

    Gainullin, I. K.; Sonkin, M. A.

    2015-03-01

    A parallelized three-dimensional (3D) time-dependent Schrodinger equation (TDSE) solver for one-electron systems is presented in this paper. The TDSE Solver is based on the finite-difference method (FDM) in Cartesian coordinates and uses a simple and explicit leap-frog numerical scheme. The simplicity of the numerical method provides very efficient parallelization and high performance of calculations using Graphics Processing Units (GPUs). For example, calculation of 106 time-steps on the 1000ṡ1000ṡ1000 numerical grid (109 points) takes only 16 hours on 16 Tesla M2090 GPUs. The TDSE Solver demonstrates scalability (parallel efficiency) close to 100% with some limitations on the problem size. The TDSE Solver is validated by calculation of energy eigenstates of the hydrogen atom (13.55 eV) and affinity level of H- ion (0.75 eV). The comparison with other TDSE solvers shows that a GPU-based TDSE Solver is 3 times faster for the problems of the same size and with the same cost of computational resources. The usage of a non-regular Cartesian grid or problem-specific non-Cartesian coordinates increases this benefit up to 10 times. The TDSE Solver was applied to the calculation of the resonant charge transfer (RCT) in nanosystems, including several related physical problems, such as electron capture during H+-H0 collision and electron tunneling between H- ion and thin metallic island film.

  7. iPhone 3D Programming Developing Graphical Applications with OpenGL ES

    CERN Document Server

    Rideout, Philip

    2010-01-01

    What does it take to build an iPhone app with stunning 3D graphics? This book will show you how to apply OpenGL graphics programming techniques to any device running the iPhone OS -- including the iPad and iPod Touch -- with no iPhone development or 3D graphics experience required. iPhone 3D Programming provides clear step-by-step instructions, as well as lots of practical advice, for using the iPhone SDK and OpenGL. You'll build several graphics programs -- progressing from simple to more complex examples -- that focus on lighting, textures, blending, augmented reality, optimization for pe

  8. Development of a 3D Parallel Mechanism Robot Arm with Three Vertical-Axial Pneumatic Actuators Combined with a Stereo Vision System

    Directory of Open Access Journals (Sweden)

    Hao-Ting Lin

    2011-12-01

    Full Text Available This study aimed to develop a novel 3D parallel mechanism robot driven by three vertical-axial pneumatic actuators with a stereo vision system for path tracking control. The mechanical system and the control system are the primary novel parts for developing a 3D parallel mechanism robot. In the mechanical system, a 3D parallel mechanism robot contains three serial chains, a fixed base, a movable platform and a pneumatic servo system. The parallel mechanism are designed and analyzed first for realizing a 3D motion in the X-Y-Z coordinate system of the robot’s end-effector. The inverse kinematics and the forward kinematics of the parallel mechanism robot are investigated by using the Denavit-Hartenberg notation (D-H notation coordinate system. The pneumatic actuators in the three vertical motion axes are modeled. In the control system, the Fourier series-based adaptive sliding-mode controller with H∞ tracking performance is used to design the path tracking controllers of the three vertical servo pneumatic actuators for realizing 3D path tracking control of the end-effector. Three optical linear scales are used to measure the position of the three pneumatic actuators. The 3D position of the end-effector is then calculated from the measuring position of the three pneumatic actuators by means of the kinematics. However, the calculated 3D position of the end-effector cannot consider the manufacturing and assembly tolerance of the joints and the parallel mechanism so that errors between the actual position and the calculated 3D position of the end-effector exist. In order to improve this situation, sensor collaboration is developed in this paper. A stereo vision system is used to collaborate with the three position sensors of the pneumatic actuators. The stereo vision system combining two CCD serves to measure the actual 3D position of the end-effector and calibrate the error between the actual and the calculated 3D position of the end

  9. Development of a 3D parallel mechanism robot arm with three vertical-axial pneumatic actuators combined with a stereo vision system.

    Science.gov (United States)

    Chiang, Mao-Hsiung; Lin, Hao-Ting

    2011-01-01

    This study aimed to develop a novel 3D parallel mechanism robot driven by three vertical-axial pneumatic actuators with a stereo vision system for path tracking control. The mechanical system and the control system are the primary novel parts for developing a 3D parallel mechanism robot. In the mechanical system, a 3D parallel mechanism robot contains three serial chains, a fixed base, a movable platform and a pneumatic servo system. The parallel mechanism are designed and analyzed first for realizing a 3D motion in the X-Y-Z coordinate system of the robot's end-effector. The inverse kinematics and the forward kinematics of the parallel mechanism robot are investigated by using the Denavit-Hartenberg notation (D-H notation) coordinate system. The pneumatic actuators in the three vertical motion axes are modeled. In the control system, the Fourier series-based adaptive sliding-mode controller with H(∞) tracking performance is used to design the path tracking controllers of the three vertical servo pneumatic actuators for realizing 3D path tracking control of the end-effector. Three optical linear scales are used to measure the position of the three pneumatic actuators. The 3D position of the end-effector is then calculated from the measuring position of the three pneumatic actuators by means of the kinematics. However, the calculated 3D position of the end-effector cannot consider the manufacturing and assembly tolerance of the joints and the parallel mechanism so that errors between the actual position and the calculated 3D position of the end-effector exist. In order to improve this situation, sensor collaboration is developed in this paper. A stereo vision system is used to collaborate with the three position sensors of the pneumatic actuators. The stereo vision system combining two CCD serves to measure the actual 3D position of the end-effector and calibrate the error between the actual and the calculated 3D position of the end-effector. Furthermore, to

  10. An innovative hybrid 3D analytic-numerical model for air breathing parallel channel counter-flow PEM fuel cells.

    Science.gov (United States)

    Tavčar, Gregor; Katrašnik, Tomaž

    2014-01-01

    The parallel straight channel PEM fuel cell model presented in this paper extends the innovative hybrid 3D analytic-numerical (HAN) approach previously published by the authors with capabilities to address ternary diffusion systems and counter-flow configurations. The model's core principle is modelling species transport by obtaining a 2D analytic solution for species concentration distribution in the plane perpendicular to the cannel gas-flow and coupling consecutive 2D solutions by means of a 1D numerical pipe-flow model. Electrochemical and other nonlinear phenomena are coupled to the species transport by a routine that uses derivative approximation with prediction-iteration. The latter is also the core of the counter-flow computation algorithm. A HAN model of a laboratory test fuel cell is presented and evaluated against a professional 3D CFD simulation tool showing very good agreement between results of the presented model and those of the CFD simulation. Furthermore, high accuracy results are achieved at moderate computational times, which is owed to the semi-analytic nature and to the efficient computational coupling of electrochemical kinetics and species transport. PMID:25125112

  11. HEXBU-3D, a three-dimensional PWR-simulator program for hexagonal fuel assemblies

    International Nuclear Information System (INIS)

    HEXBU-3D is a three-dimensional nodal simulator program for PWR reactors. It is designed for a reactor core that consists of hexagonal fuel assemblies and of big follower-type control assemblies. The program solves two-group diffusion equations in homogenized fuel assembly geometry by a sophisticated nodal method. The treatment of feedback effects from xenon-poisoning, fuel temperature, moderator temperature and density and soluble boron concentration are included in the program. The nodal equations are solved by a fast two-level iteration technique and the eigenvalue can be either the effective multiplication factor or the boron concentration of the moderator. Burnup calculations are performed by tabulated sets of burnup-dependent cross sections evaluated by a cell burnup program. HEXBY-3D has been originally programmed in FORTRAN V for the UNIVAC 1108 computer, but there is also another version which is operable on the CDC CYBER 170 computer. (author)

  12. PDDP, A Data Parallel Programming Model

    Directory of Open Access Journals (Sweden)

    Karen H. Warren

    1996-01-01

    Full Text Available PDDP, the parallel data distribution preprocessor, is a data parallel programming model for distributed memory parallel computers. PDDP implements high-performance Fortran-compatible data distribution directives and parallelism expressed by the use of Fortran 90 array syntax, the FORALL statement, and the WHERE construct. Distributed data objects belong to a global name space; other data objects are treated as local and replicated on each processor. PDDP allows the user to program in a shared memory style and generates codes that are portable to a variety of parallel machines. For interprocessor communication, PDDP uses the fastest communication primitives on each platform.

  13. Playable Stories: Making Programming and 3D Role-Playing Game Design Personally and Socially Relevant

    Science.gov (United States)

    Ingram-Goble, Adam

    2013-01-01

    This is an exploratory design study of a novel system for learning programming and 3D role-playing game design as tools for social change. This study was conducted at two sites. Participants in the study were ages 9-14 and worked for up to 15 hours with the platform to learn how to program and design video games with personally or socially…

  14. TBIEM3D: A Computer Program for Predicting Ducted Fan Engine Noise. Version 1.1

    Science.gov (United States)

    Dunn, M. H.

    1997-01-01

    This document describes the usage of the ducted fan noise prediction program TBIEM3D (Thin duct - Boundary Integral Equation Method - 3 Dimensional). A scattering approach is adopted in which the acoustic pressure field is split into known incident and unknown scattered parts. The scattering of fan-generated noise by a finite length circular cylinder in a uniform flow field is considered. The fan noise is modeled by a collection of spinning point thrust dipoles. The program, based on a Boundary Integral Equation Method (BIEM), calculates circumferential modal coefficients of the acoustic pressure at user-specified field locations. The duct interior can be of the hard wall type or lined. The duct liner is axisymmetric, locally reactive, and can be uniform or axially segmented. TBIEM3D is written in the FORTRAN programming language. Input to TBIEM3D is minimal and consists of geometric and kinematic parameters. Discretization and numerical parameters are determined automatically by the code. Several examples are presented to demonstrate TBIEM3D capabilities.

  15. Parallel programming characteristics of a DSP-based parallel system

    Institute of Scientific and Technical Information of China (English)

    GAO Shu; GUO Qing-ping

    2006-01-01

    This paper firstly introduces the structure and working principle of DSP-based parallel system, parallel accelerating board and SHARC DSP chip. Then it pays attention to investigating the system's programming characteristics, especially the mode of communication, discussing how to design parallel algorithms and presenting a domain-decomposition-based complete multi-grid parallel algorithm with virtual boundary forecast (VBF) to solve a lot of large-scale and complicated heat problems. In the end, Mandelbrot Set and a non-linear heat transfer equation of ceramic/metal composite material are taken as examples to illustrate the implementation of the proposed algorithm. The results showed that the solutions are highly efficient and have linear speedup.

  16. 2D/3D Program work summary report, [January 1988--December 1992

    International Nuclear Information System (INIS)

    The 2D/3D Program was carried out by Germany, Japan and the United States to investigate the thermal-hydraulics of a PWR large-break LOCA. A contributory approach was utilized in which each country contributed significant effort to the program and all three countries shared the research results. Germany constructed and operated the Upper Plenum Test Facility (UPTF), and Japan constructed and operated the Cylindrical Core Test Facility (CCTF) and the Slab Core Test Facility (SCTF). The US contribution consisted of provision of advanced instrumentation to each of the three test facilities, and assessment of the TRAC computer code against the test results. Evaluations of the test results were carried out in all three countries. This report summarizes the 2D/3D Program in terms of the contributing efforts of the participants

  17. Massively Parallel Finite Element Programming

    KAUST Repository

    Heister, Timo

    2010-01-01

    Today\\'s large finite element simulations require parallel algorithms to scale on clusters with thousands or tens of thousands of processor cores. We present data structures and algorithms to take advantage of the power of high performance computers in generic finite element codes. Existing generic finite element libraries often restrict the parallelization to parallel linear algebra routines. This is a limiting factor when solving on more than a few hundreds of cores. We describe routines for distributed storage of all major components coupled with efficient, scalable algorithms. We give an overview of our effort to enable the modern and generic finite element library deal.II to take advantage of the power of large clusters. In particular, we describe the construction of a distributed mesh and develop algorithms to fully parallelize the finite element calculation. Numerical results demonstrate good scalability. © 2010 Springer-Verlag.

  18. Experiences in Data-Parallel Programming

    Directory of Open Access Journals (Sweden)

    Terry W. Clark

    1997-01-01

    Full Text Available To efficiently parallelize a scientific application with a data-parallel compiler requires certain structural properties in the source program, and conversely, the absence of others. A recent parallelization effort of ours reinforced this observation and motivated this correspondence. Specifically, we have transformed a Fortran 77 version of GROMOS, a popular dusty-deck program for molecular dynamics, into Fortran D, a data-parallel dialect of Fortran. During this transformation we have encountered a number of difficulties that probably are neither limited to this particular application nor do they seem likely to be addressed by improved compiler technology in the near future. Our experience with GROMOS suggests a number of points to keep in mind when developing software that may at some time in its life cycle be parallelized with a data-parallel compiler. This note presents some guidelines for engineering data-parallel applications that are compatible with Fortran D or High Performance Fortran compilers.

  19. Towards Distributed Memory Parallel Program Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Quinlan, D; Barany, G; Panas, T

    2008-06-17

    This paper presents a parallel attribute evaluation for distributed memory parallel computer architectures where previously only shared memory parallel support for this technique has been developed. Attribute evaluation is a part of how attribute grammars are used for program analysis within modern compilers. Within this work, we have extended ROSE, a open compiler infrastructure, with a distributed memory parallel attribute evaluation mechanism to support user defined global program analysis required for some forms of security analysis which can not be addressed by a file by file view of large scale applications. As a result, user defined security analyses may now run in parallel without the user having to specify the way data is communicated between processors. The automation of communication enables an extensible open-source parallel program analysis infrastructure.

  20. Productive Parallel Programming: The PCN Approach

    Directory of Open Access Journals (Sweden)

    Ian Foster

    1992-01-01

    Full Text Available We describe the PCN programming system, focusing on those features designed to improve the productivity of scientists and engineers using parallel supercomputers. These features include a simple notation for the concise specification of concurrent algorithms, the ability to incorporate existing Fortran and C code into parallel applications, facilities for reusing parallel program components, a portable toolkit that allows applications to be developed on a workstation or small parallel computer and run unchanged on supercomputers, and integrated debugging and performance analysis tools. We survey representative scientific applications and identify problem classes for which PCN has proved particularly useful.

  1. Analysis results from the Los Alamos 2D/3D program

    International Nuclear Information System (INIS)

    Los Alamos National Laboratory is a participant in the 2D/3D program. Activities conducted at Los Alamos National Laboratory in support of 2D/3D program goals include analysis support of facility design, construction, and operation; provision of boundary and initial conditions for test-facility operations based on analysis of pressurized water reactors; performance of pretest and posttest predictions and analyses; and use of experimental results to validate and assess the single- and multi-dimensional, nonequilibrium features in the Transient Reactor Analysis Code (TRAC). During fiscal year 1987, Los Alamos conducted analytical assessment activities using data from the Slab Core Test Facility, The Cylindrical Core Test Facility, and the Upper Plenum Test Facility. Finally, Los Alamos continued work to provide TRAC improvements. In this paper, Los Alamos activities during fiscal year 1987 will be summarized; several significant accomplishments will be described in more detail to illustrate the work activities at Los Alamos

  2. Analysis results from the Los Alamos 2D/3D program

    International Nuclear Information System (INIS)

    Los Alamos National Laboratory is a participant in the 2D/3D program. Activities conducted at Los Alamos National Laboratory in support of 2D/3D program goals include analysis support of facility design, construction, and operation; provision of boundary and initial conditions for test-facility operations based on analysis of pressurized water reactors; performance of pretest and post-test predictions and analyses; and use of experimental results to validate and assess the single- and multi-dimensional, nonequilibrium features in the Transient Reactor Analysis Code (TRAC). During fiscal year 1987, Los Alamos conducted analytical assessment activities using data from the Slab Core Test Facility, the Cylindrical Core Test Facility, and the Upper Plenum Test Facility. Finally, Los Alamos continued work to provide TRAC improvements. In this paper, Los Alamos activities during fiscal year 1987 are summarized; several significant accomplishments are described in more detail to illustrate the work activities at Los Alamos

  3. Software Programs, from Sequential to Parallel

    Directory of Open Access Journals (Sweden)

    Felician ALECU

    2009-10-01

    Full Text Available Several methods can be applied to parallelize a program containing input/output operations but one of the most important challenges is to parallelize loops because they usually spend the most CPU time even if the code contained is very small. A loop could by parallelized by distributing iterations among processes. Every process will execute just a subset of the loop iterations range.

  4. DVR3D: a program suite for the calculation of rotation-vibration spectra of triatomic molecules

    Science.gov (United States)

    Tennyson, Jonathan; Kostin, Maxim A.; Barletta, Paolo; Harris, Gregory J.; Polyansky, Oleg L.; Ramanlal, Jayesh; Zobov, Nikolai F.

    2004-11-01

    from: CPC Program Library, Queen's University of Belfast, N. Ireland Reference in CPC to previous version: 86 (1995) 175 Catalogue identifier of previous version: ADAK Authors of previous version: J. Tennyson, J.R. Henderson and N.G. Fulton Does the new version supersede the original program?: DVR3DRJZ supersedes DVR3DRJ Computer: PC running Linux Installation: desktop Other machines on which program tested: Compaq running True64 Unix; SGI Origin 2000, Sunfire V750 and V880 systems running SunOS, IBM p690 Regatta running AIX Programming language used in the new version: Fortran 90 Memory required to execute: case dependent No. of lines in distributed program, including test data, etc.: 4203 No. of bytes in distributed program, including test data, etc.: 30 087 Has code been vectorised or parallelised?: The code has been extensively vectorised. A parallel version of the code, PDVR3D has been developed [1], contact the first author for details Additional keywords: perpendicular embedding Distribution format: gz Nature of physical problem: DVR3DRJZ calculates the bound vibrational or Coriolis decoupled rotational-vibrational states of a triatomic system in body-fixed Jacobi (scattering) or Radau coordinates [2] Method of solution: All coordinates are treated in a discrete variable representation (DVR). The angular coordinate uses a DVR based on (associated) Legendre polynomials and the radial coordinates utilise a DVR based on either Morse oscillator-like or spherical oscillator functions. Intermediate diagonalisation and truncation is performed on the hierarchical expression of the Hamiltonian operator to yield the final secular problem. DVR3DRJ provides the vibrational wavefunctions necessary for ROTLEV3, ROLEV3B or ROTLEV3Z to calculate rotationally excited states, DIPOLE3 to calculate rotational-vibrational transition strengths and XPECT3 to compute expectation values Restrictions on the complexity of the problem: (1) The size of the final Hamiltonian matrix that can

  5. Speedup predictions on large scientific parallel programs

    International Nuclear Information System (INIS)

    How much speedup can we expect for large scientific parallel programs running on supercomputers. For insight into this problem we extend the parallel processing environment currently existing on the Cray X-MP (a shared memory multiprocessor with at most four processors) to a simulated N-processor environment, where N greater than or equal to 1. Several large scientific parallel programs from Los Alamos National Laboratory were run in this simulated environment, and speedups were predicted. A speedup of 14.4 on 16 processors was measured for one of the three most used codes at the Laboratory

  6. Wireless Rover Meets 3D Design and Product Development

    Science.gov (United States)

    Deal, Walter F., III; Hsiung, Steve C.

    2016-01-01

    Today there are a number of 3D printing technologies that are low cost and within the budgets of middle and high school programs. Educational technology companies offer a variety of 3D printing technologies and parallel curriculum materials to enable technology and engineering teachers to easily add 3D learning activities to their programs.…

  7. Portable parallel programming in a Fortran environment

    International Nuclear Information System (INIS)

    Experience using the Argonne-developed PARMACs macro package to implement a portable parallel programming environment is described. Fortran programs with intrinsic parallelism of coarse and medium granularity are easily converted to parallel programs which are portable among a number of commercially available parallel processors in the class of shared-memory bus-based and local-memory network based MIMD processors. The parallelism is implemented using standard UNIX (tm) tools and a small number of easily understood synchronization concepts (monitors and message-passing techniques) to construct and coordinate multiple cooperating processes on one or many processors. Benchmark results are presented for parallel computers such as the Alliant FX/8, the Encore MultiMax, the Sequent Balance, the Intel iPSC/2 Hypercube and a network of Sun 3 workstations. These parallel machines are typical MIMD types with from 8 to 30 processors, each rated at from 1 to 10 MIPS processing power. The demonstration code used for this work is a Monte Carlo simulation of the response to photons of a ''nearly realistic'' lead, iron and plastic electromagnetic and hadronic calorimeter, using the EGS4 code system. 6 refs., 2 figs., 2 tabs

  8. Reactor safety issues resolved by the 2D/3D program

    International Nuclear Information System (INIS)

    The 2D/3D Program studied multidimensional thermal-hydraulics in a PWR core and primary system during the end-of-blowdown and post-blowdown phases of a large-break LOCA (LBLOCA), and during selected small-break LOCA (SBLOCA) transients. The program included tests at the Cylindrical Core Test Facility (CCTF), the Slab Core Test Facility (SCTF), and the Upper Plenum Test Facility (UPTF), and computer analyses using TRAC. Tests at CCTF investigated core thermal-hydraulics and overall system behavior while tests at SCTF concentrated on multidimensional core thermal-hydraulics. The UPTF tests investigated two-phase flow behavior in the downcomer, upper plenum, tie plate region, and primary loops. TRAC analyses evaluated thermal-hydraulic behavior throughout the primary system in tests as well as in PWRs. This report summarizes the test and analysis results in each of the main areas where improved information was obtained in the 2D/3D Program. The discussion is organized in terms of the reactor safety issues investigated. This report was prepared in a coordination among US, Germany and Japan. US and Germany have published the report as NUREG/IA-0127 and GRS-101 respectively. (author)

  9. Reactor safety issues resolved by the 2D/3D Program

    International Nuclear Information System (INIS)

    The 2D/3D Program studied multidimensional thermal-hydraulics in a PWR core and primary system during the end-of-blowdown and post-blowdown phases of a large-break LOCA (LBLOCA), and during selected small-break LOCA (SBLOCA) transients. The program included tests at the Cylindrical Core Test Facility (CCTF), the Slab Core Test Facility (SCTF), and the Upper Plenum Test Facility (UPTF), and computer analyses using TRAC. Tests at CCTF investigated core thermal-hydraulics and overall system behavior while tests at SCTF concentrated on multidimensional core thermal-hydraulics. The UPTF tests investigated two-phase flow behavior in the downcomer, upper plenum, tie plate region, and primary loops. TRAC analyses evaluated thermal-hydraulic behavior throughout the primary system in tests as well as in PWRs. This report summarizes the test and analysis results in each of the main areas where improved information was obtained in the 2D/3D Program. The discussion is organized in terms of the reactor safety issues investigated

  10. Playable stories: Making programming and 3D role-playing game design personally and socially relevant

    Science.gov (United States)

    Ingram-Goble, Adam

    This is an exploratory design study of a novel system for learning programming and 3D role-playing game design as tools for social change. This study was conducted at two sites. Participants in the study were ages 9-14 and worked for up to 15 hours with the platform to learn how to program and design video games with personally or socially relevant narratives. This first study was successful in that students learned to program a narrative game, and they viewed the social problem framing for the practices as an interesting aspect of the experience. The second study provided illustrative examples of how providing less general structure up-front, afforded players the opportunity to produce the necessary structures as needed for their particular design, and therefore had a richer understanding of what those structures represented. This study demonstrates that not only were participants able to use computational thinking skills such as Boolean and conditional logic, planning, modeling, abstraction, and encapsulation, they were able to bridge these skills to social domains they cared about. In particular, participants created stories about socially relevant topics without to explicit pushes by the instructors. The findings also suggest that the rapid uptake, and successful creation of personally and socially relevant narratives may have been facilitated by close alignment between the conceptual tools represented in the platform, and the domain of 3D role-playing games.

  11. Detecting drug use in adolescents using a 3D simulation program

    Directory of Open Access Journals (Sweden)

    Luis Iribarne

    2010-11-01

    Full Text Available This work presents a new 3D simulation program, called MiiSchool, and its application to the detection of problem behaviours appearing in school settings. We begin by describing some of the main features of the Mii School program. Then, we present the results of a study in which adolescents responded to Mii School simulations involving the consumption of alcoholic drinks, cigarettes, cannabis, cocaine, and MDMA (ecstasy. We established a“risk profile” based on the observed response patterns. We also present results concerning user satisfaction with the program and the extent to which users felt that the simulated scenes were realistic. Lastly, we discuss the usefulness of Mii School as a tool for assessing drug use in school settings.

  12. PLOT 3D: a multipurpose, interactive program for plotting three dimensional graphs

    International Nuclear Information System (INIS)

    PLOT3D is a general purpose, interactive, three dimensional display and plotter program. Written in Fortran-77, it uses a two dimensional plotter software (PLOT-10) to draw orthographic axonometric projection of three dimensional graph comprising of smooth surface or cubical histogram, in any desired orientation, magnification and window, employing throughout highly accurate hidden lines removal techniques. The figure, so generated, can optionally be clipped, smoothened by interpolation, or shaded selectively to distinguish among its different faces. The program accepts data from an external file, or generates them through built-in functions. It can be used for graphical representation of data such as neutron flux, theta(x,y) in form of smooth surface, even if available data are very few in number. It is also capable of drawing histograms of quantities such as fuel power in a reactor lattice. Listing of the program is given. (author)

  13. 冷冻电镜三维重构在CPU-GPU系统中的并行性%Parallelism for cryo-EM 3D reconstruction on CPU-GPU heterogeneous system

    Institute of Scientific and Technical Information of China (English)

    李兴建; 李临川; 谭光明; 张佩珩

    2011-01-01

    It is a challenge to efficiently utilize massive parallelism on both applications and architectures for heterogeneous systems. A practice of accelerating a cryo-EM 3D program was presented on how to exploit and orchestrate parallelism of applications to take advantage of the underlying parallelism exposed at the architecture level. All possible parallelism in cryo-EM 3D was exploited, and a self-adaptive dynamic scheduling algorithm was leveraged to efficiently implement parallelism mapping between the application and architecture. The experiment on a part of dawning nebulae system (32 nodes) confirms that a hierarchical parallelism is an efficient pattern of parallel programming to utilize capabilities of both CPU and GPU on a heterogeneous system. The hybrid CPU-GPU program improves performance by 2. 4 times over the best CPU-only one for certain problem sizes.%为了有效地发掘和利用异构系统在应用和体系结构上的并行性,以冷冻电镜三维重构为例展示如何利用应用程序潜在的并行性.通过分析重构计算所有的并行性,实现了将动态自适应的划分算法用于任务在异构系统上高效的分发.在曙光星云系统的部分节点系统(32节点)上评估并行化的程序性能.实验证明:多层次的并行化是CPU与GPU异构系统上开发并行性的有效模式;CPU-GPU混合程序在给定问题规模上相对单纯CPU程序获得2.4倍加速比.

  14. The JLAB 3D program at 12 GeV (TMDs + GPDs)

    Energy Technology Data Exchange (ETDEWEB)

    Pisano, Silvia [Istituto Nazionale di Fisica Nucleare (INFN), Frascati (Italy)

    2015-01-01

    The Jefferson Lab CEBAF accelerator is undergoing an upgrade that will increase the beam energy up to 12 GeV. The three experimental Halls operating in the 6-GeV era are upgrading their detectors to adapt their performances to the new available kinematics, and a new Hall (D) is being built. The investigation of the three-dimensional nucleon structure both in the coordinate and in the momentum space represents an essential part of the 12-GeV physics program, and several proposals aiming at the extraction of related observables have been already approved in Hall A, B and C. In this proceedings, the focus of the JLab 3D program will be described, and a selection of proposals will be discussed.

  15. Evaluation of single photon and Geiger mode Lidar for the 3D Elevation Program

    Science.gov (United States)

    Stoker, Jason M.; Abdullah, Qassim; Nayegandhi, Amar; Winehouse, Jayna

    2016-01-01

    Data acquired by Harris Corporation’s (Melbourne, FL, USA) Geiger-mode IntelliEarth™ sensor and Sigma Space Corporation’s (Lanham-Seabrook, MD, USA) Single Photon HRQLS sensor were evaluated and compared to accepted 3D Elevation Program (3DEP) data and survey ground control to assess the suitability of these new technologies for the 3DEP. While not able to collect data currently to meet USGS lidar base specification, this is partially due to the fact that the specification was written for linear-mode systems specifically. With little effort on part of the manufacturers of the new lidar systems and the USGS Lidar specifications team, data from these systems could soon serve the 3DEP program and its users. Many of the shortcomings noted in this study have been reported to have been corrected or improved upon in the next generation sensors.

  16. CMAS 3D, a new program to visualize and project major elements compositions in the CMAS system

    Science.gov (United States)

    France, L.; Ouillon, N.; Chazot, G.; Kornprobst, J.; Boivin, P.

    2009-06-01

    CMAS 3D, developed in MATLAB ®, is a program to support visualization of major element chemical data in three dimensions. Such projections are used to discuss correlations, metamorphic reactions and the chemical evolution of rocks, melts or minerals. It can also project data into 2D plots. The CMAS 3D interface makes it easy to use, and does not require any knowledge of Matlab ® programming. CMAS 3D uses data compiled in a Microsoft Excel™ spreadsheet. Although useful for scientific research, the program is also a powerful tool for teaching.

  17. Plasmonics and the parallel programming problem

    Science.gov (United States)

    Vishkin, Uzi; Smolyaninov, Igor; Davis, Chris

    2007-02-01

    While many parallel computers have been built, it has generally been too difficult to program them. Now, all computers are effectively becoming parallel machines. Biannual doubling in the number of cores on a single chip, or faster, over the coming decade is planned by most computer vendors. Thus, the parallel programming problem is becoming more critical. The only known solution to the parallel programming problem in the theory of computer science is through a parallel algorithmic theory called PRAM. Unfortunately, some of the PRAM theory assumptions regarding the bandwidth between processors and memories did not properly reflect a parallel computer that could be built in previous decades. Reaching memories, or other processors in a multi-processor organization, required off-chip connections through pins on the boundary of each electric chip. Using the number of transistors that is becoming available on chip, on-chip architectures that adequately support the PRAM are becoming possible. However, the bandwidth of off-chip connections remains insufficient and the latency remains too high. This creates a bottleneck at the boundary of the chip for a PRAM-On-Chip architecture. This also prevents scalability to larger "supercomputing" organizations spanning across many processing chips that can handle massive amounts of data. Instead of connections through pins and wires, power-efficient CMOS-compatible on-chip conversion to plasmonic nanowaveguides is introduced for improved latency and bandwidth. Proper incorporation of our ideas offer exciting avenues to resolving the parallel programming problem, and an alternative way for building faster, more useable and much more compact supercomputers.

  18. Parallelism and programming in classifier systems

    CERN Document Server

    Forrest, Stephanie

    1990-01-01

    Parallelism and Programming in Classifier Systems deals with the computational properties of the underlying parallel machine, including computational completeness, programming and representation techniques, and efficiency of algorithms. In particular, efficient classifier system implementations of symbolic data structures and reasoning procedures are presented and analyzed in detail. The book shows how classifier systems can be used to implement a set of useful operations for the classification of knowledge in semantic networks. A subset of the KL-ONE language was chosen to demonstrate these o

  19. Fabrication of 3-D Reconstituted Organoid Arrays by DNA-Programmed Assembly of Cells (DPAC).

    Science.gov (United States)

    Todhunter, Michael E; Weber, Robert J; Farlow, Justin; Jee, Noel Y; Cerchiari, Alec E; Gartner, Zev J

    2016-01-01

    Tissues are the organizational units of function in metazoan organisms. Tissues comprise an assortment of cellular building blocks, soluble factors, and extracellular matrix (ECM) composed into specific three-dimensional (3-D) structures. The capacity to reconstitute tissues in vitro with the structural complexity observed in vivo is key to understanding processes such as morphogenesis, homeostasis, and disease. In this article, we describe DNA-programmed assembly of cells (DPAC), a method to fabricate viable, functional arrays of organoid-like tissues within 3-D ECM gels. In DPAC, dissociated cells are chemically functionalized with degradable oligonucleotide "Velcro," allowing rapid, specific, and reversible cell adhesion to a two-dimensional (2-D) template patterned with complementary DNA. An iterative assembly process builds up organoids, layer-by-layer, from this initial 2-D template and into the third dimension. Cleavage of the DNA releases the completed array of tissues that are captured and fully embedded in ECM gels for culture and observation. DPAC controls the size, shape, composition, and spatial heterogeneity of organoids and permits positioning of constituent cells with single-cell resolution even within cultures several centimeters long. © 2016 by John Wiley & Sons, Inc. PMID:27622567

  20. Parallel GRISYS/Power Challenge System Version 1.0 and 3D Prestack Depth Migration Package

    Institute of Scientific and Technical Information of China (English)

    Zhao Zhenwen

    1995-01-01

    @@ Based on the achievements and experience of seismic data parallel processing made in the past years by Beijing Global Software Corporation (GS) of CNPC, Parallel GRISYS/Power Challenge seismic data processing system version 1.0 has been cooperatively developed and integrated on the Power Challenge computer by GS, SGI (USA) and Shuangyuan Company of Academia Sinica.

  1. Parallel Volunteer Learning during Youth Programs

    Science.gov (United States)

    Lesmeister, Marilyn K.; Green, Jeremy; Derby, Amy; Bothum, Candi

    2012-01-01

    Lack of time is a hindrance for volunteers to participate in educational opportunities, yet volunteer success in an organization is tied to the orientation and education they receive. Meeting diverse educational needs of volunteers can be a challenge for program managers. Scheduling a Volunteer Learning Track for chaperones that is parallel to a…

  2. Distributed parallel computing using navigational programming

    OpenAIRE

    Pan, Lei; Lai, M. K.; Noguchi, K; Huseynov, J J; L. F. Bic; Dillencourt, M B

    2004-01-01

    Message Passing ( MP) and Distributed Shared Memory (DSM) are the two most common approaches to distributed parallel computing. MP is difficult to use, whereas DSM is not scalable. Performance scalability and ease of programming can be achieved at the same time by using navigational programming (NavP). This approach combines the advantages of MP and DSM, and it balances convenience and flexibility. Similar to MP, NavP suggests to its programmers the principle of pivot-computes and hence is ef...

  3. Collision Avoidance with Potential Fields Based on Parallel Processing of 3D-Point Cloud Data on the GPU

    OpenAIRE

    Kaldestad, Knut B.; Haddadin, Sami; Belder, Rico; Hovland, Geir; Anisi, David A.

    2014-01-01

    In this paper we present an experimental study on real-time collision avoidance with potential fields that are based on 3D point cloud data and processed on the Graphics Processing Unit (GPU). The virtual forces from the potential fields serve two purposes. First, they are used for changing the reference trajectory. Second they are projected to and applied on torque control level for generating according nullspace behavior together with a Cartesian impedance main control ...

  4. Control cycloconverter using transputer based parallel programming

    Energy Technology Data Exchange (ETDEWEB)

    Chiu, D.M.; Li, S.J. [Victoria Univ. of Technology, Melbourne, Victoria (Australia). Dept. of Electrical and Electronic Engineering

    1995-12-31

    The naturally commutated cycloconverter is a valuable tool for speed control in AC machines. Over the years, its control circuits have been designed and implemented using vacuum tubes, transistors, integrated circuits and single-processor. However, the problem of obtaining accurate data on the triggering pulse generation in order for quantitative analysis is still unsolved. Triggering instants of cycloconverter have been precisely controlled using transputer based parallel computing technique. The HELIOS operating system is found to be an efficient tool for the development of the parallel programming. Different topology configurations and various communication mechanisms of HELIOS are employed to determine and select the fastest technique that is suitable for the present work.

  5. Skeleton based parallel programming: functional and parallel semantics in a single shot

    OpenAIRE

    Aldinucci, Marco; Danelutto, Marco

    2004-01-01

    Different skeleton based parallel programming systems have been developed in past years. The main goal of these programming environments is to provide programmers with handy, effective ways of writing parallel applications. In particular, skeleton based parallel programming environments automatically deal with most of the difficult, cumbersome programming problems that must be usually handled by programmers of parallel applications using traditional programming environments (e.g. environments...

  6. Hybrid MPI+OpenMP parallelization of an FFT-based 3D Poisson solver with one periodic direction

    OpenAIRE

    Gorobets, Andrei; Trias Miquel, Francesc Xavier; Borrell Pol, Ricard; Lehmkuhl Barba, Oriol; Oliva Llena, Asensio

    2011-01-01

    This work is devoted to the development of efficient parallel algorithms for the direct numerical simulation (DNS) of incompressible flows on modern supercomputers. In doing so, a Poisson equation needs to be solved at each time-step to project the velocity field onto a divergence-free space. Due to the non-local nature of its solution, this elliptic system is the part of the algorithm that is most difficult to parallelize. The Poisson solver presented here is restricted to problems with o...

  7. Programming massively parallel processors a hands-on approach

    CERN Document Server

    Kirk, David B

    2010-01-01

    Programming Massively Parallel Processors discusses basic concepts about parallel programming and GPU architecture. ""Massively parallel"" refers to the use of a large number of processors to perform a set of computations in a coordinated parallel way. The book details various techniques for constructing parallel programs. It also discusses the development process, performance level, floating-point format, parallel patterns, and dynamic parallelism. The book serves as a teaching guide where parallel programming is the main topic of the course. It builds on the basics of C programming for CUDA, a parallel programming environment that is supported on NVI- DIA GPUs. Composed of 12 chapters, the book begins with basic information about the GPU as a parallel computer source. It also explains the main concepts of CUDA, data parallelism, and the importance of memory access efficiency using CUDA. The target audience of the book is graduate and undergraduate students from all science and engineering disciplines who ...

  8. International training program: 3D S.UN.COP - Scaling, uncertainty and 3D thermal-hydraulics/neutron-kinetics coupled codes seminar

    International Nuclear Information System (INIS)

    Thermal-hydraulic system computer codes are extensively used worldwide for analysis of nuclear facilities by utilities, regulatory bodies, nuclear power plant designers and vendors, nuclear fuel companies, research organizations, consulting companies, and technical support organizations. The computer code user represents a source of uncertainty that can influence the results of system code calculations. This influence is commonly known as the 'user effect' and stems from the limitations embedded in the codes as well as from the limited capability of the analysts to use the codes. Code user training and qualification is an effective means for reducing the variation of results caused by the application of the codes by different users. This paper describes a systematic approach to training code users who, upon completion of the training, should be able to perform calculations making the best possible use of the capabilities of best estimate codes. In other words, the program aims at contributing towards solving the problem of user effect. The 3D S.UN.COP 2005 (Scaling, Uncertainty and 3D COuPled code calculations) seminar has been organized by University of Pisa and University of Zagreb as follow-up of the proposal to IAEA for the Permanent Training Course for System Code Users (D'Auria, 1998). It was recognized that such a course represented both a source of continuing education for current code users and a means for current code users to enter the formal training structure of a proposed 'permanent' stepwise approach to user training. The seminar-training was successfully held with the participation of 19 persons coming from 9 countries and 14 different institutions (universities, vendors, national laboratories and regulatory bodies). More than 15 scientists were involved in the organization of the seminar, presenting theoretical aspects of the proposed methodologies and holding the training and the final examination. A certificate (LA Code User grade) was released

  9. Automatic Performance Debugging of SPMD Parallel Programs

    CERN Document Server

    Liu, Xu; Zhan, Jianfeng; Tu, Bibo; Meng, Dan

    2010-01-01

    Automatic performance debugging of parallel applications usually involves two steps: automatic detection of performance bottlenecks and uncovering their root causes for performance optimization. Previous work fails to resolve this challenging issue in several ways: first, several previous efforts automate analysis processes, but present the results in a confined way that only identifies performance problems with apriori knowledge; second, several tools take exploratory or confirmatory data analysis to automatically discover relevant performance data relationships. However, these efforts do not focus on locating performance bottlenecks or uncovering their root causes. In this paper, we design and implement an innovative system, AutoAnalyzer, to automatically debug the performance problems of single program multi-data (SPMD) parallel programs. Our system is unique in terms of two dimensions: first, without any apriori knowledge, we automatically locate bottlenecks and uncover their root causes for performance o...

  10. Modifications of the PRONTO 3D finite element program tailored to fast burst nuclear reactor design

    International Nuclear Information System (INIS)

    This update discusses modifications of PRONTO 3D tailored to the design of fast burst nuclear reactors. A thermoelastic constitutive model and spatially variant thermal history load were added for this special application. Included are descriptions of the thermoelastic constitutive model and the thermal loading algorithm, two example problems used to benchmark the new capability, a user's guide, and PRONTO 3D input files for the example problems. The results from PRONTO 3D thermoelastic finite element analysis are benchmarked against measured data and finite difference calculations. PRONTO 3D is a three-dimensional transient solid dynamics code for analyzing large deformations of highly non-linear materials subjected to high strain rates. The code modifications are implemented in PRONTO 3D Version 5.3.3. 12 refs., 30 figs., 9 tabs

  11. Timing-Sequence Testing of Parallel Programs

    Institute of Scientific and Technical Information of China (English)

    LIANG Yu; LI Shu; ZHANG Hui; HAN Chengde

    2000-01-01

    Testing of parallel programs involves two parts-testing of controlflow within the processes and testing of timing-sequence.This paper focuses on the latter, particularly on the timing-sequence of message-passing paradigms.Firstly the coarse-grained SYN-sequence model is built up to describe the execution of distributed programs. All of the topics discussed in this paper are based on it. The most direct way to test a program is to run it. A fault-free parallel program should be of both correct computing results and proper SYN-sequence. In order to analyze the validity of observed SYN-sequence, this paper presents the formal specification (Backus Normal Form) of the valid SYN-sequence. Till now there is little work about the testing coverage for distributed programs. Calculating the number of the valid SYN-sequences is the key to coverage problem, while the number of the valid SYN-sequences is terribly large and it is very hard to obtain the combination law among SYN-events. In order to resolve this problem, this paper proposes an efficient testing strategy-atomic SYN-event testing, which is to linearize the SYN-sequence (making it only consist of serial atomic SYN-events) first and then test each atomic SYN-event independently. This paper particularly provides the calculating formula about the number of the valid SYN-sequences for tree-topology atomic SYN-event (broadcast and combine). Furthermore,the number of valid SYN-sequences also,to some degree, mirrors the testability of parallel programs. Taking tree-topology atomic SYN-event as an example, this paper demonstrates the testability and communication speed of the tree-topology atomic SYN-event under different numbers of branches in order to achieve a more satisfactory tradeoff between testability and communication efficiency.

  12. Progress in the Simulation of Steady and Time-Dependent Flows with 3D Parallel Unstructured Cartesian Methods

    Science.gov (United States)

    Aftosmis, M. J.; Berger, M. J.; Murman, S. M.; Kwak, Dochan (Technical Monitor)

    2002-01-01

    The proposed paper will present recent extensions in the development of an efficient Euler solver for adaptively-refined Cartesian meshes with embedded boundaries. The paper will focus on extensions of the basic method to include solution adaptation, time-dependent flow simulation, and arbitrary rigid domain motion. The parallel multilevel method makes use of on-the-fly parallel domain decomposition to achieve extremely good scalability on large numbers of processors, and is coupled with an automatic coarse mesh generation algorithm for efficient processing by a multigrid smoother. Numerical results are presented demonstrating parallel speed-ups of up to 435 on 512 processors. Solution-based adaptation may be keyed off truncation error estimates using tau-extrapolation or a variety of feature detection based refinement parameters. The multigrid method is extended to for time-dependent flows through the use of a dual-time approach. The extension to rigid domain motion uses an Arbitrary Lagrangian-Eulerlarian (ALE) formulation, and results will be presented for a variety of two- and three-dimensional example problems with both simple and complex geometry.

  13. Professional WebGL Programming Developing 3D Graphics for the Web

    CERN Document Server

    Anyuru, Andreas

    2012-01-01

    Everything you need to know about developing hardware-accelerated 3D graphics with WebGL! As the newest technology for creating 3D graphics on the web, in both games, applications, and on regular websites, WebGL gives web developers the capability to produce eye-popping graphics. This book teaches you how to use WebGL to create stunning cross-platform apps. The book features several detailed examples that show you how to develop 3D graphics with WebGL, including explanations of code snippets that help you understand the why behind the how. You will also develop a stronger understanding of W

  14. Scalable, incremental learning with MapReduce parallelization for cell detection in high-resolution 3D microscopy data

    KAUST Repository

    Sung, Chul

    2013-08-01

    Accurate estimation of neuronal count and distribution is central to the understanding of the organization and layout of cortical maps in the brain, and changes in the cell population induced by brain disorders. High-throughput 3D microscopy techniques such as Knife-Edge Scanning Microscopy (KESM) are enabling whole-brain survey of neuronal distributions. Data from such techniques pose serious challenges to quantitative analysis due to the massive, growing, and sparsely labeled nature of the data. In this paper, we present a scalable, incremental learning algorithm for cell body detection that can address these issues. Our algorithm is computationally efficient (linear mapping, non-iterative) and does not require retraining (unlike gradient-based approaches) or retention of old raw data (unlike instance-based learning). We tested our algorithm on our rat brain Nissl data set, showing superior performance compared to an artificial neural network-based benchmark, and also demonstrated robust performance in a scenario where the data set is rapidly growing in size. Our algorithm is also highly parallelizable due to its incremental nature, and we demonstrated this empirically using a MapReduce-based implementation of the algorithm. We expect our scalable, incremental learning approach to be widely applicable to medical imaging domains where there is a constant flux of new data. © 2013 IEEE.

  15. Extending a serial 3D two-phase CFD code to parallel execution over MPI by using the PETSc library for domain decomposition

    CERN Document Server

    Ervik, Åsmund; Müller, Bernhard

    2014-01-01

    To leverage the last two decades' transition in High-Performance Computing (HPC) towards clusters of compute nodes bound together with fast interconnects, a modern scalable CFD code must be able to efficiently distribute work amongst several nodes using the Message Passing Interface (MPI). MPI can enable very large simulations running on very large clusters, but it is necessary that the bulk of the CFD code be written with MPI in mind, an obstacle to parallelizing an existing serial code. In this work we present the results of extending an existing two-phase 3D Navier-Stokes solver, which was completely serial, to a parallel execution model using MPI. The 3D Navier-Stokes equations for two immiscible incompressible fluids are solved by the continuum surface force method, while the location of the interface is determined by the level-set method. We employ the Portable Extensible Toolkit for Scientific Computing (PETSc) for domain decomposition (DD) in a framework where only a fraction of the code needs to be a...

  16. Development of parallel/serial program analyzing tool

    International Nuclear Information System (INIS)

    Japan Atomic Energy Research Institute has been developing 'KMtool', a parallel/serial program analyzing tool, in order to promote the parallelization of the science and engineering computation program. KMtool analyzes the performance of program written by FORTRAN77 and MPI, and it reduces the effort for parallelization. This paper describes development purpose, design, utilization and evaluation of KMtool. (author)

  17. 三维有限元并行EBE方法%PARALLEL 3-D FINITE ELEMENT ANALYSIS BASED ON EBE METHOD

    Institute of Scientific and Technical Information of China (English)

    刘耀儒; 周维垣; 杨强

    2006-01-01

    采用Jacobi预处理,推导了基于EBE方法的预处理共轭梯度算法,给出了有限元EBE方法在分布存储并行机上的计算过程,可以实现整个三维有限元计算过程的并行化.编制了三维有限元求解的PFEM(Parallel Finite Element Method)程序,并在网络机群系统上实现.采用矩形截面悬臂梁的算例,对PFEM程序进行了数值测试,对串行计算和并行计算的效率进行了分析,最后将PFEM程序应用于二滩拱坝-地基系统的三维有限元数值计算中.结果表明,三维有限元EBE算法在求解过程中不需要集成整体刚度矩阵,有效地减少了对内存的需求,具有很好的并行性,可以有效地进行三维复杂结构的大规模数值分析.

  18. Four styles of parallel and net programming

    Institute of Scientific and Technical Information of China (English)

    Zhiwei XU; Yongqiang HE; Wei LIN; Li ZHA

    2009-01-01

    This paper reviews the programming landscape for parallel and network computing systems, focusing on four styles of concurrent programming models, and example languages/libraries. The four styles correspond to four scales of the targeted systems. At the smallest coprocessor scale, Single Instruction Multiple Thread (SIMT) and Compute Unified Device Architecture (CUDA) are considered. Transactional memory is discussed at the multicore or process scale. The MapReduce style is ex-amined at the datacenter scale. At the Internet scale, Grid Service Markup Language (GSML) is reviewed, which intends to integrate resources distributed across multiple dat-acenters.The four styles are concerned with and emphasize differ-ent issues, which are needed by systems at different scales. This paper discusses issues related to efficiency, ease of use, and expressiveness.

  19. Flexible Language Constructs for Large Parallel Programs

    Directory of Open Access Journals (Sweden)

    Matt Rosing

    1994-01-01

    Full Text Available The goal of the research described in this article is to develop flexible language constructs for writing large data parallel numerical programs for distributed memory (multiple instruction multiple data [MIMD] multiprocessors. Previously, several models have been developed to support synchronization and communication. Models for global synchronization include single instruction multiple data (SIMD, single program multiple data (SPMD, and sequential programs annotated with data distribution statements. The two primary models for communication include implicit communication based on shared memory and explicit communication based on messages. None of these models by themselves seem sufficient to permit the natural and efficient expression of the variety of algorithms that occur in large scientific computations. In this article, we give an overview of a new language that combines many of these programming models in a clean manner. This is done in a modular fashion such that different models can be combined to support large programs. Within a module, the selection of a model depends on the algorithm and its efficiency requirements. In this article, we give an overview of the language and discuss some of the critical implementation details.

  20. VERIFICATION OF PARALLEL AUTOMATA-BASED PROGRAMS

    Directory of Open Access Journals (Sweden)

    M. A. Lukin

    2014-01-01

    Full Text Available The paper deals with an interactive method of automatic verification for parallel automata-based programs. The hierarchical state machines can be implemented in different threads and can interact with each other. Verification is done by means of Spin tool and includes automatic Promela model construction, conversion of LTL-formula to Spin format and counterexamples in terms of automata. Interactive verification gives the possibility to decrease verification time and increase the maximum size of verifiable programs. Considered method supports verification of the parallel system for hierarchical automata that interact with each other through messages and shared variables. The feature of automaton model is that each state machine is considered as a new data type and can have an arbitrary bounded number of instances. Each state machine in the system can run a different state machine in a new thread or have nested state machine. This method was implemented in the developed Stater tool. Stater shows correct operation for all test cases.

  1. A Fast Parallel Simulation Code for Interaction between Proto-Planetary Disk and Embedded Proto-Planets: Implementation for 3D Code

    Energy Technology Data Exchange (ETDEWEB)

    Li, Shengtai [Los Alamos National Laboratory; Li, Hui [Los Alamos National Laboratory

    2012-06-14

    sensitive to the position of the planet, we adopt the corotating frame that allows the planet moving only in radial direction if only one planet is present. This code has been extensively tested on a number of problems. For the earthmass planet with constant aspect ratio h = 0.05, the torque calculated using our code matches quite well with the the 3D linear theory results by Tanaka et al. (2002). The code is fully parallelized via message-passing interface (MPI) and has very high parallel efficiency. Several numerical examples for both fixed planet and moving planet are provided to demonstrate the efficacy of the numerical method and code.

  2. A task-based parallelism and vectorized approach to 3D Method of Characteristics (MOC) reactor simulation for high performance computing architectures

    Science.gov (United States)

    Tramm, John R.; Gunow, Geoffrey; He, Tim; Smith, Kord S.; Forget, Benoit; Siegel, Andrew R.

    2016-05-01

    In this study we present and analyze a formulation of the 3D Method of Characteristics (MOC) technique applied to the simulation of full core nuclear reactors. Key features of the algorithm include a task-based parallelism model that allows independent MOC tracks to be assigned to threads dynamically, ensuring load balancing, and a wide vectorizable inner loop that takes advantage of modern SIMD computer architectures. The algorithm is implemented in a set of highly optimized proxy applications in order to investigate its performance characteristics on CPU, GPU, and Intel Xeon Phi architectures. Speed, power, and hardware cost efficiencies are compared. Additionally, performance bottlenecks are identified for each architecture in order to determine the prospects for continued scalability of the algorithm on next generation HPC architectures.

  3. Parallel programming in Go and Scala : A performance comparison

    OpenAIRE

    Johnell, Carl

    2015-01-01

        This thesis provides a performance comparison of parallel programming in Go and Scala. Go supports concurrency through goroutines and channels. Scala have parallel collections, futures and actors that can be used for concurrent and parallel programming. The experiment used two different types of algorithms to compare the performance between Go and Scala. Parallel versions of matrix multiplication and matrix chain multiplication were implemented with goroutines and channels in Go. Matrix m...

  4. THERM3D -- A boundary element computer program for transient heat conduction problems

    Energy Technology Data Exchange (ETDEWEB)

    Ingber, M.S. [New Mexico Univ., Albuquerque, NM (United States). Dept. of Mechanical Engineering

    1994-02-01

    The computer code THERM3D implements the direct boundary element method (BEM) to solve transient heat conduction problems in arbitrary three-dimensional domains. This particular implementation of the BEM avoids performing time-consuming domain integrations by approximating a ``generalized forcing function`` in the interior of the domain with the use of radial basis functions. An approximate particular solution is then constructed, and the original problem is transformed into a sequence of Laplace problems. The code is capable of handling a large variety of boundary conditions including isothermal, specified flux, convection, radiation, and combined convection and radiation conditions. The computer code is benchmarked by comparisons with analytic and finite element results.

  5. A Parallel 3D Spectral Difference Method for Solutions of Compressible Navier Stokes Equations on Deforming Grids and Simulations of Vortex Induced Vibration

    Science.gov (United States)

    DeJong, Andrew

    Numerical models of fluid-structure interaction have grown in importance due to increasing interest in environmental energy harvesting, airfoil-gust interactions, and bio-inspired formation flying. Powered by increasingly powerful parallel computers, such models seek to explain the fundamental physics behind the complex, unsteady fluid-structure phenomena. To this end, a high-fidelity computational model based on the high-order spectral difference method on 3D unstructured, dynamic meshes has been developed. The spectral difference method constructs continuous solution fields within each element with a Riemann solver to compute the inviscid fluxes at the element interfaces and an averaging mechanism to compute the viscous fluxes. This method has shown promise in the past as a highly accurate, yet sufficiently fast method for solving unsteady viscous compressible flows. The solver is monolithically coupled to the equations of motion of an elastically mounted 3-degree of freedom rigid bluff body undergoing flow-induced lift, drag, and torque. The mesh is deformed using 4 methods: an analytic function, Laplace equation, biharmonic equation, and a bi-elliptic equation with variable diffusivity. This single system of equations -- fluid and structure -- is advanced through time using a 5-stage, 4th-order Runge-Kutta scheme. Message Passing Interface is used to run the coupled system in parallel on up to 240 processors. The solver is validated against previously published numerical and experimental data for an elastically mounted cylinder. The effect of adding an upstream body and inducing wake galloping is observed.

  6. Introduction to 3D Graphics through Excel

    Science.gov (United States)

    Benacka, Jan

    2013-01-01

    The article presents a method of explaining the principles of 3D graphics through making a revolvable and sizable orthographic parallel projection of cuboid in Excel. No programming is used. The method was tried in fourteen 90 minute lessons with 181 participants, which were Informatics teachers, undergraduates of Applied Informatics and gymnasium…

  7. Data-Parallel Programming in a Multithreaded Environment

    Directory of Open Access Journals (Sweden)

    Matthew Haines

    1997-01-01

    Full Text Available Research on programming distributed memory multiprocessors has resulted in a well-understood programming model, namely data-parallel programming. However, data-parallel programming in a multithreaded environment is far less understood. For example, if multiple threads within the same process belong to different data-parallel computations, then the architecture, compiler, or run-time system must ensure that relative indexing and collective operations are handled properly and efficiently. We introduce a run-time-based solution for data-parallel programming in a distributed memory environment that handles the problems of relative indexing and collective communications among thread groups. As a result, the data-parallel programming model can now be executed in a multithreaded environment, such as a system using threads to support both task and data parallelism.

  8. Experiences Using Hybrid MPI/OpenMP in the Real World: Parallelization of a 3D CFD Solver for Multi-Core Node Clusters

    OpenAIRE

    Gabriele Jost; Bob Robins

    2010-01-01

    Today most systems in high-performance computing (HPC) feature a hierarchical hardware design: shared-memory nodes with several multi-core CPUs are connected via a network infrastructure. When parallelizing an application for these architectures it seems natural to employ a hierarchical programming model such as combining MPI and OpenMP. Nevertheless, there is the general lore that pure MPI outperforms the hybrid MPI/OpenMP approach. In this paper, we describe the hybrid MPI/OpenMP paralleliz...

  9. Parallel phase model : a programming model for high-end parallel machines with manycores.

    Energy Technology Data Exchange (ETDEWEB)

    Wu, Junfeng (Syracuse University, Syracuse, NY); Wen, Zhaofang; Heroux, Michael Allen; Brightwell, Ronald Brian

    2009-04-01

    This paper presents a parallel programming model, Parallel Phase Model (PPM), for next-generation high-end parallel machines based on a distributed memory architecture consisting of a networked cluster of nodes with a large number of cores on each node. PPM has a unified high-level programming abstraction that facilitates the design and implementation of parallel algorithms to exploit both the parallelism of the many cores and the parallelism at the cluster level. The programming abstraction will be suitable for expressing both fine-grained and coarse-grained parallelism. It includes a few high-level parallel programming language constructs that can be added as an extension to an existing (sequential or parallel) programming language such as C; and the implementation of PPM also includes a light-weight runtime library that runs on top of an existing network communication software layer (e.g. MPI). Design philosophy of PPM and details of the programming abstraction are also presented. Several unstructured applications that inherently require high-volume random fine-grained data accesses have been implemented in PPM with very promising results.

  10. 3D and Education

    Science.gov (United States)

    Meulien Ohlmann, Odile

    2013-02-01

    Today the industry offers a chain of 3D products. Learning to "read" and to "create in 3D" becomes an issue of education of primary importance. 25 years professional experience in France, the United States and Germany, Odile Meulien set up a personal method of initiation to 3D creation that entails the spatial/temporal experience of the holographic visual. She will present some different tools and techniques used for this learning, their advantages and disadvantages, programs and issues of educational policies, constraints and expectations related to the development of new techniques for 3D imaging. Although the creation of display holograms is very much reduced compared to the creation of the 90ies, the holographic concept is spreading in all scientific, social, and artistic activities of our present time. She will also raise many questions: What means 3D? Is it communication? Is it perception? How the seeing and none seeing is interferes? What else has to be taken in consideration to communicate in 3D? How to handle the non visible relations of moving objects with subjects? Does this transform our model of exchange with others? What kind of interaction this has with our everyday life? Then come more practical questions: How to learn creating 3D visualization, to learn 3D grammar, 3D language, 3D thinking? What for? At what level? In which matter? for whom?

  11. PDDP: A data parallel programming model. Revision 1

    Energy Technology Data Exchange (ETDEWEB)

    Warren, K.H.

    1995-06-01

    PDDP, the Parallel Data Distribution Preprocessor, is a data parallel programming model for distributed memory parallel computers. PDDP impelments High Performance Fortran compatible data distribution directives and parallelism expressed by the use of Fortran 90 array syntax, the FORALL statement, and the (WRERE?) construct. Distribued data objects belong to a global name space; other data objects are treated as local and replicated on each processor. PDDP allows the user to program in a shared-memory style and generates codes that are portable to a variety of parallel machines. For interprocessor communication, PDDP uses the fastest communication primitives on each platform.

  12. 3-D inelastic analysis methods for hot section components (base program). [turbine blades, turbine vanes, and combustor liners

    Science.gov (United States)

    Wilson, R. B.; Bak, M. J.; Nakazawa, S.; Banerjee, P. K.

    1984-01-01

    A 3-D inelastic analysis methods program consists of a series of computer codes embodying a progression of mathematical models (mechanics of materials, special finite element, boundary element) for streamlined analysis of combustor liners, turbine blades, and turbine vanes. These models address the effects of high temperatures and thermal/mechanical loadings on the local (stress/strain) and global (dynamics, buckling) structural behavior of the three selected components. These models are used to solve 3-D inelastic problems using linear approximations in the sense that stresses/strains and temperatures in generic modeling regions are linear functions of the spatial coordinates, and solution increments for load, temperature and/or time are extrapolated linearly from previous information. Three linear formulation computer codes, referred to as MOMM (Mechanics of Materials Model), MHOST (MARC-Hot Section Technology), and BEST (Boundary Element Stress Technology), were developed and are described.

  13. Optimizing FORTRAN Programs for Hierarchical Memory Parallel Processing Systems

    Institute of Scientific and Technical Information of China (English)

    金国华; 陈福接

    1993-01-01

    Parallel loops account for the greatest amount of parallelism in numerical programs.Executing nested loops in parallel with low run-time overhead is thus very important for achieving high performance in parallel processing systems.However,in parallel processing systems with caches or local memories in memory hierarchies,“thrashing problemmay”may arise whenever data move back and forth between the caches or local memories in different processors.Previous techniques can only deal with the rather simple cases with one linear function in the perfactly nested loop.In this paper,we present a parallel program optimizing technique called hybri loop interchange(HLI)for the cases with multiple linear functions and loop-carried data dependences in the nested loop.With HLI we can easily eliminate or reduce the thrashing phenomena without reucing the program parallelism.

  14. Parallel Programming with Matrix Distributed Processing

    OpenAIRE

    Di Pierro, Massimo

    2005-01-01

    Matrix Distributed Processing (MDP) is a C++ library for fast development of efficient parallel algorithms. It constitues the core of FermiQCD. MDP enables programmers to focus on algorithms, while parallelization is dealt with automatically and transparently. Here we present a brief overview of MDP and examples of applications in Computer Science (Cellular Automata), Engineering (PDE Solver) and Physics (Ising Model).

  15. Based on matlab 3d visualization programming in the application of the uranium exploration

    International Nuclear Information System (INIS)

    Combined geological theory, geophysical curve and Matlab programming three dimensional visualization applied to the production of uranium exploration. With a simple Matlab programming, numerical processing and graphical visualization of convenient features, and effective in identifying ore bodies, recourse to ore, ore body delineation of the scope of analysis has played the role of sedimentary environment. (author)

  16. The SMC (Short Model Coil) Nb3Sn Program: FE Analysis with 3D Modeling

    CERN Document Server

    Kokkinos, C; Guinchard, M; Karppinen, M; Manil, P; Perez, J C; Regis, F

    2012-01-01

    The SMC (Short Model Coil) project aims at testing superconducting coils in racetrack configuration, wound with Nb3Sn cable. The degradation of the magnetic properties of the cable is studied by applying different levels of pre-stress. It is an essential step in the validation of procedures for the construction of superconducting magnets with high performance conductor. Two SMC assemblies have been completed and cold tested in the frame of a European collaboration between CEA (FR), CERN and STFC (UK), with the technical support from LBNL (US). The second assembly showed remarkable good quench results, reaching a peak field of 12.5T. This paper details the new 3D modeling method of the SMC, implemented using the ANSYS® Workbench environment. Advanced computer-aided-design (CAD) tools are combined with multi-physics Finite Element Analyses (FEA), in the same integrated graphic interface, forming a fully parametric model that enables simulation driven development of the SMC project. The magnetic and structural ...

  17. Programming parallel architectures - The BLAZE family of languages

    Science.gov (United States)

    Mehrotra, Piyush

    1989-01-01

    This paper gives an overview of the various approaches to programming multiprocessor architectures that are currently being explored. It is argued that two of these approaches, interactive programming environments and functional parallel languages, are particularly attractive, since they remove much of the burden of exploiting parallel architectures from the user. This paper also describes recent work in the design of parallel languages. Research on languages for both shared and nonshared memory multiprocessors is described.

  18. The Double Hierarchy Method. A parallel 3D contact method for the interaction of spherical particles with rigid FE boundaries using the DEM

    Science.gov (United States)

    Santasusana, Miquel; Irazábal, Joaquín; Oñate, Eugenio; Carbonell, Josep Maria

    2016-07-01

    In this work, we present a new methodology for the treatment of the contact interaction between rigid boundaries and spherical discrete elements (DE). Rigid body parts are present in most of large-scale simulations. The surfaces of the rigid parts are commonly meshed with a finite element-like (FE) discretization. The contact detection and calculation between those DE and the discretized boundaries is not straightforward and has been addressed by different approaches. The algorithm presented in this paper considers the contact of the DEs with the geometric primitives of a FE mesh, i.e. facet, edge or vertex. To do so, the original hierarchical method presented by Horner et al. (J Eng Mech 127(10):1027-1032, 2001) is extended with a new insight leading to a robust, fast and accurate 3D contact algorithm which is fully parallelizable. The implementation of the method has been developed in order to deal ideally with triangles and quadrilaterals. If the boundaries are discretized with another type of geometries, the method can be easily extended to higher order planar convex polyhedra. A detailed description of the procedure followed to treat a wide range of cases is presented. The description of the developed algorithm and its validation is verified with several practical examples. The parallelization capabilities and the obtained performance are presented with the study of an industrial application example.

  19. Survey on present status and trend of parallel programming environments

    International Nuclear Information System (INIS)

    This report intends to provide useful information on software tools for parallel programming through the survey on parallel programming environments of the following six parallel computers, Fujitsu VPP300/500, NEC SX-4, Hitachi SR2201, Cray T94, IBM SP, and Intel Paragon, all of which are installed at Japan Atomic Energy Research Institute (JAERI), moreover, the present status of R and D's on parallel softwares of parallel languages, compilers, debuggers, performance evaluation tools, and integrated tools is reported. This survey has been made as a part of our project of developing a basic software for parallel programming environment, which is designed on the concept of STA (Seamless Thinking Aid to programmers). (author)

  20. The BLAZE language: A parallel language for scientific programming

    Science.gov (United States)

    Mehrotra, P.; Vanrosendale, J.

    1985-01-01

    A Pascal-like scientific programming language, Blaze, is described. Blaze contains array arithmetic, forall loops, and APL-style accumulation operators, which allow natural expression of fine grained parallelism. It also employs an applicative or functional procedure invocation mechanism, which makes it easy for compilers to extract coarse grained parallelism using machine specific program restructuring. Thus Blaze should allow one to achieve highly parallel execution on multiprocessor architectures, while still providing the user with onceptually sequential control flow. A central goal in the design of Blaze is portability across a broad range of parallel architectures. The multiple levels of parallelism present in Blaze code, in principle, allow a compiler to extract the types of parallelism appropriate for the given architecture while neglecting the remainder. The features of Blaze are described and shows how this language would be used in typical scientific programming.

  1. The BLAZE language - A parallel language for scientific programming

    Science.gov (United States)

    Mehrotra, Piyush; Van Rosendale, John

    1987-01-01

    A Pascal-like scientific programming language, BLAZE, is described. BLAZE contains array arithmetic, forall loops, and APL-style accumulation operators, which allow natural expression of fine grained parallelism. It also employs an applicative or functional procedure invocation mechanism, which makes it easy for compilers to extract coarse grained parallelism using machine specific program restructuring. Thus BLAZE should allow one to achieve highly parallel execution on multiprocessor architectures, while still providing the user with conceptually sequential control flow. A central goal in the design of BLAZE is portability across a broad range of parallel architectures. The multiple levels of parallelism present in BLAZE code, in principle, allow a compiler to extract the types of parallelism appropriate for the given architecture while neglecting the remainder. The features of BLAZE are described and it is shown how this language would be used in typical scientific programming.

  2. Professional Parallel Programming with C# Master Parallel Extensions with NET 4

    CERN Document Server

    Hillar, Gastón

    2010-01-01

    Expert guidance for those programming today's dual-core processors PCs As PC processors explode from one or two to now eight processors, there is an urgent need for programmers to master concurrent programming. This book dives deep into the latest technologies available to programmers for creating professional parallel applications using C#, .NET 4, and Visual Studio 2010. The book covers task-based programming, coordination data structures, PLINQ, thread pools, asynchronous programming model, and more. It also teaches other parallel programming techniques, such as SIMD and vectorization.Teach

  3. The 3D structure of the hadrons: recents results and experimental program at Jefferson Lab

    Directory of Open Access Journals (Sweden)

    Muñoz Camacho C.

    2014-04-01

    Full Text Available The understanding of Quantum Chromodynamics (QCD at large distances still remains one of the main outstanding problems of nuclear physics. Studying the internal structure of hadrons provides a way to probe QCD in the non-perturbative domain and can help us unravel the internal structure of the most elementary blocks of matter. Jefferson Lab (JLab has already delivered results on how elementary quarks and gluons create nucleon structure and properties. The upgrade of JLab to 12 GeV will allow the full exploration of the valence-quark structure of nucleons and the extraction of real threedimensional pictures. I will present recent results and review the future experimental program at JLab.

  4. Using Fuzzy Gaussian Inference and Genetic Programming to Classify 3D Human Motions

    Science.gov (United States)

    Khoury, Mehdi; Liu, Honghai

    This research introduces and builds on the concept of Fuzzy Gaussian Inference (FGI) (Khoury and Liu in Proceedings of UKCI, 2008 and IEEE Workshop on Robotic Intelligence in Informationally Structured Space (RiiSS 2009), 2009) as a novel way to build Fuzzy Membership Functions that map to hidden Probability Distributions underlying human motions. This method is now combined with a Genetic Programming Fuzzy rule-based system in order to classify boxing moves from natural human Motion Capture data. In this experiment, FGI alone is able to recognise seven different boxing stances simultaneously with an accuracy superior to a GMM-based classifier. Results seem to indicate that adding an evolutionary Fuzzy Inference Engine on top of FGI improves the accuracy of the classifier in a consistent way.

  5. Programming parallel architectures: The BLAZE family of languages

    Science.gov (United States)

    Mehrotra, Piyush

    1988-01-01

    Programming multiprocessor architectures is a critical research issue. An overview is given of the various approaches to programming these architectures that are currently being explored. It is argued that two of these approaches, interactive programming environments and functional parallel languages, are particularly attractive since they remove much of the burden of exploiting parallel architectures from the user. Also described is recent work by the author in the design of parallel languages. Research on languages for both shared and nonshared memory multiprocessors is described, as well as the relations of this work to other current language research projects.

  6. A Performance Analysis Tool for PVM Parallel Programs

    Institute of Scientific and Technical Information of China (English)

    Chen Wang; Yin Liu; Changjun Jiang; Zhaoqing Zhang

    2004-01-01

    In this paper,we introduce the design and implementation of ParaVT,which is a visual performance analysis and parallel debugging tool.In ParaVT,we propose an automated instrumentation mechanism. Based on this mechanism,ParaVT automatically analyzes the performance bottleneck of parallel applications and provides a visual user interface to monitor and analyze the performance of parallel programs.In addition ,it also supports certain extensions.

  7. Hybrid 3-D rocket trajectory program. Part 1: Formulation and analysis. Part 2: Computer programming and user's instruction. [computerized simulation using three dimensional motion analysis

    Science.gov (United States)

    Huang, L. C. P.; Cook, R. A.

    1973-01-01

    Models utilizing various sub-sets of the six degrees of freedom are used in trajectory simulation. A 3-D model with only linear degrees of freedom is especially attractive, since the coefficients for the angular degrees of freedom are the most difficult to determine and the angular equations are the most time consuming for the computer to evaluate. A computer program is developed that uses three separate subsections to predict trajectories. A launch rail subsection is used until the rocket has left its launcher. The program then switches to a special 3-D section which computes motions in two linear and one angular degrees of freedom. When the rocket trims out, the program switches to the standard, three linear degrees of freedom model.

  8. Portable Parallel CORBA Objects: an Approach to Combine Parallel and Distributed Programming for Grid Computing

    OpenAIRE

    Denis, Alexandre; Pérez, Christian; Priol, Thierry

    2001-01-01

    With the availability of Computational Grids, new kinds of applications that will soon emerge will raise the problem of how to program them on such computing systems. In this paper, we advocate a programming model that is based on a combination of parallel and distributed programming models. Compared to previous approaches, this work aims at bringing SPMD programming into CORBA. For example, we want to interconnect two MPI codes by CORBA without modifying MPI or CORBA. We show that such an ap...

  9. Detection of And—Parallelism in Logic Programs

    Institute of Scientific and Technical Information of China (English)

    黄志毅; 胡守仁

    1990-01-01

    In this paper,we present a detection technique of and-parallelism in logic programs.The detection consists of three phases:analysis of entry modes,derivation of exit modes and determination of execution graph expressions.Compared with other techniques[2,4,5],our approach with the compile-time program-level data-dependence analysis of logic programs,can efficiently exploit and-parallelism in logic programs.Two precompilers,based on our technique and DeGroot's approach[3] respectively,have been implemented in SES-PIM system[12],Through compiling and running some typical benchmarks in SES-PIM,we conclude that our technique can,in most cases,exploit as much and-parallelism as the dynamic approach[13]does under“produces-consumer”scheme,and needs less dynamic overhead while exploiting more and parallelism than DeGroot's approach does.

  10. Programming Probabilistic Structural Analysis for Parallel Processing Computer

    Science.gov (United States)

    Sues, Robert H.; Chen, Heh-Chyun; Twisdale, Lawrence A.; Chamis, Christos C.; Murthy, Pappu L. N.

    1991-01-01

    The ultimate goal of this research program is to make Probabilistic Structural Analysis (PSA) computationally efficient and hence practical for the design environment by achieving large scale parallelism. The paper identifies the multiple levels of parallelism in PSA, identifies methodologies for exploiting this parallelism, describes the development of a parallel stochastic finite element code, and presents results of two example applications. It is demonstrated that speeds within five percent of those theoretically possible can be achieved. A special-purpose numerical technique, the stochastic preconditioned conjugate gradient method, is also presented and demonstrated to be extremely efficient for certain classes of PSA problems.

  11. Defeasible logic programming: language definition, operational semantics, and parallelism

    OpenAIRE

    García, Alejandro Javier

    2001-01-01

    This thesis defines Defeasible Logic Programming and provides a concrete specification of this new language through its operational semantics. Defeasible Logic Programming, or DeLP for short, has been defined based on the Logic Programming paradigm and considering features of recent developments in the area of Defeasible Argumentation. DeLP relates and improves many aspects of the areas of Logic Programming, Defeasible Argumentation, Intelligent Agents, and Parallel Logic Programming

  12. Documentation of program AFTBDY to generate coordinate system for 3D after body using body fitted curvilinear coordinates, part 1

    Science.gov (United States)

    Kumar, D.

    1980-01-01

    The computer program AFTBDY generates a body fitted curvilinear coordinate system for a wedge curved after body. This wedge curved after body is being used in an experimental program. The coordinate system generated by AFTBDY is used to solve 3D compressible N.S. equations. The coordinate system in the physical plane is a cartesian x,y,z system, whereas, in the transformed plane a rectangular xi, eta, zeta system is used. The coordinate system generated is such that in the transformed plane coordinate spacing in the xi, eta, zeta direction is constant and equal to unity. The physical plane coordinate lines in the different regions are clustered heavily or sparsely depending on the regions where physical quantities to be solved for by the N.S. equations have high or low gradients. The coordinate distribution in the physical plane is such that x stays constant in eta and zeta direction, whereas, z stays constant in xi and eta direction. The desired distribution in x and z is input to the program. Consequently, only the y-coordinate is solved for by the program AFTBDY.

  13. The FORCE - A highly portable parallel programming language

    Science.gov (United States)

    Jordan, Harry F.; Benten, Muhammad S.; Alaghband, Gita; Jakob, Ruediger

    1989-01-01

    This paper explains why the FORCE parallel programming language is easily portable among six different shared-memory multiprocessors, and how a two-level macro preprocessor makes it possible to hide low-level machine dependencies and to build machine-independent high-level constructs on top of them. These FORCE constructs make it possible to write portable parallel programs largely independent of the number of processes and the specific shared-memory multiprocessor executing them.

  14. The FORCE: A highly portable parallel programming language

    Science.gov (United States)

    Jordan, Harry F.; Benten, Muhammad S.; Alaghband, Gita; Jakob, Ruediger

    1989-01-01

    Here, it is explained why the FORCE parallel programming language is easily portable among six different shared-memory microprocessors, and how a two-level macro preprocessor makes it possible to hide low level machine dependencies and to build machine-independent high level constructs on top of them. These FORCE constructs make it possible to write portable parallel programs largely independent of the number of processes and the specific shared memory multiprocessor executing them.

  15. Declarative Parallel Programming in Spreadsheet End-User Development

    DEFF Research Database (Denmark)

    Biermann, Florian

    2016-01-01

    Spreadsheets are first-order functional languages and are widely used in research and industry as a tool to conveniently perform all kinds of computations. Because cells on a spreadsheet are immutable, there are possibilities for implicit parallelization of spreadsheet computations. In this liter...... can directly apply results from functional array programming to a spreadsheet model of computations.......Spreadsheets are first-order functional languages and are widely used in research and industry as a tool to conveniently perform all kinds of computations. Because cells on a spreadsheet are immutable, there are possibilities for implicit parallelization of spreadsheet computations....... In this literature study, we provide an overview of the publications on spreadsheet end-user programming and declarative array programming to inform further research on parallel programming in spreadsheets. Our results show that there is a clear overlap between spreadsheet programming and array programming and we...

  16. Integrated Task And Data Parallel Programming: Language Design

    Science.gov (United States)

    Grimshaw, Andrew S.; West, Emily A.

    1998-01-01

    his research investigates the combination of task and data parallel language constructs within a single programming language. There are an number of applications that exhibit properties which would be well served by such an integrated language. Examples include global climate models, aircraft design problems, and multidisciplinary design optimization problems. Our approach incorporates data parallel language constructs into an existing, object oriented, task parallel language. The language will support creation and manipulation of parallel classes and objects of both types (task parallel and data parallel). Ultimately, the language will allow data parallel and task parallel classes to be used either as building blocks or managers of parallel objects of either type, thus allowing the development of single and multi-paradigm parallel applications. 1995 Research Accomplishments In February I presented a paper at Frontiers '95 describing the design of the data parallel language subset. During the spring I wrote and defended my dissertation proposal. Since that time I have developed a runtime model for the language subset. I have begun implementing the model and hand-coding simple examples which demonstrate the language subset. I have identified an astrophysical fluid flow application which will validate the data parallel language subset. 1996 Research Agenda Milestones for the coming year include implementing a significant portion of the data parallel language subset over the Legion system. Using simple hand-coded methods, I plan to demonstrate (1) concurrent task and data parallel objects and (2) task parallel objects managing both task and data parallel objects. My next steps will focus on constructing a compiler and implementing the fluid flow application with the language. Concurrently, I will conduct a search for a real-world application exhibiting both task and data parallelism within the same program m. Additional 1995 Activities During the fall I collaborated

  17. Evaluating integration of inland bathymetry in the U.S. Geological Survey 3D Elevation Program, 2014

    Science.gov (United States)

    Miller-Corbett, Cynthia

    2016-09-01

    Inland bathymetry survey collections, survey data types, features, sources, availability, and the effort required to integrate inland bathymetric data into the U.S. Geological Survey 3D Elevation Program are assessed to help determine the feasibility of integrating three-dimensional water feature elevation data into The National Map. Available data from wading, acoustic, light detection and ranging, and combined technique surveys are provided by the U.S. Geological Survey, National Oceanic and Atmospheric Administration, U.S. Army Corps of Engineers, and other sources. Inland bathymetric data accessed through Web-hosted resources or contacts provide useful baseline parameters for evaluating survey types and techniques used for collection and processing, and serve as a basis for comparing survey methods and the quality of results. Historically, boat-mounted acoustic surveys have provided most inland bathymetry data. Light detection and ranging techniques that are beneficial in areas hard to reach by boat, that can collect dense data in shallow water to provide comprehensive coverage, and that can be cost effective for surveying large areas with good water clarity are becoming more common; however, optimal conditions and techniques for collecting and processing light detection and ranging inland bathymetry surveys are not yet well defined.Assessment of site condition parameters important for understanding inland bathymetry survey issues and results, and an evaluation of existing inland bathymetry survey coverage are proposed as steps to develop criteria for implementing a useful and successful inland bathymetry survey plan in the 3D Elevation Program. These survey parameters would also serve as input for an inland bathymetry survey data baseline. Integration and interpolation techniques are important factors to consider in developing a robust plan; however, available survey data are usually in a triangulated irregular network format or other format compatible with

  18. Evaluating integration of inland bathymetry in the U.S. Geological Survey 3D Elevation Program, 2014

    Science.gov (United States)

    Miller-Corbett, Cynthia

    2016-01-01

    Inland bathymetry survey collections, survey data types, features, sources, availability, and the effort required to integrate inland bathymetric data into the U.S. Geological Survey 3D Elevation Program are assessed to help determine the feasibility of integrating three-dimensional water feature elevation data into The National Map. Available data from wading, acoustic, light detection and ranging, and combined technique surveys are provided by the U.S. Geological Survey, National Oceanic and Atmospheric Administration, U.S. Army Corps of Engineers, and other sources. Inland bathymetric data accessed through Web-hosted resources or contacts provide useful baseline parameters for evaluating survey types and techniques used for collection and processing, and serve as a basis for comparing survey methods and the quality of results. Historically, boat-mounted acoustic surveys have provided most inland bathymetry data. Light detection and ranging techniques that are beneficial in areas hard to reach by boat, that can collect dense data in shallow water to provide comprehensive coverage, and that can be cost effective for surveying large areas with good water clarity are becoming more common; however, optimal conditions and techniques for collecting and processing light detection and ranging inland bathymetry surveys are not yet well defined.Assessment of site condition parameters important for understanding inland bathymetry survey issues and results, and an evaluation of existing inland bathymetry survey coverage are proposed as steps to develop criteria for implementing a useful and successful inland bathymetry survey plan in the 3D Elevation Program. These survey parameters would also serve as input for an inland bathymetry survey data baseline. Integration and interpolation techniques are important factors to consider in developing a robust plan; however, available survey data are usually in a triangulated irregular network format or other format compatible with

  19. Characterizing and Mitigating Work Time Inflation in Task Parallel Programs

    Directory of Open Access Journals (Sweden)

    Stephen L. Olivier

    2013-01-01

    Full Text Available Task parallelism raises the level of abstraction in shared memory parallel programming to simplify the development of complex applications. However, task parallel applications can exhibit poor performance due to thread idleness, scheduling overheads, and work time inflation – additional time spent by threads in a multithreaded computation beyond the time required to perform the same work in a sequential computation. We identify the contributions of each factor to lost efficiency in various task parallel OpenMP applications and diagnose the causes of work time inflation in those applications. Increased data access latency can cause significant work time inflation in NUMA systems. Our locality framework for task parallel OpenMP programs mitigates this cause of work time inflation. Our extensions to the Qthreads library demonstrate that locality-aware scheduling can improve performance up to 3X compared to the Intel OpenMP task scheduler.

  20. Retargeting of existing FORTRAN program and development of parallel compilers

    Science.gov (United States)

    Agrawal, Dharma P.

    1988-01-01

    The software models used in implementing the parallelizing compiler for the B-HIVE multiprocessor system are described. The various models and strategies used in the compiler development are: flexible granularity model, which allows a compromise between two extreme granularity models; communication model, which is capable of precisely describing the interprocessor communication timings and patterns; loop type detection strategy, which identifies different types of loops; critical path with coloring scheme, which is a versatile scheduling strategy for any multicomputer with some associated communication costs; and loop allocation strategy, which realizes optimum overlapped operations between computation and communication of the system. Using these models, several sample routines of the AIR3D package are examined and tested. It may be noted that automatically generated codes are highly parallelized to provide the maximized degree of parallelism, obtaining the speedup up to a 28 to 32-processor system. A comparison of parallel codes for both the existing and proposed communication model, is performed and the corresponding expected speedup factors are obtained. The experimentation shows that the B-HIVE compiler produces more efficient codes than existing techniques. Work is progressing well in completing the final phase of the compiler. Numerous enhancements are needed to improve the capabilities of the parallelizing compiler.

  1. HPF Implementation of ARC3D

    Science.gov (United States)

    Frumkin, Michael; Yan, Jerry

    1999-01-01

    We present an HPF (High Performance Fortran) implementation of ARC3D code along with the profiling and performance data on SGI Origin 2000. Advantages and limitations of HPF as a parallel programming language for CFD applications are discussed. For achieving good performance results we used the data distributions optimized for implementation of implicit and explicit operators of the solver and boundary conditions. We compare the results with MPI and directive based implementations.

  2. Methodological proposal for the volumetric study of archaeological ceramics through 3D edition free-software programs: the case of the celtiberians cemeteries of the meseta

    Directory of Open Access Journals (Sweden)

    Álvaro Sánchez Climent

    2014-10-01

    Full Text Available Nowadays the free-software programs have been converted into the ideal tools for the archaeological researches, reaching the same level as other commercial programs. For that reason, the 3D modeling tool Blender has reached in the last years a great popularity offering similar characteristics like other commercial 3D editing programs such as 3D Studio Max or AutoCAD. Recently, it has been developed the necessary script for the volumetric calculations of three-dimnesional objects, offering great possibilities to calculate the volume of the archaeological ceramics. In this paper, we present a methodological approach for the volumetric studies with Blender and a study case of funerary urns from several celtiberians cemeteries of the Spanish Meseta. The goal is to demonstrate the great possibilities that the 3D editing free-software tools have in the volumetric studies at the present time.

  3. 3D Printing an Octohedron

    OpenAIRE

    Aboufadel, Edward F.

    2014-01-01

    The purpose of this short paper is to describe a project to manufacture a regular octohedron on a 3D printer. We assume that the reader is familiar with the basics of 3D printing. In the project, we use fundamental ideas to calculate the vertices and faces of an octohedron. Then, we utilize the OPENSCAD program to create a virtual 3D model and an STereoLithography (.stl) file that can be used by a 3D printer.

  4. Parallelization of detailed thermal-hydraulic analysis program SPIRAL

    International Nuclear Information System (INIS)

    The detailed thermal-hydraulic analysis computer program APIRAL is under development for the evaluation of local flow and temperature fields in wire-wrapped fuel pin bundles deformed by the influence of high burn-up, which are hard to reveal by experiment due to measurement difficulty. The coupling utilization of this program and a subchannel analysis program can offer a practical method to evaluate thermal-hydraulic behavior in a whole fuel assembly with high accuracy. This report describes the parallelization of SPIRAL for improving applicability to larger numerical simulations. The domain decomposition method using overlapped elements was adopted to the parallelization because SPIRAL is based on finite element method and it can minimize the number of communications between processor elements. As a parallelization programming library, Massage Passing Interface (MPI) was applied. Several numerical simulations were carried out to verify the parallelized version of SPIRAL and to evaluate parallelization efficiency. From These simulation results, the validity of this version was confirmed. Although no good parallelization efficiency was obtained in the case of small scale simulations due to overhead processes, approximately twelve times processing speed was achieved by using 16 processor elements in larger scale simulations. (author)

  5. Development of massively parallel quantum chemistry program SMASH

    International Nuclear Information System (INIS)

    A massively parallel program for quantum chemistry calculations SMASH was released under the Apache License 2.0 in September 2014. The SMASH program is written in the Fortran90/95 language with MPI and OpenMP standards for parallelization. Frequently used routines, such as one- and two-electron integral calculations, are modularized to make program developments simple. The speed-up of the B3LYP energy calculation for (C150H30)2 with the cc-pVDZ basis set (4500 basis functions) was 50,499 on 98,304 cores of the K computer

  6. Development of massively parallel quantum chemistry program SMASH

    Science.gov (United States)

    Ishimura, Kazuya

    2015-12-01

    A massively parallel program for quantum chemistry calculations SMASH was released under the Apache License 2.0 in September 2014. The SMASH program is written in the Fortran90/95 language with MPI and OpenMP standards for parallelization. Frequently used routines, such as one- and two-electron integral calculations, are modularized to make program developments simple. The speed-up of the B3LYP energy calculation for (C150H30)2 with the cc-pVDZ basis set (4500 basis functions) was 50,499 on 98,304 cores of the K computer.

  7. Development of massively parallel quantum chemistry program SMASH

    Energy Technology Data Exchange (ETDEWEB)

    Ishimura, Kazuya [Department of Theoretical and Computational Molecular Science, Institute for Molecular Science 38 Nishigo-Naka, Myodaiji, Okazaki, Aichi 444-8585 (Japan)

    2015-12-31

    A massively parallel program for quantum chemistry calculations SMASH was released under the Apache License 2.0 in September 2014. The SMASH program is written in the Fortran90/95 language with MPI and OpenMP standards for parallelization. Frequently used routines, such as one- and two-electron integral calculations, are modularized to make program developments simple. The speed-up of the B3LYP energy calculation for (C{sub 150}H{sub 30}){sub 2} with the cc-pVDZ basis set (4500 basis functions) was 50,499 on 98,304 cores of the K computer.

  8. The MHOST finite element program: 3-D inelastic analysis methods for hot section components. Volume 1: Theoretical manual

    Science.gov (United States)

    Nakazawa, Shohei

    1991-01-01

    Formulations and algorithms implemented in the MHOST finite element program are discussed. The code uses a novel concept of the mixed iterative solution technique for the efficient 3-D computations of turbine engine hot section components. The general framework of variational formulation and solution algorithms are discussed which were derived from the mixed three field Hu-Washizu principle. This formulation enables the use of nodal interpolation for coordinates, displacements, strains, and stresses. Algorithmic description of the mixed iterative method includes variations for the quasi static, transient dynamic and buckling analyses. The global-local analysis procedure referred to as the subelement refinement is developed in the framework of the mixed iterative solution, of which the detail is presented. The numerically integrated isoparametric elements implemented in the framework is discussed. Methods to filter certain parts of strain and project the element discontinuous quantities to the nodes are developed for a family of linear elements. Integration algorithms are described for linear and nonlinear equations included in MHOST program.

  9. Web Based Parallel Programming Workshop for Undergraduate Education.

    Science.gov (United States)

    Marcus, Robert L.; Robertson, Douglass

    Central State University (Ohio), under a contract with Nichols Research Corporation, has developed a World Wide web based workshop on high performance computing entitled "IBN SP2 Parallel Programming Workshop." The research is part of the DoD (Department of Defense) High Performance Computing Modernization Program. The research activities included…

  10. Protocol-Based Verification of Message-Passing Parallel Programs

    DEFF Research Database (Denmark)

    López-Acosta, Hugo-Andrés; Eduardo R. B. Marques, Eduardo R. B.; Martins, Francisco;

    2015-01-01

    a protocol language based on a dependent type system for message-passing parallel programs, which includes various communication operators, such as point-to-point messages, broadcast, reduce, array scatter and gather. For the verification of a program against a given protocol, the protocol is first...

  11. Parallelizing Particle Swarm Optimization in a Functional Programming Environment

    Directory of Open Access Journals (Sweden)

    Pablo Rabanal

    2014-10-01

    Full Text Available Many bioinspired methods are based on using several simple entities which search for a reasonable solution (somehow independently. This is the case of Particle Swarm Optimization (PSO, where many simple particles search for the optimum solution by using both their local information and the information of the best solution found so far by any of the other particles. Particles are partially independent, and we can take advantage of this fact to parallelize PSO programs. Unfortunately, providing good parallel implementations for each specific PSO program can be tricky and time-consuming for the programmer. In this paper we introduce several parallel functional skeletons which, given a sequential PSO implementation, automatically provide the corresponding parallel implementations of it. We use these skeletons and report some experimental results. We observe that, despite the low effort required by programmers to use these skeletons, empirical results show that skeletons reach reasonable speedups.

  12. Optimized Parallel Execution of Declarative Programs on Distributed Memory Multiprocessors

    Institute of Scientific and Technical Information of China (English)

    沈美明; 田新民; 等

    1993-01-01

    In this paper,we focus on the compiling implementation of parlalel logic language PARLOG and functional language ML on distributed memory multiprocessors.Under the graph rewriting framework, a Heterogeneous Parallel Graph Rewriting Execution Model(HPGREM)is presented firstly.Then based on HPGREM,a parallel abstact machine PAM/TGR is described.Furthermore,several optimizing compilation schemes for executing declarative programs on transputer array are proposed.The performance statistics on transputer array demonstrate the effectiveness of our model,parallel abstract machine,optimizing compilation strategies and compiler.

  13. Parallel Programming with MatlabMPI

    OpenAIRE

    Kepner, Jeremy

    2001-01-01

    MatlabMPI is a Matlab implementation of the Message Passing Interface (MPI) standard and allows any Matlab program to exploit multiple processors. MatlabMPI currently implements the basic six functions that are the core of the MPI point-to-point communications standard. The key technical innovation of MatlabMPI is that it implements the widely used MPI ``look and feel'' on top of standard Matlab file I/O, resulting in an extremely compact (~100 lines) and ``pure'' implementation which runs an...

  14. On program restructuring, scheduling, and communication for parallel processor systems

    Energy Technology Data Exchange (ETDEWEB)

    Polychronopoulos, Constantine D.

    1986-08-01

    This dissertation discusses several software and hardware aspects of program execution on large-scale, high-performance parallel processor systems. The issues covered are program restructuring, partitioning, scheduling and interprocessor communication, synchronization, and hardware design issues of specialized units. All this work was performed focusing on a single goal: to maximize program speedup, or equivalently, to minimize parallel execution time. Parafrase, a Fortran restructuring compiler was used to transform programs in a parallel form and conduct experiments. Two new program restructuring techniques are presented, loop coalescing and subscript blocking. Compile-time and run-time scheduling schemes are covered extensively. Depending on the program construct, these algorithms generate optimal or near-optimal schedules. For the case of arbitrarily nested hybrid loops, two optimal scheduling algorithms for dynamic and static scheduling are presented. Simulation results are given for a new dynamic scheduling algorithm. The performance of this algorithm is compared to that of self-scheduling. Techniques for program partitioning and minimization of interprocessor communication for idealized program models and for real Fortran programs are also discussed. The close relationship between scheduling, interprocessor communication, and synchronization becomes apparent at several points in this work. Finally, the impact of various types of overhead on program speedup and experimental results are presented. 69 refs., 74 figs., 14 tabs.

  15. X3D: Extensible 3D Graphics Standard

    OpenAIRE

    Daly, Leonard; Brutzman, Don

    2007-01-01

    The article of record as published may be located at http://dx.doi.org/10.1109/MSP.2007.905889 Extensible 3D (X3D) is the open standard for Web-delivered three-dimensional (3D) graphics. It specifies a declarative geometry definition language, a run-time engine, and an application program interface (API) that provide an interactive, animated, real-time environment for 3D graphics. The X3D specification documents are freely available, the standard can be used without paying any royalties,...

  16. Structural synthesis of parallel programs (Methodology and Tools

    Directory of Open Access Journals (Sweden)

    G. Cejtlin

    1995-11-01

    Full Text Available Concepts of structured programming and propositional program logics were anticipated in the systems of algorithmic algebras (SAAs introduced by V.M.Glushkov in 1965. This paper correlates the SAA language with the known formalisms describing logic schemes of structured and unstructured programs. Complete axiomatics is constructed for modified SAAs (SAA-M oriented towards the formalization of parallel computations and abstract data types. An apparatus formalizing (top-down, bottom-up, or combined design of programs is suggested that incorporates SAA-M, grammar and automaton models oriented towards multiprocessing. A method and tools of parallel program design are developed. They feature orientation towards validation, transformations and synthesis of programs.

  17. Simulation in 3 dimensions of a cycle 18 months for an BWR type reactor using the Nod3D program; Simulacion en 3 dimensiones de un ciclo de 18 meses para un reactor BWR usando el programa Nod3D

    Energy Technology Data Exchange (ETDEWEB)

    Hernandez, N.; Alonso, G. [ININ, A.P. 18-1027, 11801 Mexico D.F. (Mexico)]. E-mail: nhm@nuclear.inin.mx; Valle, E. del [IPN, ESFM, 07738 Mexico D.F. (Mexico)

    2004-07-01

    The development of own codes that you/they allow the simulation in 3 dimensions of the nucleus of a reactor and be of easy maintenance, without the consequent payment of expensive use licenses, it can be a factor that propitiates the technological independence. In the Department of Nuclear Engineering (DIN) of the Superior School of Physics and Mathematics (ESFM) of the National Polytechnic Institute (IPN) a denominated program Nod3D has been developed with the one that one can simulate the operation of a reactor BWR in 3 dimensions calculating the effective multiplication factor (kJJ3, as well as the distribution of the flow neutronic and of the axial and radial profiles of the power, inside a means of well-known characteristics solving the equations of diffusion of neutrons numerically in stationary state and geometry XYZ using the mathematical nodal method RTN0 (Raviart-Thomas-Nedelec of index zero). One of the limitations of the program Nod3D is that it doesn't allow to consider the burnt of the fuel in an independent way considering feedback, this makes it in an implicit way considering the effective sections in each step of burnt and these sections are obtained of the code Core Master LEND. However even given this limitation, the results obtained in the simulation of a cycle of typical operation of a reactor of the type BWR are similar to those reported by the code Core Master LENDS. The results of the keJ - that were obtained with the program Nod3D they were compared with the results of the code Core Master LEND, presenting a difference smaller than 0.2% (200 pcm), and in the case of the axial profile of power, the maxim differs it was of 2.5%. (Author)

  18. Programming frames for the efficient use of parallel systems

    OpenAIRE

    Thomas, Römke; Petit Silvestre, Jordi

    1997-01-01

    Frames will provide support for the programming of distributed memory machines via a library of basic algorithms, data structures and so-called programming frames (or frameworks). The latter are skeletons with problem dependent parameters to be provided by the users. Frames focuses on re-usability and portability as well as on small and easy-to-learn interfaces. Thus, expert and non-expert users will be provided with tools to program and exploit parallel machines efficiently...

  19. Deductive Verification of Parallel Programs Using Why3

    OpenAIRE

    Santos, César; Martins, Francisco; Vasconcelos, Vasco Thudichum

    2015-01-01

    The Message Passing Interface specification (MPI) defines a portable message-passing API used to program parallel computers. MPI programs manifest a number of challenges on what concerns correctness: sent and expected values in communications may not match, resulting in incorrect computations possibly leading to crashes; and programs may deadlock resulting in wasted resources. Existing tools are not completely satisfactory: model-checking does not scale with the number of processes; testing t...

  20. JCOGIN. A parallel programming infrastructure for Monte Carlo particle transport

    International Nuclear Information System (INIS)

    The advantages of the Monte Carlo method for reactor analysis are well known, but the full-core reactor analysis challenges the computational time and computer memory. Meanwhile, the exponential growth of computer power in the last 10 years is now creating a great opportunity for large scale parallel computing on the Monte Carlo full-core reactor analysis. In this paper, a parallel programming infrastructure is introduced for Monte Carlo particle transport, named JCOGIN, which aims at accelerating the development of Monte Carlo codes for the large scale parallelism simulations of the full-core reactor. Now, JCOGIN implements the hybrid parallelism of the spatial decomposition and the traditional particle parallelism on MPI and OpenMP. Finally, JMCT code is developed on JCOGIN, which reaches the parallel efficiency of 70% on 20480 cores for fixed source problem. By the hybrid parallelism, the full-core pin-by-pin simulation of the Dayawan reactor was implemented, with the number of the cells up to 10 million and the tallies of the fluxes utilizing over 40GB of memory. (author)

  1. Center for Programming Models for Scalable Parallel Computing

    Energy Technology Data Exchange (ETDEWEB)

    John Mellor-Crummey

    2008-02-29

    Rice University's achievements as part of the Center for Programming Models for Scalable Parallel Computing include: (1) design and implemention of cafc, the first multi-platform CAF compiler for distributed and shared-memory machines, (2) performance studies of the efficiency of programs written using the CAF and UPC programming models, (3) a novel technique to analyze explicitly-parallel SPMD programs that facilitates optimization, (4) design, implementation, and evaluation of new language features for CAF, including communication topologies, multi-version variables, and distributed multithreading to simplify development of high-performance codes in CAF, and (5) a synchronization strength reduction transformation for automatically replacing barrier-based synchronization with more efficient point-to-point synchronization. The prototype Co-array Fortran compiler cafc developed in this project is available as open source software from http://www.hipersoft.rice.edu/caf.

  2. Parallel implementation of the PHOENIX generalized stellar atmosphere program. II. Wavelength parallelization

    International Nuclear Information System (INIS)

    We describe an important addition to the parallel implementation of our generalized nonlocal thermodynamic equilibrium (NLTE) stellar atmosphere and radiative transfer computer program PHOENIX. In a previous paper in this series we described data and task parallel algorithms we have developed for radiative transfer, spectral line opacity, and NLTE opacity and rate calculations. These algorithms divided the work spatially or by spectral lines, that is, distributing the radial zones, individual spectral lines, or characteristic rays among different processors and employ, in addition, task parallelism for logically independent functions (such as atomic and molecular line opacities). For finite, monotonic velocity fields, the radiative transfer equation is an initial value problem in wavelength, and hence each wavelength point depends upon the previous one. However, for sophisticated NLTE models of both static and moving atmospheres needed to accurately describe, e.g., novae and supernovae, the number of wavelength points is very large (200,000 - 300,000) and hence parallelization over wavelength can lead both to considerable speedup in calculation time and the ability to make use of the aggregate memory available on massively parallel supercomputers. Here, we describe an implementation of a pipelined design for the wavelength parallelization of PHOENIX, where the necessary data from the processor working on a previous wavelength point is sent to the processor working on the succeeding wavelength point as soon as it is known. Our implementation uses a MIMD design based on a relatively small number of standard message passing interface (MPI) library calls and is fully portable between serial and parallel computers. copyright 1998 The American Astronomical Society

  3. SANDS Tools : an environment for deriving parallel programs

    OpenAIRE

    Buchs, Didier; Buffo, Mathieu; Flumet, Jacques; Racloz, Pascal; Urland, Erik; Jelly, Innes; Gorton, Ian

    1994-01-01

    This paper describes the techniques and the tools developed to construct CO-OPN specifications (Concurrent Object Oriented Petri Nets) and to derive parallel programs. CO-OPN is a specification language allowing to describe concurrent aspects and data-structure of computer programs in an abstract way. The concurrent part of the formalism is described with Petri nets while the data-structure are described with algebraic abstract data types. In the CO-OPN formalism, this association is structur...

  4. Session-Based Programming for Parallel Algorithms: Expressiveness and Performance

    OpenAIRE

    Andi Bejleri; Raymond Hu; Nobuko Yoshida

    2010-01-01

    This paper investigates session programming and typing of benchmark examples to compare productivity, safety and performance with other communications programming languages. Parallel algorithms are used to examine the above aspects due to their extensive use of message passing for interaction, and their increasing prominence in algorithmic research with the rising availability of hardware resources such as multicore machines and clusters. We contribute new benchmark results for SJ, an extensi...

  5. Modelling parallel programs and multiprocessor architectures with AXE

    Science.gov (United States)

    Yan, Jerry C.; Fineman, Charles E.

    1991-01-01

    AXE, An Experimental Environment for Parallel Systems, was designed to model and simulate for parallel systems at the process level. It provides an integrated environment for specifying computation models, multiprocessor architectures, data collection, and performance visualization. AXE is being used at NASA-Ames for developing resource management strategies, parallel problem formulation, multiprocessor architectures, and operating system issues related to the High Performance Computing and Communications Program. AXE's simple, structured user-interface enables the user to model parallel programs and machines precisely and efficiently. Its quick turn-around time keeps the user interested and productive. AXE models multicomputers. The user may easily modify various architectural parameters including the number of sites, connection topologies, and overhead for operating system activities. Parallel computations in AXE are represented as collections of autonomous computing objects known as players. Their use and behavior is described. Performance data of the multiprocessor model can be observed on a color screen. These include CPU and message routing bottlenecks, and the dynamic status of the software.

  6. Heterogeneous Multicore Parallel Programming for Graphics Processing Units

    Directory of Open Access Journals (Sweden)

    Francois Bodin

    2009-01-01

    Full Text Available Hybrid parallel multicore architectures based on graphics processing units (GPUs can provide tremendous computing power. Current NVIDIA and AMD Graphics Product Group hardware display a peak performance of hundreds of gigaflops. However, exploiting GPUs from existing applications is a difficult task that requires non-portable rewriting of the code. In this paper, we present HMPP, a Heterogeneous Multicore Parallel Programming workbench with compilers, developed by CAPS entreprise, that allows the integration of heterogeneous hardware accelerators in a unintrusive manner while preserving the legacy code.

  7. Advanced parallel programming models research and development opportunities.

    Energy Technology Data Exchange (ETDEWEB)

    Wen, Zhaofang.; Brightwell, Ronald Brian

    2004-07-01

    There is currently a large research and development effort within the high-performance computing community on advanced parallel programming models. This research can potentially have an impact on parallel applications, system software, and computing architectures in the next several years. Given Sandia's expertise and unique perspective in these areas, particularly on very large-scale systems, there are many areas in which Sandia can contribute to this effort. This technical report provides a survey of past and present parallel programming model research projects and provides a detailed description of the Partitioned Global Address Space (PGAS) programming model. The PGAS model may offer several improvements over the traditional distributed memory message passing model, which is the dominant model currently being used at Sandia. This technical report discusses these potential benefits and outlines specific areas where Sandia's expertise could contribute to current research activities. In particular, we describe several projects in the areas of high-performance networking, operating systems and parallel runtime systems, compilers, application development, and performance evaluation.

  8. Feedback Driven Annotation and Refactoring of Parallel Programs

    DEFF Research Database (Denmark)

    Larsen, Per

    This thesis combines programmer knowledge and feedback to improve modeling and optimization of software. The research is motivated by two observations. First, there is a great need for automatic analysis of software for embedded systems - to expose and model parallelism inherent in programs. Second...... are not effective unless programmers are told how and when they are benecial. A prototype compilation feedback system was developed in collaboration with IBM Haifa Research Labs. It reports issues that prevent further analysis to the programmer. Performance evaluation shows that three programs performes signicantly......, some program properties are beyond reach of such analysis for theoretical and practical reasons - but can be described by programmers. Three aspects are explored. The first is annotation of the source code. Two annotations are introduced. These allow more accurate modeling of parallelism...

  9. 3D Elevation Program—Virtual USA in 3D

    Science.gov (United States)

    Lukas, Vicki; Stoker, J.M.

    2016-01-01

    The U.S. Geological Survey (USGS) 3D Elevation Program (3DEP) uses a laser system called ‘lidar’ (light detection and ranging) to create a virtual reality map of the Nation that is very accurate. 3D maps have many uses with new uses being discovered all the time.  

  10. CaKernel – A Parallel Application Programming Framework for Heterogenous Computing Architectures

    Directory of Open Access Journals (Sweden)

    Marek Blazewicz

    2011-01-01

    Full Text Available With the recent advent of new heterogeneous computing architectures there is still a lack of parallel problem solving environments that can help scientists to use easily and efficiently hybrid supercomputers. Many scientific simulations that use structured grids to solve partial differential equations in fact rely on stencil computations. Stencil computations have become crucial in solving many challenging problems in various domains, e.g., engineering or physics. Although many parallel stencil computing approaches have been proposed, in most cases they solve only particular problems. As a result, scientists are struggling when it comes to the subject of implementing a new stencil-based simulation, especially on high performance hybrid supercomputers. In response to the presented need we extend our previous work on a parallel programming framework for CUDA – CaCUDA that now supports OpenCL. We present CaKernel – a tool that simplifies the development of parallel scientific applications on hybrid systems. CaKernel is built on the highly scalable and portable Cactus framework. In the CaKernel framework, Cactus manages the inter-process communication via MPI while CaKernel manages the code running on Graphics Processing Units (GPUs and interactions between them. As a non-trivial test case we have developed a 3D CFD code to demonstrate the performance and scalability of the automatically generated code.

  11. Programming Massively Parallel Architectures using MARTE: a Case Study

    CERN Document Server

    Rodrigues, Wendell; Dekeyser, Jean-Luc

    2011-01-01

    Nowadays, several industrial applications are being ported to parallel architectures. These applications take advantage of the potential parallelism provided by multiple core processors. Many-core processors, especially the GPUs(Graphics Processing Unit), have led the race of floating-point performance since 2003. While the performance improvement of general- purpose microprocessors has slowed significantly, the GPUs have continued to improve relentlessly. As of 2009, the ratio between many-core GPUs and multicore CPUs for peak floating-point calculation throughput is about 10 times. However, as parallel programming requires a non-trivial distribution of tasks and data, developers find it hard to implement their applications effectively. Aiming to improve the use of many-core processors, this work presents an case-study using UML and MARTE profile to specify and generate OpenCL code for intensive signal processing applications. Benchmark results show us the viability of the use of MDE approaches to generate G...

  12. MAP3D: a media processor approach for high-end 3D graphics

    Science.gov (United States)

    Darsa, Lucia; Stadnicki, Steven; Basoglu, Chris

    1999-12-01

    Equator Technologies, Inc. has used a software-first approach to produce several programmable and advanced VLIW processor architectures that have the flexibility to run both traditional systems tasks and an array of media-rich applications. For example, Equator's MAP1000A is the world's fastest single-chip programmable signal and image processor targeted for digital consumer and office automation markets. The Equator MAP3D is a proposal for the architecture of the next generation of the Equator MAP family. The MAP3D is designed to achieve high-end 3D performance and a variety of customizable special effects by combining special graphics features with high performance floating-point and media processor architecture. As a programmable media processor, it offers the advantages of a completely configurable 3D pipeline--allowing developers to experiment with different algorithms and to tailor their pipeline to achieve the highest performance for a particular application. With the support of Equator's advanced C compiler and toolkit, MAP3D programs can be written in a high-level language. This allows the compiler to successfully find and exploit any parallelism in a programmer's code, thus decreasing the time to market of a given applications. The ability to run an operating system makes it possible to run concurrent applications in the MAP3D chip, such as video decoding while executing the 3D pipelines, so that integration of applications is easily achieved--using real-time decoded imagery for texturing 3D objects, for instance. This novel architecture enables an affordable, integrated solution for high performance 3D graphics.

  13. Testing New Programming Paradigms with NAS Parallel Benchmarks

    Science.gov (United States)

    Jin, H.; Frumkin, M.; Schultz, M.; Yan, J.

    2000-01-01

    Over the past decade, high performance computing has evolved rapidly, not only in hardware architectures but also with increasing complexity of real applications. Technologies have been developing to aim at scaling up to thousands of processors on both distributed and shared memory systems. Development of parallel programs on these computers is always a challenging task. Today, writing parallel programs with message passing (e.g. MPI) is the most popular way of achieving scalability and high performance. However, writing message passing programs is difficult and error prone. Recent years new effort has been made in defining new parallel programming paradigms. The best examples are: HPF (based on data parallelism) and OpenMP (based on shared memory parallelism). Both provide simple and clear extensions to sequential programs, thus greatly simplify the tedious tasks encountered in writing message passing programs. HPF is independent of memory hierarchy, however, due to the immaturity of compiler technology its performance is still questionable. Although use of parallel compiler directives is not new, OpenMP offers a portable solution in the shared-memory domain. Another important development involves the tremendous progress in the internet and its associated technology. Although still in its infancy, Java promisses portability in a heterogeneous environment and offers possibility to "compile once and run anywhere." In light of testing these new technologies, we implemented new parallel versions of the NAS Parallel Benchmarks (NPBs) with HPF and OpenMP directives, and extended the work with Java and Java-threads. The purpose of this study is to examine the effectiveness of alternative programming paradigms. NPBs consist of five kernels and three simulated applications that mimic the computation and data movement of large scale computational fluid dynamics (CFD) applications. We started with the serial version included in NPB2.3. Optimization of memory and cache usage

  14. Fast 3D coronary artery contrast-enhanced magnetic resonance angiography with magnetization transfer contrast, fat suppression and parallel imaging as applied on an anthropomorphic moving heart phantom.

    Science.gov (United States)

    Irwan, Roy; Rüssel, Iris K; Sijens, Paul E

    2006-09-01

    A magnetic resonance sequence for high-resolution imaging of coronary arteries in a very short acquisition time is presented. The technique is based on fast low-angle shot and uses fat saturation and magnetization transfer contrast prepulses to improve image contrast. GeneRalized Autocalibrating Partially Parallel Acquisitions (GRAPPA) is implemented to shorten acquisition time. The sequence was tested on a moving anthropomorphic silicone heart phantom where the coronary arteries were filled with a gadolinium contrast agent solution, and imaging was performed at varying heart rates using GRAPPA. The clinical relevance of the phantom was validated by comparing the myocardial relaxation times of the phantom's homogeneous silicone cardiac wall to those of humans. Signal-to-noise ratio and contrast-to-noise ratio were higher when parallel imaging was used, possibly benefiting from the acquisition of one partition per heartbeat. Another advantage of parallel imaging for visualizing the coronary arteries is that the entire heart can be imaged within a few breath-holds.

  15. Performance Evaluation Methodologies and Tools for Massively Parallel Programs

    Science.gov (United States)

    Yan, Jerry C.; Sarukkai, Sekhar; Tucker, Deanne (Technical Monitor)

    1994-01-01

    The need for computing power has forced a migration from serial computation on a single processor to parallel processing on multiprocessors. However, without effective means to monitor (and analyze) program execution, tuning the performance of parallel programs becomes exponentially difficult as program complexity and machine size increase. The recent introduction of performance tuning tools from various supercomputer vendors (Intel's ParAide, TMC's PRISM, CSI'S Apprentice, and Convex's CXtrace) seems to indicate the maturity of performance tool technologies and vendors'/customers' recognition of their importance. However, a few important questions remain: What kind of performance bottlenecks can these tools detect (or correct)? How time consuming is the performance tuning process? What are some important technical issues that remain to be tackled in this area? This workshop reviews the fundamental concepts involved in analyzing and improving the performance of parallel and heterogeneous message-passing programs. Several alternative strategies will be contrasted, and for each we will describe how currently available tuning tools (e.g., AIMS, ParAide, PRISM, Apprentice, CXtrace, ATExpert, Pablo, IPS-2)) can be used to facilitate the process. We will characterize the effectiveness of the tools and methodologies based on actual user experiences at NASA Ames Research Center. Finally, we will discuss their limitations and outline recent approaches taken by vendors and the research community to address them.

  16. Scientific programming on massively parallel processor CP-PACS

    International Nuclear Information System (INIS)

    The massively parallel processor CP-PACS takes various problems of calculation physics as the object, and it has been designed so that its architecture has been devised to do various numerical processings. In this report, the outline of the CP-PACS and the example of programming in the Kernel CG benchmark in NAS Parallel Benchmarks, version 1, are shown, and the pseudo vector processing mechanism and the parallel processing tuning of scientific and technical computation utilizing the three-dimensional hyper crossbar net, which are two great features of the architecture of the CP-PACS are described. As for the CP-PACS, the PUs based on RISC processor and added with pseudo vector processor are used. Pseudo vector processing is realized as the loop processing by scalar command. The features of the connection net of PUs are explained. The algorithm of the NPB version 1 Kernel CG is shown. The part that takes the time for processing most in the main loop is the product of matrix and vector (matvec), and the parallel processing of the matvec is explained. The time for the computation by the CPU is determined. As the evaluation of the performance, the evaluation of the time for execution, the short vector processing of pseudo vector processor based on slide window, and the comparison with other parallel computers are reported. (K.I.)

  17. Scientific programming on massively parallel processor CP-PACS

    Energy Technology Data Exchange (ETDEWEB)

    Boku, Taisuke [Tsukuba Univ., Ibaraki (Japan). Inst. of Information Sciences and Electronics

    1998-03-01

    The massively parallel processor CP-PACS takes various problems of calculation physics as the object, and it has been designed so that its architecture has been devised to do various numerical processings. In this report, the outline of the CP-PACS and the example of programming in the Kernel CG benchmark in NAS Parallel Benchmarks, version 1, are shown, and the pseudo vector processing mechanism and the parallel processing tuning of scientific and technical computation utilizing the three-dimensional hyper crossbar net, which are two great features of the architecture of the CP-PACS are described. As for the CP-PACS, the PUs based on RISC processor and added with pseudo vector processor are used. Pseudo vector processing is realized as the loop processing by scalar command. The features of the connection net of PUs are explained. The algorithm of the NPB version 1 Kernel CG is shown. The part that takes the time for processing most in the main loop is the product of matrix and vector (matvec), and the parallel processing of the matvec is explained. The time for the computation by the CPU is determined. As the evaluation of the performance, the evaluation of the time for execution, the short vector processing of pseudo vector processor based on slide window, and the comparison with other parallel computers are reported. (K.I.)

  18. Organization of an interphase system for the coupling of WINS-D4 and SNAP-3D programs

    International Nuclear Information System (INIS)

    In this report a modular system developed for the CC-1 critical assembly's physical calculation is described. It was based upon the WINS-D4 and SNAP-3D codes, which are coupled by means of an interphase module and a groups diffusion cross sections library

  19. Final Report: Center for Programming Models for Scalable Parallel Computing

    Energy Technology Data Exchange (ETDEWEB)

    Mellor-Crummey, John [William Marsh Rice University

    2011-09-13

    As part of the Center for Programming Models for Scalable Parallel Computing, Rice University collaborated with project partners in the design, development and deployment of language, compiler, and runtime support for parallel programming models to support application development for the “leadership-class” computer systems at DOE national laboratories. Work over the course of this project has focused on the design, implementation, and evaluation of a second-generation version of Coarray Fortran. Research and development efforts of the project have focused on the CAF 2.0 language, compiler, runtime system, and supporting infrastructure. This has involved working with the teams that provide infrastructure for CAF that we rely on, implementing new language and runtime features, producing an open source compiler that enabled us to evaluate our ideas, and evaluating our design and implementation through the use of benchmarks. The report details the research, development, findings, and conclusions from this work.

  20. 07361 Abstracts Collection -- Programming Models for Ubiquitous Parallelism

    OpenAIRE

    Wong, David Chi-Leung; Cohen, Albert; Garzarán, María J.; Lengauer, Christian; Midkiff, Samuel P.

    2008-01-01

    From 02.09. to 07.09.2007, the Dagstuhl Seminar 07361 ``Programming Models for Ubiquitous Parallelism'' was held in the International Conference and Research Center (IBFI), Schloss Dagstuhl. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section des...

  1. A parallel domain decomposition-based implicit method for the Cahn-Hilliard-Cook phase-field equation in 3D

    KAUST Repository

    Zheng, Xiang

    2015-03-01

    We present a numerical algorithm for simulating the spinodal decomposition described by the three dimensional Cahn-Hilliard-Cook (CHC) equation, which is a fourth-order stochastic partial differential equation with a noise term. The equation is discretized in space and time based on a fully implicit, cell-centered finite difference scheme, with an adaptive time-stepping strategy designed to accelerate the progress to equilibrium. At each time step, a parallel Newton-Krylov-Schwarz algorithm is used to solve the nonlinear system. We discuss various numerical and computational challenges associated with the method. The numerical scheme is validated by a comparison with an explicit scheme of high accuracy (and unreasonably high cost). We present steady state solutions of the CHC equation in two and three dimensions. The effect of the thermal fluctuation on the spinodal decomposition process is studied. We show that the existence of the thermal fluctuation accelerates the spinodal decomposition process and that the final steady morphology is sensitive to the stochastic noise. We also show the evolution of the energies and statistical moments. In terms of the parallel performance, it is found that the implicit domain decomposition approach scales well on supercomputers with a large number of processors. © 2015 Elsevier Inc.

  2. PLOT3D user's manual

    Science.gov (United States)

    Walatka, Pamela P.; Buning, Pieter G.; Pierce, Larry; Elson, Patricia A.

    1990-01-01

    PLOT3D is a computer graphics program designed to visualize the grids and solutions of computational fluid dynamics. Seventy-four functions are available. Versions are available for many systems. PLOT3D can handle multiple grids with a million or more grid points, and can produce varieties of model renderings, such as wireframe or flat shaded. Output from PLOT3D can be used in animation programs. The first part of this manual is a tutorial that takes the reader, keystroke by keystroke, through a PLOT3D session. The second part of the manual contains reference chapters, including the helpfile, data file formats, advice on changing PLOT3D, and sample command files.

  3. NIF Ignition Target 3D Point Design

    Energy Technology Data Exchange (ETDEWEB)

    Jones, O; Marinak, M; Milovich, J; Callahan, D

    2008-11-05

    We have developed an input file for running 3D NIF hohlraums that is optimized such that it can be run in 1-2 days on parallel computers. We have incorporated increasing levels of automation into the 3D input file: (1) Configuration controlled input files; (2) Common file for 2D and 3D, different types of capsules (symcap, etc.); and (3) Can obtain target dimensions, laser pulse, and diagnostics settings automatically from NIF Campaign Management Tool. Using 3D Hydra calculations to investigate different problems: (1) Intrinsic 3D asymmetry; (2) Tolerance to nonideal 3D effects (e.g. laser power balance, pointing errors); and (3) Synthetic diagnostics.

  4. 3D Animation Essentials

    CERN Document Server

    Beane, Andy

    2012-01-01

    The essential fundamentals of 3D animation for aspiring 3D artists 3D is everywhere--video games, movie and television special effects, mobile devices, etc. Many aspiring artists and animators have grown up with 3D and computers, and naturally gravitate to this field as their area of interest. Bringing a blend of studio and classroom experience to offer you thorough coverage of the 3D animation industry, this must-have book shows you what it takes to create compelling and realistic 3D imagery. Serves as the first step to understanding the language of 3D and computer graphics (CG)Covers 3D anim

  5. 3D video

    CERN Document Server

    Lucas, Laurent; Loscos, Céline

    2013-01-01

    While 3D vision has existed for many years, the use of 3D cameras and video-based modeling by the film industry has induced an explosion of interest for 3D acquisition technology, 3D content and 3D displays. As such, 3D video has become one of the new technology trends of this century.The chapters in this book cover a large spectrum of areas connected to 3D video, which are presented both theoretically and technologically, while taking into account both physiological and perceptual aspects. Stepping away from traditional 3D vision, the authors, all currently involved in these areas, provide th

  6. Session-Based Programming for Parallel Algorithms: Expressiveness and Performance

    CERN Document Server

    Bejleri, Andi; Yoshida, Nobuko; 10.4204/EPTCS.17.2

    2010-01-01

    This paper investigates session programming and typing of benchmark examples to compare productivity, safety and performance with other communications programming languages. Parallel algorithms are used to examine the above aspects due to their extensive use of message passing for interaction, and their increasing prominence in algorithmic research with the rising availability of hardware resources such as multicore machines and clusters. We contribute new benchmark results for SJ, an extension of Java for type-safe, binary session programming, against MPJ Express, a Java messaging system based on the MPI standard. In conclusion, we observe that (1) despite rich libraries and functionality, MPI remains a low-level API, and can suffer from commonly perceived disadvantages of explicit message passing such as deadlocks and unexpected message types, and (2) the benefits of high-level session abstraction, which has significant impact on program structure to improve readability and reliability, and session type-saf...

  7. MLP: A Parallel Programming Alternative to MPI for New Shared Memory Parallel Systems

    Science.gov (United States)

    Taft, James R.

    1999-01-01

    Recent developments at the NASA AMES Research Center's NAS Division have demonstrated that the new generation of NUMA based Symmetric Multi-Processing systems (SMPs), such as the Silicon Graphics Origin 2000, can successfully execute legacy vector oriented CFD production codes at sustained rates far exceeding processing rates possible on dedicated 16 CPU Cray C90 systems. This high level of performance is achieved via shared memory based Multi-Level Parallelism (MLP). This programming approach, developed at NAS and outlined below, is distinct from the message passing paradigm of MPI. It offers parallelism at both the fine and coarse grained level, with communication latencies that are approximately 50-100 times lower than typical MPI implementations on the same platform. Such latency reductions offer the promise of performance scaling to very large CPU counts. The method draws on, but is also distinct from, the newly defined OpenMP specification, which uses compiler directives to support a limited subset of multi-level parallel operations. The NAS MLP method is general, and applicable to a large class of NASA CFD codes.

  8. An informal introduction to program transformation and parallel processors

    Energy Technology Data Exchange (ETDEWEB)

    Hopkins, K.W. [Southwest Baptist Univ., Bolivar, MO (United States)

    1994-08-01

    In the summer of 1992, I had the opportunity to participate in a Faculty Research Program at Argonne National Laboratory. I worked under Dr. Jim Boyle on a project transforming code written in pure functional Lisp to Fortran code to run on distributed-memory parallel processors. To perform this project, I had to learn three things: the transformation system, the basics of distributed-memory parallel machines, and the Lisp programming language. Each of these topics in computer science was unfamiliar to me as a mathematician, but I found that they (especially parallel processing) are greatly impacting many fields of mathematics and science. Since most mathematicians have some exposure to computers, but.certainly are not computer scientists, I felt it was appropriate to write a paper summarizing my introduction to these areas and how they can fit together. This paper is not meant to be a full explanation of the topics, but an informal introduction for the ``mathematical layman.`` I place myself in that category as well as my previous use of computers was as a classroom demonstration tool.

  9. Users manual for the Chameleon parallel programming tools

    Energy Technology Data Exchange (ETDEWEB)

    Gropp, W. [Argonne National Lab., IL (United States); Smith, B. [Univ. of California, Los Angeles, CA (United States). Dept. of Mathematics

    1993-06-01

    Message passing is a common method for writing programs for distributed-memory parallel computers. Unfortunately, the lack of a standard for message passing has hampered the construction of portable and efficient parallel programs. In an attempt to remedy this problem, a number of groups have developed their own message-passing systems, each with its own strengths and weaknesses. Chameleon is a second-generation system of this type. Rather than replacing these existing systems, Chameleon is meant to supplement them by providing a uniform way to access many of these systems. Chameleon`s goals are to (a) be very lightweight (low over-head), (b) be highly portable, and (c) help standardize program startup and the use of emerging message-passing operations such as collective operations on subsets of processors. Chameleon also provides a way to port programs written using PICL or Intel NX message passing to other systems, including collections of workstations. Chameleon is tracking the Message-Passing Interface (MPI) draft standard and will provide both an MPI implementation and an MPI transport layer. Chameleon provides support for heterogeneous computing by using p4 and PVM. Chameleon`s support for homogeneous computing includes the portable libraries p4, PICL, and PVM and vendor-specific implementation for Intel NX, IBM EUI (SP-1), and Thinking Machines CMMD (CM-5). Support for Ncube and PVM 3.x is also under development.

  10. Advanced Programming Platform for efficient use of Data Parallel Hardware

    CERN Document Server

    Cabellos, Luis

    2012-01-01

    Graphics processing units (GPU) had evolved from a specialized hardware capable to render high quality graphics in games to a commodity hardware for effective processing blocks of data in a parallel schema. This evolution is particularly interesting for scientific groups, which traditionally use mainly CPU as a work horse, and now can profit of the arrival of GPU hardware to HPC clusters. This new GPU hardware promises a boost in peak performance, but it is not trivial to use. In this article a programming platform designed to promote a direct use of this specialized hardware is presented. This platform includes a visual editor of parallel data flows and it is oriented to the execution in distributed clusters with GPUs. Examples of application in two characteristic problems, Fast Fourier Transform and Image Compression, are also shown.

  11. A Parallel 3D Model for The Multi-Species Low Energy Beam Transport System of the RIA Prototype ECR Ion Source Venus

    International Nuclear Information System (INIS)

    The driver linac of the proposed Rare Isotope Accelerator (RIA) requires a great variety of high intensity, high charge state ion beams. In order to design and to optimize the low energy beamline optics of the RIA front end,we have developed a new parallel three-dimensional model to simulate the low energy, multi-species ion beam formation and transport from the ECR ion source extraction region to the focal plane of the analyzing magnet. A multisection overlapped computational domain has been used to break the original transport system into a number of each subsystem, macro-particle tracking is used to obtain the charge density distribution in this subdomain. The three-dimensional Poisson equation is solved within the subdomain and particle tracking is repeated until the solution converges. Two new Poisson solvers based on a combination of the spectral method and the multigrid method have been developed to solve the Poisson equation in cylindrical coordinates for the beam extraction region and in the Frenet-Serret coordinates for the bending magnet region. Some test examples and initial applications will also be presented

  12. 三维有限元并行计算及其工程应用%PARALLEL 3D FINITE ELEMENT ANALYSIS AND ITS APPLICATION TO HYDRAULIC ENGINEERING

    Institute of Scientific and Technical Information of China (English)

    刘耀儒; 周维垣; 杨强; 陈新

    2005-01-01

    针对水利工程中的大型复杂三维结构对大规模数值计算的需求,基于J-PCG方法(Jacobi预处理共轭梯度法),建立了有限元EBE(element-by-element)方法在分布存储并行机上的计算算法.该算法不用考虑网格拓扑结构和单元的排序,同时不形成整体刚度矩阵,而且避免了对复杂的三维结构进行区域分解.采用上述算法编制了三维有限元并行求解的PFEM(parallel finite element method)程序,并在网络机群系统上实现,然后将其应用到二滩拱坝-地基系统和水布娅地下洞室的三维有限元数值计算中.数值计算结果表明,三维有限元并行EBE方法非常适合于水利工程中三维复杂结构的大规模数值计算.

  13. Simulation of a simple RCCS experiment with RELAP5-3D system code and computational fluid dynamics computer program

    International Nuclear Information System (INIS)

    A small scale experimental facility was designed to study the thermal hydraulic phenomena in the Reactor Cavity Cooling System (RCCS). The facility was scaled down from the full scale RCCS system by applying scaling laws. A set of RELAP5-3D simulations were performed to confirm the scaling calculations, and to refine and optimize the facility's configuration, instrumentation selection, and layout. Computational Fluid Dynamics (CFD) calculations using StarCCM+ were performed in order to study the flow patterns and two-phase water behavior in selected locations of the facility where expected complex flow structure occurs. (author)

  14. Automatic Performance Debugging of SPMD-style Parallel Programs

    CERN Document Server

    Liu, Xu; Zhan, Kunlin; Shi, Weisong; Yuan, Lin; Meng, Dan; Wang, Lei

    2011-01-01

    The simple program and multiple data (SPMD) programming model is widely used for both high performance computing and Cloud computing. In this paper, we design and implement an innovative system, AutoAnalyzer, that automates the process of debugging performance problems of SPMD-style parallel programs, including data collection, performance behavior analysis, locating bottlenecks, and uncovering their root causes. AutoAnalyzer is unique in terms of two features: first, without any apriori knowledge, it automatically locates bottlenecks and uncovers their root causes for performance optimization; second, it is lightweight in terms of the size of performance data to be collected and analyzed. Our contributions are three-fold: first, we propose two effective clustering algorithms to investigate the existence of performance bottlenecks that cause process behavior dissimilarity or code region behavior disparity, respectively; meanwhile, we present two searching algorithms to locate bottlenecks; second, on a basis o...

  15. NavP: Structured and Multithreaded Distributed Parallel Programming

    Science.gov (United States)

    Pan, Lei

    2007-01-01

    We present Navigational Programming (NavP) -- a distributed parallel programming methodology based on the principles of migrating computations and multithreading. The four major steps of NavP are: (1) Distribute the data using the data communication pattern in a given algorithm; (2) Insert navigational commands for the computation to migrate and follow large-sized distributed data; (3) Cut the sequential migrating thread and construct a mobile pipeline; and (4) Loop back for refinement. NavP is significantly different from the current prevailing Message Passing (MP) approach. The advantages of NavP include: (1) NavP is structured distributed programming and it does not change the code structure of an original algorithm. This is in sharp contrast to MP as MP implementations in general do not resemble the original sequential code; (2) NavP implementations are always competitive with the best MPI implementations in terms of performance. Approaches such as DSM or HPF have failed to deliver satisfying performance as of today in contrast, even if they are relatively easy to use compared to MP; (3) NavP provides incremental parallelization, which is beyond the reach of MP; and (4) NavP is a unifying approach that allows us to exploit both fine- (multithreading on shared memory) and coarse- (pipelined tasks on distributed memory) grained parallelism. This is in contrast to the currently popular hybrid use of MP+OpenMP, which is known to be complex to use. We present experimental results that demonstrate the effectiveness of NavP.

  16. A New Approach to Improve Cognition, Muscle Strength, and Postural Balance in Community-Dwelling Elderly with a 3-D Virtual Reality Kayak Program.

    Science.gov (United States)

    Park, Junhyuck; Yim, JongEun

    2016-01-01

    Aging is usually accompanied with deterioration of physical abilities, such as muscular strength, sensory sensitivity, and functional capacity. Recently, intervention methods with virtual reality have been introduced, providing an enjoyable therapy for elderly. The aim of this study was to investigate whether a 3-D virtual reality kayak program could improve the cognitive function, muscle strength, and balance of community-dwelling elderly. Importantly, kayaking involves most of the upper body musculature and needs the balance control. Seventy-two participants were randomly allocated into the kayak program group (n = 36) and the control group (n = 36). The two groups were well matched with respect to general characteristics at baseline. The participants in both groups performed a conventional exercise program for 30 min, and then the 3-D virtual reality kayak program was performed in the kayak program group for 20 min, two times a week for 6 weeks. Cognitive function was measured using the Montreal Cognitive Assessment. Muscle strength was measured using the arm curl and handgrip strength tests. Standing and sitting balance was measured using the Good Balance system. The post-test was performed in the same manner as the pre-test; the overall outcomes such as cognitive function (p virtual reality kayak program is a promising intervention method for improving the cognitive function, muscle strength, and balance of elderly.

  17. Parallelization and checkpointing of GPU applications through program transformation

    Energy Technology Data Exchange (ETDEWEB)

    Solano-Quinde, Lizandro Damian [Iowa State Univ., Ames, IA (United States)

    2012-01-01

    GPUs have emerged as a powerful tool for accelerating general-purpose applications. The availability of programming languages that makes writing general-purpose applications for running on GPUs tractable have consolidated GPUs as an alternative for accelerating general purpose applications. Among the areas that have beneffited from GPU acceleration are: signal and image processing, computational fluid dynamics, quantum chemistry, and, in general, the High Performance Computing (HPC) Industry. In order to continue to exploit higher levels of parallelism with GPUs, multi-GPU systems are gaining popularity. In this context, single-GPU applications are parallelized for running in multi-GPU systems. Furthermore, multi-GPU systems help to solve the GPU memory limitation for applications with large application memory footprint. Parallelizing single-GPU applications has been approached by libraries that distribute the workload at runtime, however, they impose execution overhead and are not portable. On the other hand, on traditional CPU systems, parallelization has been approached through application transformation at pre-compile time, which enhances the application to distribute the workload at application level and does not have the issues of library-based approaches. Hence, a parallelization scheme for GPU systems based on application transformation is needed. Like any computing engine of today, reliability is also a concern in GPUs. GPUs are vulnerable to transient and permanent failures. Current checkpoint/restart techniques are not suitable for systems with GPUs. Checkpointing for GPU systems present new and interesting challenges, primarily due to the natural differences imposed by the hardware design, the memory subsystem architecture, the massive number of threads, and the limited amount of synchronization among threads. Therefore, a checkpoint/restart technique suitable for GPU systems is needed. The goal of this work is to exploit higher levels of parallelism

  18. Parallelization and checkpointing of GPU applications through program transformation

    Energy Technology Data Exchange (ETDEWEB)

    Solano-Quinde, Lizandro Damian [Iowa State Univ., Ames, IA (United States)

    2012-01-01

    GPUs have emerged as a powerful tool for accelerating general-purpose applications. The availability of programming languages that makes writing general-purpose applications for running on GPUs tractable have consolidated GPUs as an alternative for accelerating general purpose applications. Among the areas that have benefited from GPU acceleration are: signal and image processing, computational fluid dynamics, quantum chemistry, and, in general, the High Performance Computing (HPC) Industry. In order to continue to exploit higher levels of parallelism with GPUs, multi-GPU systems are gaining popularity. In this context, single-GPU applications are parallelized for running in multi-GPU systems. Furthermore, multi-GPU systems help to solve the GPU memory limitation for applications with large application memory footprint. Parallelizing single-GPU applications has been approached by libraries that distribute the workload at runtime, however, they impose execution overhead and are not portable. On the other hand, on traditional CPU systems, parallelization has been approached through application transformation at pre-compile time, which enhances the application to distribute the workload at application level and does not have the issues of library-based approaches. Hence, a parallelization scheme for GPU systems based on application transformation is needed. Like any computing engine of today, reliability is also a concern in GPUs. GPUs are vulnerable to transient and permanent failures. Current checkpoint/restart techniques are not suitable for systems with GPUs. Checkpointing for GPU systems present new and interesting challenges, primarily due to the natural differences imposed by the hardware design, the memory subsystem architecture, the massive number of threads, and the limited amount of synchronization among threads. Therefore, a checkpoint/restart technique suitable for GPU systems is needed. The goal of this work is to exploit higher levels of parallelism and

  19. EUROPEANA AND 3D

    Directory of Open Access Journals (Sweden)

    D. Pletinckx

    2012-09-01

    Full Text Available The current 3D hype creates a lot of interest in 3D. People go to 3D movies, but are we ready to use 3D in our homes, in our offices, in our communication? Are we ready to deliver real 3D to a general public and use interactive 3D in a meaningful way to enjoy, learn, communicate? The CARARE project is realising this for the moment in the domain of monuments and archaeology, so that real 3D of archaeological sites and European monuments will be available to the general public by 2012. There are several aspects to this endeavour. First of all is the technical aspect of flawlessly delivering 3D content over all platforms and operating systems, without installing software. We have currently a working solution in PDF, but HTML5 will probably be the future. Secondly, there is still little knowledge on how to create 3D learning objects, 3D tourist information or 3D scholarly communication. We are still in a prototype phase when it comes to integrate 3D objects in physical or virtual museums. Nevertheless, Europeana has a tremendous potential as a multi-facetted virtual museum. Finally, 3D has a large potential to act as a hub of information, linking to related 2D imagery, texts, video, sound. We describe how to create such rich, explorable 3D objects that can be used intuitively by the generic Europeana user and what metadata is needed to support the semantic linking.

  20. Solid works 3D

    Energy Technology Data Exchange (ETDEWEB)

    Choi, Cheol Yeong

    2004-02-15

    This book explains modeling of solid works 3D and application of 3D CAD/CAM. The contents of this book are outline of modeling such as CAD and 2D and 3D, solid works composition, method of sketch, writing measurement fixing, selecting projection, choosing condition of restriction, practice of sketch, making parts, reforming parts, modeling 3D, revising 3D modeling, using pattern function, modeling necessaries, assembling, floor plan, 3D modeling method, practice floor plans for industrial engineer data aided manufacturing, processing of CAD/CAM interface.

  1. Solid works 3D

    International Nuclear Information System (INIS)

    This book explains modeling of solid works 3D and application of 3D CAD/CAM. The contents of this book are outline of modeling such as CAD and 2D and 3D, solid works composition, method of sketch, writing measurement fixing, selecting projection, choosing condition of restriction, practice of sketch, making parts, reforming parts, modeling 3D, revising 3D modeling, using pattern function, modeling necessaries, assembling, floor plan, 3D modeling method, practice floor plans for industrial engineer data aided manufacturing, processing of CAD/CAM interface.

  2. Session-Based Programming for Parallel Algorithms: Expressiveness and Performance

    Directory of Open Access Journals (Sweden)

    Andi Bejleri

    2010-02-01

    Full Text Available This paper investigates session programming and typing of benchmark examples to compare productivity, safety and performance with other communications programming languages. Parallel algorithms are used to examine the above aspects due to their extensive use of message passing for interaction, and their increasing prominence in algorithmic research with the rising availability of hardware resources such as multicore machines and clusters. We contribute new benchmark results for SJ, an extension of Java for type-safe, binary session programming, against MPJ Express, a Java messaging system based on the MPI standard. In conclusion, we observe that (1 despite rich libraries and functionality, MPI remains a low-level API, and can suffer from commonly perceived disadvantages of explicit message passing such as deadlocks and unexpected message types, and (2 the benefits of high-level session abstraction, which has significant impact on program structure to improve readability and reliability, and session type-safety can greatly facilitate the task of communications programming whilst retaining competitive performance.

  3. Development Program of LOCA Licensing Calculation Capability with RELAP5-3D in Accordance with Appendix K of 10 CFR 50.46

    International Nuclear Information System (INIS)

    In light water reactors, particularly the pressurized water reactors, the severity of loss-of-coolant accidents (LOCAs) will limit how high the reactor power can extend. Although the best-estimate LOCA methodology can provide the greatest margin on the peak cladding temperature (PCT) evaluation during LOCA, it will take many more resources to develop and to get final approval from the licensing authority. Instead, implementation of evaluation models required by Appendix K of the Code of Federal Regulations, Title 10, Part 50 (10 CFR 50), upon an advanced thermal-hydraulic platform can also gain significant margin on the PCT calculation. A program to modify RELAP5-3D in accordance with Appendix K of 10 CFR 50 was launched by the Institute of Nuclear Energy Research, Taiwan, and it consists of six sequential phases of work. The compliance of the current RELAP5-3D with Appendix K of 10 CFR 50 has been evaluated, and it was found that there are 11 areas where the code modifications are required to satisfy the requirements set forth in Appendix K of 10 CFR 50. To verify and assess the development of the Appendix K version of RELAP5-3D, nine kinds of separate-effect experiments and six sets of integral-effect experiments will be adopted. Through the assessments program, all the model changes will be verified

  4. A Programming Model Performance Study Using the NAS Parallel Benchmarks

    Directory of Open Access Journals (Sweden)

    Hongzhang Shan

    2010-01-01

    Full Text Available Harnessing the power of multicore platforms is challenging due to the additional levels of parallelism present. In this paper we use the NAS Parallel Benchmarks to study three programming models, MPI, OpenMP and PGAS to understand their performance and memory usage characteristics on current multicore architectures. To understand these characteristics we use the Integrated Performance Monitoring tool and other ways to measure communication versus computation time, as well as the fraction of the run time spent in OpenMP. The benchmarks are run on two different Cray XT5 systems and an Infiniband cluster. Our results show that in general the three programming models exhibit very similar performance characteristics. In a few cases, OpenMP is significantly faster because it explicitly avoids communication. For these particular cases, we were able to re-write the UPC versions and achieve equal performance to OpenMP. Using OpenMP was also the most advantageous in terms of memory usage. Also we compare performance differences between the two Cray systems, which have quad-core and hex-core processors. We show that at scale the performance is almost always slower on the hex-core system because of increased contention for network resources.

  5. An empirical study of FORTRAN programs for parallelizing compilers

    Science.gov (United States)

    Shen, Zhiyu; Li, Zhiyuan; Yew, Pen-Chung

    1990-01-01

    Some results are reported from an empirical study of program characteristics that are important in parallelizing compiler writers, especially in the area of data dependence analysis and program transformations. The state of the art in data dependence analysis and some parallel execution techniques are examined. The major findings are included. Many subscripts contain symbolic terms with unknown values. A few methods of determining their values at compile time are evaluated. Array references with coupled subscripts appear quite frequently; these subscripts must be handled simultaneously in a dependence test, rather than being handled separately as in current test algorithms. Nonzero coefficients of loop indexes in most subscripts are found to be simple: they are either 1 or -1. This allows an exact real-valued test to be as accurate as an exact integer-valued test for one-dimensional or two-dimensional arrays. Dependencies with uncertain distance are found to be rather common, and one of the main reasons is the frequent appearance of symbolic terms with unknown values.

  6. Energy consumption model over parallel programs implemented on multicore architectures

    Directory of Open Access Journals (Sweden)

    Ricardo Isidro-Ramirez

    2015-06-01

    Full Text Available In High Performance Computing, energy consump-tion is becoming an important aspect to consider. Due to the high costs that represent energy production in all countries it holds an important role and it seek to find ways to save energy. It is reflected in some efforts to reduce the energy requirements of hardware components and applications. Some options have been appearing in order to scale down energy use and, con-sequently, scale up energy efficiency. One of these strategies is the multithread programming paradigm, whose purpose is to produce parallel programs able to use the full amount of computing resources available in a microprocessor. That energy saving strategy focuses on efficient use of multicore processors that are found in various computing devices, like mobile devices. Actually, as a growing trend, multicore processors are found as part of various specific purpose computers since 2003, from High Performance Computing servers to mobile devices. However, it is not clear how multiprogramming affects energy efficiency. This paper presents an analysis of different types of multicore-based architectures used in computing, and then a valid model is presented. Based on Amdahl’s Law, a model that considers different scenarios of energy use in multicore architectures it is proposed. Some interesting results were found from experiments with the developed algorithm, that it was execute of a parallel and sequential way. A lower limit of energy consumption was found in a type of multicore architecture and this behavior was observed experimentally.

  7. Center for Programming Models for Scalable Parallel Computing: Future Programming Models

    Energy Technology Data Exchange (ETDEWEB)

    Gao, Guang, R.

    2008-07-24

    The mission of the pmodel center project is to develop software technology to support scalable parallel programming models for terascale systems. The goal of the specific UD subproject is in the context developing an efficient and robust methodology and tools for HPC programming. More specifically, the focus is on developing new programming models which facilitate programmers in porting their application onto parallel high performance computing systems. During the course of the research in the past 5 years, the landscape of microprocessor chip architecture has witnessed a fundamental change – the emergence of multi-core/many-core chip architecture appear to become the mainstream technology and will have a major impact to for future generation parallel machines. The programming model for shared-address space machines is becoming critical to such multi-core architectures. Our research highlight is the in-depth study of proposed fine-grain parallelism/multithreading support on such future generation multi-core architectures. Our research has demonstrated the significant impact such fine-grain multithreading model can have on the productivity of parallel programming models and their efficient implementation.

  8. 基于Simulink/SimMechanics的三自由度并联机器人控制系统仿真%Simulation of Control System for 3D of Parallel Robot based on Simulink/SimMechanics

    Institute of Scientific and Technical Information of China (English)

    胡峰; 骆德渊; 雷霆; 柯辉

    2012-01-01

    Aiming at the problem that the control system of parallel robot is more complicated, compared to the traditional series robot, the technology of virtual simulation was investigated on the control strategy of parallel robot Taking 3D0F Delta Parallel Robot for example, in order to conveniently and rapidly achieve the control system simulation , using Simulink for simulation platform and combining with SimMechanics link, the method of modeling which translates CAD assemblies of Pro/E into SimMechanics model was presented, after that the PID controller model was designed. The experimental results show that it can provide the efficient and significant simulation platform to research the control strategy of parallel robot.%针对并联机器人控制系统比传统串联机器人更加复杂的问题,将虚拟仿真技术应用到并联机器人控制策略的研究上.以三自由度Delta并联机器人为例,为便捷高效实现其控制系统仿真,利用Simulink为仿真平台,结合SimMechanics Link接口软件,提出了三维Pro/E模型转换成SimMechanics模型的建模方法建立机械系统模型,并设计PID控制器模型进行仿真分析.结果表明,该方法为并联机器人控制策略的研究提供了高效的仿真平台,便于展开针对并联机器人特点的各种控制策略的研究.

  9. A Parallel Vector Machine for the PM Programming Language

    Science.gov (United States)

    Bellerby, Tim

    2016-04-01

    PM is a new programming language which aims to make the writing of computational geoscience models on parallel hardware accessible to scientists who are not themselves expert parallel programmers. It is based around the concept of communicating operators: language constructs that enable variables local to a single invocation of a parallelised loop to be viewed as if they were arrays spanning the entire loop domain. This mechanism enables different loop invocations (which may or may not be executing on different processors) to exchange information in a manner that extends the successful Communicating Sequential Processes idiom from single messages to collective communication. Communicating operators avoid the additional synchronisation mechanisms, such as atomic variables, required when programming using the Partitioned Global Address Space (PGAS) paradigm. Using a single loop invocation as the fundamental unit of concurrency enables PM to uniformly represent different levels of parallelism from vector operations through shared memory systems to distributed grids. This paper describes an implementation of PM based on a vectorised virtual machine. On a single processor node, concurrent operations are implemented using masked vector operations. Virtual machine instructions operate on vectors of values and may be unmasked, masked using a Boolean field, or masked using an array of active vector cell locations. Conditional structures (such as if-then-else or while statement implementations) calculate and apply masks to the operations they control. A shift in mask representation from Boolean to location-list occurs when active locations become sufficiently sparse. Parallel loops unfold data structures (or vectors of data structures for nested loops) into vectors of values that may additionally be distributed over multiple computational nodes and then split into micro-threads compatible with the size of the local cache. Inter-node communication is accomplished using

  10. Parallelizing dynamic sequential programs using polyhedral process networks

    NARCIS (Netherlands)

    Nadezhkin, Dmitry

    2012-01-01

    The Polyhedral Process Network (PPN) is a suitable parallel model of computation (MoC) used to specify embedded streaming applications in a parallel form facilitating the efficient mapping onto embedded parallel execution platforms. Unfortunately, specifying an application using a parallel MoC is a

  11. 3d-3d correspondence revisited

    Science.gov (United States)

    Chung, Hee-Joong; Dimofte, Tudor; Gukov, Sergei; Sułkowski, Piotr

    2016-04-01

    In fivebrane compactifications on 3-manifolds, we point out the importance of all flat connections in the proper definition of the effective 3d {N}=2 theory. The Lagrangians of some theories with the desired properties can be constructed with the help of homological knot invariants that categorify colored Jones polynomials. Higgsing the full 3d theories constructed this way recovers theories found previously by Dimofte-Gaiotto-Gukov. We also consider the cutting and gluing of 3-manifolds along smooth boundaries and the role played by all flat connections in this operation.

  12. IZDELAVA TISKALNIKA 3D

    OpenAIRE

    Brdnik, Lovro

    2015-01-01

    Diplomsko delo analizira trenutno stanje 3D tiskalnikov na trgu. Prikazan je razvoj in principi delovanja 3D tiskalnikov. Predstavljeni so tipi 3D tiskalnikov, njihove prednosti in slabosti. Podrobneje je predstavljena zgradba in delovanje koračnih motorjev. Opravljene so meritve koračnih motorjev. Opisana je programska oprema za rokovanje s 3D tiskalniki in komponente, ki jih potrebujemo za izdelavo. Diploma se oklepa vprašanja, ali je izdelava 3D tiskalnika bolj ekonomična kot pa naložba v ...

  13. MoldaNet: a network distributed molecular graphics and modelling program that integrates secure signed applet and Java 3D technologies.

    Science.gov (United States)

    Yoshida, H; Rzepa, H S; Tonge, A P

    1998-06-01

    MoldaNet is a molecular graphics and modelling program that integrates several new Java technologies, including authentication as a Secure Signed Applet, and implementation of Java 3D classes to enable access to hardware graphics acceleration. It is the first example of a novel class of Internet-based distributed computational chemistry tool designed to eliminate the need for user pre-installation of software on their client computer other than a standard Internet browser. The creation of a properly authenticated tool using a signed digital X.509 certificate permits the user to employ MoldaNet to read and write the files to a local file store; actions that are normally disallowed in Java applets. The modularity of the Java language also allows straightforward inclusion of Java3D and Chemical Markup Language classes in MoldaNet to permit the user to filter their model into 3D model descriptors such as VRML97 or CML for saving on local disk. The implications for both distance-based training environments and chemical commerce are noted.

  14. Open-MP与并行程序设计%Open-MP and Parallel Programming

    Institute of Scientific and Technical Information of China (English)

    陈崚; 陈宏建; 秦玲

    2003-01-01

    The application programming interface Open-MP for the shared memory parallel computer system and its characteristics are illustrated. We also compare Open-MP with parallel programming tool MPI.To overcome the disadvantage of large overhead in Open-MP program,several optimization methods in Open-MP programming are presented to increase the efficiency of its execution.

  15. Development of a computational program for fuel management in a 3-D, two energy groups nuclear reactor core

    International Nuclear Information System (INIS)

    Full text: A computational program was developed for reactor fuel management in three dimensional Cartesian coordinates using two-group neutron diffusion theory (fast neutron and thermal neutron energy group). Three fuel loading patterns were considered as follow: 1. uniform loading, 2. out-in loading and 3. in-scatter loading. Criticality, peak power distribution and loaded fuel depletion measured in megawatt-day per kilogram (MW d/kg) of uranium were also calculated by the developed program. The results showed that the in-scatter loading pattern gave the best power peaking for fuel management

  16. Parallel functional programming in Sisal: Fictions, facts, and future

    Energy Technology Data Exchange (ETDEWEB)

    McGraw, J.R.

    1993-07-01

    This paper provides a status report on the progress of research and development on the functional language Sisal. This project focuses on providing a highly effective method of writing large scientific applications that can efficiently execute on a spectrum of different multiprocessors. The paper includes sections on the language definition, compilation strategies, and programming techniques intended for readers with little or no background with Sisal. The section on performance presents our most recent results on execution speed for shared-memory multiprocessors, our findings using Sisal to develop codes, and our experiences migrating the same source code to different machines. For large programs, the execution performance of Sisal (with minimal supporting advice from the programmer) usually exceeds that of the best available automatic, vector/parallel Fortran compilers. Our evidence also indicates that Sisal programs tend to be shorter in length, faster to write, and dearer to understand than equivalent algorithms in Fortran. The paper concludes with a substantial discussion of common criticisms of the language and our plans for addressing them. Most notably, efficient implementations for distributed memory machines are lacking; an issue we plan to remedy.

  17. Performance Measurement, Visualization and Modeling of Parallel and Distributed Programs

    Science.gov (United States)

    Yan, Jerry C.; Sarukkai, Sekhar R.; Mehra, Pankaj; Lum, Henry, Jr. (Technical Monitor)

    1994-01-01

    This paper presents a methodology for debugging the performance of message-passing programs on both tightly coupled and loosely coupled distributed-memory machines. The AIMS (Automated Instrumentation and Monitoring System) toolkit, a suite of software tools for measurement and analysis of performance, is introduced and its application illustrated using several benchmark programs drawn from the field of computational fluid dynamics. AIMS includes (i) Xinstrument, a powerful source-code instrumentor, which supports both Fortran77 and C as well as a number of different message-passing libraries including Intel's NX Thinking Machines' CMMD, and PVM; (ii) Monitor, a library of timestamping and trace -collection routines that run on supercomputers (such as Intel's iPSC/860, Delta, and Paragon and Thinking Machines' CM5) as well as on networks of workstations (including Convex Cluster and SparcStations connected by a LAN); (iii) Visualization Kernel, a trace-animation facility that supports source-code clickback, simultaneous visualization of computation and communication patterns, as well as analysis of data movements; (iv) Statistics Kernel, an advanced profiling facility, that associates a variety of performance data with various syntactic components of a parallel program; (v) Index Kernel, a diagnostic tool that helps pinpoint performance bottlenecks through the use of abstract indices; (vi) Modeling Kernel, a facility for automated modeling of message-passing programs that supports both simulation -based and analytical approaches to performance prediction and scalability analysis; (vii) Intrusion Compensator, a utility for recovering true performance from observed performance by removing the overheads of monitoring and their effects on the communication pattern of the program; and (viii) Compatibility Tools, that convert AIMS-generated traces into formats used by other performance-visualization tools, such as ParaGraph, Pablo, and certain AVS/Explorer modules.

  18. Development of an Artificial Intelligence Programming Course and Unity3d Based Framework to Motivate Learning in Artistic Minded Students

    DEFF Research Database (Denmark)

    Reng, Lars

    2012-01-01

    between technical and artistic minded students is, however, increased once the students reach the sixth semester. The complex algorithms of the artificial intelligence course seemed to demotivate the artistic minded students even before the course began. This paper will present the extensive changes made...... to the sixth semester artificial intelligence programming course, in order to provide a highly motivating direct visual feedback, and thereby remove the steep initial learning curve for artistic minded students. The framework was developed with close dialog to both the game industry and experienced master...

  19. 3-D Hybrid Kinetic Modeling of the Interaction Between the Solar Wind and Lunar-like Exospheric Pickup Ions in Case of Oblique/ Quasi-Parallel/Parallel Upstream Magnetic Field

    Science.gov (United States)

    Lipatov, A. S.; Farrell, W. M.; Cooper, J. F.; Sittler, E. C., Jr.; Hartle, R. E.

    2015-01-01

    The interactions between the solar wind and Moon-sized objects are determined by a set of the solar wind parameters and plasma environment of the space objects. The orientation of upstream magnetic field is one of the key factors which determines the formation and structure of bow shock wave/Mach cone or Alfven wing near the obstacle. The study of effects of the direction of the upstream magnetic field on lunar-like plasma environment is the main subject of our investigation in this paper. Photoionization, electron-impact ionization and charge exchange are included in our hybrid model. The computational model includes the self-consistent dynamics of the light (hydrogen (+), helium (+)) and heavy (sodium (+)) pickup ions. The lunar interior is considered as a weakly conducting body. Our previous 2013 lunar work, as reported in this journal, found formation of a triple structure of the Mach cone near the Moon in the case of perpendicular upstream magnetic field. Further advances in modeling now reveal the presence of strong wave activity in the upstream solar wind and plasma wake in the cases of quasiparallel and parallel upstream magnetic fields. However, little wave activity is found for the opposite case with a perpendicular upstream magnetic field. The modeling does not show a formation of the Mach cone in the case of theta(Sub B,U) approximately equal to 0 degrees.

  20. FastScript3D - A Companion to Java 3D

    Science.gov (United States)

    Koenig, Patti

    2005-01-01

    FastScript3D is a computer program, written in the Java 3D(TM) programming language, that establishes an alternative language that helps users who lack expertise in Java 3D to use Java 3D for constructing three-dimensional (3D)-appearing graphics. The FastScript3D language provides a set of simple, intuitive, one-line text-string commands for creating, controlling, and animating 3D models. The first word in a string is the name of a command; the rest of the string contains the data arguments for the command. The commands can also be used as an aid to learning Java 3D. Developers can extend the language by adding custom text-string commands. The commands can define new 3D objects or load representations of 3D objects from files in formats compatible with such other software systems as X3D. The text strings can be easily integrated into other languages. FastScript3D facilitates communication between scripting languages [which enable programming of hyper-text markup language (HTML) documents to interact with users] and Java 3D. The FastScript3D language can be extended and customized on both the scripting side and the Java 3D side.

  1. 3D virtuel udstilling

    DEFF Research Database (Denmark)

    Tournay, Bruno; Rüdiger, Bjarne

    2006-01-01

    3d digital model af Arkitektskolens gård med virtuel udstilling af afgangsprojekter fra afgangen sommer 2006. 10 s.......3d digital model af Arkitektskolens gård med virtuel udstilling af afgangsprojekter fra afgangen sommer 2006. 10 s....

  2. 并行程序设计语言发展现状%Current Development of Parallel Programming Language

    Institute of Scientific and Technical Information of China (English)

    韩卫; 郝红宇; 代丽

    2003-01-01

    In this paper we introduce the history of the parallel programming language and list some of currently parallel programming languages. Then according to the classified principle. We analyze some of the representative parallel programming languages in detail. Finally, we show a further feature to the parallel programming language.

  3. Processor Allocation for Optimistic Parallelization of Irregular Programs

    CERN Document Server

    Versaci, Francesco

    2012-01-01

    Optimistic parallelization is a promising approach for the parallelization of irregular algorithms: potentially interfering tasks are launched dynamically, and the runtime system detects conflicts between concurrent activities, aborting and rolling back conflicting tasks. However, parallelism in irregular algorithms is very complex. In a regular algorithm like dense matrix multiplication, the amount of parallelism can usually be expressed as a function of the problem size, so it is reasonably straightforward to determine how many processors should be allocated to execute a regular algorithm of a certain size (this is called the processor allocation problem). In contrast, parallelism in irregular algorithms can be a function of input parameters, and the amount of parallelism can vary dramatically during the execution of the irregular algorithm. Therefore, the processor allocation problem for irregular algorithms is very difficult. In this paper, we describe the first systematic strategy for addressing this pro...

  4. Martian terrain - 3D

    Science.gov (United States)

    1997-01-01

    This area of terrain near the Sagan Memorial Station was taken on Sol 3 by the Imager for Mars Pathfinder (IMP). 3D glasses are necessary to identify surface detail.The IMP is a stereo imaging system with color capability provided by 24 selectable filters -- twelve filters per 'eye.' It stands 1.8 meters above the Martian surface, and has a resolution of two millimeters at a range of two meters.Mars Pathfinder is the second in NASA's Discovery program of low-cost spacecraft with highly focused science goals. The Jet Propulsion Laboratory, Pasadena, CA, developed and manages the Mars Pathfinder mission for NASA's Office of Space Science, Washington, D.C. JPL is an operating division of the California Institute of Technology (Caltech). The Imager for Mars Pathfinder (IMP) was developed by the University of Arizona Lunar and Planetary Laboratory under contract to JPL. Peter Smith is the Principal Investigator.Click below to see the left and right views individually. [figure removed for brevity, see original site] Left [figure removed for brevity, see original site] Right

  5. PLOT3D- DRAWING THREE DIMENSIONAL SURFACES

    Science.gov (United States)

    Canright, R. B.

    1994-01-01

    PLOT3D is a package of programs to draw three-dimensional surfaces of the form z = f(x,y). The function f and the boundary values for x and y are the input to PLOT3D. The surface thus defined may be drawn after arbitrary rotations. However, it is designed to draw only functions in rectangular coordinates expressed explicitly in the above form. It cannot, for example, draw a sphere. Output is by off-line incremental plotter or online microfilm recorder. This package, unlike other packages, will plot any function of the form z = f(x,y) and portrays continuous and bounded functions of two independent variables. With curve fitting; however, it can draw experimental data and pictures which cannot be expressed in the above form. The method used is division into a uniform rectangular grid of the given x and y ranges. The values of the supplied function at the grid points (x, y) are calculated and stored; this defines the surface. The surface is portrayed by connecting successive (y,z) points with straight-line segments for each x value on the grid and, in turn, connecting successive (x,z) points for each fixed y value on the grid. These lines are then projected by parallel projection onto the fixed yz-plane for plotting. This program has been implemented on the IBM 360/67 with on-line CDC microfilm recorder.

  6. MulticoreBSP for C : A high-performance library for shared-memory parallel programming

    NARCIS (Netherlands)

    Yzelman, A. N.; Bisseling, R. H.; Roose, D.; Meerbergen, K.

    2014-01-01

    The bulk synchronous parallel (BSP) model, as well as parallel programming interfaces based on BSP, classically target distributed-memory parallel architectures. In earlier work, Yzelman and Bisseling designed a MulticoreBSP for Java library specifically for shared-memory architectures. In the prese

  7. Determining the Architecture of a Protein-DNA Complex by Combining FeBABE Cleavage Analyses, 3-D Printed Structures, and the ICM Molsoft Program.

    Science.gov (United States)

    James, Tamara; Hsieh, Meng-Lun; Knipling, Leslie; Hinton, Deborah

    2015-01-01

    Determining the structure of a protein-DNA complex can be difficult, particularly if the protein does not bind tightly to the DNA, if there are no homologous proteins from which the DNA binding can be inferred, and/or if only portions of the protein can be crystallized. If the protein comprises just a part of a large multi-subunit complex, other complications can arise such as the complex being too large for NMR studies, or it is not possible to obtain the amounts of protein and nucleic acids needed for crystallographic analyses. Here, we describe a technique we used to map the position of an activator protein relative to the DNA within a large transcription complex. We determined the position of the activator on the DNA from data generated using activator proteins that had been conjugated at specific residues with the chemical cleaving reagent, iron bromoacetamidobenzyl-EDTA (FeBABE). These analyses were combined with 3-D models of the available structures of portions of the activator protein and B-form DNA to obtain a 3-D picture of the protein relative to the DNA. Finally, the Molsoft program was used to refine the position, revealing the architecture of the protein-DNA within the transcription complex. PMID:26404142

  8. Blender 3D cookbook

    CERN Document Server

    Valenza, Enrico

    2015-01-01

    This book is aimed at the professionals that already have good 3D CGI experience with commercial packages and have now decided to try the open source Blender and want to experiment with something more complex than the average tutorials on the web. However, it's also aimed at the intermediate Blender users who simply want to go some steps further.It's taken for granted that you already know how to move inside the Blender interface, that you already have 3D modeling knowledge, and also that of basic 3D modeling and rendering concepts, for example, edge-loops, n-gons, or samples. In any case, it'

  9. HPF: a data parallel programming interface for large-scale numerical simulations

    Energy Technology Data Exchange (ETDEWEB)

    Seo, Yoshiki; Suehiro, Kenji; Murai, Hitoshi [NEC Corp., Tokyo (Japan)

    1998-03-01

    HPF (High Performance Fortran) is a data parallel language designed for programming on distributed memory parallel systems. The first draft of HPF1.0 was defined in 1993 as a de facto standard language. Recently, relatively reliable HPF compilers have become available on several distributed memory parallel systems. Many projects to parallelize real world programs have started mainly in the U.S. and Europe, and the weak and strong points in the current HPF have been made clear. In this paper, major data transfer patterns required to parallelize numerical simulations, such as SHIFT, matrix transposition, reduction, GATHER/SCATTER and irregular communication, and the programming methods to implement them with HPF are described. The problems in the current HPF interface for developing efficient parallel programs and recent activities to deal with them is presented as well. (author)

  10. Detection Of Control Flow Errors In Parallel Programs At Compile Time

    Directory of Open Access Journals (Sweden)

    Bruce P. Lester

    2010-12-01

    Full Text Available This paper describes a general technique to identify control flow errors in parallel programs, which can be automated into a compiler. The compiler builds a system of linear equations that describes the global control flow of the whole program. Solving these equations using standard techniques of linear algebra can locate a wide range of control flow bugs at compile time. This paper also describes an implementation of this control flow analysis technique in a prototype compiler for a well-known parallel programming language. In contrast to previous research in automated parallel program analysis, our technique is efficient for large programs, and does not limit the range of language features.

  11. About the Capability of Some Parallel Program Metric Prediction Using Neural Network Approach

    OpenAIRE

    Vera, Yu; Nina, N.

    2007-01-01

    Parallel program execution on multiprocessor system is affected by many factors. Often it is rather difficult to estimate how program would behave when running on some definite number of processors. Most tools designed for performance and other parallel program characteristics evaluation affect source code of a program. In this paper we consider neural network approach to predict job run time (when using a queue-based submission system). All analysis that is needed for the prediction can be p...

  12. Cache-aware Parallel Programming for Manycore Processors

    OpenAIRE

    Tousimojarad, Ashkan; Vanderbauwhede, Wim

    2014-01-01

    With rapidly evolving technology, multicore and manycore processors have emerged as promising architectures to benefit from increasing transistor numbers. The transition towards these parallel architectures makes today an exciting time to investigate challenges in parallel computing. The TILEPro64 is a manycore accelerator, composed of 64 tiles interconnected via multiple 8x8 mesh networks. It contains per-tile caches and supports cache-coherent shared memory by default. In this paper we pres...

  13. A parallel dynamic programming algorithm for multi-reservoir system optimization

    Science.gov (United States)

    Li, Xiang; Wei, Jiahua; Li, Tiejian; Wang, Guangqian; Yeh, William W.-G.

    2014-05-01

    This paper develops a parallel dynamic programming algorithm to optimize the joint operation of a multi-reservoir system. First, a multi-dimensional dynamic programming (DP) model is formulated for a multi-reservoir system. Second, the DP algorithm is parallelized using a peer-to-peer parallel paradigm. The parallelization is based on the distributed memory architecture and the message passing interface (MPI) protocol. We consider both the distributed computing and distributed computer memory in the parallelization. The parallel paradigm aims at reducing the computation time as well as alleviating the computer memory requirement associated with running a multi-dimensional DP model. Next, we test the parallel DP algorithm on the classic, benchmark four-reservoir problem on a high-performance computing (HPC) system with up to 350 cores. Results indicate that the parallel DP algorithm exhibits good performance in parallel efficiency; the parallel DP algorithm is scalable and will not be restricted by the number of cores. Finally, the parallel DP algorithm is applied to a real-world, five-reservoir system in China. The results demonstrate the parallel efficiency and practical utility of the proposed methodology.

  14. Feasibility of 3D Partially Parallel Acquisition DCE MRI in Pulmonary Parenchyma Perfusion%三维并行采集动态增强MRI在肺实质局部灌注中的应用研究

    Institute of Scientific and Technical Information of China (English)

    夏艺; 范丽; 刘士远; 管宇; 徐雪原; 于红; 肖湘生

    2012-01-01

    目的 评价3D并行采集动态对比增强MRI(dynamic contrast-enhanced MRI,DCE-MRI)技术对肺实质局部灌注成像的可行性.资料与方法 采用GE 1.5 T MRI系统,对10名健康志愿者及47例肺部疾病患者行灌注成像;评价肺灌注图像的均匀度,若存在灌注异常区域则计算其与正常肺组织的信号强度之比( RSI).结果 DCE-MRI可以清楚地显示肺实质灌注情况:10名健康志愿者的灌注图像较均匀,未见灌注缺损区.10例肺动脉栓塞( pulmonary embolism,PE)共出现12个楔形灌注缺损区,其中1例双侧PE出现3个灌注缺损区;12例侵犯邻近肺动脉的肺癌,在相应供血区均出现灌注缺损;RSI经单样本t检验差异具有明显的统计学意义(t=-24.74,P<0.05);另25例(20例未侵犯邻近肺动脉的肺癌和5例炎性病变)在对比剂首过肺实质强化达峰值时,病灶局部均呈低信号改变.结论 3D并行采集DCE-MRI技术可在单次屏气状态下完成动态多期扫描,获得全肺的容积灌注成像数据,对MR肺灌注图像采用半量化分析可明显区分出灌注异常区与灌注正常区.%Objective To assess the feasibility of 3 D partially parallel acquisition dynamic contrast enhanced (DCE) MRI in pulmonary parenchyma perfusion. Materials and Methods Ten healthy volunteers and 47 patients with lung disease performed perfusion imaging on a clinical 1. 5-T GE Excite HD whole body system. The homogeneity of perfusion images were assessed. In case of perfusion abnormality, the signal intensity ratio ( RSI) of perfusion abnormality and normal lung were calculated. Results Pulmonary parenchyma perfusion was well depicted with DCE-MRI. The perfusion images of healthy volunteers were homogeneous. 12 wedge shaped perfusion defects were visualized in 10 patients with pulmonary embolisms. 12 perfusion defects were also showed in 12 patients with lung cancer infiltrating the pulmonary artery. There was significant difference in RSI (t = - 24

  15. [Real time 3D echocardiography

    Science.gov (United States)

    Bauer, F.; Shiota, T.; Thomas, J. D.

    2001-01-01

    Three-dimensional representation of the heart is an old concern. Usually, 3D reconstruction of the cardiac mass is made by successive acquisition of 2D sections, the spatial localisation and orientation of which require complex guiding systems. More recently, the concept of volumetric acquisition has been introduced. A matricial emitter-receiver probe complex with parallel data processing provides instantaneous of a pyramidal 64 degrees x 64 degrees volume. The image is restituted in real time and is composed of 3 planes (planes B and C) which can be displaced in all spatial directions at any time during acquisition. The flexibility of this system of acquisition allows volume and mass measurement with greater accuracy and reproducibility, limiting inter-observer variability. Free navigation of the planes of investigation allows reconstruction for qualitative and quantitative analysis of valvular heart disease and other pathologies. Although real time 3D echocardiography is ready for clinical usage, some improvements are still necessary to improve its conviviality. Then real time 3D echocardiography could be the essential tool for understanding, diagnosis and management of patients.

  16. General design method for 3-dimensional, potential flow fields. Part 2: Computer program DIN3D1 for simple, unbranched ducts

    Science.gov (United States)

    Stanitz, J. D.

    1985-01-01

    The general design method for three-dimensional, potential, incompressible or subsonic-compressible flow developed in part 1 of this report is applied to the design of simple, unbranched ducts. A computer program, DIN3D1, is developed and five numerical examples are presented: a nozzle, two elbows, an S-duct, and the preliminary design of a side inlet for turbomachines. The two major inputs to the program are the upstream boundary shape and the lateral velocity distribution on the duct wall. As a result of these inputs, boundary conditions are overprescribed and the problem is ill posed. However, it appears that there are degrees of compatibility between these two major inputs and that, for reasonably compatible inputs, satisfactory solutions can be obtained. By not prescribing the shape of the upstream boundary, the problem presumably becomes well posed, but it is not clear how to formulate a practical design method under this circumstance. Nor does it appear desirable, because the designer usually needs to retain control over the upstream (or downstream) boundary shape. The problem is further complicated by the fact that, unlike the two-dimensional case, and irrespective of the upstream boundary shape, some prescribed lateral velocity distributions do not have proper solutions.

  17. Radiochromic 3D Detectors

    Science.gov (United States)

    Oldham, Mark

    2015-01-01

    Radiochromic materials exhibit a colour change when exposed to ionising radiation. Radiochromic film has been used for clinical dosimetry for many years and increasingly so recently, as films of higher sensitivities have become available. The two principle advantages of radiochromic dosimetry include greater tissue equivalence (radiologically) and the lack of requirement for development of the colour change. In a radiochromic material, the colour change arises direct from ionising interactions affecting dye molecules, without requiring any latent chemical, optical or thermal development, with important implications for increased accuracy and convenience. It is only relatively recently however, that 3D radiochromic dosimetry has become possible. In this article we review recent developments and the current state-of-the-art of 3D radiochromic dosimetry, and the potential for a more comprehensive solution for the verification of complex radiation therapy treatments, and 3D dose measurement in general.

  18. 3D Projection Installations

    DEFF Research Database (Denmark)

    Halskov, Kim; Johansen, Stine Liv; Bach Mikkelsen, Michelle

    2014-01-01

    Three-dimensional projection installations are particular kinds of augmented spaces in which a digital 3-D model is projected onto a physical three-dimensional object, thereby fusing the digital content and the physical object. Based on interaction design research and media studies, this article...... contributes to the understanding of the distinctive characteristics of such a new medium, and identifies three strategies for designing 3-D projection installations: establishing space; interplay between the digital and the physical; and transformation of materiality. The principal empirical case, From...... Fingerplan to Loop City, is a 3-D projection installation presenting the history and future of city planning for the Copenhagen area in Denmark. The installation was presented as part of the 12th Architecture Biennale in Venice in 2010....

  19. HPC parallel programming model for gyrokinetic MHD simulation

    International Nuclear Information System (INIS)

    The 3-dimensional gyrokinetic PIC (particle-in-cell) code for MHD simulation, Gpic-MHD, was installed on SR16000 (“Plasma Simulator”), which is a scalar cluster system consisting of 8,192 logical cores. The Gpic-MHD code advances particle and field quantities in time. In order to distribute calculations over large number of logical cores, the total simulation domain in cylindrical geometry was broken up into NDD-r × NDD-z (number of radial decomposition times number of axial decomposition) small domains including approximately the same number of particles. The axial direction was uniformly decomposed, while the radial direction was non-uniformly decomposed. NRP replicas (copies) of each decomposed domain were used (“particle decomposition”). The hybrid parallelization model of multi-threads and multi-processes was employed: threads were parallelized by the auto-parallelization and NDD-r × NDD-z × NRP processes were parallelized by MPI (message-passing interface). The parallelization performance of Gpic-MHD was investigated for the medium size system of Nr × Nθ × Nz = 1025 × 128 × 128 mesh with 4.196 or 8.192 billion particles. The highest speed for the fixed number of logical cores was obtained for two threads, the maximum number of NDD-z, and optimum combination of NDD-r and NRP. The observed optimum speeds demonstrated good scaling up to 8,192 logical cores. (author)

  20. RELAP5-3D User Problems

    Energy Technology Data Exchange (ETDEWEB)

    Riemke, Richard Allan

    2002-09-01

    The Reactor Excursion and Leak Analysis Program with 3D capability1 (RELAP5-3D) is a reactor system analysis code that has been developed at the Idaho National Engineering and Environmental Laboratory (INEEL) for the U. S. Department of Energy (DOE). The 3D capability in RELAP5-3D includes 3D hydrodynamics2 and 3D neutron kinetics3,4. Assessment, verification, and validation of the 3D capability in RELAP5-3D is discussed in the literature5,6,7,8,9,10. Additional assessment, verification, and validation of the 3D capability of RELAP5-3D will be presented in other papers in this users seminar. As with any software, user problems occur. User problems usually fall into the categories of input processing failure, code execution failure, restart/renodalization failure, unphysical result, and installation. This presentation will discuss some of the more generic user problems that have been reported on RELAP5-3D as well as their resolution.

  1. Herramientas SIG 3D

    Directory of Open Access Journals (Sweden)

    Francisco R. Feito Higueruela

    2010-04-01

    Full Text Available Applications of Geographical Information Systems on several Archeology fields have been increasing during the last years. Recent avances in these technologies make possible to work with more realistic 3D models. In this paper we introduce a new paradigm for this system, the GIS Thetrahedron, in which we define the fundamental elements of GIS, in order to provide a better understanding of their capabilities. At the same time the basic 3D characteristics of some comercial and open source software are described, as well as the application to some samples on archeological researchs

  2. TOWARDS: 3D INTERNET

    OpenAIRE

    Ms. Swapnali R. Ghadge

    2013-01-01

    In today’s ever-shifting media landscape, it can be a complex task to find effective ways to reach your desired audience. As traditional media such as television continue to lose audience share, one venue in particular stands out for its ability to attract highly motivated audiences and for its tremendous growth potential the 3D Internet. The concept of '3D Internet' has recently come into the spotlight in the R&D arena, catching the attention of many people, and leading to a lot o...

  3. Bootstrapping 3D fermions

    Science.gov (United States)

    Iliesiu, Luca; Kos, Filip; Poland, David; Pufu, Silviu S.; Simmons-Duffin, David; Yacoby, Ran

    2016-03-01

    We study the conformal bootstrap for a 4-point function of fermions in 3D. We first introduce an embedding formalism for 3D spinors and compute the conformal blocks appearing in fermion 4-point functions. Using these results, we find general bounds on the dimensions of operators appearing in the ψ × ψ OPE, and also on the central charge C T . We observe features in our bounds that coincide with scaling dimensions in the GrossNeveu models at large N . We also speculate that other features could coincide with a fermionic CFT containing no relevant scalar operators.

  4. Interaktiv 3D design

    DEFF Research Database (Denmark)

    Villaume, René Domine; Ørstrup, Finn Rude

    2002-01-01

    Projektet undersøger potentialet for interaktiv 3D design via Internettet. Arkitekt Jørn Utzons projekt til Espansiva blev udviklet som et byggesystem med det mål, at kunne skabe mangfoldige planmuligheder og mangfoldige facade- og rumudformninger. Systemets bygningskomponenter er digitaliseret som...... 3D elementer og gjort tilgængelige. Via Internettet er det nu muligt at sammenstille og afprøve en uendelig  række bygningstyper som  systemet blev tænkt og udviklet til....

  5. Tangible 3D Modelling

    DEFF Research Database (Denmark)

    Hejlesen, Aske K.; Ovesen, Nis

    2012-01-01

    This paper presents an experimental approach to teaching 3D modelling techniques in an Industrial Design programme. The approach includes the use of tangible free form models as tools for improving the overall learning. The paper is based on lecturer and student experiences obtained through facil...

  6. 3D Harmonic Echocardiography:

    NARCIS (Netherlands)

    M.M. Voormolen

    2007-01-01

    textabstractThree dimensional (3D) echocardiography has recently developed from an experimental technique in the ’90 towards an imaging modality for the daily clinical practice. This dissertation describes the considerations, implementation, validation and clinical application of a unique

  7. Machine and Collection Abstractions for User-Implemented Data-Parallel Programming

    Directory of Open Access Journals (Sweden)

    Magne Haveraaen

    2000-01-01

    Full Text Available Data parallelism has appeared as a fruitful approach to the parallelisation of compute-intensive programs. Data parallelism has the advantage of mimicking the sequential (and deterministic structure of programs as opposed to task parallelism, where the explicit interaction of processes has to be programmed. In data parallelism data structures, typically collection classes in the form of large arrays, are distributed on the processors of the target parallel machine. Trying to extract distribution aspects from conventional code often runs into problems with a lack of uniformity in the use of the data structures and in the expression of data dependency patterns within the code. Here we propose a framework with two conceptual classes, Machine and Collection. The Machine class abstracts hardware communication and distribution properties. This gives a programmer high-level access to the important parts of the low-level architecture. The Machine class may readily be used in the implementation of a Collection class, giving the programmer full control of the parallel distribution of data, as well as allowing normal sequential implementation of this class. Any program using such a collection class will be parallelisable, without requiring any modification, by choosing between sequential and parallel versions at link time. Experiments with a commercial application, built using the Sophus library which uses this approach to parallelisation, show good parallel speed-ups, without any adaptation of the application program being needed.

  8. The Effect of Parallel Programming Languages on the Performance and Energy Consumption of HPC Applications

    Directory of Open Access Journals (Sweden)

    Muhammad Aqib

    2016-02-01

    Full Text Available Big and complex applications need many resources and long computation time to execute sequentially. In this scenario, all application's processes are handled in sequential fashion even if they are independent of each other. In high- performance computing environment, multiple processors are available to running applications in parallel. So mutually independent blocks of codes could run in parallel. This approach not only increases the efficiency of the system without affecting the results but also saves a significant amount of energy. Many parallel programming models or APIs like Open MPI, Open MP, CUDA, etc. are available to running multiple instructions in parallel. In this paper, the efficiency and energy consumption of two known tasks i.e. matrix multiplication and quicksort are analyzed using different parallel programming models and a multiprocessor machine. The obtained results, which can be generalized, outline the effect of choosing a programming model on the efficiency and energy consumption when running different codes on different machines.

  9. A 3d game in python

    OpenAIRE

    Xu, Minghui

    2014-01-01

    3D game has widely been accepted and loved by many game players. More and more different kinds of 3D games were developed to feed people’s needs. The most common programming language for development of 3D game is C++ nowadays. Python is a high-level scripting language. It is simple and clear. The concise syntax could speed up the development cycle. This project was to develop a 3D game using only Python. The game is about how a cat lives in the street. In order to live, the player need...

  10. Buffered coscheduling for parallel programming and enhanced fault tolerance

    Science.gov (United States)

    Petrini, Fabrizio; Feng, Wu-chun

    2006-01-31

    A computer implemented method schedules processor jobs on a network of parallel machine processors or distributed system processors. Control information communications generated by each process performed by each processor during a defined time interval is accumulated in buffers, where adjacent time intervals are separated by strobe intervals for a global exchange of control information. A global exchange of the control information communications at the end of each defined time interval is performed during an intervening strobe interval so that each processor is informed by all of the other processors of the number of incoming jobs to be received by each processor in a subsequent time interval. The buffered coscheduling method of this invention also enhances the fault tolerance of a network of parallel machine processors or distributed system processors

  11. 3-D Technology Approaches for Biological Ecologies

    Science.gov (United States)

    Liu, Liyu; Austin, Robert; U. S-China Physical-Oncology Sciences Alliance (PS-OA) Team

    Constructing three dimensional (3-D) landscapes is an inevitable issue in deep study of biological ecologies, because in whatever scales in nature, all of the ecosystems are composed by complex 3-D environments and biological behaviors. Just imagine if a 3-D technology could help complex ecosystems be built easily and mimic in vivo microenvironment realistically with flexible environmental controls, it will be a fantastic and powerful thrust to assist researchers for explorations. For years, we have been utilizing and developing different technologies for constructing 3-D micro landscapes for biophysics studies in in vitro. Here, I will review our past efforts, including probing cancer cell invasiveness with 3-D silicon based Tepuis, constructing 3-D microenvironment for cell invasion and metastasis through polydimethylsiloxane (PDMS) soft lithography, as well as explorations of optimized stenting positions for coronary bifurcation disease with 3-D wax printing and the latest home designed 3-D bio-printer. Although 3-D technologies is currently considered not mature enough for arbitrary 3-D micro-ecological models with easy design and fabrication, I hope through my talk, the audiences will be able to sense its significance and predictable breakthroughs in the near future. This work was supported by the State Key Development Program for Basic Research of China (Grant No. 2013CB837200), the National Natural Science Foundation of China (Grant No. 11474345) and the Beijing Natural Science Foundation (Grant No. 7154221).

  12. Parallel graph reduction for divide-and-conquer applications - Part I: program transformations

    NARCIS (Netherlands)

    Vree, Willem G.; Hartel, Pieter H.

    1988-01-01

    A proposal is made to base parallel evaluation of functional programs on graph reduction combined with a form of string reduction that avoids duplication of work. Pure graph reduction poses some rather difficult problems to implement on a parallel reduction machine, but with certain restrictions, pa

  13. F-Nets and Software Cabling: Deriving a Formal Model and Language for Portable Parallel Programming

    Science.gov (United States)

    DiNucci, David C.; Saini, Subhash (Technical Monitor)

    1998-01-01

    Parallel programming is still being based upon antiquated sequence-based definitions of the terms "algorithm" and "computation", resulting in programs which are architecture dependent and difficult to design and analyze. By focusing on obstacles inherent in existing practice, a more portable model is derived here, which is then formalized into a model called Soviets which utilizes a combination of imperative and functional styles. This formalization suggests more general notions of algorithm and computation, as well as insights into the meaning of structured programming in a parallel setting. To illustrate how these principles can be applied, a very-high-level graphical architecture-independent parallel language, called Software Cabling, is described, with many of the features normally expected from today's computer languages (e.g. data abstraction, data parallelism, and object-based programming constructs).

  14. On the Performance of the Python Programming Language for Serial and Parallel Scientific Computations

    OpenAIRE

    Xing Cai; Hans Petter Langtangen; Halvard Moe

    2005-01-01

    This article addresses the performance of scientific applications that use the Python programming language. First, we investigate several techniques for improving the computational efficiency of serial Python codes. Then, we discuss the basic programming techniques in Python for parallelizing serial scientific applications. It is shown that an efficient implementation of the array-related operations is essential for achieving good parallel performance, as for the serial case. Once the array-r...

  15. Performance Evaluation of Parallel Message Passing and Thread Programming Model on Multicore Architectures

    OpenAIRE

    D.T Hasta; Mutiara, A. B.

    2010-01-01

    The current trend of multicore architectures on shared memory systems underscores the need of parallelism. While there are some programming model to express parallelism, thread programming model has become a standard to support these system such as OpenMP, and POSIX threads. MPI (Message Passing Interface) which remains the dominant model used in high-performance computing today faces this challenge. Previous version of MPI which is MPI-1 has no shared memory concept, and Current MPI version ...

  16. Parallel Programming Model for the Epiphany Many-Core Coprocessor Using Threaded MPI

    OpenAIRE

    Ross, James A.; Richie, David A.; Park, Song J.; Shires, Dale R.

    2015-01-01

    The Adapteva Epiphany many-core architecture comprises a 2D tiled mesh Network-on-Chip (NoC) of low-power RISC cores with minimal uncore functionality. It offers high computational energy efficiency for both integer and floating point calculations as well as parallel scalability. Yet despite the interesting architectural features, a compelling programming model has not been presented to date. This paper demonstrates an efficient parallel programming model for the Epiphany architecture based o...

  17. Particle Acceleration in 3D Magnetic Reconnection

    Science.gov (United States)

    Dahlin, J.; Drake, J. F.; Swisdak, M.

    2015-12-01

    Magnetic reconnection is an important driver of energetic particles in phenomena such as magnetospheric storms and solar flares. Using kinetic particle-in-cell (PIC) simulations, we show that the stochastic magnetic field structure which develops during 3D reconnection plays a vital role in particle acceleration and transport. In a 2D system, electrons are trapped in magnetic islands which limits their energy gain. In a 3D system, however, the stochastic magnetic field enables the energetic electrons to access volume-filling acceleration regions and therefore gain energy much more efficiently than in the 2D system. We also examine the relative roles of two important acceleration drivers: parallel electric fields and a Fermi mechanism associated with reflection of charged particles from contracting field lines. We find that parallel electric fields are most important for accelerating low energy particles, whereas Fermi reflection dominates energetic particle production. We also find that proton energization is reduced in the 3D system.

  18. User's guide of parallel program development environment (PPDE). The 2nd edition

    Energy Technology Data Exchange (ETDEWEB)

    Ueno, Hirokazu; Takemiya, Hiroshi; Imamura, Toshiyuki; Koide, Hiroshi; Matsuda, Katsuyuki; Higuchi, Kenji; Hirayama, Toshio [Center for Promotion of Computational Science and Engineering, Japan Atomic Energy Research Institute, Tokyo (Japan); Ohta, Hirofumi [Hitachi Ltd., Tokyo (Japan)

    2000-03-01

    The STA basic system has been enhanced to accelerate support for parallel programming on heterogeneous parallel computers, through a series of R and D on the technology of parallel processing. The enhancement has been made through extending the function of the PPDF, Parallel Program Development Environment in the STA basic system. The extended PPDE has the function to make: 1) the automatic creation of a 'makefile' and a shell script file for its execution, 2) the multi-tools execution which makes the tools on heterogeneous computers to execute with one operation a task on a computer, and 3) the mirror composition to reflect editing results of a file on a computer into all related files on other computers. These additional functions will enhance the work efficiency for program development on some computers. More functions have been added to the PPDE to provide help for parallel program development. New functions were also designed to complement a HPF translator and a paralleilizing support tool when working together so that a sequential program is efficiently converted to a parallel program. This report describes the use of extended PPDE. (author)

  19. A Review on New Paradigm’s of Parallel Programming Models in High Performance Computing

    Directory of Open Access Journals (Sweden)

    Mr.Amitkumar S Manekar

    2012-08-01

    Full Text Available High Performance Computing (HPC is use of multiplecomputer resources to solve large critical problems.Multiprocessor and Multicore is two broad parallelcomputers which support parallelism. ClusteredSymmetric Multiprocessors (SMP is the most fruitful wayout for large scale applications. Enhancing theperformance of computer application is the main role ofparallel processing. Single processor performance onhigh-end systems often enjoys a noteworthy outlayadvantage when implemented in parallel on systemsutilizing multiple, lower-cost, and commoditymicroprocessors. Parallel computers are going mainstream because clusters of SMP (SymmetricMultiprocessors nodes provide support for an amplecollection of parallel programming paradigms. MPI andOpenMP are the trendy flavors in a parallel programming.In this paper we have taken a review on parallelparadigm’s available in multiprocessor and multicoresystem.

  20. Discrete Method of Images for 3D Radio Propagation Modeling

    Science.gov (United States)

    Novak, Roman

    2016-09-01

    Discretization by rasterization is introduced into the method of images (MI) in the context of 3D deterministic radio propagation modeling as a way to exploit spatial coherence of electromagnetic propagation for fine-grained parallelism. Traditional algebraic treatment of bounding regions and surfaces is replaced by computer graphics rendering of 3D reflections and double refractions while building the image tree. The visibility of reception points and surfaces is also resolved by shader programs. The proposed rasterization is shown to be of comparable run time to that of the fundamentally parallel shooting and bouncing rays. The rasterization does not affect the signal evaluation backtracking step, thus preserving its advantage over the brute force ray-tracing methods in terms of accuracy. Moreover, the rendering resolution may be scaled back for a given level of scenario detail with only marginal impact on the image tree size. This allows selection of scene optimized execution parameters for faster execution, giving the method a competitive edge. The proposed variant of MI can be run on any GPU that supports real-time 3D graphics.

  1. Massive 3D Supergravity

    CERN Document Server

    Andringa, Roel; de Roo, Mees; Hohm, Olaf; Sezgin, Ergin; Townsend, Paul K

    2009-01-01

    We construct the N=1 three-dimensional supergravity theory with cosmological, Einstein-Hilbert, Lorentz Chern-Simons, and general curvature squared terms. We determine the general supersymmetric configuration, and find a family of supersymmetric adS vacua with the supersymmetric Minkowski vacuum as a limiting case. Linearizing about the Minkowski vacuum, we find three classes of unitary theories; one is the supersymmetric extension of the recently discovered `massive 3D gravity'. Another is a `new topologically massive supergravity' (with no Einstein-Hilbert term) that propagates a single (2,3/2) helicity supermultiplet.

  2. Massive 3D supergravity

    Energy Technology Data Exchange (ETDEWEB)

    Andringa, Roel; Bergshoeff, Eric A; De Roo, Mees; Hohm, Olaf [Centre for Theoretical Physics, University of Groningen, Nijenborgh 4, 9747 AG Groningen (Netherlands); Sezgin, Ergin [George and Cynthia Woods Mitchell Institute for Fundamental Physics and Astronomy, Texas A and M University, College Station, TX 77843 (United States); Townsend, Paul K, E-mail: E.A.Bergshoeff@rug.n, E-mail: O.Hohm@rug.n, E-mail: sezgin@tamu.ed, E-mail: P.K.Townsend@damtp.cam.ac.u [Department of Applied Mathematics and Theoretical Physics, Centre for Mathematical Sciences, University of Cambridge, Wilberforce Road, Cambridge, CB3 0WA (United Kingdom)

    2010-01-21

    We construct the N=1 three-dimensional supergravity theory with cosmological, Einstein-Hilbert, Lorentz Chern-Simons, and general curvature squared terms. We determine the general supersymmetric configuration, and find a family of supersymmetric adS vacua with the supersymmetric Minkowski vacuum as a limiting case. Linearizing about the Minkowski vacuum, we find three classes of unitary theories; one is the supersymmetric extension of the recently discovered 'massive 3D gravity'. Another is a 'new topologically massive supergravity' (with no Einstein-Hilbert term) that propagates a single (2,3/2) helicity supermultiplet.

  3. 3D Digital Modelling

    DEFF Research Database (Denmark)

    Hundebøl, Jesper

    ABSTRACT: Lack of productivity in construction is a well known issue. Despite the fact that causes hereof are multiple, the introduction of information technology is a frequently observed response to almost any challenge. ICT in construction is a thoroughly researched matter, however, the current...... important to appreciate the analysis. Before turning to the presentation of preliminary findings and a discussion of 3D digital modelling, it begins, however, with an outline of industry specific ICT strategic issues. Paper type. Multi-site field study...

  4. Generalized recovery algorithm for 3D super-resolution microscopy using rotating point spread functions

    Science.gov (United States)

    Shuang, Bo; Wang, Wenxiao; Shen, Hao; Tauzin, Lawrence J.; Flatebo, Charlotte; Chen, Jianbo; Moringo, Nicholas A.; Bishop, Logan D. C.; Kelly, Kevin F.; Landes, Christy F.

    2016-01-01

    Super-resolution microscopy with phase masks is a promising technique for 3D imaging and tracking. Due to the complexity of the resultant point spread functions, generalized recovery algorithms are still missing. We introduce a 3D super-resolution recovery algorithm that works for a variety of phase masks generating 3D point spread functions. A fast deconvolution process generates initial guesses, which are further refined by least squares fitting. Overfitting is suppressed using a machine learning determined threshold. Preliminary results on experimental data show that our algorithm can be used to super-localize 3D adsorption events within a porous polymer film and is useful for evaluating potential phase masks. Finally, we demonstrate that parallel computation on graphics processing units can reduce the processing time required for 3D recovery. Simulations reveal that, through desktop parallelization, the ultimate limit of real-time processing is possible. Our program is the first open source recovery program for generalized 3D recovery using rotating point spread functions. PMID:27488312

  5. Generalized recovery algorithm for 3D super-resolution microscopy using rotating point spread functions.

    Science.gov (United States)

    Shuang, Bo; Wang, Wenxiao; Shen, Hao; Tauzin, Lawrence J; Flatebo, Charlotte; Chen, Jianbo; Moringo, Nicholas A; Bishop, Logan D C; Kelly, Kevin F; Landes, Christy F

    2016-01-01

    Super-resolution microscopy with phase masks is a promising technique for 3D imaging and tracking. Due to the complexity of the resultant point spread functions, generalized recovery algorithms are still missing. We introduce a 3D super-resolution recovery algorithm that works for a variety of phase masks generating 3D point spread functions. A fast deconvolution process generates initial guesses, which are further refined by least squares fitting. Overfitting is suppressed using a machine learning determined threshold. Preliminary results on experimental data show that our algorithm can be used to super-localize 3D adsorption events within a porous polymer film and is useful for evaluating potential phase masks. Finally, we demonstrate that parallel computation on graphics processing units can reduce the processing time required for 3D recovery. Simulations reveal that, through desktop parallelization, the ultimate limit of real-time processing is possible. Our program is the first open source recovery program for generalized 3D recovery using rotating point spread functions. PMID:27488312

  6. Generalized recovery algorithm for 3D super-resolution microscopy using rotating point spread functions

    Science.gov (United States)

    Shuang, Bo; Wang, Wenxiao; Shen, Hao; Tauzin, Lawrence J.; Flatebo, Charlotte; Chen, Jianbo; Moringo, Nicholas A.; Bishop, Logan D. C.; Kelly, Kevin F.; Landes, Christy F.

    2016-08-01

    Super-resolution microscopy with phase masks is a promising technique for 3D imaging and tracking. Due to the complexity of the resultant point spread functions, generalized recovery algorithms are still missing. We introduce a 3D super-resolution recovery algorithm that works for a variety of phase masks generating 3D point spread functions. A fast deconvolution process generates initial guesses, which are further refined by least squares fitting. Overfitting is suppressed using a machine learning determined threshold. Preliminary results on experimental data show that our algorithm can be used to super-localize 3D adsorption events within a porous polymer film and is useful for evaluating potential phase masks. Finally, we demonstrate that parallel computation on graphics processing units can reduce the processing time required for 3D recovery. Simulations reveal that, through desktop parallelization, the ultimate limit of real-time processing is possible. Our program is the first open source recovery program for generalized 3D recovery using rotating point spread functions.

  7. Managing Algorithmic Skeleton Nesting Requirements in Realistic Image Processing Applications: The Case of the SKiPPER-II Parallel Programming Environment's Operating Model

    Science.gov (United States)

    Coudarcher, Rémi; Duculty, Florent; Serot, Jocelyn; Jurie, Frédéric; Derutin, Jean-Pierre; Dhome, Michel

    2005-12-01

    SKiPPER is a SKeleton-based Parallel Programming EnviRonment being developed since 1996 and running at LASMEA Laboratory, the Blaise-Pascal University, France. The main goal of the project was to demonstrate the applicability of skeleton-based parallel programming techniques to the fast prototyping of reactive vision applications. This paper deals with the special features embedded in the latest version of the project: algorithmic skeleton nesting capabilities and a fully dynamic operating model. Throughout the case study of a complete and realistic image processing application, in which we have pointed out the requirement for skeleton nesting, we are presenting the operating model of this feature. The work described here is one of the few reported experiments showing the application of skeleton nesting facilities for the parallelisation of a realistic application, especially in the area of image processing. The image processing application we have chosen is a 3D face-tracking algorithm from appearance.

  8. Concurrent extensions to the FORTRAN language for parallel programming of computational fluid dynamics algorithms

    Science.gov (United States)

    Weeks, Cindy Lou

    1986-01-01

    Experiments were conducted at NASA Ames Research Center to define multi-tasking software requirements for multiple-instruction, multiple-data stream (MIMD) computer architectures. The focus was on specifying solutions for algorithms in the field of computational fluid dynamics (CFD). The program objectives were to allow researchers to produce usable parallel application software as soon as possible after acquiring MIMD computer equipment, to provide researchers with an easy-to-learn and easy-to-use parallel software language which could be implemented on several different MIMD machines, and to enable researchers to list preferred design specifications for future MIMD computer architectures. Analysis of CFD algorithms indicated that extensions of an existing programming language, adaptable to new computer architectures, provided the best solution to meeting program objectives. The CoFORTRAN Language was written in response to these objectives and to provide researchers a means to experiment with parallel software solutions to CFD algorithms on machines with parallel architectures.

  9. On the Performance of the Python Programming Language for Serial and Parallel Scientific Computations

    Directory of Open Access Journals (Sweden)

    Xing Cai

    2005-01-01

    Full Text Available This article addresses the performance of scientific applications that use the Python programming language. First, we investigate several techniques for improving the computational efficiency of serial Python codes. Then, we discuss the basic programming techniques in Python for parallelizing serial scientific applications. It is shown that an efficient implementation of the array-related operations is essential for achieving good parallel performance, as for the serial case. Once the array-related operations are efficiently implemented, probably using a mixed-language implementation, good serial and parallel performance become achievable. This is confirmed by a set of numerical experiments. Python is also shown to be well suited for writing high-level parallel programs.

  10. Comparative Study of Message Passing and Shared Memory Parallel Programming Models in Neural Network Training

    Energy Technology Data Exchange (ETDEWEB)

    Vitela, J.; Gordillo, J.; Cortina, L; Hanebutte, U.

    1999-12-14

    It is presented a comparative performance study of a coarse grained parallel neural network training code, implemented in both OpenMP and MPI, standards for shared memory and message passing parallel programming environments, respectively. In addition, these versions of the parallel training code are compared to an implementation utilizing SHMEM the native SGI/CRAY environment for shared memory programming. The multiprocessor platform used is a SGI/Cray Origin 2000 with up to 32 processors. It is shown that in this study, the native CRAY environment outperforms MPI for the entire range of processors used, while OpenMP shows better performance than the other two environments when using more than 19 processors. In this study, the efficiency is always greater than 60% regardless of the parallel programming environment used as well as of the number of processors.

  11. TOWARDS: 3D INTERNET

    Directory of Open Access Journals (Sweden)

    Ms. Swapnali R. Ghadge

    2013-08-01

    Full Text Available In today’s ever-shifting media landscape, it can be a complex task to find effective ways to reach your desired audience. As traditional media such as television continue to lose audience share, one venue in particular stands out for its ability to attract highly motivated audiences and for its tremendous growth potential the 3D Internet. The concept of '3D Internet' has recently come into the spotlight in the R&D arena, catching the attention of many people, and leading to a lot of discussions. Basically, one can look into this matter from a few different perspectives: visualization and representation of information, and creation and transportation of information, among others. All of them still constitute research challenges, as no products or services are yet available or foreseen for the near future. Nevertheless, one can try to envisage the directions that can be taken towards achieving this goal. People who take part in virtual worlds stay online longer with a heightened level of interest. To take advantage of that interest, diverse businesses and organizations have claimed an early stake in this fast-growing market. They include technology leaders such as IBM, Microsoft, and Cisco, companies such as BMW, Toyota, Circuit City, Coca Cola, and Calvin Klein, and scores of universities, including Harvard, Stanford and Penn State.

  12. Performance and productivity of parallel python programming: a study with a CFD test case

    OpenAIRE

    Basermann, Achim; Röhrig-Zöllner, Melven; Illmer, Joachim

    2015-01-01

    The programming language Python is widely used to create rapidly compact software. However, compared to low-level programming languages like C or Fortran low performance is preventing its use for HPC applications. Efficient parallel programming of multi-core systems and graphic cards is generally a complex task. Python with add-ons might provide a simple approach to program those systems. This paper evaluates the performance of Python implementations with different libraries and compares it t...

  13. Process-Oriented Parallel Programming with an Application to Data-Intensive Computing

    OpenAIRE

    Givelberg, Edward

    2014-01-01

    We introduce process-oriented programming as a natural extension of object-oriented programming for parallel computing. It is based on the observation that every class of an object-oriented language can be instantiated as a process, accessible via a remote pointer. The introduction of process pointers requires no syntax extension, identifies processes with programming objects, and enables processes to exchange information simply by executing remote methods. Process-oriented programming is a h...

  14. 3D Turtle Graphics” by using a 3D Printer

    Directory of Open Access Journals (Sweden)

    Yasusi Kanada

    2015-04-01

    Full Text Available When creating shapes by using a 3D printer, usually, a static (declarative model designed by using a 3D CAD system is translated to a CAM program and it is sent to the printer. However, widely-used FDM-type 3D printers input a dynamical (procedural program that describes control of motions of the print head and extrusion of the filament. If the program is expressed by using a programming language or a library in a straight manner, solids can be created by a method similar to turtle graphics. An open-source library that enables “turtle 3D printing” method was described by Python and tested. Although this method currently has a problem that it cannot print in the air; however, if this problem is solved by an appropriate method, shapes drawn by 3D turtle graphics freely can be embodied by this method.

  15. Parallel Goals of the Early Childhood Music Program.

    Science.gov (United States)

    Cohen, Veronica Wolf

    Early childhood music programs should be based on two interacting goals: (1) to teach those skills most appropriate to a particular level and (2) to nurture musical creativity and self-expression. Early childhood is seen as the optimum time for acquiring certain musical skills, of which the ability to sing in tune is considered primary. The vocal…

  16. Parallelized Solution to Semidefinite Programmings in Quantum Complexity Theory

    CERN Document Server

    Wu, Xiaodi

    2010-01-01

    In this paper we present an equilibrium value based framework for solving SDPs via the multiplicative weight update method which is different from the one in Kale's thesis \\cite{Kale07}. One of the main advantages of the new framework is that we can guarantee the convertibility from approximate to exact feasibility in a much more general class of SDPs than previous result. Another advantage is the design of the oracle which is necessary for applying the multiplicative weight update method is much simplified in general cases. This leads to an alternative and easier solutions to the SDPs used in the previous results \\class{QIP(2)}$\\subseteq$\\class{PSPACE} \\cite{JainUW09} and \\class{QMAM}=\\class{PSPACE} \\cite{JainJUW09}. Furthermore, we provide a generic form of SDPs which can be solved in the similar way. By parallelizing every step in our solution, we are able to solve a class of SDPs in \\class{NC}. Although our motivation is from quantum computing, our result will also apply directly to any SDP which satisfie...

  17. 3D Turtle Graphics” by using a 3D Printer

    OpenAIRE

    Yasusi Kanada

    2015-01-01

    When creating shapes by using a 3D printer, usually, a static (declarative) model designed by using a 3D CAD system is translated to a CAM program and it is sent to the printer. However, widely-used FDM-type 3D printers input a dynamical (procedural) program that describes control of motions of the print head and extrusion of the filament. If the program is expressed by using a programming language or a library in a straight manner, solids can be created by a method similar to tur...

  18. Guide to development of a scalar massive parallel programming on Paragon

    Energy Technology Data Exchange (ETDEWEB)

    Ueshima, Yutaka; Arakawa, Takuya; Sasaki, Akira [Japan Atomic Energy Research Inst., Neyagawa, Osaka (Japan). Kansai Research Establishment; Yokota, Hisasi

    1998-10-01

    Parallel calculations using more than hundred computers had begun in Japan only several years ago. The Intel Paragon XP/S 15GP256 , 75MP834 were introduced as pioneers in Japan Atomic Energy Research Institute (JAERI) to pursue massive parallel simulations for advanced photon and fusion researches. Recently, large number of parallel programs have been transplanted or newly produced to perform the parallel calculations with those computers. However, these programs are developed based on software technologies for conventional super computer, therefore they sometimes cause troubles in the massive parallel computing. In principle, when programs are developed under different computer and operating system (OS), prudent directions and knowledge are needed. However, integration of knowledge and standardization of environment are quite difficult because number of Paragon system and Paragon`s users are very small in Japan. Therefore, we summarized information which was got through the process of development of a massive parallel program in the Paragon XP/S 75MP834. (author)

  19. Shaping 3-D boxes

    DEFF Research Database (Denmark)

    Stenholt, Rasmus; Madsen, Claus B.

    2011-01-01

    Enabling users to shape 3-D boxes in immersive virtual environments is a non-trivial problem. In this paper, a new family of techniques for creating rectangular boxes of arbitrary position, orientation, and size is presented and evaluated. These new techniques are based solely on position data......, making them different from typical, existing box shaping techniques. The basis of the proposed techniques is a new algorithm for constructing a full box from just three of its corners. The evaluation of the new techniques compares their precision and completion times in a 9 degree-of-freedom (Do......F) docking experiment against an existing technique, which requires the user to perform the rotation and scaling of the box explicitly. The precision of the users' box construction is evaluated by a novel error metric measuring the difference between two boxes. The results of the experiment strongly indicate...

  20. RAG-3D: a search tool for RNA 3D substructures.

    Science.gov (United States)

    Zahran, Mai; Sevim Bayrak, Cigdem; Elmetwaly, Shereef; Schlick, Tamar

    2015-10-30

    To address many challenges in RNA structure/function prediction, the characterization of RNA's modular architectural units is required. Using the RNA-As-Graphs (RAG) database, we have previously explored the existence of secondary structure (2D) submotifs within larger RNA structures. Here we present RAG-3D-a dataset of RNA tertiary (3D) structures and substructures plus a web-based search tool-designed to exploit graph representations of RNAs for the goal of searching for similar 3D structural fragments. The objects in RAG-3D consist of 3D structures translated into 3D graphs, cataloged based on the connectivity between their secondary structure elements. Each graph is additionally described in terms of its subgraph building blocks. The RAG-3D search tool then compares a query RNA 3D structure to those in the database to obtain structurally similar structures and substructures. This comparison reveals conserved 3D RNA features and thus may suggest functional connections. Though RNA search programs based on similarity in sequence, 2D, and/or 3D structural elements are available, our graph-based search tool may be advantageous for illuminating similarities that are not obvious; using motifs rather than sequence space also reduces search times considerably. Ultimately, such substructuring could be useful for RNA 3D structure prediction, structure/function inference and inverse folding. PMID:26304547

  1. 3D Membrane Imaging and Porosity Visualization

    KAUST Repository

    Sundaramoorthi, Ganesh

    2016-03-03

    Ultrafiltration asymmetric porous membranes were imaged by two microscopy methods, which allow 3D reconstruction: Focused Ion Beam and Serial Block Face Scanning Electron Microscopy. A new algorithm was proposed to evaluate porosity and average pore size in different layers orthogonal and parallel to the membrane surface. The 3D-reconstruction enabled additionally the visualization of pore interconnectivity in different parts of the membrane. The method was demonstrated for a block copolymer porous membrane and can be extended to other membranes with application in ultrafiltration, supports for forward osmosis, etc, offering a complete view of the transport paths in the membrane.

  2. Task scheduling of parallel programs to optimize communications for cluster of SMPs

    Institute of Scientific and Technical Information of China (English)

    郑纬民; 杨博; 林伟坚; 李志光

    2001-01-01

    This paper discusses the compile time task scheduling of parallel program running on cluster of SMP workstations. Firstly, the problem is stated formally and transformed into a graph partition problem and proved to be NP-Complete. A heuristic algorithm MMP-Solver is then proposed to solve the problem. Experiment result shows that the task scheduling can reduce communication overhead of parallel applications greatly and MMP-Solver outperforms the existing algorithms.

  3. MPI parallel programming of mixed integer optimization problems using CPLEX with COIN-OR

    OpenAIRE

    Aldasoro Marcellan, Unai; Garín Martín, María Araceli; Merino Maestre, María; Pérez Sainz de Rozas, Gloria

    2012-01-01

    The aim of this technical report is to present some detailed explanations in order to help to understand and use the Message Passing Interface (MPI) parallel programming for solving several mixed integer optimization problems. We have developed a C++ experimental code that uses the IBM ILOG CPLEX optimizer within the COmputational INfrastructure for Operations Research (COIN-OR) and MPI parallel computing for solving the optimization models under UNIX-like systems. The computational experienc...

  4. Dynamic programming parallel implementations for the knapsack problem

    OpenAIRE

    Andonov, Rumen; Raimbault, Frédéric; Quinton, Patrice

    1993-01-01

    A systolic algorithm for the dynamic programming approach to the knapsack problem is presented. The algorithm can run on any number of processors and has optimal time speedup and processor efficiency. The running time of the algorithm is [??](mc/q+m) on a ring of q processors, where c is the knapsack size and m is the number of object types. A new procedure for the backtracking phase of the algorithm with a time complexity [??](m) is also proposed which is an improvement on the usual strategi...

  5. Design for scalability in 3D computer graphics architectures

    DEFF Research Database (Denmark)

    Holten-Lund, Hans Erik

    2002-01-01

    This thesis describes useful methods and techniques for designing scalable hybrid parallel rendering architectures for 3D computer graphics. Various techniques for utilizing parallelism in a pipelines system are analyzed. During the Ph.D study a prototype 3D graphics architecture named Hybris has...... been developed. Hybris is a prototype rendering architeture which can be tailored to many specific 3D graphics applications and implemented in various ways. Parallel software implementations for both single and multi-processor Windows 2000 system have been demonstrated. Working hardware...

  6. An Improved Version of TOPAZ 3D

    CERN Document Server

    Krasnykh, Anatoly K

    2003-01-01

    An improved version of the TOPAZ 3D gun code is presented as a powerful tool for beam optics simulation. In contrast to the previous version of TOPAZ 3D, the geometry of the device under test is introduced into TOPAZ 3D directly from a CAD program, such as Solid Edge or AutoCAD. In order to have this new feature, an interface was developed, using the GiD software package as a meshing code. The article describes this method with two models to illustrate the results.

  7. An Improved Version of TOPAZ 3D

    International Nuclear Information System (INIS)

    An improved version of the TOPAZ 3D gun code is presented as a powerful tool for beam optics simulation. In contrast to the previous version of TOPAZ 3D, the geometry of the device under test is introduced into TOPAZ 3D directly from a CAD program, such as Solid Edge or AutoCAD. In order to have this new feature, an interface was developed, using the GiD software package as a meshing code. The article describes this method with two models to illustrate the results

  8. Wireless 3D Chocolate Printer

    Directory of Open Access Journals (Sweden)

    FROILAN G. DESTREZA

    2014-02-01

    Full Text Available This study is for the BSHRM Students of Batangas State University (BatStateU ARASOF for the researchers believe that the Wireless 3D Chocolate Printer would be helpful in their degree program especially on making creative, artistic, personalized and decorative chocolate designs. The researchers used the Prototyping model as procedural method for the successful development and implementation of the hardware and software. This method has five phases which are the following: quick plan, quick design, prototype construction, delivery and feedback and communication. This study was evaluated by the BSHRM Students and the assessment of the respondents regarding the software and hardware application are all excellent in terms of Accuracy, Effecitveness, Efficiency, Maintainability, Reliability and User-friendliness. Also, the overall level of acceptability of the design project as evaluated by the respondents is excellent. With regard to the observation about the best raw material to use in 3D printing, the chocolate is good to use as the printed material is slightly distorted,durable and very easy to prepare; the icing is also good to use as the printed material is not distorted and is very durable but consumes time to prepare; the flour is not good as the printed material is distorted, not durable but it is easy to prepare. The computation of the economic viability level of 3d printer with reference to ROI is 37.14%. The recommendation of the researchers in the design project are as follows: adding a cooling system so that the raw material will be more durable, development of a more simplified version and improving the extrusion process wherein the user do not need to stop the printing process just to replace the empty syringe with a new one.

  9. 3D printing for dummies

    CERN Document Server

    Hausman, Kalani Kirk

    2014-01-01

    Get started printing out 3D objects quickly and inexpensively! 3D printing is no longer just a figment of your imagination. This remarkable technology is coming to the masses with the growing availability of 3D printers. 3D printers create 3-dimensional layered models and they allow users to create prototypes that use multiple materials and colors.  This friendly-but-straightforward guide examines each type of 3D printing technology available today and gives artists, entrepreneurs, engineers, and hobbyists insight into the amazing things 3D printing has to offer. You'll discover methods for

  10. A Tool for Performance Modeling of Parallel Programs

    Directory of Open Access Journals (Sweden)

    J.A. González

    2003-01-01

    Full Text Available Current performance prediction analytical models try to characterize the performance behavior of actual machines through a small set of parameters. In practice, substantial deviations are observed. These differences are due to factors as memory hierarchies or network latency. A natural approach is to associate a different proportionality constant with each basic block, and analogously, to associate different latencies and bandwidths with each "communication block". Unfortunately, to use this approach implies that the evaluation of parameters must be done for each algorithm. This is a heavy task, implying experiment design, timing, statistics, pattern recognition and multi-parameter fitting algorithms. Software support is required. We present a compiler that takes as source a C program annotated with complexity formulas and produces as output an instrumented code. The trace files obtained from the execution of the resulting code are analyzed with an interactive interpreter, giving us, among other information, the values of those parameters.

  11. 3D monitor

    OpenAIRE

    Szkandera, Jan

    2009-01-01

    Tato bakalářská práce se zabývá návrhem a realizací systému, který umožní obraz scény zobrazovaný na ploše vnímat prostorově. Prostorové vnímání 2D obrazové informace je umožněno jednak stereopromítáním a jednak tím, že se obraz mění v závislosti na poloze pozorovatele. Tato práce se zabývá hlavně druhým z těchto problémů. This Bachelor's thesis goal is to design and realize system, which allows user to perceive 2D visual information as three-dimensional. 3D visual preception of 2D image i...

  12. Remote Memory Access: A Case for Portable, Efficient and Library Independent Parallel Programming

    Directory of Open Access Journals (Sweden)

    Alexandros V. Gerbessiotis

    2004-01-01

    Full Text Available In this work we make a strong case for remote memory access (RMA as the effective way to program a parallel computer by proposing a framework that supports RMA in a library independent, simple and intuitive way. If one uses our approach the parallel code one writes will run transparently under MPI-2 enabled libraries but also bulk-synchronous parallel libraries. The advantage of using RMA is code simplicity, reduced programming complexity, and increased efficiency. We support the latter claims by implementing under this framework a collection of benchmark programs consisting of a communication and synchronization performance assessment program, a dense matrix multiplication algorithm, and two variants of a parallel radix-sort algorithm and examine their performance on a LINUX-based PC cluster under three different RMA enabled libraries: LAM MPI, BSPlib, and PUB. We conclude that implementations of such parallel algorithms using RMA communication primitives lead to code that is as efficient as the message-passing equivalent code and in the case of radix-sort substantially more efficient. In addition our work can be used as a comparative study of the relevant capabilities of the three libraries.

  13. Scalable parallel programming for high performance seismic simulation on petascale heterogeneous supercomputers

    Science.gov (United States)

    Zhou, Jun

    The 1994 Northridge earthquake in Los Angeles, California, killed 57 people, injured over 8,700 and caused an estimated $20 billion in damage. Petascale simulations are needed in California and elsewhere to provide society with a better understanding of the rupture and wave dynamics of the largest earthquakes at shaking frequencies required to engineer safe structures. As the heterogeneous supercomputing infrastructures are becoming more common, numerical developments in earthquake system research are particularly challenged by the dependence on the accelerator elements to enable "the Big One" simulations with higher frequency and finer resolution. Reducing time to solution and power consumption are two primary focus area today for the enabling technology of fault rupture dynamics and seismic wave propagation in realistic 3D models of the crust's heterogeneous structure. This dissertation presents scalable parallel programming techniques for high performance seismic simulation running on petascale heterogeneous supercomputers. A real world earthquake simulation code, AWP-ODC, one of the most advanced earthquake codes to date, was chosen as the base code in this research, and the testbed is based on Titan at Oak Ridge National Laboraratory, the world's largest hetergeneous supercomputer. The research work is primarily related to architecture study, computation performance tuning and software system scalability. An earthquake simulation workflow has also been developed to support the efficient production sets of simulations. The highlights of the technical development are an aggressive performance optimization focusing on data locality and a notable data communication model that hides the data communication latency. This development results in the optimal computation efficiency and throughput for the 13-point stencil code on heterogeneous systems, which can be extended to general high-order stencil codes. Started from scratch, the hybrid CPU/GPU version of AWP

  14. 3D game environments create professional 3D game worlds

    CERN Document Server

    Ahearn, Luke

    2008-01-01

    The ultimate resource to help you create triple-A quality art for a variety of game worlds; 3D Game Environments offers detailed tutorials on creating 3D models, applying 2D art to 3D models, and clear concise advice on issues of efficiency and optimization for a 3D game engine. Using Photoshop and 3ds Max as his primary tools, Luke Ahearn explains how to create realistic textures from photo source and uses a variety of techniques to portray dynamic and believable game worlds.From a modern city to a steamy jungle, learn about the planning and technological considerations for 3D modelin

  15. SSI 2D/3D soil structure interaction: A program system for the calculation of structure-soil interactions using the boundary element method. Project C1

    International Nuclear Information System (INIS)

    SSI 2D/3D is a computer programm to calculate dynamic stiffness matrices for soil-structure-interaction problems in frequency domain. It is applicable to two- or three-dimensional situations. The present report is a detailed manual for the use of the computer code written in FORTRAN 77. In addition it gives a survey of the possibilities of the Boundary Element Method applied to dynamic problems in infinite domains. (orig.)

  16. High performance parallelism pearls 2 multicore and many-core programming approaches

    CERN Document Server

    Jeffers, Jim

    2015-01-01

    High Performance Parallelism Pearls Volume 2 offers another set of examples that demonstrate how to leverage parallelism. Similar to Volume 1, the techniques included here explain how to use processors and coprocessors with the same programming - illustrating the most effective ways to combine Xeon Phi coprocessors with Xeon and other multicore processors. The book includes examples of successful programming efforts, drawn from across industries and domains such as biomed, genetics, finance, manufacturing, imaging, and more. Each chapter in this edited work includes detailed explanations of t

  17. Performance Evaluation of Remote Memory Access (RMA) Programming on Shared Memory Parallel Computers

    Science.gov (United States)

    Jin, Hao-Qiang; Jost, Gabriele; Biegel, Bryan A. (Technical Monitor)

    2002-01-01

    The purpose of this study is to evaluate the feasibility of remote memory access (RMA) programming on shared memory parallel computers. We discuss different RMA based implementations of selected CFD application benchmark kernels and compare them to corresponding message passing based codes. For the message-passing implementation we use MPI point-to-point and global communication routines. For the RMA based approach we consider two different libraries supporting this programming model. One is a shared memory parallelization library (SMPlib) developed at NASA Ames, the other is the MPI-2 extensions to the MPI Standard. We give timing comparisons for the different implementation strategies and discuss the performance.

  18. Teaching Parallel Programming Using Both High-Level and Low-Level Languages

    OpenAIRE

    Pan, Yi

    2001-01-01

    We discuss the use of both MPI and OpenMP in the teaching of senior undergraduate and junior graduate classes in parallel programming. We briefly introduce the OpenMP standard and discuss why we have chosen to use it in parallel programming classes. Advantages of using OpenMP over message passing methods are discussed. We also include a brief enumeration of some of the drawbacks of using OpenMP and how these drawbacks are being addressed by supplementing OpenMP with additional MPI codes and p...

  19. Salient Local 3D Features for 3D Shape Retrieval

    CERN Document Server

    Godil, Afzal

    2011-01-01

    In this paper we describe a new formulation for the 3D salient local features based on the voxel grid inspired by the Scale Invariant Feature Transform (SIFT). We use it to identify the salient keypoints (invariant points) on a 3D voxelized model and calculate invariant 3D local feature descriptors at these keypoints. We then use the bag of words approach on the 3D local features to represent the 3D models for shape retrieval. The advantages of the method are that it can be applied to rigid as well as to articulated and deformable 3D models. Finally, this approach is applied for 3D Shape Retrieval on the McGill articulated shape benchmark and then the retrieval results are presented and compared to other methods.

  20. LB3D: a protein three-dimensional substructure search program based on the lower bound of a root mean square deviation value.

    Science.gov (United States)

    Terashi, Genki; Shibuya, Tetsuo; Takeda-Shitaka, Mayuko

    2012-05-01

    Searching for protein structure-function relationships using three-dimensional (3D) structural coordinates represents a fundamental approach for determining the function of proteins with unknown functions. Since protein structure databases are rapidly growing in size, the development of a fast search method to find similar protein substructures by comparison of protein 3D structures is essential. In this article, we present a novel protein 3D structure search method to find all substructures with root mean square deviations (RMSDs) to the query structure that are lower than a given threshold value. Our new algorithm runs in O(m + N/m(0.5)) time, after O(N log N) preprocessing, where N is the database size and m is the query length. The new method is 1.8-41.6 times faster than the practically best known O(N) algorithm, according to computational experiments using a huge database (i.e., >20,000,000 C-alpha coordinates).

  1. 3D modelling and recognition

    OpenAIRE

    Rodrigues, Marcos; Robinson, Alan; Alboul, Lyuba; Brink, Willie

    2006-01-01

    3D face recognition is an open field. In this paper we present a method for 3D facial recognition based on Principal Components Analysis. The method uses a relatively large number of facial measurements and ratios and yields reliable recognition. We also highlight our approach to sensor development for fast 3D model acquisition and automatic facial feature extraction.

  2. MOM3D/EM-ANIMATE - MOM3D WITH ANIMATION CODE

    Science.gov (United States)

    Shaeffer, J. F.

    1994-01-01

    compare surface-current distribution due to various initial excitation directions or electric field orientations. The program can accept up to 50 planes of field data consisting of a grid of 100 by 100 field points. These planes of data are user selectable and can be viewed individually or concurrently. With these preset limits, the program requires 55 megabytes of core memory to run. These limits can be changed in the header files to accommodate the available core memory of an individual workstation. An estimate of memory required can be made as follows: approximate memory in bytes equals (number of nodes times number of surfaces times 14 variables times bytes per word, typically 4 bytes per floating point) plus (number of field planes times number of nodes per plane times 21 variables times bytes per word). This gives the approximate memory size required to store the field and surface-current data. The total memory size is approximately 400,000 bytes plus the data memory size. The animation calculations are performed in real time at any user set time step. For Silicon Graphics Workstations that have multiple processors, this program has been optimized to perform these calculations on multiple processors to increase animation rates. The optimized program uses the SGI PFA (Power FORTRAN Accelerator) library. On single processor machines, the parallelization directives are seen as comments to the program and will have no effect on compilation or execution. MOM3D and EM-ANIMATE are written in FORTRAN 77 for interactive or batch execution on SGI series computers running IRIX 3.0 or later. The RAM requirements for these programs vary with the size of the problem being solved. A minimum of 30Mb of RAM is required for execution of EM-ANIMATE; however, the code may be modified to accommodate the available memory of an individual workstation. For EM-ANIMATE, twenty-four bit, double-buffered color capability is suggested, but not required. Sample executables and sample input and

  3. Grid Service Framework:Supporting Multi-Models Parallel Grid Programming

    Institute of Scientific and Technical Information of China (English)

    邓倩妮; 陆鑫达

    2004-01-01

    Web service is a grid computing technology that promises greater ease-of-use and interoperability than previous distributed computing technologies. This paper proposed Group Service Framework, a grid computing platform based on Microsoft. NET that use web service to: (1) locate and harness volunteer computing resources for different applications, and (2) support multi-models such as Master/Slave, Divide and Conquer, Phase Parallel and so forth parallel programming paradigms in Grid environment, (3) allocate data and balance load dynamically and transparently for grid computing application. The Grid Service Framework based on Microsoft. NET was used to implement several simple parallel computing applications. The results show that the proposed Group Service Framework is suitable for generic parallel numerical computing.

  4. High performance parallel computers for science: New developments at the Fermilab advanced computer program

    International Nuclear Information System (INIS)

    Fermilab's Advanced Computer Program (ACP) has been developing highly cost effective, yet practical, parallel computers for high energy physics since 1984. The ACP's latest developments are proceeding in two directions. A Second Generation ACP Multiprocessor System for experiments will include $3500 RISC processors each with performance over 15 VAX MIPS. To support such high performance, the new system allows parallel I/O, parallel interprocess communication, and parallel host processes. The ACP Multi-Array Processor, has been developed for theoretical physics. Each $4000 node is a FORTRAN or C programmable pipelined 20 MFlops (peak), 10 MByte single board computer. These are plugged into a 16 port crossbar switch crate which handles both inter and intra crate communication. The crates are connected in a hypercube. Site oriented applications like lattice gauge theory are supported by system software called CANOPY, which makes the hardware virtually transparent to users. A 256 node, 5 GFlop, system is under construction. 10 refs., 7 figs

  5. Mechanical properties of structures 3D printed with cementitious powders

    OpenAIRE

    Feng, Peng; Meng, Xinmiao; Chen, Jian Fei; Ye, Lieping

    2015-01-01

    The three dimensional (3D) printing technology has undergone rapid development in the last few years and it is now possible to print engineering structures. This paper presents a study of the mechanical behavior of 3D printed structures using cementitious powder. Microscopic observation reveals that the 3D printed products have a layered orthotropic microstructure, in which each layer consists of parallel strips. Compression and flexural tests were conducted to determine the mechanical proper...

  6. Architecture-Adaptive Computing Environment: A Tool for Teaching Parallel Programming

    Science.gov (United States)

    Dorband, John E.; Aburdene, Maurice F.

    2002-01-01

    Recently, networked and cluster computation have become very popular. This paper is an introduction to a new C based parallel language for architecture-adaptive programming, aCe C. The primary purpose of aCe (Architecture-adaptive Computing Environment) is to encourage programmers to implement applications on parallel architectures by providing them the assurance that future architectures will be able to run their applications with a minimum of modification. A secondary purpose is to encourage computer architects to develop new types of architectures by providing an easily implemented software development environment and a library of test applications. This new language should be an ideal tool to teach parallel programming. In this paper, we will focus on some fundamental features of aCe C.

  7. 3D Visualization of Recent Sumatra Earthquake

    Science.gov (United States)

    Nayak, Atul; Kilb, Debi

    2005-04-01

    Scientists and visualization experts at the Scripps Institution of Oceanography have created an interactive three-dimensional visualization of the 28 March 2005 magnitude 8.7 earthquake in Sumatra. The visualization shows the earthquake's hypocenter and aftershocks recorded until 29 March 2005, and compares it with the location of the 26 December 2004 magnitude 9 event and the consequent seismicity in that region. The 3D visualization was created using the Fledermaus software developed by Interactive Visualization Systems (http://www.ivs.unb.ca/) and stored as a ``scene'' file. To view this visualization, viewers need to download and install the free viewer program iView3D (http://www.ivs3d.com/products/iview3d).

  8. LOTT RANCH 3D PROJECT

    International Nuclear Information System (INIS)

    , and scout ticket data were integrated with the 3D interpretations to evaluate drilling opportunities resulting in an initial three well drilling program. Thousands of miles of signed bit data exist. Much of this data was processed during a time when software and hardware capabilities were either incapable or cost prohibitive to glean the full potential of the data. In fact in some circles signed bit gained an undeserved reputation for being less than optimum. As a consequence much of the older signed bit data sits on the shelf long forgotten or overlooked. With the high cost of new acquisition and permitting it might behoove other exploration companies to reconsider resurrecting perfectly viable existing volumes and have them reprocessed at a fraction of the cost of new acquisition

  9. Scoops3D: software to analyze 3D slope stability throughout a digital landscape

    Science.gov (United States)

    Reid, Mark E.; Christian, Sarah B.; Brien, Dianne L.; Henderson, Scott T.

    2015-01-01

    The computer program, Scoops3D, evaluates slope stability throughout a digital landscape represented by a digital elevation model (DEM). The program uses a three-dimensional (3D) method of columns approach to assess the stability of many (typically millions) potential landslides within a user-defined size range. For each potential landslide (or failure), Scoops3D assesses the stability of a rotational, spherical slip surface encompassing many DEM cells using a 3D version of either Bishop’s simplified method or the Ordinary (Fellenius) method of limit-equilibrium analysis. Scoops3D has several options for the user to systematically and efficiently search throughout an entire DEM, thereby incorporating the effects of complex surface topography. In a thorough search, each DEM cell is included in multiple potential failures, and Scoops3D records the lowest stability (factor of safety) for each DEM cell, as well as the size (volume or area) associated with each of these potential landslides. It also determines the least-stable potential failure for the entire DEM. The user has a variety of options for building a 3D domain, including layers or full 3D distributions of strength and pore-water pressures, simplistic earthquake loading, and unsaturated suction conditions. Results from Scoops3D can be readily incorporated into a geographic information system (GIS) or other visualization software. This manual includes information on the theoretical basis for the slope-stability analysis, requirements for constructing and searching a 3D domain, a detailed operational guide (including step-by-step instructions for using the graphical user interface [GUI] software, Scoops3D-i) and input/output file specifications, practical considerations for conducting an analysis, results of verification tests, and multiple examples illustrating the capabilities of Scoops3D. Easy-to-use software installation packages are available for the Windows or Macintosh operating systems; these packages

  10. Parallel Implementation of a Semidefinite Programming Solver based on CSDP in a distributed memory cluster

    NARCIS (Netherlands)

    Ivanov, I.D.; de Klerk, E.

    2007-01-01

    In this paper we present the algorithmic framework and practical aspects of implementing a parallel version of a primal-dual semidefinite programming solver on a distributed memory computer cluster. Our implementation is based on the CSDP solver and uses a message passing interface (MPI), and the Sc

  11. All-pairs Shortest Path Algorithm based on MPI+CUDA Distributed Parallel Programming Model

    Directory of Open Access Journals (Sweden)

    Qingshuang Wu

    2013-12-01

    Full Text Available In view of the problem that computing shortest paths in a graph is a complex and time-consuming process, and the traditional algorithm that rely on the CPU as computing unit solely can't meet the demand of real-time processing, in this paper, we present an all-pairs shortest paths algorithm using MPI+CUDA hybrid programming model, which can take use of the overwhelming computing power of the GPU cluster to speed up the processing. This proposed algorithm can combine the advantages of MPI and CUDA programming model, and can realize two-level parallel computing. In the cluster-level, we take use of the MPI programming model to achieve a coarse-grained parallel computing between the computational nodes of the GPU cluster. In the node-level, we take use of the CUDA programming model to achieve a GPU-accelerated fine grit parallel computing in each computational node internal. The experimental results show that the MPI+CUDA-based parallel algorithm can take full advantage of the powerful computing capability of the GPU cluster, and can achieve about hundreds of time speedup; The whole algorithm has good computing performance, reliability and scalability, and it is able to meet the demand of real-time processing of massive spatial shortest path analysis

  12. Hybrid MPI+UPC parallel programming paradigm on an SMP cluster

    OpenAIRE

    BOZKUŞ, Zeki

    2011-01-01

    The symmetric multiprocessing (SMP) cluster system, which consists of shared memory nodes with several multicore central processing units connected to a high-speed network to form a distributed memory system, is the most widely available hardware architecture for the high-performance computing community. Today, the Message Passing Interface (MPI) is the most widely used parallel programming paradigm for SMP clusters, in which the MPI provides programming both for an SMP node and among ...

  13. A language for data-parallel and task parallel programming dedicated to multi-SIMD computers. Contributions to hydrodynamic simulation with lattice gases

    International Nuclear Information System (INIS)

    Parallel programming covers task-parallelism and data-parallelism. Many problems need both parallelisms. Multi-SIMD computers allow hierarchical approach of these parallelisms. The T++ language, based on C++, is dedicated to exploit Multi-SIMD computers using a programming paradigm which is an extension of array-programming to tasks managing. Our language introduced array of independent tasks to achieve separately (MIMD), on subsets of processors of identical behaviour (SIMD), in order to translate the hierarchical inclusion of data-parallelism in task-parallelism. To manipulate in a symmetrical way tasks and data we propose meta-operations which have the same behaviour on tasks arrays and on data arrays. We explain how to implement this language on our parallel computer SYMPHONIE in order to profit by the locally-shared memory, by the hardware virtualization, and by the multiplicity of communications networks. We analyse simultaneously a typical application of such architecture. Finite elements scheme for Fluid mechanic needs powerful parallel computers and requires large floating points abilities. Lattice gases is an alternative to such simulations. Boolean lattice bases are simple, stable, modular, need to floating point computation, but include numerical noise. Boltzmann lattice gases present large precision of computation, but needs floating points and are only locally stable. We propose a new scheme, called multi-bit, who keeps the advantages of each boolean model to which it is applied, with large numerical precision and reduced noise. Experiments on viscosity, physical behaviour, noise reduction and spurious invariants are shown and implementation techniques for parallel Multi-SIMD computers detailed. (author)

  14. Planet-Disk Interaction on the GPU: The FARGO3D code

    Science.gov (United States)

    Masset, F. S.; Benítez-Llambay, P.

    2015-10-01

    We present the new code FARGO3D. It is a finite difference code that solves the equations of hydrodynamics or magnetohydrodynamics on a Cartesian, cylindrical or spherical mesh. It features orbital advection, conserves mass and (angular) momentum to machine accuracy. Special emphasis is put on the description of planet disk tidal interactions. It is parallelized with MPI, and it can run indistinctly on CPUs or GPUs, without the need to program in a GPU oriented language.

  15. LDRD final report on massively-parallel linear programming : the parPCx system.

    Energy Technology Data Exchange (ETDEWEB)

    Parekh, Ojas (Emory University, Atlanta, GA); Phillips, Cynthia Ann; Boman, Erik Gunnar

    2005-02-01

    This report summarizes the research and development performed from October 2002 to September 2004 at Sandia National Laboratories under the Laboratory-Directed Research and Development (LDRD) project ''Massively-Parallel Linear Programming''. We developed a linear programming (LP) solver designed to use a large number of processors. LP is the optimization of a linear objective function subject to linear constraints. Companies and universities have expended huge efforts over decades to produce fast, stable serial LP solvers. Previous parallel codes run on shared-memory systems and have little or no distribution of the constraint matrix. We have seen no reports of general LP solver runs on large numbers of processors. Our parallel LP code is based on an efficient serial implementation of Mehrotra's interior-point predictor-corrector algorithm (PCx). The computational core of this algorithm is the assembly and solution of a sparse linear system. We have substantially rewritten the PCx code and based it on Trilinos, the parallel linear algebra library developed at Sandia. Our interior-point method can use either direct or iterative solvers for the linear system. To achieve a good parallel data distribution of the constraint matrix, we use a (pre-release) version of a hypergraph partitioner from the Zoltan partitioning library. We describe the design and implementation of our new LP solver called parPCx and give preliminary computational results. We summarize a number of issues related to efficient parallel solution of LPs with interior-point methods including data distribution, numerical stability, and solving the core linear system using both direct and iterative methods. We describe a number of applications of LP specific to US Department of Energy mission areas and we summarize our efforts to integrate parPCx (and parallel LP solvers in general) into Sandia's massively-parallel integer programming solver PICO (Parallel Interger and

  16. 3D-skannaukseen perehtyminen

    OpenAIRE

    Santaluoto, Olli

    2012-01-01

    Tässä insinöörityössä tarkastellaan erilaisia 3D-skannaustekniikoita ja menetelmiä. Työssä myös kerrotaan esimerkkien avulla eri 3D-skannaustekniikoiden käyttökohteista. 3D-skannaus on Suomessa vielä melko harvinaista, siksi eri tekniikat ja käyttömahdollisuudet ovat monille tuntemattomia. 3D-skanneri on laite, jolla tutkitaan reaalimaailman esineitä tai ympäristöä keräämällä dataa kohteen muodoista. 3D-skannerit ovat hyvin paljon vastaavia tavallisen kameran kanssa. Kuten kameroilla, 3D...

  17. ORMGEN3D, 3-D Crack Geometry FEM Mesh Generator

    International Nuclear Information System (INIS)

    1 - Description of program or function: ORMGEN3D is a finite element mesh generator for computational fracture mechanics analysis. The program automatically generates a three-dimensional finite element model for six different crack geometries. These geometries include flat plates with straight or curved surface cracks and cylinders with part-through cracks on the outer or inner surface. Mathematical or user-defined crack shapes may be considered. The curved cracks may be semicircular, semi-elliptical, or user-defined. A cladding option is available that allows for either an embedded or penetrating crack in the clad material. 2 - Method of solution: In general, one eighth or one-quarter of the structure is modelled depending on the configuration or option selected. The program generates a core of special wedge or collapsed prism elements at the crack front to introduce the appropriate stress singularity at the crack tip. The remainder of the structure is modelled with conventional 20-node iso-parametric brick elements. Element group I of the finite element model consists of an inner core of special crack tip elements surrounding the crack front enclosed by a single layer of conventional brick elements. Eight element divisions are used in a plane orthogonal to the crack front, while the number of element divisions along the arc length of the crack front is user-specified. The remaining conventional brick elements of the model constitute element group II. 3 - Restrictions on the complexity of the problem: Maxima of 5,500 nodes, 4 layers of clad elements

  18. Speedup properties of phases in the execution profile of distributed parallel programs

    Energy Technology Data Exchange (ETDEWEB)

    Carlson, B.M. [Toronto Univ., ON (Canada). Computer Systems Research Institute; Wagner, T.D.; Dowdy, L.W. [Vanderbilt Univ., Nashville, TN (United States). Dept. of Computer Science; Worley, P.H. [Oak Ridge National Lab., TN (United States)

    1992-08-01

    The execution profile of a distributed-memory parallel program specifies the number of busy processors as a function of time. Periods of homogeneous processor utilization are manifested in many execution profiles. These periods can usually be correlated with the algorithms implemented in the underlying parallel code. Three families of methods for smoothing execution profile data are presented. These approaches simplify the problem of detecting end points of periods of homogeneous utilization. These periods, called phases, are then examined in isolation, and their speedup characteristics are explored. A specific workload executed on an Intel iPSC/860 is used for validation of the techniques described.

  19. Programming Environment for a High-Performance Parallel Supercomputer with Intelligent Communication

    Directory of Open Access Journals (Sweden)

    A. Gunzinger

    1996-01-01

    Full Text Available At the Electronics Laboratory of the Swiss Federal Institute of Technology (ETH in Zürich, the high-performance parallel supercomputer MUSIC (MUlti processor System with Intelligent Communication has been developed. As applications like neural network simulation and molecular dynamics show, the Electronics Laboratory supercomputer is absolutely on par with those of conventional supercomputers, but electric power requirements are reduced by a factor of 1,000, weight is reduced by a factor of 400, and price is reduced by a factor of 100. Software development is a key issue of such parallel systems. This article focuses on the programming environment of the MUSIC system and on its applications.

  20. Teaching Scientific Computing: A Model-Centered Approach to Pipeline and Parallel Programming with C

    Directory of Open Access Journals (Sweden)

    Vladimiras Dolgopolovas

    2015-01-01

    Full Text Available The aim of this study is to present an approach to the introduction into pipeline and parallel computing, using a model of the multiphase queueing system. Pipeline computing, including software pipelines, is among the key concepts in modern computing and electronics engineering. The modern computer science and engineering education requires a comprehensive curriculum, so the introduction to pipeline and parallel computing is the essential topic to be included in the curriculum. At the same time, the topic is among the most motivating tasks due to the comprehensive multidisciplinary and technical requirements. To enhance the educational process, the paper proposes a novel model-centered framework and develops the relevant learning objects. It allows implementing an educational platform of constructivist learning process, thus enabling learners’ experimentation with the provided programming models, obtaining learners’ competences of the modern scientific research and computational thinking, and capturing the relevant technical knowledge. It also provides an integral platform that allows a simultaneous and comparative introduction to pipelining and parallel computing. The programming language C for developing programming models and message passing interface (MPI and OpenMP parallelization tools have been chosen for implementation.

  1. 3D Printing Functional Nanocomposites

    OpenAIRE

    Leong, Yew Juan

    2016-01-01

    3D printing presents the ability of rapid prototyping and rapid manufacturing. Techniques such as stereolithography (SLA) and fused deposition molding (FDM) have been developed and utilized since the inception of 3D printing. In such techniques, polymers represent the most commonly used material for 3D printing due to material properties such as thermo plasticity as well as its ability to be polymerized from monomers. Polymer nanocomposites are polymers with nanomaterials composited into the ...

  2. Academic training: From Evolution Theory to Parallel and Distributed Genetic Programming

    CERN Multimedia

    2007-01-01

    2006-2007 ACADEMIC TRAINING PROGRAMME LECTURE SERIES 15, 16 March From 11:00 to 12:00 - Main Auditorium, bldg. 500 From Evolution Theory to Parallel and Distributed Genetic Programming F. FERNANDEZ DE VEGA / Univ. of Extremadura, SP Lecture No. 1: From Evolution Theory to Evolutionary Computation Evolutionary computation is a subfield of artificial intelligence (more particularly computational intelligence) involving combinatorial optimization problems, which are based to some degree on the evolution of biological life in the natural world. In this tutorial we will review the source of inspiration for this metaheuristic and its capability for solving problems. We will show the main flavours within the field, and different problems that have been successfully solved employing this kind of techniques. Lecture No. 2: Parallel and Distributed Genetic Programming The successful application of Genetic Programming (GP, one of the available Evolutionary Algorithms) to optimization problems has encouraged an ...

  3. 3D IBFV : Hardware-Accelerated 3D Flow Visualization

    NARCIS (Netherlands)

    Telea, Alexandru; Wijk, Jarke J. van

    2003-01-01

    We present a hardware-accelerated method for visualizing 3D flow fields. The method is based on insertion, advection, and decay of dye. To this aim, we extend the texture-based IBFV technique for 2D flow visualization in two main directions. First, we decompose the 3D flow visualization problem in a

  4. Research on Parallelization of Multiphase Space Numerical Simulation%多相空间数值模拟并行化研究

    Institute of Scientific and Technical Information of China (English)

    陆林生; 董超群; 李志辉

    2003-01-01

    This paper researches on parellelization of multiphase space numerical simulation with the case of the gaskinetic algorithm for 3-D flows. It focuses on the techniques of domain decomposition methods, vector reduction andboundary processing parallel optimization. The HPF parallel program design has been developed by the Parallel Pro-gramming Concept Design (PPCDS). The preferable parallel efficiency has been found by the HPF program in highperformance computer with massive scale parallel computing.

  5. Interactive 3D multimedia content

    CERN Document Server

    Cellary, Wojciech

    2012-01-01

    The book describes recent research results in the areas of modelling, creation, management and presentation of interactive 3D multimedia content. The book describes the current state of the art in the field and identifies the most important research and design issues. Consecutive chapters address these issues. These are: database modelling of 3D content, security in 3D environments, describing interactivity of content, searching content, visualization of search results, modelling mixed reality content, and efficient creation of interactive 3D content. Each chapter is illustrated with example a

  6. 3D Bayesian contextual classifiers

    DEFF Research Database (Denmark)

    Larsen, Rasmus

    2000-01-01

    We extend a series of multivariate Bayesian 2-D contextual classifiers to 3-D by specifying a simultaneous Gaussian distribution for the feature vectors as well as a prior distribution of the class variables of a pixel and its 6 nearest 3-D neighbours.......We extend a series of multivariate Bayesian 2-D contextual classifiers to 3-D by specifying a simultaneous Gaussian distribution for the feature vectors as well as a prior distribution of the class variables of a pixel and its 6 nearest 3-D neighbours....

  7. 3-D printers for libraries

    CERN Document Server

    Griffey, Jason

    2014-01-01

    As the maker movement continues to grow and 3-D printers become more affordable, an expanding group of hobbyists is keen to explore this new technology. In the time-honored tradition of introducing new technologies, many libraries are considering purchasing a 3-D printer. Jason Griffey, an early enthusiast of 3-D printing, has researched the marketplace and seen several systems first hand at the Consumer Electronics Show. In this report he introduces readers to the 3-D printing marketplace, covering such topics asHow fused deposition modeling (FDM) printing workBasic terminology such as build

  8. 3D for Graphic Designers

    CERN Document Server

    Connell, Ellery

    2011-01-01

    Helping graphic designers expand their 2D skills into the 3D space The trend in graphic design is towards 3D, with the demand for motion graphics, animation, photorealism, and interactivity rapidly increasing. And with the meteoric rise of iPads, smartphones, and other interactive devices, the design landscape is changing faster than ever.2D digital artists who need a quick and efficient way to join this brave new world will want 3D for Graphic Designers. Readers get hands-on basic training in working in the 3D space, including product design, industrial design and visualization, modeling, ani

  9. Using 3D in Visualization

    DEFF Research Database (Denmark)

    Wood, Jo; Kirschenbauer, Sabine; Döllner, Jürgen;

    2005-01-01

    to display 3D imagery. The extra cartographic degree of freedom offered by using 3D is explored and offered as a motivation for employing 3D in visualization. The use of VR and the construction of virtual environments exploit navigational and behavioral realism, but become most usefil when combined...... with abstracted representations embedded in a 3D space. The interactions between development of geovisualization, the technology used to implement it and the theory surrounding cartographic representation are explored. The dominance of computing technologies, driven particularly by the gaming industry...

  10. Algorithmic differentiation of pragma-defined parallel regions differentiating computer programs containing OpenMP

    CERN Document Server

    Förster, Michael

    2014-01-01

    Numerical programs often use parallel programming techniques such as OpenMP to compute the program's output values as efficient as possible. In addition, derivative values of these output values with respect to certain input values play a crucial role. To achieve code that computes not only the output values simultaneously but also the derivative values, this work introduces several source-to-source transformation rules. These rules are based on a technique called algorithmic differentiation. The main focus of this work lies on the important reverse mode of algorithmic differentiation. The inh

  11. P3T+: A Performance Estimator for Distributed and Parallel Programs

    Directory of Open Access Journals (Sweden)

    T. Fahringer

    2000-01-01

    Full Text Available Developing distributed and parallel programs on today's multiprocessor architectures is still a challenging task. Particular distressing is the lack of effective performance tools that support the programmer in evaluating changes in code, problem and machine sizes, and target architectures. In this paper we introduce P3T+ which is a performance estimator for mostly regular HPF (High Performance Fortran programs but partially covers also message passing programs (MPI. P3T+ is unique by modeling programs, compiler code transformations, and parallel and distributed architectures. It computes at compile-time a variety of performance parameters including work distribution, number of transfers, amount of data transferred, transfer times, computation times, and number of cache misses. Several novel technologies are employed to compute these parameters: loop iteration spaces, array access patterns, and data distributions are modeled by employing highly effective symbolic analysis. Communication is estimated by simulating the behavior of a communication library used by the underlying compiler. Computation times are predicted through pre-measured kernels on every target architecture of interest. We carefully model most critical architecture specific factors such as cache lines sizes, number of cache lines available, startup times, message transfer time per byte, etc. P3T+ has been implemented and is closely integrated with the Vienna High Performance Compiler (VFC to support programmers develop parallel and distributed applications. Experimental results for realistic kernel codes taken from real-world applications are presented to demonstrate both accuracy and usefulness of P3T+.

  12. Run-Time and Compiler Support for Programming in Adaptive Parallel Environments

    Directory of Open Access Journals (Sweden)

    Guy Edjlali

    1997-01-01

    Full Text Available For better utilization of computing resources, it is important to consider parallel programming environments in which the number of available processors varies at run-time. In this article, we discuss run-time support for data-parallel programming in such an adaptive environment. Executing programs in an adaptive environment requires redistributing data when the number of processors changes, and also requires determining new loop bounds and communication patterns for the new set of processors. We have developed a run-time library to provide this support. We discuss how the run-time library can be used by compilers of high-performance Fortran (HPF-like languages to generate code for an adaptive environment. We present performance results for a Navier-Stokes solver and a multigrid template run on a network of workstations and an IBM SP-2. Our experiments show that if the number of processors is not varied frequently, the cost of data redistribution is not significant compared to the time required for the actual computation. Overall, our work establishes the feasibility of compiling HPF for a network of nondedicated workstations, which are likely to be an important resource for parallel programming in the future.

  13. Towards Interactive Visual Exploration of Parallel Programs using a Domain-Specific Language

    KAUST Repository

    Klein, Tobias

    2016-04-19

    The use of GPUs and the massively parallel computing paradigm have become wide-spread. We describe a framework for the interactive visualization and visual analysis of the run-time behavior of massively parallel programs, especially OpenCL kernels. This facilitates understanding a program\\'s function and structure, finding the causes of possible slowdowns, locating program bugs, and interactively exploring and visually comparing different code variants in order to improve performance and correctness. Our approach enables very specific, user-centered analysis, both in terms of the recording of the run-time behavior and the visualization itself. Instead of having to manually write instrumented code to record data, simple code annotations tell the source-to-source compiler which code instrumentation to generate automatically. The visualization part of our framework then enables the interactive analysis of kernel run-time behavior in a way that can be very specific to a particular problem or optimization goal, such as analyzing the causes of memory bank conflicts or understanding an entire parallel algorithm.

  14. Documentation of a computer program to simulate lake-aquifer interaction using the MODFLOW ground water flow model and the MOC3D solute-transport model

    Science.gov (United States)

    Merritt, Michael L.; Konikow, Leonard F.

    2000-01-01

    Heads and flow patterns in surficial aquifers can be strongly influenced by the presence of stationary surface-water bodies (lakes) that are in direct contact, vertically and laterally, with the aquifer. Conversely, lake stages can be significantly affected by the volume of water that seeps through the lakebed that separates the lake from the aquifer. For these reasons, a set of computer subroutines called the Lake Package (LAK3) was developed to represent lake/aquifer interaction in numerical simulations using the U.S. Geological Survey three-dimensional, finite-difference, modular ground-water flow model MODFLOW and the U.S. Geological Survey three-dimensional method-of-characteristics solute-transport model MOC3D. In the Lake Package described in this report, a lake is represented as a volume of space within the model grid which consists of inactive cells extending downward from the upper surface of the grid. Active model grid cells bordering this space, representing the adjacent aquifer, exchange water with the lake at a rate determined by the relative heads and by conductances that are based on grid cell dimensions, hydraulic conductivities of the aquifer material, and user-specified leakance distributions that represent the resistance to flow through the material of the lakebed. Parts of the lake may become ?dry? as upper layers of the model are dewatered, with a concomitant reduction in lake surface area, and may subsequently rewet when aquifer heads rise. An empirical approximation has been encoded to simulate the rewetting of a lake that becomes completely dry. The variations of lake stages are determined by independent water budgets computed for each lake in the model grid. This lake budget process makes the package a simulator of the response of lake stage to hydraulic stresses applied to the aquifer. Implementation of a lake water budget requires input of parameters including those representing the rate of lake atmospheric recharge and evaporation

  15. Improvement of 3D Scanner

    Institute of Scientific and Technical Information of China (English)

    2003-01-01

    The disadvantage remaining in 3D scanning system and its reasons are discussed. A new host-and-slave structure with high speed image acquisition and processing system is proposed to quicken the image processing and improve the performance of 3D scanning system.

  16. 3D Printing for Bricks

    OpenAIRE

    ECT Team, Purdue

    2015-01-01

    Building Bytes, by Brian Peters, is a project that uses desktop 3D printers to print bricks for architecture. Instead of using an expensive custom-made printer, it uses a normal standard 3D printer which is available for everyone and makes it more accessible and also easier for fabrication.

  17. Supernova Remnant in 3-D

    Science.gov (United States)

    2009-01-01

    of the wavelength shift is related to the speed of motion, one can determine how fast the debris are moving in either direction. Because Cas A is the result of an explosion, the stellar debris is expanding radially outwards from the explosion center. Using simple geometry, the scientists were able to construct a 3-D model using all of this information. A program called 3-D Slicer modified for astronomical use by the Astronomical Medicine Project at Harvard University in Cambridge, Mass. was used to display and manipulate the 3-D model. Commercial software was then used to create the 3-D fly-through. The blue filaments defining the blast wave were not mapped using the Doppler effect because they emit a different kind of light synchrotron radiation that does not emit light at discrete wavelengths, but rather in a broad continuum. The blue filaments are only a representation of the actual filaments observed at the blast wave. This visualization shows that there are two main components to this supernova remnant: a spherical component in the outer parts of the remnant and a flattened (disk-like) component in the inner region. The spherical component consists of the outer layer of the star that exploded, probably made of helium and carbon. These layers drove a spherical blast wave into the diffuse gas surrounding the star. The flattened component that astronomers were unable to map into 3-D prior to these Spitzer observations consists of the inner layers of the star. It is made from various heavier elements, not all shown in the visualization, such as oxygen, neon, silicon, sulphur, argon and iron. High-velocity plumes, or jets, of this material are shooting out from the explosion in the plane of the disk-like component mentioned above. Plumes of silicon appear in the northeast and southwest, while those of iron are seen in the southeast and north. These jets were already known and Doppler velocity measurements have been made for these structures, but their orientation and

  18. Couillard: Parallel Programming via Coarse-Grained Data-Flow Compilation

    CERN Document Server

    Marzulo, Leandro A J; França, Felipe M G; Costa, Vítor Santos

    2011-01-01

    Data-flow is a natural approach to parallelism. However, describing dependencies and control between fine-grained data-flow tasks can be complex and present unwanted overheads. TALM (TALM is an Architecture and Language for Multi-threading) introduces a user-defined coarse-grained parallel data-flow model, where programmers identify code blocks, called super-instructions, to be run in parallel and connect them in a data-flow graph. TALM has been implemented as a hybrid Von Neumann/data-flow execution system: the \\emph{Trebuchet}. We have observed that TALM's usefulness largely depends on how programmers specify and connect super-instructions. Thus, we present \\emph{Couillard}, a full compiler that creates, based on an annotated C-program, a data-flow graph and C-code corresponding to each super-instruction. We show that our toolchain allows one to benefit from data-flow execution and explore sophisticated parallel programming techniques, with small effort. To evaluate our system we have executed a set of real...

  19. 3D printing in dentistry.

    Science.gov (United States)

    Dawood, A; Marti Marti, B; Sauret-Jackson, V; Darwood, A

    2015-12-01

    3D printing has been hailed as a disruptive technology which will change manufacturing. Used in aerospace, defence, art and design, 3D printing is becoming a subject of great interest in surgery. The technology has a particular resonance with dentistry, and with advances in 3D imaging and modelling technologies such as cone beam computed tomography and intraoral scanning, and with the relatively long history of the use of CAD CAM technologies in dentistry, it will become of increasing importance. Uses of 3D printing include the production of drill guides for dental implants, the production of physical models for prosthodontics, orthodontics and surgery, the manufacture of dental, craniomaxillofacial and orthopaedic implants, and the fabrication of copings and frameworks for implant and dental restorations. This paper reviews the types of 3D printing technologies available and their various applications in dentistry and in maxillofacial surgery. PMID:26657435

  20. 3D printing in dentistry.

    Science.gov (United States)

    Dawood, A; Marti Marti, B; Sauret-Jackson, V; Darwood, A

    2015-12-01

    3D printing has been hailed as a disruptive technology which will change manufacturing. Used in aerospace, defence, art and design, 3D printing is becoming a subject of great interest in surgery. The technology has a particular resonance with dentistry, and with advances in 3D imaging and modelling technologies such as cone beam computed tomography and intraoral scanning, and with the relatively long history of the use of CAD CAM technologies in dentistry, it will become of increasing importance. Uses of 3D printing include the production of drill guides for dental implants, the production of physical models for prosthodontics, orthodontics and surgery, the manufacture of dental, craniomaxillofacial and orthopaedic implants, and the fabrication of copings and frameworks for implant and dental restorations. This paper reviews the types of 3D printing technologies available and their various applications in dentistry and in maxillofacial surgery.

  1. 3DINVER.M: a MATLAB program to invert the gravity anomaly over a 3D horizontal density interface by Parker Oldenburg's algorithm

    Science.gov (United States)

    Gómez-Ortiz, David; Agarwal, Bhrigu N. P.

    2005-05-01

    A MATLAB source code 3DINVER.M is described to compute 3D geometry of a horizontal density interface from gridded gravity anomaly by Parker-Oldenburg iterative method. This procedure is based on a relationship between the Fourier transform of the gravity anomaly and the sum of the Fourier transform of the interface topography. Given the mean depth of the density interface and the density contrast between the two media, the three-dimensional geometry of the interface is iteratively calculated. The iterative process is terminated when either the RMS error between two successive approximations is lower than a pre-assigned value—used as convergence criterion, or until a pre-assigned maximum number of iterations is reached. A high-cut filter in the frequency domain has been incorporated to enhance the convergence in the iterative process. The algorithm is capable of handling large data sets requiring direct and inverse Fourier transforms effectively. The inversion of a gravity anomaly over Brittany (France) is presented to compute the Moho depth as a practical example.

  2. Creating a 3D Game Character Model

    OpenAIRE

    Paasikivi, Joni

    2014-01-01

    This thesis goes through the process of modeling a low poly 3D model for a video game project from the perspective of a novice 3D artist. The goal was to prepare a stylized low polygon model of less than 6000 triangles, based on pre-made design and a living person. The program used in this project was 3Ds Max. The process starts with the creation of the reference images for the 3Ds Max and goes through the process of modeling the wireframe model, unwrapping the model for texturizing, and crea...

  3. AutoCAD Civil 3D - Survey

    OpenAIRE

    Luketić, Antonio; Padovan, Ivan

    2011-01-01

    AutoCAD Civil 3D je vrlo kompleksan i napredan program za geodeziju i građevinarstvo. Survey izbornik predstavlja mali dio tog softvera. U programu je sadržana mogućnost kreiranja vlastitih kodova kao i naredbi kojima možemo ubrzati vizualizaciju izmjerenih objekata. Njegovim korištenjem se znatno skraćuje vrijeme obrade podataka. Članak je napisan na temelju jednostavnog, ali korisnog primjera kojime smo željeli upoznati čitatelje s prilično nepoznatom aplikacijom koju nudi AutoCAD Civil 3D....

  4. From functional programming to multicore parallelism: A case study based on Presburger Arithmetic

    DEFF Research Database (Denmark)

    Dung, Phan Anh; Hansen, Michael Reichhardt

    2011-01-01

    The overall goal of this work is studying parallelization of functional programs with the specific case study of decision procedures for Presburger Arithmetic (PA). PA is a first order theory of integers accepting addition as its only operation. Whereas it has wide applications in different areas......, we are interested in using PA in connection with the Duration Calculus Model Checker (DCMC) [5]. There are effective decision procedures for PA including Cooper’s algorithm and the Omega Test; however, their complexity is extremely high with doubly exponential lower bound and triply exponential upper...... in the SMT-solver Z3 [8] which has the capability of solving Presburger formulas. Functional programming is well-suited for the domain of decision procedures, and its immutability feature helps to reduce parallelization effort. While Haskell has progressed with a lot of parallelismrelated research [6], we...

  5. R3D Align: global pairwise alignment of RNA 3D structures using local superpositions

    Science.gov (United States)

    Rahrig, Ryan R.; Leontis, Neocles B.; Zirbel, Craig L.

    2010-01-01

    Motivation: Comparing 3D structures of homologous RNA molecules yields information about sequence and structural variability. To compare large RNA 3D structures, accurate automatic comparison tools are needed. In this article, we introduce a new algorithm and web server to align large homologous RNA structures nucleotide by nucleotide using local superpositions that accommodate the flexibility of RNA molecules. Local alignments are merged to form a global alignment by employing a maximum clique algorithm on a specially defined graph that we call the ‘local alignment’ graph. Results: The algorithm is implemented in a program suite and web server called ‘R3D Align’. The R3D Align alignment of homologous 3D structures of 5S, 16S and 23S rRNA was compared to a high-quality hand alignment. A full comparison of the 16S alignment with the other state-of-the-art methods is also provided. The R3D Align program suite includes new diagnostic tools for the structural evaluation of RNA alignments. The R3D Align alignments were compared to those produced by other programs and were found to be the most accurate, in comparison with a high quality hand-crafted alignment and in conjunction with a series of other diagnostics presented. The number of aligned base pairs as well as measures of geometric similarity are used to evaluate the accuracy of the alignments. Availability: R3D Align is freely available through a web server http://rna.bgsu.edu/R3DAlign. The MATLAB source code of the program suite is also freely available for download at that location. Supplementary information: Supplementary data are available at Bioinformatics online. Contact: r-rahrig@onu.edu PMID:20929913

  6. Parallel Implementation of a Semidefinite Programming Solver based on CSDP in a distributed memory cluster

    OpenAIRE

    Ivanov, I.D.; de Klerk, E.

    2007-01-01

    In this paper we present the algorithmic framework and practical aspects of implementing a parallel version of a primal-dual semidefinite programming solver on a distributed memory computer cluster. Our implementation is based on the CSDP solver and uses a message passing interface (MPI), and the ScaLAPACK library. A new feature is implemented to deal with problems that have rank-one constraint matrices. We show that significant improvement is obtained for a test set of problems with rank one...

  7. EMPIRE: a highly parallel semiempirical molecular orbital program: 2: periodic boundary conditions

    OpenAIRE

    Margraf, Johannes T.; Hennemann, Matthias; Meyer, Bernd; Clark, Timothy

    2015-01-01

    The increasing role of quantum chemical calculations in drug and materials design has led to a demand for methods that can describe the electronic structures of large and complex systems. Semiempirical methods based on the neglect of diatomic differential overlap (NDDO) approximation (e.g., the MNDO, MNDO/d , AM1 , AM1* , and PMx methods) are important representatives of such approaches. Many of these methods have been implemented in the massively parallel program EMPIRE], which makes the ful...

  8. 3-D Video Processing for 3-D TV

    Science.gov (United States)

    Sohn, Kwanghoon; Kim, Hansung; Kim, Yongtae

    One of the most desirable ways of realizing high quality information and telecommunication services has been called "The Sensation of Reality," which can be achieved by visual communication based on 3-D (Three-dimensional) images. These kinds of 3-D imaging systems have revealed potential applications in the fields of education, entertainment, medical surgery, video conferencing, etc. Especially, three-dimensional television (3-D TV) is believed to be the next generation of TV technology. Figure 13.1 shows how TV's display technologies have evolved , and Fig. 13.2 details the evolution of TV broadcasting as forecasted by the ETRI (Electronics and Telecommunications Research Institute). It is clear that 3-D TV broadcasting will be the next development in this field, and realistic broadcasting will soon follow.

  9. Real time 3D scanner: investigations and results

    Science.gov (United States)

    Nouri, Taoufik; Pflug, Leopold

    1993-12-01

    This article presents a concept of reconstruction of 3-D objects using non-invasive and touch loss techniques. The principle of this method is to display parallel interference optical fringes on an object and then to record the object under two angles of view. According to an appropriated treatment one reconstructs the 3-D object even when the object has no symmetrical plan. The 3-D surface data is available immediately in digital form for computer- visualization and for analysis software tools. The optical set-up for recording the 3-D object, the 3-D data extraction and treatment, as well as the reconstruction of the 3-D object are reported and commented on. This application is dedicated for reconstructive/cosmetic surgery, CAD, animation and research purposes.

  10. Importing a 3D model from an industrial design

    OpenAIRE

    Tran Thi, Thien

    2015-01-01

    In the media industry, sharing and transferring a 3D model to other programs for different stages of design is widely used. The final year project was carried out based on a case study in which a 3D model was imported from an industrial design to Autodesk 3ds Max. The thesis focuses on defining the workflow for importing a third-party 3D model to the 3ds Max program. In general, importing a 3D model made by one program to another one always presents many challenges. The purposes of this s...

  11. ADT-3D Tumor Detection Assistant in 3D

    Directory of Open Access Journals (Sweden)

    Jaime Lazcano Bello

    2008-12-01

    Full Text Available The present document describes ADT-3D (Three-Dimensional Tumor Detector Assistant, a prototype application developed to assist doctors diagnose, detect and locate tumors in the brain by using CT scan. The reader may find on this document an introduction to tumor detection; ADT-3D main goals; development details; description of the product; motivation for its development; result’s study; and areas of applicability.

  12. Unassisted 3D camera calibration

    Science.gov (United States)

    Atanassov, Kalin; Ramachandra, Vikas; Nash, James; Goma, Sergio R.

    2012-03-01

    With the rapid growth of 3D technology, 3D image capture has become a critical part of the 3D feature set on mobile phones. 3D image quality is affected by the scene geometry as well as on-the-device processing. An automatic 3D system usually assumes known camera poses accomplished by factory calibration using a special chart. In real life settings, pose parameters estimated by factory calibration can be negatively impacted by movements of the lens barrel due to shaking, focusing, or camera drop. If any of these factors displaces the optical axes of either or both cameras, vertical disparity might exceed the maximum tolerable margin and the 3D user may experience eye strain or headaches. To make 3D capture more practical, one needs to consider unassisted (on arbitrary scenes) calibration. In this paper, we propose an algorithm that relies on detection and matching of keypoints between left and right images. Frames containing erroneous matches, along with frames with insufficiently rich keypoint constellations, are detected and discarded. Roll, pitch yaw , and scale differences between left and right frames are then estimated. The algorithm performance is evaluated in terms of the remaining vertical disparity as compared to the maximum tolerable vertical disparity.

  13. DNA origami design of 3D nanostructures

    DEFF Research Database (Denmark)

    Andersen, Ebbe Sloth; Nielsen, Morten Muhlig

    2009-01-01

    [8]. We have recently developed a semi-automated DNA origami software package [9] that uses a 2D sequence editor in conjunction with several automated tools to facilitate the design process. Here we extend the use of the program for designing DNA origami structures in 3D and show the application......Structural DNA nanotechnology has been heavily dependent on the development of dedicated software tools for the design of unique helical junctions, to define unique sticky-ends for tile assembly, and for predicting the products of the self-assembly reaction of multiple DNA strands [1-3]. Recently......, several dedicated 3D editors for computer-aided design of DNA structures have been developed [4-7]. However, many of these tools are not efficient for designing DNA origami structures that requires the design of more than 200 unique DNA strands to be folded along a scaffold strand into a defined 3D shape...

  14. The interactive presentation of 3D information obtained from reconstructed datasets and 3D placement of single histological sections with the 3D portable document format

    OpenAIRE

    de Boer, B. A.; Soufan, A. T.; Hagoort, J.; Mohun, T. J.; van den Hoff, M. J. B.; Hasman, A.; Voorbraak, F. P. J. M.; Moorman, A. F. M.; Ruijter, J.M.

    2011-01-01

    Interpretation of the results of anatomical and embryological studies relies heavily on proper visualization of complex morphogenetic processes and patterns of gene expression in a three-dimensional (3D) context. However, reconstruction of complete 3D datasets is time consuming and often researchers study only a few sections. To help in understanding the resulting 2D data we developed a program (TRACTS) that places such arbitrary histological sections into a high-resolution 3D model of the de...

  15. 基于三维有限元的平行钢丝拉索断丝漏磁场仿真%Analysis of Magnetic Flux Leakage of Broken Wire of Parallel Wire Strands Based on 3D Finite Element

    Institute of Scientific and Technical Information of China (English)

    贲安然; 武新军; 袁建明; 徐志远

    2011-01-01

    Magnetic flux leakage!MFL) testing technology has been applied to test of the in-service cable with the development of bridge inspection. Due to the complexity of structural characteristic of cables, MFL testing on cable is characterized by the major diameter and liftoff, which make the background magnetic field higher. A MFL testing model is established using the 3D finite element simulation method. The authentic signals of magnetic leakage field are obtained by subtracting the magnetic field signal of no defect from the defective field signals. The relationships of distribution of the MFL field are achieved, which are influenced by many parameters, such as broken wire's width and burial depth, etc. And then, the curves are obtained representing distribution of the MFL field to above parameters. Finite element simulation provides the basis for interpretation of MFL testing results and plays an important role in quantitative analysis of broken wires.%随着桥梁检测技术的发展,漏磁检测技术开始应用于在役拉索检测.由于拉索自身结构复杂,拉索漏磁检测具有大直径和大提离的特点,导致背景磁场较强.利用有限元方法建立三维漏磁检测模型,通过将存在断丝缺陷的磁场信号与无缺陷的磁场信号求差,获得了真实漏磁场信号;研究了断丝断口宽度、埋藏深度等参数对检测信号的影响,给出了漏磁场随上述参数的变化曲线,从而为拉索漏磁检测信号的解释提供依据,为断丝量化分析奠定了基础.

  16. MIST: An Open Source Environmental Modelling Programming Language Incorporating Easy to Use Data Parallelism.

    Science.gov (United States)

    Bellerby, Tim

    2014-05-01

    Model Integration System (MIST) is open-source environmental modelling programming language that directly incorporates data parallelism. The language is designed to enable straightforward programming structures, such as nested loops and conditional statements to be directly translated into sequences of whole-array (or more generally whole data-structure) operations. MIST thus enables the programmer to use well-understood constructs, directly relating to the mathematical structure of the model, without having to explicitly vectorize code or worry about details of parallelization. A range of common modelling operations are supported by dedicated language structures operating on cell neighbourhoods rather than individual cells (e.g.: the 3x3 local neighbourhood needed to implement an averaging image filter can be simply accessed from within a simple loop traversing all image pixels). This facility hides details of inter-process communication behind more mathematically relevant descriptions of model dynamics. The MIST automatic vectorization/parallelization process serves both to distribute work among available nodes and separately to control storage requirements for intermediate expressions - enabling operations on very large domains for which memory availability may be an issue. MIST is designed to facilitate efficient interpreter based implementations. A prototype open source interpreter is available, coded in standard FORTRAN 95, with tools to rapidly integrate existing FORTRAN 77 or 95 code libraries. The language is formally specified and thus not limited to FORTRAN implementation or to an interpreter-based approach. A MIST to FORTRAN compiler is under development and volunteers are sought to create an ANSI-C implementation. Parallel processing is currently implemented using OpenMP. However, parallelization code is fully modularised and could be replaced with implementations using other libraries. GPU implementation is potentially possible.

  17. Handbook of 3D integration

    CERN Document Server

    Garrou , Philip; Ramm , Peter

    2014-01-01

    Edited by key figures in 3D integration and written by top authors from high-tech companies and renowned research institutions, this book covers the intricate details of 3D process technology.As such, the main focus is on silicon via formation, bonding and debonding, thinning, via reveal and backside processing, both from a technological and a materials science perspective. The last part of the book is concerned with assessing and enhancing the reliability of the 3D integrated devices, which is a prerequisite for the large-scale implementation of this emerging technology. Invaluable reading fo

  18. Tuotekehitysprojekti: 3D-tulostin

    OpenAIRE

    Pihlajamäki, Janne

    2011-01-01

    Opinnäytetyössä tutustuttiin 3D-tulostamisen teknologiaan. Työssä käytiin läpi 3D-tulostimesta tehty tuotekehitysprojekti. Sen lisäksi esiteltiin yleisellä tasolla tuotekehitysprosessi ja syntyneiden tulosten mahdollisia suojausmenetelmiä. Tavoitteena tässä työssä oli kehittää markkinoilta jo löytyvää kotitulostin-tasoista 3D-laiteteknologiaa lähemmäksi ammattilaistason ratkaisua. Tavoitteeseen pyrittiin keskittymällä parantamaan laitteella saavutettavaa tulostustarkkuutta ja -nopeutt...

  19. 3D on the internet

    OpenAIRE

    Puntar, Matej

    2012-01-01

    The purpose of this thesis is the presentation of already established and new technologies of displaying 3D content in a web browser. The thesis begins with a short presentation of the history of 3D content available on the internet and its development together with advantages and disadvantages of individual technologies. The latter two are described in detail as well is their use and the differences among them. Special emphasis has been given to WebGL, the newest technology of 3D conte...

  20. Color 3D Reverse Engineering

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    This paper presents a principle and a method of col or 3D laser scanning measurement. Based on the fundamental monochrome 3D measureme nt study, color information capture, color texture mapping, coordinate computati on and other techniques are performed to achieve color 3D measurement. The syste m is designed and composed of a line laser light emitter, one color CCD camera, a motor-driven rotary filter, a circuit card and a computer. Two steps in captu ring object's images in the measurement process: Firs...

  1. Exploration of 3D Printing

    OpenAIRE

    Lin, Zeyu

    2014-01-01

    3D printing technology is introduced and defined in this Thesis. Some methods of 3D printing are illustrated and their principles are explained with pictures. Most of the essential parts are presented with pictures and their effects are explained within the whole system. Problems on Up! Plus 3D printer are solved and a DIY product is made with this machine. The processes of making product are recorded and the items which need to be noticed during the process are the highlight in this th...

  2. 3-D Force-balanced Magnetospheric Configurations

    International Nuclear Information System (INIS)

    The knowledge of plasma pressure is essential for many physics applications in the magnetosphere, such as computing magnetospheric currents and deriving magnetosphere-ionosphere coupling. A thorough knowledge of the 3-D pressure distribution has however eluded the community, as most in-situ pressure observations are either in the ionosphere or the equatorial region of the magnetosphere. With the assumption of pressure isotropy there have been attempts to obtain the pressure at different locations by either (a) mapping observed data (e.g., in the ionosphere) along the field lines of an empirical magnetospheric field model or (b) computing a pressure profile in the equatorial plane (in 2-D) or along the Sun-Earth axis (in 1-D) that is in force balance with the magnetic stresses of an empirical model. However, the pressure distributions obtained through these methods are not in force balance with the empirical magnetic field at all locations. In order to find a global 3-D plasma pressure distribution in force balance with the magnetospheric magnetic field, we have developed the MAG-3D code, that solves the 3-D force balance equation J x B = (upside-down delta) P computationally. Our calculation is performed in a flux coordinate system in which the magnetic field is expressed in terms of Euler potentials as B = (upside-down delta) psi x (upside-down delta) alpha. The pressure distribution, P = P(psi,alpha), is prescribed in the equatorial plane and is based on satellite measurements. In addition, computational boundary conditions for y surfaces are imposed using empirical field models. Our results provide 3-D distributions of magnetic field and plasma pressure as well as parallel and transverse currents for both quiet-time and disturbed magnetospheric conditions

  3. The Mathematical Method of Simplifying Level of Detail Model in 3D Game Programming%3D游戏程序设计中简化模型细节层次的数学方法

    Institute of Scientific and Technical Information of China (English)

    张云苑

    2013-01-01

    应用真实的角色和场景图形的表现,再现真实世界的3D游戏已成为电子游戏的主流产品。通过程序设计算法简化模型细节层次是实时3D图形学应用于游戏角色和场景图形表现的一项技术。该文论述了不同层次细节模型简化的数学方法,通过简化游戏中的角色或场景图形的复杂度,以满足实时3D渲染要求,达到真实世界3D游戏的表现效果。%The 3D video games which applying the performance of real characters and scenes graphs to produce the real world have become a mainstream products in the electronic game. Using algorithmic programming to simplify the level of detail model is a technique of applying the real-time 3D graphics into the gaming characters and scenes graphs. This article discusses mathe-matical approaches of different simplified level of details models, by simplifying the complexity of the game's role or the scene graph, it can meet real-time 3D rendering requirements and achieve real-world 3D gaming performance results .

  4. A Model for Managing 3D Printing Services in Academic Libraries

    Science.gov (United States)

    Scalfani, Vincent F.; Sahib, Josh

    2013-01-01

    The appearance of 3D printers in university libraries opens many opportunities for advancing outreach, teaching, and research programs. The University of Alabama (UA) Libraries recently adopted 3D printing technology and maintains an open access 3D Printing Studio. The Studio consists of a 3D printer, multiple 3D design workstations, and other…

  5. 3D CT Image-Guided Parallel Mechanism-Assisted Femur Fracture Reduction%3维CT图像导航的并联机构辅助股骨复位方法

    Institute of Scientific and Technical Information of China (English)

    龚敏丽; 徐颖; 唐佩福; 胡磊; 杜海龙; 吕振天; 姚腾洲

    2011-01-01

    Traditionally, the clinical femur fracture reduction surgery is imperfect and often results in misalignment and high intraoperative radiation exposures.To solve the problem, a method for fracture reduction based on preoperative CT image and 6 degrees of freedom parallel mechanism which is fixed to unhealthy femur is proposed.Based on the body's symmetry principle, the method uses contralateral femur as a standard to guide the reduction of femoral shaft fractures.By twelve markers on the implementing machine,the computer can calculate the length of pole in the virtual space in real time.Finally, animal bones experiments show the effectiveness of the approach.%临床上传统股骨复位手术存在复位精确度不高、辐射量大等不足.针对以上不足,提出一种基于术前CT图像引导,由固定于股骨患侧上的6自由度并联机构辅助股骨复位的方法.该方法基于人体的对称性原理,用患者健侧股骨镜像作为标准,指导患侧股骨复位:通过执行机构上的12个标记点,在虚拟空间中实时标记执行机构上每根杆的长度.动物骨实验验证了此复位方法的有效性.

  6. Materialedreven 3d digital formgivning

    DEFF Research Database (Denmark)

    Hansen, Flemming Tvede

    2010-01-01

    traditionel keramisk produktionssammenhæng. Problemstillingen opmuntrede endvidere til i et samarbejde med en programmør at udvikle et 3d digitalt redskab, der er blevet kaldt et digitalt interaktivt formgivningsredskab (DIF). Eksperimentet undersøger interaktive 3d digitale dynamiske systemer, der...... samarbejder med designere fra fagområder som interaktionsdesign og programmering. Afhandlingen peger på et fremtidigt forskningsfelt indenfor generative og responderende digitale systemer til 3d formgivning, der ligeledes inkluderer følesansen. Endvidere er det relevant at forske i, hvordan de RP teknikker...... formgivning og Rapid Prototyping (RP). RP er en fællesbetegnelse for en række af de teknikker, der muliggør at overføre den digitale form til 3d fysisk form. Forskningsprojektet koncentrerer sig om to overordnede forskningsspørgsmål. Det første handler om, hvordan viden og erfaring indenfor det keramiske...

  7. High performance parallel computers for science: New developments at the Fermilab advanced computer program

    Energy Technology Data Exchange (ETDEWEB)

    Nash, T.; Areti, H.; Atac, R.; Biel, J.; Cook, A.; Deppe, J.; Edel, M.; Fischler, M.; Gaines, I.; Hance, R.

    1988-08-01

    Fermilab's Advanced Computer Program (ACP) has been developing highly cost effective, yet practical, parallel computers for high energy physics since 1984. The ACP's latest developments are proceeding in two directions. A Second Generation ACP Multiprocessor System for experiments will include $3500 RISC processors each with performance over 15 VAX MIPS. To support such high performance, the new system allows parallel I/O, parallel interprocess communication, and parallel host processes. The ACP Multi-Array Processor, has been developed for theoretical physics. Each $4000 node is a FORTRAN or C programmable pipelined 20 MFlops (peak), 10 MByte single board computer. These are plugged into a 16 port crossbar switch crate which handles both inter and intra crate communication. The crates are connected in a hypercube. Site oriented applications like lattice gauge theory are supported by system software called CANOPY, which makes the hardware virtually transparent to users. A 256 node, 5 GFlop, system is under construction. 10 refs., 7 figs.

  8. The FORCE: A portable parallel programming language supporting computational structural mechanics

    Science.gov (United States)

    Jordan, Harry F.; Benten, Muhammad S.; Brehm, Juergen; Ramanan, Aruna

    1989-01-01

    This project supports the conversion of codes in Computational Structural Mechanics (CSM) to a parallel form which will efficiently exploit the computational power available from multiprocessors. The work is a part of a comprehensive, FORTRAN-based system to form a basis for a parallel version of the NICE/SPAR combination which will form the CSM Testbed. The software is macro-based and rests on the force methodology developed by the principal investigator in connection with an early scientific multiprocessor. Machine independence is an important characteristic of the system so that retargeting it to the Flex/32, or any other multiprocessor on which NICE/SPAR might be imnplemented, is well supported. The principal investigator has experience in producing parallel software for both full and sparse systems of linear equations using the force macros. Other researchers have used the Force in finite element programs. It has been possible to rapidly develop software which performs at maximum efficiency on a multiprocessor. The inherent machine independence of the system also means that the parallelization will not be limited to a specific multiprocessor.

  9. 3D Face Apperance Model

    DEFF Research Database (Denmark)

    Lading, Brian; Larsen, Rasmus; Astrom, K

    2006-01-01

    We build a 3D face shape model, including inter- and intra-shape variations, derive the analytical Jacobian of its resulting 2D rendered image, and show example of its fitting performance with light, pose, id, expression and texture variations......We build a 3D face shape model, including inter- and intra-shape variations, derive the analytical Jacobian of its resulting 2D rendered image, and show example of its fitting performance with light, pose, id, expression and texture variations...

  10. Main: TATCCAYMOTIFOSRAMY3D [PLACE

    Lifescience Database Archive (English)

    Full Text Available TATCCAYMOTIFOSRAMY3D S000256 01-August-2006 (last modified) kehi TATCCAY motif foun...d in rice (O.s.) RAmy3D alpha-amylase gene promoter; Y=T/C; a GATA motif as its antisense sequence; TATCCAY ...motif and G motif (see S000130) are responsible for sugar repression (Toyofuku et al. 1998); GATA; amylase; sugar; repression; rice (Oryza sativa) TATCCAY ...

  11. Combinatorial 3D Mechanical Metamaterials

    Science.gov (United States)

    Coulais, Corentin; Teomy, Eial; de Reus, Koen; Shokef, Yair; van Hecke, Martin

    2015-03-01

    We present a class of elastic structures which exhibit 3D-folding motion. Our structures consist of cubic lattices of anisotropic unit cells that can be tiled in a complex combinatorial fashion. We design and 3d-print this complex ordered mechanism, in which we combine elastic hinges and defects to tailor the mechanics of the material. Finally, we use this large design space to encode smart functionalities such as surface patterning and multistability.

  12. AI 3D Cybug Gaming

    CERN Document Server

    Ahmed, Zeeshan

    2010-01-01

    In this short paper I briefly discuss 3D war Game based on artificial intelligence concepts called AI WAR. Going in to the details, I present the importance of CAICL language and how this language is used in AI WAR. Moreover I also present a designed and implemented 3D War Cybug for AI WAR using CAICL and discus the implemented strategy to defeat its enemies during the game life.

  13. 3D Face Appearance Model

    DEFF Research Database (Denmark)

    Lading, Brian; Larsen, Rasmus; Åström, Kalle

    2006-01-01

    We build a 3d face shape model, including inter- and intra-shape variations, derive the analytical jacobian of its resulting 2d rendered image, and show example of its fitting performance with light, pose, id, expression and texture variations.}......We build a 3d face shape model, including inter- and intra-shape variations, derive the analytical jacobian of its resulting 2d rendered image, and show example of its fitting performance with light, pose, id, expression and texture variations.}...

  14. GPU-based rapid reconstruction of cellular 3D refractive index maps from tomographic phase microscopy (Conference Presentation)

    Science.gov (United States)

    Dardikman, Gili; Shaked, Natan T.

    2016-03-01

    We present highly parallel and efficient algorithms for real-time reconstruction of the quantitative three-dimensional (3-D) refractive-index maps of biological cells without labeling, as obtained from the interferometric projections acquired by tomographic phase microscopy (TPM). The new algorithms are implemented on the graphic processing unit (GPU) of the computer using CUDA programming environment. The reconstruction process includes two main parts. First, we used parallel complex wave-front reconstruction of the TPM-based interferometric projections acquired at various angles. The complex wave front reconstructions are done on the GPU in parallel, while minimizing the calculation time of the Fourier transforms and phase unwrapping needed. Next, we implemented on the GPU in parallel the 3-D refractive index map retrieval using the TPM filtered-back projection algorithm. The incorporation of algorithms that are inherently parallel with a programming environment such as Nvidia's CUDA makes it possible to obtain real-time processing rate, and enables high-throughput platform for label-free, 3-D cell visualization and diagnosis.

  15. Forward ramp in 3D

    Science.gov (United States)

    1997-01-01

    Mars Pathfinder's forward rover ramp can be seen successfully unfurled in this image, taken in stereo by the Imager for Mars Pathfinder (IMP) on Sol 3. 3D glasses are necessary to identify surface detail. This ramp was not used for the deployment of the microrover Sojourner, which occurred at the end of Sol 2. When this image was taken, Sojourner was still latched to one of the lander's petals, waiting for the command sequence that would execute its descent off of the lander's petal.The image helped Pathfinder scientists determine whether to deploy the rover using the forward or backward ramps and the nature of the first rover traverse. The metallic object at the lower left of the image is the lander's low-gain antenna. The square at the end of the ramp is one of the spacecraft's magnetic targets. Dust that accumulates on the magnetic targets will later be examined by Sojourner's Alpha Proton X-Ray Spectrometer instrument for chemical analysis. At right, a lander petal is visible.The IMP is a stereo imaging system with color capability provided by 24 selectable filters -- twelve filters per 'eye.' It stands 1.8 meters above the Martian surface, and has a resolution of two millimeters at a range of two meters.Mars Pathfinder is the second in NASA's Discovery program of low-cost spacecraft with highly focused science goals. The Jet Propulsion Laboratory, Pasadena, CA, developed and manages the Mars Pathfinder mission for NASA's Office of Space Science, Washington, D.C. JPL is an operating division of the California Institute of Technology (Caltech). The Imager for Mars Pathfinder (IMP) was developed by the University of Arizona Lunar and Planetary Laboratory under contract to JPL. Peter Smith is the Principal Investigator.Click below to see the left and right views individually. [figure removed for brevity, see original site] Left [figure removed for brevity, see original site] Right

  16. Performance Evaluation of Parallel Message Passing and Thread Programming Model on Multicore Architectures

    CERN Document Server

    Hasta, D T

    2010-01-01

    The current trend of multicore architectures on shared memory systems underscores the need of parallelism. While there are some programming model to express parallelism, thread programming model has become a standard to support these system such as OpenMP, and POSIX threads. MPI (Message Passing Interface) which remains the dominant model used in high-performance computing today faces this challenge. Previous version of MPI which is MPI-1 has no shared memory concept, and Current MPI version 2 which is MPI-2 has a limited support for shared memory systems. In this research, MPI-2 version of MPI will be compared with OpenMP to see how well does MPI perform on multicore / SMP (Symmetric Multiprocessor) machines. Comparison between OpenMP for thread programming model and MPI for message passing programming model will be conducted on multicore shared memory machine architectures to see who has a better performance in terms of speed and throughput. Application used to assess the scalability of the evaluated parall...

  17. MPML3D: Scripting Agents for the 3D Internet.

    Science.gov (United States)

    Prendinger, Helmut; Ullrich, Sebastian; Nakasone, Arturo; Ishizuka, Mitsuru

    2011-05-01

    The aim of this paper is two-fold. First, it describes a scripting language for specifying communicative behavior and interaction of computer-controlled agents ("bots") in the popular three-dimensional (3D) multiuser online world of "Second Life" and the emerging "OpenSimulator" project. While tools for designing avatars and in-world objects in Second Life exist, technology for nonprogrammer content creators of scenarios involving scripted agents is currently missing. Therefore, we have implemented new client software that controls bots based on the Multimodal Presentation Markup Language 3D (MPML3D), a highly expressive XML-based scripting language for controlling the verbal and nonverbal behavior of interacting animated agents. Second, the paper compares Second Life and OpenSimulator platforms and discusses the merits and limitations of each from the perspective of agent control. Here, we also conducted a small study that compares the network performance of both platforms.

  18. Speeding up matrix element computations by means of MPI parallel programming

    International Nuclear Information System (INIS)

    The solution of some quantum-mechanical problems asks computing large sets of matrix elements involving non-factorable two-dimensional integrals. In the present paper we discuss effective means, using MPI parallel programming, for speeding up the computation of large sets of such matrix elements. To make sound comparison of the computing times, we have developed the sequential code PotentialPC and the parallel code PotentialMPI, able to solve the problems of interest on a personal computer and on a multi-core cluster, respectively. Investigation of case study problems involving matrix elements of specific operators over oscillatory bases showed a gain of CPU time exceeding two orders of magnitude

  19. Emerging Applications of Bedside 3D Printing in Plastic Surgery.

    Science.gov (United States)

    Chae, Michael P; Rozen, Warren M; McMenamin, Paul G; Findlay, Michael W; Spychal, Robert T; Hunter-Smith, David J

    2015-01-01

    Modern imaging techniques are an essential component of preoperative planning in plastic and reconstructive surgery. However, conventional modalities, including three-dimensional (3D) reconstructions, are limited by their representation on 2D workstations. 3D printing, also known as rapid prototyping or additive manufacturing, was once the province of industry to fabricate models from a computer-aided design (CAD) in a layer-by-layer manner. The early adopters in clinical practice have embraced the medical imaging-guided 3D-printed biomodels for their ability to provide tactile feedback and a superior appreciation of visuospatial relationship between anatomical structures. With increasing accessibility, investigators are able to convert standard imaging data into a CAD file using various 3D reconstruction softwares and ultimately fabricate 3D models using 3D printing techniques, such as stereolithography, multijet modeling, selective laser sintering, binder jet technique, and fused deposition modeling. However, many clinicians have questioned whether the cost-to-benefit ratio justifies its ongoing use. The cost and size of 3D printers have rapidly decreased over the past decade in parallel with the expiration of key 3D printing patents. Significant improvements in clinical imaging and user-friendly 3D software have permitted computer-aided 3D modeling of anatomical structures and implants without outsourcing in many cases. These developments offer immense potential for the application of 3D printing at the bedside for a variety of clinical applications. In this review, existing uses of 3D printing in plastic surgery practice spanning the spectrum from templates for facial transplantation surgery through to the formation of bespoke craniofacial implants to optimize post-operative esthetics are described. Furthermore, we discuss the potential of 3D printing to become an essential office-based tool in plastic surgery to assist in preoperative planning, developing

  20. Emerging Applications of Bedside 3D Printing in Plastic Surgery

    Directory of Open Access Journals (Sweden)

    Michael P Chae

    2015-06-01

    Full Text Available Modern imaging techniques are an essential component of preoperative planning in plastic and reconstructive surgery. However, conventional modalities, including three-dimensional (3D reconstructions, are limited by their representation on 2D workstations. 3D printing has been embraced by early adopters to produce medical imaging-guided 3D printed biomodels that facilitate various aspects of clinical practice. The cost and size of 3D printers have rapidly decreased over the past decade in parallel with the expiration of key 3D printing patents. With increasing accessibility, investigators are now able to convert standard imaging data into Computer Aided Design (CAD files using various 3D reconstruction softwares and ultimately fabricate 3D models using 3D printing techniques, such as stereolithography (SLA, multijet modeling (MJM, selective laser sintering (SLS, binder jet technique (BJT, and fused deposition modeling (FDM. Significant improvements in clinical imaging and user-friendly 3D software have permitted computer-aided 3D modeling of anatomical structures and implants without out-sourcing in many cases. These developments offer immense potential for the application of 3D printing at the bedside for a variety of clinical applications. However, many clinicians have questioned whether the cost-to-benefit ratio justifies its ongoing use. In this review the existing uses of 3D printing in plastic surgery practice, spanning the spectrum from templates for facial transplantation surgery through to the formation of bespoke craniofacial implants to optimize post-operative aesthetics, are described. Furthermore, we discuss the potential of 3D printing to become an essential office-based tool in plastic surgery to assist in preoperative planning, patient and surgical trainee education, and the development of intraoperative guidance tools and patient-specific prosthetics in everyday surgical practice.

  1. FRAME3D V2.0 - A Program for the Three-Dimensional Nonlinear Time-History Analysis of Steel Structures: User Guide

    OpenAIRE

    Krishnan, Swaminathan

    2009-01-01

    This is Version 2.0 of the user guide and should be used along with Version 2.0 of the program. Updates include: 1. Realistic PMM interaction surfaces for plastic hinge elements (output file PMM). 2. 5-Segment modified elastofiber element for brace and slender column modeling. 3. Eigen value problem solver using subspace iteration (output files MODES and EIGEN). 4. Output the sum of forces of groups of elements (output file ELMGRPRES). Additional input is required as a result of t...

  2. From 3D view to 3D print

    Science.gov (United States)

    Dima, M.; Farisato, G.; Bergomi, M.; Viotto, V.; Magrin, D.; Greggio, D.; Farinato, J.; Marafatto, L.; Ragazzoni, R.; Piazza, D.

    2014-08-01

    In the last few years 3D printing is getting more and more popular and used in many fields going from manufacturing to industrial design, architecture, medical support and aerospace. 3D printing is an evolution of bi-dimensional printing, which allows to obtain a solid object from a 3D model, realized with a 3D modelling software. The final product is obtained using an additive process, in which successive layers of material are laid down one over the other. A 3D printer allows to realize, in a simple way, very complex shapes, which would be quite difficult to be produced with dedicated conventional facilities. Thanks to the fact that the 3D printing is obtained superposing one layer to the others, it doesn't need any particular work flow and it is sufficient to simply draw the model and send it to print. Many different kinds of 3D printers exist based on the technology and material used for layer deposition. A common material used by the toner is ABS plastics, which is a light and rigid thermoplastic polymer, whose peculiar mechanical properties make it diffusely used in several fields, like pipes production and cars interiors manufacturing. I used this technology to create a 1:1 scale model of the telescope which is the hardware core of the space small mission CHEOPS (CHaracterising ExOPlanets Satellite) by ESA, which aims to characterize EXOplanets via transits observations. The telescope has a Ritchey-Chrétien configuration with a 30cm aperture and the launch is foreseen in 2017. In this paper, I present the different phases for the realization of such a model, focusing onto pros and cons of this kind of technology. For example, because of the finite printable volume (10×10×12 inches in the x, y and z directions respectively), it has been necessary to split the largest parts of the instrument in smaller components to be then reassembled and post-processed. A further issue is the resolution of the printed material, which is expressed in terms of layers

  3. PLOT3D/AMES, UNIX SUPERCOMPUTER AND SGI IRIS VERSION (WITHOUT TURB3D)

    Science.gov (United States)

    Buning, P.

    1994-01-01

    PLOT3D is an interactive graphics program designed to help scientists visualize computational fluid dynamics (CFD) grids and solutions. Today, supercomputers and CFD algorithms can provide scientists with simulations of such highly complex phenomena that obtaining an understanding of the simulations has become a major problem. Tools which help the scientist visualize the simulations can be of tremendous aid. PLOT3D/AMES offers more functions and features, and has been adapted for more types of computers than any other CFD graphics program. Version 3.6b+ is supported for five computers and graphic libraries. Using PLOT3D, CFD physicists can view their computational models from any angle, observing the physics of problems and the quality of solutions. As an aid in designing aircraft, for example, PLOT3D's interactive computer graphics can show vortices, temperature, reverse flow, pressure, and dozens of other characteristics of air flow during flight. As critical areas become obvious, they can easily be studied more closely using a finer grid. PLOT3D is part of a computational fluid dynamics software cycle. First, a program such as 3DGRAPE (ARC-12620) helps the scientist generate computational grids to model an object and its surrounding space. Once the grids have been designed and parameters such as the angle of attack, Mach number, and Reynolds number have been specified, a "flow-solver" program such as INS3D (ARC-11794 or COS-10019) solves the system of equations governing fluid flow, usually on a supercomputer. Grids sometimes have as many as two million points, and the "flow-solver" produces a solution file which contains density, x- y- and z-momentum, and stagnation energy for each grid point. With such a solution file and a grid file containing up to 50 grids as input, PLOT3D can calculate and graphically display any one of 74 functions, including shock waves, surface pressure, velocity vectors, and particle traces. PLOT3D's 74 functions are organized into

  4. PLOT3D/AMES, UNIX SUPERCOMPUTER AND SGI IRIS VERSION (WITH TURB3D)

    Science.gov (United States)

    Buning, P.

    1994-01-01

    PLOT3D is an interactive graphics program designed to help scientists visualize computational fluid dynamics (CFD) grids and solutions. Today, supercomputers and CFD algorithms can provide scientists with simulations of such highly complex phenomena that obtaining an understanding of the simulations has become a major problem. Tools which help the scientist visualize the simulations can be of tremendous aid. PLOT3D/AMES offers more functions and features, and has been adapted for more types of computers than any other CFD graphics program. Version 3.6b+ is supported for five computers and graphic libraries. Using PLOT3D, CFD physicists can view their computational models from any angle, observing the physics of problems and the quality of solutions. As an aid in designing aircraft, for example, PLOT3D's interactive computer graphics can show vortices, temperature, reverse flow, pressure, and dozens of other characteristics of air flow during flight. As critical areas become obvious, they can easily be studied more closely using a finer grid. PLOT3D is part of a computational fluid dynamics software cycle. First, a program such as 3DGRAPE (ARC-12620) helps the scientist generate computational grids to model an object and its surrounding space. Once the grids have been designed and parameters such as the angle of attack, Mach number, and Reynolds number have been specified, a "flow-solver" program such as INS3D (ARC-11794 or COS-10019) solves the system of equations governing fluid flow, usually on a supercomputer. Grids sometimes have as many as two million points, and the "flow-solver" produces a solution file which contains density, x- y- and z-momentum, and stagnation energy for each grid point. With such a solution file and a grid file containing up to 50 grids as input, PLOT3D can calculate and graphically display any one of 74 functions, including shock waves, surface pressure, velocity vectors, and particle traces. PLOT3D's 74 functions are organized into

  5. FARGO3D: A New GPU-oriented MHD Code

    Science.gov (United States)

    Benítez-Llambay, Pablo; Masset, Frédéric S.

    2016-03-01

    We present the FARGO3D code, recently publicly released. It is a magnetohydrodynamics code developed with special emphasis on the physics of protoplanetary disks and planet-disk interactions, and parallelized with MPI. The hydrodynamics algorithms are based on finite-difference upwind, dimensionally split methods. The magnetohydrodynamics algorithms consist of the constrained transport method to preserve the divergence-free property of the magnetic field to machine accuracy, coupled to a method of characteristics for the evaluation of electromotive forces and Lorentz forces. Orbital advection is implemented, and an N-body solver is included to simulate planets or stars interacting with the gas. We present our implementation in detail and present a number of widely known tests for comparison purposes. One strength of FARGO3D is that it can run on either graphical processing units (GPUs) or central processing units (CPUs), achieving large speed-up with respect to CPU cores. We describe our implementation choices, which allow a user with no prior knowledge of GPU programming to develop new routines for CPUs, and have them translated automatically for GPUs.

  6. FARGO3D: A NEW GPU-ORIENTED MHD CODE

    Energy Technology Data Exchange (ETDEWEB)

    Benitez-Llambay, Pablo [Instituto de Astronomía Teórica y Experimental, Observatorio Astronónomico, Universidad Nacional de Córdoba. Laprida 854, X5000BGR, Córdoba (Argentina); Masset, Frédéric S., E-mail: pbllambay@oac.unc.edu.ar, E-mail: masset@icf.unam.mx [Instituto de Ciencias Físicas, Universidad Nacional Autónoma de México (UNAM), Apdo. Postal 48-3,62251-Cuernavaca, Morelos (Mexico)

    2016-03-15

    We present the FARGO3D code, recently publicly released. It is a magnetohydrodynamics code developed with special emphasis on the physics of protoplanetary disks and planet–disk interactions, and parallelized with MPI. The hydrodynamics algorithms are based on finite-difference upwind, dimensionally split methods. The magnetohydrodynamics algorithms consist of the constrained transport method to preserve the divergence-free property of the magnetic field to machine accuracy, coupled to a method of characteristics for the evaluation of electromotive forces and Lorentz forces. Orbital advection is implemented, and an N-body solver is included to simulate planets or stars interacting with the gas. We present our implementation in detail and present a number of widely known tests for comparison purposes. One strength of FARGO3D is that it can run on either graphical processing units (GPUs) or central processing units (CPUs), achieving large speed-up with respect to CPU cores. We describe our implementation choices, which allow a user with no prior knowledge of GPU programming to develop new routines for CPUs, and have them translated automatically for GPUs.

  7. YouDash3D: exploring stereoscopic 3D gaming for 3D movie theaters

    Science.gov (United States)

    Schild, Jonas; Seele, Sven; Masuch, Maic

    2012-03-01

    Along with the success of the digitally revived stereoscopic cinema, events beyond 3D movies become attractive for movie theater operators, i.e. interactive 3D games. In this paper, we present a case that explores possible challenges and solutions for interactive 3D games to be played by a movie theater audience. We analyze the setting and showcase current issues related to lighting and interaction. Our second focus is to provide gameplay mechanics that make special use of stereoscopy, especially depth-based game design. Based on these results, we present YouDash3D, a game prototype that explores public stereoscopic gameplay in a reduced kiosk setup. It features live 3D HD video stream of a professional stereo camera rig rendered in a real-time game scene. We use the effect to place the stereoscopic effigies of players into the digital game. The game showcases how stereoscopic vision can provide for a novel depth-based game mechanic. Projected trigger zones and distributed clusters of the audience video allow for easy adaptation to larger audiences and 3D movie theater gaming.

  8. Remote 3D Medical Consultation

    Science.gov (United States)

    Welch, Greg; Sonnenwald, Diane H.; Fuchs, Henry; Cairns, Bruce; Mayer-Patel, Ketan; Yang, Ruigang; State, Andrei; Towles, Herman; Ilie, Adrian; Krishnan, Srinivas; Söderholm, Hanna M.

    Two-dimensional (2D) video-based telemedical consultation has been explored widely in the past 15-20 years. Two issues that seem to arise in most relevant case studies are the difficulty associated with obtaining the desired 2D camera views, and poor depth perception. To address these problems we are exploring the use of a small array of cameras to synthesize a spatially continuous range of dynamic three-dimensional (3D) views of a remote environment and events. The 3D views can be sent across wired or wireless networks to remote viewers with fixed displays or mobile devices such as a personal digital assistant (PDA). The viewpoints could be specified manually or automatically via user head or PDA tracking, giving the remote viewer virtual head- or hand-slaved (PDA-based) remote cameras for mono or stereo viewing. We call this idea remote 3D medical consultation (3DMC). In this article we motivate and explain the vision for 3D medical consultation; we describe the relevant computer vision/graphics, display, and networking research; we present a proof-of-concept prototype system; and we present some early experimental results supporting the general hypothesis that 3D remote medical consultation could offer benefits over conventional 2D televideo.

  9. Novel 3D media technologies

    CERN Document Server

    Dagiuklas, Tasos

    2015-01-01

    This book describes recent innovations in 3D media and technologies, with coverage of 3D media capturing, processing, encoding, and adaptation, networking aspects for 3D Media, and quality of user experience (QoE). The contributions are based on the results of the FP7 European Project ROMEO, which focuses on new methods for the compression and delivery of 3D multi-view video and spatial audio, as well as the optimization of networking and compression jointly across the future Internet. The delivery of 3D media to individual users remains a highly challenging problem due to the large amount of data involved, diverse network characteristics and user terminal requirements, as well as the user’s context such as their preferences and location. As the number of visual views increases, current systems will struggle to meet the demanding requirements in terms of delivery of consistent video quality to fixed and mobile users. ROMEO will present hybrid networking solutions that combine the DVB-T2 and DVB-NGH broadcas...

  10. 3D future internet media

    CERN Document Server

    Dagiuklas, Tasos

    2014-01-01

    This book describes recent innovations in 3D media and technologies, with coverage of 3D media capturing, processing, encoding, and adaptation, networking aspects for 3D Media, and quality of user experience (QoE). The main contributions are based on the results of the FP7 European Projects ROMEO, which focus on new methods for the compression and delivery of 3D multi-view video and spatial audio, as well as the optimization of networking and compression jointly across the Future Internet (www.ict-romeo.eu). The delivery of 3D media to individual users remains a highly challenging problem due to the large amount of data involved, diverse network characteristics and user terminal requirements, as well as the user’s context such as their preferences and location. As the number of visual views increases, current systems will struggle to meet the demanding requirements in terms of delivery of constant video quality to both fixed and mobile users. ROMEO will design and develop hybrid-networking solutions that co...

  11. Time domain topology optimization of 3D nanophotonic devices

    DEFF Research Database (Denmark)

    Elesin, Yuriy; Lazarov, Boyan Stefanov; Jensen, Jakob Søndergaard;

    2014-01-01

    We present an efficient parallel topology optimization framework for design of large scale 3D nanophotonic devices. The code shows excellent scalability and is demonstrated for optimization of broadband frequency splitter, waveguide intersection, photonic crystal-based waveguide and nanowire-base......-based waveguide. The obtained results are compared to simplified 2D studies and we demonstrate that 3D topology optimization may lead to significant performance improvements. © 2013 Elsevier B.V. All rights reserved....

  12. 3D- VISUALIZATION BY RAYTRACING IMAGE SYNTHESIS ON GPU

    Directory of Open Access Journals (Sweden)

    Al-Oraiqat Anas M.

    2016-06-01

    Full Text Available This paper presents a realization of the approach to spatial 3D stereo of visualization of 3D images with use parallel Graphics processing unit (GPU. The experiments of realization of synthesis of images of a 3D stage by a method of trace of beams on GPU with Compute Unified Device Architecture (CUDA have shown that 60 % of the time is spent for the decision of a computing problem approximately, the major part of time (40 % is spent for transfer of data between the central processing unit and GPU for calculations and the organization process of visualization. The study of the influence of increase in the size of the GPU network at the speed of calculations showed importance of the correct task of structure of formation of the parallel computer network and general mechanism of parallelization.

  13. Eighth SIAM conference on parallel processing for scientific computing: Final program and abstracts

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1997-12-31

    This SIAM conference is the premier forum for developments in parallel numerical algorithms, a field that has seen very lively and fruitful developments over the past decade, and whose health is still robust. Themes for this conference were: combinatorial optimization; data-parallel languages; large-scale parallel applications; message-passing; molecular modeling; parallel I/O; parallel libraries; parallel software tools; parallel compilers; particle simulations; problem-solving environments; and sparse matrix computations.

  14. JAC3D -- A three-dimensional finite element computer program for the nonlinear quasi-static response of solids with the conjugate gradient method; Yucca Mountain Site Characterization Project

    Energy Technology Data Exchange (ETDEWEB)

    Biffle, J.H.

    1993-02-01

    JAC3D is a three-dimensional finite element program designed to solve quasi-static nonlinear mechanics problems. A set of continuum equations describes the nonlinear mechanics involving large rotation and strain. A nonlinear conjugate gradient method is used to solve the equation. The method is implemented in a three-dimensional setting with various methods for accelerating convergence. Sliding interface logic is also implemented. An eight-node Lagrangian uniform strain element is used with hourglass stiffness to control the zero-energy modes. This report documents the elastic and isothermal elastic-plastic material model. Other material models, documented elsewhere, are also available. The program is vectorized for efficient performance on Cray computers. Sample problems described are the bending of a thin beam, the rotation of a unit cube, and the pressurization and thermal loading of a hollow sphere.

  15. Facing competition: Neural mechanisms underlying parallel programming of antisaccades and prosaccades.

    Science.gov (United States)

    Talanow, Tobias; Kasparbauer, Anna-Maria; Steffens, Maria; Meyhöfer, Inga; Weber, Bernd; Smyrnis, Nikolaos; Ettinger, Ulrich

    2016-08-01

    The antisaccade task is a prominent tool to investigate the response inhibition component of cognitive control. Recent theoretical accounts explain performance in terms of parallel programming of exogenous and endogenous saccades, linked to the horse race metaphor. Previous studies have tested the hypothesis of competing saccade signals at the behavioral level by selectively slowing the programming of endogenous or exogenous processes e.g. by manipulating the probability of antisaccades in an experimental block. To gain a better understanding of inhibitory control processes in parallel saccade programming, we analyzed task-related eye movements and blood oxygenation level dependent (BOLD) responses obtained using functional magnetic resonance imaging (fMRI) at 3T from 16 healthy participants in a mixed antisaccade and prosaccade task. The frequency of antisaccade trials was manipulated across blocks of high (75%) and low (25%) antisaccade frequency. In blocks with high antisaccade frequency, antisaccade latencies were shorter and error rates lower whilst prosaccade latencies were longer and error rates were higher. At the level of BOLD, activations in the task-related saccade network (left inferior parietal lobe, right inferior parietal sulcus, left precentral gyrus reaching into left middle frontal gyrus and inferior frontal junction) and deactivations in components of the default mode network (bilateral temporal cortex, ventromedial prefrontal cortex) compensated increased cognitive control demands. These findings illustrate context dependent mechanisms underlying the coordination of competing decision signals in volitional gaze control. PMID:27363008

  16. Facing competition: Neural mechanisms underlying parallel programming of antisaccades and prosaccades.

    Science.gov (United States)

    Talanow, Tobias; Kasparbauer, Anna-Maria; Steffens, Maria; Meyhöfer, Inga; Weber, Bernd; Smyrnis, Nikolaos; Ettinger, Ulrich

    2016-08-01

    The antisaccade task is a prominent tool to investigate the response inhibition component of cognitive control. Recent theoretical accounts explain performance in terms of parallel programming of exogenous and endogenous saccades, linked to the horse race metaphor. Previous studies have tested the hypothesis of competing saccade signals at the behavioral level by selectively slowing the programming of endogenous or exogenous processes e.g. by manipulating the probability of antisaccades in an experimental block. To gain a better understanding of inhibitory control processes in parallel saccade programming, we analyzed task-related eye movements and blood oxygenation level dependent (BOLD) responses obtained using functional magnetic resonance imaging (fMRI) at 3T from 16 healthy participants in a mixed antisaccade and prosaccade task. The frequency of antisaccade trials was manipulated across blocks of high (75%) and low (25%) antisaccade frequency. In blocks with high antisaccade frequency, antisaccade latencies were shorter and error rates lower whilst prosaccade latencies were longer and error rates were higher. At the level of BOLD, activations in the task-related saccade network (left inferior parietal lobe, right inferior parietal sulcus, left precentral gyrus reaching into left middle frontal gyrus and inferior frontal junction) and deactivations in components of the default mode network (bilateral temporal cortex, ventromedial prefrontal cortex) compensated increased cognitive control demands. These findings illustrate context dependent mechanisms underlying the coordination of competing decision signals in volitional gaze control.

  17. 3D Imager and Method for 3D imaging

    NARCIS (Netherlands)

    Kumar, P.; Staszewski, R.; Charbon, E.

    2013-01-01

    3D imager comprising at least one pixel, each pixel comprising a photodetectorfor detecting photon incidence and a time-to-digital converter system configured for referencing said photon incidence to a reference clock, and further comprising a reference clock generator provided for generating the re

  18. Modification of 3D milling machine to 3D printer

    OpenAIRE

    Halamíček, Lukáš

    2015-01-01

    Tato práce se zabývá přestavbou gravírovací frézky na 3D tiskárnu. V první části se práce zabývá možnými technologiemi 3D tisku a možností jejich využití u přestavby. Dále jsou popsány a vybrány vhodné součásti pro přestavbu. V další části je realizováno řízení ohřevu podložky, trysky a řízení posuvu drátu pomocí softwaru TwinCat od společnosti Beckhoff na průmyslovém počítači. Výsledkem práce by měla být oživená 3D tiskárna. This thesis deals with rebuilding of engraving machine to 3D pri...

  19. A Prototypical 3D Graphical Visualizer for Object-Oriented Systems

    Institute of Scientific and Technical Information of China (English)

    1996-01-01

    is paper describes a framework for visualizing object-oriented systems within a 3D interactive environment.The 3D visualizer represents the structure of a program as Cylinder Net that simultaneously specifies two relationships between objects within 3D virtual space.Additionally,it represents additional relationships on demand when objects are moved into local focus.The 3D visualizer is implemented using a 3D graphics toolkit,TOAST,that implements 3D Widgets 3D graphics to ease the programming task for 3D visualization.

  20. PLOT3D/AMES, SGI IRIS VERSION (WITH TURB3D)

    Science.gov (United States)

    Buning, P.

    1994-01-01

    PLOT3D is an interactive graphics program designed to help scientists visualize computational fluid dynamics (CFD) grids and solutions. Today, supercomputers and CFD algorithms can provide scientists with simulations of such highly complex phenomena that obtaining an understanding of the simulations has become a major problem. Tools which help the scientist visualize the simulations can be of tremendous aid. PLOT3D/AMES offers more functions and features, and has been adapted for more types of computers than any other CFD graphics program. Version 3.6b+ is supported for five computers and graphic libraries. Using PLOT3D, CFD physicists can view their computational models from any angle, observing the physics of problems and the quality of solutions. As an aid in designing aircraft, for example, PLOT3D's interactive computer graphics can show vortices, temperature, reverse flow, pressure, and dozens of other characteristics of air flow during flight. As critical areas become obvious, they can easily be studied more closely using a finer grid. PLOT3D is part of a computational fluid dynamics software cycle. First, a program such as 3DGRAPE (ARC-12620) helps the scientist generate computational grids to model an object and its surrounding space. Once the grids have been designed and parameters such as the angle of attack, Mach number, and Reynolds number have been specified, a "flow-solver" program such as INS3D (ARC-11794 or COS-10019) solves the system of equations governing fluid flow, usually on a supercomputer. Grids sometimes have as many as two million points, and the "flow-solver" produces a solution file which contains density, x- y- and z-momentum, and stagnation energy for each grid point. With such a solution file and a grid file containing up to 50 grids as input, PLOT3D can calculate and graphically display any one of 74 functions, including shock waves, surface pressure, velocity vectors, and particle traces. PLOT3D's 74 functions are organized into

  1. PLOT3D/AMES, SGI IRIS VERSION (WITHOUT TURB3D)

    Science.gov (United States)

    Buning, P.

    1994-01-01

    PLOT3D is an interactive graphics program designed to help scientists visualize computational fluid dynamics (CFD) grids and solutions. Today, supercomputers and CFD algorithms can provide scientists with simulations of such highly complex phenomena that obtaining an understanding of the simulations has become a major problem. Tools which help the scientist visualize the simulations can be of tremendous aid. PLOT3D/AMES offers more functions and features, and has been adapted for more types of computers than any other CFD graphics program. Version 3.6b+ is supported for five computers and graphic libraries. Using PLOT3D, CFD physicists can view their computational models from any angle, observing the physics of problems and the quality of solutions. As an aid in designing aircraft, for example, PLOT3D's interactive computer graphics can show vortices, temperature, reverse flow, pressure, and dozens of other characteristics of air flow during flight. As critical areas become obvious, they can easily be studied more closely using a finer grid. PLOT3D is part of a computational fluid dynamics software cycle. First, a program such as 3DGRAPE (ARC-12620) helps the scientist generate computational grids to model an object and its surrounding space. Once the grids have been designed and parameters such as the angle of attack, Mach number, and Reynolds number have been specified, a "flow-solver" program such as INS3D (ARC-11794 or COS-10019) solves the system of equations governing fluid flow, usually on a supercomputer. Grids sometimes have as many as two million points, and the "flow-solver" produces a solution file which contains density, x- y- and z-momentum, and stagnation energy for each grid point. With such a solution file and a grid file containing up to 50 grids as input, PLOT3D can calculate and graphically display any one of 74 functions, including shock waves, surface pressure, velocity vectors, and particle traces. PLOT3D's 74 functions are organized into

  2. Numerical Simulation of 3-D Wave Crests

    Institute of Scientific and Technical Information of China (English)

    YU Dingyong; ZHANG Hanyuan

    2003-01-01

    A clear definition of 3-D wave crest and a description of the procedures to detect the boundary of wave crest are presented in the paper. By using random wave theory and directional wave spectrum, a MATLAB-platformed program is designed to simulate random wave crests for various directional spectral conditions in deep water. Statistics of wave crests with different directional spreading parameters and different directional functions are obtained and discussed.

  3. Markerless 3D Face Tracking

    DEFF Research Database (Denmark)

    Walder, Christian; Breidt, Martin; Bulthoff, Heinrich;

    2009-01-01

    We present a novel algorithm for the markerless tracking of deforming surfaces such as faces. We acquire a sequence of 3D scans along with color images at 40Hz. The data is then represented by implicit surface and color functions, using a novel partition-of-unity type method of efficiently...... combining local regressors using nearest neighbor searches. Both these functions act on the 4D space of 3D plus time, and use temporal information to handle the noise in individual scans. After interactive registration of a template mesh to the first frame, it is then automatically deformed to track...... the scanned surface, using the variation of both shape and color as features in a dynamic energy minimization problem. Our prototype system yields high-quality animated 3D models in correspondence, at a rate of approximately twenty seconds per timestep. Tracking results for faces and other objects...

  4. Crowded Field 3D Spectroscopy

    CERN Document Server

    Becker, T; Roth, M M; Becker, Thomas; Fabrika, Sergei; Roth, Martin M.

    2003-01-01

    The quantitative spectroscopy of stellar objects in complex environments is mainly limited by the ability of separating the object from the background. Standard slit spectroscopy, restricting the field of view to one dimension, is obviously not the proper technique in general. The emerging Integral Field (3D) technique with spatially resolved spectra of a two-dimensional field of view provides a great potential for applying advanced subtraction methods. In this paper an image reconstruction algorithm to separate point sources and a smooth background is applied to 3D data. Several performance tests demonstrate the photometric quality of the method. The algorithm is applied to real 3D observations of a sample Planetary Nebula in M31, whose spectrum is contaminated by the bright and complex galaxy background. The ability of separating sources is also studied in a crowded stellar field in M33.

  5. 3D-grafiikkamoottori mobiililaitteille

    OpenAIRE

    Vahlman, Lauri

    2014-01-01

    Tässä insinöörityössä käydään läpi mobiililaitteille suunnatun yksinkertaisen 3D-grafiikkamoottorin suunnittelu ja toteutus käyttäen OpenGL ES -rajapintaa. Työssä esitellään grafiikkamoottorin toteutuksessa käytettyjä tekniikoita sekä tutustutaan moottorin rakenteeseen ja toteutuksellisiin yksityiskohtiin. Työn alkupuolella tutustutaan myös modernin 3D-grafiikan yleisiin periaatteisiin ja toimintaan sekä käydään läpi 3D-grafiikkaan liittyviä suorituskykyongelmia. Työn loppupuolella esitel...

  6. 3D vector flow imaging

    DEFF Research Database (Denmark)

    Pihl, Michael Johannes

    The main purpose of this PhD project is to develop an ultrasonic method for 3D vector flow imaging. The motivation is to advance the field of velocity estimation in ultrasound, which plays an important role in the clinic. The velocity of blood has components in all three spatial dimensions, yet...... conventional methods can estimate only the axial component. Several approaches for 3D vector velocity estimation have been suggested, but none of these methods have so far produced convincing in vivo results nor have they been adopted by commercial manufacturers. The basis for this project is the Transverse...... on the TO fields are suggested. They can be used to optimize the TO method. In the third part, a TO method for 3D vector velocity estimation is proposed. It employs a 2D phased array transducer and decouples the velocity estimation into three velocity components, which are estimated simultaneously based on 5...

  7. Microfluidic 3D Helix Mixers

    Directory of Open Access Journals (Sweden)

    Georgette B. Salieb-Beugelaar

    2016-10-01

    Full Text Available Polymeric microfluidic systems are well suited for miniaturized devices with complex functionality, and rapid prototyping methods for 3D microfluidic structures are increasingly used. Mixing at the microscale and performing chemical reactions at the microscale are important applications of such systems and we therefore explored feasibility, mixing characteristics and the ability to control a chemical reaction in helical 3D channels produced by the emerging thread template method. Mixing at the microscale is challenging because channel size reduction for improving solute diffusion comes at the price of a reduced Reynolds number that induces a strictly laminar flow regime and abolishes turbulence that would be desired for improved mixing. Microfluidic 3D helix mixers were rapidly prototyped in polydimethylsiloxane (PDMS using low-surface energy polymeric threads, twisted to form 2-channel and 3-channel helices. Structure and flow characteristics were assessed experimentally by microscopy, hydraulic measurements and chromogenic reaction, and were modeled by computational fluid dynamics. We found that helical 3D microfluidic systems produced by thread templating allow rapid prototyping, can be used for mixing and for controlled chemical reaction with two or three reaction partners at the microscale. Compared to the conventional T-shaped microfluidic system used as a control device, enhanced mixing and faster chemical reaction was found to occur due to the combination of diffusive mixing in small channels and flow folding due to the 3D helix shape. Thus, microfluidic 3D helix mixers can be rapidly prototyped using the thread template method and are an attractive and competitive method for fluid mixing and chemical reactions at the microscale.

  8. Program Suite for Conceptual Designing of Parallel Mechanism-Based Robots and Machine Tools

    Directory of Open Access Journals (Sweden)

    Slobodan Tabaković

    2013-06-01

    This paper describes the categorization of criteria for the conceptual design of parallel mechanism‐based robots or machine tools, resulting from workspace analysis as well as the procedure of their defining. Furthermore, it also presents the designing methodology that was implemented into the program for the creation of a robot or machine tool space model and the optimization of the resulting solution. For verification of the criteria and the programme suite, three common (conceptually different mechanisms with a similar mechanical structure and kinematic characteristics were used.

  9. Total maximum allocated load calculation of nitrogen pollutants by linking a 3D biogeochemical-hydrodynamic model with a programming model in Bohai Sea

    Science.gov (United States)

    Dai, Aiquan; Li, Keqiang; Ding, Dongsheng; Li, Yan; Liang, Shengkang; Li, Yanbin; Su, Ying; Wang, Xiulin

    2015-12-01

    The equal percent removal (EPR) method, in which pollutant reduction ratio was set as the same in all administrative regions, failed to satisfy the requirement for water quality improvement in the Bohai Sea. Such requirement was imposed by the developed Coastal Pollution Total Load Control Management. The total maximum allocated load (TMAL) of nitrogen pollutants in the sea-sink source regions (SSRs) around the Bohai Rim, which is the maximum pollutant load of every outlet under the limitation of water quality criteria, was estimated by optimization-simulation method (OSM) combined with loop approximation calculation. In OSM, water quality is simulated using a water quality model and pollutant load is calculated with a programming model. The effect of changes in pollutant loads on TMAL was discussed. Results showed that the TMAL of nitrogen pollutants in 34 SSRs was 1.49×105 ton/year. The highest TMAL was observed in summer, whereas the lowest in winter. TMAL was also higher in the Bohai Strait and central Bohai Sea and lower in the inner area of the Liaodong Bay, Bohai Bay and Laizhou Bay. In loop approximation calculation, the TMAL obtained was considered satisfactory for water quality criteria as fluctuation of concentration response matrix with pollutant loads was eliminated. Results of numerical experiment further showed that water quality improved faster and were more evident under TMAL input than that when using the EPR method

  10. Ideal 3D asymmetric concentrator

    Energy Technology Data Exchange (ETDEWEB)

    Garcia-Botella, Angel [Departamento Fisica Aplicada a los Recursos Naturales, Universidad Politecnica de Madrid, E.T.S.I. de Montes, Ciudad Universitaria s/n, 28040 Madrid (Spain); Fernandez-Balbuena, Antonio Alvarez; Vazquez, Daniel; Bernabeu, Eusebio [Departamento de Optica, Universidad Complutense de Madrid, Fac. CC. Fisicas, Ciudad Universitaria s/n, 28040 Madrid (Spain)

    2009-01-15

    Nonimaging optics is a field devoted to the design of optical components for applications such as solar concentration or illumination. In this field, many different techniques have been used for producing reflective and refractive optical devices, including reverse engineering techniques. In this paper we apply photometric field theory and elliptic ray bundles method to study 3D asymmetric - without rotational or translational symmetry - concentrators, which can be useful components for nontracking solar applications. We study the one-sheet hyperbolic concentrator and we demonstrate its behaviour as ideal 3D asymmetric concentrator. (author)

  11. Advanced 3-D Ultrasound Imaging

    DEFF Research Database (Denmark)

    Rasmussen, Morten Fischer

    The main purpose of the PhD project was to develop methods that increase the 3-D ultrasound imaging quality available for the medical personnel in the clinic. Acquiring a 3-D volume gives the medical doctor the freedom to investigate the measured anatomy in any slice desirable after the scan has...... beamforming. This is achieved partly because synthetic aperture imaging removes the limitation of a fixed transmit focal depth and instead enables dynamic transmit focusing. Lately, the major ultrasound companies have produced ultrasound scanners using 2-D transducer arrays with enough transducer elements...

  12. 3D reconstruction based on spatial vanishing information

    Institute of Scientific and Technical Information of China (English)

    Yuan Shu; Zheng Tan

    2005-01-01

    An approach for the three-dimensional (3D) reconstruction of architectural scenes from two un-calibrated images is described in this paper. From two views of one architectural structure, three pairs of corresponding vanishing points of three major mutual orthogonal directions can be extracted. The simple but powerful constraints of parallelism and orthogonal lines in architectural scenes can be used to calibrate the cameras and to recover the 3D information of the structure. This approach is applied to the real images of architectural scenes, and a 3D model of a building in virtual reality modelling language (VRML) format is presented which illustrates the method with successful performance.

  13. The New Realm of 3-D Vision

    Science.gov (United States)

    2002-01-01

    Dimension Technologies Inc., developed a line of 2-D/3-D Liquid Crystal Display (LCD) screens, including a 15-inch model priced at consumer levels. DTI's family of flat panel LCD displays, called the Virtual Window(TM), provide real-time 3-D images without the use of glasses, head trackers, helmets, or other viewing aids. Most of the company initial 3-D display research was funded through NASA's Small Business Innovation Research (SBIR) program. The images on DTI's displays appear to leap off the screen and hang in space. The display accepts input from computers or stereo video sources, and can be switched from 3-D to full-resolution 2-D viewing with the push of a button. The Virtual Window displays have applications in data visualization, medicine, architecture, business, real estate, entertainment, and other research, design, military, and consumer applications. Displays are currently used for computer games, protein analysis, and surgical imaging. The technology greatly benefits the medical field, as surgical simulators are helping to increase the skills of surgical residents. Virtual Window(TM) is a trademark of Dimension Technologies Inc.

  14. Full Parallel Implementation of an All-Electron Four-Component Dirac-Kohn-Sham Program.

    Science.gov (United States)

    Rampino, Sergio; Belpassi, Leonardo; Tarantelli, Francesco; Storchi, Loriano

    2014-09-01

    A full distributed-memory implementation of the Dirac-Kohn-Sham (DKS) module of the program BERTHA (Belpassi et al., Phys. Chem. Chem. Phys. 2011, 13, 12368-12394) is presented, where the self-consistent field (SCF) procedure is replicated on all the parallel processes, each process working on subsets of the global matrices. The key feature of the implementation is an efficient procedure for switching between two matrix distribution schemes, one (integral-driven) optimal for the parallel computation of the matrix elements and another (block-cyclic) optimal for the parallel linear algebra operations. This approach, making both CPU-time and memory scalable with the number of processors used, virtually overcomes at once both time and memory barriers associated with DKS calculations. Performance, portability, and numerical stability of the code are illustrated on the basis of test calculations on three gold clusters of increasing size, an organometallic compound, and a perovskite model. The calculations are performed on a Beowulf and a BlueGene/Q system.

  15. Streamlining of the RELAP5-3D Code

    Energy Technology Data Exchange (ETDEWEB)

    Mesina, George L; Hykes, Joshua; Guillen, Donna Post

    2007-11-01

    RELAP5-3D is widely used by the nuclear community to simulate general thermal hydraulic systems and has proven to be so versatile that the spectrum of transient two-phase problems that can be analyzed has increased substantially over time. To accommodate the many new types of problems that are analyzed by RELAP5-3D, both the physics and numerical methods of the code have been continuously improved. In the area of computational methods and mathematical techniques, many upgrades and improvements have been made decrease code run time and increase solution accuracy. These include vectorization, parallelization, use of improved equation solvers for thermal hydraulics and neutron kinetics, and incorporation of improved library utilities. In the area of applied nuclear engineering, expanded capabilities include boron and level tracking models, radiation/conduction enclosure model, feedwater heater and compressor components, fluids and corresponding correlations for modeling Generation IV reactor designs, and coupling to computational fluid dynamics solvers. Ongoing and proposed future developments include improvements to the two-phase pump model, conversion to FORTRAN 90, and coupling to more computer programs. This paper summarizes the general improvements made to RELAP5-3D, with an emphasis on streamlining the code infrastructure for improved maintenance and development. With all these past, present and planned developments, it is necessary to modify the code infrastructure to incorporate modifications in a consistent and maintainable manner. Modifying a complex code such as RELAP5-3D to incorporate new models, upgrade numerics, and optimize existing code becomes more difficult as the code grows larger. The difficulty of this as well as the chance of introducing errors is significantly reduced when the code is structured. To streamline the code into a structured program, a commercial restructuring tool, FOR_STRUCT, was applied to the RELAP5-3D source files. The

  16. 3D Equilibrium Reconstructions in DIII-D

    Science.gov (United States)

    Lao, L. L.; Ferraro, N. W.; Strait, E. J.; Turnbull, A. D.; King, J. D.; Hirshman, H. P.; Lazarus, E. A.; Sontag, A. C.; Hanson, J.; Trevisan, G.

    2013-10-01

    Accurate and efficient 3D equilibrium reconstruction is needed in tokamaks for study of 3D magnetic field effects on experimentally reconstructed equilibrium and for analysis of MHD stability experiments with externally imposed magnetic perturbations. A large number of new magnetic probes have been recently installed in DIII-D to improve 3D equilibrium measurements and to facilitate 3D reconstructions. The V3FIT code has been in use in DIII-D to support 3D reconstruction and the new magnetic diagnostic design. V3FIT is based on the 3D equilibrium code VMEC that assumes nested magnetic surfaces. V3FIT uses a pseudo-Newton least-square algorithm to search for the solution vector. In parallel, the EFIT equilibrium reconstruction code is being extended to allow for 3D effects using a perturbation approach based on an expansion of the MHD equations. EFIT uses the cylindrical coordinate system and can include the magnetic island and stochastic effects. Algorithms are being developed to allow EFIT to reconstruct 3D perturbed equilibria directly making use of plasma response to 3D perturbations from the GATO, MARS-F, or M3D-C1 MHD codes. DIII-D 3D reconstruction examples using EFIT and V3FIT and the new 3D magnetic data will be presented. Work supported in part by US DOE under DE-FC02-04ER54698, DE-FG02-95ER54309 and DE-AC05-06OR23100.

  17. 3D Simulations of line emission from ICF capsules

    International Nuclear Information System (INIS)

    Line emission from ICF implosions can be used to diagnose the temperature of the DT fuel and provides an indication of the distortion in the fuel-pusher interface. 2D simulations have provided valuable insights into the usefulness of argon and titanium dopants as diagnostics of instabilities. Characterizing the effects of drive asymmetries requires 3D modeling with large demands for computer time and memory, necessitating the use of parallel computers. We present the results of some 3D simulations achieved with a code utilizing both shared memory and distributed parallelism. We discuss the code structure and related performance issues

  18. PubChem3D: Biologically relevant 3-D similarity

    Directory of Open Access Journals (Sweden)

    Kim Sunghwan

    2011-07-01

    Full Text Available Abstract Background The use of 3-D similarity techniques in the analysis of biological data and virtual screening is pervasive, but what is a biologically meaningful 3-D similarity value? Can one find statistically significant separation between "active/active" and "active/inactive" spaces? These questions are explored using 734,486 biologically tested chemical structures, 1,389 biological assay data sets, and six different 3-D similarity types utilized by PubChem analysis tools. Results The similarity value distributions of 269.7 billion unique conformer pairs from 734,486 biologically tested compounds (all-against-all from PubChem were utilized to help work towards an answer to the question: what is a biologically meaningful 3-D similarity score? The average and standard deviation for the six similarity measures STST-opt, CTST-opt, ComboTST-opt, STCT-opt, CTCT-opt, and ComboTCT-opt were 0.54 ± 0.10, 0.07 ± 0.05, 0.62 ± 0.13, 0.41 ± 0.11, 0.18 ± 0.06, and 0.59 ± 0.14, respectively. Considering that this random distribution of biologically tested compounds was constructed using a single theoretical conformer per compound (the "default" conformer provided by PubChem, further study may be necessary using multiple diverse conformers per compound; however, given the breadth of the compound set, the single conformer per compound results may still apply to the case of multi-conformer per compound 3-D similarity value distributions. As such, this work is a critical step, covering a very wide corpus of chemical structures and biological assays, creating a statistical framework to build upon. The second part of this study explored the question of whether it was possible to realize a statistically meaningful 3-D similarity value separation between reputed biological assay "inactives" and "actives". Using the terminology of noninactive-noninactive (NN pairs and the noninactive-inactive (NI pairs to represent comparison of the "active/active" and

  19. 3D Face Apperance Model

    OpenAIRE

    Lading, Brian; Larsen, Rasmus; Astrom, K

    2006-01-01

    We build a 3D face shape model, including inter- and intra-shape variations, derive the analytical Jacobian of its resulting 2D rendered image, and show example of its fitting performance with light, pose, id, expression and texture variations

  20. 3D Face Appearance Model

    OpenAIRE

    Lading, Brian; Larsen, Rasmus; Åström, Kalle

    2006-01-01

    We build a 3d face shape model, including inter- and intra-shape variations, derive the analytical jacobian of its resulting 2d rendered image, and show example of its fitting performance with light, pose, id, expression and texture variations.}