WorldWideScience

Sample records for 3-d parallel program

  1. 3-D parallel program for numerical calculation of gas dynamics problems with heat conductivity on distributed memory computational systems (CS)

    Energy Technology Data Exchange (ETDEWEB)

    Sofronov, I.D.; Voronin, B.L.; Butnev, O.I. [VNIIEF (Russian Federation)] [and others

    1997-12-31

    The aim of the work performed is to develop a 3D parallel program for numerical calculation of gas dynamics problem with heat conductivity on distributed memory computational systems (CS), satisfying the condition of numerical result independence from the number of processors involved. Two basically different approaches to the structure of massive parallel computations have been developed. The first approach uses the 3D data matrix decomposition reconstructed at temporal cycle and is a development of parallelization algorithms for multiprocessor CS with shareable memory. The second approach is based on using a 3D data matrix decomposition not reconstructed during a temporal cycle. The program was developed on 8-processor CS MP-3 made in VNIIEF and was adapted to a massive parallel CS Meiko-2 in LLNL by joint efforts of VNIIEF and LLNL staffs. A large number of numerical experiments has been carried out with different number of processors up to 256 and the efficiency of parallelization has been evaluated in dependence on processor number and their parameters.

  2. Shared Memory Parallelism for 3D Cartesian Discrete Ordinates Solver

    Science.gov (United States)

    Moustafa, Salli; Dutka-Malen, Ivan; Plagne, Laurent; Ponçot, Angélique; Ramet, Pierre

    2014-06-01

    This paper describes the design and the performance of DOMINO, a 3D Cartesian SN solver that implements two nested levels of parallelism (multicore+SIMD) on shared memory computation nodes. DOMINO is written in C++, a multi-paradigm programming language that enables the use of powerful and generic parallel programming tools such as Intel TBB and Eigen. These two libraries allow us to combine multi-thread parallelism with vector operations in an efficient and yet portable way. As a result, DOMINO can exploit the full power of modern multi-core processors and is able to tackle very large simulations, that usually require large HPC clusters, using a single computing node. For example, DOMINO solves a 3D full core PWR eigenvalue problem involving 26 energy groups, 288 angular directions (S16), 46 × 106 spatial cells and 1 × 1012 DoFs within 11 hours on a single 32-core SMP node. This represents a sustained performance of 235 GFlops and 40:74% of the SMP node peak performance for the DOMINO sweep implementation. The very high Flops/Watt ratio of DOMINO makes it a very interesting building block for a future many-nodes nuclear simulation tool.

  3. Parallel computing helps 3D depth imaging, processing

    Energy Technology Data Exchange (ETDEWEB)

    Nestvold, E. O. [IBM, Houston, TX (United States); Su, C. B. [IBM, Dallas, TX (United States); Black, J. L. [Landmark Graphics, Denver, CO (United States); Jack, I. G. [BP Exploration, London (United Kingdom)

    1996-10-28

    The significance of 3D seismic data in the petroleum industry during the past decade cannot be overstated. Having started as a technology too expensive to be utilized except by major oil companies, 3D technology is now routinely used by independent operators in the US and Canada. As with all emerging technologies, documentation of successes has been limited. There are some successes, however, that have been summarized in the literature in the recent past. Key technological developments contributing to this success have been major advances in RISC workstation technology, 3D depth imaging, and parallel computing. This article presents the basic concepts of parallel seismic computing, showing how it impacts both 3D depth imaging and more-conventional 3D seismic processing.

  4. Parallel Processor for 3D Recovery from Optical Flow

    Directory of Open Access Journals (Sweden)

    Jose Hugo Barron-Zambrano

    2009-01-01

    Full Text Available 3D recovery from motion has received a major effort in computer vision systems in the recent years. The main problem lies in the number of operations and memory accesses to be performed by the majority of the existing techniques when translated to hardware or software implementations. This paper proposes a parallel processor for 3D recovery from optical flow. Its main feature is the maximum reuse of data and the low number of clock cycles to calculate the optical flow, along with the precision with which 3D recovery is achieved. The results of the proposed architecture as well as those from processor synthesis are presented.

  5. CALTRANS: A parallel, deterministic, 3D neutronics code

    Energy Technology Data Exchange (ETDEWEB)

    Carson, L.; Ferguson, J.; Rogers, J.

    1994-04-01

    Our efforts to parallelize the deterministic solution of the neutron transport equation has culminated in a new neutronics code CALTRANS, which has full 3D capability. In this article, we describe the layout and algorithms of CALTRANS and present performance measurements of the code on a variety of platforms. Explicit implementation of the parallel algorithms of CALTRANS using both the function calls of the Parallel Virtual Machine software package (PVM 3.2) and the Meiko CS-2 tagged message passing library (based on the Intel NX/2 interface) are provided in appendices.

  6. Programs Lucky and Lucky{sub C} - 3D parallel transport codes for the multi-group transport equation solution for XYZ geometry by Pm Sn method

    Energy Technology Data Exchange (ETDEWEB)

    Moriakov, A. [Russian Research Centre, Kurchatov Institute, Moscow (Russian Federation); Vasyukhno, V.; Netecha, M.; Khacheresov, G. [Research and Development Institute of Power Engineering, Moscow (Russian Federation)

    2003-07-01

    Powerful supercomputers are available today. MBC-1000M is one of Russian supercomputers that may be used by distant way access. Programs LUCKY and LUCKY{sub C} were created to work for multi-processors systems. These programs have algorithms created especially for these computers and used MPI (message passing interface) service for exchanges between processors. LUCKY may resolved shielding tasks by multigroup discreet ordinate method. LUCKY{sub C} may resolve critical tasks by same method. Only XYZ orthogonal geometry is available. Under little space steps to approximate discreet operator this geometry may be used as universal one to describe complex geometrical structures. Cross section libraries are used up to P8 approximation by Legendre polynomials for nuclear data in GIT format. Programming language is Fortran-90. 'Vector' processors may be used that lets get a time profit up to 30 times. But unfortunately MBC-1000M has not these processors. Nevertheless sufficient value for efficiency of parallel calculations was obtained under 'space' (LUCKY) and 'space and energy' (LUCKY{sub C}) paralleling. AUTOCAD program is used to control geometry after a treatment of input data. Programs have powerful geometry module, it is a beautiful tool to achieve any geometry. Output results may be processed by graphic programs on personal computer. (authors)

  7. 3-D Visualization on Workspace of Parallel Manipulators

    Science.gov (United States)

    Tanaka, Yoshito; Yokomichi, Isao; Ishii, Junko; Makino, Toshiaki

    In parallel mechanisms, the form and volume of workspace also change variously with the attitude of a platform. This paper presents a method to search for the workspace of parallel mechanisms with 6-DOF and 3D visualization of the workspace. Workspace is a search for the movable range of the central point of a platform when it moves with a given orientation. In order to search workspace, geometric analysis based on inverse kinematics is considered. Plots of 2D of calculations are compared with those measured by position sensors. The test results are shown to have good agreement with simulation results. The workspace variations are demonstrated in terms of 3D and 2D plots for prototype mechanisms. The workspace plots are created with OpenGL and Visual C++ by implementation of the algorithm. An application module is developed, which displays workspace of the mechanism in 3D images. The effectiveness and practicability of 3D visualization on workspace are successfully demonstrated by 6-DOF parallel mechanisms.

  8. Parallel Optimization of 3D Cardiac Electrophysiological Model Using GPU

    Directory of Open Access Journals (Sweden)

    Yong Xia

    2015-01-01

    Full Text Available Large-scale 3D virtual heart model simulations are highly demanding in computational resources. This imposes a big challenge to the traditional computation resources based on CPU environment, which already cannot meet the requirement of the whole computation demands or are not easily available due to expensive costs. GPU as a parallel computing environment therefore provides an alternative to solve the large-scale computational problems of whole heart modeling. In this study, using a 3D sheep atrial model as a test bed, we developed a GPU-based simulation algorithm to simulate the conduction of electrical excitation waves in the 3D atria. In the GPU algorithm, a multicellular tissue model was split into two components: one is the single cell model (ordinary differential equation and the other is the diffusion term of the monodomain model (partial differential equation. Such a decoupling enabled realization of the GPU parallel algorithm. Furthermore, several optimization strategies were proposed based on the features of the virtual heart model, which enabled a 200-fold speedup as compared to a CPU implementation. In conclusion, an optimized GPU algorithm has been developed that provides an economic and powerful platform for 3D whole heart simulations.

  9. Parallel OSEM Reconstruction Algorithm for Fully 3-D SPECT on a Beowulf Cluster.

    Science.gov (United States)

    Rong, Zhou; Tianyu, Ma; Yongjie, Jin

    2005-01-01

    In order to improve the computation speed of ordered subset expectation maximization (OSEM) algorithm for fully 3-D single photon emission computed tomography (SPECT) reconstruction, an experimental beowulf-type cluster was built and several parallel reconstruction schemes were described. We implemented a single-program-multiple-data (SPMD) parallel 3-D OSEM reconstruction algorithm based on message passing interface (MPI) and tested it with combinations of different number of calculating processors and different size of voxel grid in reconstruction (64×64×64 and 128×128×128). Performance of parallelization was evaluated in terms of the speedup factor and parallel efficiency. This parallel implementation methodology is expected to be helpful to make fully 3-D OSEM algorithms more feasible in clinical SPECT studies.

  10. Parallel tempering and 3D spin glass models

    Science.gov (United States)

    Papakonstantinou, T.; Malakis, A.

    2014-03-01

    We review parallel tempering schemes and examine their main ingredients for accuracy and efficiency. We discuss two selection methods of temperatures and some alternatives for the exchange of replicas, including all-pair exchange methods. We measure specific heat errors and round-trip efficiency using the two-dimensional (2D) Ising model, and also test the efficiency for the ground state production in 3D spin glass models. We find that the optimization of the GS problem is highly influenced by the choice of the temperature range of the PT process. Finally, we present numerical evidence concerning the universality aspects of an anisotropic case of the 3D spin-glass model.

  11. Massive parallel 3D PIC simulation of negative ion extraction

    Science.gov (United States)

    Revel, Adrien; Mochalskyy, Serhiy; Montellano, Ivar Mauricio; Wünderlich, Dirk; Fantz, Ursel; Minea, Tiberiu

    2017-09-01

    The 3D PIC-MCC code ONIX is dedicated to modeling Negative hydrogen/deuterium Ion (NI) extraction and co-extraction of electrons from radio-frequency driven, low pressure plasma sources. It provides valuable insight on the complex phenomena involved in the extraction process. In previous calculations, a mesh size larger than the Debye length was used, implying numerical electron heating. Important steps have been achieved in terms of computation performance and parallelization efficiency allowing successful massive parallel calculations (4096 cores), imperative to resolve the Debye length. In addition, the numerical algorithms have been improved in terms of grid treatment, i.e., the electric field near the complex geometry boundaries (plasma grid) is calculated more accurately. The revised model preserves the full 3D treatment, but can take advantage of a highly refined mesh. ONIX was used to investigate the role of the mesh size, the re-injection scheme for lost particles (extracted or wall absorbed), and the electron thermalization process on the calculated extracted current and plasma characteristics. It is demonstrated that all numerical schemes give the same NI current distribution for extracted ions. Concerning the electrons, the pair-injection technique is found well-adapted to simulate the sheath in front of the plasma grid.

  12. 3D data denoising via Nonlocal Means filter by using parallel GPU strategies.

    Science.gov (United States)

    Cuomo, Salvatore; De Michele, Pasquale; Piccialli, Francesco

    2014-01-01

    Nonlocal Means (NLM) algorithm is widely considered as a state-of-the-art denoising filter in many research fields. Its high computational complexity leads researchers to the development of parallel programming approaches and the use of massively parallel architectures such as the GPUs. In the recent years, the GPU devices had led to achieving reasonable running times by filtering, slice-by-slice, and 3D datasets with a 2D NLM algorithm. In our approach we design and implement a fully 3D NonLocal Means parallel approach, adopting different algorithm mapping strategies on GPU architecture and multi-GPU framework, in order to demonstrate its high applicability and scalability. The experimental results we obtained encourage the usability of our approach in a large spectrum of applicative scenarios such as magnetic resonance imaging (MRI) or video sequence denoising.

  13. Glnemo2: Interactive Visualization 3D Program

    Science.gov (United States)

    Lambert, Jean-Charles

    2011-10-01

    Glnemo2 is an interactive 3D visualization program developed in C++ using the OpenGL library and Nokia QT 4.X API. It displays in 3D the particles positions of the different components of an nbody snapshot. It quickly gives a lot of information about the data (shape, density area, formation of structures such as spirals, bars, or peanuts). It allows for in/out zooms, rotations, changes of scale, translations, selection of different groups of particles and plots in different blending colors. It can color particles according to their density or temperature, play with the density threshold, trace orbits, display different time steps, take automatic screenshots to make movies, select particles using the mouse, and fly over a simulation using a given camera path. All these features are accessible from a very intuitive graphic user interface. Glnemo2 supports a wide range of input file formats (Nemo, Gadget 1 and 2, phiGrape, Ramses, list of files, realtime gyrfalcON simulation) which are automatically detected at loading time without user intervention. Glnemo2 uses a plugin mechanism to load the data, so that it is easy to add a new file reader. It's powered by a 3D engine which uses the latest OpenGL technology, such as shaders (glsl), vertex buffer object, frame buffer object, and takes in account the power of the graphic card used in order to accelerate the rendering. With a fast GPU, millions of particles can be rendered in real time. Glnemo2 runs on Linux, Windows (using minGW compiler), and MaxOSX, thanks to the QT4API.

  14. DPGL: The Direct3D9-based Parallel Graphics Library for Multi-display Environment

    Institute of Scientific and Technical Information of China (English)

    Zhen Liu; Jiao-Ying Shi

    2007-01-01

    The emergence of high performance 3D graphics cards has opened the way to PC clusters for high performance multidisplay environment. In order to exploit the rendering ability of PC clusters, we should design appropriate parallel rendering algorithms and parallel graphics library interfaces. Due to the rapid development of Direct3D, we bring forward DPGL, the Direct3D9-based parallel graphics library in D3DPR parallel rendering system, which implements Direct3D9 interfaces to support existing Direct3D9 application parallelization with no modification. Based on the parallelism analysis of Direct3D9 rendering pipeline, we briefly introduce D3DPR parallel rendering system. DPGL is the fundamental component of D3DPR. After presenting DPGL three layers architecture,we discuss the rendering resource interception and management. Finally, we describe the design and implementation of DPGL in detail,including rendering command interception layer, rendering command interpretation layer and rendering resource parallelization layer.

  15. On the Parallel Design and Analysis for 3-D ADI Telegraph Problem with MPI

    Directory of Open Access Journals (Sweden)

    Simon Uzezi Ewedafe

    2014-05-01

    Full Text Available In this paper we describe the 3-D Telegraph Equation (3-DTEL with the use of Alternating Direction Implicit (ADI method on Geranium Cadcam Cluster (GCC with Message Passing Interface (MPI parallel software. The algorithm is presented by the use of Single Program Multiple Data (SPMD technique. The implementation is discussed by means of Parallel Design and Analysis with the use of Domain Decomposition (DD strategy. The 3-DTEL with ADI scheme is implemented on the GCC cluster, with an objective to evaluate the overhead it introduces, with ability to exploit the inherent parallelism of the computation. Results of the parallel experiments are presented. The Speedup and Efficiency from the experiments on different block sizes agree with the theoretical analysis.

  16. Study of improved ray tracing parallel algorithm for CGH of 3D objects on GPU

    Science.gov (United States)

    Cong, Bin; Jiang, Xiaoyu; Yao, Jun; Zhao, Kai

    2014-11-01

    An improved parallel algorithm for holograms of three-dimensional objects was presented. According to the physical characteristics and mathematical properties of the original ray tracing algorithm for computer generated holograms (CGH), using transform approximation and numerical analysis methods, we extract parts of ray tracing algorithm which satisfy parallelization features and implement them on graphics processing unit (GPU). Meanwhile, through proper design of parallel numerical procedure, we did parallel programming to the two-dimensional slices of three-dimensional object with CUDA. According to the experiments, an effective method of dealing with occlusion problem in ray tracing is proposed, as well as generating the holograms of 3D objects with additive property. Our results indicate that the improved algorithm can effectively shorten the computing time. Due to the different sizes of spatial object points and hologram pixels, the speed has increased 20 to 70 times comparing with original ray tracing algorithm.

  17. Developing Parallel Programs

    Directory of Open Access Journals (Sweden)

    Ranjan Sen

    2012-09-01

    Full Text Available Parallel programming is an extension of sequential programming; today, it is becoming the mainstream paradigm in day-to-day information processing. Its aim is to build the fastest programs on parallel computers. The methodologies for developing a parallelprogram can be put into integrated frameworks. Development focuses on algorithm, languages, and how the program is deployed on the parallel computer.

  18. Introduction to parallel programming

    CERN Document Server

    Brawer, Steven

    1989-01-01

    Introduction to Parallel Programming focuses on the techniques, processes, methodologies, and approaches involved in parallel programming. The book first offers information on Fortran, hardware and operating system models, and processes, shared memory, and simple parallel programs. Discussions focus on processes and processors, joining processes, shared memory, time-sharing with multiple processors, hardware, loops, passing arguments in function/subroutine calls, program structure, and arithmetic expressions. The text then elaborates on basic parallel programming techniques, barriers and race

  19. Parallel Simulation of 3-D Turbulent Flow Through Hydraulic Machinery

    Institute of Scientific and Technical Information of China (English)

    徐宇; 吴玉林

    2003-01-01

    Parallel calculational methods were used to analyze incompressible turbulent flow through hydraulic machinery. Two parallel methods were used to simulate the complex flow field. The space decomposition method divides the computational domain into several sub-ranges. Parallel discrete event simulation divides the whole task into several parts according to their functions. The simulation results were compared with the serial simulation results and particle image velocimetry (PIV) experimental results. The results give the distribution and configuration of the complex vortices and illustrate the effectiveness of the parallel algorithms for numerical simulation of turbulent flows.

  20. Parallel Hall effect from 3D single-component metamaterials

    CERN Document Server

    Kern, Christian; Wegener, Martin

    2015-01-01

    We propose a class of three-dimensional metamaterial architectures composed of a single doped semiconductor (e.g., n-Si) in air or vacuum that lead to unusual effective behavior of the classical Hall effect. Using an anisotropic structure, we numerically demonstrate a Hall voltage that is parallel---rather than orthogonal---to the external static magnetic-field vector ("parallel Hall effect"). The sign of this parallel Hall voltage can be determined by a structure parameter. Together with the previously demonstrated positive or negative orthogonal Hall voltage, we demonstrate four different sign combinations

  1. Parallel magnetohydrodynamics on the Cray T3D

    NARCIS (Netherlands)

    Meijer, P. M.; Poedts, S.; Goedbloed, J. P.

    1996-01-01

    The equations of magnetohydrodynamics (MHD) are discussed in the framework of parallel computing. Both linear and nonlinear MHD models are addressed. Special attention is given to the parallellisation of the kernels of the existing sequential MHD codes. These kernels involve matrix-vector multiplica

  2. A Parallel Sweeping Preconditioner for Heterogeneous 3D Helmholtz Equations

    KAUST Repository

    Poulson, Jack

    2013-05-02

    A parallelization of a sweeping preconditioner for three-dimensional Helmholtz equations without large cavities is introduced and benchmarked for several challenging velocity models. The setup and application costs of the sequential preconditioner are shown to be O(γ2N4/3) and O(γN logN), where γ(ω) denotes the modestly frequency-dependent number of grid points per perfectly matched layer. Several computational and memory improvements are introduced relative to using black-box sparse-direct solvers for the auxiliary problems, and competitive runtimes and iteration counts are reported for high-frequency problems distributed over thousands of cores. Two open-source packages are released along with this paper: Parallel Sweeping Preconditioner (PSP) and the underlying distributed multifrontal solver, Clique. © 2013 Society for Industrial and Applied Mathematics.

  3. Parallel deterministic neutronics with AMR in 3D

    Energy Technology Data Exchange (ETDEWEB)

    Clouse, C.; Ferguson, J.; Hendrickson, C. [Lawrence Livermore National Lab., CA (United States)

    1997-12-31

    AMTRAN, a three dimensional Sn neutronics code with adaptive mesh refinement (AMR) has been parallelized over spatial domains and energy groups and runs on the Meiko CS-2 with MPI message passing. Block refined AMR is used with linear finite element representations for the fluxes, which allows for a straight forward interpretation of fluxes at block interfaces with zoning differences. The load balancing algorithm assumes 8 spatial domains, which minimizes idle time among processors.

  4. A STUDY ON USING 3D VISUALIZATION AND SIMULATION PROGRAM (OPTITEX 3D ON LEATHER APPAREL

    Directory of Open Access Journals (Sweden)

    Ork Nilay

    2016-05-01

    Full Text Available Leather is a luxury garment. Design, material, labor, fitting and time costs are very effective on the production cost of the consumer leather good. 3D visualization and simulation programs which are getting popular in textile industry can be used for material, labor and time saving in leather apparel. However these programs have a very limited use in leather industry because leather material databases are not sufficient as in textile industry. In this research, firstly material properties of leather and textile fabric were determined by using both textile and leather physical test methods, and interpreted and introduced in the program. Detailed measures of an experimental human body were measured from a 3D body scanner. An avatar was designed according to these measurements. Then a prototype dress was made by using Computer Aided Design-CAD program for designing the patterns. After the pattern making, OptiTex 3D visualization and simulation program was used to visualize and simulate the dresses. Additionally the leather and cotton fabric dresses were sewn in real life. Then the visual and real life dresses were compared and discussed. 3D virtual prototyping seems a promising potential in future manufacturing technologies by evaluating the fitting of garments in a simple and quick way, filling the gap between 3D pattern design and manufacturing, providing virtual demonstrations to customers.

  5. Time efficient 3-D electromagnetic modeling on massively parallel computers

    Energy Technology Data Exchange (ETDEWEB)

    Alumbaugh, D.L.; Newman, G.A.

    1995-08-01

    A numerical modeling algorithm has been developed to simulate the electromagnetic response of a three dimensional earth to a dipole source for frequencies ranging from 100 to 100MHz. The numerical problem is formulated in terms of a frequency domain--modified vector Helmholtz equation for the scattered electric fields. The resulting differential equation is approximated using a staggered finite difference grid which results in a linear system of equations for which the matrix is sparse and complex symmetric. The system of equations is solved using a preconditioned quasi-minimum-residual method. Dirichlet boundary conditions are employed at the edges of the mesh by setting the tangential electric fields equal to zero. At frequencies less than 1MHz, normal grid stretching is employed to mitigate unwanted reflections off the grid boundaries. For frequencies greater than this, absorbing boundary conditions must be employed by making the stretching parameters of the modified vector Helmholtz equation complex which introduces loss at the boundaries. To allow for faster calculation of realistic models, the original serial version of the code has been modified to run on a massively parallel architecture. This modification involves three distinct tasks; (1) mapping the finite difference stencil to a processor stencil which allows for the necessary information to be exchanged between processors that contain adjacent nodes in the model, (2) determining the most efficient method to input the model which is accomplished by dividing the input into ``global`` and ``local`` data and then reading the two sets in differently, and (3) deciding how to output the data which is an inherently nonparallel process.

  6. Parallel processing for efficient 3D slope stability modelling

    Science.gov (United States)

    Marchesini, Ivan; Mergili, Martin; Alvioli, Massimiliano; Metz, Markus; Schneider-Muntau, Barbara; Rossi, Mauro; Guzzetti, Fausto

    2014-05-01

    We test the performance of the GIS-based, three-dimensional slope stability model r.slope.stability. The model was developed as a C- and python-based raster module of the GRASS GIS software. It considers the three-dimensional geometry of the sliding surface, adopting a modification of the model proposed by Hovland (1977), and revised and extended by Xie and co-workers (2006). Given a terrain elevation map and a set of relevant thematic layers, the model evaluates the stability of slopes for a large number of randomly selected potential slip surfaces, ellipsoidal or truncated in shape. Any single raster cell may be intersected by multiple sliding surfaces, each associated with a value of the factor of safety, FS. For each pixel, the minimum value of FS and the depth of the associated slip surface are stored. This information is used to obtain a spatial overview of the potentially unstable slopes in the study area. We test the model in the Collazzone area, Umbria, central Italy, an area known to be susceptible to landslides of different type and size. Availability of a comprehensive and detailed landslide inventory map allowed for a critical evaluation of the model results. The r.slope.stability code automatically splits the study area into a defined number of tiles, with proper overlap in order to provide the same statistical significance for the entire study area. The tiles are then processed in parallel by a given number of processors, exploiting a multi-purpose computing environment at CNR IRPI, Perugia. The map of the FS is obtained collecting the individual results, taking the minimum values on the overlapping cells. This procedure significantly reduces the processing time. We show how the gain in terms of processing time depends on the tile dimensions and on the number of cores.

  7. Programming structure into 3D nanomaterials

    Directory of Open Access Journals (Sweden)

    Dara Van Gough

    2009-06-01

    Full Text Available Programming three dimensional nanostructures into materials is becoming increasingly important given the need for ever more highly functional solids. Applications for materials with complex programmed structures include solar energy harvesting, energy storage, molecular separation, sensors, pharmaceutical agent delivery, nanoreactors and advanced optical devices. Here we discuss examples of molecular and optical routes to program the structure of three-dimensional nanomaterials with exquisite control over nanomorphology and the resultant properties and conclude with a discussion of the opportunities and challenges of such an approach.

  8. Parallel programming with PCN

    Energy Technology Data Exchange (ETDEWEB)

    Foster, I.; Tuecke, S.

    1991-12-01

    PCN is a system for developing and executing parallel programs. It comprises a high-level programming language, tools for developing and debugging programs in this language, and interfaces to Fortran and C that allow the reuse of existing code in multilingual parallel programs. Programs developed using PCN are portable across many different workstations, networks, and parallel computers. This document provides all the information required to develop parallel programs with the PCN programming system. In includes both tutorial and reference material. It also presents the basic concepts that underly PCN, particularly where these are likely to be unfamiliar to the reader, and provides pointers to other documentation on the PCN language, programming techniques, and tools. PCN is in the public domain. The latest version of both the software and this manual can be obtained by anonymous FTP from Argonne National Laboratory in the directory pub/pcn at info.mcs.anl.gov (c.f. Appendix A).

  9. Parallel Programming Paradigms

    Science.gov (United States)

    1987-07-01

    GOVT ACCESSION NO. 3. RECIPIENT’S CATALOG NUMBER 4, TITL.: td Subtitle) S. TYPE OF REPORT & PERIOD COVERED Parallel Programming Paradigms...studied. 0A ITI is Jt, t’i- StCUI-eASSIICATION OFvrHIS PAGFrm".n Def. £ntered, Parallel Programming Paradigms Philip Arne Nelson Department of Computer...8416878 and by the Office of Naval Research Contracts No. N00014-86-K-0264 and No. N00014-85- K-0328. 8 ?~~ O .G 1 49 II Parallel Programming Paradigms

  10. ERROR ANALYSIS OF 3D DETECTING SYSTEM BASED ON WHOLE-FIELD PARALLEL CONFOCAL MICROSCOPE

    Institute of Scientific and Technical Information of China (English)

    Wang Yonghong; Yu Xiaofen

    2005-01-01

    Compared with the traditional scanning confocal microscopy, the effect of various factors on characteristic in multi-beam parallel confocal system is discussed, the error factors in multi-beam parallel confocal system are analyzed. The factors influencing the characteristics of the multi-beam parallel confocal system are discussed. The construction and working principle of the non-scanning 3D detecting system is introduced, and some experiment results prove the effect of various factors on the detecting system.

  11. Parallel programming with PCN

    Energy Technology Data Exchange (ETDEWEB)

    Foster, I.; Tuecke, S.

    1993-01-01

    PCN is a system for developing and executing parallel programs. It comprises a high-level programming language, tools for developing and debugging programs in this language, and interfaces to Fortran and Cthat allow the reuse of existing code in multilingual parallel programs. Programs developed using PCN are portable across many different workstations, networks, and parallel computers. This document provides all the information required to develop parallel programs with the PCN programming system. It includes both tutorial and reference material. It also presents the basic concepts that underlie PCN, particularly where these are likely to be unfamiliar to the reader, and provides pointers to other documentation on the PCN language, programming techniques, and tools. PCN is in the public domain. The latest version of both the software and this manual can be obtained by anonymous ftp from Argonne National Laboratory in the directory pub/pcn at info.mcs. ani.gov (cf. Appendix A). This version of this document describes PCN version 2.0, a major revision of the PCN programming system. It supersedes earlier versions of this report.

  12. Parallel Finite Element Solution of 3D Rayleigh-Benard-Marangoni Flows

    Science.gov (United States)

    Carey, G. F.; McLay, R.; Bicken, G.; Barth, B.; Pehlivanov, A.

    1999-01-01

    A domain decomposition strategy and parallel gradient-type iterative solution scheme have been developed and implemented for computation of complex 3D viscous flow problems involving heat transfer and surface tension effects. Details of the implementation issues are described together with associated performance and scalability studies. Representative Rayleigh-Benard and microgravity Marangoni flow calculations and performance results on the Cray T3D and T3E are presented. The work is currently being extended to tightly-coupled parallel "Beowulf-type" PC clusters and we present some preliminary performance results on this platform. We also describe progress on related work on hierarchic data extraction for visualization.

  13. Parallel Programming with Intel Parallel Studio XE

    CERN Document Server

    Blair-Chappell , Stephen

    2012-01-01

    Optimize code for multi-core processors with Intel's Parallel Studio Parallel programming is rapidly becoming a "must-know" skill for developers. Yet, where to start? This teach-yourself tutorial is an ideal starting point for developers who already know Windows C and C++ and are eager to add parallelism to their code. With a focus on applying tools, techniques, and language extensions to implement parallelism, this essential resource teaches you how to write programs for multicore and leverage the power of multicore in your programs. Sharing hands-on case studies and real-world examples, the

  14. Reconstruction for time-domain in vivo EPR 3D multigradient oximetric imaging--a parallel processing perspective.

    Science.gov (United States)

    Dharmaraj, Christopher D; Thadikonda, Kishan; Fletcher, Anthony R; Doan, Phuc N; Devasahayam, Nallathamby; Matsumoto, Shingo; Johnson, Calvin A; Cook, John A; Mitchell, James B; Subramanian, Sankaran; Krishna, Murali C

    2009-01-01

    Three-dimensional Oximetric Electron Paramagnetic Resonance Imaging using the Single Point Imaging modality generates unpaired spin density and oxygen images that can readily distinguish between normal and tumor tissues in small animals. It is also possible with fast imaging to track the changes in tissue oxygenation in response to the oxygen content in the breathing air. However, this involves dealing with gigabytes of data for each 3D oximetric imaging experiment involving digital band pass filtering and background noise subtraction, followed by 3D Fourier reconstruction. This process is rather slow in a conventional uniprocessor system. This paper presents a parallelization framework using OpenMP runtime support and parallel MATLAB to execute such computationally intensive programs. The Intel compiler is used to develop a parallel C++ code based on OpenMP. The code is executed on four Dual-Core AMD Opteron shared memory processors, to reduce the computational burden of the filtration task significantly. The results show that the parallel code for filtration has achieved a speed up factor of 46.66 as against the equivalent serial MATLAB code. In addition, a parallel MATLAB code has been developed to perform 3D Fourier reconstruction. Speedup factors of 4.57 and 4.25 have been achieved during the reconstruction process and oximetry computation, for a data set with 23 x 23 x 23 gradient steps. The execution time has been computed for both the serial and parallel implementations using different dimensions of the data and presented for comparison. The reported system has been designed to be easily accessible even from low-cost personal computers through local internet (NIHnet). The experimental results demonstrate that the parallel computing provides a source of high computational power to obtain biophysical parameters from 3D EPR oximetric imaging, almost in real-time.

  15. Reconstruction for Time-Domain In Vivo EPR 3D Multigradient Oximetric Imaging—A Parallel Processing Perspective

    Directory of Open Access Journals (Sweden)

    Christopher D. Dharmaraj

    2009-01-01

    Full Text Available Three-dimensional Oximetric Electron Paramagnetic Resonance Imaging using the Single Point Imaging modality generates unpaired spin density and oxygen images that can readily distinguish between normal and tumor tissues in small animals. It is also possible with fast imaging to track the changes in tissue oxygenation in response to the oxygen content in the breathing air. However, this involves dealing with gigabytes of data for each 3D oximetric imaging experiment involving digital band pass filtering and background noise subtraction, followed by 3D Fourier reconstruction. This process is rather slow in a conventional uniprocessor system. This paper presents a parallelization framework using OpenMP runtime support and parallel MATLAB to execute such computationally intensive programs. The Intel compiler is used to develop a parallel C++ code based on OpenMP. The code is executed on four Dual-Core AMD Opteron shared memory processors, to reduce the computational burden of the filtration task significantly. The results show that the parallel code for filtration has achieved a speed up factor of 46.66 as against the equivalent serial MATLAB code. In addition, a parallel MATLAB code has been developed to perform 3D Fourier reconstruction. Speedup factors of 4.57 and 4.25 have been achieved during the reconstruction process and oximetry computation, for a data set with 23×23×23 gradient steps. The execution time has been computed for both the serial and parallel implementations using different dimensions of the data and presented for comparison. The reported system has been designed to be easily accessible even from low-cost personal computers through local internet (NIHnet. The experimental results demonstrate that the parallel computing provides a source of high computational power to obtain biophysical parameters from 3D EPR oximetric imaging, almost in real-time.

  16. VPython: Writing Real-time 3D Physics Programs

    Science.gov (United States)

    Chabay, Ruth

    2001-06-01

    VPython (http://cil.andrew.cmu.edu/projects/visual) combines the Python programming language with an innovative 3D graphics module called Visual, developed by David Scherer. Designed to make 3D physics simulations accessible to novice programmers, VPython allows the programmer to write a purely computational program without any graphics code, and produces an interactive realtime 3D graphical display. In a program 3D objects are created and their positions modified by computational algorithms. Running in a separate thread, the Visual module monitors the positions of these objects and renders them many times per second. Using the mouse, one can zoom and rotate to navigate through the scene. After one hour of instruction, students in an introductory physics course at Carnegie Mellon University, including those who have never programmed before, write programs in VPython to model the behavior of physical systems and to visualize fields in 3D. The Numeric array processing module allows the construction of more sophisticated simulations and models as well. VPython is free and open source. The Visual module is based on OpenGL, and runs on Windows, Linux, and Macintosh.

  17. IM3D: A parallel Monte Carlo code for efficient simulations of primary radiation displacements and damage in 3D geometry

    National Research Council Canada - National Science Library

    Li, Yong Gang; Yang, Yang; Short, Michael P; Ding, Ze Jun; Zeng, Zhi; Li, Ju

    2015-01-01

    ... and > 10(4) times faster using parallel computation. For 3D problems, it provides a fast approach for analyzing the spatial distributions of primary displacements and defect generation under ion irradiation...

  18. RELAP5-3D Developer Guidelines and Programming Practices

    Energy Technology Data Exchange (ETDEWEB)

    Dr. George L Mesina

    2014-03-01

    Our ultimate goal is to create and maintain RELAP5-3D as the best software tool available to analyze nuclear power plants. This begins with writing excellent programming and requires thorough testing. This document covers development of RELAP5-3D software, the behavior of the RELAP5-3D program that must be maintained, and code testing. RELAP5-3D must perform in a manner consistent with previous code versions with backward compatibility for the sake of the users. Thus file operations, code termination, input and output must remain consistent in form and content while adding appropriate new files, input and output as new features are developed. As computer hardware, operating systems, and other software change, RELAP5-3D must adapt and maintain performance. The code must be thoroughly tested to ensure that it continues to perform robustly on the supported platforms. The coding must be written in a consistent manner that makes the program easy to read to reduce the time and cost of development, maintenance and error resolution. The programming guidelines presented her are intended to institutionalize a consistent way of writing FORTRAN code for the RELAP5-3D computer program that will minimize errors and rework. A common format and organization of program units creates a unifying look and feel to the code. This in turn increases readability and reduces time required for maintenance, development and debugging. It also aids new programmers in reading and understanding the program. Therefore, when undertaking development of the RELAP5-3D computer program, the programmer must write computer code that follows these guidelines. This set of programming guidelines creates a framework of good programming practices, such as initialization, structured programming, and vector-friendly coding. It sets out formatting rules for lines of code, such as indentation, capitalization, spacing, etc. It creates limits on program units, such as subprograms, functions, and modules. It

  19. 3 - D Animation as Applied to the Solving of Coupling Relations in the 6 - DOF Parallel Robot

    Institute of Scientific and Technical Information of China (English)

    XU Lei-lin; WANG Li-rong; LI En-guang

    2002-01-01

    How to solve the coupling relations in a 6 - DOF parallel robot quickly and accurately within the limits of realtime control is a critical problem. In traditional analytic method, the complicated mathemtical model must first be constructed and then solved by programming.Obviously, this method is not very practical. This paper,therefore, proposes a new way of approach with a new method using 3- D animation for the solving of coupling relations in the 6 - DOF parallel robot. This method is much simpler and its solving accuracy approaches that of the more complicated analytic method.

  20. Parallel programming with MPI

    Energy Technology Data Exchange (ETDEWEB)

    Tatebe, Osamu [Electrotechnical Lab., Tsukuba, Ibaraki (Japan)

    1998-03-01

    MPI is a practical, portable, efficient and flexible standard for message passing, which has been implemented on most MPPs and network of workstations by machine vendors, universities and national laboratories. MPI avoids specifying how operations will take place and superfluous work to achieve efficiency as well as portability, and is also designed to encourage overlapping communication and computation to hide communication latencies. This presentation briefly explains the MPI standard, and comments on efficient parallel programming to improve performance. (author)

  1. 2D/3D Program work summary report

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1995-09-01

    The 2D/3D Program was carried out by Germany, Japan and the United States to investigate the thermal-hydraulics of a PWR large-break LOCA. A contributory approach was utilized in which each country contributed significant effort to the program and all three countries shared the research results. Germany constructed and operated the Upper Plenum Test Facility (UPTF), and Japan constructed and operated the Cylindrical Core Test Facility (CCTF) and the Slab Core Test Facility (SCTF). The US contribution consisted of provision of advanced instrumentation to each of the three test facilities, and assessment of the TRAC computer code against the test results. Evaluations of the test results were carried out in all three countries. This report summarizes the 2D/3D Program in terms of the contributing efforts of the participants, and was prepared in a coordination among three countries. US and Germany have published the report as NUREG/IA-0126 and GRS-100, respectively. (author).

  2. Fast 3D Variable-FOV Reconstruction for Parallel Imaging with Localized Sensitivities

    CERN Document Server

    Can, Yiğit Baran; Çukur, Tolga

    2016-01-01

    Several successful iterative approaches have recently been proposed for parallel-imaging reconstructions of variable-density (VD) acquisitions, but they often induce substantial computational burden for non-Cartesian data. Here we propose a generalized variable-FOV PILS reconstruction 3D VD Cartesian and non-Cartesian data. The proposed method separates k-space into non-intersecting annuli based on sampling density, and sets the 3D reconstruction FOV for each annulus based on the respective sampling density. The variable-FOV method is compared against conventional gridding, PILS, and ESPIRiT reconstructions. Results indicate that the proposed method yields better artifact suppression compared to gridding and PILS, and improves noise conditioning relative to ESPIRiT, enabling fast and high-quality reconstructions of 3D datasets.

  3. CH5M3D: an HTML5 program for creating 3D molecular structures.

    Science.gov (United States)

    Earley, Clarke W

    2013-11-18

    While a number of programs and web-based applications are available for the interactive display of 3-dimensional molecular structures, few of these provide the ability to edit these structures. For this reason, we have developed a library written in JavaScript to allow for the simple creation of web-based applications that should run on any browser capable of rendering HTML5 web pages. While our primary interest in developing this application was for educational use, it may also prove useful to researchers who want a light-weight application for viewing and editing small molecular structures. Molecular compounds are drawn on the HTML5 Canvas element, with the JavaScript code making use of standard techniques to allow display of three-dimensional structures on a two-dimensional canvas. Information about the structure (bond lengths, bond angles, and dihedral angles) can be obtained using a mouse or other pointing device. Both atoms and bonds can be added or deleted, and rotation about bonds is allowed. Routines are provided to read structures either from the web server or from the user's computer, and creation of galleries of structures can be accomplished with only a few lines of code. Documentation and examples are provided to demonstrate how users can access all of the molecular information for creation of web pages with more advanced features. A light-weight (≈ 75 kb) JavaScript library has been made available that allows for the simple creation of web pages containing interactive 3-dimensional molecular structures. Although this library is designed to create web pages, a web server is not required. Installation on a web server is straightforward and does not require any server-side modules or special permissions. The ch5m3d.js library has been released under the GNU GPL version 3 open-source license and is available from http://sourceforge.net/projects/ch5m3d/.

  4. Gust Acoustics Computation with a Space-Time CE/SE Parallel 3D Solver

    Science.gov (United States)

    Wang, X. Y.; Himansu, A.; Chang, S. C.; Jorgenson, P. C. E.; Reddy, D. R. (Technical Monitor)

    2002-01-01

    The benchmark Problem 2 in Category 3 of the Third Computational Aero-Acoustics (CAA) Workshop is solved using the space-time conservation element and solution element (CE/SE) method. This problem concerns the unsteady response of an isolated finite-span swept flat-plate airfoil bounded by two parallel walls to an incident gust. The acoustic field generated by the interaction of the gust with the flat-plate airfoil is computed by solving the 3D (three-dimensional) Euler equations in the time domain using a parallel version of a 3D CE/SE solver. The effect of the gust orientation on the far-field directivity is studied. Numerical solutions are presented and compared with analytical solutions, showing a reasonable agreement.

  5. Stiffness Analysis of 3-d.o.f. Overconstrained Translational Parallel Manipulators

    CERN Document Server

    Pashkevich, Anatoly; Wenger, Philippe

    2008-01-01

    The paper presents a new stiffness modelling method for overconstrained parallel manipulators, which is applied to 3-d.o.f. translational mechanisms. It is based on a multidimensional lumped-parameter model that replaces the link flexibility by localized 6-d.o.f. virtual springs. In contrast to other works, the method includes a FEA-based link stiffness evaluation and employs a new solution strategy of the kinetostatic equations, which allows computing the stiffness matrix for the overconstrained architectures and for the singular manipulator postures. The advantages of the developed technique are confirmed by application examples, which deal with comparative stiffness analysis of two translational parallel manipulators.

  6. An improved parallel SPH approach to solve 3D transient generalized Newtonian free surface flows

    Science.gov (United States)

    Ren, Jinlian; Jiang, Tao; Lu, Weigang; Li, Gang

    2016-08-01

    In this paper, a corrected parallel smoothed particle hydrodynamics (C-SPH) method is proposed to simulate the 3D generalized Newtonian free surface flows with low Reynolds number, especially the 3D viscous jets buckling problems are investigated. The proposed C-SPH method is achieved by coupling an improved SPH method based on the incompressible condition with the traditional SPH (TSPH), that is, the improved SPH with diffusive term and first-order Kernel gradient correction scheme is used in the interior of the fluid domain, and the TSPH is used near the free surface. Thus the C-SPH method possesses the advantages of two methods. Meanwhile, an effective and convenient boundary treatment is presented to deal with 3D multiple-boundary problem, and the MPI parallelization technique with a dynamic cells neighbor particle searching method is considered to improve the computational efficiency. The validity and the merits of the C-SPH are first verified by solving several benchmarks and compared with other results. Then the viscous jet folding/coiling based on the Cross model is simulated by the C-SPH method and compared with other experimental or numerical results. Specially, the influences of macroscopic parameters on the flow are discussed. All the numerical results agree well with available data, and show that the C-SPH method has higher accuracy and better stability for solving 3D moving free surface flows over other particle methods.

  7. Spatial Parallelism of a 3D Finite Difference, Velocity-Stress Elastic Wave Propagation Code

    Energy Technology Data Exchange (ETDEWEB)

    MINKOFF,SUSAN E.

    1999-12-09

    Finite difference methods for solving the wave equation more accurately capture the physics of waves propagating through the earth than asymptotic solution methods. Unfortunately. finite difference simulations for 3D elastic wave propagation are expensive. We model waves in a 3D isotropic elastic earth. The wave equation solution consists of three velocity components and six stresses. The partial derivatives are discretized using 2nd-order in time and 4th-order in space staggered finite difference operators. Staggered schemes allow one to obtain additional accuracy (via centered finite differences) without requiring additional storage. The serial code is most unique in its ability to model a number of different types of seismic sources. The parallel implementation uses the MP1 library, thus allowing for portability between platforms. Spatial parallelism provides a highly efficient strategy for parallelizing finite difference simulations. In this implementation, one can decompose the global problem domain into one-, two-, and three-dimensional processor decompositions with 3D decompositions generally producing the best parallel speed up. Because i/o is handled largely outside of the time-step loop (the most expensive part of the simulation) we have opted for straight-forward broadcast and reduce operations to handle i/o. The majority of the communication in the code consists of passing subdomain face information to neighboring processors for use as ''ghost cells''. When this communication is balanced against computation by allocating subdomains of reasonable size, we observe excellent scaled speed up. Allocating subdomains of size 25 x 25 x 25 on each node, we achieve efficiencies of 94% on 128 processors. Numerical examples for both a layered earth model and a homogeneous medium with a high-velocity blocky inclusion illustrate the accuracy of the parallel code.

  8. Spatial parallelism of a 3D finite difference, velocity-stress elastic wave propagation code

    Energy Technology Data Exchange (ETDEWEB)

    Minkoff, S.E.

    1999-12-01

    Finite difference methods for solving the wave equation more accurately capture the physics of waves propagating through the earth than asymptotic solution methods. Unfortunately, finite difference simulations for 3D elastic wave propagation are expensive. The authors model waves in a 3D isotropic elastic earth. The wave equation solution consists of three velocity components and six stresses. The partial derivatives are discretized using 2nd-order in time and 4th-order in space staggered finite difference operators. Staggered schemes allow one to obtain additional accuracy (via centered finite differences) without requiring additional storage. The serial code is most unique in its ability to model a number of different types of seismic sources. The parallel implementation uses the MPI library, thus allowing for portability between platforms. Spatial parallelism provides a highly efficient strategy for parallelizing finite difference simulations. In this implementation, one can decompose the global problem domain into one-, two-, and three-dimensional processor decompositions with 3D decompositions generally producing the best parallel speedup. Because I/O is handled largely outside of the time-step loop (the most expensive part of the simulation) the authors have opted for straight-forward broadcast and reduce operations to handle I/O. The majority of the communication in the code consists of passing subdomain face information to neighboring processors for use as ghost cells. When this communication is balanced against computation by allocating subdomains of reasonable size, they observe excellent scaled speedup. Allocating subdomains of size 25 x 25 x 25 on each node, they achieve efficiencies of 94% on 128 processors. Numerical examples for both a layered earth model and a homogeneous medium with a high-velocity blocky inclusion illustrate the accuracy of the parallel code.

  9. Architectural Adaptability in Parallel Programming

    Science.gov (United States)

    1991-05-01

    I AD-A247 516 Architectural Adaptability in Parallel Programming Lawrence Alan Crowl Technical Report 381 May 1991 92-06322 UNIVERSITY OF ROC R...COMPUTER SCIENCE Best Avai~lable Copy Architectural Adaptability in Parallel Programming by Lawrence Alan Crowl Submitted in Partial Fulfillment of the...in the development of their programs. In applying abstraction to parallel programming , we can use abstractions to represent potential parallelism

  10. 3-D electromagnetic plasma particle simulations on the Intel Delta parallel computer

    Energy Technology Data Exchange (ETDEWEB)

    Wang, J.; Liewer, P.C. [California Inst. of Tech., Pasadena, CA (United States). Jet Propulsion Lab.; Decyk, V.K. [Univ. of California, Los Angeles, CA (United States)

    1994-12-31

    A three-dimensional electromagnetic PIC code has been developed on the 512 node Intel Touchstone Delta MIMD parallel computer. This code is based on the General Concurrent PIC algorithm which uses a domain decomposition to divide the computation among the processors. The 3D simulation domain can be partitioned into 1-, 2-, or 3-dimensional sub-domains. Particles must be exchanged between processors as they move among the subdomains. The Intel Delta allows one to use this code for very-large-scale simulations (i.e. over 10{sup 8} particles and 10{sup 6} grid cells). The parallel efficiency of this code is measured, and the overall code performance on the Delta is compared with that on Cray supercomputers. It is shown that their code runs with a high parallel efficiency of {ge} 95% for large size problems. The particle push time achieved is 115 nsecs/particle/time step for 162 million particles on 512 nodes. Comparing with the performance on a single processor Cray C90, this represents a factor of 58 speedup. The code uses a finite-difference leap frog method for field solve which is significantly more efficient than fast fourier transforms on parallel computers. The performance of this code on the 128 node Cray T3D will also be discussed.

  11. The 3D Elevation Program and America's infrastructure

    Science.gov (United States)

    Lukas, Vicki; Carswell, Jr., William J.

    2016-11-07

    Infrastructure—the physical framework of transportation, energy, communications, water supply, and other systems—and construction management—the overall planning, coordination, and control of a project from beginning to end—are critical to the Nation’s prosperity. The American Society of Civil Engineers has warned that, despite the importance of the Nation’s infrastructure, it is in fair to poor condition and needs sizable and urgent investments to maintain and modernize it, and to ensure that it is sustainable and resilient. Three-dimensional (3D) light detection and ranging (lidar) elevation data provide valuable productivity, safety, and cost-saving benefits to infrastructure improvement projects and associated construction management. By providing data to users, the 3D Elevation Program (3DEP) of the U.S. Geological Survey reduces users’ costs and risks and allows them to concentrate on their mission objectives. 3DEP includes (1) data acquisition partnerships that leverage funding, (2) contracts with experienced private mapping firms, (3) technical expertise, lidar data standards, and specifications, and (4) most important, public access to high-quality 3D elevation data. The size and breadth of improvements for the Nation’s infrastructure and construction management needs call for an efficient, systematic approach to acquiring foundational 3D elevation data. The 3DEP approach to national data coverage will yield large cost savings over individual project-by-project acquisitions and will ensure that data are accessible for other critical applications.

  12. CH5M3D: an HTML5 program for creating 3D molecular structures

    Science.gov (United States)

    2013-01-01

    Background While a number of programs and web-based applications are available for the interactive display of 3-dimensional molecular structures, few of these provide the ability to edit these structures. For this reason, we have developed a library written in JavaScript to allow for the simple creation of web-based applications that should run on any browser capable of rendering HTML5 web pages. While our primary interest in developing this application was for educational use, it may also prove useful to researchers who want a light-weight application for viewing and editing small molecular structures. Results Molecular compounds are drawn on the HTML5 Canvas element, with the JavaScript code making use of standard techniques to allow display of three-dimensional structures on a two-dimensional canvas. Information about the structure (bond lengths, bond angles, and dihedral angles) can be obtained using a mouse or other pointing device. Both atoms and bonds can be added or deleted, and rotation about bonds is allowed. Routines are provided to read structures either from the web server or from the user’s computer, and creation of galleries of structures can be accomplished with only a few lines of code. Documentation and examples are provided to demonstrate how users can access all of the molecular information for creation of web pages with more advanced features. Conclusions A light-weight (≈ 75 kb) JavaScript library has been made available that allows for the simple creation of web pages containing interactive 3-dimensional molecular structures. Although this library is designed to create web pages, a web server is not required. Installation on a web server is straightforward and does not require any server-side modules or special permissions. The ch5m3d.js library has been released under the GNU GPL version 3 open-source license and is available from http://sourceforge.net/projects/ch5m3d/. PMID:24246004

  13. Performance analysis of high quality parallel preconditioners applied to 3D finite element structural analysis

    Energy Technology Data Exchange (ETDEWEB)

    Kolotilina, L.; Nikishin, A.; Yeremin, A. [and others

    1994-12-31

    The solution of large systems of linear equations is a crucial bottleneck when performing 3D finite element analysis of structures. Also, in many cases the reliability and robustness of iterative solution strategies, and their efficiency when exploiting hardware resources, fully determine the scope of industrial applications which can be solved on a particular computer platform. This is especially true for modern vector/parallel supercomputers with large vector length and for modern massively parallel supercomputers. Preconditioned iterative methods have been successfully applied to industrial class finite element analysis of structures. The construction and application of high quality preconditioners constitutes a high percentage of the total solution time. Parallel implementation of high quality preconditioners on such architectures is a formidable challenge. Two common types of existing preconditioners are the implicit preconditioners and the explicit preconditioners. The implicit preconditioners (e.g. incomplete factorizations of several types) are generally high quality but require solution of lower and upper triangular systems of equations per iteration which are difficult to parallelize without deteriorating the convergence rate. The explicit type of preconditionings (e.g. polynomial preconditioners or Jacobi-like preconditioners) require sparse matrix-vector multiplications and can be parallelized but their preconditioning qualities are less than desirable. The authors present results of numerical experiments with Factorized Sparse Approximate Inverses (FSAI) for symmetric positive definite linear systems. These are high quality preconditioners that possess a large resource of parallelism by construction without increasing the serial complexity.

  14. Late gadolinium enhancement cardiac imaging on a 3T scanner with parallel RF transmission technique: prospective comparison of 3D-PSIR and 3D-IR

    Energy Technology Data Exchange (ETDEWEB)

    Schultz, Anthony [Nouvel Hopital Civil, Strasbourg University Hospital, Radiology Department, Strasbourg Cedex (France); Nouvel Hopital Civil, Service de Radiologie, Strasbourg Cedex (France); Caspar, Thibault [Nouvel Hopital Civil, Strasbourg University Hospital, Cardiology Department, Strasbourg Cedex (France); Schaeffer, Mickael [Nouvel Hopital Civil, Strasbourg University Hospital, Public Health and Biostatistics Department, Strasbourg Cedex (France); Labani, Aissam; Jeung, Mi-Young; El Ghannudi, Soraya; Roy, Catherine [Nouvel Hopital Civil, Strasbourg University Hospital, Radiology Department, Strasbourg Cedex (France); Ohana, Mickael [Nouvel Hopital Civil, Strasbourg University Hospital, Radiology Department, Strasbourg Cedex (France); Universite de Strasbourg / CNRS, UMR 7357, iCube Laboratory, Illkirch (France)

    2016-06-15

    To qualitatively and quantitatively compare different late gadolinium enhancement (LGE) sequences acquired at 3T with a parallel RF transmission technique. One hundred and sixty participants prospectively enrolled underwent a 3T cardiac MRI with 3 different LGE sequences: 3D Phase-Sensitive Inversion-Recovery (3D-PSIR) acquired 5 minutes after injection, 3D Inversion-Recovery (3D-IR) at 9 minutes and 3D-PSIR at 13 minutes. All LGE-positive patients were qualitatively evaluated both independently and blindly by two radiologists using a 4-level scale, and quantitatively assessed with measurement of contrast-to-noise ratio and LGE maximal surface. Statistical analyses were calculated under a Bayesian paradigm using MCMC methods. Fifty patients (70 % men, 56yo ± 19) exhibited LGE (62 % were post-ischemic, 30 % related to cardiomyopathy and 8 % post-myocarditis). Early and late 3D-PSIR were superior to 3D-IR sequences (global quality, estimated coefficient IR > early-PSIR: -2.37 CI = [-3.46; -1.38], prob(coef > 0) = 0 % and late-PSIR > IR: 3.12 CI = [0.62; 4.41], prob(coef > 0) = 100 %), LGE surface estimated coefficient IR > early-PSIR: -0.09 CI = [-1.11; -0.74], prob(coef > 0) = 0 % and late-PSIR > IR: 0.96 CI = [0.77; 1.15], prob(coef > 0) = 100 %. Probabilities for late PSIR being superior to early PSIR concerning global quality and CNR were over 90 %, regardless of the aetiological subgroup. In 3T cardiac MRI acquired with parallel RF transmission technique, 3D-PSIR is qualitatively and quantitatively superior to 3D-IR. (orig.)

  15. Parallel Imaging of 3D Surface Profile with Space-Division Multiplexing

    Directory of Open Access Journals (Sweden)

    Hyung Seok Lee

    2016-01-01

    Full Text Available We have developed a modified optical frequency domain imaging (OFDI system that performs parallel imaging of three-dimensional (3D surface profiles by using the space division multiplexing (SDM method with dual-area swept sourced beams. We have also demonstrated that 3D surface information for two different areas could be well obtained in a same time with only one camera by our method. In this study, double field of views (FOVs of 11.16 mm × 5.92 mm were achieved within 0.5 s. Height range for each FOV was 460 µm and axial and transverse resolutions were 3.6 and 5.52 µm, respectively.

  16. Squire's transformation and 3D Optimal Perturbations in Bounded Parallel Shear Flows

    Science.gov (United States)

    Chomaz, Jean-Marc; Soundar Jerome, J. John

    2011-11-01

    The aim of this short communication is to present the implication of Squire's transformation on the optimal transient growth of arbitrary 3D disturbances in parallel shear flow bounded in the cross-stream direction. To our best knowledge this simple property has never been discussed before. In particular it allows to express the long-time optimal growth for perturbations of arbitrary wavenumbers as the product of the gains from the 2D optimal at a lower Reynolds number itself due to the Orr-mechanism by a term that may be identified as due to the lift-up mechanism. This property predict scalings for the 3D optimal perturbation well verified by direct computation. It may be extended to take into account buoyancy effect.

  17. LB3D: A parallel implementation of the Lattice-Boltzmann method for simulation of interacting amphiphilic fluids

    Science.gov (United States)

    Schmieschek, S.; Shamardin, L.; Frijters, S.; Krüger, T.; Schiller, U. D.; Harting, J.; Coveney, P. V.

    2017-08-01

    We introduce the lattice-Boltzmann code LB3D, version 7.1. Building on a parallel program and supporting tools which have enabled research utilising high performance computing resources for nearly two decades, LB3D version 7 provides a subset of the research code functionality as an open source project. Here, we describe the theoretical basis of the algorithm as well as computational aspects of the implementation. The software package is validated against simulations of meso-phases resulting from self-assembly in ternary fluid mixtures comprising immiscible and amphiphilic components such as water-oil-surfactant systems. The impact of the surfactant species on the dynamics of spinodal decomposition are tested and quantitative measurement of the permeability of a body centred cubic (BCC) model porous medium for a simple binary mixture is described. Single-core performance and scaling behaviour of the code are reported for simulations on current supercomputer architectures.

  18. PARALLEL 3-D SPACE CHARGE CALCULATIONS IN THE UNIFIED ACCELERATOR LIBRARY.

    Energy Technology Data Exchange (ETDEWEB)

    D' IMPERIO, N.L.; LUCCIO, A.U.; MALITSKY, N.

    2006-06-26

    The paper presents the integration of the SIMBAD space charge module in the UAL framework. SIMBAD is a Particle-in-Cell (PIC) code. Its 3-D Parallel approach features an optimized load balancing scheme based on a genetic algorithm. The UAL framework enhances the SIMBAD standalone version with the interactive ROOT-based analysis environment and an open catalog of accelerator algorithms. The composite package addresses complex high intensity beam dynamics and has been developed as part of the FAIR SIS 100 project.

  19. Parallel implementation of 3D FFT with volumetric decomposition schemes for efficient molecular dynamics simulations

    Science.gov (United States)

    Jung, Jaewoon; Kobayashi, Chigusa; Imamura, Toshiyuki; Sugita, Yuji

    2016-03-01

    Three-dimensional Fast Fourier Transform (3D FFT) plays an important role in a wide variety of computer simulations and data analyses, including molecular dynamics (MD) simulations. In this study, we develop hybrid (MPI+OpenMP) parallelization schemes of 3D FFT based on two new volumetric decompositions, mainly for the particle mesh Ewald (PME) calculation in MD simulations. In one scheme, (1d_Alltoall), five all-to-all communications in one dimension are carried out, and in the other, (2d_Alltoall), one two-dimensional all-to-all communication is combined with two all-to-all communications in one dimension. 2d_Alltoall is similar to the conventional volumetric decomposition scheme. We performed benchmark tests of 3D FFT for the systems with different grid sizes using a large number of processors on the K computer in RIKEN AICS. The two schemes show comparable performances, and are better than existing 3D FFTs. The performances of 1d_Alltoall and 2d_Alltoall depend on the supercomputer network system and number of processors in each dimension. There is enough leeway for users to optimize performance for their conditions. In the PME method, short-range real-space interactions as well as long-range reciprocal-space interactions are calculated. Our volumetric decomposition schemes are particularly useful when used in conjunction with the recently developed midpoint cell method for short-range interactions, due to the same decompositions of real and reciprocal spaces. The 1d_Alltoall scheme of 3D FFT takes 4.7 ms to simulate one MD cycle for a virus system containing more than 1 million atoms using 32,768 cores on the K computer.

  20. Status of the 3D Elevation Program, 2015

    Science.gov (United States)

    Sugarbaker, Larry J.; Eldridge, Diane F.; Jason, Allyson L.; Lukas, Vicki; Saghy, David L.; Stoker, Jason M.; Thunen, Diana R.

    2017-01-18

    The 3D Elevation Program (3DEP) is a cooperative activity to collect light detection and ranging (lidar) data for the conterminous United States, Hawaii, and U.S. territories; and interferometric synthetic aperture radar (IfSAR) elevation data for Alaska during an 8-year period. The U.S. Geological Survey (USGS) and partner organizations acquire high-quality three-dimensional elevation data for the United States and its territories that support requirements beyond what could be realized if agencies independently pursued lidar and IfSAR data collection activities. Data collection rates have been increasing as a growing number of State and Federal agencies participate in cooperative data acquisition projects. USGS and partner agencies expanded data collection, completed the initial product delivery systems and implemented changes to the program governance to include a restructuring of the 3DEP working group and formalizing the relationship to the Federal Geographic Data Committee during the final year (2015) of program preparation.

  1. Simulating Growth Kinetics in a Data-Parallel 3D Lattice Photobioreactor

    Directory of Open Access Journals (Sweden)

    A. V. Husselmann

    2013-01-01

    Full Text Available Though there have been many attempts to address growth kinetics in algal photobioreactors, surprisingly little have attempted an agent-based modelling (ABM approach. ABM has been heralded as a method of practical scientific inquiry into systems of a complex nature and has been applied liberally in a range of disciplines including ecology, physics, social science, and microbiology with special emphasis on pathogenic bacterial growth. We bring together agent-based simulation with the Photosynthetic Factory (PSF model, as well as certain key bioreactor characteristics in a visual 3D, parallel computing fashion. Despite being at small scale, the simulation gives excellent visual cues on the dynamics of such a reactor, and we further investigate the model in a variety of ways. Our parallel implementation on graphical processing units of the simulation provides key advantages, which we also briefly discuss. We also provide some performance data, along with particular effort in visualisation, using volumetric and isosurface rendering.

  2. Porting a 3D-model for the transport of reactive air pollutants to the parallel machine T3D

    NARCIS (Netherlands)

    Kessler, C.; Blom, J.G.; Verwer, J.G.

    1995-01-01

    Air pollution forecasting puts a high demand on the memory and the floating point performance of modern computers. For this kind of problems massively parallel computers are very promising, although the software tools and the I/O facilities on those machines are still under-developed. This report de

  3. Assessing the performance of a parallel MATLAB-based 3D convection code

    Science.gov (United States)

    Kirkpatrick, G. J.; Hasenclever, J.; Phipps Morgan, J.; Shi, C.

    2008-12-01

    We are currently building 2D and 3D MATLAB-based parallel finite element codes for mantle convection and melting. The codes use the MATLAB implementation of core MPI commands (eg. Send, Receive, Broadcast) for message passing between computational subdomains. We have found that code development and algorithm testing are much faster in MATLAB than in our previous work coding in C or FORTRAN, this code was built from scratch with only 12 man-months of effort. The one extra cost w.r.t. C coding on a Beowulf cluster is the cost of the parallel MATLAB license for a >4core cluster. Here we present some preliminary results on the efficiency of MPI messaging in MATLAB on a small 4 machine, 16core, 32Gb RAM Intel Q6600 processor-based cluster. Our code implements fully parallelized preconditioned conjugate gradients with a multigrid preconditioner. Our parallel viscous flow solver is currently 20% slower for a 1,000,000 DOF problem on a single core in 2D as the direct solve MILAMIN MATLAB viscous flow solver. We have tested both continuous and discontinuous pressure formulations. We test with various configurations of network hardware, CPU speeds, and memory using our own and MATLAB's built in cluster profiler. So far we have only explored relatively small (up to 1.6GB RAM) test problems. We find that with our current code and Intel memory controller bandwidth limitations we can only get ~2.3 times performance out of 4 cores than 1 core per machine. Even for these small problems the code runs faster with message passing between 4 machines with one core each than 1 machine with 4 cores and internal messaging (1.29x slower), or 1 core (2.15x slower). It surprised us that for 2D ~1GB-sized problems with only 3 multigrid levels, the direct- solve on the coarsest mesh consumes comparable time to the iterative solve on the finest mesh - a penalty that is greatly reduced either by using a 4th multigrid level or by using an iterative solve at the coarsest grid level. We plan to

  4. Computer Assisted Parallel Program Generation

    CERN Document Server

    Kawata, Shigeo

    2015-01-01

    Parallel computation is widely employed in scientific researches, engineering activities and product development. Parallel program writing itself is not always a simple task depending on problems solved. Large-scale scientific computing, huge data analyses and precise visualizations, for example, would require parallel computations, and the parallel computing needs the parallelization techniques. In this Chapter a parallel program generation support is discussed, and a computer-assisted parallel program generation system P-NCAS is introduced. Computer assisted problem solving is one of key methods to promote innovations in science and engineering, and contributes to enrich our society and our life toward a programming-free environment in computing science. Problem solving environments (PSE) research activities had started to enhance the programming power in 1970's. The P-NCAS is one of the PSEs; The PSE concept provides an integrated human-friendly computational software and hardware system to solve a target ...

  5. The 3D Elevation Program initiative: a call for action

    Science.gov (United States)

    Sugarbaker, Larry J.; Constance, Eric W.; Heidemann, Hans Karl; Jason, Allyson L.; Lukas, Vicki; Saghy, David L.; Stoker, Jason M.

    2014-01-01

    The 3D Elevation Program (3DEP) initiative is accelerating the rate of three-dimensional (3D) elevation data collection in response to a call for action to address a wide range of urgent needs nationwide. It began in 2012 with the recommendation to collect (1) high-quality light detection and ranging (lidar) data for the conterminous United States (CONUS), Hawaii, and the U.S. territories and (2) interferometric synthetic aperture radar (ifsar) data for Alaska. Specifications were created for collecting 3D elevation data, and the data management and delivery systems are being modernized. The National Elevation Dataset (NED) will be completely refreshed with new elevation data products and services. The call for action requires broad support from a large partnership community committed to the achievement of national 3D elevation data coverage. The initiative is being led by the U.S. Geological Survey (USGS) and includes many partners—Federal agencies and State, Tribal, and local governments—who will work together to build on existing programs to complete the national collection of 3D elevation data in 8 years. Private sector firms, under contract to the Government, will continue to collect the data and provide essential technology solutions for the Government to manage and deliver these data and services. The 3DEP governance structure includes (1) an executive forum established in May 2013 to have oversight functions and (2) a multiagency coordinating committee based upon the committee structure already in place under the National Digital Elevation Program (NDEP). The 3DEP initiative is based on the results of the National Enhanced Elevation Assessment (NEEA) that was funded by NDEP agencies and completed in 2011. The study, led by the USGS, identified more than 600 requirements for enhanced (3D) elevation data to address mission-critical information requirements of 34 Federal agencies, all 50 States, and a sample of private sector companies and Tribal and local

  6. Introducing ZEUS-MP A 3D, Parallel, Multiphysics Code for Astrophysical Fluid Dynamics

    CERN Document Server

    Norman, M L

    2000-01-01

    We describe ZEUS-MP: a Multi-Physics, Massively-Parallel, Message-Passing code for astrophysical fluid dynamics simulations in 3 dimensions. ZEUS-MP is a follow-on to the sequential ZEUS-2D and ZEUS-3D codes developed and disseminated by the Laboratory for Computational Astrophysics (lca.ncsa.uiuc.edu) at NCSA. V1.0 released 1/1/2000 includes the following physics modules: ideal hydrodynamics, ideal MHD, and self-gravity. Future releases will include flux-limited radiation diffusion, thermal heat conduction, two-temperature plasma, and heating and cooling functions. The covariant equations are cast on a moving Eulerian grid with Cartesian, cylindrical, and spherical polar coordinates currently supported. Parallelization is done by domain decomposition and implemented in F77 and MPI. The code is portable across a wide range of platforms from networks of workstations to massively parallel processors. Some parallel performance results are presented as well as an application to turbulent star formation.

  7. Parallel implementation of 3D protein structure similarity searches using a GPU and the CUDA.

    Science.gov (United States)

    Mrozek, Dariusz; Brożek, Miłosz; Małysiak-Mrozek, Bożena

    2014-02-01

    Searching for similar 3D protein structures is one of the primary processes employed in the field of structural bioinformatics. However, the computational complexity of this process means that it is constantly necessary to search for new methods that can perform such a process faster and more efficiently. Finding molecular substructures that complex protein structures have in common is still a challenging task, especially when entire databases containing tens or even hundreds of thousands of protein structures must be scanned. Graphics processing units (GPUs) and general purpose graphics processing units (GPGPUs) can perform many time-consuming and computationally demanding processes much more quickly than a classical CPU can. In this paper, we describe the GPU-based implementation of the CASSERT algorithm for 3D protein structure similarity searching. This algorithm is based on the two-phase alignment of protein structures when matching fragments of the compared proteins. The GPU (GeForce GTX 560Ti: 384 cores, 2GB RAM) implementation of CASSERT ("GPU-CASSERT") parallelizes both alignment phases and yields an average 180-fold increase in speed over its CPU-based, single-core implementation on an Intel Xeon E5620 (2.40GHz, 4 cores). In this paper, we show that massive parallelization of the 3D structure similarity search process on many-core GPU devices can reduce the execution time of the process, allowing it to be performed in real time. GPU-CASSERT is available at: http://zti.polsl.pl/dmrozek/science/gpucassert/cassert.htm.

  8. Approach of generating parallel programs from parallelized algorithm design strategies

    Institute of Scientific and Technical Information of China (English)

    WAN Jian-yi; LI Xiao-ying

    2008-01-01

    Today, parallel programming is dominated by message passing libraries, such as message passing interface (MPI). This article intends to simplify parallel programming by generating parallel programs from parallelized algorithm design strategies. It uses skeletons to abstract parallelized algorithm design strategies, as well as parallel architectures. Starting from problem specification, an abstract parallel abstract programming language+ (Apla+) program is generated from parallelized algorithm design strategies and problem-specific function definitions. By combining with parallel architectures, implicity of parallelism inside the parallelized algorithm design strategies is exploited. With implementation and transformation, C++ and parallel virtual machine (CPPVM) parallel program is finally generated. Parallelized branch and bound (B&B) algorithm design strategy and parallelized divide and conquer (D & C) algorithm design strategy are studied in this article as examples. And it also illustrates the approach with a case study.

  9. Patterns For Parallel Programming

    CERN Document Server

    Mattson, Timothy G; Massingill, Berna L

    2005-01-01

    From grids and clusters to next-generation game consoles, parallel computing is going mainstream. Innovations such as Hyper-Threading Technology, HyperTransport Technology, and multicore microprocessors from IBM, Intel, and Sun are accelerating the movement's growth. Only one thing is missing: programmers with the skills to meet the soaring demand for parallel software.

  10. The development of laser-plasma interaction program LAP3D on thousands of processors

    Directory of Open Access Journals (Sweden)

    Xiaoyan Hu

    2015-08-01

    Full Text Available Modeling laser-plasma interaction (LPI processes in real-size experiments scale is recognized as a challenging task. For explorering the influence of various instabilities in LPI processes, a three-dimensional laser and plasma code (LAP3D has been developed, which includes filamentation, stimulated Brillouin backscattering (SBS, stimulated Raman backscattering (SRS, non-local heat transport and plasmas flow computation modules. In this program, a second-order upwind scheme is applied to solve the plasma equations which are represented by an Euler fluid model. Operator splitting method is used for solving the equations of the light wave propagation, where the Fast Fourier translation (FFT is applied to compute the diffraction operator and the coordinate translations is used to solve the acoustic wave equation. The coupled terms of the different physics processes are computed by the second-order interpolations algorithm. In order to simulate the LPI processes in massively parallel computers well, several parallel techniques are used, such as the coupled parallel algorithm of FFT and fluid numerical computation, the load balance algorithm, and the data transfer algorithm. Now the phenomena of filamentation, SBS and SRS have been studied in low-density plasma successfully with LAP3D. Scalability of the program is demonstrated with a parallel efficiency above 50% on about ten thousand of processors.

  11. 3-D readout-electronics packaging for high-bandwidth massively paralleled imager

    Science.gov (United States)

    Kwiatkowski, Kris; Lyke, James

    2007-12-18

    Dense, massively parallel signal processing electronics are co-packaged behind associated sensor pixels. Microchips containing a linear or bilinear arrangement of photo-sensors, together with associated complex electronics, are integrated into a simple 3-D structure (a "mirror cube"). An array of photo-sensitive cells are disposed on a stacked CMOS chip's surface at a 45.degree. angle from light reflecting mirror surfaces formed on a neighboring CMOS chip surface. Image processing electronics are held within the stacked CMOS chip layers. Electrical connections couple each of said stacked CMOS chip layers and a distribution grid, the connections for distributing power and signals to components associated with each stacked CSMO chip layer.

  12. A parallel 3-D discrete wavelet transform architecture using pipelined lifting scheme approach for video coding

    Science.gov (United States)

    Hegde, Ganapathi; Vaya, Pukhraj

    2013-10-01

    This article presents a parallel architecture for 3-D discrete wavelet transform (3-DDWT). The proposed design is based on the 1-D pipelined lifting scheme. The architecture is fully scalable beyond the present coherent Daubechies filter bank (9, 7). This 3-DDWT architecture has advantages such as no group of pictures restriction and reduced memory referencing. It offers low power consumption, low latency and high throughput. The computing technique is based on the concept that lifting scheme minimises the storage requirement. The application specific integrated circuit implementation of the proposed architecture is done by synthesising it using 65 nm Taiwan Semiconductor Manufacturing Company standard cell library. It offers a speed of 486 MHz with a power consumption of 2.56 mW. This architecture is suitable for real-time video compression even with large frame dimensions.

  13. Implementation of a 3D mixing layer code on parallel computers

    Energy Technology Data Exchange (ETDEWEB)

    Roe, K.; Thakur, R.; Dang, T.; Bogucz, E. [Syracuse Univ., NY (United States)

    1995-09-01

    This paper summarizes our progress and experience in the development of a Computational-Fluid-Dynamics code on parallel computers to simulate three-dimensional spatially-developing mixing layers. In this initial study, the three-dimensional time-dependent Euler equations are solved using a finite-volume explicit time-marching algorithm. The code was first programmed in Fortran 77 for sequential computers. The code was then converted for use on parallel computers using the conventional message-passing technique, while we have not been able to compile the code with the present version of HPF compilers.

  14. Hybrid Parallel Bundle Adjustment for 3D Scene Reconstruction with Massive Points

    Institute of Scientific and Technical Information of China (English)

    Xin Liu; Wei Gao; Zhan-Yi Hu

    2012-01-01

    Bundle adjustment (BA) is a crucial but time consuming step in 3D reconstruction.In this paper,we intend to tackle a special class of BA problems where the reconstructed 3D points are much more numerous than the camera parameters,called Massive-Points BA (MPBA) problems.This is often the case when high-resolution images are used.We present a design and implementation of a new bundle adjustment algorithm for efficiently solving the MPBA problems.The use of hardware parallelism,the multi-core CPUs as well as GPUs,is explored.By careful memory-usage design,the graphic-memory limitation is effectively alleviated.Several modern acceleration strategies for bundle adjustment,such as the mixed-precision arithmetics,the embedded point iteration,and the preconditioned conjugate gradients,are explored and compared.By using several high-resolution image datasets,we generate a variety of MPBA problems,with which the performance of five bundle adjustment algorithms are evaluated.The experimental results show that our algorithm is up to 40 times faster than classical Sparse Bundle Adjustment,while maintaining comparable precision.

  15. 3D multiphysics modeling of superconducting cavities with a massively parallel simulation suite

    Directory of Open Access Journals (Sweden)

    Oleksiy Kononenko

    2017-10-01

    Full Text Available Radiofrequency cavities based on superconducting technology are widely used in particle accelerators for various applications. The cavities usually have high quality factors and hence narrow bandwidths, so the field stability is sensitive to detuning from the Lorentz force and external loads, including vibrations and helium pressure variations. If not properly controlled, the detuning can result in a serious performance degradation of a superconducting accelerator, so an understanding of the underlying detuning mechanisms can be very helpful. Recent advances in the simulation suite ace3p have enabled realistic multiphysics characterization of such complex accelerator systems on supercomputers. In this paper, we present the new capabilities in ace3p for large-scale 3D multiphysics modeling of superconducting cavities, in particular, a parallel eigensolver for determining mechanical resonances, a parallel harmonic response solver to calculate the response of a cavity to external vibrations, and a numerical procedure to decompose mechanical loads, such as from the Lorentz force or piezoactuators, into the corresponding mechanical modes. These capabilities have been used to do an extensive rf-mechanical analysis of dressed TESLA-type superconducting cavities. The simulation results and their implications for the operational stability of the Linac Coherent Light Source-II are discussed.

  16. 3-D Parallel Simulation Model of Continuous Beam-Electron Cloud Interactions

    CERN Document Server

    Ghalam, Ali F; Decyk, Viktor K; Huang Cheng Kun; Katsouleas, Thomas C; Mori, Warren; Rumolo, Giovanni; Zimmermann, Frank

    2005-01-01

    A 3D Particle-In-Cell model for continuous modeling of beam and electron cloud interaction in a circular accelerator is presented. A simple model for lattice structure, mainly the Quadruple and dipole magnets and chromaticity have been added to a plasma PIC code, QuickPIC, used extensively to model plasma wakefield acceleration concept. The code utilizes parallel processing techniques with domain decomposition in both longitudinal and transverse domains to overcome the massive computational costs of continuously modeling the beam-cloud interaction. Through parallel modeling, we have been able to simulate long-term beam propagation in the presence of electron cloud in many existing and future circular machines around the world. The exact dipole lattice structure has been added to the code and the simulation results for CERN-SPS and LHC with the new lattice structure have been studied. Also the simulation results are compared to the results from the two macro-particle modeling for strong head-tail instability. ...

  17. Parallel 3-d simulations for porous media models in soil mechanics

    Science.gov (United States)

    Wieners, C.; Ammann, M.; Diebels, S.; Ehlers, W.

    Numerical simulations in 3-d for porous media models in soil mechanics are a difficult task for the engineering modelling as well as for the numerical realization. Here, we present a general numerical scheme for the simulation of two-phase models in combination with an material model via the stress response with a specialized parallel saddle point solver. Therefore, we give a brief introduction into the theoretical background of the Theory of Porous Media and constitute a two-phase model consisting of a porous solid skeleton saturated by a viscous pore-fluid. The material behaviour of the skeleton is assumed to be elasto-viscoplastic. The governing equations are transfered to a weak formulation suitable for the application of the finite element method. Introducing an formulation in terms of the stress response, we define a clear interface between the assembling process and the parallel solver modules. We demonstrate the efficiency of this approach by challenging numerical experiments realized on the Linux Cluster in Chemnitz.

  18. Parallel programming with Python

    CERN Document Server

    Palach, Jan

    2014-01-01

    A fast, easy-to-follow and clear tutorial to help you develop Parallel computing systems using Python. Along with explaining the fundamentals, the book will also introduce you to slightly advanced concepts and will help you in implementing these techniques in the real world. If you are an experienced Python programmer and are willing to utilize the available computing resources by parallelizing applications in a simple way, then this book is for you. You are required to have a basic knowledge of Python development to get the most of this book.

  19. 3-D Parallel, Object-Oriented, Hybrid, PIC Code for Ion Ring Studies

    Science.gov (United States)

    Omelchenko, Y. A.

    1997-08-01

    The 3-D hybrid, Particle-in-Cell (PIC) code, FLAME has been developed to study low-frequency, large orbit plasmas in realistic cylindrical configurations. FLAME assumes plasma quasineutrality and solves the Maxwell equations with displacement current neglected. The electron component is modeled as a massless fluid and all ion components are represented by discrete macro-particles. The poloidal discretization is done by a finite-difference staggered grid method. FFT is applied in the azimuthal direction. A substantial reduction of CPU time is achieved by enabling separate time advances of background and beam particle species in the time-averaged fields. The FLAME structure follows the guidelines of object-oriented programming. Its C++ class hierarchy comprises the Utility, Geometry, Particle, Grid and Distributed base class packages. The latter encapsulates implementation of concurrent grid and particle algorithms. The particle and grid data interprocessor communications are unified and designed to be independent of both the underlying message-passing library and the actual poloidal domain decomposition technique (FFT's are local). Load balancing concerns are addressed by using adaptive domain partitions to account for nonuniform spatial distributions of particle objects. The results of 2-D and 3-D FLAME simulations in support of the FIREX program at Cornell are presented.

  20. Practical parallel programming

    CERN Document Server

    Bauer, Barr E

    2014-01-01

    This is the book that will teach programmers to write faster, more efficient code for parallel processors. The reader is introduced to a vast array of procedures and paradigms on which actual coding may be based. Examples and real-life simulations using these devices are presented in C and FORTRAN.

  1. The 3D Elevation Program: summary for Texas

    Science.gov (United States)

    Carswell, William J.

    2013-01-01

    Elevation data are essential to a broad range of applications, including forest resources management, wildlife and habitat management, national security, recreation, and many others. For the State of Texas, elevation data are critical for natural resources conservation; wildfire management, planning, and response; flood risk management; agriculture and precision farming; infrastructure and construction management; water supply and quality; and other business uses. Today, high-quality light detection and ranging (lidar) data are the source for creating elevation models and other elevation datasets. Federal, State, and local agencies work in partnership to (1) replace data, on a national basis, that are (on average) 30 years old and of lower quality and (2) provide coverage where publicly accessible data do not exist. A joint goal of State and Federal partners is to acquire consistent, statewide coverage to support existing and emerging applications enabled by lidar data. The new 3D Elevation Program (3DEP) initiative, managed by the U.S. Geological Survey (USGS), responds to the growing need for high-quality topographic data and a wide range of other three-dimensional representations of the Nation’s natural and constructed features.

  2. The 3D Elevation Program: summary for Virginia

    Science.gov (United States)

    Carswell, William J.

    2013-01-01

    Elevation data are essential to a broad range of applications, including forest resources management, wildlife and habitat management, national security, recreation, and many others. For the Commonwealth of Virginia, elevation data are critical for urban and regional planning, natural resources conservation, flood risk management, agriculture and precision farming, resource mining, infrastructure and construction management, and other business uses. Today, high-quality light detection and ranging (lidar) data are the sources for creating elevation models and other elevation datasets. Federal, State, and local agencies work in partnership to (1) replace data, on a national basis, that are (on average) 30 years old and of lower quality and (2) provide coverage where publicly accessible data do not exist. A joint goal of State and Federal partners is to acquire consistent, statewide coverage to support existing and emerging applications enabled by lidar data. The new 3D Elevation Program (3DEP) initiative, managed by the U.S. Geological Survey (USGS), responds to the growing need for high-quality topographic data and a wide range of other three-dimensional representations of the Nation’s natural and constructed features.

  3. 3D Kirchhoff depth migration algorithm: A new scalable approach for parallelization on multicore CPU based cluster

    Science.gov (United States)

    Rastogi, Richa; Londhe, Ashutosh; Srivastava, Abhishek; Sirasala, Kirannmayi M.; Khonde, Kiran

    2017-03-01

    In this article, a new scalable 3D Kirchhoff depth migration algorithm is presented on state of the art multicore CPU based cluster. Parallelization of 3D Kirchhoff depth migration is challenging due to its high demand of compute time, memory, storage and I/O along with the need of their effective management. The most resource intensive modules of the algorithm are traveltime calculations and migration summation which exhibit an inherent trade off between compute time and other resources. The parallelization strategy of the algorithm largely depends on the storage of calculated traveltimes and its feeding mechanism to the migration process. The presented work is an extension of our previous work, wherein a 3D Kirchhoff depth migration application for multicore CPU based parallel system had been developed. Recently, we have worked on improving parallel performance of this application by re-designing the parallelization approach. The new algorithm is capable to efficiently migrate both prestack and poststack 3D data. It exhibits flexibility for migrating large number of traces within the available node memory and with minimal requirement of storage, I/O and inter-node communication. The resultant application is tested using 3D Overthrust data on PARAM Yuva II, which is a Xeon E5-2670 based multicore CPU cluster with 16 cores/node and 64 GB shared memory. Parallel performance of the algorithm is studied using different numerical experiments and the scalability results show striking improvement over its previous version. An impressive 49.05X speedup with 76.64% efficiency is achieved for 3D prestack data and 32.00X speedup with 50.00% efficiency for 3D poststack data, using 64 nodes. The results also demonstrate the effectiveness and robustness of the improved algorithm with high scalability and efficiency on a multicore CPU cluster.

  4. High-Level Parallel Programming.

    Science.gov (United States)

    parallel programming languages. These issues were evaluated via the utilization of a language called UC. UC is a programming language aimed at balancing notational simplicity with execution efficiency and portability. UC accomplishes this by separating the programming task from the efficiency issues. This report gives a description of the language, its current implementation, its verification methodology and its use in designing various

  5. 3D magnetospheric parallel hybrid multi-grid method applied to planet–plasma interactions

    Energy Technology Data Exchange (ETDEWEB)

    Leclercq, L., E-mail: ludivine.leclercq@latmos.ipsl.fr [LATMOS/IPSL, UVSQ Université Paris-Saclay, UPMC Univ. Paris 06, CNRS, Guyancourt (France); Modolo, R., E-mail: ronan.modolo@latmos.ipsl.fr [LATMOS/IPSL, UVSQ Université Paris-Saclay, UPMC Univ. Paris 06, CNRS, Guyancourt (France); Leblanc, F. [LATMOS/IPSL, UPMC Univ. Paris 06 Sorbonne Universités, UVSQ, CNRS, Paris (France); Hess, S. [ONERA, Toulouse (France); Mancini, M. [LUTH, Observatoire Paris-Meudon (France)

    2016-03-15

    We present a new method to exploit multiple refinement levels within a 3D parallel hybrid model, developed to study planet–plasma interactions. This model is based on the hybrid formalism: ions are kinetically treated whereas electrons are considered as a inertia-less fluid. Generally, ions are represented by numerical particles whose size equals the volume of the cells. Particles that leave a coarse grid subsequently entering a refined region are split into particles whose volume corresponds to the volume of the refined cells. The number of refined particles created from a coarse particle depends on the grid refinement rate. In order to conserve velocity distribution functions and to avoid calculations of average velocities, particles are not coalesced. Moreover, to ensure the constancy of particles' shape function sizes, the hybrid method is adapted to allow refined particles to move within a coarse region. Another innovation of this approach is the method developed to compute grid moments at interfaces between two refinement levels. Indeed, the hybrid method is adapted to accurately account for the special grid structure at the interfaces, avoiding any overlapping grid considerations. Some fundamental test runs were performed to validate our approach (e.g. quiet plasma flow, Alfven wave propagation). Lastly, we also show a planetary application of the model, simulating the interaction between Jupiter's moon Ganymede and the Jovian plasma.

  6. Parallel computing simulation of electrical excitation and conduction in the 3D human heart.

    Science.gov (United States)

    Di Yu; Dongping Du; Hui Yang; Yicheng Tu

    2014-01-01

    A correctly beating heart is important to ensure adequate circulation of blood throughout the body. Normal heart rhythm is produced by the orchestrated conduction of electrical signals throughout the heart. Cardiac electrical activity is the resulted function of a series of complex biochemical-mechanical reactions, which involves transportation and bio-distribution of ionic flows through a variety of biological ion channels. Cardiac arrhythmias are caused by the direct alteration of ion channel activity that results in changes in the AP waveform. In this work, we developed a whole-heart simulation model with the use of massive parallel computing with GPGPU and OpenGL. The simulation algorithm was implemented under several different versions for the purpose of comparisons, including one conventional CPU version and two GPU versions based on Nvidia CUDA platform. OpenGL was utilized for the visualization / interaction platform because it is open source, light weight and universally supported by various operating systems. The experimental results show that the GPU-based simulation outperforms the conventional CPU-based approach and significantly improves the speed of simulation. By adopting modern computer architecture, this present investigation enables real-time simulation and visualization of electrical excitation and conduction in the large and complicated 3D geometry of a real-world human heart.

  7. Tensor3D: A computer graphics program to simulate 3D real-time deformation and visualization of geometric bodies

    Science.gov (United States)

    Pallozzi Lavorante, Luca; Dirk Ebert, Hans

    2008-07-01

    Tensor3D is a geometric modeling program with the capacity to simulate and visualize in real-time the deformation, specified through a tensor matrix and applied to triangulated models representing geological bodies. 3D visualization allows the study of deformational processes that are traditionally conducted in 2D, such as simple and pure shears. Besides geometric objects that are immediately available in the program window, the program can read other models from disk, thus being able to import objects created with different open-source or proprietary programs. A strain ellipsoid and a bounding box are simultaneously shown and instantly deformed with the main object. The principal axes of strain are visualized as well to provide graphical information about the orientation of the tensor's normal components. The deformed models can also be saved, retrieved later and deformed again, in order to study different steps of progressive strain, or to make this data available to other programs. The shape of stress ellipsoids and the corresponding Mohr circles defined by any stress tensor can also be represented. The application was written using the Visualization ToolKit, a powerful scientific visualization library in the public domain. This development choice, allied to the use of the Tcl/Tk programming language, which is independent on the host computational platform, makes the program a useful tool for the study of geometric deformations directly in three dimensions in teaching as well as research activities.

  8. Task-parallel implementation of 3D shortest path raytracing for geophysical applications

    Science.gov (United States)

    Giroux, Bernard; Larouche, Benoît

    2013-04-01

    This paper discusses two variants of the shortest path method and their parallel implementation on a shared-memory system. One variant is designed to perform raytracing in models with stepwise distributions of interval velocity while the other is better suited for continuous velocity models. Both rely on a discretization scheme where primary nodes are located at the corners of cuboid cells and where secondary nodes are found on the edges and sides of the cells. The parallel implementations allow raytracing concurrently for different sources, providing an attractive framework for ray-based tomography. The accuracy and performance of the implementations were measured by comparison with the analytic solution for a layered model and for a vertical gradient model. Mean relative error less than 0.2% was obtained with 5 secondary nodes for the layered model and 9 secondary nodes for the gradient model. Parallel performance depends on the level of discretization refinement, on the number of threads, and on the problem size, with the most determinant variable being the level of discretization refinement (number of secondary nodes). The results indicate that a good trade-off between speed and accuracy is achieved with the number of secondary nodes equal to 5. The programs are written in C++ and rely on the Standard Template Library and OpenMP.

  9. Writing parallel programs that work

    CERN Document Server

    CERN. Geneva

    2012-01-01

    Serial algorithms typically run inefficiently on parallel machines. This may sound like an obvious statement, but it is the root cause of why parallel programming is considered to be difficult. The current state of the computer industry is still that almost all programs in existence are serial. This talk will describe the techniques used in the Intel Parallel Studio to provide a developer with the tools necessary to understand the behaviors and limitations of the existing serial programs. Once the limitations are known the developer can refactor the algorithms and reanalyze the resulting programs with the tools in the Intel Parallel Studio to create parallel programs that work. About the speaker Paul Petersen is a Sr. Principal Engineer in the Software and Solutions Group (SSG) at Intel. He received a Ph.D. degree in Computer Science from the University of Illinois in 1993. After UIUC, he was employed at Kuck and Associates, Inc. (KAI) working on auto-parallelizing compiler (KAP), and was involved in th...

  10. New 3D parallel GILD electromagnetic modeling and nonlinear inversion using global magnetic integral and local differential equation

    Energy Technology Data Exchange (ETDEWEB)

    Xie, G.; Li, J.; Majer, E.; Zuo, D.

    1998-07-01

    This paper describes a new 3D parallel GILD electromagnetic (EM) modeling and nonlinear inversion algorithm. The algorithm consists of: (a) a new magnetic integral equation instead of the electric integral equation to solve the electromagnetic forward modeling and inverse problem; (b) a collocation finite element method for solving the magnetic integral and a Galerkin finite element method for the magnetic differential equations; (c) a nonlinear regularizing optimization method to make the inversion stable and of high resolution; and (d) a new parallel 3D modeling and inversion using a global integral and local differential domain decomposition technique (GILD). The new 3D nonlinear electromagnetic inversion has been tested with synthetic data and field data. The authors obtained very good imaging for the synthetic data and reasonable subsurface EM imaging for the field data. The parallel algorithm has high parallel efficiency over 90% and can be a parallel solver for elliptic, parabolic, and hyperbolic modeling and inversion. The parallel GILD algorithm can be extended to develop a high resolution and large scale seismic and hydrology modeling and inversion in the massively parallel computer.

  11. A Heterogeneous Parallel Programming Capability

    Science.gov (United States)

    1990-11-30

    the various implementations of Express attempted to address only the first of these is- sues - providing a portable, standard platform for parallel ... programming on a wide variety of dif- I I! 5 ferent systems. Each implementation, however, was independent, but allowed programs to execute on a single

  12. Influence of intrinsic and extrinsic forces on 3D stress distribution using CUDA programming

    Science.gov (United States)

    Räss, Ludovic; Omlin, Samuel; Podladchikov, Yuri

    2013-04-01

    In order to have a better understanding of the influence of buoyancy (intrinsic) and boundary (extrinsic) forces in a nonlinear rheology due to a power law fluid, some basics needs to be explored through 3D numerical calculation. As first approach, the already studied Stokes setup of a rising sphere will be used to calibrate the 3D model. Far field horizontal tectonic stress is applied to the sphere, which generates a vertical acceleration, buoyancy driven. This simple and known setup allows some benchmarking performed through systematic runs. The relative importance of intrinsic and extrinsic forces producing the wide variety of rates and styles of deformation, including absence of deformation and generating 3D stress patterns, will be determined. Relation between vertical motion and power law exponent will also be explored. The goal of these investigations will be to run models having topography and density structure from geophysical imaging as input, and 3D stress field as output. The stress distribution in Swiss Alps and Plateau and its implication for risk analysis is one of the perspective for this research. In fact, proximity of the stress to the failure is fundamental for risk assessment. Sensitivity of this to the accurate topography representation can then be evaluated. The developed 3D numerical codes, tuned for mid-sized cluster, need to be optimized, especially while running good resolution in full 3D. Therefor, two largely used computing platforms, MATLAB and FORTRAN 90 are explored. Starting with an easy adaptable and as short as possible MATLAB code, which is then upgraded in order to reach higher performance in simulation times and resolution. A significant speedup using the rising NVIDIA CUDA technology and resources is also possible. Programming in C-CUDA, creating some synchronization feature, and comparing the results with previous runs, helps us to investigate the new speedup possibilities allowed through GPU parallel computing. These codes

  13. PROGRAM MODULE FOR 3-DIMENSIONAL (3D) VISUALIZATION OF FABRIC

    OpenAIRE

    Drole, Blaž

    2012-01-01

    Development of new designs in textile industry is a work, that requires a great deal of precision and the need for end design visualasation. Altough computers are being used for many years now, we were technologically limited when visualasing the final result to only two dimensional display. With the advances in computer technology, especialy graphic cards, we are now able to display complex weave in three dimensional (3D) space even on personal computers. This greatly simplifies the visualis...

  14. TOMO3D: 3-D joint refraction and reflection traveltime tomography parallel code for active-source seismic data—synthetic test

    Science.gov (United States)

    Meléndez, A.; Korenaga, J.; Sallarès, V.; Miniussi, A.; Ranero, C. R.

    2015-10-01

    We present a new 3-D traveltime tomography code (TOMO3D) for the modelling of active-source seismic data that uses the arrival times of both refracted and reflected seismic phases to derive the velocity distribution and the geometry of reflecting boundaries in the subsurface. This code is based on its popular 2-D version TOMO2D from which it inherited the methods to solve the forward and inverse problems. The traveltime calculations are done using a hybrid ray-tracing technique combining the graph and bending methods. The LSQR algorithm is used to perform the iterative regularized inversion to improve the initial velocity and depth models. In order to cope with an increased computational demand due to the incorporation of the third dimension, the forward problem solver, which takes most of the run time (˜90 per cent in the test presented here), has been parallelized with a combination of multi-processing and message passing interface standards. This parallelization distributes the ray-tracing and traveltime calculations among available computational resources. The code's performance is illustrated with a realistic synthetic example, including a checkerboard anomaly and two reflectors, which simulates the geometry of a subduction zone. The code is designed to invert for a single reflector at a time. A data-driven layer-stripping strategy is proposed for cases involving multiple reflectors, and it is tested for the successive inversion of the two reflectors. Layers are bound by consecutive reflectors, and an initial velocity model for each inversion step incorporates the results from previous steps. This strategy poses simpler inversion problems at each step, allowing the recovery of strong velocity discontinuities that would otherwise be smoothened.

  15. Reactor Dosimetry Applications Using RAPTOR-M3G:. a New Parallel 3-D Radiation Transport Code

    Science.gov (United States)

    Longoni, Gianluca; Anderson, Stanwood L.

    2009-08-01

    The numerical solution of the Linearized Boltzmann Equation (LBE) via the Discrete Ordinates method (SN) requires extensive computational resources for large 3-D neutron and gamma transport applications due to the concurrent discretization of the angular, spatial, and energy domains. This paper will discuss the development RAPTOR-M3G (RApid Parallel Transport Of Radiation - Multiple 3D Geometries), a new 3-D parallel radiation transport code, and its application to the calculation of ex-vessel neutron dosimetry responses in the cavity of a commercial 2-loop Pressurized Water Reactor (PWR). RAPTOR-M3G is based domain decomposition algorithms, where the spatial and angular domains are allocated and processed on multi-processor computer architectures. As compared to traditional single-processor applications, this approach reduces the computational load as well as the memory requirement per processor, yielding an efficient solution methodology for large 3-D problems. Measured neutron dosimetry responses in the reactor cavity air gap will be compared to the RAPTOR-M3G predictions. This paper is organized as follows: Section 1 discusses the RAPTOR-M3G methodology; Section 2 describes the 2-loop PWR model and the numerical results obtained. Section 3 addresses the parallel performance of the code, and Section 4 concludes this paper with final remarks and future work.

  16. Development of a stereo vision measurement system for a 3D three-axial pneumatic parallel mechanism robot arm.

    Science.gov (United States)

    Chiang, Mao-Hsiung; Lin, Hao-Ting; Hou, Chien-Lun

    2011-01-01

    In this paper, a stereo vision 3D position measurement system for a three-axial pneumatic parallel mechanism robot arm is presented. The stereo vision 3D position measurement system aims to measure the 3D trajectories of the end-effector of the robot arm. To track the end-effector of the robot arm, the circle detection algorithm is used to detect the desired target and the SAD algorithm is used to track the moving target and to search the corresponding target location along the conjugate epipolar line in the stereo pair. After camera calibration, both intrinsic and extrinsic parameters of the stereo rig can be obtained, so images can be rectified according to the camera parameters. Thus, through the epipolar rectification, the stereo matching process is reduced to a horizontal search along the conjugate epipolar line. Finally, 3D trajectories of the end-effector are computed by stereo triangulation. The experimental results show that the stereo vision 3D position measurement system proposed in this paper can successfully track and measure the fifth-order polynomial trajectory and sinusoidal trajectory of the end-effector of the three- axial pneumatic parallel mechanism robot arm.

  17. Experiences Using Hybrid MPI/OpenMP in the Real World: Parallelization of a 3D CFD Solver for Multi-Core Node Clusters

    Directory of Open Access Journals (Sweden)

    Gabriele Jost

    2010-01-01

    Full Text Available Today most systems in high-performance computing (HPC feature a hierarchical hardware design: shared-memory nodes with several multi-core CPUs are connected via a network infrastructure. When parallelizing an application for these architectures it seems natural to employ a hierarchical programming model such as combining MPI and OpenMP. Nevertheless, there is the general lore that pure MPI outperforms the hybrid MPI/OpenMP approach. In this paper, we describe the hybrid MPI/OpenMP parallelization of IR3D (Incompressible Realistic 3-D code, a full-scale real-world application, which simulates the environmental effects on the evolution of vortices trailing behind control surfaces of underwater vehicles. We discuss performance, scalability and limitations of the pure MPI version of the code on a variety of hardware platforms and show how the hybrid approach can help to overcome certain limitations.

  18. Parallel rapid relaxation inversion of 3D magnetotelluric data%大地电磁三维快速松弛反演并行算法研究

    Institute of Scientific and Technical Information of China (English)

    林昌洪; 谭捍东; 佟拓

    2009-01-01

    We implement a parallel algorithm with the advantage of MPI (Message Passing Interface) to speed up the rapid relaxation inversion for 3D magnetotelluric data. We test the parallel rapid relaxation algorithm with synthetic and real data. The execution efficiency of the algorithm for several different situations is also compared. The results 'indicate that the parallel rapid relaxation algorithm for 3D magnetotelluric inversion, is effective. This parallel algorithm implemented on a common PC promotes the practical application of 3D magnetotelluric inversion and can be suitable for the other geophysical 3D modeling and inversion.

  19. Massively parallel computers for 3D single-photon-emission computed tomography

    Energy Technology Data Exchange (ETDEWEB)

    Butler, C.S.; Miller, M.I. (Washington Univ., St. Louis, MO (United States). Electronic Systems and Signals Research Lab.); Miller, T.R.; Wallis, J.W. (Washington Univ., St. Louis, MO (United States). Edward Mallinckrodt Inst. of Radiology)

    1994-03-01

    Since the introduction of the expectation-maximization (EM) algorithm for generating maximum-likelihood (ML) and maximum a posteriori (MAP) estimates in emission tomography, there have been many investigators applying the ML method. However, almost all of the previous work has been restricted to two-dimensional (2D) reconstructions. The major focus and contribution of this paper is to demonstrate a fully three-dimensional (3D) implementation of the MAP method for single-photon-emission computed tomography (SPECT). The 3D reconstruction exhibits an improvement in resolution when compared to the generation of the series of separate 2D slice reconstructions. (Author).

  20. Parallel Adaptive Computation of Blood Flow in a 3D ``Whole'' Body Model

    Science.gov (United States)

    Zhou, M.; Figueroa, C. A.; Taylor, C. A.; Sahni, O.; Jansen, K. E.

    2008-11-01

    Accurate numerical simulations of vascular trauma require the consideration of a larger portion of the vasculature than previously considered, due to the systemic nature of the human body's response. A patient-specific 3D model composed of 78 connected arterial branches extending from the neck to the lower legs is constructed to effectively represent the entire body. Recently developed outflow boundary conditions that appropriately represent the downstream vasculature bed which is not included in the 3D computational domain are applied at 78 outlets. In this work, the pulsatile blood flow simulations are started on a fairly uniform, unstructured mesh that is subsequently adapted using a solution-based approach to efficiently resolve the flow features. The adapted mesh contains non-uniform, anisotropic elements resulting in resolution that conforms with the physical length scales present in the problem. The effects of the mesh resolution on the flow field are studied, specifically on relevant quantities of pressure, velocity and wall shear stress.

  1. Parallel Programming Archetypes in Combinatorics and Optimization

    Science.gov (United States)

    1995-06-12

    A parallel programming archetype is a language independent program design strategy. We describe two archetypes in combinatorics and optimization...the systematic design of efficient sequential and parallel programs. The research whose results are presented in this document is part of the ongoing project on Parallel Programming Archetype.

  2. Graphics-Based Parallel Programming Tools

    Science.gov (United States)

    1991-09-01

    AD-A254 406 (9 FINAL REPORT DLECTF ’AUG 13 1992 Graphics-Based Parallel Programming Tools .p Janice E. Cuny, Principal Investigator Department of...suggest parallel (either because we use a parallel graph rewriting mechanism or because we apply our results to parallel programming ), we interpret it to...was to provide support for the ex- plicit representation of graphs for use within a parallel programming environ- ment. In our environment, we view a

  3. C# game programming cookbook for Unity 3D

    CERN Document Server

    Murray, Jeff W

    2014-01-01

    Making Games the Modular Way Important Programming ConceptsBuilding the Core Game Framework Controllers and Managers Building the Core Framework ScriptsPlayer Structure Game-Specific Player Controller Dealing with Input Player Manager User Data ManagerRecipes: Common Components Introduction The Timer Class Spawn ScriptsSet Gravity Pretend Friction-Friction Simulation to Prevent Slipping Around Cameras Input Scripts Automatic Self-Destruction Script Automatic Object SpinnerScene Manager Building Player Movement ControllersShoot 'Em Up Spaceship Humanoid Character Wheeled Vehicle Weapon Systems

  4. Focusing optics of a parallel beam CCD optical tomography apparatus for 3D radiation gel dosimetry.

    Science.gov (United States)

    Krstajić, Nikola; Doran, Simon J

    2006-04-21

    Optical tomography of gel dosimeters is a promising and cost-effective avenue for quality control of radiotherapy treatments such as intensity-modulated radiotherapy (IMRT). Systems based on a laser coupled to a photodiode have so far shown the best results within the context of optical scanning of radiosensitive gels, but are very slow ( approximately 9 min per slice) and poorly suited to measurements that require many slices. Here, we describe a fast, three-dimensional (3D) optical computed tomography (optical-CT) apparatus, based on a broad, collimated beam, obtained from a high power LED and detected by a charged coupled detector (CCD). The main advantages of such a system are (i) an acquisition speed approximately two orders of magnitude higher than a laser-based system when 3D data are required, and (ii) a greater simplicity of design. This paper advances our previous work by introducing a new design of focusing optics, which take information from a suitably positioned focal plane and project an image onto the CCD. An analysis of the ray optics is presented, which explains the roles of telecentricity, focusing, acceptance angle and depth-of-field (DOF) in the formation of projections. A discussion of the approximation involved in measuring the line integrals required for filtered backprojection reconstruction is given. Experimental results demonstrate (i) the effect on projections of changing the position of the focal plane of the apparatus, (ii) how to measure the acceptance angle of the optics, and (iii) the ability of the new scanner to image both absorbing and scattering gel phantoms. The quality of reconstructed images is very promising and suggests that the new apparatus may be useful in a clinical setting for fast and accurate 3D dosimetry.

  5. Massive parallelization of a 3D finite difference electromagnetic forward solution using domain decomposition methods on multiple CUDA enabled GPUs

    Science.gov (United States)

    Schultz, A.

    2010-12-01

    3D forward solvers lie at the core of inverse formulations used to image the variation of electrical conductivity within the Earth's interior. This property is associated with variations in temperature, composition, phase, presence of volatiles, and in specific settings, the presence of groundwater, geothermal resources, oil/gas or minerals. The high cost of 3D solutions has been a stumbling block to wider adoption of 3D methods. Parallel algorithms for modeling frequency domain 3D EM problems have not achieved wide scale adoption, with emphasis on fairly coarse grained parallelism using MPI and similar approaches. The communications bandwidth as well as the latency required to send and receive network communication packets is a limiting factor in implementing fine grained parallel strategies, inhibiting wide adoption of these algorithms. Leading Graphics Processor Unit (GPU) companies now produce GPUs with hundreds of GPU processor cores per die. The footprint, in silicon, of the GPU's restricted instruction set is much smaller than the general purpose instruction set required of a CPU. Consequently, the density of processor cores on a GPU can be much greater than on a CPU. GPUs also have local memory, registers and high speed communication with host CPUs, usually through PCIe type interconnects. The extremely low cost and high computational power of GPUs provides the EM geophysics community with an opportunity to achieve fine grained (i.e. massive) parallelization of codes on low cost hardware. The current generation of GPUs (e.g. NVidia Fermi) provides 3 billion transistors per chip die, with nearly 500 processor cores and up to 6 GB of fast (DDR5) GPU memory. This latest generation of GPU supports fast hardware double precision (64 bit) floating point operations of the type required for frequency domain EM forward solutions. Each Fermi GPU board can sustain nearly 1 TFLOP in double precision, and multiple boards can be installed in the host computer system. We

  6. Field Verification Program, Coastal Flooding and Storm Protection Program. Preliminary User’s Manual 3-D Mathematical Model of Coastal, Estuarine, and Lake Currents (CELC3D).

    Science.gov (United States)

    1984-04-01

    on any other virtual machine, e.g., the CDC cyber 203, or .other non-virtual machine with sufficient memory. The CELC3D program solves the mean...Princeton, NJ, 287 pp; also WES Technical Report (in Press), U.S. Army Eng. Waterways Experiment Station, - Vicksburg, MS. " Sheng, Y.P., H. Segur , and W.S

  7. A Parallel Programming Model With Sequential Semantics

    Science.gov (United States)

    1996-01-01

    Parallel programming is more difficult than sequential programming in part because of the complexity of reasoning, testing, and debugging in the...context of concurrency. In the thesis, we present and investigate a parallel programming model that provides direct control of parallelism in a notation

  8. High-Performance Computation of Distributed-Memory Parallel 3D Voronoi and Delaunay Tessellation

    Energy Technology Data Exchange (ETDEWEB)

    Peterka, Tom; Morozov, Dmitriy; Phillips, Carolyn

    2014-11-14

    Computing a Voronoi or Delaunay tessellation from a set of points is a core part of the analysis of many simulated and measured datasets: N-body simulations, molecular dynamics codes, and LIDAR point clouds are just a few examples. Such computational geometry methods are common in data analysis and visualization; but as the scale of simulations and observations surpasses billions of particles, the existing serial and shared-memory algorithms no longer suffice. A distributed-memory scalable parallel algorithm is the only feasible approach. The primary contribution of this paper is a new parallel Delaunay and Voronoi tessellation algorithm that automatically determines which neighbor points need to be exchanged among the subdomains of a spatial decomposition. Other contributions include periodic and wall boundary conditions, comparison of our method using two popular serial libraries, and application to numerous science datasets.

  9. Calibration of 3-d.o.f. Translational Parallel Manipulators Using Leg Observations

    CERN Document Server

    Pashkevich, Anatoly; Wenger, Philippe; Gomolitsky, Roman

    2009-01-01

    The paper proposes a novel approach for the geometrical model calibration of quasi-isotropic parallel kinematic mechanisms of the Orthoglide family. It is based on the observations of the manipulator leg parallelism during motions between the specific test postures and employs a low-cost measuring system composed of standard comparator indicators attached to the universal magnetic stands. They are sequentially used for measuring the deviation of the relevant leg location while the manipulator moves the TCP along the Cartesian axes. Using the measured differences, the developed algorithm estimates the joint offsets and the leg lengths that are treated as the most essential parameters. Validity of the proposed calibration technique is confirmed by the experimental results.

  10. Parallel 3-D numerical simulation of dielectric barrier discharge plasma actuators

    Science.gov (United States)

    Houba, Tomas

    Dielectric barrier discharge plasma actuators have shown promise in a range of applications including flow control, sterilization and ozone generation. Developing numerical models of plasma actuators is of great importance, because a high-fidelity parallel numerical model allows new design configurations to be tested rapidly. Additionally, it provides a better understanding of the plasma actuator physics which is useful for further innovation. The physics of plasma actuators is studied numerically. A loosely coupled approach is utilized for the coupling of the plasma to the neutral fluid. The state of the art in numerical plasma modeling is advanced by the development of a parallel, three-dimensional, first-principles model with detailed air chemistry. The model incorporates 7 charged species and 18 reactions, along with a solution of the electron energy equation. To the author's knowledge, a parallel three-dimensional model of a gas discharge with a detailed air chemistry model and the solution of electron energy is unique. Three representative geometries are studied using the gas discharge model. The discharge of gas between two parallel electrodes is used to validate the air chemistry model developed for the gas discharge code. The gas discharge model is then applied to the discharge produced by placing a dc powered wire and grounded plate electrodes in a channel. Finally, a three-dimensional simulation of gas discharge produced by electrodes placed inside a riblet is carried out. The body force calculated with the gas discharge model is loosely coupled with a fluid model to predict the induced flow inside the riblet.

  11. Comparison of parallel and spiral tagged MRI geometries in estimation of 3-D myocardial strains

    Science.gov (United States)

    Tustison, Nicholas J.; Amini, Amir A.

    2005-04-01

    Research involving the quantification of left ventricular myocardial strain from cardiac tagged magnetic resonance imaging (MRI) is extensive. Two different imaging geometries are commonly employed by these methodologies to extract longitudinal deformation. We refer to these imaging geometries as either parallel or spiral. In the spiral configuration, four long-axis tagged image slices which intersect along the long-axis of the left ventricle are collected and in the parallel configuration, contiguous tagged long-axis images spanning the width of the left ventricle between the lateral wall and the septum are collected. Despite the number of methodologies using either or both imaging configurations, to date, no comparison has been made to determine which geometry results in more accurate estimation of strains. Using previously published work in which left ventricular myocardial strain is calculated from 4-D anatomical NURBS models, we compare the strain calculated from these two imaging geometries in both simulated tagged MR images for which ground truth strain is available as well as in in vivo data. It is shown that strains calculated using the parallel imaging protocol are more accurate than that calculated using spiral protocol.

  12. Sparse Approximations of the Schur Complement for Parallel Algebraic Hybrid Solvers in 3D

    Institute of Scientific and Technical Information of China (English)

    L.Giraud; A.Haidar; Y.Saad

    2010-01-01

    In this paper we study the computational performance of variants of an al-gebraic additive Schwarz preconditioner for the Schur complement for the solution of large sparse linear systems. In earlier works, the local Schur complements were com- puted exactly using a sparse direct solver. The robustness of the preconditioner comes at the price of this memory and time intensive computation that is the main bottleneck of the approach for tackling huge problems. In this work we investigate the use of sparse approximation of the dense local Schur complements. These approximations are com-puted using a partial incomplete LU factorization. Such a numerical calculation is the core of the multi-level incomplete factorization such as the one implemented in pARMS. The numerical and computing performance of the new numerical scheme is illustrated on a set of large 3D convection-diffusion problems;preliminary experiments on linear systems arising from structural mechanics are also reported.

  13. 3D seismic modeling and reverse‐time migration with the parallel Fourier method using non‐blocking collective communications

    KAUST Repository

    Chu, Chunlei

    2009-01-01

    The major performance bottleneck of the parallel Fourier method on distributed memory systems is the network communication cost. In this study, we investigate the potential of using non‐blocking all‐to‐all communications to solve this problem by overlapping computation and communication. We present the runtime comparison of a 3D seismic modeling problem with the Fourier method using non‐blocking and blocking calls, respectively, on a Linux cluster. The data demonstrate that a performance improvement of up to 40% can be achieved by simply changing blocking all‐to‐all communication calls to non‐blocking ones to introduce the overlapping capability. A 3D reverse‐time migration result is also presented as an extension to the modeling work based on non‐blocking collective communications.

  14. Design and Sensitivity Analysis Simulation of a Novel 3D Force Sensor Based on a Parallel Mechanism

    Directory of Open Access Journals (Sweden)

    Eileen Chih-Ying Yang

    2016-12-01

    Full Text Available Automated force measurement is one of the most important technologies in realizing intelligent automation systems. However, while many methods are available for micro-force sensing, measuring large three-dimensional (3D forces and loads remains a significant challenge. Accordingly, the present study proposes a novel 3D force sensor based on a parallel mechanism. The transformation function and sensitivity index of the proposed sensor are analytically derived. The simulation results show that the sensor has a larger effective measuring capability than traditional force sensors. Moreover, the sensor has a greater measurement sensitivity for horizontal forces than for vertical forces over most of the measurable force region. In other words, compared to traditional force sensors, the proposed sensor is more sensitive to shear forces than normal forces.

  15. Massively Parallel Linear Stability Analysis with P_ARPACK for 3D Fluid Flow Modeled with MPSalsa

    Energy Technology Data Exchange (ETDEWEB)

    Lehoucq, R.B.; Salinger, A.G.

    1998-10-13

    We are interested in the stability of three-dimensional fluid flows to small dkturbances. One computational approach is to solve a sequence of large sparse generalized eigenvalue problems for the leading modes that arise from discretizating the differential equations modeling the flow. The modes of interest are the eigenvalues of largest real part and their associated eigenvectors. We discuss our work to develop an effi- cient and reliable eigensolver for use by the massively parallel simulation code MPSalsa. MPSalsa allows simulation of complex 3D fluid flow, heat transfer, and mass transfer with detailed bulk fluid and surface chemical reaction kinetics.

  16. A parallel fast multipole BEM and its applications to large-scale analysis of 3-D fiber-reinforced composites

    Institute of Scientific and Technical Information of China (English)

    Ting Lei; Zhenhan Yao; Haitao Wang; Pengbo Wang

    2006-01-01

    In this paper, an adaptive boundary element method (BEM) is presented for solving 3-D elasticity problems. The numerical scheme is accelerated by the new version of fast multipole method (FMM) and parallelized on distributed memory architectures. The resulting solver is applied to the study of representative volume element (RVE)for short fiberreinforced composites with complex inclusion geometry. Numerical examples performed on a 32-processor cluster show that the proposed method is both accurate and efficient. And can solve problems of large size that are challenging to existing state-of-the-art domain methods.

  17. 3D parallel-detection microwave tomography for clinical breast imaging

    Science.gov (United States)

    Epstein, N. R.; Meaney, P. M.; Paulsen, K. D.

    2014-12-01

    A biomedical microwave tomography system with 3D-imaging capabilities has been constructed and translated to the clinic. Updates to the hardware and reconfiguration of the electronic-network layouts in a more compartmentalized construct have streamlined system packaging. Upgrades to the data acquisition and microwave components have increased data-acquisition speeds and improved system performance. By incorporating analog-to-digital boards that accommodate the linear amplification and dynamic-range coverage our system requires, a complete set of data (for a fixed array position at a single frequency) is now acquired in 5.8 s. Replacement of key components (e.g., switches and power dividers) by devices with improved operational bandwidths has enhanced system response over a wider frequency range. High-integrity, low-power signals are routinely measured down to -130 dBm for frequencies ranging from 500 to 2300 MHz. Adequate inter-channel isolation has been maintained, and a dynamic range >110 dB has been achieved for the full operating frequency range (500-2900 MHz). For our primary band of interest, the associated measurement deviations are less than 0.33% and 0.5° for signal amplitude and phase values, respectively. A modified monopole antenna array (composed of two interwoven eight-element sub-arrays), in conjunction with an updated motion-control system capable of independently moving the sub-arrays to various in-plane and cross-plane positions within the illumination chamber, has been configured in the new design for full volumetric data acquisition. Signal-to-noise ratios (SNRs) are more than adequate for all transmit/receive antenna pairs over the full frequency range and for the variety of in-plane and cross-plane configurations. For proximal receivers, in-plane SNRs greater than 80 dB are observed up to 2900 MHz, while cross-plane SNRs greater than 80 dB are seen for 6 cm sub-array spacing (for frequencies up to 1500 MHz). We demonstrate accurate recovery

  18. Structured Parallel Programming Patterns for Efficient Computation

    CERN Document Server

    McCool, Michael; Robison, Arch

    2012-01-01

    Programming is now parallel programming. Much as structured programming revolutionized traditional serial programming decades ago, a new kind of structured programming, based on patterns, is relevant to parallel programming today. Parallel computing experts and industry insiders Michael McCool, Arch Robison, and James Reinders describe how to design and implement maintainable and efficient parallel algorithms using a pattern-based approach. They present both theory and practice, and give detailed concrete examples using multiple programming models. Examples are primarily given using two of th

  19. About Parallel Programming: Paradigms, Parallel Execution and Collaborative Systems

    Directory of Open Access Journals (Sweden)

    Loredana MOCEAN

    2009-01-01

    Full Text Available In the last years, there were made efforts for delineation of a stabile and unitary frame, where the problems of logical parallel processing must find solutions at least at the level of imperative languages. The results obtained by now are not at the level of the made efforts. This paper wants to be a little contribution at these efforts. We propose an overview in parallel programming, parallel execution and collaborative systems.

  20. Introducing zeus-mp: a 3d, parallel, multiphysics code for astrophysical fluid dynamics

    Directory of Open Access Journals (Sweden)

    Michael L. Norman

    2000-01-01

    Full Text Available Describimos ZEUS-MP: un c odigo Multi-F sica, Masivamente-Paralelo, Pasa-Mensajes para simulaciones tridimensionales de din amica de uidos astrof sicos. ZEUS-MP es la continuaci on de los c odigos ZEUS-2D y ZEUS-3D, desarrollados y diseminados por el Laboratorio de Astrof sica Computacional (lca.ncsa.uiuc.edu del NCSA. La versi on V1.0, liberada el 1/1/2000, contiene los siguientes m odulos: hidrodin amica ideal, MHD ideal y auto-gravedad. Las pr oximas versiones tendr an difusi on radiativa de ujo limitado, conducci on de calor, plasma de dos temperaturas y funciones de enfriamiento y calentamiento. Las ecuaciones covariantes est an avanzadas en una malla Euleriana m ovil en coordenadas Cartesianas, cil ndricas y polares esf ericas. La paralelizaci on es hecha por descomposici on del dominio y est a implementada en F77 y MPI. El c odigo es portable en un amplio rango de plataformas, desde redes de estaciones de trabajo hasta procesadores de paralelismo masivo. Se presentan algunos resultados de la e ciencia en paralelo junto con una aplicaci on a formaci on estelar turbulenta.

  1. The ParaScope parallel programming environment

    Science.gov (United States)

    Cooper, Keith D.; Hall, Mary W.; Hood, Robert T.; Kennedy, Ken; Mckinley, Kathryn S.; Mellor-Crummey, John M.; Torczon, Linda; Warren, Scott K.

    1993-01-01

    The ParaScope parallel programming environment, developed to support scientific programming of shared-memory multiprocessors, includes a collection of tools that use global program analysis to help users develop and debug parallel programs. This paper focuses on ParaScope's compilation system, its parallel program editor, and its parallel debugging system. The compilation system extends the traditional single-procedure compiler by providing a mechanism for managing the compilation of complete programs. Thus, ParaScope can support both traditional single-procedure optimization and optimization across procedure boundaries. The ParaScope editor brings both compiler analysis and user expertise to bear on program parallelization. It assists the knowledgeable user by displaying and managing analysis and by providing a variety of interactive program transformations that are effective in exposing parallelism. The debugging system detects and reports timing-dependent errors, called data races, in execution of parallel programs. The system combines static analysis, program instrumentation, and run-time reporting to provide a mechanical system for isolating errors in parallel program executions. Finally, we describe a new project to extend ParaScope to support programming in FORTRAN D, a machine-independent parallel programming language intended for use with both distributed-memory and shared-memory parallel computers.

  2. A 3D hybrid grid generation technique and a multigrid/parallel algorithm based on anisotropic agglomeration approach

    Institute of Scientific and Technical Information of China (English)

    Zhang Laiping; Zhao Zhong; Chang Xinghua; He Xin

    2013-01-01

    A hybrid grid generation technique and a multigrid/parallel algorithm are presented in this paper for turbulence flow simulations over three-dimensional (3D) complex geometries.The hybrid grid generation technique is based on an agglomeration method of anisotropic tetrahedrons.Firstly,the complex computational domain is covered by pure tetrahedral grids,in which anisotropic tetrahedrons are adopted to discrete the boundary layer and isotropic tetrahedrons in the outer field.Then,the anisotropic tetrahedrons in the boundary layer are agglomerated to generate prismatic grids.The agglomeration method can improve the grid quality in boundary layer and reduce the grid quantity to enhance the numerical accuracy and efficiency.In order to accelerate the convergence history,a multigrid/parallel algorithm is developed also based on anisotropic agglomeration approach.The numerical results demonstrate the excellent accelerating capability of this multigrid method.

  3. A Case Study of a Hybrid Parallel 3D Surface Rendering Graphics Architecture

    DEFF Research Database (Denmark)

    Holten-Lund, Hans Erik; Madsen, Jan; Pedersen, Steen

    1997-01-01

    This paper presents a case study in the design strategy used inbuilding a graphics computer, for drawing very complex 3Dgeometric surfaces. The goal is to build a PC based computer systemcapable of handling surfaces built from about 2 million triangles, andto be able to render a perspective view...... of these on a computer displayat interactive frame rates, i.e. processing around 50 milliontriangles per second. The paper presents a hardware/softwarearchitecture called HPGA (Hybrid Parallel Graphics Architecture) whichis likely to be able to carry out this task. The case study focuses ontechniques to increase...... the clock frequency as well as the parallelismof the system. This paper focuses on the back-end graphics pipeline,which is responsible for rasterizing triangles.%with a practically linear increase in performance. A pure software implementation of the proposed architecture iscurrently able to process 300...

  4. Parallel deconvolution of large 3D images obtained by confocal laser scanning microscopy.

    Science.gov (United States)

    Pawliczek, Piotr; Romanowska-Pawliczek, Anna; Soltys, Zbigniew

    2010-03-01

    Various deconvolution algorithms are often used for restoration of digital images. Image deconvolution is especially needed for the correction of three-dimensional images obtained by confocal laser scanning microscopy. Such images suffer from distortions, particularly in the Z dimension. As a result, reliable automatic segmentation of these images may be difficult or even impossible. Effective deconvolution algorithms are memory-intensive and time-consuming. In this work, we propose a parallel version of the well-known Richardson-Lucy deconvolution algorithm developed for a system with distributed memory and implemented with the use of Message Passing Interface (MPI). It enables significantly more rapid deconvolution of two-dimensional and three-dimensional images by efficiently splitting the computation across multiple computers. The implementation of this algorithm can be used on professional clusters provided by computing centers as well as on simple networks of ordinary PC machines.

  5. Fast and Precise 3D Computation of Capacitance of Parallel Narrow Beam MEMS Structures

    CERN Document Server

    Majumdar, N

    2007-01-01

    Efficient design and performance of electrically actuated MEMS devices necessitate accurate estimation of electrostatic forces on the MEMS structures. This in turn requires thorough study of the capacitance of the structures and finally the charge density distribution on the various surfaces of a device. In this work, nearly exact BEM solutions have been provided in order to estimate these properties of a parallel narrow beam structure found in MEMS devices. The effect of three-dimensionality, which is an important aspect for these structures, and associated fringe fields have been studied in detail. A reasonably large parameter space has been covered in order to follow the variation of capacitance with various geometric factors. The present results have been compared with those obtained using empirical parametrized expressions keeping in view the requirement of the speed of computation. The limitations of the empirical expressions have been pointed out and possible approaches of their improvement have been d...

  6. High-speed 3D imaging using two-wavelength parallel-phase-shift interferometry.

    Science.gov (United States)

    Safrani, Avner; Abdulhalim, Ibrahim

    2015-10-15

    High-speed three dimensional imaging based on two-wavelength parallel-phase-shift interferometry is presented. The technique is demonstrated using a high-resolution polarization-based Linnik interferometer operating with three high-speed phase-masked CCD cameras and two quasi-monochromatic modulated light sources. The two light sources allow for phase unwrapping the single source wrapped phase so that relatively high step profiles having heights as large as 3.7 μm can be imaged in video rate with ±2  nm accuracy and repeatability. The technique is validated using a certified very large scale integration (VLSI) step standard followed by a demonstration from the semiconductor industry showing an integrated chip with 2.75 μm height copper micro pillars at different packing densities.

  7. Parallel Programming in the Age of Ubiquitous Parallelism

    Science.gov (United States)

    Pingali, Keshav

    2014-04-01

    Multicore and manycore processors are now ubiquitous, but parallel programming remains as difficult as it was 30-40 years ago. During this time, our community has explored many promising approaches including functional and dataflow languages, logic programming, and automatic parallelization using program analysis and restructuring, but none of these approaches has succeeded except in a few niche application areas. In this talk, I will argue that these problems arise largely from the computation-centric foundations and abstractions that we currently use to think about parallelism. In their place, I will propose a novel data-centric foundation for parallel programming called the operator formulation in which algorithms are described in terms of actions on data. The operator formulation shows that a generalized form of data-parallelism called amorphous data-parallelism is ubiquitous even in complex, irregular graph applications such as mesh generation/refinement/partitioning and SAT solvers. Regular algorithms emerge as a special case of irregular ones, and many application-specific optimization techniques can be generalized to a broader context. The operator formulation also leads to a structural analysis of algorithms called TAO-analysis that provides implementation guidelines for exploiting parallelism efficiently. Finally, I will describe a system called Galois based on these ideas for exploiting amorphous data-parallelism on multicores and GPUs

  8. December 1993 National Drunk and Drugged Driving (3D) Prevention Month: Program Planner.

    Science.gov (United States)

    National Highway Traffic Safety Administration (DOT), Washington, DC.

    This program planner's kit is based on the experiences of the first 12 years of the National Drunk and Drugged Driving (3D) Prevention Month program and provides practical advice to help readers plan activities for this year's campaign. Included in the kit is a background and resource guide that explains the background and goals of the program and…

  9. Adobe Flash 11 Stage3D (Molehill) Game Programming Beginner's Guide

    CERN Document Server

    Kaitila, Christer

    2011-01-01

    Written in an informal and friendly manner, the style and approach of this book will take you on an exciting adventure. Piece by piece, detailed examples help you along the way by providing real-world game code required to make a complete 3D video game. Each chapter builds upon the experience and achievements earned in the last, culminating in the ultimate prize - your game! If you ever wanted to make your own 3D game in Flash, then this book is for you. This book is a perfect introduction to 3D game programming in Adobe Molehill for complete beginners. You do not need to know anything about S

  10. Genetic Parallel Programming: design and implementation.

    Science.gov (United States)

    Cheang, Sin Man; Leung, Kwong Sak; Lee, Kin Hong

    2006-01-01

    This paper presents a novel Genetic Parallel Programming (GPP) paradigm for evolving parallel programs running on a Multi-Arithmetic-Logic-Unit (Multi-ALU) Processor (MAP). The MAP is a Multiple Instruction-streams, Multiple Data-streams (MIMD), general-purpose register machine that can be implemented on modern Very Large-Scale Integrated Circuits (VLSIs) in order to evaluate genetic programs at high speed. For human programmers, writing parallel programs is more difficult than writing sequential programs. However, experimental results show that GPP evolves parallel programs with less computational effort than that of their sequential counterparts. It creates a new approach to evolving a feasible problem solution in parallel program form and then serializes it into a sequential program if required. The effectiveness and efficiency of GPP are investigated using a suite of 14 well-studied benchmark problems. Experimental results show that GPP speeds up evolution substantially.

  11. Requirements for Data-Parallel Programming Environments

    Science.gov (United States)

    1994-04-22

    fully automatic techniques would be insufficient by themselves to support general parallel programming , even in the limited domain of scientific...computation. In other words, in an effective parallel programming system, the programmer would have to provide additional information to help the system...convey an understanding of the tools and strategies that will be needed to adequately support efficient, machine-independent, data- parallel programming .

  12. Parallel Inversion Arithmetic for 3D Multi-Wave Pre-stack Elasticity Parameters and Its Application

    Science.gov (United States)

    Luo, S.; Li, L.

    2009-12-01

    Multi-wave seismic prospect is an elastic wave prospect by which all wave fields can be achieved. We can inverse the stratum lithological parameters and elastic parameters by the multi-wave amplitude characteristics in order to get the information of the reservoirs and fluids. At present, the main methods of multi-wave inversion are post-stack inversion and single component partial-stack inversion, which are based on the approximative expressions and isotropy media. Widely known, the post-stack inversion can only be used to inverse the impedance, three lithological parameters(P wave velocity,S wave velocity and density)can not be obtained independently in this kind of method. The single component is not the whole elastic wave inversion, and the theory formula of the isotropy media is unfit for the anisotropy media inversion. Therefore, based on the anisotropy media, the method of the multi-wave associated pre-stack inversion is studied by using of the precise AVA formulae in this paper. To the questions of the lithology identification and the prediction of reservoirs, the authors studied the associated inversion of 3D lithological parameters for the anisotropy media with 3D3C data. The basic processes of the parameter inversion are as follows: (1) create the velocity model and produce the NMO gathers, (2) match the layers of the P wave with the same layers of P-SV wave, and convert AVO gathers into AVA gathers, (3) inverse the lithology parameters and anisotropy coefficients with the NMO gather, and (4) compute the elastic parameters, elastic impedance, elastic impedance grads based on the inversed parameters. Because of the huge amount of computing work of 3D pre-stack parameter inversion, the parallel arithmetic of the 3D pre-stack parameter inversion is utilized to improve the computing efficiency. Via the 3D real data processing, it is proved that this method is effective and can be applied in the oil and gas prediction of the reservoirs.

  13. Development of a 3D Parallel Mechanism Robot Arm with Three Vertical-Axial Pneumatic Actuators Combined with a Stereo Vision System

    Directory of Open Access Journals (Sweden)

    Hao-Ting Lin

    2011-12-01

    Full Text Available This study aimed to develop a novel 3D parallel mechanism robot driven by three vertical-axial pneumatic actuators with a stereo vision system for path tracking control. The mechanical system and the control system are the primary novel parts for developing a 3D parallel mechanism robot. In the mechanical system, a 3D parallel mechanism robot contains three serial chains, a fixed base, a movable platform and a pneumatic servo system. The parallel mechanism are designed and analyzed first for realizing a 3D motion in the X-Y-Z coordinate system of the robot’s end-effector. The inverse kinematics and the forward kinematics of the parallel mechanism robot are investigated by using the Denavit-Hartenberg notation (D-H notation coordinate system. The pneumatic actuators in the three vertical motion axes are modeled. In the control system, the Fourier series-based adaptive sliding-mode controller with H∞ tracking performance is used to design the path tracking controllers of the three vertical servo pneumatic actuators for realizing 3D path tracking control of the end-effector. Three optical linear scales are used to measure the position of the three pneumatic actuators. The 3D position of the end-effector is then calculated from the measuring position of the three pneumatic actuators by means of the kinematics. However, the calculated 3D position of the end-effector cannot consider the manufacturing and assembly tolerance of the joints and the parallel mechanism so that errors between the actual position and the calculated 3D position of the end-effector exist. In order to improve this situation, sensor collaboration is developed in this paper. A stereo vision system is used to collaborate with the three position sensors of the pneumatic actuators. The stereo vision system combining two CCD serves to measure the actual 3D position of the end-effector and calibrate the error between the actual and the calculated 3D position of the end

  14. Development of a 3D parallel mechanism robot arm with three vertical-axial pneumatic actuators combined with a stereo vision system.

    Science.gov (United States)

    Chiang, Mao-Hsiung; Lin, Hao-Ting

    2011-01-01

    This study aimed to develop a novel 3D parallel mechanism robot driven by three vertical-axial pneumatic actuators with a stereo vision system for path tracking control. The mechanical system and the control system are the primary novel parts for developing a 3D parallel mechanism robot. In the mechanical system, a 3D parallel mechanism robot contains three serial chains, a fixed base, a movable platform and a pneumatic servo system. The parallel mechanism are designed and analyzed first for realizing a 3D motion in the X-Y-Z coordinate system of the robot's end-effector. The inverse kinematics and the forward kinematics of the parallel mechanism robot are investigated by using the Denavit-Hartenberg notation (D-H notation) coordinate system. The pneumatic actuators in the three vertical motion axes are modeled. In the control system, the Fourier series-based adaptive sliding-mode controller with H(∞) tracking performance is used to design the path tracking controllers of the three vertical servo pneumatic actuators for realizing 3D path tracking control of the end-effector. Three optical linear scales are used to measure the position of the three pneumatic actuators. The 3D position of the end-effector is then calculated from the measuring position of the three pneumatic actuators by means of the kinematics. However, the calculated 3D position of the end-effector cannot consider the manufacturing and assembly tolerance of the joints and the parallel mechanism so that errors between the actual position and the calculated 3D position of the end-effector exist. In order to improve this situation, sensor collaboration is developed in this paper. A stereo vision system is used to collaborate with the three position sensors of the pneumatic actuators. The stereo vision system combining two CCD serves to measure the actual 3D position of the end-effector and calibrate the error between the actual and the calculated 3D position of the end-effector. Furthermore, to

  15. Parallel Programming Environment for OpenMP

    Directory of Open Access Journals (Sweden)

    Insung Park

    2001-01-01

    Full Text Available We present our effort to provide a comprehensive parallel programming environment for the OpenMP parallel directive language. This environment includes a parallel programming methodology for the OpenMP programming model and a set of tools (Ursa Minor and InterPol that support this methodology. Our toolset provides automated and interactive assistance to parallel programmers in time-consuming tasks of the proposed methodology. The features provided by our tools include performance and program structure visualization, interactive optimization, support for performance modeling, and performance advising for finding and correcting performance problems. The presented evaluation demonstrates that our environment offers significant support in general parallel tuning efforts and that the toolset facilitates many common tasks in OpenMP parallel programming in an efficient manner.

  16. Parallelized 3D CSEM modeling using edge-based finite element with total field formulation and unstructured mesh

    Science.gov (United States)

    Cai, Hongzhu; Hu, Xiangyun; Li, Jianhui; Endo, Masashi; Xiong, Bin

    2017-02-01

    We solve the 3D controlled-source electromagnetic (CSEM) problem using the edge-based finite element method. The modeling domain is discretized using unstructured tetrahedral mesh. We adopt the total field formulation for the quasi-static variant of Maxwell's equation and the computation cost to calculate the primary field can be saved. We adopt a new boundary condition which approximate the total field on the boundary by the primary field corresponding to the layered earth approximation of the complicated conductivity model. The primary field on the modeling boundary is calculated using fast Hankel transform. By using this new type of boundary condition, the computation cost can be reduced significantly and the modeling accuracy can be improved. We consider that the conductivity can be anisotropic. We solve the finite element system of equations using a parallelized multifrontal solver which works efficiently for multiple source and large scale electromagnetic modeling.

  17. iPhone 3D Programming Developing Graphical Applications with OpenGL ES

    CERN Document Server

    Rideout, Philip

    2010-01-01

    What does it take to build an iPhone app with stunning 3D graphics? This book will show you how to apply OpenGL graphics programming techniques to any device running the iPhone OS -- including the iPad and iPod Touch -- with no iPhone development or 3D graphics experience required. iPhone 3D Programming provides clear step-by-step instructions, as well as lots of practical advice, for using the iPhone SDK and OpenGL. You'll build several graphics programs -- progressing from simple to more complex examples -- that focus on lighting, textures, blending, augmented reality, optimization for pe

  18. PDDP, A Data Parallel Programming Model

    Directory of Open Access Journals (Sweden)

    Karen H. Warren

    1996-01-01

    Full Text Available PDDP, the parallel data distribution preprocessor, is a data parallel programming model for distributed memory parallel computers. PDDP implements high-performance Fortran-compatible data distribution directives and parallelism expressed by the use of Fortran 90 array syntax, the FORALL statement, and the WHERE construct. Distributed data objects belong to a global name space; other data objects are treated as local and replicated on each processor. PDDP allows the user to program in a shared memory style and generates codes that are portable to a variety of parallel machines. For interprocessor communication, PDDP uses the fastest communication primitives on each platform.

  19. Non-Iterative Rigid 2D/3D Point-Set Registration Using Semidefinite Programming.

    Science.gov (United States)

    Khoo, Yuehaw; Kapoor, Ankur

    2016-07-01

    We describe a convex programming framework for pose estimation in 2D/3D point-set registration with unknown point correspondences. We give two mixed-integer nonlinear program (MINLP) formulations of the 2D/3D registration problem when there are multiple 2D images, and propose convex relaxations for both the MINLPs to semidefinite programs that can be solved efficiently by interior point methods. Our approach to the 2D/3D registration problem is non-iterative in nature as we jointly solve for pose and correspondence. Furthermore, these convex programs can readily incorporate feature descriptors of points to enhance registration results. We prove that the convex programs exactly recover the solution to the MINLPs under certain noiseless condition. We apply these formulations to the registration of 3D models of coronary vessels to their 2D projections obtained from multiple intra-operative fluoroscopic images. For this application, we experimentally corroborate the exact recovery property in the absence of noise and further demonstrate robustness of the convex programs in the presence of noise.

  20. Parallel programming with PCN. Revision 1

    Energy Technology Data Exchange (ETDEWEB)

    Foster, I.; Tuecke, S.

    1991-12-01

    PCN is a system for developing and executing parallel programs. It comprises a high-level programming language, tools for developing and debugging programs in this language, and interfaces to Fortran and C that allow the reuse of existing code in multilingual parallel programs. Programs developed using PCN are portable across many different workstations, networks, and parallel computers. This document provides all the information required to develop parallel programs with the PCN programming system. In includes both tutorial and reference material. It also presents the basic concepts that underly PCN, particularly where these are likely to be unfamiliar to the reader, and provides pointers to other documentation on the PCN language, programming techniques, and tools. PCN is in the public domain. The latest version of both the software and this manual can be obtained by anonymous FTP from Argonne National Laboratory in the directory pub/pcn at info.mcs.anl.gov (c.f. Appendix A).

  1. Synthetic models of distributed memory parallel programs

    Energy Technology Data Exchange (ETDEWEB)

    Poplawski, D.A. (Michigan Technological Univ., Houghton, MI (USA). Dept. of Computer Science)

    1990-09-01

    This paper deals with the construction and use of simple synthetic programs that model the behavior of more complex, real parallel programs. Synthetic programs can be used in many ways: to construct an easily ported suite of benchmark programs, to experiment with alternate parallel implementations of a program without actually writing them, and to predict the behavior and performance of an algorithm on a new or hypothetical machine. Synthetic programs are constructed easily from scratch, from existing programs, and can even be constructed using nothing but information obtained from traces of the real program's execution.

  2. Parallel programming characteristics of a DSP-based parallel system

    Institute of Scientific and Technical Information of China (English)

    GAO Shu; GUO Qing-ping

    2006-01-01

    This paper firstly introduces the structure and working principle of DSP-based parallel system, parallel accelerating board and SHARC DSP chip. Then it pays attention to investigating the system's programming characteristics, especially the mode of communication, discussing how to design parallel algorithms and presenting a domain-decomposition-based complete multi-grid parallel algorithm with virtual boundary forecast (VBF) to solve a lot of large-scale and complicated heat problems. In the end, Mandelbrot Set and a non-linear heat transfer equation of ceramic/metal composite material are taken as examples to illustrate the implementation of the proposed algorithm. The results showed that the solutions are highly efficient and have linear speedup.

  3. Massively Parallel Finite Element Programming

    KAUST Repository

    Heister, Timo

    2010-01-01

    Today\\'s large finite element simulations require parallel algorithms to scale on clusters with thousands or tens of thousands of processor cores. We present data structures and algorithms to take advantage of the power of high performance computers in generic finite element codes. Existing generic finite element libraries often restrict the parallelization to parallel linear algebra routines. This is a limiting factor when solving on more than a few hundreds of cores. We describe routines for distributed storage of all major components coupled with efficient, scalable algorithms. We give an overview of our effort to enable the modern and generic finite element library deal.II to take advantage of the power of large clusters. In particular, we describe the construction of a distributed mesh and develop algorithms to fully parallelize the finite element calculation. Numerical results demonstrate good scalability. © 2010 Springer-Verlag.

  4. Experiences in Data-Parallel Programming

    Directory of Open Access Journals (Sweden)

    Terry W. Clark

    1997-01-01

    Full Text Available To efficiently parallelize a scientific application with a data-parallel compiler requires certain structural properties in the source program, and conversely, the absence of others. A recent parallelization effort of ours reinforced this observation and motivated this correspondence. Specifically, we have transformed a Fortran 77 version of GROMOS, a popular dusty-deck program for molecular dynamics, into Fortran D, a data-parallel dialect of Fortran. During this transformation we have encountered a number of difficulties that probably are neither limited to this particular application nor do they seem likely to be addressed by improved compiler technology in the near future. Our experience with GROMOS suggests a number of points to keep in mind when developing software that may at some time in its life cycle be parallelized with a data-parallel compiler. This note presents some guidelines for engineering data-parallel applications that are compatible with Fortran D or High Performance Fortran compilers.

  5. Superpose3D: a local structural comparison program that allows for user-defined structure representations.

    Directory of Open Access Journals (Sweden)

    Pier Federico Gherardini

    Full Text Available Local structural comparison methods can be used to find structural similarities involving functional protein patches such as enzyme active sites and ligand binding sites. The outcome of such analyses is critically dependent on the representation used to describe the structure. Indeed different categories of functional sites may require the comparison program to focus on different characteristics of the protein residues. We have therefore developed superpose3D, a novel structural comparison software that lets users specify, with a powerful and flexible syntax, the structure description most suited to the requirements of their analysis. Input proteins are processed according to the user's directives and the program identifies sets of residues (or groups of atoms that have a similar 3D position in the two structures. The advantages of using such a general purpose program are demonstrated with several examples. These test cases show that no single representation is appropriate for every analysis, hence the usefulness of having a flexible program that can be tailored to different needs. Moreover we also discuss how to interpret the results of a database screening where a known structural motif is searched against a large ensemble of structures. The software is written in C++ and is released under the open source GPL license. Superpose3D does not require any external library, runs on Linux, Mac OSX, Windows and is available at http://cbm.bio.uniroma2.it/superpose3D.

  6. Productive Parallel Programming: The PCN Approach

    Directory of Open Access Journals (Sweden)

    Ian Foster

    1992-01-01

    Full Text Available We describe the PCN programming system, focusing on those features designed to improve the productivity of scientists and engineers using parallel supercomputers. These features include a simple notation for the concise specification of concurrent algorithms, the ability to incorporate existing Fortran and C code into parallel applications, facilities for reusing parallel program components, a portable toolkit that allows applications to be developed on a workstation or small parallel computer and run unchanged on supercomputers, and integrated debugging and performance analysis tools. We survey representative scientific applications and identify problem classes for which PCN has proved particularly useful.

  7. Playable Stories: Making Programming and 3D Role-Playing Game Design Personally and Socially Relevant

    Science.gov (United States)

    Ingram-Goble, Adam

    2013-01-01

    This is an exploratory design study of a novel system for learning programming and 3D role-playing game design as tools for social change. This study was conducted at two sites. Participants in the study were ages 9-14 and worked for up to 15 hours with the platform to learn how to program and design video games with personally or socially…

  8. GPU-based, parallel-line, omni-directional integration of measured acceleration field to obtain the 3D pressure distribution

    Science.gov (United States)

    Wang, Jin; Zhang, Cao; Katz, Joseph

    2016-11-01

    A PIV based method to reconstruct the volumetric pressure field by direct integration of the 3D material acceleration directions has been developed. Extending the 2D virtual-boundary omni-directional method (Omni2D, Liu & Katz, 2013), the new 3D parallel-line omni-directional method (Omni3D) integrates the material acceleration along parallel lines aligned in multiple directions. Their angles are set by a spherical virtual grid. The integration is parallelized on a Tesla K40c GPU, which reduced the computing time from three hours to one minute for a single realization. To validate its performance, this method is utilized to calculate the 3D pressure fields in isotropic turbulence and channel flow using the JHU DNS Databases (http://turbulence.pha.jhu.edu). Both integration of the DNS acceleration as well as acceleration from synthetic 3D particles are tested. Results are compared to other method, e.g. solution to the Pressure Poisson Equation (e.g. PPE, Ghaemi et al., 2012) with Bernoulli based Dirichlet boundary conditions, and the Omni2D method. The error in Omni3D prediction is uniformly low, and its sensitivity to acceleration errors is local. It agrees with the PPE/Bernoulli prediction away from the Dirichlet boundary. The Omni3D method is also applied to experimental data obtained using tomographic PIV, and results are correlated with deformation of a compliant wall. ONR.

  9. A survey of parallel programming tools

    Science.gov (United States)

    Cheng, Doreen Y.

    1991-01-01

    This survey examines 39 parallel programming tools. Focus is placed on those tool capabilites needed for parallel scientific programming rather than for general computer science. The tools are classified with current and future needs of Numerical Aerodynamic Simulator (NAS) in mind: existing and anticipated NAS supercomputers and workstations; operating systems; programming languages; and applications. They are divided into four categories: suggested acquisitions, tools already brought in; tools worth tracking; and tools eliminated from further consideration at this time.

  10. 2D/3D Program work summary report, [January 1988--December 1992

    Energy Technology Data Exchange (ETDEWEB)

    Damerell, P. S.; Simons, J. W. [eds., MPR Associates, Washington, DC (United States)

    1993-06-01

    The 2D/3D Program was carried out by Germany, Japan and the United States to investigate the thermal-hydraulics of a PWR large-break LOCA. A contributory approach was utilized in which each country contributed significant effort to the program and all three countries shared the research results. Germany constructed and operated the Upper Plenum Test Facility (UPTF), and Japan constructed and operated the Cylindrical Core Test Facility (CCTF) and the Slab Core Test Facility (SCTF). The US contribution consisted of provision of advanced instrumentation to each of the three test facilities, and assessment of the TRAC computer code against the test results. Evaluations of the test results were carried out in all three countries. This report summarizes the 2D/3D Program in terms of the contributing efforts of the participants.

  11. Analysis results from the Los Alamos 2D/3D program

    Energy Technology Data Exchange (ETDEWEB)

    Boyack, B.E.; Cappiello, M.W.; Harmony, S.C.; Shire, P.R.; Siebe, D.A.

    1987-01-01

    Los Alamos National Laboratory is a participant in the 2D/3D program. Activities conducted at Los Alamos National Laboratory in support of 2D/3D program goals include analysis support of facility design, construction, and operation; provision of boundary and initial conditions for test-facility operations based on analysis of pressurized water reactors; performance of pretest and posttest predictions and analyses; and use of experimental results to validate and assess the single- and multi-dimensional, nonequilibrium features in the Transient Reactor Analysis Code (TRAC). During fiscal year 1987, Los Alamos conducted analytical assessment activities using data from the Slab Core Test Facility, The Cylindrical Core Test Facility, and the Upper Plenum Test Facility. Finally, Los Alamos continued work to provide TRAC improvements. In this paper, Los Alamos activities during fiscal year 1987 will be summarized; several significant accomplishments will be described in more detail to illustrate the work activities at Los Alamos.

  12. Integrated Task and Data Parallel Programming

    Science.gov (United States)

    Grimshaw, A. S.

    1998-01-01

    This research investigates the combination of task and data parallel language constructs within a single programming language. There are an number of applications that exhibit properties which would be well served by such an integrated language. Examples include global climate models, aircraft design problems, and multidisciplinary design optimization problems. Our approach incorporates data parallel language constructs into an existing, object oriented, task parallel language. The language will support creation and manipulation of parallel classes and objects of both types (task parallel and data parallel). Ultimately, the language will allow data parallel and task parallel classes to be used either as building blocks or managers of parallel objects of either type, thus allowing the development of single and multi-paradigm parallel applications. 1995 Research Accomplishments In February I presented a paper at Frontiers 1995 describing the design of the data parallel language subset. During the spring I wrote and defended my dissertation proposal. Since that time I have developed a runtime model for the language subset. I have begun implementing the model and hand-coding simple examples which demonstrate the language subset. I have identified an astrophysical fluid flow application which will validate the data parallel language subset. 1996 Research Agenda Milestones for the coming year include implementing a significant portion of the data parallel language subset over the Legion system. Using simple hand-coded methods, I plan to demonstrate (1) concurrent task and data parallel objects and (2) task parallel objects managing both task and data parallel objects. My next steps will focus on constructing a compiler and implementing the fluid flow application with the language. Concurrently, I will conduct a search for a real-world application exhibiting both task and data parallelism within the same program. Additional 1995 Activities During the fall I collaborated

  13. The PISCES 2 parallel programming environment

    Science.gov (United States)

    Pratt, Terrence W.

    1987-01-01

    PISCES 2 is a programming environment for scientific and engineering computations on MIMD parallel computers. It is currently implemented on a flexible FLEX/32 at NASA Langley, a 20 processor machine with both shared and local memories. The environment provides an extended Fortran for applications programming, a configuration environment for setting up a run on the parallel machine, and a run-time environment for monitoring and controlling program execution. This paper describes the overall design of the system and its implementation on the FLEX/32. Emphasis is placed on several novel aspects of the design: the use of a carefully defined virtual machine, programmer control of the mapping of virtual machine to actual hardware, forces for medium-granularity parallelism, and windows for parallel distribution of data. Some preliminary measurements of storage use are included.

  14. Automatic Parallelization Tool: Classification of Program Code for Parallel Computing

    Directory of Open Access Journals (Sweden)

    Mustafa Basthikodi

    2016-04-01

    Full Text Available Performance growth of single-core processors has come to a halt in the past decade, but was re-enabled by the introduction of parallelism in processors. Multicore frameworks along with Graphical Processing Units empowered to enhance parallelism broadly. Couples of compilers are updated to developing challenges forsynchronization and threading issues. Appropriate program and algorithm classifications will have advantage to a great extent to the group of software engineers to get opportunities for effective parallelization. In present work we investigated current species for classification of algorithms, in that related work on classification is discussed along with the comparison of issues that challenges the classification. The set of algorithms are chosen which matches the structure with different issues and perform given task. We have tested these algorithms utilizing existing automatic species extraction toolsalong with Bones compiler. We have added functionalities to existing tool, providing a more detailed characterization. The contributions of our work include support for pointer arithmetic, conditional and incremental statements, user defined types, constants and mathematical functions. With this, we can retain significant data which is not captured by original speciesof algorithms. We executed new theories into the device, empowering automatic characterization of program code.

  15. Parallel programming with PCN. Revision 2

    Energy Technology Data Exchange (ETDEWEB)

    Foster, I.; Tuecke, S.

    1993-01-01

    PCN is a system for developing and executing parallel programs. It comprises a high-level programming language, tools for developing and debugging programs in this language, and interfaces to Fortran and Cthat allow the reuse of existing code in multilingual parallel programs. Programs developed using PCN are portable across many different workstations, networks, and parallel computers. This document provides all the information required to develop parallel programs with the PCN programming system. It includes both tutorial and reference material. It also presents the basic concepts that underlie PCN, particularly where these are likely to be unfamiliar to the reader, and provides pointers to other documentation on the PCN language, programming techniques, and tools. PCN is in the public domain. The latest version of both the software and this manual can be obtained by anonymous ftp from Argonne National Laboratory in the directory pub/pcn at info.mcs. ani.gov (cf. Appendix A). This version of this document describes PCN version 2.0, a major revision of the PCN programming system. It supersedes earlier versions of this report.

  16. DVR3D: a program suite for the calculation of rotation-vibration spectra of triatomic molecules

    Science.gov (United States)

    Tennyson, Jonathan; Kostin, Maxim A.; Barletta, Paolo; Harris, Gregory J.; Polyansky, Oleg L.; Ramanlal, Jayesh; Zobov, Nikolai F.

    2004-11-01

    from: CPC Program Library, Queen's University of Belfast, N. Ireland Reference in CPC to previous version: 86 (1995) 175 Catalogue identifier of previous version: ADAK Authors of previous version: J. Tennyson, J.R. Henderson and N.G. Fulton Does the new version supersede the original program?: DVR3DRJZ supersedes DVR3DRJ Computer: PC running Linux Installation: desktop Other machines on which program tested: Compaq running True64 Unix; SGI Origin 2000, Sunfire V750 and V880 systems running SunOS, IBM p690 Regatta running AIX Programming language used in the new version: Fortran 90 Memory required to execute: case dependent No. of lines in distributed program, including test data, etc.: 4203 No. of bytes in distributed program, including test data, etc.: 30 087 Has code been vectorised or parallelised?: The code has been extensively vectorised. A parallel version of the code, PDVR3D has been developed [1], contact the first author for details Additional keywords: perpendicular embedding Distribution format: gz Nature of physical problem: DVR3DRJZ calculates the bound vibrational or Coriolis decoupled rotational-vibrational states of a triatomic system in body-fixed Jacobi (scattering) or Radau coordinates [2] Method of solution: All coordinates are treated in a discrete variable representation (DVR). The angular coordinate uses a DVR based on (associated) Legendre polynomials and the radial coordinates utilise a DVR based on either Morse oscillator-like or spherical oscillator functions. Intermediate diagonalisation and truncation is performed on the hierarchical expression of the Hamiltonian operator to yield the final secular problem. DVR3DRJ provides the vibrational wavefunctions necessary for ROTLEV3, ROLEV3B or ROTLEV3Z to calculate rotationally excited states, DIPOLE3 to calculate rotational-vibrational transition strengths and XPECT3 to compute expectation values Restrictions on the complexity of the problem: (1) The size of the final Hamiltonian matrix that can

  17. Rhorix: An interface between quantum chemical topology and the 3D graphics program blender.

    Science.gov (United States)

    Mills, Matthew J L; Sale, Kenneth L; Simmons, Blake A; Popelier, Paul L A

    2017-08-31

    Chemical research is assisted by the creation of visual representations that map concepts (such as atoms and bonds) to 3D objects. These concepts are rooted in chemical theory that predates routine solution of the Schrödinger equation for systems of interesting size. The method of Quantum Chemical Topology (QCT) provides an alternative, parameter-free means to understand chemical phenomena directly from quantum mechanical principles. Representation of the topological elements of QCT has lagged behind the best tools available. Here, we describe a general abstraction (and corresponding file format) that permits the definition of mappings between topological objects and their 3D representations. Possible mappings are discussed and a canonical example is suggested, which has been implemented as a Python "Add-On" named Rhorix for the state-of-the-art 3D modeling program Blender. This allows chemists to use modern drawing tools and artists to access QCT data in a familiar context. A number of examples are discussed. © 2017 The Authors. Journal of Computational Chemistry Published by Wiley Periodicals, Inc. © 2017 The Authors. Journal of Computational Chemistry Published by Wiley Periodicals, Inc.

  18. Detecting drug use in adolescents using a 3D simulation program

    Directory of Open Access Journals (Sweden)

    Luis Iribarne

    2010-11-01

    Full Text Available This work presents a new 3D simulation program, called MiiSchool, and its application to the detection of problem behaviours appearing in school settings. We begin by describing some of the main features of the Mii School program. Then, we present the results of a study in which adolescents responded to Mii School simulations involving the consumption of alcoholic drinks, cigarettes, cannabis, cocaine, and MDMA (ecstasy. We established a“risk profile” based on the observed response patterns. We also present results concerning user satisfaction with the program and the extent to which users felt that the simulated scenes were realistic. Lastly, we discuss the usefulness of Mii School as a tool for assessing drug use in school settings.

  19. Modeling of AAR affected structures using the GROW3D FEA program

    Energy Technology Data Exchange (ETDEWEB)

    Curtis, D.D. [Acres International Limited, Niagara Falls, Ontario (Canada)

    1995-12-31

    The objective of this paper is to present a rational and practical methodology for finite element stress analysis of AAR affected structures. The methodology is presented using case history studies which illustrate the practical application of the GROW3D program. GROW3D uses an anisotropic expansion strain function and concrete properties which simulates the following key characteristics of AAR affected concrete (1) concrete growth expansion rates dependent on the stress vectors at each point; (2) concrete growth rate variation due to changes in moisture content and temperature; and (3) time-dependent, enhanced creep behavior. GROW3D has been applied to several hydropower structures and case histories from the Mactaquac Generating Station are presented herein. Mactaquac is selected because extensive instrumentation data before and after remedial measures have been used to calibrate and test the model. The results of analyses of three different structures are given, i.e., the intake, diversion sluiceway and powerhouse. The analysis results are used to identify potential structural problems and the need and timing of remedial measures. The output from GROW3D includes displacement rates, total displacements, global stresses and local factors of safety. The local factors of safety (or strength to stress ratios) are computed for several modes of failure including crushing, cracking, shear and sliding on horizontal construction joints. The analysis results are compared with field measurements which are taken before and after slot cutting. The effects of including the above-mentioned characteristics and other modeling assumptions on the computed results is discussed herein. Finally, a brief discussion on the recent enhancements to the model is given. These enhancements include the implementation of a more rigorous treatment of concrete creep effects.

  20. Playable stories: Making programming and 3D role-playing game design personally and socially relevant

    Science.gov (United States)

    Ingram-Goble, Adam

    This is an exploratory design study of a novel system for learning programming and 3D role-playing game design as tools for social change. This study was conducted at two sites. Participants in the study were ages 9-14 and worked for up to 15 hours with the platform to learn how to program and design video games with personally or socially relevant narratives. This first study was successful in that students learned to program a narrative game, and they viewed the social problem framing for the practices as an interesting aspect of the experience. The second study provided illustrative examples of how providing less general structure up-front, afforded players the opportunity to produce the necessary structures as needed for their particular design, and therefore had a richer understanding of what those structures represented. This study demonstrates that not only were participants able to use computational thinking skills such as Boolean and conditional logic, planning, modeling, abstraction, and encapsulation, they were able to bridge these skills to social domains they cared about. In particular, participants created stories about socially relevant topics without to explicit pushes by the instructors. The findings also suggest that the rapid uptake, and successful creation of personally and socially relevant narratives may have been facilitated by close alignment between the conceptual tools represented in the platform, and the domain of 3D role-playing games.

  1. Reactor safety issues resolved by the 2D/3D program

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1995-09-01

    The 2D/3D Program studied multidimensional thermal-hydraulics in a PWR core and primary system during the end-of-blowdown and post-blowdown phases of a large-break LOCA (LBLOCA), and during selected small-break LOCA (SBLOCA) transients. The program included tests at the Cylindrical Core Test Facility (CCTF), the Slab Core Test Facility (SCTF), and the Upper Plenum Test Facility (UPTF), and computer analyses using TRAC. Tests at CCTF investigated core thermal-hydraulics and overall system behavior while tests at SCTF concentrated on multidimensional core thermal-hydraulics. The UPTF tests investigated two-phase flow behavior in the downcomer, upper plenum, tie plate region, and primary loops. TRAC analyses evaluated thermal-hydraulic behavior throughout the primary system in tests as well as in PWRs. This report summarizes the test and analysis results in each of the main areas where improved information was obtained in the 2D/3D Program. The discussion is organized in terms of the reactor safety issues investigated. This report was prepared in a coordination among US, Germany and Japan. US and Germany have published the report as NUREG/IA-0127 and GRS-101 respectively. (author).

  2. Plasmonics and the parallel programming problem

    Science.gov (United States)

    Vishkin, Uzi; Smolyaninov, Igor; Davis, Chris

    2007-02-01

    While many parallel computers have been built, it has generally been too difficult to program them. Now, all computers are effectively becoming parallel machines. Biannual doubling in the number of cores on a single chip, or faster, over the coming decade is planned by most computer vendors. Thus, the parallel programming problem is becoming more critical. The only known solution to the parallel programming problem in the theory of computer science is through a parallel algorithmic theory called PRAM. Unfortunately, some of the PRAM theory assumptions regarding the bandwidth between processors and memories did not properly reflect a parallel computer that could be built in previous decades. Reaching memories, or other processors in a multi-processor organization, required off-chip connections through pins on the boundary of each electric chip. Using the number of transistors that is becoming available on chip, on-chip architectures that adequately support the PRAM are becoming possible. However, the bandwidth of off-chip connections remains insufficient and the latency remains too high. This creates a bottleneck at the boundary of the chip for a PRAM-On-Chip architecture. This also prevents scalability to larger "supercomputing" organizations spanning across many processing chips that can handle massive amounts of data. Instead of connections through pins and wires, power-efficient CMOS-compatible on-chip conversion to plasmonic nanowaveguides is introduced for improved latency and bandwidth. Proper incorporation of our ideas offer exciting avenues to resolving the parallel programming problem, and an alternative way for building faster, more useable and much more compact supercomputers.

  3. OpenCL parallel programming development cookbook

    CERN Document Server

    Tay, Raymond

    2013-01-01

    OpenCL Parallel Programming Development Cookbook will provide a set of advanced recipes that can be utilized to optimize existing code. This book is therefore ideal for experienced developers with a working knowledge of C/C++ and OpenCL.This book is intended for software developers who have often wondered what to do with that newly bought CPU or GPU they bought other than using it for playing computer games; this book is also for developers who have a working knowledge of C/C++ and who want to learn how to write parallel programs in OpenCL so that life isn't too boring.

  4. Wireless Rover Meets 3D Design and Product Development

    Science.gov (United States)

    Deal, Walter F., III; Hsiung, Steve C.

    2016-01-01

    Today there are a number of 3D printing technologies that are low cost and within the budgets of middle and high school programs. Educational technology companies offer a variety of 3D printing technologies and parallel curriculum materials to enable technology and engineering teachers to easily add 3D learning activities to their programs.…

  5. Wireless Rover Meets 3D Design and Product Development

    Science.gov (United States)

    Deal, Walter F., III; Hsiung, Steve C.

    2016-01-01

    Today there are a number of 3D printing technologies that are low cost and within the budgets of middle and high school programs. Educational technology companies offer a variety of 3D printing technologies and parallel curriculum materials to enable technology and engineering teachers to easily add 3D learning activities to their programs.…

  6. 冷冻电镜三维重构在CPU-GPU系统中的并行性%Parallelism for cryo-EM 3D reconstruction on CPU-GPU heterogeneous system

    Institute of Scientific and Technical Information of China (English)

    李兴建; 李临川; 谭光明; 张佩珩

    2011-01-01

    It is a challenge to efficiently utilize massive parallelism on both applications and architectures for heterogeneous systems. A practice of accelerating a cryo-EM 3D program was presented on how to exploit and orchestrate parallelism of applications to take advantage of the underlying parallelism exposed at the architecture level. All possible parallelism in cryo-EM 3D was exploited, and a self-adaptive dynamic scheduling algorithm was leveraged to efficiently implement parallelism mapping between the application and architecture. The experiment on a part of dawning nebulae system (32 nodes) confirms that a hierarchical parallelism is an efficient pattern of parallel programming to utilize capabilities of both CPU and GPU on a heterogeneous system. The hybrid CPU-GPU program improves performance by 2. 4 times over the best CPU-only one for certain problem sizes.%为了有效地发掘和利用异构系统在应用和体系结构上的并行性,以冷冻电镜三维重构为例展示如何利用应用程序潜在的并行性.通过分析重构计算所有的并行性,实现了将动态自适应的划分算法用于任务在异构系统上高效的分发.在曙光星云系统的部分节点系统(32节点)上评估并行化的程序性能.实验证明:多层次的并行化是CPU与GPU异构系统上开发并行性的有效模式;CPU-GPU混合程序在给定问题规模上相对单纯CPU程序获得2.4倍加速比.

  7. The 3D elevation program - Precision agriculture and other farm practices

    Science.gov (United States)

    Sugarbaker, Larry J.; Carswell, Jr., William J.

    2016-12-27

    The agriculture industry, including farmers who rely on advanced technologies, increasingly use light detection and ranging (lidar) data for crop management to enhance agricultural productivity. Annually, the combination of greater yields and reduced crop losses is estimated to increase revenue by \\$2 billion for America's farmers when terrain data derived from lidar are available for croplands. Additionally, the Natural Resources Conservation Service (NRCS) estimates that the value of improved services for farmers, through its farm assistance program, would be \\$79 million annually if lidar-derived digital elevation models (DEMs) are made available to the public.The 3D Elevation Program (3DEP) of the U.S. Geological Survey provides the programmatic infrastructure to generate and supply superior, lidar-derived terrain data to the agriculture industry, which would allow farms to refine agricultural practices and produce crops more efficiently. By providing data to users, 3DEP reduces users’ costs and risks, allowing them to concentrate on mission objectives. 3DEP includes (1) data acquisition partnerships that leverage funding, (2) contracts with experienced private mapping firms, (3) technical expertise, lidar data standards and specifications, and (4) most importantly, public access to high-quality 3D elevation data.

  8. Evaluation of single photon and Geiger mode Lidar for the 3D Elevation Program

    Science.gov (United States)

    Stoker, Jason M.; Abdullah, Qassim; Nayegandhi, Amar; Winehouse, Jayna

    2016-01-01

    Data acquired by Harris Corporation’s (Melbourne, FL, USA) Geiger-mode IntelliEarth™ sensor and Sigma Space Corporation’s (Lanham-Seabrook, MD, USA) Single Photon HRQLS sensor were evaluated and compared to accepted 3D Elevation Program (3DEP) data and survey ground control to assess the suitability of these new technologies for the 3DEP. While not able to collect data currently to meet USGS lidar base specification, this is partially due to the fact that the specification was written for linear-mode systems specifically. With little effort on part of the manufacturers of the new lidar systems and the USGS Lidar specifications team, data from these systems could soon serve the 3DEP program and its users. Many of the shortcomings noted in this study have been reported to have been corrected or improved upon in the next generation sensors.

  9. Contributions to computational stereology and parallel programming

    DEFF Research Database (Denmark)

    Rasmusson, Allan

    rotator, even without the need for isotropic sections. To meet the need for computational power to perform image restoration of virtual tissue sections, parallel programming on GPUs has also been part of the project. This has lead to a significant change in paradigm for a previously developed surgical...

  10. Parallel Volunteer Learning during Youth Programs

    Science.gov (United States)

    Lesmeister, Marilyn K.; Green, Jeremy; Derby, Amy; Bothum, Candi

    2012-01-01

    Lack of time is a hindrance for volunteers to participate in educational opportunities, yet volunteer success in an organization is tied to the orientation and education they receive. Meeting diverse educational needs of volunteers can be a challenge for program managers. Scheduling a Volunteer Learning Track for chaperones that is parallel to a…

  11. Parallel GRISYS/Power Challenge System Version 1.0 and 3D Prestack Depth Migration Package

    Institute of Scientific and Technical Information of China (English)

    Zhao Zhenwen

    1995-01-01

    @@ Based on the achievements and experience of seismic data parallel processing made in the past years by Beijing Global Software Corporation (GS) of CNPC, Parallel GRISYS/Power Challenge seismic data processing system version 1.0 has been cooperatively developed and integrated on the Power Challenge computer by GS, SGI (USA) and Shuangyuan Company of Academia Sinica.

  12. Concurrency-based approaches to parallel programming

    Science.gov (United States)

    Kale, L.V.; Chrisochoides, N.; Kohl, J.; Yelick, K.

    1995-01-01

    The inevitable transition to parallel programming can be facilitated by appropriate tools, including languages and libraries. After describing the needs of applications developers, this paper presents three specific approaches aimed at development of efficient and reusable parallel software for irregular and dynamic-structured problems. A salient feature of all three approaches in their exploitation of concurrency within a processor. Benefits of individual approaches such as these can be leveraged by an interoperability environment which permits modules written using different approaches to co-exist in single applications.

  13. Fabrication of 3-D Reconstituted Organoid Arrays by DNA-Programmed Assembly of Cells (DPAC).

    Science.gov (United States)

    Todhunter, Michael E; Weber, Robert J; Farlow, Justin; Jee, Noel Y; Cerchiari, Alec E; Gartner, Zev J

    2016-09-13

    Tissues are the organizational units of function in metazoan organisms. Tissues comprise an assortment of cellular building blocks, soluble factors, and extracellular matrix (ECM) composed into specific three-dimensional (3-D) structures. The capacity to reconstitute tissues in vitro with the structural complexity observed in vivo is key to understanding processes such as morphogenesis, homeostasis, and disease. In this article, we describe DNA-programmed assembly of cells (DPAC), a method to fabricate viable, functional arrays of organoid-like tissues within 3-D ECM gels. In DPAC, dissociated cells are chemically functionalized with degradable oligonucleotide "Velcro," allowing rapid, specific, and reversible cell adhesion to a two-dimensional (2-D) template patterned with complementary DNA. An iterative assembly process builds up organoids, layer-by-layer, from this initial 2-D template and into the third dimension. Cleavage of the DNA releases the completed array of tissues that are captured and fully embedded in ECM gels for culture and observation. DPAC controls the size, shape, composition, and spatial heterogeneity of organoids and permits positioning of constituent cells with single-cell resolution even within cultures several centimeters long. © 2016 by John Wiley & Sons, Inc.

  14. Towards HPC++: A Unified Approach to Parallel Programming in C++

    Science.gov (United States)

    1998-10-30

    Compositional C++ or CC++, is a general purpose parallel programming language designed to support a wide range of parallel programming styles. By...appropriate for parallelizing the range of applications that one would write in C++. CC++ supports the integration of different parallel programming styles

  15. Programming massively parallel processors a hands-on approach

    CERN Document Server

    Kirk, David B

    2010-01-01

    Programming Massively Parallel Processors discusses basic concepts about parallel programming and GPU architecture. ""Massively parallel"" refers to the use of a large number of processors to perform a set of computations in a coordinated parallel way. The book details various techniques for constructing parallel programs. It also discusses the development process, performance level, floating-point format, parallel patterns, and dynamic parallelism. The book serves as a teaching guide where parallel programming is the main topic of the course. It builds on the basics of C programming for CUDA, a parallel programming environment that is supported on NVI- DIA GPUs. Composed of 12 chapters, the book begins with basic information about the GPU as a parallel computer source. It also explains the main concepts of CUDA, data parallelism, and the importance of memory access efficiency using CUDA. The target audience of the book is graduate and undergraduate students from all science and engineering disciplines who ...

  16. Specifying and Executing Optimizations for Parallel Programs

    Directory of Open Access Journals (Sweden)

    William Mansky

    2014-07-01

    Full Text Available Compiler optimizations, usually expressed as rewrites on program graphs, are a core part of all modern compilers. However, even production compilers have bugs, and these bugs are difficult to detect and resolve. The problem only becomes more complex when compiling parallel programs; from the choice of graph representation to the possibility of race conditions, optimization designers have a range of factors to consider that do not appear when dealing with single-threaded programs. In this paper we present PTRANS, a domain-specific language for formal specification of compiler transformations, and describe its executable semantics. The fundamental approach of PTRANS is to describe program transformations as rewrites on control flow graphs with temporal logic side conditions. The syntax of PTRANS allows cleaner, more comprehensible specification of program optimizations; its executable semantics allows these specifications to act as prototypes for the optimizations themselves, so that candidate optimizations can be tested and refined before going on to include them in a compiler. We demonstrate the use of PTRANS to state, test, and refine the specification of a redundant store elimination optimization on parallel programs.

  17. Scaling and performance of a 3-D radiation hydrodynamics code on message-passing parallel computers: final report

    Energy Technology Data Exchange (ETDEWEB)

    Hayes, J C; Norman, M

    1999-10-28

    This report details an investigation into the efficacy of two approaches to solving the radiation diffusion equation within a radiation hydrodynamic simulation. Because leading-edge scientific computing platforms have evolved from large single-node vector processors to parallel aggregates containing tens to thousands of individual CPU's, the ability of an algorithm to maintain high compute efficiency when distributed over a large array of nodes is critically important. The viability of an algorithm thus hinges upon the tripartite question of numerical accuracy, total time to solution, and parallel efficiency.

  18. Timing-Sequence Testing of Parallel Programs

    Institute of Scientific and Technical Information of China (English)

    LIANG Yu; LI Shu; ZHANG Hui; HAN Chengde

    2000-01-01

    Testing of parallel programs involves two parts-testing of controlflow within the processes and testing of timing-sequence.This paper focuses on the latter, particularly on the timing-sequence of message-passing paradigms.Firstly the coarse-grained SYN-sequence model is built up to describe the execution of distributed programs. All of the topics discussed in this paper are based on it. The most direct way to test a program is to run it. A fault-free parallel program should be of both correct computing results and proper SYN-sequence. In order to analyze the validity of observed SYN-sequence, this paper presents the formal specification (Backus Normal Form) of the valid SYN-sequence. Till now there is little work about the testing coverage for distributed programs. Calculating the number of the valid SYN-sequences is the key to coverage problem, while the number of the valid SYN-sequences is terribly large and it is very hard to obtain the combination law among SYN-events. In order to resolve this problem, this paper proposes an efficient testing strategy-atomic SYN-event testing, which is to linearize the SYN-sequence (making it only consist of serial atomic SYN-events) first and then test each atomic SYN-event independently. This paper particularly provides the calculating formula about the number of the valid SYN-sequences for tree-topology atomic SYN-event (broadcast and combine). Furthermore,the number of valid SYN-sequences also,to some degree, mirrors the testability of parallel programs. Taking tree-topology atomic SYN-event as an example, this paper demonstrates the testability and communication speed of the tree-topology atomic SYN-event under different numbers of branches in order to achieve a more satisfactory tradeoff between testability and communication efficiency.

  19. Accepting the T3D

    Energy Technology Data Exchange (ETDEWEB)

    Rich, D.O.; Pope, S.C.; DeLapp, J.G.

    1994-10-01

    In April, a 128 PE Cray T3D was installed at Los Alamos National Laboratory`s Advanced Computing Laboratory as part of the DOE`s High-Performance Parallel Processor Program (H4P). In conjunction with CRI, the authors implemented a 30 day acceptance test. The test was constructed in part to help them understand the strengths and weaknesses of the T3D. In this paper, they briefly describe the H4P and its goals. They discuss the design and implementation of the T3D acceptance test and detail issues that arose during the test. They conclude with a set of system requirements that must be addressed as the T3D system evolves.

  20. Parallel Programming with MatlabMPI

    CERN Document Server

    Kepner, J V

    2001-01-01

    MatlabMPI is a Matlab implementation of the Message Passing Interface (MPI) standard and allows any Matlab program to exploit multiple processors. MatlabMPI currently implements the basic six functions that are the core of the MPI point-to-point communications standard. The key technical innovation of MatlabMPI is that it implements the widely used MPI ``look and feel'' on top of standard Matlab file I/O, resulting in an extremely compact (~100 lines) and ``pure'' implementation which runs anywhere Matlab runs. The performance has been tested on both shared and distributed memory parallel computers. MatlabMPI can match the bandwidth of C based MPI at large message sizes. A test image filtering application using MatlabMPI achieved a speedup of ~70 on a parallel computer.

  1. Array distribution in data-parallel programs

    Science.gov (United States)

    Chatterjee, Siddhartha; Gilbert, John R.; Schreiber, Robert; Sheffler, Thomas J.

    1994-01-01

    We consider distribution at compile time of the array data in a distributed-memory implementation of a data-parallel program written in a language like Fortran 90. We allow dynamic redistribution of data and define a heuristic algorithmic framework that chooses distribution parameters to minimize an estimate of program completion time. We represent the program as an alignment-distribution graph. We propose a divide-and-conquer algorithm for distribution that initially assigns a common distribution to each node of the graph and successively refines this assignment, taking computation, realignment, and redistribution costs into account. We explain how to estimate the effect of distribution on computation cost and how to choose a candidate set of distributions. We present the results of an implementation of our algorithms on several test problems.

  2. XJava: Exploiting Parallelism with Object-Oriented Stream Programming

    Science.gov (United States)

    Otto, Frank; Pankratius, Victor; Tichy, Walter F.

    This paper presents the XJava compiler for parallel programs. It exploits parallelism based on an object-oriented stream programming paradigm. XJava extends Java with new parallel constructs that do not expose programmers to low-level details of parallel programming on shared memory machines. Tasks define composable parallel activities, and new operators allow an easier expression of parallel patterns, such as pipelines, divide and conquer, or master/worker. We also present an automatic run-time mechanism that extends our previous work to automatically map tasks and parallel statements to threads.

  3. Profiling parallel Mercury programs with ThreadScope

    CERN Document Server

    Bone, Paul

    2011-01-01

    The behavior of parallel programs is even harder to understand than the behavior of sequential programs. Parallel programs may suffer from any of the performance problems affecting sequential programs, as well as from several problems unique to parallel systems. Many of these problems are quite hard (or even practically impossible) to diagnose without help from specialized tools. We present a proposal for a tool for profiling the parallel execution of Mercury programs, a proposal whose implementation we have already started. This tool is an adaptation and extension of the ThreadScope profiler that was first built to help programmers visualize the execution of parallel Haskell programs.

  4. Computational modeling of pitching cylinder-type ocean wave energy converters using 3D MPI-parallel simulations

    Science.gov (United States)

    Freniere, Cole; Pathak, Ashish; Raessi, Mehdi

    2016-11-01

    Ocean Wave Energy Converters (WECs) are devices that convert energy from ocean waves into electricity. To aid in the design of WECs, an advanced computational framework has been developed which has advantages over conventional methods. The computational framework simulates the performance of WECs in a virtual wave tank by solving the full Navier-Stokes equations in 3D, capturing the fluid-structure interaction, nonlinear and viscous effects. In this work, we present simulations of the performance of pitching cylinder-type WECs and compare against experimental data. WECs are simulated at both model and full scales. The results are used to determine the role of the Keulegan-Carpenter (KC) number. The KC number is representative of viscous drag behavior on a bluff body in an oscillating flow, and is considered an important indicator of the dynamics of a WEC. Studying the effects of the KC number is important for determining the validity of the Froude scaling and the inviscid potential flow theory, which are heavily relied on in the conventional approaches to modeling WECs. Support from the National Science Foundation is gratefully acknowledged.

  5. Pixel parallel localized driver design for a 128 x 256 pixel array 3D 1Gfps image sensor

    Science.gov (United States)

    Zhang, C.; Dao, V. T. S.; Etoh, T. G.; Charbon, E.

    2017-02-01

    In this paper, a 3D 1Gfps BSI image sensor is proposed, where 128 × 256 pixels are located in the top-tier chip and a 32 × 32 localized driver array in the bottom-tier chip. Pixels are designed with Multiple Collection Gates (MCG), which collects photons selectively with different collection gates being active at intervals of 1ns to achieve 1Gfps. For the drivers, a global PLL is designed, which consists of a ring oscillator with 6-stage current starved differential inverters, achieving a wide frequency tuning range from 40MHz to 360MHz (20ps rms jitter). The drivers are the replicas of the ring oscillator that operates within a PLL. Together with level shifters and XNOR gates, continuous 3.3V pulses are generated with desired pulse width, which is 1/12 of the PLL clock period. The driver array is activated by a START signal, which propagates through a highly balanced clock tree, to activate all the pixels at the same time with virtually negligible skew.

  6. Scalable, incremental learning with MapReduce parallelization for cell detection in high-resolution 3D microscopy data

    KAUST Repository

    Sung, Chul

    2013-08-01

    Accurate estimation of neuronal count and distribution is central to the understanding of the organization and layout of cortical maps in the brain, and changes in the cell population induced by brain disorders. High-throughput 3D microscopy techniques such as Knife-Edge Scanning Microscopy (KESM) are enabling whole-brain survey of neuronal distributions. Data from such techniques pose serious challenges to quantitative analysis due to the massive, growing, and sparsely labeled nature of the data. In this paper, we present a scalable, incremental learning algorithm for cell body detection that can address these issues. Our algorithm is computationally efficient (linear mapping, non-iterative) and does not require retraining (unlike gradient-based approaches) or retention of old raw data (unlike instance-based learning). We tested our algorithm on our rat brain Nissl data set, showing superior performance compared to an artificial neural network-based benchmark, and also demonstrated robust performance in a scenario where the data set is rapidly growing in size. Our algorithm is also highly parallelizable due to its incremental nature, and we demonstrated this empirically using a MapReduce-based implementation of the algorithm. We expect our scalable, incremental learning approach to be widely applicable to medical imaging domains where there is a constant flux of new data. © 2013 IEEE.

  7. Verification and validation of a parallel 3D direct simulation Monte Carlo solver for atmospheric entry applications

    Science.gov (United States)

    Nizenkov, Paul; Noeding, Peter; Konopka, Martin; Fasoulas, Stefanos

    2017-03-01

    The in-house direct simulation Monte Carlo solver PICLas, which enables parallel, three-dimensional simulations of rarefied gas flows, is verified and validated. Theoretical aspects of the method and the employed schemes are briefly discussed. Considered cases include simple reservoir simulations and complex re-entry geometries, which were selected from literature and simulated with PICLas. First, the chemistry module is verified using simple numerical and analytical solutions. Second, simulation results of the rarefied gas flow around a 70° blunted-cone, the REX Free-Flyer as well as multiple points of the re-entry trajectory of the Orion capsule are presented in terms of drag and heat flux. A comparison to experimental measurements as well as other numerical results shows an excellent agreement across the different simulation cases. An outlook on future code development and applications is given.

  8. A Tutorial on Parallel and Concurrent Programming in Haskell

    Science.gov (United States)

    Peyton Jones, Simon; Singh, Satnam

    This practical tutorial introduces the features available in Haskell for writing parallel and concurrent programs. We first describe how to write semi-explicit parallel programs by using annotations to express opportunities for parallelism and to help control the granularity of parallelism for effective execution on modern operating systems and processors. We then describe the mechanisms provided by Haskell for writing explicitly parallel programs with a focus on the use of software transactional memory to help share information between threads. Finally, we show how nested data parallelism can be used to write deterministically parallel programs which allows programmers to use rich data types in data parallel programs which are automatically transformed into flat data parallel versions for efficient execution on multi-core processors.

  9. Programming in Manticore, a Heterogenous Parallel Functional Language

    Science.gov (United States)

    Fluet, Matthew; Bergstrom, Lars; Ford, Nic; Rainey, Mike; Reppy, John; Shaw, Adam; Xiao, Yingqi

    The Manticore project is an effort to design and implement a new functional language for parallel programming. Unlike many earlier parallel languages, Manticore is a heterogeneous language that supports parallelism at multiple levels. Specifically, the Manticore language combines Concurrent ML-style explicit concurrency with fine-grain, implicitly threaded, parallel constructs. These lectures will introduce the Manticore language and explore a variety of programs written to take advantage of heterogeneous parallelism.

  10. Professional WebGL Programming Developing 3D Graphics for the Web

    CERN Document Server

    Anyuru, Andreas

    2012-01-01

    Everything you need to know about developing hardware-accelerated 3D graphics with WebGL! As the newest technology for creating 3D graphics on the web, in both games, applications, and on regular websites, WebGL gives web developers the capability to produce eye-popping graphics. This book teaches you how to use WebGL to create stunning cross-platform apps. The book features several detailed examples that show you how to develop 3D graphics with WebGL, including explanations of code snippets that help you understand the why behind the how. You will also develop a stronger understanding of W

  11. Four styles of parallel and net programming

    Institute of Scientific and Technical Information of China (English)

    Zhiwei XU; Yongqiang HE; Wei LIN; Li ZHA

    2009-01-01

    This paper reviews the programming landscape for parallel and network computing systems, focusing on four styles of concurrent programming models, and example languages/libraries. The four styles correspond to four scales of the targeted systems. At the smallest coprocessor scale, Single Instruction Multiple Thread (SIMT) and Compute Unified Device Architecture (CUDA) are considered. Transactional memory is discussed at the multicore or process scale. The MapReduce style is ex-amined at the datacenter scale. At the Internet scale, Grid Service Markup Language (GSML) is reviewed, which intends to integrate resources distributed across multiple dat-acenters.The four styles are concerned with and emphasize differ-ent issues, which are needed by systems at different scales. This paper discusses issues related to efficiency, ease of use, and expressiveness.

  12. Parallel Programming Strategies for Irregular Adaptive Applications

    Science.gov (United States)

    Biswas, Rupak; Biegel, Bryan (Technical Monitor)

    2001-01-01

    Achieving scalable performance for dynamic irregular applications is eminently challenging. Traditional message-passing approaches have been making steady progress towards this goal; however, they suffer from complex implementation requirements. The use of a global address space greatly simplifies the programming task, but can degrade the performance for such computations. In this work, we examine two typical irregular adaptive applications, Dynamic Remeshing and N-Body, under competing programming methodologies and across various parallel architectures. The Dynamic Remeshing application simulates flow over an airfoil, and refines localized regions of the underlying unstructured mesh. The N-Body experiment models two neighboring Plummer galaxies that are about to undergo a merger. Both problems demonstrate dramatic changes in processor workloads and interprocessor communication with time; thus, dynamic load balancing is a required component.

  13. VERIFICATION OF PARALLEL AUTOMATA-BASED PROGRAMS

    Directory of Open Access Journals (Sweden)

    M. A. Lukin

    2014-01-01

    Full Text Available The paper deals with an interactive method of automatic verification for parallel automata-based programs. The hierarchical state machines can be implemented in different threads and can interact with each other. Verification is done by means of Spin tool and includes automatic Promela model construction, conversion of LTL-formula to Spin format and counterexamples in terms of automata. Interactive verification gives the possibility to decrease verification time and increase the maximum size of verifiable programs. Considered method supports verification of the parallel system for hierarchical automata that interact with each other through messages and shared variables. The feature of automaton model is that each state machine is considered as a new data type and can have an arbitrary bounded number of instances. Each state machine in the system can run a different state machine in a new thread or have nested state machine. This method was implemented in the developed Stater tool. Stater shows correct operation for all test cases.

  14. Extending a serial 3D two-phase CFD code to parallel execution over MPI by using the PETSc library for domain decomposition

    CERN Document Server

    Ervik, Åsmund; Müller, Bernhard

    2014-01-01

    To leverage the last two decades' transition in High-Performance Computing (HPC) towards clusters of compute nodes bound together with fast interconnects, a modern scalable CFD code must be able to efficiently distribute work amongst several nodes using the Message Passing Interface (MPI). MPI can enable very large simulations running on very large clusters, but it is necessary that the bulk of the CFD code be written with MPI in mind, an obstacle to parallelizing an existing serial code. In this work we present the results of extending an existing two-phase 3D Navier-Stokes solver, which was completely serial, to a parallel execution model using MPI. The 3D Navier-Stokes equations for two immiscible incompressible fluids are solved by the continuum surface force method, while the location of the interface is determined by the level-set method. We employ the Portable Extensible Toolkit for Scientific Computing (PETSc) for domain decomposition (DD) in a framework where only a fraction of the code needs to be a...

  15. 三维有限元并行EBE方法%PARALLEL 3-D FINITE ELEMENT ANALYSIS BASED ON EBE METHOD

    Institute of Scientific and Technical Information of China (English)

    刘耀儒; 周维垣; 杨强

    2006-01-01

    采用Jacobi预处理,推导了基于EBE方法的预处理共轭梯度算法,给出了有限元EBE方法在分布存储并行机上的计算过程,可以实现整个三维有限元计算过程的并行化.编制了三维有限元求解的PFEM(Parallel Finite Element Method)程序,并在网络机群系统上实现.采用矩形截面悬臂梁的算例,对PFEM程序进行了数值测试,对串行计算和并行计算的效率进行了分析,最后将PFEM程序应用于二滩拱坝-地基系统的三维有限元数值计算中.结果表明,三维有限元EBE算法在求解过程中不需要集成整体刚度矩阵,有效地减少了对内存的需求,具有很好的并行性,可以有效地进行三维复杂结构的大规模数值分析.

  16. Mars-solar wind interaction: LatHyS, an improved parallel 3-D multispecies hybrid model

    Science.gov (United States)

    Modolo, Ronan; Hess, Sebastien; Mancini, Marco; Leblanc, Francois; Chaufray, Jean-Yves; Brain, David; Leclercq, Ludivine; Esteban-Hernández, Rosa; Chanteur, Gerard; Weill, Philippe; González-Galindo, Francisco; Forget, Francois; Yagi, Manabu; Mazelle, Christian

    2016-07-01

    In order to better represent Mars-solar wind interaction, we present an unprecedented model achieving spatial resolution down to 50 km, a so far unexplored resolution for global kinetic models of the Martian ionized environment. Such resolution approaches the ionospheric plasma scale height. In practice, the model is derived from a first version described in Modolo et al. (2005). An important effort of parallelization has been conducted and is presented here. A better description of the ionosphere was also implemented including ionospheric chemistry, electrical conductivities, and a drag force modeling the ion-neutral collisions in the ionosphere. This new version of the code, named LatHyS (Latmos Hybrid Simulation), is here used to characterize the impact of various spatial resolutions on simulation results. In addition, and following a global model challenge effort, we present the results of simulation run for three cases which allow addressing the effect of the suprathermal corona and of the solar EUV activity on the magnetospheric plasma boundaries and on the global escape. Simulation results showed that global patterns are relatively similar for the different spatial resolution runs, but finest grid runs provide a better representation of the ionosphere and display more details of the planetary plasma dynamic. Simulation results suggest that a significant fraction of escaping O+ ions is originated from below 1200 km altitude.

  17. Multilanguage parallel programming of heterogeneous machines

    Energy Technology Data Exchange (ETDEWEB)

    Bisiani, R.; Forin, A.

    1988-08-01

    The authors designed and implemented a system, Agora, that supports the development of multilanguage parallel applications for heterogeneous machines. Agora hinges on two ideas: the first one is that shared memory can be a suitable abstraction to program concurrent, multilanguage modules running on heterogeneous machines. The second one is that a shared memory abstraction can efficiently supported across different computer architectures that are not connected by a physical shared memory, for example local are network workstations or ensemble machines. Agora has been in use for more than a year. This paper describes the Agora shared memory and its software implementation on both tightly and loosely coupled architectures. Measurements of the current implementation are also included.

  18. Semantic Language Extensions for Implicit Parallel Programming

    Science.gov (United States)

    2013-09-01

    successfully for years on end have always been inspiring. His approach of only demanding the best for and from me, though seemed like a bitter medicine...sequential stage, an important insight behind the DSWP family of transforms. em3d: Electro-magnetic Wave propagation em3d simulates electromagnetic wave

  19. NEPTUNE:并行三维全电磁粒子模拟软件%NEPTUNE:A 3-D Fully Electromagnetic Particle Parallel Software

    Institute of Scientific and Technical Information of China (English)

    陈军; 董烨; 杨温渊; 董志伟

    2009-01-01

    为求解具有复杂几何的高功率微波电磁场问题,本文研制了一个三维全电磁粒子并行软件NEPTUNE.本文介绍了该并行软件的基本结构和采用的一些并行算法.目前,该软件已经成功模拟了多种高功率源器件,并可扩展到数千台处理器核上运行.%We developed a three-dimensional fully electromagnetic particle parallel software based on the parallel adaptive structure mesh application infrastructure, ,to solve the electromagnetic problem in the high power microwave devices with complex geometry. This paper presents the basic numerical method and parallel algorithm used in the parallel program. A typical device with complex geometry is simulated by the parallel program on thousands of processors, and the results show the good scalability. Currently it has been simulated many high power microwave devices successfully.

  20. A Fast Parallel Simulation Code for Interaction between Proto-Planetary Disk and Embedded Proto-Planets: Implementation for 3D Code

    Energy Technology Data Exchange (ETDEWEB)

    Li, Shengtai [Los Alamos National Laboratory; Li, Hui [Los Alamos National Laboratory

    2012-06-14

    sensitive to the position of the planet, we adopt the corotating frame that allows the planet moving only in radial direction if only one planet is present. This code has been extensively tested on a number of problems. For the earthmass planet with constant aspect ratio h = 0.05, the torque calculated using our code matches quite well with the the 3D linear theory results by Tanaka et al. (2002). The code is fully parallelized via message-passing interface (MPI) and has very high parallel efficiency. Several numerical examples for both fixed planet and moving planet are provided to demonstrate the efficacy of the numerical method and code.

  1. A Fast Parallel Simulation Code for Interaction between Proto-Planetary Disk and Embedded Proto-Planets: Implementation for 3D Code

    Energy Technology Data Exchange (ETDEWEB)

    Li, Shengtai [Los Alamos National Laboratory; Li, Hui [Los Alamos National Laboratory

    2012-06-14

    sensitive to the position of the planet, we adopt the corotating frame that allows the planet moving only in radial direction if only one planet is present. This code has been extensively tested on a number of problems. For the earthmass planet with constant aspect ratio h = 0.05, the torque calculated using our code matches quite well with the the 3D linear theory results by Tanaka et al. (2002). The code is fully parallelized via message-passing interface (MPI) and has very high parallel efficiency. Several numerical examples for both fixed planet and moving planet are provided to demonstrate the efficacy of the numerical method and code.

  2. A task-based parallelism and vectorized approach to 3D Method of Characteristics (MOC) reactor simulation for high performance computing architectures

    Science.gov (United States)

    Tramm, John R.; Gunow, Geoffrey; He, Tim; Smith, Kord S.; Forget, Benoit; Siegel, Andrew R.

    2016-05-01

    In this study we present and analyze a formulation of the 3D Method of Characteristics (MOC) technique applied to the simulation of full core nuclear reactors. Key features of the algorithm include a task-based parallelism model that allows independent MOC tracks to be assigned to threads dynamically, ensuring load balancing, and a wide vectorizable inner loop that takes advantage of modern SIMD computer architectures. The algorithm is implemented in a set of highly optimized proxy applications in order to investigate its performance characteristics on CPU, GPU, and Intel Xeon Phi architectures. Speed, power, and hardware cost efficiencies are compared. Additionally, performance bottlenecks are identified for each architecture in order to determine the prospects for continued scalability of the algorithm on next generation HPC architectures.

  3. Data-Parallel Programming in a Multithreaded Environment

    Directory of Open Access Journals (Sweden)

    Matthew Haines

    1997-01-01

    Full Text Available Research on programming distributed memory multiprocessors has resulted in a well-understood programming model, namely data-parallel programming. However, data-parallel programming in a multithreaded environment is far less understood. For example, if multiple threads within the same process belong to different data-parallel computations, then the architecture, compiler, or run-time system must ensure that relative indexing and collective operations are handled properly and efficiently. We introduce a run-time-based solution for data-parallel programming in a distributed memory environment that handles the problems of relative indexing and collective communications among thread groups. As a result, the data-parallel programming model can now be executed in a multithreaded environment, such as a system using threads to support both task and data parallelism.

  4. Automatic Parallelization and Optimization of Programs by Proof Rewriting

    NARCIS (Netherlands)

    Hurlin, C.; Palsberg, J.; Su, Z.

    2009-01-01

    We show how, given a program and its separation logic proof, one can parallelize and optimize this program and transform its proof simultaneously to obtain a proven parallelized and optimized program. To achieve this goal, we present new proof rules for generating proof trees and a rewrite system on

  5. What a Parallel Programming Language Has to Let You Say,

    Science.gov (United States)

    1984-09-01

    RD-fl147 854 WHAT A PARALLEL PROGRAMMING LANGUAGE HAS TO LET YOU SAY 1/1 (U) MASSACHUSETTS INST OF TECH CAMBRIDGE ARTIFICIAL INTELLIGENCE LAB A...What a parallel programming language has to let I* you say 6. PERFORMING ORG. REPORT NUMNER io 8. CONTRACT OR GRANT NUSeR(8e) Alan Bawden/Philip E. Agre...Massachusetts Institute of Technology Artificial Intelligence Laboratory AI Memo 796 September 1984 What a parallel programming language has to let

  6. A Parallel 3D Spectral Difference Method for Solutions of Compressible Navier Stokes Equations on Deforming Grids and Simulations of Vortex Induced Vibration

    Science.gov (United States)

    DeJong, Andrew

    Numerical models of fluid-structure interaction have grown in importance due to increasing interest in environmental energy harvesting, airfoil-gust interactions, and bio-inspired formation flying. Powered by increasingly powerful parallel computers, such models seek to explain the fundamental physics behind the complex, unsteady fluid-structure phenomena. To this end, a high-fidelity computational model based on the high-order spectral difference method on 3D unstructured, dynamic meshes has been developed. The spectral difference method constructs continuous solution fields within each element with a Riemann solver to compute the inviscid fluxes at the element interfaces and an averaging mechanism to compute the viscous fluxes. This method has shown promise in the past as a highly accurate, yet sufficiently fast method for solving unsteady viscous compressible flows. The solver is monolithically coupled to the equations of motion of an elastically mounted 3-degree of freedom rigid bluff body undergoing flow-induced lift, drag, and torque. The mesh is deformed using 4 methods: an analytic function, Laplace equation, biharmonic equation, and a bi-elliptic equation with variable diffusivity. This single system of equations -- fluid and structure -- is advanced through time using a 5-stage, 4th-order Runge-Kutta scheme. Message Passing Interface is used to run the coupled system in parallel on up to 240 processors. The solver is validated against previously published numerical and experimental data for an elastically mounted cylinder. The effect of adding an upstream body and inducing wake galloping is observed.

  7. PDDP: A data parallel programming model. Revision 1

    Energy Technology Data Exchange (ETDEWEB)

    Warren, K.H.

    1995-06-01

    PDDP, the Parallel Data Distribution Preprocessor, is a data parallel programming model for distributed memory parallel computers. PDDP impelments High Performance Fortran compatible data distribution directives and parallelism expressed by the use of Fortran 90 array syntax, the FORALL statement, and the (WRERE?) construct. Distribued data objects belong to a global name space; other data objects are treated as local and replicated on each processor. PDDP allows the user to program in a shared-memory style and generates codes that are portable to a variety of parallel machines. For interprocessor communication, PDDP uses the fastest communication primitives on each platform.

  8. THERM3D -- A boundary element computer program for transient heat conduction problems

    Energy Technology Data Exchange (ETDEWEB)

    Ingber, M.S. [New Mexico Univ., Albuquerque, NM (United States). Dept. of Mechanical Engineering

    1994-02-01

    The computer code THERM3D implements the direct boundary element method (BEM) to solve transient heat conduction problems in arbitrary three-dimensional domains. This particular implementation of the BEM avoids performing time-consuming domain integrations by approximating a ``generalized forcing function`` in the interior of the domain with the use of radial basis functions. An approximate particular solution is then constructed, and the original problem is transformed into a sequence of Laplace problems. The code is capable of handling a large variety of boundary conditions including isothermal, specified flux, convection, radiation, and combined convection and radiation conditions. The computer code is benchmarked by comparisons with analytic and finite element results.

  9. Optimizing FORTRAN Programs for Hierarchical Memory Parallel Processing Systems

    Institute of Scientific and Technical Information of China (English)

    金国华; 陈福接

    1993-01-01

    Parallel loops account for the greatest amount of parallelism in numerical programs.Executing nested loops in parallel with low run-time overhead is thus very important for achieving high performance in parallel processing systems.However,in parallel processing systems with caches or local memories in memory hierarchies,“thrashing problemmay”may arise whenever data move back and forth between the caches or local memories in different processors.Previous techniques can only deal with the rather simple cases with one linear function in the perfactly nested loop.In this paper,we present a parallel program optimizing technique called hybri loop interchange(HLI)for the cases with multiple linear functions and loop-carried data dependences in the nested loop.With HLI we can easily eliminate or reduce the thrashing phenomena without reucing the program parallelism.

  10. Parallel Programming with Matrix Distributed Processing

    CERN Document Server

    Di Pierro, Massimo

    2005-01-01

    Matrix Distributed Processing (MDP) is a C++ library for fast development of efficient parallel algorithms. It constitues the core of FermiQCD. MDP enables programmers to focus on algorithms, while parallelization is dealt with automatically and transparently. Here we present a brief overview of MDP and examples of applications in Computer Science (Cellular Automata), Engineering (PDE Solver) and Physics (Ising Model).

  11. Introduction to 3D Graphics through Excel

    Science.gov (United States)

    Benacka, Jan

    2013-01-01

    The article presents a method of explaining the principles of 3D graphics through making a revolvable and sizable orthographic parallel projection of cuboid in Excel. No programming is used. The method was tried in fourteen 90 minute lessons with 181 participants, which were Informatics teachers, undergraduates of Applied Informatics and gymnasium…

  12. Programming Mechanical and Physicochemical Properties of 3D Hydrogel Cellular Microcultures via Direct Ink Writing.

    Science.gov (United States)

    McCracken, Joselle M; Badea, Adina; Kandel, Mikhail E; Gladman, A Sydney; Wetzel, David J; Popescu, Gabriel; Lewis, Jennifer A; Nuzzo, Ralph G

    2016-05-01

    3D hydrogel scaffolds are widely used in cellular microcultures and tissue engineering. Using direct ink writing, microperiodic poly(2-hydroxyethyl-methacrylate) (pHEMA) scaffolds are created that are then printed, cured, and modified by absorbing 30 kDa protein poly-l-lysine (PLL) to render them biocompliant in model NIH/3T3 fibroblast and MC3T3-E1 preosteoblast cell cultures. Spatial light interference microscopy (SLIM) live cell imaging studies are carried out to quantify cellular motilities for each cell type, substrate, and surface treatment of interest. 3D scaffold mechanics is investigated using atomic force microscopy (AFM), while their absorption kinetics are determined by confocal fluorescence microscopy (CFM) for a series of hydrated hydrogel films prepared from prepolymers with different homopolymer-to-monomer (Mr ) ratios. The observations reveal that the inks with higher Mr values yield relatively more open-mesh gels due to a lower degree of entanglement. The biocompatibility of printed hydrogel scaffolds can be controlled by both PLL content and hydrogel mesh properties.

  13. An interactive parallel programming environment applied in atmospheric science

    Science.gov (United States)

    vonLaszewski, G.

    1996-01-01

    This article introduces an interactive parallel programming environment (IPPE) that simplifies the generation and execution of parallel programs. One of the tasks of the environment is to generate message-passing parallel programs for homogeneous and heterogeneous computing platforms. The parallel programs are represented by using visual objects. This is accomplished with the help of a graphical programming editor that is implemented in Java and enables portability to a wide variety of computer platforms. In contrast to other graphical programming systems, reusable parts of the programs can be stored in a program library to support rapid prototyping. In addition, runtime performance data on different computing platforms is collected in a database. A selection process determines dynamically the software and the hardware platform to be used to solve the problem in minimal wall-clock time. The environment is currently being tested on a Grand Challenge problem, the NASA four-dimensional data assimilation system.

  14. A linear programming approach to reconstructing subcellular structures from confocal images for automated generation of representative 3D cellular models.

    Science.gov (United States)

    Wood, Scott T; Dean, Brian C; Dean, Delphine

    2013-04-01

    This paper presents a novel computer vision algorithm to analyze 3D stacks of confocal images of fluorescently stained single cells. The goal of the algorithm is to create representative in silico model structures that can be imported into finite element analysis software for mechanical characterization. Segmentation of cell and nucleus boundaries is accomplished via standard thresholding methods. Using novel linear programming methods, a representative actin stress fiber network is generated by computing a linear superposition of fibers having minimum discrepancy compared with an experimental 3D confocal image. Qualitative validation is performed through analysis of seven 3D confocal image stacks of adherent vascular smooth muscle cells (VSMCs) grown in 2D culture. The presented method is able to automatically generate 3D geometries of the cell's boundary, nucleus, and representative F-actin network based on standard cell microscopy data. These geometries can be used for direct importation and implementation in structural finite element models for analysis of the mechanics of a single cell to potentially speed discoveries in the fields of regenerative medicine, mechanobiology, and drug discovery.

  15. Deterministic Consistency: A Programming Model for Shared Memory Parallelism

    OpenAIRE

    Aviram, Amittai; Ford, Bryan

    2009-01-01

    The difficulty of developing reliable parallel software is generating interest in deterministic environments, where a given program and input can yield only one possible result. Languages or type systems can enforce determinism in new code, and runtime systems can impose synthetic schedules on legacy parallel code. To parallelize existing serial code, however, we would like a programming model that is naturally deterministic without language restrictions or artificial scheduling. We propose "...

  16. PROP3D: A Program for 3D Euler Unsteady Aerodynamic and Aeroelastic (Flutter and Forced Response) Analysis of Propellers. Version 1.0

    Science.gov (United States)

    Srivastava, R.; Reddy, T. S. R.

    1996-01-01

    This guide describes the input data required, for steady or unsteady aerodynamic and aeroelastic analysis of propellers and the output files generated, in using PROP3D. The aerodynamic forces are obtained by solving three dimensional unsteady, compressible Euler equations. A normal mode structural analysis is used to obtain the aeroelastic equations, which are solved using either time domain or frequency domain solution method. Sample input and output files are included in this guide for steady aerodynamic analysis of single and counter-rotation propellers, and aeroelastic analysis of single-rotation propeller.

  17. Parallel programming practical aspects, models and current limitations

    CERN Document Server

    Tarkov, Mikhail S

    2014-01-01

    Parallel programming is designed for the use of parallel computer systems for solving time-consuming problems that cannot be solved on a sequential computer in a reasonable time. These problems can be divided into two classes: 1. Processing large data arrays (including processing images and signals in real time)2. Simulation of complex physical processes and chemical reactions For each of these classes, prospective methods are designed for solving problems. For data processing, one of the most promising technologies is the use of artificial neural networks. Particles-in-cell method and cellular automata are very useful for simulation. Problems of scalability of parallel algorithms and the transfer of existing parallel programs to future parallel computers are very acute now. An important task is to optimize the use of the equipment (including the CPU cache) of parallel computers. Along with parallelizing information processing, it is essential to ensure the processing reliability by the relevant organization ...

  18. 3-D magnetotelluric inversion including topography using deformed hexahedral edge finite elements and direct solvers parallelized on SMP computers - Part I: forward problem and parameter Jacobians

    Science.gov (United States)

    Kordy, M.; Wannamaker, P.; Maris, V.; Cherkaev, E.; Hill, G.

    2016-01-01

    We have developed an algorithm, which we call HexMT, for 3-D simulation and inversion of magnetotelluric (MT) responses using deformable hexahedral finite elements that permit incorporation of topography. Direct solvers parallelized on symmetric multiprocessor (SMP), single-chassis workstations with large RAM are used throughout, including the forward solution, parameter Jacobians and model parameter update. In Part I, the forward simulator and Jacobian calculations are presented. We use first-order edge elements to represent the secondary electric field (E), yielding accuracy O(h) for E and its curl (magnetic field). For very low frequencies or small material admittivities, the E-field requires divergence correction. With the help of Hodge decomposition, the correction may be applied in one step after the forward solution is calculated. This allows accurate E-field solutions in dielectric air. The system matrix factorization and source vector solutions are computed using the MKL PARDISO library, which shows good scalability through 24 processor cores. The factorized matrix is used to calculate the forward response as well as the Jacobians of electromagnetic (EM) field and MT responses using the reciprocity theorem. Comparison with other codes demonstrates accuracy of our forward calculations. We consider a popular conductive/resistive double brick structure, several synthetic topographic models and the natural topography of Mount Erebus in Antarctica. In particular, the ability of finite elements to represent smooth topographic slopes permits accurate simulation of refraction of EM waves normal to the slopes at high frequencies. Run-time tests of the parallelized algorithm indicate that for meshes as large as 176 × 176 × 70 elements, MT forward responses and Jacobians can be calculated in ˜1.5 hr per frequency. Together with an efficient inversion parameter step described in Part II, MT inversion problems of 200-300 stations are computable with total run times

  19. 3D and Education

    Science.gov (United States)

    Meulien Ohlmann, Odile

    2013-02-01

    Today the industry offers a chain of 3D products. Learning to "read" and to "create in 3D" becomes an issue of education of primary importance. 25 years professional experience in France, the United States and Germany, Odile Meulien set up a personal method of initiation to 3D creation that entails the spatial/temporal experience of the holographic visual. She will present some different tools and techniques used for this learning, their advantages and disadvantages, programs and issues of educational policies, constraints and expectations related to the development of new techniques for 3D imaging. Although the creation of display holograms is very much reduced compared to the creation of the 90ies, the holographic concept is spreading in all scientific, social, and artistic activities of our present time. She will also raise many questions: What means 3D? Is it communication? Is it perception? How the seeing and none seeing is interferes? What else has to be taken in consideration to communicate in 3D? How to handle the non visible relations of moving objects with subjects? Does this transform our model of exchange with others? What kind of interaction this has with our everyday life? Then come more practical questions: How to learn creating 3D visualization, to learn 3D grammar, 3D language, 3D thinking? What for? At what level? In which matter? for whom?

  20. Professional Parallel Programming with C# Master Parallel Extensions with NET 4

    CERN Document Server

    Hillar, Gastón

    2010-01-01

    Expert guidance for those programming today's dual-core processors PCs As PC processors explode from one or two to now eight processors, there is an urgent need for programmers to master concurrent programming. This book dives deep into the latest technologies available to programmers for creating professional parallel applications using C#, .NET 4, and Visual Studio 2010. The book covers task-based programming, coordination data structures, PLINQ, thread pools, asynchronous programming model, and more. It also teaches other parallel programming techniques, such as SIMD and vectorization.Teach

  1. Selecting Simulation Models when Predicting Parallel Program Behaviour

    OpenAIRE

    Broberg, Magnus; Lundberg, Lars; Grahn, Håkan

    2002-01-01

    The use of multiprocessors is an important way to increase the performance of a supercom-puting program. This means that the program has to be parallelized to make use of the multi-ple processors. The parallelization is unfortunately not an easy task. Development tools supporting parallel programs are important. Further, it is the customer that decides the number of processors in the target machine, and as a result the developer has to make sure that the pro-gram runs efficiently on any numbe...

  2. 基于C++ Builder的共焦显微镜三维重建方法%3D reconstruction of parallel confocal microscope based on C++ Builder

    Institute of Scientific and Technical Information of China (English)

    杨召雷; 张运波; 董洪波

    2012-01-01

    主要研究基于C++ Builder的数字微镜器件(Digital Micro-mirror Device,简称DMD)多路并行扫描共焦检测系统三维重建方法,将实验采集的虚拟针孔位图经过拟合后得到三维坐标数据,将其网格化后,采用集成OpenGL的C++Builder开发三维重建软件系统并重建三维图形,实验结果成功还原了被测物表面的微结构.%This paper primarily researched 3D reconstruction of the DMD ( Digital Micro-mirror Device) multi-point parallel confocal inspection system based on C ++ Builder, with the virtual pinhole bitmap collected from the laboratory after fitting, three-dimensional coordinate data was obtained, then made them grid. After the three-dimensional reconstruction software system compiled by C ++ Builder & OpenGL, the three-dimensional graphics were rebuild, and the results showed that the surface of the micro-structure was restored successfully.

  3. The SMC (Short Model Coil) Nb3Sn Program: FE Analysis with 3D Modeling

    CERN Document Server

    Kokkinos, C; Guinchard, M; Karppinen, M; Manil, P; Perez, J C; Regis, F

    2012-01-01

    The SMC (Short Model Coil) project aims at testing superconducting coils in racetrack configuration, wound with Nb3Sn cable. The degradation of the magnetic properties of the cable is studied by applying different levels of pre-stress. It is an essential step in the validation of procedures for the construction of superconducting magnets with high performance conductor. Two SMC assemblies have been completed and cold tested in the frame of a European collaboration between CEA (FR), CERN and STFC (UK), with the technical support from LBNL (US). The second assembly showed remarkable good quench results, reaching a peak field of 12.5T. This paper details the new 3D modeling method of the SMC, implemented using the ANSYS® Workbench environment. Advanced computer-aided-design (CAD) tools are combined with multi-physics Finite Element Analyses (FEA), in the same integrated graphic interface, forming a fully parametric model that enables simulation driven development of the SMC project. The magnetic and structural ...

  4. Directions in parallel programming: HPF, shared virtual memory and object parallelism in pC++

    Science.gov (United States)

    Bodin, Francois; Priol, Thierry; Mehrotra, Piyush; Gannon, Dennis

    1994-01-01

    Fortran and C++ are the dominant programming languages used in scientific computation. Consequently, extensions to these languages are the most popular for programming massively parallel computers. We discuss two such approaches to parallel Fortran and one approach to C++. The High Performance Fortran Forum has designed HPF with the intent of supporting data parallelism on Fortran 90 applications. HPF works by asking the user to help the compiler distribute and align the data structures with the distributed memory modules in the system. Fortran-S takes a different approach in which the data distribution is managed by the operating system and the user provides annotations to indicate parallel control regions. In the case of C++, we look at pC++ which is based on a concurrent aggregate parallel model.

  5. A Performance Analysis Tool for PVM Parallel Programs

    Institute of Scientific and Technical Information of China (English)

    Chen Wang; Yin Liu; Changjun Jiang; Zhaoqing Zhang

    2004-01-01

    In this paper,we introduce the design and implementation of ParaVT,which is a visual performance analysis and parallel debugging tool.In ParaVT,we propose an automated instrumentation mechanism. Based on this mechanism,ParaVT automatically analyzes the performance bottleneck of parallel applications and provides a visual user interface to monitor and analyze the performance of parallel programs.In addition ,it also supports certain extensions.

  6. Architectural Adaptability in Parallel Programming via Control Abstraction

    Science.gov (United States)

    1991-01-01

    Technical Report 359 January 1991 Abstract Parallel programming involves finding the potential parallelism in an application, choos - ing an...during the development of this paper. 34 References [Albert et ai, 1988] Eugene Albert, Kathleen Knobe, Joan D. Lukas, and Guy L. Steele, Jr

  7. Detection of And—Parallelism in Logic Programs

    Institute of Scientific and Technical Information of China (English)

    黄志毅; 胡守仁

    1990-01-01

    In this paper,we present a detection technique of and-parallelism in logic programs.The detection consists of three phases:analysis of entry modes,derivation of exit modes and determination of execution graph expressions.Compared with other techniques[2,4,5],our approach with the compile-time program-level data-dependence analysis of logic programs,can efficiently exploit and-parallelism in logic programs.Two precompilers,based on our technique and DeGroot's approach[3] respectively,have been implemented in SES-PIM system[12],Through compiling and running some typical benchmarks in SES-PIM,we conclude that our technique can,in most cases,exploit as much and-parallelism as the dynamic approach[13]does under“produces-consumer”scheme,and needs less dynamic overhead while exploiting more and parallelism than DeGroot's approach does.

  8. Using Fuzzy Gaussian Inference and Genetic Programming to Classify 3D Human Motions

    Science.gov (United States)

    Khoury, Mehdi; Liu, Honghai

    This research introduces and builds on the concept of Fuzzy Gaussian Inference (FGI) (Khoury and Liu in Proceedings of UKCI, 2008 and IEEE Workshop on Robotic Intelligence in Informationally Structured Space (RiiSS 2009), 2009) as a novel way to build Fuzzy Membership Functions that map to hidden Probability Distributions underlying human motions. This method is now combined with a Genetic Programming Fuzzy rule-based system in order to classify boxing moves from natural human Motion Capture data. In this experiment, FGI alone is able to recognise seven different boxing stances simultaneously with an accuracy superior to a GMM-based classifier. Results seem to indicate that adding an evolutionary Fuzzy Inference Engine on top of FGI improves the accuracy of the classifier in a consistent way.

  9. The 3D structure of the hadrons: recents results and experimental program at Jefferson Lab

    Directory of Open Access Journals (Sweden)

    Muñoz Camacho C.

    2014-04-01

    Full Text Available The understanding of Quantum Chromodynamics (QCD at large distances still remains one of the main outstanding problems of nuclear physics. Studying the internal structure of hadrons provides a way to probe QCD in the non-perturbative domain and can help us unravel the internal structure of the most elementary blocks of matter. Jefferson Lab (JLab has already delivered results on how elementary quarks and gluons create nucleon structure and properties. The upgrade of JLab to 12 GeV will allow the full exploration of the valence-quark structure of nucleons and the extraction of real threedimensional pictures. I will present recent results and review the future experimental program at JLab.

  10. Development of a web-based 3D virtual reality program for hydrogen station

    Energy Technology Data Exchange (ETDEWEB)

    Lee, Younghee; Kim, Jinkyung; Kim, Junghwan; Moon, Il [Department of Chemical and Biomolecular Engineering, Yonsei University, 134 Shinchon-dong Seodaemun-ku, Seoul 120-749 (Korea); Kim, Eun Jung; Kim, Young Gyu [Institute of Gas Safety R and D, Korea Gas Safety Corporation, 322-1 Daeya-Dong Siheung-si, Kyonggi-do 429-712 (Korea)

    2010-03-15

    The hydrogen fueling station is an infrastructure of supplying fuel cell vehicles. It is necessary to guarantee the safety of hydrogen station equipment and operating procedure for decreasing intangible awareness of danger of hydrogen. Among many methods of securing the safety of the hydrogen stations, the virtual experience by dynamic simulation of operating the facilities and equipment is important. Thus, we have developed a virtual reality operator education system, and an interactive hydrogen safety training system. This paper focuses on the development of a virtual reality operator education of the hydrogen fueling station based on simulations of accident scenarios and hypothetical operating experience. The risks to equipment and personnel, associated with the manual operation of hydrogen fueling station demand rigorous personnel instruction. Trainees can practice how to use all necessary equipments and can experience twenty possible accident scenarios. This program also illustrates Emergency Response Plan and Standard Operating Procedure for both emergency and normal operations. (author)

  11. The FORCE: A highly portable parallel programming language

    Science.gov (United States)

    Jordan, Harry F.; Benten, Muhammad S.; Alaghband, Gita; Jakob, Ruediger

    1989-01-01

    Here, it is explained why the FORCE parallel programming language is easily portable among six different shared-memory microprocessors, and how a two-level macro preprocessor makes it possible to hide low level machine dependencies and to build machine-independent high level constructs on top of them. These FORCE constructs make it possible to write portable parallel programs largely independent of the number of processes and the specific shared memory multiprocessor executing them.

  12. The FORCE - A highly portable parallel programming language

    Science.gov (United States)

    Jordan, Harry F.; Benten, Muhammad S.; Alaghband, Gita; Jakob, Ruediger

    1989-01-01

    This paper explains why the FORCE parallel programming language is easily portable among six different shared-memory multiprocessors, and how a two-level macro preprocessor makes it possible to hide low-level machine dependencies and to build machine-independent high-level constructs on top of them. These FORCE constructs make it possible to write portable parallel programs largely independent of the number of processes and the specific shared-memory multiprocessor executing them.

  13. Integrated Task And Data Parallel Programming: Language Design

    Science.gov (United States)

    Grimshaw, Andrew S.; West, Emily A.

    1998-01-01

    his research investigates the combination of task and data parallel language constructs within a single programming language. There are an number of applications that exhibit properties which would be well served by such an integrated language. Examples include global climate models, aircraft design problems, and multidisciplinary design optimization problems. Our approach incorporates data parallel language constructs into an existing, object oriented, task parallel language. The language will support creation and manipulation of parallel classes and objects of both types (task parallel and data parallel). Ultimately, the language will allow data parallel and task parallel classes to be used either as building blocks or managers of parallel objects of either type, thus allowing the development of single and multi-paradigm parallel applications. 1995 Research Accomplishments In February I presented a paper at Frontiers '95 describing the design of the data parallel language subset. During the spring I wrote and defended my dissertation proposal. Since that time I have developed a runtime model for the language subset. I have begun implementing the model and hand-coding simple examples which demonstrate the language subset. I have identified an astrophysical fluid flow application which will validate the data parallel language subset. 1996 Research Agenda Milestones for the coming year include implementing a significant portion of the data parallel language subset over the Legion system. Using simple hand-coded methods, I plan to demonstrate (1) concurrent task and data parallel objects and (2) task parallel objects managing both task and data parallel objects. My next steps will focus on constructing a compiler and implementing the fluid flow application with the language. Concurrently, I will conduct a search for a real-world application exhibiting both task and data parallelism within the same program m. Additional 1995 Activities During the fall I collaborated

  14. Characterizing and Mitigating Work Time Inflation in Task Parallel Programs

    Directory of Open Access Journals (Sweden)

    Stephen L. Olivier

    2013-01-01

    Full Text Available Task parallelism raises the level of abstraction in shared memory parallel programming to simplify the development of complex applications. However, task parallel applications can exhibit poor performance due to thread idleness, scheduling overheads, and work time inflation – additional time spent by threads in a multithreaded computation beyond the time required to perform the same work in a sequential computation. We identify the contributions of each factor to lost efficiency in various task parallel OpenMP applications and diagnose the causes of work time inflation in those applications. Increased data access latency can cause significant work time inflation in NUMA systems. Our locality framework for task parallel OpenMP programs mitigates this cause of work time inflation. Our extensions to the Qthreads library demonstrate that locality-aware scheduling can improve performance up to 3X compared to the Intel OpenMP task scheduler.

  15. Hybrid 3-D rocket trajectory program. Part 1: Formulation and analysis. Part 2: Computer programming and user's instruction. [computerized simulation using three dimensional motion analysis

    Science.gov (United States)

    Huang, L. C. P.; Cook, R. A.

    1973-01-01

    Models utilizing various sub-sets of the six degrees of freedom are used in trajectory simulation. A 3-D model with only linear degrees of freedom is especially attractive, since the coefficients for the angular degrees of freedom are the most difficult to determine and the angular equations are the most time consuming for the computer to evaluate. A computer program is developed that uses three separate subsections to predict trajectories. A launch rail subsection is used until the rocket has left its launcher. The program then switches to a special 3-D section which computes motions in two linear and one angular degrees of freedom. When the rocket trims out, the program switches to the standard, three linear degrees of freedom model.

  16. Retargeting of existing FORTRAN program and development of parallel compilers

    Science.gov (United States)

    Agrawal, Dharma P.

    1988-01-01

    The software models used in implementing the parallelizing compiler for the B-HIVE multiprocessor system are described. The various models and strategies used in the compiler development are: flexible granularity model, which allows a compromise between two extreme granularity models; communication model, which is capable of precisely describing the interprocessor communication timings and patterns; loop type detection strategy, which identifies different types of loops; critical path with coloring scheme, which is a versatile scheduling strategy for any multicomputer with some associated communication costs; and loop allocation strategy, which realizes optimum overlapped operations between computation and communication of the system. Using these models, several sample routines of the AIR3D package are examined and tested. It may be noted that automatically generated codes are highly parallelized to provide the maximized degree of parallelism, obtaining the speedup up to a 28 to 32-processor system. A comparison of parallel codes for both the existing and proposed communication model, is performed and the corresponding expected speedup factors are obtained. The experimentation shows that the B-HIVE compiler produces more efficient codes than existing techniques. Work is progressing well in completing the final phase of the compiler. Numerous enhancements are needed to improve the capabilities of the parallelizing compiler.

  17. Documentation of program AFTBDY to generate coordinate system for 3D after body using body fitted curvilinear coordinates, part 1

    Science.gov (United States)

    Kumar, D.

    1980-01-01

    The computer program AFTBDY generates a body fitted curvilinear coordinate system for a wedge curved after body. This wedge curved after body is being used in an experimental program. The coordinate system generated by AFTBDY is used to solve 3D compressible N.S. equations. The coordinate system in the physical plane is a cartesian x,y,z system, whereas, in the transformed plane a rectangular xi, eta, zeta system is used. The coordinate system generated is such that in the transformed plane coordinate spacing in the xi, eta, zeta direction is constant and equal to unity. The physical plane coordinate lines in the different regions are clustered heavily or sparsely depending on the regions where physical quantities to be solved for by the N.S. equations have high or low gradients. The coordinate distribution in the physical plane is such that x stays constant in eta and zeta direction, whereas, z stays constant in xi and eta direction. The desired distribution in x and z is input to the program. Consequently, only the y-coordinate is solved for by the program AFTBDY.

  18. Program Transformation to Identify List-Based Parallel Skeletons

    Directory of Open Access Journals (Sweden)

    Venkatesh Kannan

    2016-07-01

    Full Text Available Algorithmic skeletons are used as building-blocks to ease the task of parallel programming by abstracting the details of parallel implementation from the developer. Most existing libraries provide implementations of skeletons that are defined over flat data types such as lists or arrays. However, skeleton-based parallel programming is still very challenging as it requires intricate analysis of the underlying algorithm and often uses inefficient intermediate data structures. Further, the algorithmic structure of a given program may not match those of list-based skeletons. In this paper, we present a method to automatically transform any given program to one that is defined over a list and is more likely to contain instances of list-based skeletons. This facilitates the parallel execution of a transformed program using existing implementations of list-based parallel skeletons. Further, by using an existing transformation called distillation in conjunction with our method, we produce transformed programs that contain fewer inefficient intermediate data structures.

  19. Development of massively parallel quantum chemistry program SMASH

    Energy Technology Data Exchange (ETDEWEB)

    Ishimura, Kazuya [Department of Theoretical and Computational Molecular Science, Institute for Molecular Science 38 Nishigo-Naka, Myodaiji, Okazaki, Aichi 444-8585 (Japan)

    2015-12-31

    A massively parallel program for quantum chemistry calculations SMASH was released under the Apache License 2.0 in September 2014. The SMASH program is written in the Fortran90/95 language with MPI and OpenMP standards for parallelization. Frequently used routines, such as one- and two-electron integral calculations, are modularized to make program developments simple. The speed-up of the B3LYP energy calculation for (C{sub 150}H{sub 30}){sub 2} with the cc-pVDZ basis set (4500 basis functions) was 50,499 on 98,304 cores of the K computer.

  20. Web Based Parallel Programming Workshop for Undergraduate Education.

    Science.gov (United States)

    Marcus, Robert L.; Robertson, Douglass

    Central State University (Ohio), under a contract with Nichols Research Corporation, has developed a World Wide web based workshop on high performance computing entitled "IBN SP2 Parallel Programming Workshop." The research is part of the DoD (Department of Defense) High Performance Computing Modernization Program. The research…

  1. Protocol-Based Verification of Message-Passing Parallel Programs

    DEFF Research Database (Denmark)

    López-Acosta, Hugo-Andrés; Eduardo R. B. Marques, Eduardo R. B.; Martins, Francisco;

    2015-01-01

    a protocol language based on a dependent type system for message-passing parallel programs, which includes various communication operators, such as point-to-point messages, broadcast, reduce, array scatter and gather. For the verification of a program against a given protocol, the protocol is first...

  2. Evaluating integration of inland bathymetry in the U.S. Geological Survey 3D Elevation Program, 2014

    Science.gov (United States)

    Miller-Corbett, Cynthia

    2016-09-01

    Inland bathymetry survey collections, survey data types, features, sources, availability, and the effort required to integrate inland bathymetric data into the U.S. Geological Survey 3D Elevation Program are assessed to help determine the feasibility of integrating three-dimensional water feature elevation data into The National Map. Available data from wading, acoustic, light detection and ranging, and combined technique surveys are provided by the U.S. Geological Survey, National Oceanic and Atmospheric Administration, U.S. Army Corps of Engineers, and other sources. Inland bathymetric data accessed through Web-hosted resources or contacts provide useful baseline parameters for evaluating survey types and techniques used for collection and processing, and serve as a basis for comparing survey methods and the quality of results. Historically, boat-mounted acoustic surveys have provided most inland bathymetry data. Light detection and ranging techniques that are beneficial in areas hard to reach by boat, that can collect dense data in shallow water to provide comprehensive coverage, and that can be cost effective for surveying large areas with good water clarity are becoming more common; however, optimal conditions and techniques for collecting and processing light detection and ranging inland bathymetry surveys are not yet well defined.Assessment of site condition parameters important for understanding inland bathymetry survey issues and results, and an evaluation of existing inland bathymetry survey coverage are proposed as steps to develop criteria for implementing a useful and successful inland bathymetry survey plan in the 3D Elevation Program. These survey parameters would also serve as input for an inland bathymetry survey data baseline. Integration and interpolation techniques are important factors to consider in developing a robust plan; however, available survey data are usually in a triangulated irregular network format or other format compatible with

  3. Accelerate Performance on the Parallel Programming Super Highway

    Science.gov (United States)

    2010-04-01

    barriers  associated with  parallel   programming Dataflow languages ought to be considered along with  traditional (imperative) programming solutions 2...dasymptotic con ition (3 GHz) Moore’s Law may still be valid, but the Law of  Thermodynamics is also valid Parallel   Programming  options exist, but...languages can address some major  challenges associated with  parallel   programming Many dataflow languages exist today, and should be  considered along  ith

  4. On program restructuring, scheduling, and communication for parallel processor systems

    Energy Technology Data Exchange (ETDEWEB)

    Polychronopoulos, Constantine D.

    1986-08-01

    This dissertation discusses several software and hardware aspects of program execution on large-scale, high-performance parallel processor systems. The issues covered are program restructuring, partitioning, scheduling and interprocessor communication, synchronization, and hardware design issues of specialized units. All this work was performed focusing on a single goal: to maximize program speedup, or equivalently, to minimize parallel execution time. Parafrase, a Fortran restructuring compiler was used to transform programs in a parallel form and conduct experiments. Two new program restructuring techniques are presented, loop coalescing and subscript blocking. Compile-time and run-time scheduling schemes are covered extensively. Depending on the program construct, these algorithms generate optimal or near-optimal schedules. For the case of arbitrarily nested hybrid loops, two optimal scheduling algorithms for dynamic and static scheduling are presented. Simulation results are given for a new dynamic scheduling algorithm. The performance of this algorithm is compared to that of self-scheduling. Techniques for program partitioning and minimization of interprocessor communication for idealized program models and for real Fortran programs are also discussed. The close relationship between scheduling, interprocessor communication, and synchronization becomes apparent at several points in this work. Finally, the impact of various types of overhead on program speedup and experimental results are presented. 69 refs., 74 figs., 14 tabs.

  5. Optimized Parallel Execution of Declarative Programs on Distributed Memory Multiprocessors

    Institute of Scientific and Technical Information of China (English)

    沈美明; 田新民; 等

    1993-01-01

    In this paper,we focus on the compiling implementation of parlalel logic language PARLOG and functional language ML on distributed memory multiprocessors.Under the graph rewriting framework, a Heterogeneous Parallel Graph Rewriting Execution Model(HPGREM)is presented firstly.Then based on HPGREM,a parallel abstact machine PAM/TGR is described.Furthermore,several optimizing compilation schemes for executing declarative programs on transputer array are proposed.The performance statistics on transputer array demonstrate the effectiveness of our model,parallel abstract machine,optimizing compilation strategies and compiler.

  6. Methodological proposal for the volumetric study of archaeological ceramics through 3D edition free-software programs: the case of the celtiberians cemeteries of the meseta

    Directory of Open Access Journals (Sweden)

    Álvaro Sánchez Climent

    2014-10-01

    Full Text Available Nowadays the free-software programs have been converted into the ideal tools for the archaeological researches, reaching the same level as other commercial programs. For that reason, the 3D modeling tool Blender has reached in the last years a great popularity offering similar characteristics like other commercial 3D editing programs such as 3D Studio Max or AutoCAD. Recently, it has been developed the necessary script for the volumetric calculations of three-dimnesional objects, offering great possibilities to calculate the volume of the archaeological ceramics. In this paper, we present a methodological approach for the volumetric studies with Blender and a study case of funerary urns from several celtiberians cemeteries of the Spanish Meseta. The goal is to demonstrate the great possibilities that the 3D editing free-software tools have in the volumetric studies at the present time.

  7. The parallel programming of voluntary and reflexive saccades.

    Science.gov (United States)

    Walker, Robin; McSorley, Eugene

    2006-06-01

    A novel two-step paradigm was used to investigate the parallel programming of consecutive, stimulus-elicited ('reflexive') and endogenous ('voluntary') saccades. The mean latency of voluntary saccades, made following the first reflexive saccades in two-step conditions, was significantly reduced compared to that of voluntary saccades made in the single-step control trials. The latency of the first reflexive saccades was modulated by the requirement to make a second saccade: first saccade latency increased when a second voluntary saccade was required in the opposite direction to the first saccade, and decreased when a second saccade was required in the same direction as the first reflexive saccade. A second experiment confirmed the basic effect and also showed that a second reflexive saccade may be programmed in parallel with a first voluntary saccade. The results support the view that voluntary and reflexive saccades can be programmed in parallel on a common motor map.

  8. Basic design of parallel computational program for probabilistic structural analysis

    Energy Technology Data Exchange (ETDEWEB)

    Kaji, Yoshiyuki; Arai, Taketoshi [Japan Atomic Energy Research Inst., Tokai, Ibaraki (Japan). Tokai Research Establishment; Gu, Wenwei; Nakamura, Hitoshi

    1999-06-01

    In our laboratory, for `development of damage evaluation method of structural brittle materials by microscopic fracture mechanics and probabilistic theory` (nuclear computational science cross-over research) we examine computational method related to super parallel computation system which is coupled with material strength theory based on microscopic fracture mechanics for latent cracks and continuum structural model to develop new structural reliability evaluation methods for ceramic structures. This technical report is the review results regarding probabilistic structural mechanics theory, basic terms of formula and program methods of parallel computation which are related to principal terms in basic design of computational mechanics program. (author)

  9. The MHOST finite element program: 3-D inelastic analysis methods for hot section components. Volume 1: Theoretical manual

    Science.gov (United States)

    Nakazawa, Shohei

    1991-01-01

    Formulations and algorithms implemented in the MHOST finite element program are discussed. The code uses a novel concept of the mixed iterative solution technique for the efficient 3-D computations of turbine engine hot section components. The general framework of variational formulation and solution algorithms are discussed which were derived from the mixed three field Hu-Washizu principle. This formulation enables the use of nodal interpolation for coordinates, displacements, strains, and stresses. Algorithmic description of the mixed iterative method includes variations for the quasi static, transient dynamic and buckling analyses. The global-local analysis procedure referred to as the subelement refinement is developed in the framework of the mixed iterative solution, of which the detail is presented. The numerically integrated isoparametric elements implemented in the framework is discussed. Methods to filter certain parts of strain and project the element discontinuous quantities to the nodes are developed for a family of linear elements. Integration algorithms are described for linear and nonlinear equations included in MHOST program.

  10. 3D Printing an Octohedron

    OpenAIRE

    Aboufadel, Edward F.

    2014-01-01

    The purpose of this short paper is to describe a project to manufacture a regular octohedron on a 3D printer. We assume that the reader is familiar with the basics of 3D printing. In the project, we use fundamental ideas to calculate the vertices and faces of an octohedron. Then, we utilize the OPENSCAD program to create a virtual 3D model and an STereoLithography (.stl) file that can be used by a 3D printer.

  11. Parallel Implementation of the PHOENIX Generalized Stellar Atmosphere Program; 2, Wavelength Parallelization

    CERN Document Server

    Baron, E A; Hauschildt, Peter H.

    1997-01-01

    We describe an important addition to the parallel implementation of our generalized NLTE stellar atmosphere and radiative transfer computer program PHOENIX. In a previous paper in this series we described data and task parallel algorithms we have developed for radiative transfer, spectral line opacity, and NLTE opacity and rate calculations. These algorithms divided the work spatially or by spectral lines, that is distributing the radial zones, individual spectral lines, or characteristic rays among different processors and employ, in addition task parallelism for logically independent functions (such as atomic and molecular line opacities). For finite, monotonic velocity fields, the radiative transfer equation is an initial value problem in wavelength, and hence each wavelength point depends upon the previous one. However, for sophisticated NLTE models of both static and moving atmospheres needed to accurately describe, e.g., novae and supernovae, the number of wavelength points is very large (200,000--300,0...

  12. Simulation in 3 dimensions of a cycle 18 months for an BWR type reactor using the Nod3D program; Simulacion en 3 dimensiones de un ciclo de 18 meses para un reactor BWR usando el programa Nod3D

    Energy Technology Data Exchange (ETDEWEB)

    Hernandez, N.; Alonso, G. [ININ, A.P. 18-1027, 11801 Mexico D.F. (Mexico)]. E-mail: nhm@nuclear.inin.mx; Valle, E. del [IPN, ESFM, 07738 Mexico D.F. (Mexico)

    2004-07-01

    The development of own codes that you/they allow the simulation in 3 dimensions of the nucleus of a reactor and be of easy maintenance, without the consequent payment of expensive use licenses, it can be a factor that propitiates the technological independence. In the Department of Nuclear Engineering (DIN) of the Superior School of Physics and Mathematics (ESFM) of the National Polytechnic Institute (IPN) a denominated program Nod3D has been developed with the one that one can simulate the operation of a reactor BWR in 3 dimensions calculating the effective multiplication factor (kJJ3, as well as the distribution of the flow neutronic and of the axial and radial profiles of the power, inside a means of well-known characteristics solving the equations of diffusion of neutrons numerically in stationary state and geometry XYZ using the mathematical nodal method RTN0 (Raviart-Thomas-Nedelec of index zero). One of the limitations of the program Nod3D is that it doesn't allow to consider the burnt of the fuel in an independent way considering feedback, this makes it in an implicit way considering the effective sections in each step of burnt and these sections are obtained of the code Core Master LEND. However even given this limitation, the results obtained in the simulation of a cycle of typical operation of a reactor of the type BWR are similar to those reported by the code Core Master LENDS. The results of the keJ - that were obtained with the program Nod3D they were compared with the results of the code Core Master LEND, presenting a difference smaller than 0.2% (200 pcm), and in the case of the axial profile of power, the maxim differs it was of 2.5%. (Author)

  13. Parallel Libraries to support High-Level Programming

    DEFF Research Database (Denmark)

    Larsen, Morten Nørgaard

    so is not a simple task and for many non-computer scientists, like chemists and physicists writing programs for simulating their experiments, the task can easily become overwhelming. During the last decades, a lot of research efforts have been put into how to create tools that will simplify writing......The development of computer architectures during the last ten years have forced programmers to move towards writing parallel programs instead of sequential ones. The homogenous multi-core architectures from the major CPU producers like Intel and AMD has led this trend, but the introduction......, the general increase in the usage of graphic cards for general-purpose programming (GPGPU) have meant that programmers today must be able to write parallel programs that cannot only utilize small number computational cores but perhaps hundreds or even thousands. However, most programmers will agree that doing...

  14. Center for Programming Models for Scalable Parallel Computing

    Energy Technology Data Exchange (ETDEWEB)

    John Mellor-Crummey

    2008-02-29

    Rice University's achievements as part of the Center for Programming Models for Scalable Parallel Computing include: (1) design and implemention of cafc, the first multi-platform CAF compiler for distributed and shared-memory machines, (2) performance studies of the efficiency of programs written using the CAF and UPC programming models, (3) a novel technique to analyze explicitly-parallel SPMD programs that facilitates optimization, (4) design, implementation, and evaluation of new language features for CAF, including communication topologies, multi-version variables, and distributed multithreading to simplify development of high-performance codes in CAF, and (5) a synchronization strength reduction transformation for automatically replacing barrier-based synchronization with more efficient point-to-point synchronization. The prototype Co-array Fortran compiler cafc developed in this project is available as open source software from http://www.hipersoft.rice.edu/caf.

  15. GeoGebra 3D from the perspectives of elementary pre-service mathematics teachers who are familiar with a number of software programs

    Directory of Open Access Journals (Sweden)

    Serdal Baltaci

    2015-01-01

    Full Text Available Each new version of the GeoGebra dynamic mathematics software goes through updates and innovations. One of these innovations is the GeoGebra 5.0 version. This version aims to facilitate 3D instruction by offering opportunities for students to analyze 3D objects. While scanning the previous studies of GeoGebra 3D, it is seen that they mainly focus on the visualization of a problem in daily life and the dimensions of the evaluation of the process of problem solving with various variables. Therefore, this research problem was determined to reveal the opinions of pre-service elementary mathematics teachers who can use multiple software programs very well, about the usability of GeoGebra 3D. Compared to other studies conducted in this field, this study is thought to add a new dimension to the literature on GeoGebra 3D because the participants in the study had received training in using the Derive, Cabri, Cabri 3D, GeoGebra and GeoGebra 3D programs and had developed activities throughout their undergraduate programs and in some cases they were held responsible for those programs in their exams. In this research, we used the method of case study. The participants consisted of five elementary pre-service mathematics teachers who were enrolled in fourth year courses. We employed semi-structured interviews to collect data. It is concluded that pre-service elementary mathematics teachers expressed a great deal of opinions about the positive contribution of the GeoGebra 3D dynamic mathematics software.

  16. MELD: A Logical Approach to Distributed and Parallel Programming

    Science.gov (United States)

    2012-03-01

    USA: ACM, 1974, pp. 249–264. [14] M. Isard , M. Budiu, Y. Yu, A. Birrell, and D. Fetterly, “Dryad: Distributed data-parallel programs from sequential...TR-2006-140. [Online]. Available: http://budiu.info/work/eurosys07.pdf [15] Y. Yu, M. Isard , D. Fetterly, M. Budiu, Ú . Erlingsson, P. K. Gunda

  17. Vyukový program pro demonstraci metod texturování 3D objektů

    OpenAIRE

    Kukla, Michal

    2008-01-01

    Práca sa zaoberá vývojom aplikácie pre demonštrovanie metód texturovania 3D objektov. Práca obsahuje teoreticky rozobrané jednotlivé metódy textúrovania, ktoré tvoria základ pre návrh obsahu demonštrácie. Požiadavky na aplikáciu sú analyzované a aplikované pre vytvorenie návrhu aplikácie. Postupy pri implementácii jednotlivých metód textúrovania a popis ovládacích prvkov aplikácie sú popísané v závere práce. Program môže slúžiť ako demonštračný  pri výučbe alebo tiež ako študijný materiá...

  18. GRID2D/3D: A computer program for generating grid systems in complex-shaped two- and three-dimensional spatial domains. Part 1: Theory and method

    Science.gov (United States)

    Shih, T. I.-P.; Bailey, R. T.; Nguyen, H. L.; Roelke, R. J.

    1990-01-01

    An efficient computer program, called GRID2D/3D was developed to generate single and composite grid systems within geometrically complex two- and three-dimensional (2- and 3-D) spatial domains that can deform with time. GRID2D/3D generates single grid systems by using algebraic grid generation methods based on transfinite interpolation in which the distribution of grid points within the spatial domain is controlled by stretching functions. All single grid systems generated by GRID2D/3D can have grid lines that are continuous and differentiable everywhere up to the second-order. Also, grid lines can intersect boundaries of the spatial domain orthogonally. GRID2D/3D generates composite grid systems by patching together two or more single grid systems. The patching can be discontinuous or continuous. For continuous composite grid systems, the grid lines are continuous and differentiable everywhere up to the second-order except at interfaces where different single grid systems meet. At interfaces where different single grid systems meet, the grid lines are only differentiable up to the first-order. For 2-D spatial domains, the boundary curves are described by using either cubic or tension spline interpolation. For 3-D spatial domains, the boundary surfaces are described by using either linear Coon's interpolation, bi-hyperbolic spline interpolation, or a new technique referred to as 3-D bi-directional Hermite interpolation. Since grid systems generated by algebraic methods can have grid lines that overlap one another, GRID2D/3D contains a graphics package for evaluating the grid systems generated. With the graphics package, the user can generate grid systems in an interactive manner with the grid generation part of GRID2D/3D. GRID2D/3D is written in FORTRAN 77 and can be run on any IBM PC, XT, or AT compatible computer. In order to use GRID2D/3D on workstations or mainframe computers, some minor modifications must be made in the graphics part of the program; no

  19. A Parallel Product-Convolution approach for representing the depth varying Point Spread Functions in 3D widefield microscopy based on principal component analysis.

    Science.gov (United States)

    Arigovindan, Muthuvel; Shaevitz, Joshua; McGowan, John; Sedat, John W; Agard, David A

    2010-03-29

    We address the problem of computational representation of image formation in 3D widefield fluorescence microscopy with depth varying spherical aberrations. We first represent 3D depth-dependent point spread functions (PSFs) as a weighted sum of basis functions that are obtained by principal component analysis (PCA) of experimental data. This representation is then used to derive an approximating structure that compactly expresses the depth variant response as a sum of few depth invariant convolutions pre-multiplied by a set of 1D depth functions, where the convolving functions are the PCA-derived basis functions. The model offers an efficient and convenient trade-off between complexity and accuracy. For a given number of approximating PSFs, the proposed method results in a much better accuracy than the strata based approximation scheme that is currently used in the literature. In addition to yielding better accuracy, the proposed methods automatically eliminate the noise in the measured PSFs.

  20. Modelling parallel programs and multiprocessor architectures with AXE

    Science.gov (United States)

    Yan, Jerry C.; Fineman, Charles E.

    1991-01-01

    AXE, An Experimental Environment for Parallel Systems, was designed to model and simulate for parallel systems at the process level. It provides an integrated environment for specifying computation models, multiprocessor architectures, data collection, and performance visualization. AXE is being used at NASA-Ames for developing resource management strategies, parallel problem formulation, multiprocessor architectures, and operating system issues related to the High Performance Computing and Communications Program. AXE's simple, structured user-interface enables the user to model parallel programs and machines precisely and efficiently. Its quick turn-around time keeps the user interested and productive. AXE models multicomputers. The user may easily modify various architectural parameters including the number of sites, connection topologies, and overhead for operating system activities. Parallel computations in AXE are represented as collections of autonomous computing objects known as players. Their use and behavior is described. Performance data of the multiprocessor model can be observed on a color screen. These include CPU and message routing bottlenecks, and the dynamic status of the software.

  1. A Large-Grain Parallel Programming Environment for Non-Programmers

    OpenAIRE

    Lewis, Ted

    1994-01-01

    1994 International Conference on Parallel Processing Banger is a parallel programming environment used by non-professional programmers to write explicitly parallel large-grain parallel programs. The goals of Banger are: 1. extreme ease of use, 2. immediate feedback, and 3. machine-independence. Banger is based on three principles: 1. separation of parallel programming-in-the-large from sequential programming-in-the-small, 2. separation of programming environment from target machine ...

  2. Heterogeneous Multicore Parallel Programming for Graphics Processing Units

    Directory of Open Access Journals (Sweden)

    Francois Bodin

    2009-01-01

    Full Text Available Hybrid parallel multicore architectures based on graphics processing units (GPUs can provide tremendous computing power. Current NVIDIA and AMD Graphics Product Group hardware display a peak performance of hundreds of gigaflops. However, exploiting GPUs from existing applications is a difficult task that requires non-portable rewriting of the code. In this paper, we present HMPP, a Heterogeneous Multicore Parallel Programming workbench with compilers, developed by CAPS entreprise, that allows the integration of heterogeneous hardware accelerators in a unintrusive manner while preserving the legacy code.

  3. Advanced parallel programming models research and development opportunities.

    Energy Technology Data Exchange (ETDEWEB)

    Wen, Zhaofang.; Brightwell, Ronald Brian

    2004-07-01

    There is currently a large research and development effort within the high-performance computing community on advanced parallel programming models. This research can potentially have an impact on parallel applications, system software, and computing architectures in the next several years. Given Sandia's expertise and unique perspective in these areas, particularly on very large-scale systems, there are many areas in which Sandia can contribute to this effort. This technical report provides a survey of past and present parallel programming model research projects and provides a detailed description of the Partitioned Global Address Space (PGAS) programming model. The PGAS model may offer several improvements over the traditional distributed memory message passing model, which is the dominant model currently being used at Sandia. This technical report discusses these potential benefits and outlines specific areas where Sandia's expertise could contribute to current research activities. In particular, we describe several projects in the areas of high-performance networking, operating systems and parallel runtime systems, compilers, application development, and performance evaluation.

  4. Efficient Thread Labeling for Monitoring Programs with Nested Parallelism

    Science.gov (United States)

    Ha, Ok-Kyoon; Kim, Sun-Sook; Jun, Yong-Kee

    It is difficult and cumbersome to detect data races occurred in an execution of parallel programs. Any on-the-fly race detection techniques using Lamport's happened-before relation needs a thread labeling scheme for generating unique identifiers which maintain logical concurrency information for the parallel threads. NR labeling is an efficient thread labeling scheme for the fork-join program model with nested parallelism, because its efficiency depends only on the nesting depth for every fork and join operation. This paper presents an improved NR labeling, called e-NR labeling, in which every thread generates its label by inheriting the pointer to its ancestor list from the parent threads or by updating the pointer in a constant amount of time and space. This labeling is more efficient than the NR labeling, because its efficiency does not depend on the nesting depth for every fork and join operation. Some experiments were performed with OpenMP programs having nesting depths of three or four and maximum parallelisms varying from 10,000 to 1,000,000. The results show that e-NR is 5 times faster than NR labeling and 4.3 times faster than OS labeling in the average time for creating and maintaining the thread labels. In average space required for labeling, it is 3.5 times smaller than NR labeling and 3 times smaller than OS labeling.

  5. Feedback Driven Annotation and Refactoring of Parallel Programs

    DEFF Research Database (Denmark)

    Larsen, Per

    This thesis combines programmer knowledge and feedback to improve modeling and optimization of software. The research is motivated by two observations. First, there is a great need for automatic analysis of software for embedded systems - to expose and model parallelism inherent in programs. Second......, some program properties are beyond reach of such analysis for theoretical and practical reasons - but can be described by programmers. Three aspects are explored. The first is annotation of the source code. Two annotations are introduced. These allow more accurate modeling of parallelism...... are not effective unless programmers are told how and when they are benecial. A prototype compilation feedback system was developed in collaboration with IBM Haifa Research Labs. It reports issues that prevent further analysis to the programmer. Performance evaluation shows that three programs performes signicantly...

  6. 3维全电磁粒子软件NEPTUNE中的并行计算方法%Parallelization methods in 3D fully electromagnetic code NEPTUNE

    Institute of Scientific and Technical Information of China (English)

    陈军; 莫则尧; 董烨; 杨温渊; 董志伟

    2011-01-01

    NEPTUNE is a three-dimensional fully parallel electromagnetic code to solve electromagnetic problem in high power microwaveC HPM) devices with complex geometry. This paper introduces the following three parallelization methods used in the code. For massively computation, the "block-patch" two level parallel domain decomposition strategy is provided to scale the computation size to thousands of processor cores. Based on the geometry information, the mesh is reconfigured using the adaptive technology to get rid of invalid grid cells, and thus the storage amount and parallel execution time decrease sharply. On the basis of traditional Boris' successive over relaxation (SOR) iteration method, a parallel Poisson solver on irregular domains is provided with red and black ordering technology and geometry constraints. With the above methods, NEPTUNE can get 51. 8% parallel efficiency on 1 024 cores when simulating MILO devices.%介绍了NEPTUNE软件采用的一些并行计算方法:采用“块-网格片”二层并行区域分解方法,使计算规模能够扩展到上千个处理器核.基于复杂几何特征采用自适应技术并行生成结构网格,在原有规则区域的基础上剔除无效网格,大幅降低了存储量和并行执行时间.在经典的Boris和SOR迭代方法基础上,采用红黑排序和几何约束,提出了非规则区域上的Poisson方程并行求解方法.采用这些方法后,当使用NEP-TUNE软件模拟MILO器件时,可在1024个处理器核上获得51.8%的并行效率.

  7. Sisal 3.2: functional language for scientific parallel programming

    Science.gov (United States)

    Kasyanov, Victor

    2013-05-01

    Sisal 3.2 is a new input language of system of functional programming (SFP) which is under development at the Institute of Informatics Systems in Novosibirsk as an interactive visual environment for supporting of scientific parallel programming. This paper contains an overview of Sisal 3.2 and a description of its new features compared with previous versions of the SFP input language such as the multidimensional array support, new abstractions like parametric types and generalised procedures, more flexible user-defined reductions, improved interoperability with other programming languages and specification of several optimising source text annotations.

  8. 3D photoacoustic imaging

    Science.gov (United States)

    Carson, Jeffrey J. L.; Roumeliotis, Michael; Chaudhary, Govind; Stodilka, Robert Z.; Anastasio, Mark A.

    2010-06-01

    Our group has concentrated on development of a 3D photoacoustic imaging system for biomedical imaging research. The technology employs a sparse parallel detection scheme and specialized reconstruction software to obtain 3D optical images using a single laser pulse. With the technology we have been able to capture 3D movies of translating point targets and rotating line targets. The current limitation of our 3D photoacoustic imaging approach is its inability ability to reconstruct complex objects in the field of view. This is primarily due to the relatively small number of projections used to reconstruct objects. However, in many photoacoustic imaging situations, only a few objects may be present in the field of view and these objects may have very high contrast compared to background. That is, the objects have sparse properties. Therefore, our work had two objectives: (i) to utilize mathematical tools to evaluate 3D photoacoustic imaging performance, and (ii) to test image reconstruction algorithms that prefer sparseness in the reconstructed images. Our approach was to utilize singular value decomposition techniques to study the imaging operator of the system and evaluate the complexity of objects that could potentially be reconstructed. We also compared the performance of two image reconstruction algorithms (algebraic reconstruction and l1-norm techniques) at reconstructing objects of increasing sparseness. We observed that for a 15-element detection scheme, the number of measureable singular vectors representative of the imaging operator was consistent with the demonstrated ability to reconstruct point and line targets in the field of view. We also observed that the l1-norm reconstruction technique, which is known to prefer sparseness in reconstructed images, was superior to the algebraic reconstruction technique. Based on these findings, we concluded (i) that singular value decomposition of the imaging operator provides valuable insight into the capabilities of

  9. Programming Massively Parallel Architectures using MARTE: a Case Study

    CERN Document Server

    Rodrigues, Wendell; Dekeyser, Jean-Luc

    2011-01-01

    Nowadays, several industrial applications are being ported to parallel architectures. These applications take advantage of the potential parallelism provided by multiple core processors. Many-core processors, especially the GPUs(Graphics Processing Unit), have led the race of floating-point performance since 2003. While the performance improvement of general- purpose microprocessors has slowed significantly, the GPUs have continued to improve relentlessly. As of 2009, the ratio between many-core GPUs and multicore CPUs for peak floating-point calculation throughput is about 10 times. However, as parallel programming requires a non-trivial distribution of tasks and data, developers find it hard to implement their applications effectively. Aiming to improve the use of many-core processors, this work presents an case-study using UML and MARTE profile to specify and generate OpenCL code for intensive signal processing applications. Benchmark results show us the viability of the use of MDE approaches to generate G...

  10. Testing New Programming Paradigms with NAS Parallel Benchmarks

    Science.gov (United States)

    Jin, H.; Frumkin, M.; Schultz, M.; Yan, J.

    2000-01-01

    Over the past decade, high performance computing has evolved rapidly, not only in hardware architectures but also with increasing complexity of real applications. Technologies have been developing to aim at scaling up to thousands of processors on both distributed and shared memory systems. Development of parallel programs on these computers is always a challenging task. Today, writing parallel programs with message passing (e.g. MPI) is the most popular way of achieving scalability and high performance. However, writing message passing programs is difficult and error prone. Recent years new effort has been made in defining new parallel programming paradigms. The best examples are: HPF (based on data parallelism) and OpenMP (based on shared memory parallelism). Both provide simple and clear extensions to sequential programs, thus greatly simplify the tedious tasks encountered in writing message passing programs. HPF is independent of memory hierarchy, however, due to the immaturity of compiler technology its performance is still questionable. Although use of parallel compiler directives is not new, OpenMP offers a portable solution in the shared-memory domain. Another important development involves the tremendous progress in the internet and its associated technology. Although still in its infancy, Java promisses portability in a heterogeneous environment and offers possibility to "compile once and run anywhere." In light of testing these new technologies, we implemented new parallel versions of the NAS Parallel Benchmarks (NPBs) with HPF and OpenMP directives, and extended the work with Java and Java-threads. The purpose of this study is to examine the effectiveness of alternative programming paradigms. NPBs consist of five kernels and three simulated applications that mimic the computation and data movement of large scale computational fluid dynamics (CFD) applications. We started with the serial version included in NPB2.3. Optimization of memory and cache usage

  11. CaKernel – A Parallel Application Programming Framework for Heterogenous Computing Architectures

    Directory of Open Access Journals (Sweden)

    Marek Blazewicz

    2011-01-01

    Full Text Available With the recent advent of new heterogeneous computing architectures there is still a lack of parallel problem solving environments that can help scientists to use easily and efficiently hybrid supercomputers. Many scientific simulations that use structured grids to solve partial differential equations in fact rely on stencil computations. Stencil computations have become crucial in solving many challenging problems in various domains, e.g., engineering or physics. Although many parallel stencil computing approaches have been proposed, in most cases they solve only particular problems. As a result, scientists are struggling when it comes to the subject of implementing a new stencil-based simulation, especially on high performance hybrid supercomputers. In response to the presented need we extend our previous work on a parallel programming framework for CUDA – CaCUDA that now supports OpenCL. We present CaKernel – a tool that simplifies the development of parallel scientific applications on hybrid systems. CaKernel is built on the highly scalable and portable Cactus framework. In the CaKernel framework, Cactus manages the inter-process communication via MPI while CaKernel manages the code running on Graphics Processing Units (GPUs and interactions between them. As a non-trivial test case we have developed a 3D CFD code to demonstrate the performance and scalability of the automatically generated code.

  12. A new computer program for topological, visual analysis of 3D particle configurations based on visual representation of radial distribution function peaks as bonds

    CERN Document Server

    Metere, Alfredo; Dzugutov, Mikhail

    2015-01-01

    We present a new program able to perform unique visual analysis on generic particle systems: PASYVAT (PArticle SYstem Visual Analysis Tool). More specifically, it can perform a selection of multiple interparticle distance ranges from a radial distribution function (RDF) plot and display them in 3D as bonds. This software can be used with any data set representing a system of particles in 3D. In this manuscript the reader will find a description of the program and its internal structure, with emphasis on its applicability in the study of certain particle configurations, obtained from classical molecular dynamics simulation in condensed matter physics.

  13. Performance Evaluation Methodologies and Tools for Massively Parallel Programs

    Science.gov (United States)

    Yan, Jerry C.; Sarukkai, Sekhar; Tucker, Deanne (Technical Monitor)

    1994-01-01

    The need for computing power has forced a migration from serial computation on a single processor to parallel processing on multiprocessors. However, without effective means to monitor (and analyze) program execution, tuning the performance of parallel programs becomes exponentially difficult as program complexity and machine size increase. The recent introduction of performance tuning tools from various supercomputer vendors (Intel's ParAide, TMC's PRISM, CSI'S Apprentice, and Convex's CXtrace) seems to indicate the maturity of performance tool technologies and vendors'/customers' recognition of their importance. However, a few important questions remain: What kind of performance bottlenecks can these tools detect (or correct)? How time consuming is the performance tuning process? What are some important technical issues that remain to be tackled in this area? This workshop reviews the fundamental concepts involved in analyzing and improving the performance of parallel and heterogeneous message-passing programs. Several alternative strategies will be contrasted, and for each we will describe how currently available tuning tools (e.g., AIMS, ParAide, PRISM, Apprentice, CXtrace, ATExpert, Pablo, IPS-2)) can be used to facilitate the process. We will characterize the effectiveness of the tools and methodologies based on actual user experiences at NASA Ames Research Center. Finally, we will discuss their limitations and outline recent approaches taken by vendors and the research community to address them.

  14. Scientific programming on massively parallel processor CP-PACS

    Energy Technology Data Exchange (ETDEWEB)

    Boku, Taisuke [Tsukuba Univ., Ibaraki (Japan). Inst. of Information Sciences and Electronics

    1998-03-01

    The massively parallel processor CP-PACS takes various problems of calculation physics as the object, and it has been designed so that its architecture has been devised to do various numerical processings. In this report, the outline of the CP-PACS and the example of programming in the Kernel CG benchmark in NAS Parallel Benchmarks, version 1, are shown, and the pseudo vector processing mechanism and the parallel processing tuning of scientific and technical computation utilizing the three-dimensional hyper crossbar net, which are two great features of the architecture of the CP-PACS are described. As for the CP-PACS, the PUs based on RISC processor and added with pseudo vector processor are used. Pseudo vector processing is realized as the loop processing by scalar command. The features of the connection net of PUs are explained. The algorithm of the NPB version 1 Kernel CG is shown. The part that takes the time for processing most in the main loop is the product of matrix and vector (matvec), and the parallel processing of the matvec is explained. The time for the computation by the CPU is determined. As the evaluation of the performance, the evaluation of the time for execution, the short vector processing of pseudo vector processor based on slide window, and the comparison with other parallel computers are reported. (K.I.)

  15. [3D finite-element study on displacement of craniofacial complex with retractive forces parallel to the occlusion plane on the maxilla of rhesus monkeys].

    Science.gov (United States)

    Huang, Jiaxin; Zhang, Xiaorong; Yao, Ji

    2014-04-01

    To construct a 3D finite-element model of the craniofacial complex with the original DICOM data of CT and to investigate the preliminary biomechanical characteristics with different directions and magnitudes of retractive forces to the maxilla of rhesus monkeys. A male rhesus monkey with mixed dentition was used. Spiral CT was performed to establish a 3D finite-element model of the craniofacial complex. The ANSYS 12.1 software was used to analyze craniofacial complex displacement. Each landmark showed larger displacement with increasing force value. The displacement values and force size exhibited a linear relationship. In the x-axis direction, all displacements were small. In the y-axis direction, all displacements showed significantly higher changes with increasing force value displacement. In the z-axis direction, the A-point and ANS point moved downward, but PNS moved upward. Loading retractive force resultes in an apparent backward and clockwise rotation on the maxilla with no obvious effects on the width of the upper jaw.

  16. Automatic array alignment in data-parallel programs

    Science.gov (United States)

    Chatterjee, Siddhartha; Gilbert, John R.; Schreiber, Robert; Teng, Shang-Hua

    1993-01-01

    FORTRAN 90 and other data-parallel languages express parallelism in the form of operations on data aggregates such as arrays. Misalignment of the operands of an array operation can reduce program performance on a distributed-memory parallel machine by requiring nonlocal data accesses. Determining array alignments that reduce communication is therefore a key issue in compiling such languages. We present a framework for the automatic determination of array alignments in array-based, data-parallel languages. Our language model handles array sectioning, reductions, spreads, transpositions, and masked operations. We decompose alignment functions into three constituents: axis, stride, and offset. For each of these subproblems, we show how to solve the alignment problem for a basic block of code, possibly containing common subexpressions. Alignments are generated for all array objects in the code, both named program variables and intermediate results. We assign computation to processors by virtue of explicit alignment of all temporaries; the resulting work assignment is in general better than that provided by the 'owner-computes' rule. Finally, we present some ideas for dealing with control flow, replication, and dynamic alignments that depend on loop induction variables.

  17. Programming N-Cubes with a Graphical Parallel Programming Environment Versus an Extended Sequential Language.

    Science.gov (United States)

    1986-11-01

    parallel programming environment and language Poker. Our example programs, an implementation of a Cholesky algorithm for a banded matrix, were written in both languages and compiled into object codes that ran on the Cosmic Cube. However the program written in Poker is shorter, faster and easier to write, easier to debug, and portable without changes to other parallel computer architectures. The Poker program was slower than the program written directly in Cosmic Cube C, however the experiments provided insights into changes that make Poker programs nearly as fast.

  18. On the utility of threads for data parallel programming

    Science.gov (United States)

    Fahringer, Thomas; Haines, Matthew; Mehrotra, Piyush

    1995-01-01

    Threads provide a useful programming model for asynchronous behavior because of their ability to encapsulate units of work that can then be scheduled for execution at runtime, based on the dynamic state of a system. Recently, the threaded model has been applied to the domain of data parallel scientific codes, and initial reports indicate that the threaded model can produce performance gains over non-threaded approaches, primarily through the use of overlapping useful computation with communication latency. However, overlapping computation with communication is possible without the benefit of threads if the communication system supports asynchronous primitives, and this comparison has not been made in previous papers. This paper provides a critical look at the utility of lightweight threads as applied to data parallel scientific programming.

  19. Final Report: Center for Programming Models for Scalable Parallel Computing

    Energy Technology Data Exchange (ETDEWEB)

    Mellor-Crummey, John [William Marsh Rice University

    2011-09-13

    As part of the Center for Programming Models for Scalable Parallel Computing, Rice University collaborated with project partners in the design, development and deployment of language, compiler, and runtime support for parallel programming models to support application development for the “leadership-class” computer systems at DOE national laboratories. Work over the course of this project has focused on the design, implementation, and evaluation of a second-generation version of Coarray Fortran. Research and development efforts of the project have focused on the CAF 2.0 language, compiler, runtime system, and supporting infrastructure. This has involved working with the teams that provide infrastructure for CAF that we rely on, implementing new language and runtime features, producing an open source compiler that enabled us to evaluate our ideas, and evaluating our design and implementation through the use of benchmarks. The report details the research, development, findings, and conclusions from this work.

  20. Fast 3D coronary artery contrast-enhanced magnetic resonance angiography with magnetization transfer contrast, fat suppression and parallel imaging as applied on an anthropomorphic moving heart phantom.

    Science.gov (United States)

    Irwan, Roy; Rüssel, Iris K; Sijens, Paul E

    2006-09-01

    A magnetic resonance sequence for high-resolution imaging of coronary arteries in a very short acquisition time is presented. The technique is based on fast low-angle shot and uses fat saturation and magnetization transfer contrast prepulses to improve image contrast. GeneRalized Autocalibrating Partially Parallel Acquisitions (GRAPPA) is implemented to shorten acquisition time. The sequence was tested on a moving anthropomorphic silicone heart phantom where the coronary arteries were filled with a gadolinium contrast agent solution, and imaging was performed at varying heart rates using GRAPPA. The clinical relevance of the phantom was validated by comparing the myocardial relaxation times of the phantom's homogeneous silicone cardiac wall to those of humans. Signal-to-noise ratio and contrast-to-noise ratio were higher when parallel imaging was used, possibly benefiting from the acquisition of one partition per heartbeat. Another advantage of parallel imaging for visualizing the coronary arteries is that the entire heart can be imaged within a few breath-holds.

  1. VPC - A Proposal for a Vector Parallel C Programming Language.

    Science.gov (United States)

    1987-10-30

    181 B. Kernighan and D. Ritchie. Th~e C Programming Language. Prentice-11all, 1978. [91 B. Kernighan and R. Pike. The Unix Programming Environment...designed to be an extended version of the C language as defined by Kernighan and Ritchie (Ref. 8). Rather than taking the approach of extending...basis. Unix is a trademark of AT&T Bell Laboratories. e .’r % 7% The Vector Parallel C Language 3 tion calls that activate the FX/8’s proprietary

  2. 3D Elevation Program—Virtual USA in 3D

    Science.gov (United States)

    Lukas, Vicki; Stoker, J.M.

    2016-04-14

    The U.S. Geological Survey (USGS) 3D Elevation Program (3DEP) uses a laser system called ‘lidar’ (light detection and ranging) to create a virtual reality map of the Nation that is very accurate. 3D maps have many uses with new uses being discovered all the time.  

  3. MLP: A Parallel Programming Alternative to MPI for New Shared Memory Parallel Systems

    Science.gov (United States)

    Taft, James R.

    1999-01-01

    Recent developments at the NASA AMES Research Center's NAS Division have demonstrated that the new generation of NUMA based Symmetric Multi-Processing systems (SMPs), such as the Silicon Graphics Origin 2000, can successfully execute legacy vector oriented CFD production codes at sustained rates far exceeding processing rates possible on dedicated 16 CPU Cray C90 systems. This high level of performance is achieved via shared memory based Multi-Level Parallelism (MLP). This programming approach, developed at NAS and outlined below, is distinct from the message passing paradigm of MPI. It offers parallelism at both the fine and coarse grained level, with communication latencies that are approximately 50-100 times lower than typical MPI implementations on the same platform. Such latency reductions offer the promise of performance scaling to very large CPU counts. The method draws on, but is also distinct from, the newly defined OpenMP specification, which uses compiler directives to support a limited subset of multi-level parallel operations. The NAS MLP method is general, and applicable to a large class of NASA CFD codes.

  4. A parallel domain decomposition-based implicit method for the Cahn-Hilliard-Cook phase-field equation in 3D

    KAUST Repository

    Zheng, Xiang

    2015-03-01

    We present a numerical algorithm for simulating the spinodal decomposition described by the three dimensional Cahn-Hilliard-Cook (CHC) equation, which is a fourth-order stochastic partial differential equation with a noise term. The equation is discretized in space and time based on a fully implicit, cell-centered finite difference scheme, with an adaptive time-stepping strategy designed to accelerate the progress to equilibrium. At each time step, a parallel Newton-Krylov-Schwarz algorithm is used to solve the nonlinear system. We discuss various numerical and computational challenges associated with the method. The numerical scheme is validated by a comparison with an explicit scheme of high accuracy (and unreasonably high cost). We present steady state solutions of the CHC equation in two and three dimensions. The effect of the thermal fluctuation on the spinodal decomposition process is studied. We show that the existence of the thermal fluctuation accelerates the spinodal decomposition process and that the final steady morphology is sensitive to the stochastic noise. We also show the evolution of the energies and statistical moments. In terms of the parallel performance, it is found that the implicit domain decomposition approach scales well on supercomputers with a large number of processors. © 2015 Elsevier Inc.

  5. MAP3D: a media processor approach for high-end 3D graphics

    Science.gov (United States)

    Darsa, Lucia; Stadnicki, Steven; Basoglu, Chris

    1999-12-01

    Equator Technologies, Inc. has used a software-first approach to produce several programmable and advanced VLIW processor architectures that have the flexibility to run both traditional systems tasks and an array of media-rich applications. For example, Equator's MAP1000A is the world's fastest single-chip programmable signal and image processor targeted for digital consumer and office automation markets. The Equator MAP3D is a proposal for the architecture of the next generation of the Equator MAP family. The MAP3D is designed to achieve high-end 3D performance and a variety of customizable special effects by combining special graphics features with high performance floating-point and media processor architecture. As a programmable media processor, it offers the advantages of a completely configurable 3D pipeline--allowing developers to experiment with different algorithms and to tailor their pipeline to achieve the highest performance for a particular application. With the support of Equator's advanced C compiler and toolkit, MAP3D programs can be written in a high-level language. This allows the compiler to successfully find and exploit any parallelism in a programmer's code, thus decreasing the time to market of a given applications. The ability to run an operating system makes it possible to run concurrent applications in the MAP3D chip, such as video decoding while executing the 3D pipelines, so that integration of applications is easily achieved--using real-time decoded imagery for texturing 3D objects, for instance. This novel architecture enables an affordable, integrated solution for high performance 3D graphics.

  6. An informal introduction to program transformation and parallel processors

    Energy Technology Data Exchange (ETDEWEB)

    Hopkins, K.W. [Southwest Baptist Univ., Bolivar, MO (United States)

    1994-08-01

    In the summer of 1992, I had the opportunity to participate in a Faculty Research Program at Argonne National Laboratory. I worked under Dr. Jim Boyle on a project transforming code written in pure functional Lisp to Fortran code to run on distributed-memory parallel processors. To perform this project, I had to learn three things: the transformation system, the basics of distributed-memory parallel machines, and the Lisp programming language. Each of these topics in computer science was unfamiliar to me as a mathematician, but I found that they (especially parallel processing) are greatly impacting many fields of mathematics and science. Since most mathematicians have some exposure to computers, but.certainly are not computer scientists, I felt it was appropriate to write a paper summarizing my introduction to these areas and how they can fit together. This paper is not meant to be a full explanation of the topics, but an informal introduction for the ``mathematical layman.`` I place myself in that category as well as my previous use of computers was as a classroom demonstration tool.

  7. Parallelizing Deadlock Resolution in Symbolic Synthesis of Distributed Programs

    Directory of Open Access Journals (Sweden)

    Fuad Abujarad

    2009-12-01

    Full Text Available Previous work has shown that there are two major complexity barriers in the synthesis of fault-tolerant distributed programs: (1 generation of fault-span, the set of states reachable in the presence of faults, and (2 resolving deadlock states, from where the program has no outgoing transitions. Of these, the former closely resembles with model checking and, hence, techniques for efficient verification are directly applicable to it. Hence, we focus on expediting the latter with the use of multi-core technology. We present two approaches for parallelization by considering different design choices. The first approach is based on the computation of equivalence classes of program transitions (called group computation that are needed due to the issue of distribution (i.e., inability of processes to atomically read and write all program variables. We show that in most cases the speedup of this approach is close to the ideal speedup and in some cases it is superlinear. The second approach uses traditional technique of partitioning deadlock states among multiple threads. However, our experiments show that the speedup for this approach is small. Consequently, our analysis demonstrates that a simple approach of parallelizing the group computation is likely to be the effective method for using multi-core computing in the context of deadlock resolution.

  8. Advanced Programming Platform for efficient use of Data Parallel Hardware

    CERN Document Server

    Cabellos, Luis

    2012-01-01

    Graphics processing units (GPU) had evolved from a specialized hardware capable to render high quality graphics in games to a commodity hardware for effective processing blocks of data in a parallel schema. This evolution is particularly interesting for scientific groups, which traditionally use mainly CPU as a work horse, and now can profit of the arrival of GPU hardware to HPC clusters. This new GPU hardware promises a boost in peak performance, but it is not trivial to use. In this article a programming platform designed to promote a direct use of this specialized hardware is presented. This platform includes a visual editor of parallel data flows and it is oriented to the execution in distributed clusters with GPUs. Examples of application in two characteristic problems, Fast Fourier Transform and Image Compression, are also shown.

  9. Automatic Performance Debugging of SPMD-style Parallel Programs

    CERN Document Server

    Liu, Xu; Zhan, Kunlin; Shi, Weisong; Yuan, Lin; Meng, Dan; Wang, Lei

    2011-01-01

    The simple program and multiple data (SPMD) programming model is widely used for both high performance computing and Cloud computing. In this paper, we design and implement an innovative system, AutoAnalyzer, that automates the process of debugging performance problems of SPMD-style parallel programs, including data collection, performance behavior analysis, locating bottlenecks, and uncovering their root causes. AutoAnalyzer is unique in terms of two features: first, without any apriori knowledge, it automatically locates bottlenecks and uncovers their root causes for performance optimization; second, it is lightweight in terms of the size of performance data to be collected and analyzed. Our contributions are three-fold: first, we propose two effective clustering algorithms to investigate the existence of performance bottlenecks that cause process behavior dissimilarity or code region behavior disparity, respectively; meanwhile, we present two searching algorithms to locate bottlenecks; second, on a basis o...

  10. SIMULATION PROCESS OF REMOVING NON-METALLIC INCLUSIONS IN ALUMINUM ALLOYS USING THE PROGRAM FLOW-3D

    Directory of Open Access Journals (Sweden)

    N. V. Sletova

    2013-01-01

    Full Text Available The perspective materials for making fining preparations for the silumins are the calcium and strontium carbonates from the environmental safety point of view are shown. Principle possibility of using dispersed carbonates in the fining mixtures is confirmed by late inoculation process research using simulation FLOW-3D.The high efficiency of the fining mixture with the inoculants effect is confirmed by the industrial tests

  11. NavP: Structured and Multithreaded Distributed Parallel Programming

    Science.gov (United States)

    Pan, Lei

    2007-01-01

    We present Navigational Programming (NavP) -- a distributed parallel programming methodology based on the principles of migrating computations and multithreading. The four major steps of NavP are: (1) Distribute the data using the data communication pattern in a given algorithm; (2) Insert navigational commands for the computation to migrate and follow large-sized distributed data; (3) Cut the sequential migrating thread and construct a mobile pipeline; and (4) Loop back for refinement. NavP is significantly different from the current prevailing Message Passing (MP) approach. The advantages of NavP include: (1) NavP is structured distributed programming and it does not change the code structure of an original algorithm. This is in sharp contrast to MP as MP implementations in general do not resemble the original sequential code; (2) NavP implementations are always competitive with the best MPI implementations in terms of performance. Approaches such as DSM or HPF have failed to deliver satisfying performance as of today in contrast, even if they are relatively easy to use compared to MP; (3) NavP provides incremental parallelization, which is beyond the reach of MP; and (4) NavP is a unifying approach that allows us to exploit both fine- (multithreading on shared memory) and coarse- (pipelined tasks on distributed memory) grained parallelism. This is in contrast to the currently popular hybrid use of MP+OpenMP, which is known to be complex to use. We present experimental results that demonstrate the effectiveness of NavP.

  12. 三维有限元并行计算及其工程应用%PARALLEL 3D FINITE ELEMENT ANALYSIS AND ITS APPLICATION TO HYDRAULIC ENGINEERING

    Institute of Scientific and Technical Information of China (English)

    刘耀儒; 周维垣; 杨强; 陈新

    2005-01-01

    针对水利工程中的大型复杂三维结构对大规模数值计算的需求,基于J-PCG方法(Jacobi预处理共轭梯度法),建立了有限元EBE(element-by-element)方法在分布存储并行机上的计算算法.该算法不用考虑网格拓扑结构和单元的排序,同时不形成整体刚度矩阵,而且避免了对复杂的三维结构进行区域分解.采用上述算法编制了三维有限元并行求解的PFEM(parallel finite element method)程序,并在网络机群系统上实现,然后将其应用到二滩拱坝-地基系统和水布娅地下洞室的三维有限元数值计算中.数值计算结果表明,三维有限元并行EBE方法非常适合于水利工程中三维复杂结构的大规模数值计算.

  13. A Parallel 3D Model for the Multi-Species Low Energy Beam Transport System of the RIA Prototype ECR Ion Source VENUS

    CERN Document Server

    Qiang, Ji; Todd, Damon

    2005-01-01

    The driver linac of the proposed Rare Isotope Accelerator (RIA) requires a great variety of high intensity, high charge state ion beams. In order to design and optimize the low energy beam line optics of the RIA front end, we have developed a new parallel three-dimensional model to simulate the low energy, multi-species beam transport from the ECR ion source extraction region to the focal plane of the analyzing magnet. A multi-section overlapped computational domain has been used to break the original transport system into a number of independent subsystems. Within each subsystem, macro-particle tracking is used to obtain the charge density distribution in this subdomain. The three-dimensional Poisson equation is solved within the subdomain and particle tracking is repeated until the solution converges. Two new Poisson solvers based on a combination of the spectral method and the multigrid method have been developed to solve the Poisson equation in cylindrical coordinates for the beam extraction region and in...

  14. PLOT3D user's manual

    Science.gov (United States)

    Walatka, Pamela P.; Buning, Pieter G.; Pierce, Larry; Elson, Patricia A.

    1990-01-01

    PLOT3D is a computer graphics program designed to visualize the grids and solutions of computational fluid dynamics. Seventy-four functions are available. Versions are available for many systems. PLOT3D can handle multiple grids with a million or more grid points, and can produce varieties of model renderings, such as wireframe or flat shaded. Output from PLOT3D can be used in animation programs. The first part of this manual is a tutorial that takes the reader, keystroke by keystroke, through a PLOT3D session. The second part of the manual contains reference chapters, including the helpfile, data file formats, advice on changing PLOT3D, and sample command files.

  15. 3D video

    CERN Document Server

    Lucas, Laurent; Loscos, Céline

    2013-01-01

    While 3D vision has existed for many years, the use of 3D cameras and video-based modeling by the film industry has induced an explosion of interest for 3D acquisition technology, 3D content and 3D displays. As such, 3D video has become one of the new technology trends of this century.The chapters in this book cover a large spectrum of areas connected to 3D video, which are presented both theoretically and technologically, while taking into account both physiological and perceptual aspects. Stepping away from traditional 3D vision, the authors, all currently involved in these areas, provide th

  16. 3D Animation Essentials

    CERN Document Server

    Beane, Andy

    2012-01-01

    The essential fundamentals of 3D animation for aspiring 3D artists 3D is everywhere--video games, movie and television special effects, mobile devices, etc. Many aspiring artists and animators have grown up with 3D and computers, and naturally gravitate to this field as their area of interest. Bringing a blend of studio and classroom experience to offer you thorough coverage of the 3D animation industry, this must-have book shows you what it takes to create compelling and realistic 3D imagery. Serves as the first step to understanding the language of 3D and computer graphics (CG)Covers 3D anim

  17. Parallelization and checkpointing of GPU applications through program transformation

    Energy Technology Data Exchange (ETDEWEB)

    Solano-Quinde, Lizandro Damian [Iowa State Univ., Ames, IA (United States)

    2012-01-01

    GPUs have emerged as a powerful tool for accelerating general-purpose applications. The availability of programming languages that makes writing general-purpose applications for running on GPUs tractable have consolidated GPUs as an alternative for accelerating general purpose applications. Among the areas that have benefited from GPU acceleration are: signal and image processing, computational fluid dynamics, quantum chemistry, and, in general, the High Performance Computing (HPC) Industry. In order to continue to exploit higher levels of parallelism with GPUs, multi-GPU systems are gaining popularity. In this context, single-GPU applications are parallelized for running in multi-GPU systems. Furthermore, multi-GPU systems help to solve the GPU memory limitation for applications with large application memory footprint. Parallelizing single-GPU applications has been approached by libraries that distribute the workload at runtime, however, they impose execution overhead and are not portable. On the other hand, on traditional CPU systems, parallelization has been approached through application transformation at pre-compile time, which enhances the application to distribute the workload at application level and does not have the issues of library-based approaches. Hence, a parallelization scheme for GPU systems based on application transformation is needed. Like any computing engine of today, reliability is also a concern in GPUs. GPUs are vulnerable to transient and permanent failures. Current checkpoint/restart techniques are not suitable for systems with GPUs. Checkpointing for GPU systems present new and interesting challenges, primarily due to the natural differences imposed by the hardware design, the memory subsystem architecture, the massive number of threads, and the limited amount of synchronization among threads. Therefore, a checkpoint/restart technique suitable for GPU systems is needed. The goal of this work is to exploit higher levels of parallelism and

  18. Energy consumption model over parallel programs implemented on multicore architectures

    Directory of Open Access Journals (Sweden)

    Ricardo Isidro-Ramirez

    2015-06-01

    Full Text Available In High Performance Computing, energy consump-tion is becoming an important aspect to consider. Due to the high costs that represent energy production in all countries it holds an important role and it seek to find ways to save energy. It is reflected in some efforts to reduce the energy requirements of hardware components and applications. Some options have been appearing in order to scale down energy use and, con-sequently, scale up energy efficiency. One of these strategies is the multithread programming paradigm, whose purpose is to produce parallel programs able to use the full amount of computing resources available in a microprocessor. That energy saving strategy focuses on efficient use of multicore processors that are found in various computing devices, like mobile devices. Actually, as a growing trend, multicore processors are found as part of various specific purpose computers since 2003, from High Performance Computing servers to mobile devices. However, it is not clear how multiprogramming affects energy efficiency. This paper presents an analysis of different types of multicore-based architectures used in computing, and then a valid model is presented. Based on Amdahl’s Law, a model that considers different scenarios of energy use in multicore architectures it is proposed. Some interesting results were found from experiments with the developed algorithm, that it was execute of a parallel and sequential way. A lower limit of energy consumption was found in a type of multicore architecture and this behavior was observed experimentally.

  19. An empirical study of FORTRAN programs for parallelizing compilers

    Science.gov (United States)

    Shen, Zhiyu; Li, Zhiyuan; Yew, Pen-Chung

    1990-01-01

    Some results are reported from an empirical study of program characteristics that are important in parallelizing compiler writers, especially in the area of data dependence analysis and program transformations. The state of the art in data dependence analysis and some parallel execution techniques are examined. The major findings are included. Many subscripts contain symbolic terms with unknown values. A few methods of determining their values at compile time are evaluated. Array references with coupled subscripts appear quite frequently; these subscripts must be handled simultaneously in a dependence test, rather than being handled separately as in current test algorithms. Nonzero coefficients of loop indexes in most subscripts are found to be simple: they are either 1 or -1. This allows an exact real-valued test to be as accurate as an exact integer-valued test for one-dimensional or two-dimensional arrays. Dependencies with uncertain distance are found to be rather common, and one of the main reasons is the frequent appearance of symbolic terms with unknown values.

  20. CUDA programs for GPU computing of Swendsen-Wang multi-cluster spin flip algorithm: 2D and 3D Ising, Potts, and XY models

    CERN Document Server

    Komura, Yukihiro

    2014-01-01

    We present sample CUDA programs for the GPU computing of the Swendsen-Wang multi-cluster spin flip algorithm. We deal with the classical spin models; the Ising model, the $q$-state Potts model, and the classical XY model. As for the lattice, both the 2D (square) lattice and the 3D (simple cubic) lattice are treated. We already reported the idea of the GPU implementation for 2D models [Comput. Phys. Commun. 183 (2012) 1155-1161]. We here explain the details of sample programs, and discuss the performance of the present GPU implementation for the 3D Ising and XY models. We also show the calculated results of the moment ratio for these models, and discuss phase transitions.

  1. Poker on the Cosmic Cube: The First Retargetable Parallel Programming Language and Environment.

    Science.gov (United States)

    1986-06-01

    parallel programming environment, to new parallel architectures. The specifics are illustrated by describing the retarget of Poker to CalTech’s Cosmic Cube. Poker requires only three features from the target architecture: MIMD operation, message passing inter-process communication, and a sequential language (e.g. C) for the processor elements. In return Poker gives the new architecture a complete parallel programming environment which will compile Poker parallel programs without modification, into efficient object code for the new architecture.

  2. A Parallel Vector Machine for the PM Programming Language

    Science.gov (United States)

    Bellerby, Tim

    2016-04-01

    PM is a new programming language which aims to make the writing of computational geoscience models on parallel hardware accessible to scientists who are not themselves expert parallel programmers. It is based around the concept of communicating operators: language constructs that enable variables local to a single invocation of a parallelised loop to be viewed as if they were arrays spanning the entire loop domain. This mechanism enables different loop invocations (which may or may not be executing on different processors) to exchange information in a manner that extends the successful Communicating Sequential Processes idiom from single messages to collective communication. Communicating operators avoid the additional synchronisation mechanisms, such as atomic variables, required when programming using the Partitioned Global Address Space (PGAS) paradigm. Using a single loop invocation as the fundamental unit of concurrency enables PM to uniformly represent different levels of parallelism from vector operations through shared memory systems to distributed grids. This paper describes an implementation of PM based on a vectorised virtual machine. On a single processor node, concurrent operations are implemented using masked vector operations. Virtual machine instructions operate on vectors of values and may be unmasked, masked using a Boolean field, or masked using an array of active vector cell locations. Conditional structures (such as if-then-else or while statement implementations) calculate and apply masks to the operations they control. A shift in mask representation from Boolean to location-list occurs when active locations become sufficiently sparse. Parallel loops unfold data structures (or vectors of data structures for nested loops) into vectors of values that may additionally be distributed over multiple computational nodes and then split into micro-threads compatible with the size of the local cache. Inter-node communication is accomplished using

  3. Parallelizing dynamic sequential programs using polyhedral process networks

    NARCIS (Netherlands)

    Nadezhkin, Dmitry

    2012-01-01

    The Polyhedral Process Network (PPN) is a suitable parallel model of computation (MoC) used to specify embedded streaming applications in a parallel form facilitating the efficient mapping onto embedded parallel execution platforms. Unfortunately, specifying an application using a parallel MoC is a

  4. 基于 VIC 3D 技术的重组竹I 型断裂参数确定方法%Determination of mode-I fracture parameters of parallel strand bamboo based on VIC-3D technology

    Institute of Scientific and Technical Information of China (English)

    杨蕾; 周爱萍; 黄东升; 何晨

    2016-01-01

    Fabricated via industrial process using raw bamboo,Parallel strand bamboo (PSB)is a high strength composite which has been used in building constructions in recent years.Microvoids are inevitably left in PSB composite owing to the dimensional inconsistence of bamboo fibers;hence fracture due to mocrovids coalescence and expanding maybe a major failure mode of PSB components.This paper amid at studying on the Mode I fracture properties based on LEFM theory and Iwrin’s energy theory. Wedge splitting test was employed as test method.The length of crack expending was determined by three dimensional virtual image correlation (VIC-3D)global-field deformation acquisition system.Fracture toughness of PSB and R-curves were obtained.The results showed that the VIC-3D global-field acquisition system can precisely determine the crack tip and the complicated calculation for crack length determination may be avoided.Good agreement may be achieved between the result of this method and compliance approach,which indicated that the proposed method in this study is efficient.%重组竹是原竹经工业化制造而成的一种高强复合材料,近年来被用于建筑结构。由于重组竹内部存在许多微裂纹,断裂破坏是重组竹构件的主要失效模式。本文基于线弹性断裂理论和 Irwin 能量原理,采用 VIC 3D 全场变形测量技术确定裂纹扩展长度,通过重组竹楔形试件断裂试验,研究重组竹 I 型裂纹的断裂性能,给出重组竹的断裂韧度和 R 曲线。试验表明:VIC 3D 全场变形测量技术可准确确定裂纹尖端,避免了等效柔度法的复杂运算。该方法的结构与等效柔度法高度一致,是研究重组竹断裂性能的一种有效测量手段。

  5. EUROPEANA AND 3D

    Directory of Open Access Journals (Sweden)

    D. Pletinckx

    2012-09-01

    Full Text Available The current 3D hype creates a lot of interest in 3D. People go to 3D movies, but are we ready to use 3D in our homes, in our offices, in our communication? Are we ready to deliver real 3D to a general public and use interactive 3D in a meaningful way to enjoy, learn, communicate? The CARARE project is realising this for the moment in the domain of monuments and archaeology, so that real 3D of archaeological sites and European monuments will be available to the general public by 2012. There are several aspects to this endeavour. First of all is the technical aspect of flawlessly delivering 3D content over all platforms and operating systems, without installing software. We have currently a working solution in PDF, but HTML5 will probably be the future. Secondly, there is still little knowledge on how to create 3D learning objects, 3D tourist information or 3D scholarly communication. We are still in a prototype phase when it comes to integrate 3D objects in physical or virtual museums. Nevertheless, Europeana has a tremendous potential as a multi-facetted virtual museum. Finally, 3D has a large potential to act as a hub of information, linking to related 2D imagery, texts, video, sound. We describe how to create such rich, explorable 3D objects that can be used intuitively by the generic Europeana user and what metadata is needed to support the semantic linking.

  6. Open-MP与并行程序设计%Open-MP and Parallel Programming

    Institute of Scientific and Technical Information of China (English)

    陈崚; 陈宏建; 秦玲

    2003-01-01

    The application programming interface Open-MP for the shared memory parallel computer system and its characteristics are illustrated. We also compare Open-MP with parallel programming tool MPI.To overcome the disadvantage of large overhead in Open-MP program,several optimization methods in Open-MP programming are presented to increase the efficiency of its execution.

  7. 基于Simulink/SimMechanics的三自由度并联机器人控制系统仿真%Simulation of Control System for 3D of Parallel Robot based on Simulink/SimMechanics

    Institute of Scientific and Technical Information of China (English)

    胡峰; 骆德渊; 雷霆; 柯辉

    2012-01-01

    Aiming at the problem that the control system of parallel robot is more complicated, compared to the traditional series robot, the technology of virtual simulation was investigated on the control strategy of parallel robot Taking 3D0F Delta Parallel Robot for example, in order to conveniently and rapidly achieve the control system simulation , using Simulink for simulation platform and combining with SimMechanics link, the method of modeling which translates CAD assemblies of Pro/E into SimMechanics model was presented, after that the PID controller model was designed. The experimental results show that it can provide the efficient and significant simulation platform to research the control strategy of parallel robot.%针对并联机器人控制系统比传统串联机器人更加复杂的问题,将虚拟仿真技术应用到并联机器人控制策略的研究上.以三自由度Delta并联机器人为例,为便捷高效实现其控制系统仿真,利用Simulink为仿真平台,结合SimMechanics Link接口软件,提出了三维Pro/E模型转换成SimMechanics模型的建模方法建立机械系统模型,并设计PID控制器模型进行仿真分析.结果表明,该方法为并联机器人控制策略的研究提供了高效的仿真平台,便于展开针对并联机器人特点的各种控制策略的研究.

  8. Parallel functional programming in Sisal: Fictions, facts, and future

    Energy Technology Data Exchange (ETDEWEB)

    McGraw, J.R.

    1993-07-01

    This paper provides a status report on the progress of research and development on the functional language Sisal. This project focuses on providing a highly effective method of writing large scientific applications that can efficiently execute on a spectrum of different multiprocessors. The paper includes sections on the language definition, compilation strategies, and programming techniques intended for readers with little or no background with Sisal. The section on performance presents our most recent results on execution speed for shared-memory multiprocessors, our findings using Sisal to develop codes, and our experiences migrating the same source code to different machines. For large programs, the execution performance of Sisal (with minimal supporting advice from the programmer) usually exceeds that of the best available automatic, vector/parallel Fortran compilers. Our evidence also indicates that Sisal programs tend to be shorter in length, faster to write, and dearer to understand than equivalent algorithms in Fortran. The paper concludes with a substantial discussion of common criticisms of the language and our plans for addressing them. Most notably, efficient implementations for distributed memory machines are lacking; an issue we plan to remedy.

  9. A New Approach to Improve Cognition, Muscle Strength, and Postural Balance in Community-Dwelling Elderly with a 3-D Virtual Reality Kayak Program.

    Science.gov (United States)

    Park, Junhyuck; Yim, JongEun

    2016-01-01

    Aging is usually accompanied with deterioration of physical abilities, such as muscular strength, sensory sensitivity, and functional capacity. Recently, intervention methods with virtual reality have been introduced, providing an enjoyable therapy for elderly. The aim of this study was to investigate whether a 3-D virtual reality kayak program could improve the cognitive function, muscle strength, and balance of community-dwelling elderly. Importantly, kayaking involves most of the upper body musculature and needs the balance control. Seventy-two participants were randomly allocated into the kayak program group (n = 36) and the control group (n = 36). The two groups were well matched with respect to general characteristics at baseline. The participants in both groups performed a conventional exercise program for 30 min, and then the 3-D virtual reality kayak program was performed in the kayak program group for 20 min, two times a week for 6 weeks. Cognitive function was measured using the Montreal Cognitive Assessment. Muscle strength was measured using the arm curl and handgrip strength tests. Standing and sitting balance was measured using the Good Balance system. The post-test was performed in the same manner as the pre-test; the overall outcomes such as cognitive function (p balance (standing and sitting balance, p balance of elderly.

  10. Ex-vessel neutron dosimetry analysis for westinghouse 4-loop XL pressurized water reactor plant using the RadTrack{sup TM} Code System with the 3D parallel discrete ordinates code RAPTOR-M3G

    Energy Technology Data Exchange (ETDEWEB)

    Chen, J.; Alpan, F. A.; Fischer, G.A.; Fero, A.H. [Westinghouse Electric Company, Nuclear Services, Radiation Engineering and Analysis, 1000 Westinghouse Dr., Cranberry Township, PA 16066-5228 (United States)

    2011-07-01

    Traditional two-dimensional (2D)/one-dimensional (1D) SYNTHESIS methodology has been widely used to calculate fast neutron (>1.0 MeV) fluence exposure to reactor pressure vessel in the belt-line region. However, it is expected that this methodology cannot provide accurate fast neutron fluence calculation at elevations far above or below the active core region. A three-dimensional (3D) parallel discrete ordinates calculation for ex-vessel neutron dosimetry on a Westinghouse 4-Loop XL Pressurized Water Reactor has been done. It shows good agreement between the calculated results and measured results. Furthermore, the results show very different fast neutron flux values at some of the former plate locations and elevations above and below an active core than those calculated by a 2D/1D SYNTHESIS method. This indicates that for certain irregular reactor internal structures, where the fast neutron flux has a very strong local effect, it is required to use a 3D transport method to calculate accurate fast neutron exposure. (authors)

  11. A 3D elasto-plastic FEM program developed for reservoir Geomechanics simulations: Introduction and case studies

    Directory of Open Access Journals (Sweden)

    Amin Chamani

    2013-06-01

    Full Text Available The development of yielded or failure zone due to an engineering construction is a subject of study in different disciplines. In Petroleum engineering, depletion from and injection of gas into a porous rock can cause development of a yield zone around the reservoir. Studying this phenomenon requires elasto-plastic analysis of geomaterial, in this case the porous rocks. In this study, which is a continuation of a previous study investigating the elastic behaviour of geomaterial, the elasto-plastic responses of geomaterial were studied. A 3D finite element code (FEM was developed, which can consider different constitutive models. The code features were explained and some case studies were presented to validate the output results of the code. The numerical model was, then, applied to study the development of the plastic zone around a horizontal porous formation subjected to the injection of gas. The model is described in detail and the results are presented. It was observed that by reducing the cohesion of rocks the extension of the plastic zone increased. Comparing to the elastic model, the ability to estimate the extension of the yield and failure zone is the main advantage of an elasto-plastic model.

  12. GRID2D/3D: A computer program for generating grid systems in complex-shaped two- and three-dimensional spatial domains. Part 2: User's manual and program listing

    Science.gov (United States)

    Bailey, R. T.; Shih, T. I.-P.; Nguyen, H. L.; Roelke, R. J.

    1990-01-01

    An efficient computer program, called GRID2D/3D, was developed to generate single and composite grid systems within geometrically complex two- and three-dimensional (2- and 3-D) spatial domains that can deform with time. GRID2D/3D generates single grid systems by using algebraic grid generation methods based on transfinite interpolation in which the distribution of grid points within the spatial domain is controlled by stretching functions. All single grid systems generated by GRID2D/3D can have grid lines that are continuous and differentiable everywhere up to the second-order. Also, grid lines can intersect boundaries of the spatial domain orthogonally. GRID2D/3D generates composite grid systems by patching together two or more single grid systems. The patching can be discontinuous or continuous. For continuous composite grid systems, the grid lines are continuous and differentiable everywhere up to the second-order except at interfaces where different single grid systems meet. At interfaces where different single grid systems meet, the grid lines are only differentiable up to the first-order. For 2-D spatial domains, the boundary curves are described by using either cubic or tension spline interpolation. For 3-D spatial domains, the boundary surfaces are described by using either linear Coon's interpolation, bi-hyperbolic spline interpolation, or a new technique referred to as 3-D bi-directional Hermite interpolation. Since grid systems generated by algebraic methods can have grid lines that overlap one another, GRID2D/3D contains a graphics package for evaluating the grid systems generated. With the graphics package, the user can generate grid systems in an interactive manner with the grid generation part of GRID2D/3D. GRID2D/3D is written in FORTRAN 77 and can be run on any IBM PC, XT, or AT compatible computer. In order to use GRID2D/3D on workstations or mainframe computers, some minor modifications must be made in the graphics part of the program; no

  13. The Rochester Checkers Player: Multi-Model Parallel Programming for Animate Vision

    Science.gov (United States)

    1991-06-01

    parallel programming is likely to serve for all tasks, however. Early vision algorithms are intensely data parallel, often utilizing fine-grain parallel computations that share an image, while cognition algorithms decompose naturally by function, often consisting of loosely-coupled, coarse-grain parallel units. A typical animate vision application will likely consist of many tasks, each of which may require a different parallel programming model, and all of which must cooperate to achieve the desired behavior. These multi-model programs require an

  14. MoldaNet: a network distributed molecular graphics and modelling program that integrates secure signed applet and Java 3D technologies.

    Science.gov (United States)

    Yoshida, H; Rzepa, H S; Tonge, A P

    1998-06-01

    MoldaNet is a molecular graphics and modelling program that integrates several new Java technologies, including authentication as a Secure Signed Applet, and implementation of Java 3D classes to enable access to hardware graphics acceleration. It is the first example of a novel class of Internet-based distributed computational chemistry tool designed to eliminate the need for user pre-installation of software on their client computer other than a standard Internet browser. The creation of a properly authenticated tool using a signed digital X.509 certificate permits the user to employ MoldaNet to read and write the files to a local file store; actions that are normally disallowed in Java applets. The modularity of the Java language also allows straightforward inclusion of Java3D and Chemical Markup Language classes in MoldaNet to permit the user to filter their model into 3D model descriptors such as VRML97 or CML for saving on local disk. The implications for both distance-based training environments and chemical commerce are noted.

  15. IZDELAVA TISKALNIKA 3D

    OpenAIRE

    Brdnik, Lovro

    2015-01-01

    Diplomsko delo analizira trenutno stanje 3D tiskalnikov na trgu. Prikazan je razvoj in principi delovanja 3D tiskalnikov. Predstavljeni so tipi 3D tiskalnikov, njihove prednosti in slabosti. Podrobneje je predstavljena zgradba in delovanje koračnih motorjev. Opravljene so meritve koračnih motorjev. Opisana je programska oprema za rokovanje s 3D tiskalniki in komponente, ki jih potrebujemo za izdelavo. Diploma se oklepa vprašanja, ali je izdelava 3D tiskalnika bolj ekonomična kot pa naložba v ...

  16. Hip2Norm: an object-oriented cross-platform program for 3D analysis of hip joint morphology using 2D pelvic radiographs.

    Science.gov (United States)

    Zheng, G; Tannast, M; Anderegg, C; Siebenrock, K A; Langlotz, F

    2007-07-01

    We developed an object-oriented cross-platform program to perform three-dimensional (3D) analysis of hip joint morphology using two-dimensional (2D) anteroposterior (AP) pelvic radiographs. Landmarks extracted from 2D AP pelvic radiographs and optionally an additional lateral pelvic X-ray were combined with a cone beam projection model to reconstruct 3D hip joints. Since individual pelvic orientation can vary considerably, a method for standardizing pelvic orientation was implemented to determine the absolute tilt/rotation. The evaluation of anatomically morphologic differences was achieved by reconstructing the projected acetabular rim and the measured hip parameters as if obtained in a standardized neutral orientation. The program had been successfully used to interactively objectify acetabular version in hips with femoro-acetabular impingement or developmental dysplasia. Hip(2)Norm is written in object-oriented programming language C++ using cross-platform software Qt (TrollTech, Oslo, Norway) for graphical user interface (GUI) and is transportable to any platform.

  17. Parallel Programming Methodologies for Non-Uniform Structured Problems in Materials Science

    Science.gov (United States)

    1993-10-01

    COVERED 1 10/93 _ Interim 12/01/92 - 09/30/93 4. TITLE AND SUBTITLE 5. FUNDING NUMBERS Parallel Programming Methodologies for Non-Uniform Structured...Dear Dr. van Tilborg, Enclosed you will find the annual report for " Parallel Programming Methodolo- gies for Non-Uniform Structured Problems in...Quincy Street Arlington, VA 22217-5660 Dear Dr. van Tilborg, Enclosed you will find the annual report for " Parallel Programming Methodolo- gies for Non

  18. 并行程序设计语言发展现状%Current Development of Parallel Programming Language

    Institute of Scientific and Technical Information of China (English)

    韩卫; 郝红宇; 代丽

    2003-01-01

    In this paper we introduce the history of the parallel programming language and list some of currently parallel programming languages. Then according to the classified principle. We analyze some of the representative parallel programming languages in detail. Finally, we show a further feature to the parallel programming language.

  19. 3-D Hybrid Kinetic Modeling of the Interaction Between the Solar Wind and Lunar-like Exospheric Pickup Ions in Case of Oblique/ Quasi-Parallel/Parallel Upstream Magnetic Field

    Science.gov (United States)

    Lipatov, A. S.; Farrell, W. M.; Cooper, J. F.; Sittler, E. C., Jr.; Hartle, R. E.

    2015-01-01

    The interactions between the solar wind and Moon-sized objects are determined by a set of the solar wind parameters and plasma environment of the space objects. The orientation of upstream magnetic field is one of the key factors which determines the formation and structure of bow shock wave/Mach cone or Alfven wing near the obstacle. The study of effects of the direction of the upstream magnetic field on lunar-like plasma environment is the main subject of our investigation in this paper. Photoionization, electron-impact ionization and charge exchange are included in our hybrid model. The computational model includes the self-consistent dynamics of the light (hydrogen (+), helium (+)) and heavy (sodium (+)) pickup ions. The lunar interior is considered as a weakly conducting body. Our previous 2013 lunar work, as reported in this journal, found formation of a triple structure of the Mach cone near the Moon in the case of perpendicular upstream magnetic field. Further advances in modeling now reveal the presence of strong wave activity in the upstream solar wind and plasma wake in the cases of quasiparallel and parallel upstream magnetic fields. However, little wave activity is found for the opposite case with a perpendicular upstream magnetic field. The modeling does not show a formation of the Mach cone in the case of theta(Sub B,U) approximately equal to 0 degrees.

  20. Development of an Artificial Intelligence Programming Course and Unity3d Based Framework to Motivate Learning in Artistic Minded Students

    DEFF Research Database (Denmark)

    Reng, Lars

    2012-01-01

    between technical and artistic minded students is, however, increased once the students reach the sixth semester. The complex algorithms of the artificial intelligence course seemed to demotivate the artistic minded students even before the course began. This paper will present the extensive changes made...... to the sixth semester artificial intelligence programming course, in order to provide a highly motivating direct visual feedback, and thereby remove the steep initial learning curve for artistic minded students. The framework was developed with close dialog to both the game industry and experienced master...

  1. Development of an Artificial Intelligence Programming Course and Unity3d Based Framework to Motivate Learning in Artistic Minded Students

    DEFF Research Database (Denmark)

    Reng, Lars

    2012-01-01

    between technical and artistic minded students is, however, increased once the students reach the sixth semester. The complex algorithms of the artificial intelligence course seemed to demotivate the artistic minded students even before the course began. This paper will present the extensive changes made...... to the sixth semester artificial intelligence programming course, in order to provide a highly motivating direct visual feedback, and thereby remove the steep initial learning curve for artistic minded students. The framework was developed with close dialog to both the game industry and experienced master...

  2. Processor Allocation for Optimistic Parallelization of Irregular Programs

    CERN Document Server

    Versaci, Francesco

    2012-01-01

    Optimistic parallelization is a promising approach for the parallelization of irregular algorithms: potentially interfering tasks are launched dynamically, and the runtime system detects conflicts between concurrent activities, aborting and rolling back conflicting tasks. However, parallelism in irregular algorithms is very complex. In a regular algorithm like dense matrix multiplication, the amount of parallelism can usually be expressed as a function of the problem size, so it is reasonably straightforward to determine how many processors should be allocated to execute a regular algorithm of a certain size (this is called the processor allocation problem). In contrast, parallelism in irregular algorithms can be a function of input parameters, and the amount of parallelism can vary dramatically during the execution of the irregular algorithm. Therefore, the processor allocation problem for irregular algorithms is very difficult. In this paper, we describe the first systematic strategy for addressing this pro...

  3. Exploiting variability for energy optimization of parallel programs

    Energy Technology Data Exchange (ETDEWEB)

    Lavrijsen, Wim [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Iancu, Costin [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); de Jong, Wibe [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Chen, Xin [Georgia Inst. of Technology, Atlanta, GA (United States); Schwan, Karsten [Georgia Inst. of Technology, Atlanta, GA (United States)

    2016-04-18

    Here in this paper we present optimizations that use DVFS mechanisms to reduce the total energy usage in scientific applications. Our main insight is that noise is intrinsic to large scale parallel executions and it appears whenever shared resources are contended. The presence of noise allows us to identify and manipulate any program regions amenable to DVFS. When compared to previous energy optimizations that make per core decisions using predictions of the running time, our scheme uses a qualitative approach to recognize the signature of executions amenable to DVFS. By recognizing the "shape of variability" we can optimize codes with highly dynamic behavior, which pose challenges to all existing DVFS techniques. We validate our approach using offline and online analyses for one-sided and two-sided communication paradigms. We have applied our methods to NWChem, and we show best case improvements in energy use of 12% at no loss in performance when using online optimizations running on 720 Haswell cores with one-sided communication. With NWChem on MPI two-sided and offline analysis, capturing the initialization, we find energy savings of up to 20%, with less than 1% performance cost.

  4. MulticoreBSP for C : A high-performance library for shared-memory parallel programming

    NARCIS (Netherlands)

    Yzelman, A. N.; Bisseling, R. H.; Roose, D.; Meerbergen, K.

    2014-01-01

    The bulk synchronous parallel (BSP) model, as well as parallel programming interfaces based on BSP, classically target distributed-memory parallel architectures. In earlier work, Yzelman and Bisseling designed a MulticoreBSP for Java library specifically for shared-memory architectures. In the prese

  5. 3D immersive and interactive learning

    CERN Document Server

    Cai, Yiyu

    2014-01-01

    This book reviews innovative uses of 3D for immersive and interactive learning, covering gifted programs, normal stream and special needs education. Reports on curriculum-based 3D learning in classrooms, and co-curriculum-based 3D student research projects.

  6. TEHNOLOGIJE 3D TISKALNIKOV

    OpenAIRE

    Kolar, Nataša

    2016-01-01

    Diplomsko delo predstavi razvoj tiskanja skozi čas. Podrobneje so opisani 3D tiskalniki, ki uporabljajo različne tehnologije 3D tiskanja. Predstavljene so različne tehnologije 3D tiskanja, njihova uporaba in narejeni prototipi oz. končni izdelki. Diplomsko delo opiše celoten postopek, od zamisli, priprave podatkov in tiskalnika do izdelave prototipa oz. končnega izdelka.

  7. KT3D_H2O: a program for kriging water level data using hydrologic drift terms.

    Science.gov (United States)

    Karanovic, Marinko; Tonkin, Matthew; Wilson, David

    2009-01-01

    It is often necessary to estimate the zone of contribution to, or the capture zone developed by, pumped wells: for example, when evaluating pump-and-treat remedies and when developing wellhead protection areas for supply wells. Tonkin and Larson (2002) and Brochu and Marcotte (2003) describe a mapping-based method for estimating the capture zone of pumped wells, developed by combining universal kriging (kriging with a trend) with analytical expressions that describe the response of the potentiometric surface to certain applied stresses. This Methods Note describes (a) expansions to the technique described by Tonkin and Larson (2002); (b) the concept of the capture frequency map (CFM), a technique that combines information from multiple capture zone maps into a single depiction of capture; (c) the development of a graphical user interface to facilitate the use of the methods described; and (d) the integration of these programs within the MapWindow geographic information system environment. An example application is presented that illustrates ground water level contours, capture zones, and a CFM prepared using the methods and software described.

  8. HPF: a data parallel programming interface for large-scale numerical simulations

    Energy Technology Data Exchange (ETDEWEB)

    Seo, Yoshiki; Suehiro, Kenji; Murai, Hitoshi [NEC Corp., Tokyo (Japan)

    1998-03-01

    HPF (High Performance Fortran) is a data parallel language designed for programming on distributed memory parallel systems. The first draft of HPF1.0 was defined in 1993 as a de facto standard language. Recently, relatively reliable HPF compilers have become available on several distributed memory parallel systems. Many projects to parallelize real world programs have started mainly in the U.S. and Europe, and the weak and strong points in the current HPF have been made clear. In this paper, major data transfer patterns required to parallelize numerical simulations, such as SHIFT, matrix transposition, reduction, GATHER/SCATTER and irregular communication, and the programming methods to implement them with HPF are described. The problems in the current HPF interface for developing efficient parallel programs and recent activities to deal with them is presented as well. (author)

  9. VPython: Python plus Animations in Stereo 3D

    Science.gov (United States)

    Sherwood, Bruce

    2004-03-01

    Python is a modern object-oriented programming language. VPython (http://vpython.org) is a combination of Python (http://python.org), the Numeric module from LLNL (http://www.pfdubois.com/numpy), and the Visual module created by David Scherer, all of which have been under continuous development as open source projects. VPython makes it easy to write programs that generate real-time, navigable 3D animations. The Visual module includes a set of 3D objects (sphere, cylinder, arrow, etc.), tools for creating other shapes, and support for vector algebra. The 3D renderer runs in a parallel thread, and animations are produced as a side effect of computations, freeing the programmer to concentrate on the physics. Applications include educational and research visualization. In the Fall of 2003 Hugh Fisher at the Australian National University, John Zelle at Wartburg College, and I contributed to a new stereo capability of VPython. By adding a single statement to an existing VPython program, animations can be viewed in true stereo 3D. One can choose several modes: active shutter glasses, passive polarized glasses, or colored glasses (e.g. red-cyan). The talk will demonstrate the new stereo capability and discuss the pros and cons of various schemes for display of stereo 3D for a large audience. Supported in part by NSF grant DUE-0237132.

  10. OpenCL: A Parallel Programming Standard for Heterogeneous Computing Systems.

    Science.gov (United States)

    Stone, John E; Gohara, David; Shi, Guochun

    2010-05-01

    We provide an overview of the key architectural features of recent microprocessor designs and describe the programming model and abstractions provided by OpenCL, a new parallel programming standard targeting these architectures.

  11. 3D virtuel udstilling

    DEFF Research Database (Denmark)

    Tournay, Bruno; Rüdiger, Bjarne

    2006-01-01

    3d digital model af Arkitektskolens gård med virtuel udstilling af afgangsprojekter fra afgangen sommer 2006. 10 s.......3d digital model af Arkitektskolens gård med virtuel udstilling af afgangsprojekter fra afgangen sommer 2006. 10 s....

  12. A Verified Integration of Imperative Parallel Programming Paradigms in an Object-Oriented Language

    OpenAIRE

    Sivilotti, Paul

    1993-01-01

    CC++ is a parallel object-oriented programming language that uses parallel composition, atomic functions, and single- assignment variables to express concurrency. We show that this programming paradigm is equivalent to several traditional imperative communication and synchronization models, namely: semaphores, monitors, and asynchronous channels. A collection of libraries which integrates these traditional models with CC++ is specified, implemented, and formally verified.

  13. FastScript3D - A Companion to Java 3D

    Science.gov (United States)

    Koenig, Patti

    2005-01-01

    FastScript3D is a computer program, written in the Java 3D(TM) programming language, that establishes an alternative language that helps users who lack expertise in Java 3D to use Java 3D for constructing three-dimensional (3D)-appearing graphics. The FastScript3D language provides a set of simple, intuitive, one-line text-string commands for creating, controlling, and animating 3D models. The first word in a string is the name of a command; the rest of the string contains the data arguments for the command. The commands can also be used as an aid to learning Java 3D. Developers can extend the language by adding custom text-string commands. The commands can define new 3D objects or load representations of 3D objects from files in formats compatible with such other software systems as X3D. The text strings can be easily integrated into other languages. FastScript3D facilitates communication between scripting languages [which enable programming of hyper-text markup language (HTML) documents to interact with users] and Java 3D. The FastScript3D language can be extended and customized on both the scripting side and the Java 3D side.

  14. Martian terrain - 3D

    Science.gov (United States)

    1997-01-01

    This area of terrain near the Sagan Memorial Station was taken on Sol 3 by the Imager for Mars Pathfinder (IMP). 3D glasses are necessary to identify surface detail.The IMP is a stereo imaging system with color capability provided by 24 selectable filters -- twelve filters per 'eye.' It stands 1.8 meters above the Martian surface, and has a resolution of two millimeters at a range of two meters.Mars Pathfinder is the second in NASA's Discovery program of low-cost spacecraft with highly focused science goals. The Jet Propulsion Laboratory, Pasadena, CA, developed and manages the Mars Pathfinder mission for NASA's Office of Space Science, Washington, D.C. JPL is an operating division of the California Institute of Technology (Caltech). The Imager for Mars Pathfinder (IMP) was developed by the University of Arizona Lunar and Planetary Laboratory under contract to JPL. Peter Smith is the Principal Investigator.Click below to see the left and right views individually. [figure removed for brevity, see original site] Left [figure removed for brevity, see original site] Right

  15. Blender 3D cookbook

    CERN Document Server

    Valenza, Enrico

    2015-01-01

    This book is aimed at the professionals that already have good 3D CGI experience with commercial packages and have now decided to try the open source Blender and want to experiment with something more complex than the average tutorials on the web. However, it's also aimed at the intermediate Blender users who simply want to go some steps further.It's taken for granted that you already know how to move inside the Blender interface, that you already have 3D modeling knowledge, and also that of basic 3D modeling and rendering concepts, for example, edge-loops, n-gons, or samples. In any case, it'

  16. A simple and efficient explicit parallelization of logic programs using low-level threading primitives

    CERN Document Server

    Saha, Diptikalyan

    2009-01-01

    In this work, we present an automatic way to parallelize logic programs for finding all the answers to queries using a transformation to low level threading primitives. Although much work has been done in parallelization of logic programming more than a decade ago (e.g., Aurora, Muse, YapOR), the current state of parallelizing logic programs is still very poor. This work presents a way for parallelism of tabled logic programs in XSB Prolog under the well founded semantics. An important contribution of this work relies in merging answer-tables from multiple children threads without incurring copying or full-sharing and synchronization of data-structures. The implementation of the parent-children shared answer-tables surpasses in efficiency all the other data-structures currently implemented for completion of answers in parallelization using multi-threading. The transformation and its lower-level answer merging predicates were implemented as an extension to the XSB system.

  17. 3D Digital Modelling

    DEFF Research Database (Denmark)

    Hundebøl, Jesper

    wave of new building information modelling tools demands further investigation, not least because of industry representatives' somewhat coarse parlance: Now the word is spreading -3D digital modelling is nothing less than a revolution, a shift of paradigm, a new alphabet... Research qeustions. Based...... on empirical probes (interviews, observations, written inscriptions) within the Danish construction industry this paper explores the organizational and managerial dynamics of 3D Digital Modelling. The paper intends to - Illustrate how the network of (non-)human actors engaged in the promotion (and arrest) of 3......D Modelling (in Denmark) stabilizes - Examine how 3D Modelling manifests itself in the early design phases of a construction project with a view to discuss the effects hereof for i.a. the management of the building process. Structure. The paper introduces a few, basic methodological concepts...

  18. DELTA 3D PRINTER

    Directory of Open Access Journals (Sweden)

    ȘOVĂILĂ Florin

    2016-07-01

    Full Text Available 3D printing is a very used process in industry, the generic name being “rapid prototyping”. The essential advantage of a 3D printer is that it allows the designers to produce a prototype in a very short time, which is tested and quickly remodeled, considerably reducing the required time to get from the prototype phase to the final product. At the same time, through this technique we can achieve components with very precise forms, complex pieces that, through classical methods, could have been accomplished only in a large amount of time. In this paper, there are presented the stages of a 3D model execution, also the physical achievement after of a Delta 3D printer after the model.

  19. Professional Papervision3D

    CERN Document Server

    Lively, Michael

    2010-01-01

    Professional Papervision3D describes how Papervision3D works and how real world applications are built, with a clear look at essential topics such as building websites and games, creating virtual tours, and Adobe's Flash 10. Readers learn important techniques through hands-on applications, and build on those skills as the book progresses. The companion website contains all code examples, video step-by-step explanations, and a collada repository.

  20. AE3D

    Energy Technology Data Exchange (ETDEWEB)

    2016-06-20

    AE3D solves for the shear Alfven eigenmodes and eigenfrequencies in a torodal magnetic fusion confinement device. The configuration can be either 2D (e.g. tokamak, reversed field pinch) or 3D (e.g. stellarator, helical reversed field pinch, tokamak with ripple). The equations solved are based on a reduced MHD model and sound wave coupling effects are not currently included.

  1. Feasibility of 3D Partially Parallel Acquisition DCE MRI in Pulmonary Parenchyma Perfusion%三维并行采集动态增强MRI在肺实质局部灌注中的应用研究

    Institute of Scientific and Technical Information of China (English)

    夏艺; 范丽; 刘士远; 管宇; 徐雪原; 于红; 肖湘生

    2012-01-01

    目的 评价3D并行采集动态对比增强MRI(dynamic contrast-enhanced MRI,DCE-MRI)技术对肺实质局部灌注成像的可行性.资料与方法 采用GE 1.5 T MRI系统,对10名健康志愿者及47例肺部疾病患者行灌注成像;评价肺灌注图像的均匀度,若存在灌注异常区域则计算其与正常肺组织的信号强度之比( RSI).结果 DCE-MRI可以清楚地显示肺实质灌注情况:10名健康志愿者的灌注图像较均匀,未见灌注缺损区.10例肺动脉栓塞( pulmonary embolism,PE)共出现12个楔形灌注缺损区,其中1例双侧PE出现3个灌注缺损区;12例侵犯邻近肺动脉的肺癌,在相应供血区均出现灌注缺损;RSI经单样本t检验差异具有明显的统计学意义(t=-24.74,P<0.05);另25例(20例未侵犯邻近肺动脉的肺癌和5例炎性病变)在对比剂首过肺实质强化达峰值时,病灶局部均呈低信号改变.结论 3D并行采集DCE-MRI技术可在单次屏气状态下完成动态多期扫描,获得全肺的容积灌注成像数据,对MR肺灌注图像采用半量化分析可明显区分出灌注异常区与灌注正常区.%Objective To assess the feasibility of 3 D partially parallel acquisition dynamic contrast enhanced (DCE) MRI in pulmonary parenchyma perfusion. Materials and Methods Ten healthy volunteers and 47 patients with lung disease performed perfusion imaging on a clinical 1. 5-T GE Excite HD whole body system. The homogeneity of perfusion images were assessed. In case of perfusion abnormality, the signal intensity ratio ( RSI) of perfusion abnormality and normal lung were calculated. Results Pulmonary parenchyma perfusion was well depicted with DCE-MRI. The perfusion images of healthy volunteers were homogeneous. 12 wedge shaped perfusion defects were visualized in 10 patients with pulmonary embolisms. 12 perfusion defects were also showed in 12 patients with lung cancer infiltrating the pulmonary artery. There was significant difference in RSI (t = - 24

  2. Declarative Parallel Programming in Spreadsheet End-User Development

    DEFF Research Database (Denmark)

    Biermann, Florian

    2016-01-01

    Spreadsheets are first-order functional languages and are widely used in research and industry as a tool to conveniently perform all kinds of computations. Because cells on a spreadsheet are immutable, there are possibilities for implicit parallelization of spreadsheet computations. In this liter......Spreadsheets are first-order functional languages and are widely used in research and industry as a tool to conveniently perform all kinds of computations. Because cells on a spreadsheet are immutable, there are possibilities for implicit parallelization of spreadsheet computations...

  3. Machine and Collection Abstractions for User-Implemented Data-Parallel Programming

    Directory of Open Access Journals (Sweden)

    Magne Haveraaen

    2000-01-01

    Full Text Available Data parallelism has appeared as a fruitful approach to the parallelisation of compute-intensive programs. Data parallelism has the advantage of mimicking the sequential (and deterministic structure of programs as opposed to task parallelism, where the explicit interaction of processes has to be programmed. In data parallelism data structures, typically collection classes in the form of large arrays, are distributed on the processors of the target parallel machine. Trying to extract distribution aspects from conventional code often runs into problems with a lack of uniformity in the use of the data structures and in the expression of data dependency patterns within the code. Here we propose a framework with two conceptual classes, Machine and Collection. The Machine class abstracts hardware communication and distribution properties. This gives a programmer high-level access to the important parts of the low-level architecture. The Machine class may readily be used in the implementation of a Collection class, giving the programmer full control of the parallel distribution of data, as well as allowing normal sequential implementation of this class. Any program using such a collection class will be parallelisable, without requiring any modification, by choosing between sequential and parallel versions at link time. Experiments with a commercial application, built using the Sophus library which uses this approach to parallelisation, show good parallel speed-ups, without any adaptation of the application program being needed.

  4. Molecular dynamics simulation on a network of workstations using a machine-independent parallel programming language.

    OpenAIRE

    1991-01-01

    Molecular dynamics simulations investigate local and global motion in molecules. Several parallel computing approaches have been taken to attack the most computationally expensive phase of molecular simulations, the evaluation of long range interactions. This paper develops a straightforward but effective algorithm for molecular dynamics simulations using the machine-independent parallel programming language, Linda. The algorithm was run both on a shared memory parallel computer and on a netw...

  5. The Effect of Parallel Programming Languages on the Performance and Energy Consumption of HPC Applications

    Directory of Open Access Journals (Sweden)

    Muhammad Aqib

    2016-02-01

    Full Text Available Big and complex applications need many resources and long computation time to execute sequentially. In this scenario, all application's processes are handled in sequential fashion even if they are independent of each other. In high- performance computing environment, multiple processors are available to running applications in parallel. So mutually independent blocks of codes could run in parallel. This approach not only increases the efficiency of the system without affecting the results but also saves a significant amount of energy. Many parallel programming models or APIs like Open MPI, Open MP, CUDA, etc. are available to running multiple instructions in parallel. In this paper, the efficiency and energy consumption of two known tasks i.e. matrix multiplication and quicksort are analyzed using different parallel programming models and a multiprocessor machine. The obtained results, which can be generalized, outline the effect of choosing a programming model on the efficiency and energy consumption when running different codes on different machines.

  6. CRBLASTER: a fast parallel-processing program for cosmic ray rejection

    Science.gov (United States)

    Mighell, Kenneth J.

    2008-08-01

    Many astronomical image-analysis programs are based on algorithms that can be described as being embarrassingly parallel, where the analysis of one subimage generally does not affect the analysis of another subimage. Yet few parallel-processing astrophysical image-analysis programs exist that can easily take full advantage of todays fast multi-core servers costing a few thousands of dollars. A major reason for the shortage of state-of-the-art parallel-processing astrophysical image-analysis codes is that the writing of parallel codes has been perceived to be difficult. I describe a new fast parallel-processing image-analysis program called crblaster which does cosmic ray rejection using van Dokkum's L.A.Cosmic algorithm. crblaster is written in C using the industry standard Message Passing Interface (MPI) library. Processing a single 800×800 HST WFPC2 image takes 1.87 seconds using 4 processes on an Apple Xserve with two dual-core 3.0-GHz Intel Xeons; the efficiency of the program running with the 4 processors is 82%. The code can be used as a software framework for easy development of parallel-processing image-anlaysis programs using embarrassing parallel algorithms; the biggest required modification is the replacement of the core image processing function with an alternative image-analysis function based on a single-processor algorithm. I describe the design, implementation and performance of the program.

  7. General design method for 3-dimensional, potential flow fields. Part 2: Computer program DIN3D1 for simple, unbranched ducts

    Science.gov (United States)

    Stanitz, J. D.

    1985-01-01

    The general design method for three-dimensional, potential, incompressible or subsonic-compressible flow developed in part 1 of this report is applied to the design of simple, unbranched ducts. A computer program, DIN3D1, is developed and five numerical examples are presented: a nozzle, two elbows, an S-duct, and the preliminary design of a side inlet for turbomachines. The two major inputs to the program are the upstream boundary shape and the lateral velocity distribution on the duct wall. As a result of these inputs, boundary conditions are overprescribed and the problem is ill posed. However, it appears that there are degrees of compatibility between these two major inputs and that, for reasonably compatible inputs, satisfactory solutions can be obtained. By not prescribing the shape of the upstream boundary, the problem presumably becomes well posed, but it is not clear how to formulate a practical design method under this circumstance. Nor does it appear desirable, because the designer usually needs to retain control over the upstream (or downstream) boundary shape. The problem is further complicated by the fact that, unlike the two-dimensional case, and irrespective of the upstream boundary shape, some prescribed lateral velocity distributions do not have proper solutions.

  8. Exploiting Vector and Multicore Parallelsim for Recursive, Data- and Task-Parallel Programs

    Energy Technology Data Exchange (ETDEWEB)

    Ren, Bin; Krishnamoorthy, Sriram; Agrawal, Kunal; Kulkarni, Milind

    2017-01-26

    Modern hardware contains parallel execution resources that are well-suited for data-parallelism-vector units-and task parallelism-multicores. However, most work on parallel scheduling focuses on one type of hardware or the other. In this work, we present a scheduling framework that allows for a unified treatment of task- and data-parallelism. Our key insight is an abstraction, task blocks, that uniformly handles data-parallel iterations and task-parallel tasks, allowing them to be scheduled on vector units or executed independently as multicores. Our framework allows us to define schedulers that can dynamically select between executing task- blocks on vector units or multicores. We show that these schedulers are asymptotically optimal, and deliver the maximum amount of parallelism available in computation trees. To evaluate our schedulers, we develop program transformations that can convert mixed data- and task-parallel pro- grams into task block-based programs. Using a prototype instantiation of our scheduling framework, we show that, on an 8-core system, we can simultaneously exploit vector and multicore parallelism to achieve 14×-108× speedup over sequential baselines.

  9. 3D Projection Installations

    DEFF Research Database (Denmark)

    Halskov, Kim; Johansen, Stine Liv; Bach Mikkelsen, Michelle

    2014-01-01

    Three-dimensional projection installations are particular kinds of augmented spaces in which a digital 3-D model is projected onto a physical three-dimensional object, thereby fusing the digital content and the physical object. Based on interaction design research and media studies, this article...... contributes to the understanding of the distinctive characteristics of such a new medium, and identifies three strategies for designing 3-D projection installations: establishing space; interplay between the digital and the physical; and transformation of materiality. The principal empirical case, From...... Fingerplan to Loop City, is a 3-D projection installation presenting the history and future of city planning for the Copenhagen area in Denmark. The installation was presented as part of the 12th Architecture Biennale in Venice in 2010....

  10. 3D Spectroscopic Instrumentation

    CERN Document Server

    Bershady, Matthew A

    2009-01-01

    In this Chapter we review the challenges of, and opportunities for, 3D spectroscopy, and how these have lead to new and different approaches to sampling astronomical information. We describe and categorize existing instruments on 4m and 10m telescopes. Our primary focus is on grating-dispersed spectrographs. We discuss how to optimize dispersive elements, such as VPH gratings, to achieve adequate spectral resolution, high throughput, and efficient data packing to maximize spatial sampling for 3D spectroscopy. We review and compare the various coupling methods that make these spectrographs ``3D,'' including fibers, lenslets, slicers, and filtered multi-slits. We also describe Fabry-Perot and spatial-heterodyne interferometers, pointing out their advantages as field-widened systems relative to conventional, grating-dispersed spectrographs. We explore the parameter space all these instruments sample, highlighting regimes open for exploitation. Present instruments provide a foil for future development. We give an...

  11. Radiochromic 3D Detectors

    Science.gov (United States)

    Oldham, Mark

    2015-01-01

    Radiochromic materials exhibit a colour change when exposed to ionising radiation. Radiochromic film has been used for clinical dosimetry for many years and increasingly so recently, as films of higher sensitivities have become available. The two principle advantages of radiochromic dosimetry include greater tissue equivalence (radiologically) and the lack of requirement for development of the colour change. In a radiochromic material, the colour change arises direct from ionising interactions affecting dye molecules, without requiring any latent chemical, optical or thermal development, with important implications for increased accuracy and convenience. It is only relatively recently however, that 3D radiochromic dosimetry has become possible. In this article we review recent developments and the current state-of-the-art of 3D radiochromic dosimetry, and the potential for a more comprehensive solution for the verification of complex radiation therapy treatments, and 3D dose measurement in general.

  12. GeoGebra 3D from the Perspectives of Elementary Pre-Service Mathematics Teachers Who Are Familiar with a Number of Software Programs

    Science.gov (United States)

    Baltaci, Serdal; Yildiz, Avni

    2015-01-01

    Each new version of the GeoGebra dynamic mathematics software goes through updates and innovations. One of these innovations is the GeoGebra 5.0 version. This version aims to facilitate 3D instruction by offering opportunities for students to analyze 3D objects. While scanning the previous studies of GeoGebra 3D, it is seen that they mainly focus…

  13. Cost Reduction Through the Use of Additive Manufacturing (3d Printing) and Collaborative Product Life Cycle Management Technologies to Enhance the Navy’s Maintenance Programs

    Science.gov (United States)

    2013-09-01

    Objet (Israel), Polymers , Prototyping 3D Systems (US), Solidscape (US) 3D Systems (US), Polymers , Metals, Prototyping, ExOne (US), Casting Molds...Voxeljet (Germany) Direct Part Stratasys (US), Bits from Bytes, RepRap Polymers Prototyping EOS (Germany), 3D Systems (US), Polymers , Prototyping, Arcam...and typical markets Vat Photopolymerization Material Jetting Binder Jetting Material Extrusion Powder Bed Fusion Sheet Lamination Directed Energy

  14. Exploration Of Deep Learning Algorithms Using Openacc Parallel Programming Model

    KAUST Repository

    Hamam, Alwaleed A.

    2017-03-13

    Deep learning is based on a set of algorithms that attempt to model high level abstractions in data. Specifically, RBM is a deep learning algorithm that used in the project to increase it\\'s time performance using some efficient parallel implementation by OpenACC tool with best possible optimizations on RBM to harness the massively parallel power of NVIDIA GPUs. GPUs development in the last few years has contributed to growing the concept of deep learning. OpenACC is a directive based ap-proach for computing where directives provide compiler hints to accelerate code. The traditional Restricted Boltzmann Ma-chine is a stochastic neural network that essentially perform a binary version of factor analysis. RBM is a useful neural net-work basis for larger modern deep learning model, such as Deep Belief Network. RBM parameters are estimated using an efficient training method that called Contrastive Divergence. Parallel implementation of RBM is available using different models such as OpenMP, and CUDA. But this project has been the first attempt to apply OpenACC model on RBM.

  15. Interaktiv 3D design

    DEFF Research Database (Denmark)

    Villaume, René Domine; Ørstrup, Finn Rude

    2002-01-01

    Projektet undersøger potentialet for interaktiv 3D design via Internettet. Arkitekt Jørn Utzons projekt til Espansiva blev udviklet som et byggesystem med det mål, at kunne skabe mangfoldige planmuligheder og mangfoldige facade- og rumudformninger. Systemets bygningskomponenter er digitaliseret som...... 3D elementer og gjort tilgængelige. Via Internettet er det nu muligt at sammenstille og afprøve en uendelig  række bygningstyper som  systemet blev tænkt og udviklet til....

  16. Computer simulation program for parallel SITAN. [Sandia Inertia Terrain-Aided Navigation, in FORTRAN

    Energy Technology Data Exchange (ETDEWEB)

    Andreas, R.D.; Sheives, T.C.

    1980-11-01

    This computer program simulates the operation of parallel SITAN using digitized terrain data. An actual trajectory is modeled including the effects of inertial navigation errors and radar altimeter measurements.

  17. F-Nets and Software Cabling: Deriving a Formal Model and Language for Portable Parallel Programming

    Science.gov (United States)

    DiNucci, David C.; Saini, Subhash (Technical Monitor)

    1998-01-01

    Parallel programming is still being based upon antiquated sequence-based definitions of the terms "algorithm" and "computation", resulting in programs which are architecture dependent and difficult to design and analyze. By focusing on obstacles inherent in existing practice, a more portable model is derived here, which is then formalized into a model called Soviets which utilizes a combination of imperative and functional styles. This formalization suggests more general notions of algorithm and computation, as well as insights into the meaning of structured programming in a parallel setting. To illustrate how these principles can be applied, a very-high-level graphical architecture-independent parallel language, called Software Cabling, is described, with many of the features normally expected from today's computer languages (e.g. data abstraction, data parallelism, and object-based programming constructs).

  18. Resolutions of the Coulomb operator: VIII. Parallel implementation using the modern programming language X10.

    Science.gov (United States)

    Limpanuparb, Taweetham; Milthorpe, Josh; Rendell, Alistair P

    2014-10-30

    Use of the modern parallel programming language X10 for computing long-range Coulomb and exchange interactions is presented. By using X10, a partitioned global address space language with support for task parallelism and the explicit representation of data locality, the resolution of the Ewald operator can be parallelized in a straightforward manner including use of both intranode and internode parallelism. We evaluate four different schemes for dynamic load balancing of integral calculation using X10's work stealing runtime, and report performance results for long-range HF energy calculation of large molecule/high quality basis running on up to 1024 cores of a high performance cluster machine.

  19. 3D Wire 2015

    DEFF Research Database (Denmark)

    Jordi, Moréton; F, Escribano; J. L., Farias

    This document is a general report on the implementation of gamification in 3D Wire 2015 event. As the second gamification experience in this event, we have delved deeply in the previous objectives (attracting public areas less frequented exhibition in previous years and enhance networking) and ha......, improves socialization and networking, improves media impact, improves fun factor and improves encouragement of the production team....

  20. Shaping 3-D boxes

    DEFF Research Database (Denmark)

    Stenholt, Rasmus; Madsen, Claus B.

    2011-01-01

    Enabling users to shape 3-D boxes in immersive virtual environments is a non-trivial problem. In this paper, a new family of techniques for creating rectangular boxes of arbitrary position, orientation, and size is presented and evaluated. These new techniques are based solely on position data...

  1. Tangible 3D Modelling

    DEFF Research Database (Denmark)

    Hejlesen, Aske K.; Ovesen, Nis

    2012-01-01

    This paper presents an experimental approach to teaching 3D modelling techniques in an Industrial Design programme. The approach includes the use of tangible free form models as tools for improving the overall learning. The paper is based on lecturer and student experiences obtained through facil...

  2. Massively parallel code named NEPTUNE for 3D fully electromagnetic and PIC simulations%3维全电磁粒子模拟大规模并行程序NEPTUNE

    Institute of Scientific and Technical Information of China (English)

    董烨; 董志伟; 周海京; 陈虹; 莫则尧; 陈军; 杨温渊; 赵强; 夏芳; 肖丽; 马彦; 廖丽; 孙会芳

    2011-01-01

    介绍了自主编制的3维全电磁粒子模拟大规模并行程序NEPTUNE的基本情况.该程序具备对多种典型高功率微波源器件的3维模拟能力,可以在数百乃至上千个CPU上稳定运行.该程序使用时域有限差分(FDTD)方法更新计算电磁场,采用Buneman-Boris算法更新粒子运动状态,运用质点网格法(PIC)处理粒子与电磁场的耦合关系,最后利用Boris方法求解泊松方程对电场散度进行修正,以确保计算精度.该程序初步具备复杂几何结构建模能力,可以对典型高功率微波器件中常见的一些复杂结构,如任意边界形状的轴对称几何体、正交投影面几何体,慢波结构、耦合孔洞、金属线和曲面薄膜等进行几何建模.该程序将理想导体边界、外加波边界、粒子发射与吸收边界及完全匹配层边界等物理边界应用于几何边界上,实现了数值计算的封闭求解.最后以算例的形式,介绍了使用NEPTUNE程序对磁绝缘线振荡器、相对论返波管、虚阴极振荡器及相对论速调管等典型高功率微波源器件进行的模拟计算情况,验证了模拟计算结果的可靠性,同时给出了并行效率的分布情况.%A massively parallel code named NEPTUNE for 3D fully electromagnetic and particle-in-cell(PIC) simulations is introduced , which can run on the Linux system with hundreds or even thousands of CPUs. NEPTUNE is capable of three-dimensional simulation of various typical high power microwave( HPM) devices. In NEPTUNE code, electromagnetic fields are updated by using finite-difference time-domain (FDTD) method to solve Maxwell equations and particles are moved by using Buneman-Boris method to solve the relativistic Newton-Lorentz equation. The electromagnetic fields and particles are coupled by using linear weighing interpolation PIC method, and the electric field components are corrected by using Boris method to solve the Poisson e-quation in order to ensure charge

  3. On the Performance of the Python Programming Language for Serial and Parallel Scientific Computations

    Directory of Open Access Journals (Sweden)

    Xing Cai

    2005-01-01

    Full Text Available This article addresses the performance of scientific applications that use the Python programming language. First, we investigate several techniques for improving the computational efficiency of serial Python codes. Then, we discuss the basic programming techniques in Python for parallelizing serial scientific applications. It is shown that an efficient implementation of the array-related operations is essential for achieving good parallel performance, as for the serial case. Once the array-related operations are efficiently implemented, probably using a mixed-language implementation, good serial and parallel performance become achievable. This is confirmed by a set of numerical experiments. Python is also shown to be well suited for writing high-level parallel programs.

  4. Concurrent extensions to the FORTRAN language for parallel programming of computational fluid dynamics algorithms

    Science.gov (United States)

    Weeks, Cindy Lou

    1986-01-01

    Experiments were conducted at NASA Ames Research Center to define multi-tasking software requirements for multiple-instruction, multiple-data stream (MIMD) computer architectures. The focus was on specifying solutions for algorithms in the field of computational fluid dynamics (CFD). The program objectives were to allow researchers to produce usable parallel application software as soon as possible after acquiring MIMD computer equipment, to provide researchers with an easy-to-learn and easy-to-use parallel software language which could be implemented on several different MIMD machines, and to enable researchers to list preferred design specifications for future MIMD computer architectures. Analysis of CFD algorithms indicated that extensions of an existing programming language, adaptable to new computer architectures, provided the best solution to meeting program objectives. The CoFORTRAN Language was written in response to these objectives and to provide researchers a means to experiment with parallel software solutions to CFD algorithms on machines with parallel architectures.

  5. Parallel Libraries to support High-Level Programming

    DEFF Research Database (Denmark)

    Larsen, Morten Nørgaard

    model requires the programmer to think a bit differently, but at the same time the implemented algorithms will perform very well, as shown by the initial tests presented. In the second part of this thesis, I will change focus from the CELL-BE architecture to the more traditionally x86 architecture...... of the more exotic though short-lived heterogeneous CELL Broadband Engine (CELL-BE) architecture added to this shift. Furthermore, the use of cluster computers made of commodity hardware and specialized supercomputers have greatly increased in both industry as well as in the academic world. Finally...... as they would be a single machine. In between is a number of tools helping the programmers handle communication, share data, run loops in parallel, handle algorithms mining huge amounts of data etc. Even though most of them do a good job performance-wise, almost all of them require that the programmers learn...

  6. Unoriented 3d TFTs

    CERN Document Server

    Bhardwaj, Lakshya

    2016-01-01

    This paper generalizes two facts about oriented 3d TFTs to the unoriented case. On one hand, it is known that oriented 3d TFTs having a topological boundary condition admit a state-sum construction known as the Turaev-Viro construction. This is related to the string-net construction of fermionic phases of matter. We show how Turaev-Viro construction can be generalized to unoriented 3d TFTs. On the other hand, it is known that the "fermionic" versions of oriented TFTs, known as Spin-TFTs, can be constructed in terms of "shadow" TFTs which are ordinary oriented TFTs with an anomalous Z_2 1-form symmetry. We generalize this correspondence to Pin+ TFTs by showing that they can be constructed in terms of ordinary unoriented TFTs with anomalous Z_2 1-form symmetry having a mixed anomaly with time-reversal symmetry. The corresponding Pin+ TFT does not have any anomaly for time-reversal symmetry however and hence it can be unambiguously defined on a non-orientable manifold. In case a Pin+ TFT admits a topological bou...

  7. Using CLIPS in the domain of knowledge-based massively parallel programming

    Science.gov (United States)

    Dvorak, Jiri J.

    1994-01-01

    The Program Development Environment (PDE) is a tool for massively parallel programming of distributed-memory architectures. Adopting a knowledge-based approach, the PDE eliminates the complexity introduced by parallel hardware with distributed memory and offers complete transparency in respect of parallelism exploitation. The knowledge-based part of the PDE is realized in CLIPS. Its principal task is to find an efficient parallel realization of the application specified by the user in a comfortable, abstract, domain-oriented formalism. A large collection of fine-grain parallel algorithmic skeletons, represented as COOL objects in a tree hierarchy, contains the algorithmic knowledge. A hybrid knowledge base with rule modules and procedural parts, encoding expertise about application domain, parallel programming, software engineering, and parallel hardware, enables a high degree of automation in the software development process. In this paper, important aspects of the implementation of the PDE using CLIPS and COOL are shown, including the embedding of CLIPS with C++-based parts of the PDE. The appropriateness of the chosen approach and of the CLIPS language for knowledge-based software engineering are discussed.

  8. Managing Algorithmic Skeleton Nesting Requirements in Realistic Image Processing Applications: The Case of the SKiPPER-II Parallel Programming Environment's Operating Model

    Directory of Open Access Journals (Sweden)

    Duculty Florent

    2005-01-01

    Full Text Available SKiPPER is a SKeleton-based Parallel Programming EnviRonment being developed since 1996 and running at LASMEA Laboratory, the Blaise-Pascal University, France. The main goal of the project was to demonstrate the applicability of skeleton-based parallel programming techniques to the fast prototyping of reactive vision applications. This paper deals with the special features embedded in the latest version of the project: algorithmic skeleton nesting capabilities and a fully dynamic operating model. Throughout the case study of a complete and realistic image processing application, in which we have pointed out the requirement for skeleton nesting, we are presenting the operating model of this feature. The work described here is one of the few reported experiments showing the application of skeleton nesting facilities for the parallelisation of a realistic application, especially in the area of image processing. The image processing application we have chosen is a 3D face-tracking algorithm from appearance.

  9. Protocol-Based Verification of Message-Passing Parallel Programs

    DEFF Research Database (Denmark)

    López-Acosta, Hugo-Andrés; Eduardo R. B. Marques, Eduardo R. B.; Martins, Francisco

    2015-01-01

    translated into a representation read by VCC, a software verifier for C. We successfully verified several MPI programs in a running time that is independent of the number of processes or other input parameters. This contrasts with alternative techniques, notably model checking and runtime verification...

  10. Guide to development of a scalar massive parallel programming on Paragon

    Energy Technology Data Exchange (ETDEWEB)

    Ueshima, Yutaka; Arakawa, Takuya; Sasaki, Akira [Japan Atomic Energy Research Inst., Neyagawa, Osaka (Japan). Kansai Research Establishment; Yokota, Hisasi

    1998-10-01

    Parallel calculations using more than hundred computers had begun in Japan only several years ago. The Intel Paragon XP/S 15GP256 , 75MP834 were introduced as pioneers in Japan Atomic Energy Research Institute (JAERI) to pursue massive parallel simulations for advanced photon and fusion researches. Recently, large number of parallel programs have been transplanted or newly produced to perform the parallel calculations with those computers. However, these programs are developed based on software technologies for conventional super computer, therefore they sometimes cause troubles in the massive parallel computing. In principle, when programs are developed under different computer and operating system (OS), prudent directions and knowledge are needed. However, integration of knowledge and standardization of environment are quite difficult because number of Paragon system and Paragon`s users are very small in Japan. Therefore, we summarized information which was got through the process of development of a massive parallel program in the Paragon XP/S 75MP834. (author)

  11. Parallelized Solution to Semidefinite Programmings in Quantum Complexity Theory

    CERN Document Server

    Wu, Xiaodi

    2010-01-01

    In this paper we present an equilibrium value based framework for solving SDPs via the multiplicative weight update method which is different from the one in Kale's thesis \\cite{Kale07}. One of the main advantages of the new framework is that we can guarantee the convertibility from approximate to exact feasibility in a much more general class of SDPs than previous result. Another advantage is the design of the oracle which is necessary for applying the multiplicative weight update method is much simplified in general cases. This leads to an alternative and easier solutions to the SDPs used in the previous results \\class{QIP(2)}$\\subseteq$\\class{PSPACE} \\cite{JainUW09} and \\class{QMAM}=\\class{PSPACE} \\cite{JainJUW09}. Furthermore, we provide a generic form of SDPs which can be solved in the similar way. By parallelizing every step in our solution, we are able to solve a class of SDPs in \\class{NC}. Although our motivation is from quantum computing, our result will also apply directly to any SDP which satisfie...

  12. Programming a massively parallel, computation universal system: static behavior

    Energy Technology Data Exchange (ETDEWEB)

    Lapedes, A.; Farber, R.

    1986-01-01

    In previous work by the authors, the ''optimum finding'' properties of Hopfield neural nets were applied to the nets themselves to create a ''neural compiler.'' This was done in such a way that the problem of programming the attractors of one neural net (called the Slave net) was expressed as an optimization problem that was in turn solved by a second neural net (the Master net). In this series of papers that approach is extended to programming nets that contain interneurons (sometimes called ''hidden neurons''), and thus deals with nets capable of universal computation. 22 refs.

  13. Parallelizing Deadlock Resolution in Symbolic Synthesis of Distributed Programs

    Science.gov (United States)

    2008-01-01

    follows. In Sections 2 and 3, we present precise defini- tions for distributed programs, specifications, and fault- tolerance. We formally state the...Subsequently, experimental results and analysis are presented in Section 6. Related work is discussed in Section 7. Finally, we conclude in Section...infinite com- putation by stuttering at sl. On the other hand, if there exists a state sd such that there is no outgoing transition (or a self-loop

  14. Concurrent Programming Using Actors: Exploiting Large-Scale Parallelism,

    Science.gov (United States)

    1985-10-07

    ORGANIZATION NAME AND ADDRESS 10. PROGRAM ELEMENT. PROJECT. TASK* Artificial Inteligence Laboratory AREA Is WORK UNIT NUMBERS 545 Technology Square...D-R162 422 CONCURRENT PROGRMMIZNG USING f"OS XL?ITP TEH l’ LARGE-SCALE PARALLELISH(U) NASI AC E Al CAMBRIDGE ARTIFICIAL INTELLIGENCE L. G AGHA ET AL...RESOLUTION TEST CHART N~ATIONAL BUREAU OF STANDA.RDS - -96 A -E. __ _ __ __’ .,*- - -- •. - MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL

  15. 3D and beyond

    Science.gov (United States)

    Fung, Y. C.

    1995-05-01

    This conference on physiology and function covers a wide range of subjects, including the vasculature and blood flow, the flow of gas, water, and blood in the lung, the neurological structure and function, the modeling, and the motion and mechanics of organs. Many technologies are discussed. I believe that the list would include a robotic photographer, to hold the optical equipment in a precisely controlled way to obtain the images for the user. Why are 3D images needed? They are to achieve certain objectives through measurements of some objects. For example, in order to improve performance in sports or beauty of a person, we measure the form, dimensions, appearance, and movements.

  16. 3D Reconstruction of NMR Images

    Directory of Open Access Journals (Sweden)

    Peter Izak

    2007-01-01

    Full Text Available This paper introduces experiment of 3D reconstruction NMR images scanned from magnetic resonance device. There are described methods which can be used for 3D reconstruction magnetic resonance images in biomedical application. The main idea is based on marching cubes algorithm. For this task was chosen sophistication method by program Vision Assistant, which is a part of program LabVIEW.

  17. Discrete Method of Images for 3D Radio Propagation Modeling

    Science.gov (United States)

    Novak, Roman

    2016-09-01

    Discretization by rasterization is introduced into the method of images (MI) in the context of 3D deterministic radio propagation modeling as a way to exploit spatial coherence of electromagnetic propagation for fine-grained parallelism. Traditional algebraic treatment of bounding regions and surfaces is replaced by computer graphics rendering of 3D reflections and double refractions while building the image tree. The visibility of reception points and surfaces is also resolved by shader programs. The proposed rasterization is shown to be of comparable run time to that of the fundamentally parallel shooting and bouncing rays. The rasterization does not affect the signal evaluation backtracking step, thus preserving its advantage over the brute force ray-tracing methods in terms of accuracy. Moreover, the rendering resolution may be scaled back for a given level of scenario detail with only marginal impact on the image tree size. This allows selection of scene optimized execution parameters for faster execution, giving the method a competitive edge. The proposed variant of MI can be run on any GPU that supports real-time 3D graphics.

  18. Describing, using 'recognition cones'. [parallel-series model with English-like computer program

    Science.gov (United States)

    Uhr, L.

    1973-01-01

    A parallel-serial 'recognition cone' model is examined, taking into account the model's ability to describe scenes of objects. An actual program is presented in an English-like language. The concept of a 'description' is discussed together with possible types of descriptive information. Questions regarding the level and the variety of detail are considered along with approaches for improving the serial representations of parallel systems.

  19. Method for resource control in parallel environments using program organization and run-time support

    Science.gov (United States)

    Ekanadham, Kattamuri (Inventor); Moreira, Jose Eduardo (Inventor); Naik, Vijay Krishnarao (Inventor)

    2001-01-01

    A system and method for dynamic scheduling and allocation of resources to parallel applications during the course of their execution. By establishing well-defined interactions between an executing job and the parallel system, the system and method support dynamic reconfiguration of processor partitions, dynamic distribution and redistribution of data, communication among cooperating applications, and various other monitoring actions. The interactions occur only at specific points in the execution of the program where the aforementioned operations can be performed efficiently.

  20. Task scheduling of parallel programs to optimize communications for cluster of SMPs

    Institute of Scientific and Technical Information of China (English)

    郑纬民; 杨博; 林伟坚; 李志光

    2001-01-01

    This paper discusses the compile time task scheduling of parallel program running on cluster of SMP workstations. Firstly, the problem is stated formally and transformed into a graph partition problem and proved to be NP-Complete. A heuristic algorithm MMP-Solver is then proposed to solve the problem. Experiment result shows that the task scheduling can reduce communication overhead of parallel applications greatly and MMP-Solver outperforms the existing algorithms.

  1. Efficient Parallelization of the Stochastic Dual Dynamic Programming Algorithm Applied to Hydropower Scheduling

    Directory of Open Access Journals (Sweden)

    Arild Helseth

    2015-12-01

    Full Text Available Stochastic dual dynamic programming (SDDP has become a popular algorithm used in practical long-term scheduling of hydropower systems. The SDDP algorithm is computationally demanding, but can be designed to take advantage of parallel processing. This paper presents a novel parallel scheme for the SDDP algorithm, where the stage-wise synchronization point traditionally used in the backward iteration of the SDDP algorithm is partially relaxed. The proposed scheme was tested on a realistic model of a Norwegian water course, proving that the synchronization point relaxation significantly improves parallel efficiency.

  2. 3D Surgical Simulation

    Science.gov (United States)

    Cevidanes, Lucia; Tucker, Scott; Styner, Martin; Kim, Hyungmin; Chapuis, Jonas; Reyes, Mauricio; Proffit, William; Turvey, Timothy; Jaskolka, Michael

    2009-01-01

    This paper discusses the development of methods for computer-aided jaw surgery. Computer-aided jaw surgery allows us to incorporate the high level of precision necessary for transferring virtual plans into the operating room. We also present a complete computer-aided surgery (CAS) system developed in close collaboration with surgeons. Surgery planning and simulation include construction of 3D surface models from Cone-beam CT (CBCT), dynamic cephalometry, semi-automatic mirroring, interactive cutting of bone and bony segment repositioning. A virtual setup can be used to manufacture positioning splints for intra-operative guidance. The system provides further intra-operative assistance with the help of a computer display showing jaw positions and 3D positioning guides updated in real-time during the surgical procedure. The CAS system aids in dealing with complex cases with benefits for the patient, with surgical practice, and for orthodontic finishing. Advanced software tools for diagnosis and treatment planning allow preparation of detailed operative plans, osteotomy repositioning, bone reconstructions, surgical resident training and assessing the difficulties of the surgical procedures prior to the surgery. CAS has the potential to make the elaboration of the surgical plan a more flexible process, increase the level of detail and accuracy of the plan, yield higher operative precision and control, and enhance documentation of cases. Supported by NIDCR DE017727, and DE018962 PMID:20816308

  3. TOWARDS: 3D INTERNET

    Directory of Open Access Journals (Sweden)

    Ms. Swapnali R. Ghadge

    2013-08-01

    Full Text Available In today’s ever-shifting media landscape, it can be a complex task to find effective ways to reach your desired audience. As traditional media such as television continue to lose audience share, one venue in particular stands out for its ability to attract highly motivated audiences and for its tremendous growth potential the 3D Internet. The concept of '3D Internet' has recently come into the spotlight in the R&D arena, catching the attention of many people, and leading to a lot of discussions. Basically, one can look into this matter from a few different perspectives: visualization and representation of information, and creation and transportation of information, among others. All of them still constitute research challenges, as no products or services are yet available or foreseen for the near future. Nevertheless, one can try to envisage the directions that can be taken towards achieving this goal. People who take part in virtual worlds stay online longer with a heightened level of interest. To take advantage of that interest, diverse businesses and organizations have claimed an early stake in this fast-growing market. They include technology leaders such as IBM, Microsoft, and Cisco, companies such as BMW, Toyota, Circuit City, Coca Cola, and Calvin Klein, and scores of universities, including Harvard, Stanford and Penn State.

  4. Generalized recovery algorithm for 3D super-resolution microscopy using rotating point spread functions

    Science.gov (United States)

    Shuang, Bo; Wang, Wenxiao; Shen, Hao; Tauzin, Lawrence J.; Flatebo, Charlotte; Chen, Jianbo; Moringo, Nicholas A.; Bishop, Logan D. C.; Kelly, Kevin F.; Landes, Christy F.

    2016-08-01

    Super-resolution microscopy with phase masks is a promising technique for 3D imaging and tracking. Due to the complexity of the resultant point spread functions, generalized recovery algorithms are still missing. We introduce a 3D super-resolution recovery algorithm that works for a variety of phase masks generating 3D point spread functions. A fast deconvolution process generates initial guesses, which are further refined by least squares fitting. Overfitting is suppressed using a machine learning determined threshold. Preliminary results on experimental data show that our algorithm can be used to super-localize 3D adsorption events within a porous polymer film and is useful for evaluating potential phase masks. Finally, we demonstrate that parallel computation on graphics processing units can reduce the processing time required for 3D recovery. Simulations reveal that, through desktop parallelization, the ultimate limit of real-time processing is possible. Our program is the first open source recovery program for generalized 3D recovery using rotating point spread functions.

  5. CRBLASTER: A Fast Parallel-Processing Program for Cosmic Ray Rejection in Space-Based Observations

    Science.gov (United States)

    Mighell, K.

    Many astronomical image analysis tasks are based on algorithms that can be described as being embarrassingly parallel - where the analysis of one subimage generally does not affect the analysis of another subimage. Yet few parallel-processing astrophysical image-analysis programs exist that can easily take full advantage of today's fast multi-core servers costing a few thousands of dollars. One reason for the shortage of state-of-the-art parallel processing astrophysical image-analysis codes is that the writing of parallel codes has been perceived to be difficult. I describe a new fast parallel-processing image-analysis program called CRBLASTER which does cosmic ray rejection using van Dokkum's L.A.Cosmic algorithm. CRBLASTER is written in C using the industry standard Message Passing Interface library. Processing a single 800 x 800 Hubble Space Telescope Wide-Field Planetary Camera 2 (WFPC2) image takes 1.9 seconds using 4 processors on an Apple Xserve with two dual-core 3.0-GHz Intel Xeons; the efficiency of the program running with the 4 cores is 82%. The code has been designed to be used as a software framework for the easy development of parallel-processing image-analysis programs using embarrassing parallel algorithms; all that needs to be done is to replace the core image processing task (in this case the C function that performs the L.A.Cosmic algorithm) with an alternative image analysis task based on a single processor algorithm. I describe the design and implementation of the program and then discuss how it could possibly be used to quickly do time-critical analysis applications such as those involved with space surveillance or do complex calibration tasks as part of the pipeline processing of images from large focal plane arrays.

  6. A Tool for Performance Modeling of Parallel Programs

    Directory of Open Access Journals (Sweden)

    J.A. González

    2003-01-01

    Full Text Available Current performance prediction analytical models try to characterize the performance behavior of actual machines through a small set of parameters. In practice, substantial deviations are observed. These differences are due to factors as memory hierarchies or network latency. A natural approach is to associate a different proportionality constant with each basic block, and analogously, to associate different latencies and bandwidths with each "communication block". Unfortunately, to use this approach implies that the evaluation of parameters must be done for each algorithm. This is a heavy task, implying experiment design, timing, statistics, pattern recognition and multi-parameter fitting algorithms. Software support is required. We present a compiler that takes as source a C program annotated with complexity formulas and produces as output an instrumented code. The trace files obtained from the execution of the resulting code are analyzed with an interactive interpreter, giving us, among other information, the values of those parameters.

  7. Remote Memory Access: A Case for Portable, Efficient and Library Independent Parallel Programming

    Directory of Open Access Journals (Sweden)

    Alexandros V. Gerbessiotis

    2004-01-01

    Full Text Available In this work we make a strong case for remote memory access (RMA as the effective way to program a parallel computer by proposing a framework that supports RMA in a library independent, simple and intuitive way. If one uses our approach the parallel code one writes will run transparently under MPI-2 enabled libraries but also bulk-synchronous parallel libraries. The advantage of using RMA is code simplicity, reduced programming complexity, and increased efficiency. We support the latter claims by implementing under this framework a collection of benchmark programs consisting of a communication and synchronization performance assessment program, a dense matrix multiplication algorithm, and two variants of a parallel radix-sort algorithm and examine their performance on a LINUX-based PC cluster under three different RMA enabled libraries: LAM MPI, BSPlib, and PUB. We conclude that implementations of such parallel algorithms using RMA communication primitives lead to code that is as efficient as the message-passing equivalent code and in the case of radix-sort substantially more efficient. In addition our work can be used as a comparative study of the relevant capabilities of the three libraries.

  8. 3D-kompositointi

    OpenAIRE

    Piirainen, Jere

    2015-01-01

    Opinnäytetyössä käydään läpi yleisimpiä 3D-kompositointiin liittyviä tekniikoita sekä kompositointiin käytettyjä ohjelmia ja liitännäisiä. Työssä esitellään myös kompositoinnin juuret 1800-luvun lopulta aina nykyaikaiseen digitaaliseen kompositointiin asti. Kompositointi on yksinkertaisimmillaan usean kuvan liittämistä saumattomasti yhdeksi uskottavaksi kokonaisuudeksi. Vaikka prosessi vaatii visuaalista silmää, vaatii se myös paljon teknistä osaamista. Tämän lisäksi perusymmärrys kamera...

  9. Shaping 3-D boxes

    DEFF Research Database (Denmark)

    Stenholt, Rasmus; Madsen, Claus B.

    2011-01-01

    Enabling users to shape 3-D boxes in immersive virtual environments is a non-trivial problem. In this paper, a new family of techniques for creating rectangular boxes of arbitrary position, orientation, and size is presented and evaluated. These new techniques are based solely on position data......, making them different from typical, existing box shaping techniques. The basis of the proposed techniques is a new algorithm for constructing a full box from just three of its corners. The evaluation of the new techniques compares their precision and completion times in a 9 degree-of-freedom (Do......F) docking experiment against an existing technique, which requires the user to perform the rotation and scaling of the box explicitly. The precision of the users' box construction is evaluated by a novel error metric measuring the difference between two boxes. The results of the experiment strongly indicate...

  10. 3D Turtle Graphics” by using a 3D Printer

    Directory of Open Access Journals (Sweden)

    Yasusi Kanada

    2015-04-01

    Full Text Available When creating shapes by using a 3D printer, usually, a static (declarative model designed by using a 3D CAD system is translated to a CAM program and it is sent to the printer. However, widely-used FDM-type 3D printers input a dynamical (procedural program that describes control of motions of the print head and extrusion of the filament. If the program is expressed by using a programming language or a library in a straight manner, solids can be created by a method similar to turtle graphics. An open-source library that enables “turtle 3D printing” method was described by Python and tested. Although this method currently has a problem that it cannot print in the air; however, if this problem is solved by an appropriate method, shapes drawn by 3D turtle graphics freely can be embodied by this method.

  11. High performance parallelism pearls 2 multicore and many-core programming approaches

    CERN Document Server

    Jeffers, Jim

    2015-01-01

    High Performance Parallelism Pearls Volume 2 offers another set of examples that demonstrate how to leverage parallelism. Similar to Volume 1, the techniques included here explain how to use processors and coprocessors with the same programming - illustrating the most effective ways to combine Xeon Phi coprocessors with Xeon and other multicore processors. The book includes examples of successful programming efforts, drawn from across industries and domains such as biomed, genetics, finance, manufacturing, imaging, and more. Each chapter in this edited work includes detailed explanations of t

  12. Scalable parallel programming for high performance seismic simulation on petascale heterogeneous supercomputers

    Science.gov (United States)

    Zhou, Jun

    The 1994 Northridge earthquake in Los Angeles, California, killed 57 people, injured over 8,700 and caused an estimated $20 billion in damage. Petascale simulations are needed in California and elsewhere to provide society with a better understanding of the rupture and wave dynamics of the largest earthquakes at shaking frequencies required to engineer safe structures. As the heterogeneous supercomputing infrastructures are becoming more common, numerical developments in earthquake system research are particularly challenged by the dependence on the accelerator elements to enable "the Big One" simulations with higher frequency and finer resolution. Reducing time to solution and power consumption are two primary focus area today for the enabling technology of fault rupture dynamics and seismic wave propagation in realistic 3D models of the crust's heterogeneous structure. This dissertation presents scalable parallel programming techniques for high performance seismic simulation running on petascale heterogeneous supercomputers. A real world earthquake simulation code, AWP-ODC, one of the most advanced earthquake codes to date, was chosen as the base code in this research, and the testbed is based on Titan at Oak Ridge National Laboraratory, the world's largest hetergeneous supercomputer. The research work is primarily related to architecture study, computation performance tuning and software system scalability. An earthquake simulation workflow has also been developed to support the efficient production sets of simulations. The highlights of the technical development are an aggressive performance optimization focusing on data locality and a notable data communication model that hides the data communication latency. This development results in the optimal computation efficiency and throughput for the 13-point stencil code on heterogeneous systems, which can be extended to general high-order stencil codes. Started from scratch, the hybrid CPU/GPU version of AWP

  13. Empirical valence bond models for reactive potential energy surfaces: a parallel multilevel genetic program approach.

    Science.gov (United States)

    Bellucci, Michael A; Coker, David F

    2011-07-28

    We describe a new method for constructing empirical valence bond potential energy surfaces using a parallel multilevel genetic program (PMLGP). Genetic programs can be used to perform an efficient search through function space and parameter space to find the best functions and sets of parameters that fit energies obtained by ab initio electronic structure calculations. Building on the traditional genetic program approach, the PMLGP utilizes a hierarchy of genetic programming on two different levels. The lower level genetic programs are used to optimize coevolving populations in parallel while the higher level genetic program (HLGP) is used to optimize the genetic operator probabilities of the lower level genetic programs. The HLGP allows the algorithm to dynamically learn the mutation or combination of mutations that most effectively increase the fitness of the populations, causing a significant increase in the algorithm's accuracy and efficiency. The algorithm's accuracy and efficiency is tested against a standard parallel genetic program with a variety of one-dimensional test cases. Subsequently, the PMLGP is utilized to obtain an accurate empirical valence bond model for proton transfer in 3-hydroxy-gamma-pyrone in gas phase and protic solvent.

  14. Analysis of linear measurements on 3D surface models using CBCT data segmentation obtained by automatic standard pre-set thresholds in two segmentation software programs: an in vitro study.

    Science.gov (United States)

    Poleti, Marcelo Lupion; Fernandes, Thais Maria Freire; Pagin, Otávio; Moretti, Marcela Rodrigues; Rubira-Bullen, Izabel Regina Fischer

    2016-01-01

    The aim of this in vitro study was to evaluate the reliability and accuracy of linear measurements on three-dimensional (3D) surface models obtained by standard pre-set thresholds in two segmentation software programs. Ten mandibles with 17 silica markers were scanned for 0.3-mm voxels in the i-CAT Classic (Imaging Sciences International, Hatfield, PA, USA). Twenty linear measurements were carried out by two observers two times on the 3D surface models: the Dolphin Imaging 11.5 (Dolphin Imaging & Management Solutions, Chatsworth, CA, USA), using two filters(Translucent and Solid-1), and in the InVesalius 3.0.0 (Centre for Information Technology Renato Archer, Campinas, SP, Brazil). The physical measurements were made by another observer two times using a digital caliper on the dry mandibles. Excellent intra- and inter-observer reliability for the markers, physical measurements, and 3D surface models were found (intra-class correlation coefficient (ICC) and Pearson's r ≥ 0.91). The linear measurements on 3D surface models by Dolphin and InVesalius software programs were accurate (Dolphin Solid-1 > InVesalius > Dolphin Translucent). The highest absolute and percentage errors were obtained for the variable R1-R1 (1.37 mm) and MF-AC (2.53 %) in the Dolphin Translucent and InVesalius software, respectively. Linear measurements on 3D surface models obtained by standard pre-set thresholds in the Dolphin and InVesalius software programs are reliable and accurate compared with physical measurements. Studies that evaluate the reliability and accuracy of the 3D models are necessary to ensure error predictability and to establish diagnosis, treatment plan, and prognosis in a more realistic way.

  15. Parallel programming of saccades during natural scene viewing: evidence from eye movement positions.

    Science.gov (United States)

    Wu, Esther X W; Gilani, Syed Omer; van Boxtel, Jeroen J A; Amihai, Ido; Chua, Fook Kee; Yen, Shih-Cheng

    2013-10-24

    Previous studies have shown that saccade plans during natural scene viewing can be programmed in parallel. This evidence comes mainly from temporal indicators, i.e., fixation durations and latencies. In the current study, we asked whether eye movement positions recorded during scene viewing also reflect parallel programming of saccades. As participants viewed scenes in preparation for a memory task, their inspection of the scene was suddenly disrupted by a transition to another scene. We examined whether saccades after the transition were invariably directed immediately toward the center or were contingent on saccade onset times relative to the transition. The results, which showed a dissociation in eye movement behavior between two groups of saccades after the scene transition, supported the parallel programming account. Saccades with relatively long onset times (>100 ms) after the transition were directed immediately toward the center of the scene, probably to restart scene exploration. Saccades with short onset times (programming of saccades during scene viewing. Additionally, results from the analyses of intersaccadic intervals were also consistent with the parallel programming hypothesis.

  16. 3D Membrane Imaging and Porosity Visualization

    KAUST Repository

    Sundaramoorthi, Ganesh

    2016-03-03

    Ultrafiltration asymmetric porous membranes were imaged by two microscopy methods, which allow 3D reconstruction: Focused Ion Beam and Serial Block Face Scanning Electron Microscopy. A new algorithm was proposed to evaluate porosity and average pore size in different layers orthogonal and parallel to the membrane surface. The 3D-reconstruction enabled additionally the visualization of pore interconnectivity in different parts of the membrane. The method was demonstrated for a block copolymer porous membrane and can be extended to other membranes with application in ultrafiltration, supports for forward osmosis, etc, offering a complete view of the transport paths in the membrane.

  17. Grid Service Framework:Supporting Multi-Models Parallel Grid Programming

    Institute of Scientific and Technical Information of China (English)

    邓倩妮; 陆鑫达

    2004-01-01

    Web service is a grid computing technology that promises greater ease-of-use and interoperability than previous distributed computing technologies. This paper proposed Group Service Framework, a grid computing platform based on Microsoft. NET that use web service to: (1) locate and harness volunteer computing resources for different applications, and (2) support multi-models such as Master/Slave, Divide and Conquer, Phase Parallel and so forth parallel programming paradigms in Grid environment, (3) allocate data and balance load dynamically and transparently for grid computing application. The Grid Service Framework based on Microsoft. NET was used to implement several simple parallel computing applications. The results show that the proposed Group Service Framework is suitable for generic parallel numerical computing.

  18. Wireless 3D Chocolate Printer

    Directory of Open Access Journals (Sweden)

    FROILAN G. DESTREZA

    2014-02-01

    Full Text Available This study is for the BSHRM Students of Batangas State University (BatStateU ARASOF for the researchers believe that the Wireless 3D Chocolate Printer would be helpful in their degree program especially on making creative, artistic, personalized and decorative chocolate designs. The researchers used the Prototyping model as procedural method for the successful development and implementation of the hardware and software. This method has five phases which are the following: quick plan, quick design, prototype construction, delivery and feedback and communication. This study was evaluated by the BSHRM Students and the assessment of the respondents regarding the software and hardware application are all excellent in terms of Accuracy, Effecitveness, Efficiency, Maintainability, Reliability and User-friendliness. Also, the overall level of acceptability of the design project as evaluated by the respondents is excellent. With regard to the observation about the best raw material to use in 3D printing, the chocolate is good to use as the printed material is slightly distorted,durable and very easy to prepare; the icing is also good to use as the printed material is not distorted and is very durable but consumes time to prepare; the flour is not good as the printed material is distorted, not durable but it is easy to prepare. The computation of the economic viability level of 3d printer with reference to ROI is 37.14%. The recommendation of the researchers in the design project are as follows: adding a cooling system so that the raw material will be more durable, development of a more simplified version and improving the extrusion process wherein the user do not need to stop the printing process just to replace the empty syringe with a new one.

  19. Design for scalability in 3D computer graphics architectures

    DEFF Research Database (Denmark)

    Holten-Lund, Hans Erik

    2002-01-01

    This thesis describes useful methods and techniques for designing scalable hybrid parallel rendering architectures for 3D computer graphics. Various techniques for utilizing parallelism in a pipelines system are analyzed. During the Ph.D study a prototype 3D graphics architecture named Hybris has...... been developed. Hybris is a prototype rendering architeture which can be tailored to many specific 3D graphics applications and implemented in various ways. Parallel software implementations for both single and multi-processor Windows 2000 system have been demonstrated. Working hardware...... as a case study and an application of the Hybris graphics architecture....

  20. Intraoral 3D scanner

    Science.gov (United States)

    Kühmstedt, Peter; Bräuer-Burchardt, Christian; Munkelt, Christoph; Heinze, Matthias; Palme, Martin; Schmidt, Ingo; Hintersehr, Josef; Notni, Gunther

    2007-09-01

    Here a new set-up of a 3D-scanning system for CAD/CAM in dental industry is proposed. The system is designed for direct scanning of the dental preparations within the mouth. The measuring process is based on phase correlation technique in combination with fast fringe projection in a stereo arrangement. The novelty in the approach is characterized by the following features: A phase correlation between the phase values of the images of two cameras is used for the co-ordinate calculation. This works contrary to the usage of only phase values (phasogrammetry) or classical triangulation (phase values and camera image co-ordinate values) for the determination of the co-ordinates. The main advantage of the method is that the absolute value of the phase at each point does not directly determine the coordinate. Thus errors in the determination of the co-ordinates are prevented. Furthermore, using the epipolar geometry of the stereo-like arrangement the phase unwrapping problem of fringe analysis can be solved. The endoscope like measurement system contains one projection and two camera channels for illumination and observation of the object, respectively. The new system has a measurement field of nearly 25mm × 15mm. The user can measure two or three teeth at one time. So the system can by used for scanning of single tooth up to bridges preparations. In the paper the first realization of the intraoral scanner is described.

  1. 3D Printing and 3D Bioprinting in Pediatrics.

    Science.gov (United States)

    Vijayavenkataraman, Sanjairaj; Fuh, Jerry Y H; Lu, Wen Feng

    2017-07-13

    Additive manufacturing, commonly referred to as 3D printing, is a technology that builds three-dimensional structures and components layer by layer. Bioprinting is the use of 3D printing technology to fabricate tissue constructs for regenerative medicine from cell-laden bio-inks. 3D printing and bioprinting have huge potential in revolutionizing the field of tissue engineering and regenerative medicine. This paper reviews the application of 3D printing and bioprinting in the field of pediatrics.

  2. Architecture-Adaptive Computing Environment: A Tool for Teaching Parallel Programming

    Science.gov (United States)

    Dorband, John E.; Aburdene, Maurice F.

    2002-01-01

    Recently, networked and cluster computation have become very popular. This paper is an introduction to a new C based parallel language for architecture-adaptive programming, aCe C. The primary purpose of aCe (Architecture-adaptive Computing Environment) is to encourage programmers to implement applications on parallel architectures by providing them the assurance that future architectures will be able to run their applications with a minimum of modification. A secondary purpose is to encourage computer architects to develop new types of architectures by providing an easily implemented software development environment and a library of test applications. This new language should be an ideal tool to teach parallel programming. In this paper, we will focus on some fundamental features of aCe C.

  3. 3D printing for dummies

    CERN Document Server

    Hausman, Kalani Kirk

    2014-01-01

    Get started printing out 3D objects quickly and inexpensively! 3D printing is no longer just a figment of your imagination. This remarkable technology is coming to the masses with the growing availability of 3D printers. 3D printers create 3-dimensional layered models and they allow users to create prototypes that use multiple materials and colors.  This friendly-but-straightforward guide examines each type of 3D printing technology available today and gives artists, entrepreneurs, engineers, and hobbyists insight into the amazing things 3D printing has to offer. You'll discover methods for

  4. Parallel Implementation of a Semidefinite Programming Solver based on CSDP in a distributed memory cluster

    NARCIS (Netherlands)

    Ivanov, I.D.; de Klerk, E.

    2007-01-01

    In this paper we present the algorithmic framework and practical aspects of implementing a parallel version of a primal-dual semidefinite programming solver on a distributed memory computer cluster. Our implementation is based on the CSDP solver and uses a message passing interface (MPI), and the Sc

  5. All-pairs Shortest Path Algorithm based on MPI+CUDA Distributed Parallel Programming Model

    Directory of Open Access Journals (Sweden)

    Qingshuang Wu

    2013-12-01

    Full Text Available In view of the problem that computing shortest paths in a graph is a complex and time-consuming process, and the traditional algorithm that rely on the CPU as computing unit solely can't meet the demand of real-time processing, in this paper, we present an all-pairs shortest paths algorithm using MPI+CUDA hybrid programming model, which can take use of the overwhelming computing power of the GPU cluster to speed up the processing. This proposed algorithm can combine the advantages of MPI and CUDA programming model, and can realize two-level parallel computing. In the cluster-level, we take use of the MPI programming model to achieve a coarse-grained parallel computing between the computational nodes of the GPU cluster. In the node-level, we take use of the CUDA programming model to achieve a GPU-accelerated fine grit parallel computing in each computational node internal. The experimental results show that the MPI+CUDA-based parallel algorithm can take full advantage of the powerful computing capability of the GPU cluster, and can achieve about hundreds of time speedup; The whole algorithm has good computing performance, reliability and scalability, and it is able to meet the demand of real-time processing of massive spatial shortest path analysis

  6. Parallel Implementation of a Semidefinite Programming Solver based on CSDP in a distributed memory cluster

    NARCIS (Netherlands)

    Ivanov, I.D.; de Klerk, E.

    2007-01-01

    In this paper we present the algorithmic framework and practical aspects of implementing a parallel version of a primal-dual semidefinite programming solver on a distributed memory computer cluster. Our implementation is based on the CSDP solver and uses a message passing interface (MPI), and the Sc

  7. Convex quadratic programming relaxations for parallel machine scheduling with controllable processing times subject to release times

    Institute of Scientific and Technical Information of China (English)

    ZHANG Feng; CHEN Feng; TANG Guochun

    2004-01-01

    Scheduling unrelated parallel machines with controllable processing times subject to release times is investigated. Based on the convex quadratic programming relaxation and the randomized rounding strategy, a 2-approximation algorithm is obtained for a special case with the all-or-none property and then a 3-approximation algorithm is presented for general problem.

  8. Cellular Microcultures: Programming Mechanical and Physicochemical Properties of 3D Hydrogel Cellular Microcultures via Direct Ink Writing (Adv. Healthcare Mater. 9/2016).

    Science.gov (United States)

    McCracken, Joselle M; Badea, Adina; Kandel, Mikhail E; Gladman, A Sydney; Wetzel, David J; Popescu, Gabriel; Lewis, Jennifer A; Nuzzo, Ralph G

    2016-05-01

    R. Nuzzo and co-workers show on page 1025 how compositional differences in hydrogels are used to tune their cellular compliance by controlling their polymer mesh properties and subsequent uptake of the protein poly-l-lysine (green spheres in circled inset). The cover image shows pyramid micro-scaffolds prepared using direct ink writing (DIW) that differentially direct fibroblast and preosteoblast growth in 3D, depending on cell motility and surface treatment.

  9. Dynamic Frames Based Generation of 3D Scenes and Applications

    OpenAIRE

    Kvesić, Anton; Radošević, Danijel; Orehovački, Tihomir

    2015-01-01

    Modern graphic/programming tools like Unity enables the possibility of creating 3D scenes as well as making 3D scene based program applications, including full physical model, motion, sounds, lightning effects etc. This paper deals with the usage of dynamic frames based generator in the automatic generation of 3D scene and related source code. The suggested model enables the possibility to specify features of the 3D scene in a form of textual specification, as well as exporting such features ...

  10. 3D game environments create professional 3D game worlds

    CERN Document Server

    Ahearn, Luke

    2008-01-01

    The ultimate resource to help you create triple-A quality art for a variety of game worlds; 3D Game Environments offers detailed tutorials on creating 3D models, applying 2D art to 3D models, and clear concise advice on issues of efficiency and optimization for a 3D game engine. Using Photoshop and 3ds Max as his primary tools, Luke Ahearn explains how to create realistic textures from photo source and uses a variety of techniques to portray dynamic and believable game worlds.From a modern city to a steamy jungle, learn about the planning and technological considerations for 3D modelin

  11. Salient Local 3D Features for 3D Shape Retrieval

    CERN Document Server

    Godil, Afzal

    2011-01-01

    In this paper we describe a new formulation for the 3D salient local features based on the voxel grid inspired by the Scale Invariant Feature Transform (SIFT). We use it to identify the salient keypoints (invariant points) on a 3D voxelized model and calculate invariant 3D local feature descriptors at these keypoints. We then use the bag of words approach on the 3D local features to represent the 3D models for shape retrieval. The advantages of the method are that it can be applied to rigid as well as to articulated and deformable 3D models. Finally, this approach is applied for 3D Shape Retrieval on the McGill articulated shape benchmark and then the retrieval results are presented and compared to other methods.

  12. MOM3D/EM-ANIMATE - MOM3D WITH ANIMATION CODE

    Science.gov (United States)

    Shaeffer, J. F.

    1994-01-01

    compare surface-current distribution due to various initial excitation directions or electric field orientations. The program can accept up to 50 planes of field data consisting of a grid of 100 by 100 field points. These planes of data are user selectable and can be viewed individually or concurrently. With these preset limits, the program requires 55 megabytes of core memory to run. These limits can be changed in the header files to accommodate the available core memory of an individual workstation. An estimate of memory required can be made as follows: approximate memory in bytes equals (number of nodes times number of surfaces times 14 variables times bytes per word, typically 4 bytes per floating point) plus (number of field planes times number of nodes per plane times 21 variables times bytes per word). This gives the approximate memory size required to store the field and surface-current data. The total memory size is approximately 400,000 bytes plus the data memory size. The animation calculations are performed in real time at any user set time step. For Silicon Graphics Workstations that have multiple processors, this program has been optimized to perform these calculations on multiple processors to increase animation rates. The optimized program uses the SGI PFA (Power FORTRAN Accelerator) library. On single processor machines, the parallelization directives are seen as comments to the program and will have no effect on compilation or execution. MOM3D and EM-ANIMATE are written in FORTRAN 77 for interactive or batch execution on SGI series computers running IRIX 3.0 or later. The RAM requirements for these programs vary with the size of the problem being solved. A minimum of 30Mb of RAM is required for execution of EM-ANIMATE; however, the code may be modified to accommodate the available memory of an individual workstation. For EM-ANIMATE, twenty-four bit, double-buffered color capability is suggested, but not required. Sample executables and sample input and

  13. LDRD final report on massively-parallel linear programming : the parPCx system.

    Energy Technology Data Exchange (ETDEWEB)

    Parekh, Ojas (Emory University, Atlanta, GA); Phillips, Cynthia Ann; Boman, Erik Gunnar

    2005-02-01

    This report summarizes the research and development performed from October 2002 to September 2004 at Sandia National Laboratories under the Laboratory-Directed Research and Development (LDRD) project ''Massively-Parallel Linear Programming''. We developed a linear programming (LP) solver designed to use a large number of processors. LP is the optimization of a linear objective function subject to linear constraints. Companies and universities have expended huge efforts over decades to produce fast, stable serial LP solvers. Previous parallel codes run on shared-memory systems and have little or no distribution of the constraint matrix. We have seen no reports of general LP solver runs on large numbers of processors. Our parallel LP code is based on an efficient serial implementation of Mehrotra's interior-point predictor-corrector algorithm (PCx). The computational core of this algorithm is the assembly and solution of a sparse linear system. We have substantially rewritten the PCx code and based it on Trilinos, the parallel linear algebra library developed at Sandia. Our interior-point method can use either direct or iterative solvers for the linear system. To achieve a good parallel data distribution of the constraint matrix, we use a (pre-release) version of a hypergraph partitioner from the Zoltan partitioning library. We describe the design and implementation of our new LP solver called parPCx and give preliminary computational results. We summarize a number of issues related to efficient parallel solution of LPs with interior-point methods including data distribution, numerical stability, and solving the core linear system using both direct and iterative methods. We describe a number of applications of LP specific to US Department of Energy mission areas and we summarize our efforts to integrate parPCx (and parallel LP solvers in general) into Sandia's massively-parallel integer programming solver PICO (Parallel Interger and

  14. Speedup properties of phases in the execution profile of distributed parallel programs

    Energy Technology Data Exchange (ETDEWEB)

    Carlson, B.M. [Toronto Univ., ON (Canada). Computer Systems Research Institute; Wagner, T.D.; Dowdy, L.W. [Vanderbilt Univ., Nashville, TN (United States). Dept. of Computer Science; Worley, P.H. [Oak Ridge National Lab., TN (United States)

    1992-08-01

    The execution profile of a distributed-memory parallel program specifies the number of busy processors as a function of time. Periods of homogeneous processor utilization are manifested in many execution profiles. These periods can usually be correlated with the algorithms implemented in the underlying parallel code. Three families of methods for smoothing execution profile data are presented. These approaches simplify the problem of detecting end points of periods of homogeneous utilization. These periods, called phases, are then examined in isolation, and their speedup characteristics are explored. A specific workload executed on an Intel iPSC/860 is used for validation of the techniques described.

  15. Method, systems, and computer program products for implementing function-parallel network firewall

    Science.gov (United States)

    Fulp, Errin W.; Farley, Ryan J.

    2011-10-11

    Methods, systems, and computer program products for providing function-parallel firewalls are disclosed. According to one aspect, a function-parallel firewall includes a first firewall node for filtering received packets using a first portion of a rule set including a plurality of rules. The first portion includes less than all of the rules in the rule set. At least one second firewall node filters packets using a second portion of the rule set. The second portion includes at least one rule in the rule set that is not present in the first portion. The first and second portions together include all of the rules in the rule set.

  16. Programming Environment for a High-Performance Parallel Supercomputer with Intelligent Communication

    Directory of Open Access Journals (Sweden)

    A. Gunzinger

    1996-01-01

    Full Text Available At the Electronics Laboratory of the Swiss Federal Institute of Technology (ETH in Zürich, the high-performance parallel supercomputer MUSIC (MUlti processor System with Intelligent Communication has been developed. As applications like neural network simulation and molecular dynamics show, the Electronics Laboratory supercomputer is absolutely on par with those of conventional supercomputers, but electric power requirements are reduced by a factor of 1,000, weight is reduced by a factor of 400, and price is reduced by a factor of 100. Software development is a key issue of such parallel systems. This article focuses on the programming environment of the MUSIC system and on its applications.

  17. Teaching Scientific Computing: A Model-Centered Approach to Pipeline and Parallel Programming with C

    Directory of Open Access Journals (Sweden)

    Vladimiras Dolgopolovas

    2015-01-01

    Full Text Available The aim of this study is to present an approach to the introduction into pipeline and parallel computing, using a model of the multiphase queueing system. Pipeline computing, including software pipelines, is among the key concepts in modern computing and electronics engineering. The modern computer science and engineering education requires a comprehensive curriculum, so the introduction to pipeline and parallel computing is the essential topic to be included in the curriculum. At the same time, the topic is among the most motivating tasks due to the comprehensive multidisciplinary and technical requirements. To enhance the educational process, the paper proposes a novel model-centered framework and develops the relevant learning objects. It allows implementing an educational platform of constructivist learning process, thus enabling learners’ experimentation with the provided programming models, obtaining learners’ competences of the modern scientific research and computational thinking, and capturing the relevant technical knowledge. It also provides an integral platform that allows a simultaneous and comparative introduction to pipelining and parallel computing. The programming language C for developing programming models and message passing interface (MPI and OpenMP parallelization tools have been chosen for implementation.

  18. Academic training: From Evolution Theory to Parallel and Distributed Genetic Programming

    CERN Multimedia

    2007-01-01

    2006-2007 ACADEMIC TRAINING PROGRAMME LECTURE SERIES 15, 16 March From 11:00 to 12:00 - Main Auditorium, bldg. 500 From Evolution Theory to Parallel and Distributed Genetic Programming F. FERNANDEZ DE VEGA / Univ. of Extremadura, SP Lecture No. 1: From Evolution Theory to Evolutionary Computation Evolutionary computation is a subfield of artificial intelligence (more particularly computational intelligence) involving combinatorial optimization problems, which are based to some degree on the evolution of biological life in the natural world. In this tutorial we will review the source of inspiration for this metaheuristic and its capability for solving problems. We will show the main flavours within the field, and different problems that have been successfully solved employing this kind of techniques. Lecture No. 2: Parallel and Distributed Genetic Programming The successful application of Genetic Programming (GP, one of the available Evolutionary Algorithms) to optimization problems has encouraged an ...

  19. Parallel programming of exogenous and endogenous components in the antisaccade task.

    Science.gov (United States)

    Massen, Cristina

    2004-04-01

    In the antisaccade task subjects are required to suppress the reflexive tendency to look at a peripherally presented stimulus and to perform a saccade in the opposite direction instead. The present studies aimed at investigating the inhibitory mechanisms responsible for successful performance in this task, testing a hypothesis of parallel programming of exogenous and endogenous components: A reflexive saccade to the stimulus is automatically programmed and competes with the concurrently established voluntary programme to look in the opposite direction. The experiments followed the logic of selectively manipulating the speed of processing of these components and testing the prediction that a selective slowing of the exogenous component should result in a reduced error rate in this task, while a selective slowing of the endogenous component should have the opposite effect. The results provide evidence for the hypothesis of parallel programming and are discussed in the context of alternative accounts of antisaccade performance.

  20. Calculation of illumination conditions at the lunar south pole - parallel programming approach

    Science.gov (United States)

    Figuera, R. Marco; Gläser, P.; Oberst, J.; De Rosa, D.

    2014-04-01

    In this paper we present a parallel programming approach to evaluate illumination conditions at the lunar south pole. Due to the small inclination (1.54°) of the lunar rotational axis with respect to the ecliptic plane and the topography of the lunar south pole, which allows long illumination periods, the study of illumination conditions is of great importance. Several tests were conducted in order to check the viability of the study and to optimize the tool used to calculate such illumination. First results using a simulated case study showed a reduction of the computation time in the order of 8-12 times using parallel programming in the Graphic Processing Unit (GPU) in comparison with sequential programming in the Central Processing Unit (CPU).

  1. Time domain topology optimization of 3D nanophotonic devices

    DEFF Research Database (Denmark)

    Elesin, Yuriy; Lazarov, Boyan Stefanov; Jensen, Jakob Søndergaard;

    2014-01-01

    We present an efficient parallel topology optimization framework for design of large scale 3D nanophotonic devices. The code shows excellent scalability and is demonstrated for optimization of broadband frequency splitter, waveguide intersection, photonic crystal-based waveguide and nanowire...

  2. Algorithmic differentiation of pragma-defined parallel regions differentiating computer programs containing OpenMP

    CERN Document Server

    Förster, Michael

    2014-01-01

    Numerical programs often use parallel programming techniques such as OpenMP to compute the program's output values as efficient as possible. In addition, derivative values of these output values with respect to certain input values play a crucial role. To achieve code that computes not only the output values simultaneously but also the derivative values, this work introduces several source-to-source transformation rules. These rules are based on a technique called algorithmic differentiation. The main focus of this work lies on the important reverse mode of algorithmic differentiation. The inh

  3. Fixed-dimensional parallel linear programming via relative {Epsilon}-approximations

    Energy Technology Data Exchange (ETDEWEB)

    Goodrich, M.T.

    1996-12-31

    We show that linear programming in IR{sup d} can be solved deterministically in O((log log n){sup d}) time using linear work in the PRAM model of computation, for any fixed constant d. Our method is developed for the CRCW variant of the PRAM parallel computation model, and can be easily implemented to run in O(log n(log log n){sup d-1}) time using linear work on an EREW PRAM. A key component in these algorithms is a new, efficient parallel method for constructing E-nets and E-approximations (which have wide applicability in computational geometry). In addition, we introduce a new deterministic set approximation for range spaces with finite VC-exponent, which we call the {delta}-relative {epsilon}-approximation, and we show how such approximations can be efficiently constructed in parallel.

  4. Molecular dynamics simulation on a network of workstations using a machine-independent parallel programming language.

    Science.gov (United States)

    Shifman, M A; Windemuth, A; Schulten, K; Miller, P L

    1992-04-01

    Molecular dynamics simulations investigate local and global motion in molecules. Several parallel computing approaches have been taken to attack the most computationally expensive phase of molecular simulations, the evaluation of long range interactions. This paper reviews these approaches and develops a straightforward but effective algorithm using the machine-independent parallel programming language, Linda. The algorithm was run both on a shared memory parallel computer and on a network of high performance Unix workstations. Performance benchmarks were performed on both systems using two proteins. This algorithm offers a portable cost-effective alternative for molecular dynamics simulations. In view of the increasing numbers of networked workstations, this approach could help make molecular dynamics simulations more easily accessible to the research community.

  5. Beam and Truss Finite Element Verification for DYNA3D

    Energy Technology Data Exchange (ETDEWEB)

    Rathbun, H J

    2007-07-16

    The explicit finite element (FE) software program DYNA3D has been developed at Lawrence Livermore National Laboratory (LLNL) to simulate the dynamic behavior of structures, systems, and components. This report focuses on verification of beam and truss element formulations in DYNA3D. An efficient protocol has been developed to verify the accuracy of these structural elements by generating a set of representative problems for which closed-form quasi-static steady-state analytical reference solutions exist. To provide as complete coverage as practically achievable, problem sets are developed for each beam and truss element formulation (and their variants) in all modes of loading and physical orientation. Analyses with loading in the elastic and elastic-plastic regimes are performed. For elastic loading, the FE results are within 1% of the reference solutions for all cases. For beam element bending and torsion loading in the plastic regime, the response is heavily dependent on the numerical integration rule chosen, with higher refinement yielding greater accuracy (agreement to within 1%). Axial loading in the plastic regime produces accurate results (agreement to within 0.01%) for all integration rules and element formulations. Truss elements are also verified to provide accurate results (within 0.01%) for elastic and elastic-plastic loading. A sample problem to verify beam element response in ParaDyn, the parallel version DYNA3D, is also presented.

  6. Run-Time and Compiler Support for Programming in Adaptive Parallel Environments

    Directory of Open Access Journals (Sweden)

    Guy Edjlali

    1997-01-01

    Full Text Available For better utilization of computing resources, it is important to consider parallel programming environments in which the number of available processors varies at run-time. In this article, we discuss run-time support for data-parallel programming in such an adaptive environment. Executing programs in an adaptive environment requires redistributing data when the number of processors changes, and also requires determining new loop bounds and communication patterns for the new set of processors. We have developed a run-time library to provide this support. We discuss how the run-time library can be used by compilers of high-performance Fortran (HPF-like languages to generate code for an adaptive environment. We present performance results for a Navier-Stokes solver and a multigrid template run on a network of workstations and an IBM SP-2. Our experiments show that if the number of processors is not varied frequently, the cost of data redistribution is not significant compared to the time required for the actual computation. Overall, our work establishes the feasibility of compiling HPF for a network of nondedicated workstations, which are likely to be an important resource for parallel programming in the future.

  7. Towards Interactive Visual Exploration of Parallel Programs using a Domain-Specific Language

    KAUST Repository

    Klein, Tobias

    2016-04-19

    The use of GPUs and the massively parallel computing paradigm have become wide-spread. We describe a framework for the interactive visualization and visual analysis of the run-time behavior of massively parallel programs, especially OpenCL kernels. This facilitates understanding a program\\'s function and structure, finding the causes of possible slowdowns, locating program bugs, and interactively exploring and visually comparing different code variants in order to improve performance and correctness. Our approach enables very specific, user-centered analysis, both in terms of the recording of the run-time behavior and the visualization itself. Instead of having to manually write instrumented code to record data, simple code annotations tell the source-to-source compiler which code instrumentation to generate automatically. The visualization part of our framework then enables the interactive analysis of kernel run-time behavior in a way that can be very specific to a particular problem or optimization goal, such as analyzing the causes of memory bank conflicts or understanding an entire parallel algorithm.

  8. Scoops3D: software to analyze 3D slope stability throughout a digital landscape

    Science.gov (United States)

    Reid, Mark E.; Christian, Sarah B.; Brien, Dianne L.; Henderson, Scott T.

    2015-01-01

    The computer program, Scoops3D, evaluates slope stability throughout a digital landscape represented by a digital elevation model (DEM). The program uses a three-dimensional (3D) method of columns approach to assess the stability of many (typically millions) potential landslides within a user-defined size range. For each potential landslide (or failure), Scoops3D assesses the stability of a rotational, spherical slip surface encompassing many DEM cells using a 3D version of either Bishop’s simplified method or the Ordinary (Fellenius) method of limit-equilibrium analysis. Scoops3D has several options for the user to systematically and efficiently search throughout an entire DEM, thereby incorporating the effects of complex surface topography. In a thorough search, each DEM cell is included in multiple potential failures, and Scoops3D records the lowest stability (factor of safety) for each DEM cell, as well as the size (volume or area) associated with each of these potential landslides. It also determines the least-stable potential failure for the entire DEM. The user has a variety of options for building a 3D domain, including layers or full 3D distributions of strength and pore-water pressures, simplistic earthquake loading, and unsaturated suction conditions. Results from Scoops3D can be readily incorporated into a geographic information system (GIS) or other visualization software. This manual includes information on the theoretical basis for the slope-stability analysis, requirements for constructing and searching a 3D domain, a detailed operational guide (including step-by-step instructions for using the graphical user interface [GUI] software, Scoops3D-i) and input/output file specifications, practical considerations for conducting an analysis, results of verification tests, and multiple examples illustrating the capabilities of Scoops3D. Easy-to-use software installation packages are available for the Windows or Macintosh operating systems; these packages

  9. Scoops3D: software to analyze 3D slope stability throughout a digital landscape

    Science.gov (United States)

    Reid, Mark E.; Christian, Sarah B.; Brien, Dianne L.; Henderson, Scott T.

    2015-01-01

    The computer program, Scoops3D, evaluates slope stability throughout a digital landscape represented by a digital elevation model (DEM). The program uses a three-dimensional (3D) method of columns approach to assess the stability of many (typically millions) potential landslides within a user-defined size range. For each potential landslide (or failure), Scoops3D assesses the stability of a rotational, spherical slip surface encompassing many DEM cells using a 3D version of either Bishop’s simplified method or the Ordinary (Fellenius) method of limit-equilibrium analysis. Scoops3D has several options for the user to systematically and efficiently search throughout an entire DEM, thereby incorporating the effects of complex surface topography. In a thorough search, each DEM cell is included in multiple potential failures, and Scoops3D records the lowest stability (factor of safety) for each DEM cell, as well as the size (volume or area) associated with each of these potential landslides. It also determines the least-stable potential failure for the entire DEM. The user has a variety of options for building a 3D domain, including layers or full 3D distributions of strength and pore-water pressures, simplistic earthquake loading, and unsaturated suction conditions. Results from Scoops3D can be readily incorporated into a geographic information system (GIS) or other visualization software. This manual includes information on the theoretical basis for the slope-stability analysis, requirements for constructing and searching a 3D domain, a detailed operational guide (including step-by-step instructions for using the graphical user interface [GUI] software, Scoops3D-i) and input/output file specifications, practical considerations for conducting an analysis, results of verification tests, and multiple examples illustrating the capabilities of Scoops3D. Easy-to-use software installation packages are available for the Windows or Macintosh operating systems; these packages

  10. Research on Parallelization of Multiphase Space Numerical Simulation%多相空间数值模拟并行化研究

    Institute of Scientific and Technical Information of China (English)

    陆林生; 董超群; 李志辉

    2003-01-01

    This paper researches on parellelization of multiphase space numerical simulation with the case of the gaskinetic algorithm for 3-D flows. It focuses on the techniques of domain decomposition methods, vector reduction andboundary processing parallel optimization. The HPF parallel program design has been developed by the Parallel Pro-gramming Concept Design (PPCDS). The preferable parallel efficiency has been found by the HPF program in highperformance computer with massive scale parallel computing.

  11. BSP模型下的并行程序设计与开发%Design and Development of Parallel Programs on Bulk Synchronous Parallel Model

    Institute of Scientific and Technical Information of China (English)

    赖树华; 陆朝俊; 孙永强

    2001-01-01

    The Bulk Synchronous Parallel (BSP) model was simply introduced, and the advantage of the parapllel program's design and development on BSP model was discussed. Then it analysed how to design and develop the parallel programs on BSP model and summarized several principles the developer must comply with. At last a useful parallel programming method based on the BSP model was presented: the two phase method of BSP parallel program design. An example was given to illustrate how to make use of the above method and the BSP performance prediction tool.%介绍了BSP(Bulk Synchronous Parallel)模型,讨论了在该模型下进行并行程序设计的优点、并行算法的分析和设计方法及其必须遵守的原则.以两矩阵的乘法为例说明了如何借助BSP并行程序性能预测工具,利用两阶段BSP并行程序设计方法进行BSP并行程序的设计和开发.

  12. Holography of 3d-3d correspondence at Large N

    OpenAIRE

    Gang, Dongmin; Kim, Nakwoo; Lee, Sangmin

    2014-01-01

    We study the physics of multiple M5-branes compactified on a hyperbolic 3-manifold. On the one hand, it leads to the 3d-3d correspondence which maps an N = 2 $$ \\mathcal{N}=2 $$ superconformal field theory to a pure Chern-Simons theory on the 3-manifold. On the other hand, it leads to a warped AdS 4 geometry in M-theory holographically dual to the superconformal field theory. Combining the holographic duality and the 3d-3d correspondence, we propose a conjecture for the large N limit of the p...

  13. The (parallel) approximability of non-boolean satisfiability problems and restricted integer programming

    Science.gov (United States)

    Serna, Maria; Trevisan, Luca; Xhafa, Fatos

    We present parallel approximation algorithms for maximization problems expressible by integer linear programs of a restricted syntactic form introduced by Barland et al. [BKT96]. One of our motivations was to show whether the approximation results in the framework of Barland et al. holds in the parallel setting. Our results are a confirmation of this, and thus we have a new common framework for both computational settings. Also, we prove almost tight non-approximability results, thus solving a main open question of Barland et al. We obtain the results through the constraint satisfaction problem over multi-valued domains, for which we show non-approximability results and develop parallel approximation algorithms. Our parallel approximation algorithms are based on linear programming and random rounding; they are better than previously known sequential algorithms. The non-approximability results are based on new recent progress in the fields of Probabilistically Checkable Proofs and Multi-Prover One-Round Proof Systems [Raz95, Hås97, AS97, RS97].

  14. What Is an Attractive Body? Using an Interactive 3D Program to Create the Ideal Body for You and Your Partner

    Science.gov (United States)

    Crossley, Kara L.; Cornelissen, Piers L.; Tovée, Martin J.

    2012-01-01

    What is the ideal body size and shape that we want for ourselves and our partners? What are the important physical features in this ideal? And do both genders agree on what is an attractive body? To answer these questions we used a 3D interactive software system which allows our participants to produce a photorealistic, virtual male or female body. Forty female and forty male heterosexual Caucasian observers (females mean age 19.10 years, s.d. 1.01; 40 males mean age 19.84, s.d. 1.66) set their own ideal size and shape, and the size and shape of their ideal partner using the DAZ studio image manipulation programme. In this programme the shape and size of a 3D body can be altered along 94 independent dimensions, allowing each participant to create the exact size and shape of the body they want. The volume (and thus the weight assuming a standard density) and the circumference of the bust, waist and hips of these 3D models can then be measured. The ideal female body set by women (BMI = 18.9, WHR = 0.70, WCR = 0.67) was very similar to the ideal partner set by men, particularly in their BMI (BMI = 18.8, WHR = 0.73, WCR = 0.69). This was a lower BMI than the actual BMI of 39 of the 40 women. The ideal male body set by the men (BMI = 25.9, WHR = 0.87, WCR = 0.74) was very similar to the ideal partner set by the women (BMI = 24.5, WHR = 0.86, WCR = 0.77). This was a lower BMI than the actual BMI of roughly half of the men and a higher BMI than the other half. The results suggest a consistent preference for an ideal male and female body size and shape across both genders. The results also suggest that both BMI and torso shape are important components for the creation of the ideal body. PMID:23209791

  15. From functional programming to multicore parallelism: A case study based on Presburger Arithmetic

    DEFF Research Database (Denmark)

    Dung, Phan Anh; Hansen, Michael Reichhardt

    2011-01-01

    The overall goal of this work is studying parallelization of functional programs with the specific case study of decision procedures for Presburger Arithmetic (PA). PA is a first order theory of integers accepting addition as its only operation. Whereas it has wide applications in different areas......, we are interested in using PA in connection with the Duration Calculus Model Checker (DCMC) [5]. There are effective decision procedures for PA including Cooper’s algorithm and the Omega Test; however, their complexity is extremely high with doubly exponential lower bound and triply exponential upper...... in the SMT-solver Z3 [8] which has the capability of solving Presburger formulas. Functional programming is well-suited for the domain of decision procedures, and its immutability feature helps to reduce parallelization effort. While Haskell has progressed with a lot of parallelismrelated research [6], we...

  16. HipMatch: an object-oriented cross-platform program for accurate determination of cup orientation using 2D-3D registration of single standard X-ray radiograph and a CT volume.

    Science.gov (United States)

    Zheng, Guoyan; Zhang, Xuan; Steppacher, Simon D; Murphy, Stephen B; Siebenrock, Klaus A; Tannast, Moritz

    2009-09-01

    The widely used procedure of evaluation of cup orientation following total hip arthroplasty using single standard anteroposterior (AP) radiograph is known inaccurate, largely due to the wide variability in individual pelvic orientation relative to X-ray plate. 2D-3D image registration methods have been introduced for an accurate determination of the post-operative cup alignment with respect to an anatomical reference extracted from the CT data. Although encouraging results have been reported, their extensive usage in clinical routine is still limited. This may be explained by their requirement of a CAD model of the prosthesis, which is often difficult to be organized from the manufacturer due to the proprietary issue, and by their requirement of either multiple radiographs or a radiograph-specific calibration, both of which are not available for most retrospective studies. To address these issues, we developed and validated an object-oriented cross-platform program called "HipMatch" where a hybrid 2D-3D registration scheme combining an iterative landmark-to-ray registration with a 2D-3D intensity-based registration was implemented to estimate a rigid transformation between a pre-operative CT volume and the post-operative X-ray radiograph for a precise estimation of cup alignment. No CAD model of the prosthesis is required. Quantitative and qualitative results evaluated on cadaveric and clinical datasets are given, which indicate the robustness and the accuracy of the program. HipMatch is written in object-oriented programming language C++ using cross-platform software Qt (TrollTech, Oslo, Norway), VTK, and Coin3D and is transportable to any platform.

  17. Class Notes: Programming Parallel Algorithms CS 15-840B (Fall 1992)

    Science.gov (United States)

    1993-02-01

    840: Programming Parallel Algorithms Lecture #15 Scribe: Bob Wheeler Thursday, 6 Nov 92 Overview * Connected components (continued). * Minimum spanning...Sriram Sethuraman Singular value decomposition Ken Tew EEG analysis Eric Thayer Speech recognition Xuemei Wang & Bob Wheeler Matrix operations Matt...Computing, 14(4):862-874, 1985. [33] L. W. Tucker, C. R. Feynman , and D. M. Fritzsche. Object recognition using the Connection Machine. Proceedings CVPR 󈨜

  18. 3D Spectroscopy in Astronomy

    Science.gov (United States)

    Mediavilla, Evencio; Arribas, Santiago; Roth, Martin; Cepa-Nogué, Jordi; Sánchez, Francisco

    2011-09-01

    Preface; Acknowledgements; 1. Introductory review and technical approaches Martin M. Roth; 2. Observational procedures and data reduction James E. H. Turner; 3. 3D Spectroscopy instrumentation M. A. Bershady; 4. Analysis of 3D data Pierre Ferruit; 5. Science motivation for IFS and galactic studies F. Eisenhauer; 6. Extragalactic studies and future IFS science Luis Colina; 7. Tutorials: how to handle 3D spectroscopy data Sebastian F. Sánchez, Begona García-Lorenzo and Arlette Pécontal-Rousset.

  19. Spherical 3D isotropic wavelets

    Science.gov (United States)

    Lanusse, F.; Rassat, A.; Starck, J.-L.

    2012-04-01

    Context. Future cosmological surveys will provide 3D large scale structure maps with large sky coverage, for which a 3D spherical Fourier-Bessel (SFB) analysis in spherical coordinates is natural. Wavelets are particularly well-suited to the analysis and denoising of cosmological data, but a spherical 3D isotropic wavelet transform does not currently exist to analyse spherical 3D data. Aims: The aim of this paper is to present a new formalism for a spherical 3D isotropic wavelet, i.e. one based on the SFB decomposition of a 3D field and accompany the formalism with a public code to perform wavelet transforms. Methods: We describe a new 3D isotropic spherical wavelet decomposition based on the undecimated wavelet transform (UWT) described in Starck et al. (2006). We also present a new fast discrete spherical Fourier-Bessel transform (DSFBT) based on both a discrete Bessel transform and the HEALPIX angular pixelisation scheme. We test the 3D wavelet transform and as a toy-application, apply a denoising algorithm in wavelet space to the Virgo large box cosmological simulations and find we can successfully remove noise without much loss to the large scale structure. Results: We have described a new spherical 3D isotropic wavelet transform, ideally suited to analyse and denoise future 3D spherical cosmological surveys, which uses a novel DSFBT. We illustrate its potential use for denoising using a toy model. All the algorithms presented in this paper are available for download as a public code called MRS3D at http://jstarck.free.fr/mrs3d.html

  20. 3D IBFV : Hardware-Accelerated 3D Flow Visualization

    NARCIS (Netherlands)

    Telea, Alexandru; Wijk, Jarke J. van

    2003-01-01

    We present a hardware-accelerated method for visualizing 3D flow fields. The method is based on insertion, advection, and decay of dye. To this aim, we extend the texture-based IBFV technique for 2D flow visualization in two main directions. First, we decompose the 3D flow visualization problem in a

  1. Dynamic programming in parallel boundary detection with application to ultrasound intima-media segmentation.

    Science.gov (United States)

    Zhou, Yuan; Cheng, Xinyao; Xu, Xiangyang; Song, Enmin

    2013-12-01

    Segmentation of carotid artery intima-media in longitudinal ultrasound images for measuring its thickness to predict cardiovascular diseases can be simplified as detecting two nearly parallel boundaries within a certain distance range, when plaque with irregular shapes is not considered. In this paper, we improve the implementation of two dynamic programming (DP) based approaches to parallel boundary detection, dual dynamic programming (DDP) and piecewise linear dual dynamic programming (PL-DDP). Then, a novel DP based approach, dual line detection (DLD), which translates the original 2-D curve position to a 4-D parameter space representing two line segments in a local image segment, is proposed to solve the problem while maintaining efficiency and rotation invariance. To apply the DLD to ultrasound intima-media segmentation, it is imbedded in a framework that employs an edge map obtained from multiplication of the responses of two edge detectors with different scales and a coupled snake model that simultaneously deforms the two contours for maintaining parallelism. The experimental results on synthetic images and carotid arteries of clinical ultrasound images indicate improved performance of the proposed DLD compared to DDP and PL-DDP, with respect to accuracy and efficiency.

  2. A 3-D Contextual Classifier

    DEFF Research Database (Denmark)

    Larsen, Rasmus

    1997-01-01

    . This includes the specification of a Gaussian distribution for the pixel values as well as a prior distribution for the configuration of class variables within the cross that is m ade of a pixel and its four nearest neighbours. We will extend this algorithm to 3-D, i.e. we will specify a simultaneous Gaussian...... distr ibution for a pixel and its 6 nearest 3-D neighbours, and generalise the class variable configuration distribution within the 3-D cross. The algorithm is tested on a synthetic 3-D multivariate dataset....

  3. 3D Bayesian contextual classifiers

    DEFF Research Database (Denmark)

    Larsen, Rasmus

    2000-01-01

    We extend a series of multivariate Bayesian 2-D contextual classifiers to 3-D by specifying a simultaneous Gaussian distribution for the feature vectors as well as a prior distribution of the class variables of a pixel and its 6 nearest 3-D neighbours.......We extend a series of multivariate Bayesian 2-D contextual classifiers to 3-D by specifying a simultaneous Gaussian distribution for the feature vectors as well as a prior distribution of the class variables of a pixel and its 6 nearest 3-D neighbours....

  4. Using 3D in Visualization

    DEFF Research Database (Denmark)

    Wood, Jo; Kirschenbauer, Sabine; Döllner, Jürgen

    2005-01-01

    to display 3D imagery. The extra cartographic degree of freedom offered by using 3D is explored and offered as a motivation for employing 3D in visualization. The use of VR and the construction of virtual environments exploit navigational and behavioral realism, but become most usefil when combined...... with abstracted representations embedded in a 3D space. The interactions between development of geovisualization, the technology used to implement it and the theory surrounding cartographic representation are explored. The dominance of computing technologies, driven particularly by the gaming industry...

  5. Interactive 3D multimedia content

    CERN Document Server

    Cellary, Wojciech

    2012-01-01

    The book describes recent research results in the areas of modelling, creation, management and presentation of interactive 3D multimedia content. The book describes the current state of the art in the field and identifies the most important research and design issues. Consecutive chapters address these issues. These are: database modelling of 3D content, security in 3D environments, describing interactivity of content, searching content, visualization of search results, modelling mixed reality content, and efficient creation of interactive 3D content. Each chapter is illustrated with example a

  6. 3D for Graphic Designers

    CERN Document Server

    Connell, Ellery

    2011-01-01

    Helping graphic designers expand their 2D skills into the 3D space The trend in graphic design is towards 3D, with the demand for motion graphics, animation, photorealism, and interactivity rapidly increasing. And with the meteoric rise of iPads, smartphones, and other interactive devices, the design landscape is changing faster than ever.2D digital artists who need a quick and efficient way to join this brave new world will want 3D for Graphic Designers. Readers get hands-on basic training in working in the 3D space, including product design, industrial design and visualization, modeling, ani

  7. 3-D printers for libraries

    CERN Document Server

    Griffey, Jason

    2014-01-01

    As the maker movement continues to grow and 3-D printers become more affordable, an expanding group of hobbyists is keen to explore this new technology. In the time-honored tradition of introducing new technologies, many libraries are considering purchasing a 3-D printer. Jason Griffey, an early enthusiast of 3-D printing, has researched the marketplace and seen several systems first hand at the Consumer Electronics Show. In this report he introduces readers to the 3-D printing marketplace, covering such topics asHow fused deposition modeling (FDM) printing workBasic terminology such as build

  8. Documentation of a computer program to simulate lake-aquifer interaction using the MODFLOW ground water flow model and the MOC3D solute-transport model

    Science.gov (United States)

    Merritt, Michael L.; Konikow, Leonard F.

    2000-01-01

    Heads and flow patterns in surficial aquifers can be strongly influenced by the presence of stationary surface-water bodies (lakes) that are in direct contact, vertically and laterally, with the aquifer. Conversely, lake stages can be significantly affected by the volume of water that seeps through the lakebed that separates the lake from the aquifer. For these reasons, a set of computer subroutines called the Lake Package (LAK3) was developed to represent lake/aquifer interaction in numerical simulations using the U.S. Geological Survey three-dimensional, finite-difference, modular ground-water flow model MODFLOW and the U.S. Geological Survey three-dimensional method-of-characteristics solute-transport model MOC3D. In the Lake Package described in this report, a lake is represented as a volume of space within the model grid which consists of inactive cells extending downward from the upper surface of the grid. Active model grid cells bordering this space, representing the adjacent aquifer, exchange water with the lake at a rate determined by the relative heads and by conductances that are based on grid cell dimensions, hydraulic conductivities of the aquifer material, and user-specified leakance distributions that represent the resistance to flow through the material of the lakebed. Parts of the lake may become ?dry? as upper layers of the model are dewatered, with a concomitant reduction in lake surface area, and may subsequently rewet when aquifer heads rise. An empirical approximation has been encoded to simulate the rewetting of a lake that becomes completely dry. The variations of lake stages are determined by independent water budgets computed for each lake in the model grid. This lake budget process makes the package a simulator of the response of lake stage to hydraulic stresses applied to the aquifer. Implementation of a lake water budget requires input of parameters including those representing the rate of lake atmospheric recharge and evaporation

  9. CLUSTEREASY:A Program for Simulating Scalar Field Evolution on Parallel Computers

    CERN Document Server

    Felder, Gary N

    2007-01-01

    We describe a new, parallel programming version of the scalar field simulation program LATTICEEASY. The new C++ program, CLUSTEREASY, can simulate arbitrary scalar field models on distributed-memory clusters. The speed and memory requirements scale well with the number of processors. As with the serial version of LATTICEEASY, CLUSTEREASY can run simulations in one, two, or three dimensions, with or without expansion of the universe, with customizable parameters and output. The program and its full documentation are available on the LATTICEEASY website at http://www.science.smith.edu/departments/Physics/fstaff/gfelder/latticeeasy/. In this paper we provide a brief overview of what CLUSTEREASY does and the ways in which it does and doesn't differ from the serial version of LATTICEEASY.

  10. Novel Scalable 3-D MT Inverse Solver

    Science.gov (United States)

    Kuvshinov, A. V.; Kruglyakov, M.; Geraskin, A.

    2016-12-01

    We present a new, robust and fast, three-dimensional (3-D) magnetotelluric (MT) inverse solver. As a forward modelling engine a highly-scalable solver extrEMe [1] is used. The (regularized) inversion is based on an iterative gradient-type optimization (quasi-Newton method) and exploits adjoint sources approach for fast calculation of the gradient of the misfit. The inverse solver is able to deal with highly detailed and contrasting models, allows for working (separately or jointly) with any type of MT (single-site and/or inter-site) responses, and supports massive parallelization. Different parallelization strategies implemented in the code allow for optimal usage of available computational resources for a given problem set up. To parameterize an inverse domain a mask approach is implemented, which means that one can merge any subset of forward modelling cells in order to account for (usually) irregular distribution of observation sites. We report results of 3-D numerical experiments aimed at analysing the robustness, performance and scalability of the code. In particular, our computational experiments carried out at different platforms ranging from modern laptops to high-performance clusters demonstrate practically linear scalability of the code up to thousands of nodes. 1. Kruglyakov, M., A. Geraskin, A. Kuvshinov, 2016. Novel accurate and scalable 3-D MT forward solver based on a contracting integral equation method, Computers and Geosciences, in press.

  11. Identifying positioning-based attacks against 3D printed objects and the 3D printing process

    Science.gov (United States)

    Straub, Jeremy

    2017-05-01

    Zeltmann, et al. demonstrated that structural integrity and other quality damage to objects can be caused by changing its position on a 3D printer's build plate. On some printers, for example, object surfaces and support members may be stronger when oriented parallel to the X or Y axis. The challenge presented by the need to assure 3D printed object orientation is that this can be altered in numerous places throughout the system. This paper considers attack scenarios and discusses where attacks that change printing orientation can occur in the process. An imaging-based solution to combat this problem is presented.

  12. High performance parallel computers for science: New developments at the Fermilab advanced computer program

    Energy Technology Data Exchange (ETDEWEB)

    Nash, T.; Areti, H.; Atac, R.; Biel, J.; Cook, A.; Deppe, J.; Edel, M.; Fischler, M.; Gaines, I.; Hance, R.

    1988-08-01

    Fermilab's Advanced Computer Program (ACP) has been developing highly cost effective, yet practical, parallel computers for high energy physics since 1984. The ACP's latest developments are proceeding in two directions. A Second Generation ACP Multiprocessor System for experiments will include $3500 RISC processors each with performance over 15 VAX MIPS. To support such high performance, the new system allows parallel I/O, parallel interprocess communication, and parallel host processes. The ACP Multi-Array Processor, has been developed for theoretical physics. Each $4000 node is a FORTRAN or C programmable pipelined 20 MFlops (peak), 10 MByte single board computer. These are plugged into a 16 port crossbar switch crate which handles both inter and intra crate communication. The crates are connected in a hypercube. Site oriented applications like lattice gauge theory are supported by system software called CANOPY, which makes the hardware virtually transparent to users. A 256 node, 5 GFlop, system is under construction. 10 refs., 7 figs.

  13. The FORCE: A portable parallel programming language supporting computational structural mechanics

    Science.gov (United States)

    Jordan, Harry F.; Benten, Muhammad S.; Brehm, Juergen; Ramanan, Aruna

    1989-01-01

    This project supports the conversion of codes in Computational Structural Mechanics (CSM) to a parallel form which will efficiently exploit the computational power available from multiprocessors. The work is a part of a comprehensive, FORTRAN-based system to form a basis for a parallel version of the NICE/SPAR combination which will form the CSM Testbed. The software is macro-based and rests on the force methodology developed by the principal investigator in connection with an early scientific multiprocessor. Machine independence is an important characteristic of the system so that retargeting it to the Flex/32, or any other multiprocessor on which NICE/SPAR might be imnplemented, is well supported. The principal investigator has experience in producing parallel software for both full and sparse systems of linear equations using the force macros. Other researchers have used the Force in finite element programs. It has been possible to rapidly develop software which performs at maximum efficiency on a multiprocessor. The inherent machine independence of the system also means that the parallelization will not be limited to a specific multiprocessor.

  14. Supernova Remnant in 3-D

    Science.gov (United States)

    2009-01-01

    of the wavelength shift is related to the speed of motion, one can determine how fast the debris are moving in either direction. Because Cas A is the result of an explosion, the stellar debris is expanding radially outwards from the explosion center. Using simple geometry, the scientists were able to construct a 3-D model using all of this information. A program called 3-D Slicer modified for astronomical use by the Astronomical Medicine Project at Harvard University in Cambridge, Mass. was used to display and manipulate the 3-D model. Commercial software was then used to create the 3-D fly-through. The blue filaments defining the blast wave were not mapped using the Doppler effect because they emit a different kind of light synchrotron radiation that does not emit light at discrete wavelengths, but rather in a broad continuum. The blue filaments are only a representation of the actual filaments observed at the blast wave. This visualization shows that there are two main components to this supernova remnant: a spherical component in the outer parts of the remnant and a flattened (disk-like) component in the inner region. The spherical component consists of the outer layer of the star that exploded, probably made of helium and carbon. These layers drove a spherical blast wave into the diffuse gas surrounding the star. The flattened component that astronomers were unable to map into 3-D prior to these Spitzer observations consists of the inner layers of the star. It is made from various heavier elements, not all shown in the visualization, such as oxygen, neon, silicon, sulphur, argon and iron. High-velocity plumes, or jets, of this material are shooting out from the explosion in the plane of the disk-like component mentioned above. Plumes of silicon appear in the northeast and southwest, while those of iron are seen in the southeast and north. These jets were already known and Doppler velocity measurements have been made for these structures, but their orientation and

  15. Perception of 3D spatial relations for 3D displays

    Science.gov (United States)

    Rosen, Paul; Pizlo, Zygmunt; Hoffmann, Christoph; Popescu, Voicu S.

    2004-05-01

    We test perception of 3D spatial relations in 3D images rendered by a 3D display (Perspecta from Actuality Systems) and compare it to that of a high-resolution flat panel display. 3D images provide the observer with such depth cues as motion parallax and binocular disparity. Our 3D display is a device that renders a 3D image by displaying, in rapid succession, radial slices through the scene on a rotating screen. The image is contained in a glass globe and can be viewed from virtually any direction. In the psychophysical experiment several families of 3D objects are used as stimuli: primitive shapes (cylinders and cuboids), and complex objects (multi-story buildings, cars, and pieces of furniture). Each object has at least one plane of symmetry. On each trial an object or its "distorted" version is shown at an arbitrary orientation. The distortion is produced by stretching an object in a random direction by 40%. This distortion must eliminate the symmetry of an object. The subject's task is to decide whether or not the presented object is distorted under several viewing conditions (monocular/binocular, with/without motion parallax, and near/far). The subject's performance is measured by the discriminability d', which is a conventional dependent variable in signal detection experiments.

  16. LFP: A PC-program for ligand-field analysis of 3d(n) ions in Oh and lower symmetries

    Science.gov (United States)

    Kurzak

    2000-05-01

    A modular and efficient version of PC program for calculating ligand-field parameters, i.e. crystal-field model (CFM) and angular overlap model (AOM) is presented. The LFP program is designed to calculate the ligand-field parameters of low symmetry transition metal complexes. It is based on the general method for the analysis of central ion states distortion using group theory. The program has not closed form. It will be extended, according to spectroscopic studies in our laboratory. It is written in FORTRAN language.

  17. 3D Printing for Bricks

    OpenAIRE

    ECT Team, Purdue

    2015-01-01

    Building Bytes, by Brian Peters, is a project that uses desktop 3D printers to print bricks for architecture. Instead of using an expensive custom-made printer, it uses a normal standard 3D printer which is available for everyone and makes it more accessible and also easier for fabrication.

  18. Market study: 3-D eyetracker

    Science.gov (United States)

    1977-01-01

    A market study of a proposed version of a 3-D eyetracker for initial use at NASA's Ames Research Center was made. The commercialization potential of a simplified, less expensive 3-D eyetracker was ascertained. Primary focus on present and potential users of eyetrackers, as well as present and potential manufacturers has provided an effective means of analyzing the prospects for commercialization.

  19. Spherical 3D Isotropic Wavelets

    CERN Document Server

    Lanusse, F; Starck, J -L

    2011-01-01

    Future cosmological surveys will provide 3D large scale structure maps with large sky coverage, for which a 3D Spherical Fourier-Bessel (SFB) analysis in is natural. Wavelets are particularly well-suited to the analysis and denoising of cosmological data, but a spherical 3D isotropic wavelet transform does not currently exist to analyse spherical 3D data. The aim of this paper is to present a new formalism for a spherical 3D isotropic wavelet, i.e. one based on the Fourier-Bessel decomposition of a 3D field and accompany the formalism with a public code to perform wavelet transforms. We describe a new 3D isotropic spherical wavelet decomposition based on the undecimated wavelet transform (UWT) described in Starck et al. 2006. We also present a new fast Discrete Spherical Fourier-Bessel Transform (DSFBT) based on both a discrete Bessel Transform and the HEALPIX angular pixelisation scheme. We test the 3D wavelet transform and as a toy-application, apply a denoising algorithm in wavelet space to the Virgo large...

  20. Improvement of 3D Scanner

    Institute of Scientific and Technical Information of China (English)

    2003-01-01

    The disadvantage remaining in 3D scanning system and its reasons are discussed. A new host-and-slave structure with high speed image acquisition and processing system is proposed to quicken the image processing and improve the performance of 3D scanning system.

  1. R3D-2-MSA: the RNA 3D structure-to-multiple sequence alignment server.

    Science.gov (United States)

    Cannone, Jamie J; Sweeney, Blake A; Petrov, Anton I; Gutell, Robin R; Zirbel, Craig L; Leontis, Neocles

    2015-07-01

    The RNA 3D Structure-to-Multiple Sequence Alignment Server (R3D-2-MSA) is a new web service that seamlessly links RNA three-dimensional (3D) structures to high-quality RNA multiple sequence alignments (MSAs) from diverse biological sources. In this first release, R3D-2-MSA provides manual and programmatic access to curated, representative ribosomal RNA sequence alignments from bacterial, archaeal, eukaryal and organellar ribosomes, using nucleotide numbers from representative atomic-resolution 3D structures. A web-based front end is available for manual entry and an Application Program Interface for programmatic access. Users can specify up to five ranges of nucleotides and 50 nucleotide positions per range. The R3D-2-MSA server maps these ranges to the appropriate columns of the corresponding MSA and returns the contents of the columns, either for display in a web browser or in JSON format for subsequent programmatic use. The browser output page provides a 3D interactive display of the query, a full list of sequence variants with taxonomic information and a statistical summary of distinct sequence variants found. The output can be filtered and sorted in the browser. Previous user queries can be viewed at any time by resubmitting the output URL, which encodes the search and re-generates the results. The service is freely available with no login requirement at http://rna.bgsu.edu/r3d-2-msa.

  2. Advanced 3-D Ultrasound Imaging

    DEFF Research Database (Denmark)

    Rasmussen, Morten Fischer

    to produce high quality 3-D images. Because of the large matrix transducers with integrated custom electronics, these systems are extremely expensive. The relatively low price of ultrasound scanners is one of the factors for the widespread use of ultrasound imaging. The high price tag on the high quality 3-D......The main purpose of the PhD project was to develop methods that increase the 3-D ultrasound imaging quality available for the medical personnel in the clinic. Acquiring a 3-D volume gives the medical doctor the freedom to investigate the measured anatomy in any slice desirable after the scan has...... been completed. This allows for precise measurements of organs dimensions and makes the scan more operator independent. Real-time 3-D ultrasound imaging is still not as widespread in use in the clinics as 2-D imaging. A limiting factor has traditionally been the low image quality achievable using...

  3. Using 3D in Visualization

    DEFF Research Database (Denmark)

    Wood, Jo; Kirschenbauer, Sabine; Döllner, Jürgen

    2005-01-01

    The notion of three-dimensionality is applied to five stages of the visualization pipeline. While 3D visulization is most often associated with the visual mapping and representation of data, this chapter also identifies its role in the management and assembly of data, and in the media used...... to display 3D imagery. The extra cartographic degree of freedom offered by using 3D is explored and offered as a motivation for employing 3D in visualization. The use of VR and the construction of virtual environments exploit navigational and behavioral realism, but become most usefil when combined...... with abstracted representations embedded in a 3D space. The interactions between development of geovisualization, the technology used to implement it and the theory surrounding cartographic representation are explored. The dominance of computing technologies, driven particularly by the gaming industry...

  4. 3D printing in dentistry.

    Science.gov (United States)

    Dawood, A; Marti Marti, B; Sauret-Jackson, V; Darwood, A

    2015-12-01

    3D printing has been hailed as a disruptive technology which will change manufacturing. Used in aerospace, defence, art and design, 3D printing is becoming a subject of great interest in surgery. The technology has a particular resonance with dentistry, and with advances in 3D imaging and modelling technologies such as cone beam computed tomography and intraoral scanning, and with the relatively long history of the use of CAD CAM technologies in dentistry, it will become of increasing importance. Uses of 3D printing include the production of drill guides for dental implants, the production of physical models for prosthodontics, orthodontics and surgery, the manufacture of dental, craniomaxillofacial and orthopaedic implants, and the fabrication of copings and frameworks for implant and dental restorations. This paper reviews the types of 3D printing technologies available and their various applications in dentistry and in maxillofacial surgery.

  5. 3D vision system assessment

    Science.gov (United States)

    Pezzaniti, J. Larry; Edmondson, Richard; Vaden, Justin; Hyatt, Bryan; Chenault, David B.; Kingston, David; Geulen, Vanilynmae; Newell, Scott; Pettijohn, Brad

    2009-02-01

    In this paper, we report on the development of a 3D vision system consisting of a flat panel stereoscopic display and auto-converging stereo camera and an assessment of the system's use for robotic driving, manipulation, and surveillance operations. The 3D vision system was integrated onto a Talon Robot and Operator Control Unit (OCU) such that direct comparisons of the performance of a number of test subjects using 2D and 3D vision systems were possible. A number of representative scenarios were developed to determine which tasks benefited most from the added depth perception and to understand when the 3D vision system hindered understanding of the scene. Two tests were conducted at Fort Leonard Wood, MO with noncommissioned officers ranked Staff Sergeant and Sergeant First Class. The scenarios; the test planning, approach and protocols; the data analysis; and the resulting performance assessment of the 3D vision system are reported.

  6. Performance Evaluation of Parallel Message Passing and Thread Programming Model on Multicore Architectures

    CERN Document Server

    Hasta, D T

    2010-01-01

    The current trend of multicore architectures on shared memory systems underscores the need of parallelism. While there are some programming model to express parallelism, thread programming model has become a standard to support these system such as OpenMP, and POSIX threads. MPI (Message Passing Interface) which remains the dominant model used in high-performance computing today faces this challenge. Previous version of MPI which is MPI-1 has no shared memory concept, and Current MPI version 2 which is MPI-2 has a limited support for shared memory systems. In this research, MPI-2 version of MPI will be compared with OpenMP to see how well does MPI perform on multicore / SMP (Symmetric Multiprocessor) machines. Comparison between OpenMP for thread programming model and MPI for message passing programming model will be conducted on multicore shared memory machine architectures to see who has a better performance in terms of speed and throughput. Application used to assess the scalability of the evaluated parall...

  7. The neural basis of parallel saccade programming: an fMRI study.

    Science.gov (United States)

    Hu, Yanbo; Walker, Robin

    2011-11-01

    The neural basis of parallel saccade programming was examined in an event-related fMRI study using a variation of the double-step saccade paradigm. Two double-step conditions were used: one enabled the second saccade to be partially programmed in parallel with the first saccade while in a second condition both saccades had to be prepared serially. The intersaccadic interval, observed in the parallel programming (PP) condition, was significantly reduced compared with latency in the serial programming (SP) condition and also to the latency of single saccades in control conditions. The fMRI analysis revealed greater activity (BOLD response) in the frontal and parietal eye fields for the PP condition compared with the SP double-step condition and when compared with the single-saccade control conditions. By contrast, activity in the supplementary eye fields was greater for the double-step condition than the single-step condition but did not distinguish between the PP and SP requirements. The role of the frontal eye fields in PP may be related to the advanced temporal preparation and increased salience of the second saccade goal that may mediate activity in other downstream structures, such as the superior colliculus. The parietal lobes may be involved in the preparation for spatial remapping, which is required in double-step conditions. The supplementary eye fields appear to have a more general role in planning saccade sequences that may be related to error monitoring and the control over the execution of the correct sequence of responses.

  8. R3D Align: global pairwise alignment of RNA 3D structures using local superpositions

    Science.gov (United States)

    Rahrig, Ryan R.; Leontis, Neocles B.; Zirbel, Craig L.

    2010-01-01

    Motivation: Comparing 3D structures of homologous RNA molecules yields information about sequence and structural variability. To compare large RNA 3D structures, accurate automatic comparison tools are needed. In this article, we introduce a new algorithm and web server to align large homologous RNA structures nucleotide by nucleotide using local superpositions that accommodate the flexibility of RNA molecules. Local alignments are merged to form a global alignment by employing a maximum clique algorithm on a specially defined graph that we call the ‘local alignment’ graph. Results: The algorithm is implemented in a program suite and web server called ‘R3D Align’. The R3D Align alignment of homologous 3D structures of 5S, 16S and 23S rRNA was compared to a high-quality hand alignment. A full comparison of the 16S alignment with the other state-of-the-art methods is also provided. The R3D Align program suite includes new diagnostic tools for the structural evaluation of RNA alignments. The R3D Align alignments were compared to those produced by other programs and were found to be the most accurate, in comparison with a high quality hand-crafted alignment and in conjunction with a series of other diagnostics presented. The number of aligned base pairs as well as measures of geometric similarity are used to evaluate the accuracy of the alignments. Availability: R3D Align is freely available through a web server http://rna.bgsu.edu/R3DAlign. The MATLAB source code of the program suite is also freely available for download at that location. Supplementary information: Supplementary data are available at Bioinformatics online. Contact: r-rahrig@onu.edu PMID:20929913

  9. Development of visual 3D virtual environment for control software

    Science.gov (United States)

    Hirose, Michitaka; Myoi, Takeshi; Amari, Haruo; Inamura, Kohei; Stark, Lawrence

    1991-01-01

    Virtual environments for software visualization may enable complex programs to be created and maintained. A typical application might be for control of regional electric power systems. As these encompass broader computer networks than ever, construction of such systems becomes very difficult. Conventional text-oriented environments are useful in programming individual processors. However, they are obviously insufficient to program a large and complicated system, that includes large numbers of computers connected to each other; such programming is called 'programming in the large.' As a solution for this problem, the authors are developing a graphic programming environment wherein one can visualize complicated software in virtual 3D world. One of the major features of the environment is the 3D representation of concurrent process. 3D representation is used to supply both network-wide interprocess programming capability (capability for 'programming in the large') and real-time programming capability. The authors' idea is to fuse both the block diagram (which is useful to check relationship among large number of processes or processors) and the time chart (which is useful to check precise timing for synchronization) into a single 3D space. The 3D representation gives us a capability for direct and intuitive planning or understanding of complicated relationship among many concurrent processes. To realize the 3D representation, a technology to enable easy handling of virtual 3D object is a definite necessity. Using a stereo display system and a gesture input device (VPL DataGlove), our prototype of the virtual workstation has been implemented. The workstation can supply the 'sensation' of the virtual 3D space to a programmer. Software for the 3D programming environment is implemented on the workstation. According to preliminary assessments, a 50 percent reduction of programming effort is achieved by using the virtual 3D environment. The authors expect that the 3D

  10. ADT-3D Tumor Detection Assistant in 3D

    Directory of Open Access Journals (Sweden)

    Jaime Lazcano Bello

    2008-12-01

    Full Text Available The present document describes ADT-3D (Three-Dimensional Tumor Detector Assistant, a prototype application developed to assist doctors diagnose, detect and locate tumors in the brain by using CT scan. The reader may find on this document an introduction to tumor detection; ADT-3D main goals; development details; description of the product; motivation for its development; result’s study; and areas of applicability.

  11. Improved CUDA programs for GPU computing of Swendsen-Wang multi-cluster spin flip algorithm: 2D and 3D Ising, Potts, and XY models

    Science.gov (United States)

    Komura, Yukihiro; Okabe, Yutaka

    2016-03-01

    We present new versions of sample CUDA programs for the GPU computing of the Swendsen-Wang multi-cluster spin flip algorithm. In this update, we add the method of GPU-based cluster-labeling algorithm without the use of conventional iteration (Komura, 2015) to those programs. For high-precision calculations, we also add a random-number generator in the cuRAND library. Moreover, we fix several bugs and remove the extra usage of shared memory in the kernel functions.

  12. Unassisted 3D camera calibration

    Science.gov (United States)

    Atanassov, Kalin; Ramachandra, Vikas; Nash, James; Goma, Sergio R.

    2012-03-01

    With the rapid growth of 3D technology, 3D image capture has become a critical part of the 3D feature set on mobile phones. 3D image quality is affected by the scene geometry as well as on-the-device processing. An automatic 3D system usually assumes known camera poses accomplished by factory calibration using a special chart. In real life settings, pose parameters estimated by factory calibration can be negatively impacted by movements of the lens barrel due to shaking, focusing, or camera drop. If any of these factors displaces the optical axes of either or both cameras, vertical disparity might exceed the maximum tolerable margin and the 3D user may experience eye strain or headaches. To make 3D capture more practical, one needs to consider unassisted (on arbitrary scenes) calibration. In this paper, we propose an algorithm that relies on detection and matching of keypoints between left and right images. Frames containing erroneous matches, along with frames with insufficiently rich keypoint constellations, are detected and discarded. Roll, pitch yaw , and scale differences between left and right frames are then estimated. The algorithm performance is evaluated in terms of the remaining vertical disparity as compared to the maximum tolerable vertical disparity.

  13. Bioprinting of 3D hydrogels.

    Science.gov (United States)

    Stanton, M M; Samitier, J; Sánchez, S

    2015-08-07

    Three-dimensional (3D) bioprinting has recently emerged as an extension of 3D material printing, by using biocompatible or cellular components to build structures in an additive, layer-by-layer methodology for encapsulation and culture of cells. These 3D systems allow for cell culture in a suspension for formation of highly organized tissue or controlled spatial orientation of cell environments. The in vitro 3D cellular environments simulate the complexity of an in vivo environment and natural extracellular matrices (ECM). This paper will focus on bioprinting utilizing hydrogels as 3D scaffolds. Hydrogels are advantageous for cell culture as they are highly permeable to cell culture media, nutrients, and waste products generated during metabolic cell processes. They have the ability to be fabricated in customized shapes with various material properties with dimensions at the micron scale. 3D hydrogels are a reliable method for biocompatible 3D printing and have applications in tissue engineering, drug screening, and organ on a chip models.

  14. DNA origami design of 3D nanostructures

    DEFF Research Database (Denmark)

    Andersen, Ebbe Sloth; Nielsen, Morten Muhlig

    2009-01-01

    , several dedicated 3D editors for computer-aided design of DNA structures have been developed [4-7]. However, many of these tools are not efficient for designing DNA origami structures that requires the design of more than 200 unique DNA strands to be folded along a scaffold strand into a defined 3D shape......Structural DNA nanotechnology has been heavily dependent on the development of dedicated software tools for the design of unique helical junctions, to define unique sticky-ends for tile assembly, and for predicting the products of the self-assembly reaction of multiple DNA strands [1-3]. Recently...... [8]. We have recently developed a semi-automated DNA origami software package [9] that uses a 2D sequence editor in conjunction with several automated tools to facilitate the design process. Here we extend the use of the program for designing DNA origami structures in 3D and show the application...

  15. Tuotekehitysprojekti: 3D-tulostin

    OpenAIRE

    Pihlajamäki, Janne

    2011-01-01

    Opinnäytetyössä tutustuttiin 3D-tulostamisen teknologiaan. Työssä käytiin läpi 3D-tulostimesta tehty tuotekehitysprojekti. Sen lisäksi esiteltiin yleisellä tasolla tuotekehitysprosessi ja syntyneiden tulosten mahdollisia suojausmenetelmiä. Tavoitteena tässä työssä oli kehittää markkinoilta jo löytyvää kotitulostin-tasoista 3D-laiteteknologiaa lähemmäksi ammattilaistason ratkaisua. Tavoitteeseen pyrittiin keskittymällä parantamaan laitteella saavutettavaa tulostustarkkuutta ja -nopeutt...

  16. Color 3D Reverse Engineering

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    This paper presents a principle and a method of col or 3D laser scanning measurement. Based on the fundamental monochrome 3D measureme nt study, color information capture, color texture mapping, coordinate computati on and other techniques are performed to achieve color 3D measurement. The syste m is designed and composed of a line laser light emitter, one color CCD camera, a motor-driven rotary filter, a circuit card and a computer. Two steps in captu ring object's images in the measurement process: Firs...

  17. Exploration of 3D Printing

    OpenAIRE

    Lin, Zeyu

    2014-01-01

    3D printing technology is introduced and defined in this Thesis. Some methods of 3D printing are illustrated and their principles are explained with pictures. Most of the essential parts are presented with pictures and their effects are explained within the whole system. Problems on Up! Plus 3D printer are solved and a DIY product is made with this machine. The processes of making product are recorded and the items which need to be noticed during the process are the highlight in this th...

  18. Handbook of 3D integration

    CERN Document Server

    Garrou , Philip; Ramm , Peter

    2014-01-01

    Edited by key figures in 3D integration and written by top authors from high-tech companies and renowned research institutions, this book covers the intricate details of 3D process technology.As such, the main focus is on silicon via formation, bonding and debonding, thinning, via reveal and backside processing, both from a technological and a materials science perspective. The last part of the book is concerned with assessing and enhancing the reliability of the 3D integrated devices, which is a prerequisite for the large-scale implementation of this emerging technology. Invaluable reading fo

  19. Miniaturized 3D microscope imaging system

    Science.gov (United States)

    Lan, Yung-Sung; Chang, Chir-Weei; Sung, Hsin-Yueh; Wang, Yen-Chang; Chang, Cheng-Yi

    2015-05-01

    We designed and assembled a portable 3-D miniature microscopic image system with the size of 35x35x105 mm3 . By integrating a microlens array (MLA) into the optical train of a handheld microscope, the biological specimen's image will be captured for ease of use in a single shot. With the light field raw data and program, the focal plane can be changed digitally and the 3-D image can be reconstructed after the image was taken. To localize an object in a 3-D volume, an automated data analysis algorithm to precisely distinguish profundity position is needed. The ability to create focal stacks from a single image allows moving or specimens to be recorded. Applying light field microscope algorithm to these focal stacks, a set of cross sections will be produced, which can be visualized using 3-D rendering. Furthermore, we have developed a series of design rules in order to enhance the pixel using efficiency and reduce the crosstalk between each microlens for obtain good image quality. In this paper, we demonstrate a handheld light field microscope (HLFM) to distinguish two different color fluorescence particles separated by a cover glass in a 600um range, show its focal stacks, and 3-D position.

  20. The Mathematical Method of Simplifying Level of Detail Model in 3D Game Programming%3D游戏程序设计中简化模型细节层次的数学方法

    Institute of Scientific and Technical Information of China (English)

    张云苑

    2013-01-01

    应用真实的角色和场景图形的表现,再现真实世界的3D游戏已成为电子游戏的主流产品。通过程序设计算法简化模型细节层次是实时3D图形学应用于游戏角色和场景图形表现的一项技术。该文论述了不同层次细节模型简化的数学方法,通过简化游戏中的角色或场景图形的复杂度,以满足实时3D渲染要求,达到真实世界3D游戏的表现效果。%The 3D video games which applying the performance of real characters and scenes graphs to produce the real world have become a mainstream products in the electronic game. Using algorithmic programming to simplify the level of detail model is a technique of applying the real-time 3D graphics into the gaming characters and scenes graphs. This article discusses mathe-matical approaches of different simplified level of details models, by simplifying the complexity of the game's role or the scene graph, it can meet real-time 3D rendering requirements and achieve real-world 3D gaming performance results .

  1. Facing competition: Neural mechanisms underlying parallel programming of antisaccades and prosaccades.

    Science.gov (United States)

    Talanow, Tobias; Kasparbauer, Anna-Maria; Steffens, Maria; Meyhöfer, Inga; Weber, Bernd; Smyrnis, Nikolaos; Ettinger, Ulrich

    2016-08-01

    The antisaccade task is a prominent tool to investigate the response inhibition component of cognitive control. Recent theoretical accounts explain performance in terms of parallel programming of exogenous and endogenous saccades, linked to the horse race metaphor. Previous studies have tested the hypothesis of competing saccade signals at the behavioral level by selectively slowing the programming of endogenous or exogenous processes e.g. by manipulating the probability of antisaccades in an experimental block. To gain a better understanding of inhibitory control processes in parallel saccade programming, we analyzed task-related eye movements and blood oxygenation level dependent (BOLD) responses obtained using functional magnetic resonance imaging (fMRI) at 3T from 16 healthy participants in a mixed antisaccade and prosaccade task. The frequency of antisaccade trials was manipulated across blocks of high (75%) and low (25%) antisaccade frequency. In blocks with high antisaccade frequency, antisaccade latencies were shorter and error rates lower whilst prosaccade latencies were longer and error rates were higher. At the level of BOLD, activations in the task-related saccade network (left inferior parietal lobe, right inferior parietal sulcus, left precentral gyrus reaching into left middle frontal gyrus and inferior frontal junction) and deactivations in components of the default mode network (bilateral temporal cortex, ventromedial prefrontal cortex) compensated increased cognitive control demands. These findings illustrate context dependent mechanisms underlying the coordination of competing decision signals in volitional gaze control.

  2. Eighth SIAM conference on parallel processing for scientific computing: Final program and abstracts

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1997-12-31

    This SIAM conference is the premier forum for developments in parallel numerical algorithms, a field that has seen very lively and fruitful developments over the past decade, and whose health is still robust. Themes for this conference were: combinatorial optimization; data-parallel languages; large-scale parallel applications; message-passing; molecular modeling; parallel I/O; parallel libraries; parallel software tools; parallel compilers; particle simulations; problem-solving environments; and sparse matrix computations.

  3. Conducting polymer 3D microelectrodes

    DEFF Research Database (Denmark)

    Sasso, Luigi; Vazquez, Patricia; Vedarethinam, Indumathi

    2010-01-01

    Conducting polymer 3D microelectrodes have been fabricated for possible future neurological applications. A combination of micro-fabrication techniques and chemical polymerization methods has been used to create pillar electrodes in polyaniline and polypyrrole. The thin polymer films obtained...

  4. 3-D Vector Flow Imaging

    DEFF Research Database (Denmark)

    Holbek, Simon

    For the last decade, the field of ultrasonic vector flow imaging has gotten an increasingly attention, as the technique offers a variety of new applications for screening and diagnostics of cardiovascular pathologies. The main purpose of this PhD project was therefore to advance the field of 3-D...... ultrasonic vector flow estimation and bring it a step closer to a clinical application. A method for high frame rate 3-D vector flow estimation in a plane using the transverse oscillation method combined with a 1024 channel 2-D matrix array is presented. The proposed method is validated both through phantom......, if this significant reduction in the element count can still provide precise and robust 3-D vector flow estimates in a plane. The study concludes that the RC array is capable of estimating precise 3-D vector flow both in a plane and in a volume, despite the low channel count. However, some inherent new challenges...

  5. PSH3D fast Poisson solver for petascale DNS

    Science.gov (United States)

    Adams, Darren; Dodd, Michael; Ferrante, Antonino

    2016-11-01

    Direct numerical simulation (DNS) of high Reynolds number, Re >= O (105) , turbulent flows requires computational meshes >= O (1012) grid points, and, thus, the use of petascale supercomputers. DNS often requires the solution of a Helmholtz (or Poisson) equation for pressure, which constitutes the bottleneck of the solver. We have developed a parallel solver of the Helmholtz equation in 3D, PSH3D. The numerical method underlying PSH3D combines a parallel 2D Fast Fourier transform in two spatial directions, and a parallel linear solver in the third direction. For computational meshes up to 81923 grid points, our numerical results show that PSH3D scales up to at least 262k cores of Cray XT5 (Blue Waters). PSH3D has a peak performance 6 × faster than 3D FFT-based methods when used with the 'partial-global' optimization, and for a 81923 mesh solves the Poisson equation in 1 sec using 128k cores. Also, we have verified that the use of PSH3D with the 'partial-global' optimization in our DNS solver does not reduce the accuracy of the numerical solution of the incompressible Navier-Stokes equations.

  6. Program Suite for Conceptual Designing of Parallel Mechanism-Based Robots and Machine Tools

    Directory of Open Access Journals (Sweden)

    Slobodan Tabaković

    2013-06-01

    This paper describes the categorization of criteria for the conceptual design of parallel mechanism‐based robots or machine tools, resulting from workspace analysis as well as the procedure of their defining. Furthermore, it also presents the designing methodology that was implemented into the program for the creation of a robot or machine tool space model and the optimization of the resulting solution. For verification of the criteria and the programme suite, three common (conceptually different mechanisms with a similar mechanical structure and kinematic characteristics were used.

  7. 3D Face Apperance Model

    DEFF Research Database (Denmark)

    Lading, Brian; Larsen, Rasmus; Astrom, K

    2006-01-01

    We build a 3D face shape model, including inter- and intra-shape variations, derive the analytical Jacobian of its resulting 2D rendered image, and show example of its fitting performance with light, pose, id, expression and texture variations......We build a 3D face shape model, including inter- and intra-shape variations, derive the analytical Jacobian of its resulting 2D rendered image, and show example of its fitting performance with light, pose, id, expression and texture variations...

  8. 3D Face Appearance Model

    DEFF Research Database (Denmark)

    Lading, Brian; Larsen, Rasmus; Åström, Kalle

    2006-01-01

    We build a 3d face shape model, including inter- and intra-shape variations, derive the analytical jacobian of its resulting 2d rendered image, and show example of its fitting performance with light, pose, id, expression and texture variations.}......We build a 3d face shape model, including inter- and intra-shape variations, derive the analytical jacobian of its resulting 2d rendered image, and show example of its fitting performance with light, pose, id, expression and texture variations.}...

  9. Main: TATCCAYMOTIFOSRAMY3D [PLACE

    Lifescience Database Archive (English)

    Full Text Available TATCCAYMOTIFOSRAMY3D S000256 01-August-2006 (last modified) kehi TATCCAY motif foun...d in rice (O.s.) RAmy3D alpha-amylase gene promoter; Y=T/C; a GATA motif as its antisense sequence; TATCCAY ...motif and G motif (see S000130) are responsible for sugar repression (Toyofuku et al. 1998); GATA; amylase; sugar; repression; rice (Oryza sativa) TATCCAY ...

  10. bioWeb3D: an online webGL 3D data visualisation tool.

    Science.gov (United States)

    Pettit, Jean-Baptiste; Marioni, John C

    2013-06-07

    Data visualization is critical for interpreting biological data. However, in practice it can prove to be a bottleneck for non trained researchers; this is especially true for three dimensional (3D) data representation. Whilst existing software can provide all necessary functionalities to represent and manipulate biological 3D datasets, very few are easily accessible (browser based), cross platform and accessible to non-expert users. An online HTML5/WebGL based 3D visualisation tool has been developed to allow biologists to quickly and easily view interactive and customizable three dimensional representations of their data along with multiple layers of information. Using the WebGL library Three.js written in Javascript, bioWeb3D allows the simultaneous visualisation of multiple large datasets inputted via a simple JSON, XML or CSV file, which can be read and analysed locally thanks to HTML5 capabilities. Using basic 3D representation techniques in a technologically innovative context, we provide a program that is not intended to compete with professional 3D representation software, but that instead enables a quick and intuitive representation of reasonably large 3D datasets.

  11. GPU-based rapid reconstruction of cellular 3D refractive index maps from tomographic phase microscopy (Conference Presentation)

    Science.gov (United States)

    Dardikman, Gili; Shaked, Natan T.

    2016-03-01

    We present highly parallel and efficient algorithms for real-time reconstruction of the quantitative three-dimensional (3-D) refractive-index maps of biological cells without labeling, as obtained from the interferometric projections acquired by tomographic phase microscopy (TPM). The new algorithms are implemented on the graphic processing unit (GPU) of the computer using CUDA programming environment. The reconstruction process includes two main parts. First, we used parallel complex wave-front reconstruction of the TPM-based interferometric projections acquired at various angles. The complex wave front reconstructions are done on the GPU in parallel, while minimizing the calculation time of the Fourier transforms and phase unwrapping needed. Next, we implemented on the GPU in parallel the 3-D refractive index map retrieval using the TPM filtered-back projection algorithm. The incorporation of algorithms that are inherently parallel with a programming environment such as Nvidia's CUDA makes it possible to obtain real-time processing rate, and enables high-throughput platform for label-free, 3-D cell visualization and diagnosis.

  12. Full Parallel Implementation of an All-Electron Four-Component Dirac-Kohn-Sham Program.

    Science.gov (United States)

    Rampino, Sergio; Belpassi, Leonardo; Tarantelli, Francesco; Storchi, Loriano

    2014-09-09

    A full distributed-memory implementation of the Dirac-Kohn-Sham (DKS) module of the program BERTHA (Belpassi et al., Phys. Chem. Chem. Phys. 2011, 13, 12368-12394) is presented, where the self-consistent field (SCF) procedure is replicated on all the parallel processes, each process working on subsets of the global matrices. The key feature of the implementation is an efficient procedure for switching between two matrix distribution schemes, one (integral-driven) optimal for the parallel computation of the matrix elements and another (block-cyclic) optimal for the parallel linear algebra operations. This approach, making both CPU-time and memory scalable with the number of processors used, virtually overcomes at once both time and memory barriers associated with DKS calculations. Performance, portability, and numerical stability of the code are illustrated on the basis of test calculations on three gold clusters of increasing size, an organometallic compound, and a perovskite model. The calculations are performed on a Beowulf and a BlueGene/Q system.

  13. Forward ramp in 3D

    Science.gov (United States)

    1997-01-01

    Mars Pathfinder's forward rover ramp can be seen successfully unfurled in this image, taken in stereo by the Imager for Mars Pathfinder (IMP) on Sol 3. 3D glasses are necessary to identify surface detail. This ramp was not used for the deployment of the microrover Sojourner, which occurred at the end of Sol 2. When this image was taken, Sojourner was still latched to one of the lander's petals, waiting for the command sequence that would execute its descent off of the lander's petal.The image helped Pathfinder scientists determine whether to deploy the rover using the forward or backward ramps and the nature of the first rover traverse. The metallic object at the lower left of the image is the lander's low-gain antenna. The square at the end of the ramp is one of the spacecraft's magnetic targets. Dust that accumulates on the magnetic targets will later be examined by Sojourner's Alpha Proton X-Ray Spectrometer instrument for chemical analysis. At right, a lander petal is visible.The IMP is a stereo imaging system with color capability provided by 24 selectable filters -- twelve filters per 'eye.' It stands 1.8 meters above the Martian surface, and has a resolution of two millimeters at a range of two meters.Mars Pathfinder is the second in NASA's Discovery program of low-cost spacecraft with highly focused science goals. The Jet Propulsion Laboratory, Pasadena, CA, developed and manages the Mars Pathfinder mission for NASA's Office of Space Science, Washington, D.C. JPL is an operating division of the California Institute of Technology (Caltech). The Imager for Mars Pathfinder (IMP) was developed by the University of Arizona Lunar and Planetary Laboratory under contract to JPL. Peter Smith is the Principal Investigator.Click below to see the left and right views individually. [figure removed for brevity, see original site] Left [figure removed for brevity, see original site] Right

  14. A Model for Managing 3D Printing Services in Academic Libraries

    Science.gov (United States)

    Scalfani, Vincent F.; Sahib, Josh

    2013-01-01

    The appearance of 3D printers in university libraries opens many opportunities for advancing outreach, teaching, and research programs. The University of Alabama (UA) Libraries recently adopted 3D printing technology and maintains an open access 3D Printing Studio. The Studio consists of a 3D printer, multiple 3D design workstations, and other…

  15. A Model for Managing 3D Printing Services in Academic Libraries

    Science.gov (United States)

    Scalfani, Vincent F.; Sahib, Josh

    2013-01-01

    The appearance of 3D printers in university libraries opens many opportunities for advancing outreach, teaching, and research programs. The University of Alabama (UA) Libraries recently adopted 3D printing technology and maintains an open access 3D Printing Studio. The Studio consists of a 3D printer, multiple 3D design workstations, and other…

  16. 3D fold growth in transpression

    Science.gov (United States)

    Frehner, Marcel

    2016-12-01

    Geological folds in transpression are inherently 3D structures; hence their growth and rotation behavior is studied using 3D numerical finite-element simulations. Upright single-layer buckle folds in Newtonian materials are considered, which grow from an initial point-like perturbation due to a combination of in-plane shortening and shearing (i.e., transpression). The resulting fold growth exhibits three components: (1) fold amplification (vertical), (2) fold elongation (parallel to fold axis), and (3) sequential fold growth (perpendicular to axial plane) of new anti- and synforms adjacent to the initial fold. Generally, the fold growth rates are smaller for shearing-dominated than for shortening-dominated transpression. In spite of the growth rate, the folding behavior is very similar for the different convergence angles. The two lateral directions always exhibit similar growth rates implying that the bulk fold structure occupies an increasing roughly circular area. Fold axes are always parallel to the major horizontal principal strain axis (λ→max, i.e., long axis of the horizontal finite strain ellipse), which is initially also parallel to the major horizontal instantaneous stretching axis (ISA→max). After initiation, the fold axes rotate together with λ→max. Sequential folds appearing later do not initiate parallel to ISA→max, but parallel to λ→max, i.e. parallel to the already existing folds, and also rotate with λ→max. Therefore, fold axes do not correspond to passive material lines and hinge migration takes place as a consequence. The fold axis orientation parallel to λ→max is independent of convergence angle and viscosity ratio. Therefore, a triangular relationship between convergence angle, amount of shortening, and fold axis orientation exists. If two of these values are known, the third can be determined. This relationship is applied to the Zagros fold-and-thrust-belt to estimate the degree of strain partitioning between the Simply

  17. Pembuatan Aplikasi Catalog 3D Desain Rumah Sebagai Sarana Promosi Dengan Menggunakan Unity 3D

    Directory of Open Access Journals (Sweden)

    Siryantini Nurul Adnin

    2016-03-01

    Full Text Available This study incorporate AR into a technology home Catalog sales, thus Catalog home is becoming more real with 3D objects in it. This research aims to produce an application that can display a 3D model of a house that can help buyers to know well the home to be purchased, and will simplify the home seller as a media campaign to consumers. 3D objects used to develop two kinds of Software that Sweet Home 3D and Blender, whereas to create application in programming used Unity 3D Software using the C # programming language. Application home design Catalog is made through several stages of design 3D objects, Marker workmanship and application design. The end result consists of two forms, namely in the form of physical (in the form of print media Catalog that contains a marker on some pages and Augmented Reality applications based on Android in the form of .apk which is then installed on Smartphones, where the two are complementary.

  18. MPML3D: Scripting Agents for the 3D Internet.

    Science.gov (United States)

    Prendinger, Helmut; Ullrich, Sebastian; Nakasone, Arturo; Ishizuka, Mitsuru

    2011-05-01

    The aim of this paper is two-fold. First, it describes a scripting language for specifying communicative behavior and interaction of computer-controlled agents ("bots") in the popular three-dimensional (3D) multiuser online world of "Second Life" and the emerging "OpenSimulator" project. While tools for designing avatars and in-world objects in Second Life exist, technology for nonprogrammer content creators of scenarios involving scripted agents is currently missing. Therefore, we have implemented new client software that controls bots based on the Multimodal Presentation Markup Language 3D (MPML3D), a highly expressive XML-based scripting language for controlling the verbal and nonverbal behavior of interacting animated agents. Second, the paper compares Second Life and OpenSimulator platforms and discusses the merits and limitations of each from the perspective of agent control. Here, we also conducted a small study that compares the network performance of both platforms.

  19. A comparison of distributed memory and virtual shared memory parallel programming models

    Energy Technology Data Exchange (ETDEWEB)

    Keane, J.A. [Univ. of Manchester (United Kingdom). Dept. of Computer Science; Grant, A.J. [Univ. of Manchester (United Kingdom). Computer Graphics Unit; Xu, M.Q. [Argonne National Lab., IL (United States)

    1993-04-01

    The virtues of the different parallel programming models, shared memory and distributed memory, have been much debated. Conventionally the debate could be reduced to programming convenience on the one hand, and high salability factors on the other. More recently the debate has become somewhat blurred with the provision of virtual shared memory models built on machines with physically distributed memory. The intention of such models/machines is to provide scalable shared memory, i.e. to provide both programmer convenience and high salability. In this paper, the different models are considered from experiences gained with a number of system ranging from applications in both commerce and science to languages and operating systems. Case studies are introduced as appropriate.

  20. 3D-mallinnus ja 3D-animaatiot biovoimalaitoksesta

    OpenAIRE

    Hiltula, Tytti

    2014-01-01

    Opinnäytetyössä tehtiin biovoimalaitoksen piirustuksista 3D-mallinnus ja animaatiot. Työn tarkoituksena oli saada valmiiksi Recwell Oy:lle markkinointiin tarkoitetut kuva- ja videomateriaalit. Työssä perehdyttiin 3D-mallintamisen perustietoihin ja lähtökohtiin sekä animaation laatimiseen. Työ laadittiin kokonaisuudessaan AutoCAD-ohjelmalla, ja työn aikana tutustuttiin huolellisesti myös ohjelman käyttöohjeisiin. Piirustusten mitoituksessa huomattiin jo alkuvaiheessa suuria puutteita, ...