WorldWideScience

Sample records for 3-d parallel program

  1. 3-D parallel program for numerical calculation of gas dynamics problems with heat conductivity on distributed memory computational systems (CS)

    Sofronov, I.D.; Voronin, B.L.; Butnev, O.I. [VNIIEF (Russian Federation)] [and others

    1997-12-31

    The aim of the work performed is to develop a 3D parallel program for numerical calculation of gas dynamics problem with heat conductivity on distributed memory computational systems (CS), satisfying the condition of numerical result independence from the number of processors involved. Two basically different approaches to the structure of massive parallel computations have been developed. The first approach uses the 3D data matrix decomposition reconstructed at temporal cycle and is a development of parallelization algorithms for multiprocessor CS with shareable memory. The second approach is based on using a 3D data matrix decomposition not reconstructed during a temporal cycle. The program was developed on 8-processor CS MP-3 made in VNIIEF and was adapted to a massive parallel CS Meiko-2 in LLNL by joint efforts of VNIIEF and LLNL staffs. A large number of numerical experiments has been carried out with different number of processors up to 256 and the efficiency of parallelization has been evaluated in dependence on processor number and their parameters.

  2. Shared Memory Parallelism for 3D Cartesian Discrete Ordinates Solver

    Moustafa, Salli; Dutka-Malen, Ivan; Plagne, Laurent; Ponçot, Angélique; Ramet, Pierre

    2014-06-01

    This paper describes the design and the performance of DOMINO, a 3D Cartesian SN solver that implements two nested levels of parallelism (multicore+SIMD) on shared memory computation nodes. DOMINO is written in C++, a multi-paradigm programming language that enables the use of powerful and generic parallel programming tools such as Intel TBB and Eigen. These two libraries allow us to combine multi-thread parallelism with vector operations in an efficient and yet portable way. As a result, DOMINO can exploit the full power of modern multi-core processors and is able to tackle very large simulations, that usually require large HPC clusters, using a single computing node. For example, DOMINO solves a 3D full core PWR eigenvalue problem involving 26 energy groups, 288 angular directions (S16), 46 × 106 spatial cells and 1 × 1012 DoFs within 11 hours on a single 32-core SMP node. This represents a sustained performance of 235 GFlops and 40:74% of the SMP node peak performance for the DOMINO sweep implementation. The very high Flops/Watt ratio of DOMINO makes it a very interesting building block for a future many-nodes nuclear simulation tool.

  3. Parallel Processor for 3D Recovery from Optical Flow

    Jose Hugo Barron-Zambrano

    2009-01-01

    Full Text Available 3D recovery from motion has received a major effort in computer vision systems in the recent years. The main problem lies in the number of operations and memory accesses to be performed by the majority of the existing techniques when translated to hardware or software implementations. This paper proposes a parallel processor for 3D recovery from optical flow. Its main feature is the maximum reuse of data and the low number of clock cycles to calculate the optical flow, along with the precision with which 3D recovery is achieved. The results of the proposed architecture as well as those from processor synthesis are presented.

  4. CALTRANS: A parallel, deterministic, 3D neutronics code

    Carson, L.; Ferguson, J.; Rogers, J.

    1994-04-01

    Our efforts to parallelize the deterministic solution of the neutron transport equation has culminated in a new neutronics code CALTRANS, which has full 3D capability. In this article, we describe the layout and algorithms of CALTRANS and present performance measurements of the code on a variety of platforms. Explicit implementation of the parallel algorithms of CALTRANS using both the function calls of the Parallel Virtual Machine software package (PVM 3.2) and the Meiko CS-2 tagged message passing library (based on the Intel NX/2 interface) are provided in appendices.

  5. Programs Lucky and Lucky{sub C} - 3D parallel transport codes for the multi-group transport equation solution for XYZ geometry by Pm Sn method

    Moriakov, A. [Russian Research Centre, Kurchatov Institute, Moscow (Russian Federation); Vasyukhno, V.; Netecha, M.; Khacheresov, G. [Research and Development Institute of Power Engineering, Moscow (Russian Federation)

    2003-07-01

    Powerful supercomputers are available today. MBC-1000M is one of Russian supercomputers that may be used by distant way access. Programs LUCKY and LUCKY{sub C} were created to work for multi-processors systems. These programs have algorithms created especially for these computers and used MPI (message passing interface) service for exchanges between processors. LUCKY may resolved shielding tasks by multigroup discreet ordinate method. LUCKY{sub C} may resolve critical tasks by same method. Only XYZ orthogonal geometry is available. Under little space steps to approximate discreet operator this geometry may be used as universal one to describe complex geometrical structures. Cross section libraries are used up to P8 approximation by Legendre polynomials for nuclear data in GIT format. Programming language is Fortran-90. 'Vector' processors may be used that lets get a time profit up to 30 times. But unfortunately MBC-1000M has not these processors. Nevertheless sufficient value for efficiency of parallel calculations was obtained under 'space' (LUCKY) and 'space and energy' (LUCKY{sub C}) paralleling. AUTOCAD program is used to control geometry after a treatment of input data. Programs have powerful geometry module, it is a beautiful tool to achieve any geometry. Output results may be processed by graphic programs on personal computer. (authors)

  6. 3-D Visualization on Workspace of Parallel Manipulators

    Tanaka, Yoshito; Yokomichi, Isao; Ishii, Junko; Makino, Toshiaki

    In parallel mechanisms, the form and volume of workspace also change variously with the attitude of a platform. This paper presents a method to search for the workspace of parallel mechanisms with 6-DOF and 3D visualization of the workspace. Workspace is a search for the movable range of the central point of a platform when it moves with a given orientation. In order to search workspace, geometric analysis based on inverse kinematics is considered. Plots of 2D of calculations are compared with those measured by position sensors. The test results are shown to have good agreement with simulation results. The workspace variations are demonstrated in terms of 3D and 2D plots for prototype mechanisms. The workspace plots are created with OpenGL and Visual C++ by implementation of the algorithm. An application module is developed, which displays workspace of the mechanism in 3D images. The effectiveness and practicability of 3D visualization on workspace are successfully demonstrated by 6-DOF parallel mechanisms.

  7. Parallel Optimization of 3D Cardiac Electrophysiological Model Using GPU

    Yong Xia

    2015-01-01

    Full Text Available Large-scale 3D virtual heart model simulations are highly demanding in computational resources. This imposes a big challenge to the traditional computation resources based on CPU environment, which already cannot meet the requirement of the whole computation demands or are not easily available due to expensive costs. GPU as a parallel computing environment therefore provides an alternative to solve the large-scale computational problems of whole heart modeling. In this study, using a 3D sheep atrial model as a test bed, we developed a GPU-based simulation algorithm to simulate the conduction of electrical excitation waves in the 3D atria. In the GPU algorithm, a multicellular tissue model was split into two components: one is the single cell model (ordinary differential equation and the other is the diffusion term of the monodomain model (partial differential equation. Such a decoupling enabled realization of the GPU parallel algorithm. Furthermore, several optimization strategies were proposed based on the features of the virtual heart model, which enabled a 200-fold speedup as compared to a CPU implementation. In conclusion, an optimized GPU algorithm has been developed that provides an economic and powerful platform for 3D whole heart simulations.

  8. Parallel OSEM Reconstruction Algorithm for Fully 3-D SPECT on a Beowulf Cluster.

    Rong, Zhou; Tianyu, Ma; Yongjie, Jin

    2005-01-01

    In order to improve the computation speed of ordered subset expectation maximization (OSEM) algorithm for fully 3-D single photon emission computed tomography (SPECT) reconstruction, an experimental beowulf-type cluster was built and several parallel reconstruction schemes were described. We implemented a single-program-multiple-data (SPMD) parallel 3-D OSEM reconstruction algorithm based on message passing interface (MPI) and tested it with combinations of different number of calculating processors and different size of voxel grid in reconstruction (64×64×64 and 128×128×128). Performance of parallelization was evaluated in terms of the speedup factor and parallel efficiency. This parallel implementation methodology is expected to be helpful to make fully 3-D OSEM algorithms more feasible in clinical SPECT studies.

  9. Parallel tempering and 3D spin glass models

    Papakonstantinou, T.; Malakis, A.

    2014-03-01

    We review parallel tempering schemes and examine their main ingredients for accuracy and efficiency. We discuss two selection methods of temperatures and some alternatives for the exchange of replicas, including all-pair exchange methods. We measure specific heat errors and round-trip efficiency using the two-dimensional (2D) Ising model, and also test the efficiency for the ground state production in 3D spin glass models. We find that the optimization of the GS problem is highly influenced by the choice of the temperature range of the PT process. Finally, we present numerical evidence concerning the universality aspects of an anisotropic case of the 3D spin-glass model.

  10. Glnemo2: Interactive Visualization 3D Program

    Lambert, Jean-Charles

    2011-10-01

    Glnemo2 is an interactive 3D visualization program developed in C++ using the OpenGL library and Nokia QT 4.X API. It displays in 3D the particles positions of the different components of an nbody snapshot. It quickly gives a lot of information about the data (shape, density area, formation of structures such as spirals, bars, or peanuts). It allows for in/out zooms, rotations, changes of scale, translations, selection of different groups of particles and plots in different blending colors. It can color particles according to their density or temperature, play with the density threshold, trace orbits, display different time steps, take automatic screenshots to make movies, select particles using the mouse, and fly over a simulation using a given camera path. All these features are accessible from a very intuitive graphic user interface. Glnemo2 supports a wide range of input file formats (Nemo, Gadget 1 and 2, phiGrape, Ramses, list of files, realtime gyrfalcON simulation) which are automatically detected at loading time without user intervention. Glnemo2 uses a plugin mechanism to load the data, so that it is easy to add a new file reader. It's powered by a 3D engine which uses the latest OpenGL technology, such as shaders (glsl), vertex buffer object, frame buffer object, and takes in account the power of the graphic card used in order to accelerate the rendering. With a fast GPU, millions of particles can be rendered in real time. Glnemo2 runs on Linux, Windows (using minGW compiler), and MaxOSX, thanks to the QT4API.

  11. 3D data denoising via Nonlocal Means filter by using parallel GPU strategies.

    Cuomo, Salvatore; De Michele, Pasquale; Piccialli, Francesco

    2014-01-01

    Nonlocal Means (NLM) algorithm is widely considered as a state-of-the-art denoising filter in many research fields. Its high computational complexity leads researchers to the development of parallel programming approaches and the use of massively parallel architectures such as the GPUs. In the recent years, the GPU devices had led to achieving reasonable running times by filtering, slice-by-slice, and 3D datasets with a 2D NLM algorithm. In our approach we design and implement a fully 3D NonLocal Means parallel approach, adopting different algorithm mapping strategies on GPU architecture and multi-GPU framework, in order to demonstrate its high applicability and scalability. The experimental results we obtained encourage the usability of our approach in a large spectrum of applicative scenarios such as magnetic resonance imaging (MRI) or video sequence denoising.

  12. DPGL: The Direct3D9-based Parallel Graphics Library for Multi-display Environment

    Zhen Liu; Jiao-Ying Shi

    2007-01-01

    The emergence of high performance 3D graphics cards has opened the way to PC clusters for high performance multidisplay environment. In order to exploit the rendering ability of PC clusters, we should design appropriate parallel rendering algorithms and parallel graphics library interfaces. Due to the rapid development of Direct3D, we bring forward DPGL, the Direct3D9-based parallel graphics library in D3DPR parallel rendering system, which implements Direct3D9 interfaces to support existing Direct3D9 application parallelization with no modification. Based on the parallelism analysis of Direct3D9 rendering pipeline, we briefly introduce D3DPR parallel rendering system. DPGL is the fundamental component of D3DPR. After presenting DPGL three layers architecture,we discuss the rendering resource interception and management. Finally, we describe the design and implementation of DPGL in detail,including rendering command interception layer, rendering command interpretation layer and rendering resource parallelization layer.

  13. Developing Parallel Programs

    Ranjan Sen

    2012-09-01

    Full Text Available Parallel programming is an extension of sequential programming; today, it is becoming the mainstream paradigm in day-to-day information processing. Its aim is to build the fastest programs on parallel computers. The methodologies for developing a parallelprogram can be put into integrated frameworks. Development focuses on algorithm, languages, and how the program is deployed on the parallel computer.

  14. Introduction to parallel programming

    Brawer, Steven

    1989-01-01

    Introduction to Parallel Programming focuses on the techniques, processes, methodologies, and approaches involved in parallel programming. The book first offers information on Fortran, hardware and operating system models, and processes, shared memory, and simple parallel programs. Discussions focus on processes and processors, joining processes, shared memory, time-sharing with multiple processors, hardware, loops, passing arguments in function/subroutine calls, program structure, and arithmetic expressions. The text then elaborates on basic parallel programming techniques, barriers and race

  15. Study of improved ray tracing parallel algorithm for CGH of 3D objects on GPU

    Cong, Bin; Jiang, Xiaoyu; Yao, Jun; Zhao, Kai

    2014-11-01

    An improved parallel algorithm for holograms of three-dimensional objects was presented. According to the physical characteristics and mathematical properties of the original ray tracing algorithm for computer generated holograms (CGH), using transform approximation and numerical analysis methods, we extract parts of ray tracing algorithm which satisfy parallelization features and implement them on graphics processing unit (GPU). Meanwhile, through proper design of parallel numerical procedure, we did parallel programming to the two-dimensional slices of three-dimensional object with CUDA. According to the experiments, an effective method of dealing with occlusion problem in ray tracing is proposed, as well as generating the holograms of 3D objects with additive property. Our results indicate that the improved algorithm can effectively shorten the computing time. Due to the different sizes of spatial object points and hologram pixels, the speed has increased 20 to 70 times comparing with original ray tracing algorithm.

  16. On the Parallel Design and Analysis for 3-D ADI Telegraph Problem with MPI

    Simon Uzezi Ewedafe

    2014-05-01

    Full Text Available In this paper we describe the 3-D Telegraph Equation (3-DTEL with the use of Alternating Direction Implicit (ADI method on Geranium Cadcam Cluster (GCC with Message Passing Interface (MPI parallel software. The algorithm is presented by the use of Single Program Multiple Data (SPMD technique. The implementation is discussed by means of Parallel Design and Analysis with the use of Domain Decomposition (DD strategy. The 3-DTEL with ADI scheme is implemented on the GCC cluster, with an objective to evaluate the overhead it introduces, with ability to exploit the inherent parallelism of the computation. Results of the parallel experiments are presented. The Speedup and Efficiency from the experiments on different block sizes agree with the theoretical analysis.

  17. Parallel Simulation of 3-D Turbulent Flow Through Hydraulic Machinery

    徐宇; 吴玉林

    2003-01-01

    Parallel calculational methods were used to analyze incompressible turbulent flow through hydraulic machinery. Two parallel methods were used to simulate the complex flow field. The space decomposition method divides the computational domain into several sub-ranges. Parallel discrete event simulation divides the whole task into several parts according to their functions. The simulation results were compared with the serial simulation results and particle image velocimetry (PIV) experimental results. The results give the distribution and configuration of the complex vortices and illustrate the effectiveness of the parallel algorithms for numerical simulation of turbulent flows.

  18. Parallel Hall effect from 3D single-component metamaterials

    Kern, Christian; Wegener, Martin

    2015-01-01

    We propose a class of three-dimensional metamaterial architectures composed of a single doped semiconductor (e.g., n-Si) in air or vacuum that lead to unusual effective behavior of the classical Hall effect. Using an anisotropic structure, we numerically demonstrate a Hall voltage that is parallel---rather than orthogonal---to the external static magnetic-field vector ("parallel Hall effect"). The sign of this parallel Hall voltage can be determined by a structure parameter. Together with the previously demonstrated positive or negative orthogonal Hall voltage, we demonstrate four different sign combinations

  19. Parallel magnetohydrodynamics on the Cray T3D

    Meijer, P. M.; Poedts, S.; Goedbloed, J. P.

    1996-01-01

    The equations of magnetohydrodynamics (MHD) are discussed in the framework of parallel computing. Both linear and nonlinear MHD models are addressed. Special attention is given to the parallellisation of the kernels of the existing sequential MHD codes. These kernels involve matrix-vector multiplica

  20. A Parallel Sweeping Preconditioner for Heterogeneous 3D Helmholtz Equations

    Poulson, Jack

    2013-05-02

    A parallelization of a sweeping preconditioner for three-dimensional Helmholtz equations without large cavities is introduced and benchmarked for several challenging velocity models. The setup and application costs of the sequential preconditioner are shown to be O(γ2N4/3) and O(γN logN), where γ(ω) denotes the modestly frequency-dependent number of grid points per perfectly matched layer. Several computational and memory improvements are introduced relative to using black-box sparse-direct solvers for the auxiliary problems, and competitive runtimes and iteration counts are reported for high-frequency problems distributed over thousands of cores. Two open-source packages are released along with this paper: Parallel Sweeping Preconditioner (PSP) and the underlying distributed multifrontal solver, Clique. © 2013 Society for Industrial and Applied Mathematics.

  1. Parallel deterministic neutronics with AMR in 3D

    Clouse, C.; Ferguson, J.; Hendrickson, C. [Lawrence Livermore National Lab., CA (United States)

    1997-12-31

    AMTRAN, a three dimensional Sn neutronics code with adaptive mesh refinement (AMR) has been parallelized over spatial domains and energy groups and runs on the Meiko CS-2 with MPI message passing. Block refined AMR is used with linear finite element representations for the fluxes, which allows for a straight forward interpretation of fluxes at block interfaces with zoning differences. The load balancing algorithm assumes 8 spatial domains, which minimizes idle time among processors.

  2. A STUDY ON USING 3D VISUALIZATION AND SIMULATION PROGRAM (OPTITEX 3D ON LEATHER APPAREL

    Ork Nilay

    2016-05-01

    Full Text Available Leather is a luxury garment. Design, material, labor, fitting and time costs are very effective on the production cost of the consumer leather good. 3D visualization and simulation programs which are getting popular in textile industry can be used for material, labor and time saving in leather apparel. However these programs have a very limited use in leather industry because leather material databases are not sufficient as in textile industry. In this research, firstly material properties of leather and textile fabric were determined by using both textile and leather physical test methods, and interpreted and introduced in the program. Detailed measures of an experimental human body were measured from a 3D body scanner. An avatar was designed according to these measurements. Then a prototype dress was made by using Computer Aided Design-CAD program for designing the patterns. After the pattern making, OptiTex 3D visualization and simulation program was used to visualize and simulate the dresses. Additionally the leather and cotton fabric dresses were sewn in real life. Then the visual and real life dresses were compared and discussed. 3D virtual prototyping seems a promising potential in future manufacturing technologies by evaluating the fitting of garments in a simple and quick way, filling the gap between 3D pattern design and manufacturing, providing virtual demonstrations to customers.

  3. Parallel programming with PCN

    Foster, I.; Tuecke, S.

    1991-12-01

    PCN is a system for developing and executing parallel programs. It comprises a high-level programming language, tools for developing and debugging programs in this language, and interfaces to Fortran and C that allow the reuse of existing code in multilingual parallel programs. Programs developed using PCN are portable across many different workstations, networks, and parallel computers. This document provides all the information required to develop parallel programs with the PCN programming system. In includes both tutorial and reference material. It also presents the basic concepts that underly PCN, particularly where these are likely to be unfamiliar to the reader, and provides pointers to other documentation on the PCN language, programming techniques, and tools. PCN is in the public domain. The latest version of both the software and this manual can be obtained by anonymous FTP from Argonne National Laboratory in the directory pub/pcn at info.mcs.anl.gov (c.f. Appendix A).

  4. Programming structure into 3D nanomaterials

    Dara Van Gough

    2009-06-01

    Full Text Available Programming three dimensional nanostructures into materials is becoming increasingly important given the need for ever more highly functional solids. Applications for materials with complex programmed structures include solar energy harvesting, energy storage, molecular separation, sensors, pharmaceutical agent delivery, nanoreactors and advanced optical devices. Here we discuss examples of molecular and optical routes to program the structure of three-dimensional nanomaterials with exquisite control over nanomorphology and the resultant properties and conclude with a discussion of the opportunities and challenges of such an approach.

  5. Parallel processing for efficient 3D slope stability modelling

    Marchesini, Ivan; Mergili, Martin; Alvioli, Massimiliano; Metz, Markus; Schneider-Muntau, Barbara; Rossi, Mauro; Guzzetti, Fausto

    2014-05-01

    We test the performance of the GIS-based, three-dimensional slope stability model r.slope.stability. The model was developed as a C- and python-based raster module of the GRASS GIS software. It considers the three-dimensional geometry of the sliding surface, adopting a modification of the model proposed by Hovland (1977), and revised and extended by Xie and co-workers (2006). Given a terrain elevation map and a set of relevant thematic layers, the model evaluates the stability of slopes for a large number of randomly selected potential slip surfaces, ellipsoidal or truncated in shape. Any single raster cell may be intersected by multiple sliding surfaces, each associated with a value of the factor of safety, FS. For each pixel, the minimum value of FS and the depth of the associated slip surface are stored. This information is used to obtain a spatial overview of the potentially unstable slopes in the study area. We test the model in the Collazzone area, Umbria, central Italy, an area known to be susceptible to landslides of different type and size. Availability of a comprehensive and detailed landslide inventory map allowed for a critical evaluation of the model results. The r.slope.stability code automatically splits the study area into a defined number of tiles, with proper overlap in order to provide the same statistical significance for the entire study area. The tiles are then processed in parallel by a given number of processors, exploiting a multi-purpose computing environment at CNR IRPI, Perugia. The map of the FS is obtained collecting the individual results, taking the minimum values on the overlapping cells. This procedure significantly reduces the processing time. We show how the gain in terms of processing time depends on the tile dimensions and on the number of cores.

  6. Time efficient 3-D electromagnetic modeling on massively parallel computers

    Alumbaugh, D.L.; Newman, G.A.

    1995-08-01

    A numerical modeling algorithm has been developed to simulate the electromagnetic response of a three dimensional earth to a dipole source for frequencies ranging from 100 to 100MHz. The numerical problem is formulated in terms of a frequency domain--modified vector Helmholtz equation for the scattered electric fields. The resulting differential equation is approximated using a staggered finite difference grid which results in a linear system of equations for which the matrix is sparse and complex symmetric. The system of equations is solved using a preconditioned quasi-minimum-residual method. Dirichlet boundary conditions are employed at the edges of the mesh by setting the tangential electric fields equal to zero. At frequencies less than 1MHz, normal grid stretching is employed to mitigate unwanted reflections off the grid boundaries. For frequencies greater than this, absorbing boundary conditions must be employed by making the stretching parameters of the modified vector Helmholtz equation complex which introduces loss at the boundaries. To allow for faster calculation of realistic models, the original serial version of the code has been modified to run on a massively parallel architecture. This modification involves three distinct tasks; (1) mapping the finite difference stencil to a processor stencil which allows for the necessary information to be exchanged between processors that contain adjacent nodes in the model, (2) determining the most efficient method to input the model which is accomplished by dividing the input into ``global`` and ``local`` data and then reading the two sets in differently, and (3) deciding how to output the data which is an inherently nonparallel process.

  7. Parallel Programming Paradigms

    1987-07-01

    GOVT ACCESSION NO. 3. RECIPIENT’S CATALOG NUMBER 4, TITL.: td Subtitle) S. TYPE OF REPORT & PERIOD COVERED Parallel Programming Paradigms...studied. 0A ITI is Jt, t’i- StCUI-eASSIICATION OFvrHIS PAGFrm".n Def. £ntered, Parallel Programming Paradigms Philip Arne Nelson Department of Computer...8416878 and by the Office of Naval Research Contracts No. N00014-86-K-0264 and No. N00014-85- K-0328. 8 ?~~ O .G 1 49 II Parallel Programming Paradigms

  8. ERROR ANALYSIS OF 3D DETECTING SYSTEM BASED ON WHOLE-FIELD PARALLEL CONFOCAL MICROSCOPE

    Wang Yonghong; Yu Xiaofen

    2005-01-01

    Compared with the traditional scanning confocal microscopy, the effect of various factors on characteristic in multi-beam parallel confocal system is discussed, the error factors in multi-beam parallel confocal system are analyzed. The factors influencing the characteristics of the multi-beam parallel confocal system are discussed. The construction and working principle of the non-scanning 3D detecting system is introduced, and some experiment results prove the effect of various factors on the detecting system.

  9. Parallel Programming with Intel Parallel Studio XE

    Blair-Chappell , Stephen

    2012-01-01

    Optimize code for multi-core processors with Intel's Parallel Studio Parallel programming is rapidly becoming a "must-know" skill for developers. Yet, where to start? This teach-yourself tutorial is an ideal starting point for developers who already know Windows C and C++ and are eager to add parallelism to their code. With a focus on applying tools, techniques, and language extensions to implement parallelism, this essential resource teaches you how to write programs for multicore and leverage the power of multicore in your programs. Sharing hands-on case studies and real-world examples, the

  10. Parallel Finite Element Solution of 3D Rayleigh-Benard-Marangoni Flows

    Carey, G. F.; McLay, R.; Bicken, G.; Barth, B.; Pehlivanov, A.

    1999-01-01

    A domain decomposition strategy and parallel gradient-type iterative solution scheme have been developed and implemented for computation of complex 3D viscous flow problems involving heat transfer and surface tension effects. Details of the implementation issues are described together with associated performance and scalability studies. Representative Rayleigh-Benard and microgravity Marangoni flow calculations and performance results on the Cray T3D and T3E are presented. The work is currently being extended to tightly-coupled parallel "Beowulf-type" PC clusters and we present some preliminary performance results on this platform. We also describe progress on related work on hierarchic data extraction for visualization.

  11. VPython: Writing Real-time 3D Physics Programs

    Chabay, Ruth

    2001-06-01

    VPython (http://cil.andrew.cmu.edu/projects/visual) combines the Python programming language with an innovative 3D graphics module called Visual, developed by David Scherer. Designed to make 3D physics simulations accessible to novice programmers, VPython allows the programmer to write a purely computational program without any graphics code, and produces an interactive realtime 3D graphical display. In a program 3D objects are created and their positions modified by computational algorithms. Running in a separate thread, the Visual module monitors the positions of these objects and renders them many times per second. Using the mouse, one can zoom and rotate to navigate through the scene. After one hour of instruction, students in an introductory physics course at Carnegie Mellon University, including those who have never programmed before, write programs in VPython to model the behavior of physical systems and to visualize fields in 3D. The Numeric array processing module allows the construction of more sophisticated simulations and models as well. VPython is free and open source. The Visual module is based on OpenGL, and runs on Windows, Linux, and Macintosh.

  12. Reconstruction for Time-Domain In Vivo EPR 3D Multigradient Oximetric Imaging—A Parallel Processing Perspective

    Christopher D. Dharmaraj

    2009-01-01

    Full Text Available Three-dimensional Oximetric Electron Paramagnetic Resonance Imaging using the Single Point Imaging modality generates unpaired spin density and oxygen images that can readily distinguish between normal and tumor tissues in small animals. It is also possible with fast imaging to track the changes in tissue oxygenation in response to the oxygen content in the breathing air. However, this involves dealing with gigabytes of data for each 3D oximetric imaging experiment involving digital band pass filtering and background noise subtraction, followed by 3D Fourier reconstruction. This process is rather slow in a conventional uniprocessor system. This paper presents a parallelization framework using OpenMP runtime support and parallel MATLAB to execute such computationally intensive programs. The Intel compiler is used to develop a parallel C++ code based on OpenMP. The code is executed on four Dual-Core AMD Opteron shared memory processors, to reduce the computational burden of the filtration task significantly. The results show that the parallel code for filtration has achieved a speed up factor of 46.66 as against the equivalent serial MATLAB code. In addition, a parallel MATLAB code has been developed to perform 3D Fourier reconstruction. Speedup factors of 4.57 and 4.25 have been achieved during the reconstruction process and oximetry computation, for a data set with 23×23×23 gradient steps. The execution time has been computed for both the serial and parallel implementations using different dimensions of the data and presented for comparison. The reported system has been designed to be easily accessible even from low-cost personal computers through local internet (NIHnet. The experimental results demonstrate that the parallel computing provides a source of high computational power to obtain biophysical parameters from 3D EPR oximetric imaging, almost in real-time.

  13. Reconstruction for time-domain in vivo EPR 3D multigradient oximetric imaging--a parallel processing perspective.

    Dharmaraj, Christopher D; Thadikonda, Kishan; Fletcher, Anthony R; Doan, Phuc N; Devasahayam, Nallathamby; Matsumoto, Shingo; Johnson, Calvin A; Cook, John A; Mitchell, James B; Subramanian, Sankaran; Krishna, Murali C

    2009-01-01

    Three-dimensional Oximetric Electron Paramagnetic Resonance Imaging using the Single Point Imaging modality generates unpaired spin density and oxygen images that can readily distinguish between normal and tumor tissues in small animals. It is also possible with fast imaging to track the changes in tissue oxygenation in response to the oxygen content in the breathing air. However, this involves dealing with gigabytes of data for each 3D oximetric imaging experiment involving digital band pass filtering and background noise subtraction, followed by 3D Fourier reconstruction. This process is rather slow in a conventional uniprocessor system. This paper presents a parallelization framework using OpenMP runtime support and parallel MATLAB to execute such computationally intensive programs. The Intel compiler is used to develop a parallel C++ code based on OpenMP. The code is executed on four Dual-Core AMD Opteron shared memory processors, to reduce the computational burden of the filtration task significantly. The results show that the parallel code for filtration has achieved a speed up factor of 46.66 as against the equivalent serial MATLAB code. In addition, a parallel MATLAB code has been developed to perform 3D Fourier reconstruction. Speedup factors of 4.57 and 4.25 have been achieved during the reconstruction process and oximetry computation, for a data set with 23 x 23 x 23 gradient steps. The execution time has been computed for both the serial and parallel implementations using different dimensions of the data and presented for comparison. The reported system has been designed to be easily accessible even from low-cost personal computers through local internet (NIHnet). The experimental results demonstrate that the parallel computing provides a source of high computational power to obtain biophysical parameters from 3D EPR oximetric imaging, almost in real-time.

  14. RELAP5-3D Developer Guidelines and Programming Practices

    Dr. George L Mesina

    2014-03-01

    Our ultimate goal is to create and maintain RELAP5-3D as the best software tool available to analyze nuclear power plants. This begins with writing excellent programming and requires thorough testing. This document covers development of RELAP5-3D software, the behavior of the RELAP5-3D program that must be maintained, and code testing. RELAP5-3D must perform in a manner consistent with previous code versions with backward compatibility for the sake of the users. Thus file operations, code termination, input and output must remain consistent in form and content while adding appropriate new files, input and output as new features are developed. As computer hardware, operating systems, and other software change, RELAP5-3D must adapt and maintain performance. The code must be thoroughly tested to ensure that it continues to perform robustly on the supported platforms. The coding must be written in a consistent manner that makes the program easy to read to reduce the time and cost of development, maintenance and error resolution. The programming guidelines presented her are intended to institutionalize a consistent way of writing FORTRAN code for the RELAP5-3D computer program that will minimize errors and rework. A common format and organization of program units creates a unifying look and feel to the code. This in turn increases readability and reduces time required for maintenance, development and debugging. It also aids new programmers in reading and understanding the program. Therefore, when undertaking development of the RELAP5-3D computer program, the programmer must write computer code that follows these guidelines. This set of programming guidelines creates a framework of good programming practices, such as initialization, structured programming, and vector-friendly coding. It sets out formatting rules for lines of code, such as indentation, capitalization, spacing, etc. It creates limits on program units, such as subprograms, functions, and modules. It

  15. Programming standards for effective S-3D game development

    Schneider, Neil; Matveev, Alexander

    2008-02-01

    When a video game is in development, more often than not it is being rendered in three dimensions - complete with volumetric depth. It's the PC monitor that is taking this three-dimensional information, and artificially displaying it in a flat, two-dimensional format. Stereoscopic drivers take the three-dimensional information captured from DirectX and OpenGL calls and properly display it with a unique left and right sided view for each eye so a proper stereoscopic 3D image can be seen by the gamer. The two-dimensional limitation of how information is displayed on screen has encouraged programming short-cuts and work-arounds that stifle this stereoscopic 3D effect, and the purpose of this guide is to outline techniques to get the best of both worlds. While the programming requirements do not significantly add to the game development time, following these guidelines will greatly enhance your customer's stereoscopic 3D experience, increase your likelihood of earning Meant to be Seen certification, and give you instant cost-free access to the industry's most valued consumer base. While this outline is mostly based on NVIDIA's programming guide and iZ3D resources, it is designed to work with all stereoscopic 3D hardware solutions and is not proprietary in any way.

  16. Parallel programming with MPI

    Tatebe, Osamu [Electrotechnical Lab., Tsukuba, Ibaraki (Japan)

    1998-03-01

    MPI is a practical, portable, efficient and flexible standard for message passing, which has been implemented on most MPPs and network of workstations by machine vendors, universities and national laboratories. MPI avoids specifying how operations will take place and superfluous work to achieve efficiency as well as portability, and is also designed to encourage overlapping communication and computation to hide communication latencies. This presentation briefly explains the MPI standard, and comments on efficient parallel programming to improve performance. (author)

  17. 2D/3D Program work summary report

    NONE

    1995-09-01

    The 2D/3D Program was carried out by Germany, Japan and the United States to investigate the thermal-hydraulics of a PWR large-break LOCA. A contributory approach was utilized in which each country contributed significant effort to the program and all three countries shared the research results. Germany constructed and operated the Upper Plenum Test Facility (UPTF), and Japan constructed and operated the Cylindrical Core Test Facility (CCTF) and the Slab Core Test Facility (SCTF). The US contribution consisted of provision of advanced instrumentation to each of the three test facilities, and assessment of the TRAC computer code against the test results. Evaluations of the test results were carried out in all three countries. This report summarizes the 2D/3D Program in terms of the contributing efforts of the participants, and was prepared in a coordination among three countries. US and Germany have published the report as NUREG/IA-0126 and GRS-100, respectively. (author).

  18. Fast 3D Variable-FOV Reconstruction for Parallel Imaging with Localized Sensitivities

    Can, Yiğit Baran; Çukur, Tolga

    2016-01-01

    Several successful iterative approaches have recently been proposed for parallel-imaging reconstructions of variable-density (VD) acquisitions, but they often induce substantial computational burden for non-Cartesian data. Here we propose a generalized variable-FOV PILS reconstruction 3D VD Cartesian and non-Cartesian data. The proposed method separates k-space into non-intersecting annuli based on sampling density, and sets the 3D reconstruction FOV for each annulus based on the respective sampling density. The variable-FOV method is compared against conventional gridding, PILS, and ESPIRiT reconstructions. Results indicate that the proposed method yields better artifact suppression compared to gridding and PILS, and improves noise conditioning relative to ESPIRiT, enabling fast and high-quality reconstructions of 3D datasets.

  19. Gust Acoustics Computation with a Space-Time CE/SE Parallel 3D Solver

    Wang, X. Y.; Himansu, A.; Chang, S. C.; Jorgenson, P. C. E.; Reddy, D. R. (Technical Monitor)

    2002-01-01

    The benchmark Problem 2 in Category 3 of the Third Computational Aero-Acoustics (CAA) Workshop is solved using the space-time conservation element and solution element (CE/SE) method. This problem concerns the unsteady response of an isolated finite-span swept flat-plate airfoil bounded by two parallel walls to an incident gust. The acoustic field generated by the interaction of the gust with the flat-plate airfoil is computed by solving the 3D (three-dimensional) Euler equations in the time domain using a parallel version of a 3D CE/SE solver. The effect of the gust orientation on the far-field directivity is studied. Numerical solutions are presented and compared with analytical solutions, showing a reasonable agreement.

  20. Stiffness Analysis of 3-d.o.f. Overconstrained Translational Parallel Manipulators

    Pashkevich, Anatoly; Wenger, Philippe

    2008-01-01

    The paper presents a new stiffness modelling method for overconstrained parallel manipulators, which is applied to 3-d.o.f. translational mechanisms. It is based on a multidimensional lumped-parameter model that replaces the link flexibility by localized 6-d.o.f. virtual springs. In contrast to other works, the method includes a FEA-based link stiffness evaluation and employs a new solution strategy of the kinetostatic equations, which allows computing the stiffness matrix for the overconstrained architectures and for the singular manipulator postures. The advantages of the developed technique are confirmed by application examples, which deal with comparative stiffness analysis of two translational parallel manipulators.

  1. Architectural Adaptability in Parallel Programming

    1991-05-01

    I AD-A247 516 Architectural Adaptability in Parallel Programming Lawrence Alan Crowl Technical Report 381 May 1991 92-06322 UNIVERSITY OF ROC R...COMPUTER SCIENCE Best Avai~lable Copy Architectural Adaptability in Parallel Programming by Lawrence Alan Crowl Submitted in Partial Fulfillment of the...in the development of their programs. In applying abstraction to parallel programming , we can use abstractions to represent potential parallelism

  2. An improved parallel SPH approach to solve 3D transient generalized Newtonian free surface flows

    Ren, Jinlian; Jiang, Tao; Lu, Weigang; Li, Gang

    2016-08-01

    In this paper, a corrected parallel smoothed particle hydrodynamics (C-SPH) method is proposed to simulate the 3D generalized Newtonian free surface flows with low Reynolds number, especially the 3D viscous jets buckling problems are investigated. The proposed C-SPH method is achieved by coupling an improved SPH method based on the incompressible condition with the traditional SPH (TSPH), that is, the improved SPH with diffusive term and first-order Kernel gradient correction scheme is used in the interior of the fluid domain, and the TSPH is used near the free surface. Thus the C-SPH method possesses the advantages of two methods. Meanwhile, an effective and convenient boundary treatment is presented to deal with 3D multiple-boundary problem, and the MPI parallelization technique with a dynamic cells neighbor particle searching method is considered to improve the computational efficiency. The validity and the merits of the C-SPH are first verified by solving several benchmarks and compared with other results. Then the viscous jet folding/coiling based on the Cross model is simulated by the C-SPH method and compared with other experimental or numerical results. Specially, the influences of macroscopic parameters on the flow are discussed. All the numerical results agree well with available data, and show that the C-SPH method has higher accuracy and better stability for solving 3D moving free surface flows over other particle methods.

  3. Spatial Parallelism of a 3D Finite Difference, Velocity-Stress Elastic Wave Propagation Code

    MINKOFF,SUSAN E.

    1999-12-09

    Finite difference methods for solving the wave equation more accurately capture the physics of waves propagating through the earth than asymptotic solution methods. Unfortunately. finite difference simulations for 3D elastic wave propagation are expensive. We model waves in a 3D isotropic elastic earth. The wave equation solution consists of three velocity components and six stresses. The partial derivatives are discretized using 2nd-order in time and 4th-order in space staggered finite difference operators. Staggered schemes allow one to obtain additional accuracy (via centered finite differences) without requiring additional storage. The serial code is most unique in its ability to model a number of different types of seismic sources. The parallel implementation uses the MP1 library, thus allowing for portability between platforms. Spatial parallelism provides a highly efficient strategy for parallelizing finite difference simulations. In this implementation, one can decompose the global problem domain into one-, two-, and three-dimensional processor decompositions with 3D decompositions generally producing the best parallel speed up. Because i/o is handled largely outside of the time-step loop (the most expensive part of the simulation) we have opted for straight-forward broadcast and reduce operations to handle i/o. The majority of the communication in the code consists of passing subdomain face information to neighboring processors for use as ''ghost cells''. When this communication is balanced against computation by allocating subdomains of reasonable size, we observe excellent scaled speed up. Allocating subdomains of size 25 x 25 x 25 on each node, we achieve efficiencies of 94% on 128 processors. Numerical examples for both a layered earth model and a homogeneous medium with a high-velocity blocky inclusion illustrate the accuracy of the parallel code.

  4. Spatial parallelism of a 3D finite difference, velocity-stress elastic wave propagation code

    Minkoff, S.E.

    1999-12-01

    Finite difference methods for solving the wave equation more accurately capture the physics of waves propagating through the earth than asymptotic solution methods. Unfortunately, finite difference simulations for 3D elastic wave propagation are expensive. The authors model waves in a 3D isotropic elastic earth. The wave equation solution consists of three velocity components and six stresses. The partial derivatives are discretized using 2nd-order in time and 4th-order in space staggered finite difference operators. Staggered schemes allow one to obtain additional accuracy (via centered finite differences) without requiring additional storage. The serial code is most unique in its ability to model a number of different types of seismic sources. The parallel implementation uses the MPI library, thus allowing for portability between platforms. Spatial parallelism provides a highly efficient strategy for parallelizing finite difference simulations. In this implementation, one can decompose the global problem domain into one-, two-, and three-dimensional processor decompositions with 3D decompositions generally producing the best parallel speedup. Because I/O is handled largely outside of the time-step loop (the most expensive part of the simulation) the authors have opted for straight-forward broadcast and reduce operations to handle I/O. The majority of the communication in the code consists of passing subdomain face information to neighboring processors for use as ghost cells. When this communication is balanced against computation by allocating subdomains of reasonable size, they observe excellent scaled speedup. Allocating subdomains of size 25 x 25 x 25 on each node, they achieve efficiencies of 94% on 128 processors. Numerical examples for both a layered earth model and a homogeneous medium with a high-velocity blocky inclusion illustrate the accuracy of the parallel code.

  5. The 3D Elevation Program and America's infrastructure

    Lukas, Vicki; Carswell, Jr., William J.

    2016-11-07

    Infrastructure—the physical framework of transportation, energy, communications, water supply, and other systems—and construction management—the overall planning, coordination, and control of a project from beginning to end—are critical to the Nation’s prosperity. The American Society of Civil Engineers has warned that, despite the importance of the Nation’s infrastructure, it is in fair to poor condition and needs sizable and urgent investments to maintain and modernize it, and to ensure that it is sustainable and resilient. Three-dimensional (3D) light detection and ranging (lidar) elevation data provide valuable productivity, safety, and cost-saving benefits to infrastructure improvement projects and associated construction management. By providing data to users, the 3D Elevation Program (3DEP) of the U.S. Geological Survey reduces users’ costs and risks and allows them to concentrate on their mission objectives. 3DEP includes (1) data acquisition partnerships that leverage funding, (2) contracts with experienced private mapping firms, (3) technical expertise, lidar data standards, and specifications, and (4) most important, public access to high-quality 3D elevation data. The size and breadth of improvements for the Nation’s infrastructure and construction management needs call for an efficient, systematic approach to acquiring foundational 3D elevation data. The 3DEP approach to national data coverage will yield large cost savings over individual project-by-project acquisitions and will ensure that data are accessible for other critical applications.

  6. Performance analysis of high quality parallel preconditioners applied to 3D finite element structural analysis

    Kolotilina, L.; Nikishin, A.; Yeremin, A. [and others

    1994-12-31

    The solution of large systems of linear equations is a crucial bottleneck when performing 3D finite element analysis of structures. Also, in many cases the reliability and robustness of iterative solution strategies, and their efficiency when exploiting hardware resources, fully determine the scope of industrial applications which can be solved on a particular computer platform. This is especially true for modern vector/parallel supercomputers with large vector length and for modern massively parallel supercomputers. Preconditioned iterative methods have been successfully applied to industrial class finite element analysis of structures. The construction and application of high quality preconditioners constitutes a high percentage of the total solution time. Parallel implementation of high quality preconditioners on such architectures is a formidable challenge. Two common types of existing preconditioners are the implicit preconditioners and the explicit preconditioners. The implicit preconditioners (e.g. incomplete factorizations of several types) are generally high quality but require solution of lower and upper triangular systems of equations per iteration which are difficult to parallelize without deteriorating the convergence rate. The explicit type of preconditionings (e.g. polynomial preconditioners or Jacobi-like preconditioners) require sparse matrix-vector multiplications and can be parallelized but their preconditioning qualities are less than desirable. The authors present results of numerical experiments with Factorized Sparse Approximate Inverses (FSAI) for symmetric positive definite linear systems. These are high quality preconditioners that possess a large resource of parallelism by construction without increasing the serial complexity.

  7. Late gadolinium enhancement cardiac imaging on a 3T scanner with parallel RF transmission technique: prospective comparison of 3D-PSIR and 3D-IR

    Schultz, Anthony [Nouvel Hopital Civil, Strasbourg University Hospital, Radiology Department, Strasbourg Cedex (France); Nouvel Hopital Civil, Service de Radiologie, Strasbourg Cedex (France); Caspar, Thibault [Nouvel Hopital Civil, Strasbourg University Hospital, Cardiology Department, Strasbourg Cedex (France); Schaeffer, Mickael [Nouvel Hopital Civil, Strasbourg University Hospital, Public Health and Biostatistics Department, Strasbourg Cedex (France); Labani, Aissam; Jeung, Mi-Young; El Ghannudi, Soraya; Roy, Catherine [Nouvel Hopital Civil, Strasbourg University Hospital, Radiology Department, Strasbourg Cedex (France); Ohana, Mickael [Nouvel Hopital Civil, Strasbourg University Hospital, Radiology Department, Strasbourg Cedex (France); Universite de Strasbourg / CNRS, UMR 7357, iCube Laboratory, Illkirch (France)

    2016-06-15

    To qualitatively and quantitatively compare different late gadolinium enhancement (LGE) sequences acquired at 3T with a parallel RF transmission technique. One hundred and sixty participants prospectively enrolled underwent a 3T cardiac MRI with 3 different LGE sequences: 3D Phase-Sensitive Inversion-Recovery (3D-PSIR) acquired 5 minutes after injection, 3D Inversion-Recovery (3D-IR) at 9 minutes and 3D-PSIR at 13 minutes. All LGE-positive patients were qualitatively evaluated both independently and blindly by two radiologists using a 4-level scale, and quantitatively assessed with measurement of contrast-to-noise ratio and LGE maximal surface. Statistical analyses were calculated under a Bayesian paradigm using MCMC methods. Fifty patients (70 % men, 56yo ± 19) exhibited LGE (62 % were post-ischemic, 30 % related to cardiomyopathy and 8 % post-myocarditis). Early and late 3D-PSIR were superior to 3D-IR sequences (global quality, estimated coefficient IR > early-PSIR: -2.37 CI = [-3.46; -1.38], prob(coef > 0) = 0 % and late-PSIR > IR: 3.12 CI = [0.62; 4.41], prob(coef > 0) = 100 %), LGE surface estimated coefficient IR > early-PSIR: -0.09 CI = [-1.11; -0.74], prob(coef > 0) = 0 % and late-PSIR > IR: 0.96 CI = [0.77; 1.15], prob(coef > 0) = 100 %. Probabilities for late PSIR being superior to early PSIR concerning global quality and CNR were over 90 %, regardless of the aetiological subgroup. In 3T cardiac MRI acquired with parallel RF transmission technique, 3D-PSIR is qualitatively and quantitatively superior to 3D-IR. (orig.)

  8. Parallel Imaging of 3D Surface Profile with Space-Division Multiplexing

    Hyung Seok Lee

    2016-01-01

    Full Text Available We have developed a modified optical frequency domain imaging (OFDI system that performs parallel imaging of three-dimensional (3D surface profiles by using the space division multiplexing (SDM method with dual-area swept sourced beams. We have also demonstrated that 3D surface information for two different areas could be well obtained in a same time with only one camera by our method. In this study, double field of views (FOVs of 11.16 mm × 5.92 mm were achieved within 0.5 s. Height range for each FOV was 460 µm and axial and transverse resolutions were 3.6 and 5.52 µm, respectively.

  9. Status of the 3D Elevation Program, 2015

    Sugarbaker, Larry J.; Eldridge, Diane F.; Jason, Allyson L.; Lukas, Vicki; Saghy, David L.; Stoker, Jason M.; Thunen, Diana R.

    2017-01-18

    The 3D Elevation Program (3DEP) is a cooperative activity to collect light detection and ranging (lidar) data for the conterminous United States, Hawaii, and U.S. territories; and interferometric synthetic aperture radar (IfSAR) elevation data for Alaska during an 8-year period. The U.S. Geological Survey (USGS) and partner organizations acquire high-quality three-dimensional elevation data for the United States and its territories that support requirements beyond what could be realized if agencies independently pursued lidar and IfSAR data collection activities. Data collection rates have been increasing as a growing number of State and Federal agencies participate in cooperative data acquisition projects. USGS and partner agencies expanded data collection, completed the initial product delivery systems and implemented changes to the program governance to include a restructuring of the 3DEP working group and formalizing the relationship to the Federal Geographic Data Committee during the final year (2015) of program preparation.

  10. Computer Assisted Parallel Program Generation

    Kawata, Shigeo

    2015-01-01

    Parallel computation is widely employed in scientific researches, engineering activities and product development. Parallel program writing itself is not always a simple task depending on problems solved. Large-scale scientific computing, huge data analyses and precise visualizations, for example, would require parallel computations, and the parallel computing needs the parallelization techniques. In this Chapter a parallel program generation support is discussed, and a computer-assisted parallel program generation system P-NCAS is introduced. Computer assisted problem solving is one of key methods to promote innovations in science and engineering, and contributes to enrich our society and our life toward a programming-free environment in computing science. Problem solving environments (PSE) research activities had started to enhance the programming power in 1970's. The P-NCAS is one of the PSEs; The PSE concept provides an integrated human-friendly computational software and hardware system to solve a target ...

  11. PARALLEL 3-D SPACE CHARGE CALCULATIONS IN THE UNIFIED ACCELERATOR LIBRARY.

    D' IMPERIO, N.L.; LUCCIO, A.U.; MALITSKY, N.

    2006-06-26

    The paper presents the integration of the SIMBAD space charge module in the UAL framework. SIMBAD is a Particle-in-Cell (PIC) code. Its 3-D Parallel approach features an optimized load balancing scheme based on a genetic algorithm. The UAL framework enhances the SIMBAD standalone version with the interactive ROOT-based analysis environment and an open catalog of accelerator algorithms. The composite package addresses complex high intensity beam dynamics and has been developed as part of the FAIR SIS 100 project.

  12. Parallel implementation of 3D FFT with volumetric decomposition schemes for efficient molecular dynamics simulations

    Jung, Jaewoon; Kobayashi, Chigusa; Imamura, Toshiyuki; Sugita, Yuji

    2016-03-01

    Three-dimensional Fast Fourier Transform (3D FFT) plays an important role in a wide variety of computer simulations and data analyses, including molecular dynamics (MD) simulations. In this study, we develop hybrid (MPI+OpenMP) parallelization schemes of 3D FFT based on two new volumetric decompositions, mainly for the particle mesh Ewald (PME) calculation in MD simulations. In one scheme, (1d_Alltoall), five all-to-all communications in one dimension are carried out, and in the other, (2d_Alltoall), one two-dimensional all-to-all communication is combined with two all-to-all communications in one dimension. 2d_Alltoall is similar to the conventional volumetric decomposition scheme. We performed benchmark tests of 3D FFT for the systems with different grid sizes using a large number of processors on the K computer in RIKEN AICS. The two schemes show comparable performances, and are better than existing 3D FFTs. The performances of 1d_Alltoall and 2d_Alltoall depend on the supercomputer network system and number of processors in each dimension. There is enough leeway for users to optimize performance for their conditions. In the PME method, short-range real-space interactions as well as long-range reciprocal-space interactions are calculated. Our volumetric decomposition schemes are particularly useful when used in conjunction with the recently developed midpoint cell method for short-range interactions, due to the same decompositions of real and reciprocal spaces. The 1d_Alltoall scheme of 3D FFT takes 4.7 ms to simulate one MD cycle for a virus system containing more than 1 million atoms using 32,768 cores on the K computer.

  13. The 3D Elevation Program initiative: a call for action

    Sugarbaker, Larry J.; Constance, Eric W.; Heidemann, Hans Karl; Jason, Allyson L.; Lukas, Vicki; Saghy, David L.; Stoker, Jason M.

    2014-01-01

    The 3D Elevation Program (3DEP) initiative is accelerating the rate of three-dimensional (3D) elevation data collection in response to a call for action to address a wide range of urgent needs nationwide. It began in 2012 with the recommendation to collect (1) high-quality light detection and ranging (lidar) data for the conterminous United States (CONUS), Hawaii, and the U.S. territories and (2) interferometric synthetic aperture radar (ifsar) data for Alaska. Specifications were created for collecting 3D elevation data, and the data management and delivery systems are being modernized. The National Elevation Dataset (NED) will be completely refreshed with new elevation data products and services. The call for action requires broad support from a large partnership community committed to the achievement of national 3D elevation data coverage. The initiative is being led by the U.S. Geological Survey (USGS) and includes many partners—Federal agencies and State, Tribal, and local governments—who will work together to build on existing programs to complete the national collection of 3D elevation data in 8 years. Private sector firms, under contract to the Government, will continue to collect the data and provide essential technology solutions for the Government to manage and deliver these data and services. The 3DEP governance structure includes (1) an executive forum established in May 2013 to have oversight functions and (2) a multiagency coordinating committee based upon the committee structure already in place under the National Digital Elevation Program (NDEP). The 3DEP initiative is based on the results of the National Enhanced Elevation Assessment (NEEA) that was funded by NDEP agencies and completed in 2011. The study, led by the USGS, identified more than 600 requirements for enhanced (3D) elevation data to address mission-critical information requirements of 34 Federal agencies, all 50 States, and a sample of private sector companies and Tribal and local

  14. Simulating Growth Kinetics in a Data-Parallel 3D Lattice Photobioreactor

    A. V. Husselmann

    2013-01-01

    Full Text Available Though there have been many attempts to address growth kinetics in algal photobioreactors, surprisingly little have attempted an agent-based modelling (ABM approach. ABM has been heralded as a method of practical scientific inquiry into systems of a complex nature and has been applied liberally in a range of disciplines including ecology, physics, social science, and microbiology with special emphasis on pathogenic bacterial growth. We bring together agent-based simulation with the Photosynthetic Factory (PSF model, as well as certain key bioreactor characteristics in a visual 3D, parallel computing fashion. Despite being at small scale, the simulation gives excellent visual cues on the dynamics of such a reactor, and we further investigate the model in a variety of ways. Our parallel implementation on graphical processing units of the simulation provides key advantages, which we also briefly discuss. We also provide some performance data, along with particular effort in visualisation, using volumetric and isosurface rendering.

  15. Assessing the performance of a parallel MATLAB-based 3D convection code

    Kirkpatrick, G. J.; Hasenclever, J.; Phipps Morgan, J.; Shi, C.

    2008-12-01

    We are currently building 2D and 3D MATLAB-based parallel finite element codes for mantle convection and melting. The codes use the MATLAB implementation of core MPI commands (eg. Send, Receive, Broadcast) for message passing between computational subdomains. We have found that code development and algorithm testing are much faster in MATLAB than in our previous work coding in C or FORTRAN, this code was built from scratch with only 12 man-months of effort. The one extra cost w.r.t. C coding on a Beowulf cluster is the cost of the parallel MATLAB license for a >4core cluster. Here we present some preliminary results on the efficiency of MPI messaging in MATLAB on a small 4 machine, 16core, 32Gb RAM Intel Q6600 processor-based cluster. Our code implements fully parallelized preconditioned conjugate gradients with a multigrid preconditioner. Our parallel viscous flow solver is currently 20% slower for a 1,000,000 DOF problem on a single core in 2D as the direct solve MILAMIN MATLAB viscous flow solver. We have tested both continuous and discontinuous pressure formulations. We test with various configurations of network hardware, CPU speeds, and memory using our own and MATLAB's built in cluster profiler. So far we have only explored relatively small (up to 1.6GB RAM) test problems. We find that with our current code and Intel memory controller bandwidth limitations we can only get ~2.3 times performance out of 4 cores than 1 core per machine. Even for these small problems the code runs faster with message passing between 4 machines with one core each than 1 machine with 4 cores and internal messaging (1.29x slower), or 1 core (2.15x slower). It surprised us that for 2D ~1GB-sized problems with only 3 multigrid levels, the direct- solve on the coarsest mesh consumes comparable time to the iterative solve on the finest mesh - a penalty that is greatly reduced either by using a 4th multigrid level or by using an iterative solve at the coarsest grid level. We plan to

  16. Porting a 3D-model for the transport of reactive air pollutants to the parallel machine T3D

    Kessler, C.; Blom, J.G.; Verwer, J.G.

    1995-01-01

    Air pollution forecasting puts a high demand on the memory and the floating point performance of modern computers. For this kind of problems massively parallel computers are very promising, although the software tools and the I/O facilities on those machines are still under-developed. This report de

  17. Introducing ZEUS-MP A 3D, Parallel, Multiphysics Code for Astrophysical Fluid Dynamics

    Norman, M L

    2000-01-01

    We describe ZEUS-MP: a Multi-Physics, Massively-Parallel, Message-Passing code for astrophysical fluid dynamics simulations in 3 dimensions. ZEUS-MP is a follow-on to the sequential ZEUS-2D and ZEUS-3D codes developed and disseminated by the Laboratory for Computational Astrophysics (lca.ncsa.uiuc.edu) at NCSA. V1.0 released 1/1/2000 includes the following physics modules: ideal hydrodynamics, ideal MHD, and self-gravity. Future releases will include flux-limited radiation diffusion, thermal heat conduction, two-temperature plasma, and heating and cooling functions. The covariant equations are cast on a moving Eulerian grid with Cartesian, cylindrical, and spherical polar coordinates currently supported. Parallelization is done by domain decomposition and implemented in F77 and MPI. The code is portable across a wide range of platforms from networks of workstations to massively parallel processors. Some parallel performance results are presented as well as an application to turbulent star formation.

  18. Approach of generating parallel programs from parallelized algorithm design strategies

    WAN Jian-yi; LI Xiao-ying

    2008-01-01

    Today, parallel programming is dominated by message passing libraries, such as message passing interface (MPI). This article intends to simplify parallel programming by generating parallel programs from parallelized algorithm design strategies. It uses skeletons to abstract parallelized algorithm design strategies, as well as parallel architectures. Starting from problem specification, an abstract parallel abstract programming language+ (Apla+) program is generated from parallelized algorithm design strategies and problem-specific function definitions. By combining with parallel architectures, implicity of parallelism inside the parallelized algorithm design strategies is exploited. With implementation and transformation, C++ and parallel virtual machine (CPPVM) parallel program is finally generated. Parallelized branch and bound (B&B) algorithm design strategy and parallelized divide and conquer (D & C) algorithm design strategy are studied in this article as examples. And it also illustrates the approach with a case study.

  19. Parallel implementation of 3D protein structure similarity searches using a GPU and the CUDA.

    Mrozek, Dariusz; Brożek, Miłosz; Małysiak-Mrozek, Bożena

    2014-02-01

    Searching for similar 3D protein structures is one of the primary processes employed in the field of structural bioinformatics. However, the computational complexity of this process means that it is constantly necessary to search for new methods that can perform such a process faster and more efficiently. Finding molecular substructures that complex protein structures have in common is still a challenging task, especially when entire databases containing tens or even hundreds of thousands of protein structures must be scanned. Graphics processing units (GPUs) and general purpose graphics processing units (GPGPUs) can perform many time-consuming and computationally demanding processes much more quickly than a classical CPU can. In this paper, we describe the GPU-based implementation of the CASSERT algorithm for 3D protein structure similarity searching. This algorithm is based on the two-phase alignment of protein structures when matching fragments of the compared proteins. The GPU (GeForce GTX 560Ti: 384 cores, 2GB RAM) implementation of CASSERT ("GPU-CASSERT") parallelizes both alignment phases and yields an average 180-fold increase in speed over its CPU-based, single-core implementation on an Intel Xeon E5620 (2.40GHz, 4 cores). In this paper, we show that massive parallelization of the 3D structure similarity search process on many-core GPU devices can reduce the execution time of the process, allowing it to be performed in real time. GPU-CASSERT is available at: http://zti.polsl.pl/dmrozek/science/gpucassert/cassert.htm.

  20. The development of laser-plasma interaction program LAP3D on thousands of processors

    Xiaoyan Hu

    2015-08-01

    Full Text Available Modeling laser-plasma interaction (LPI processes in real-size experiments scale is recognized as a challenging task. For explorering the influence of various instabilities in LPI processes, a three-dimensional laser and plasma code (LAP3D has been developed, which includes filamentation, stimulated Brillouin backscattering (SBS, stimulated Raman backscattering (SRS, non-local heat transport and plasmas flow computation modules. In this program, a second-order upwind scheme is applied to solve the plasma equations which are represented by an Euler fluid model. Operator splitting method is used for solving the equations of the light wave propagation, where the Fast Fourier translation (FFT is applied to compute the diffraction operator and the coordinate translations is used to solve the acoustic wave equation. The coupled terms of the different physics processes are computed by the second-order interpolations algorithm. In order to simulate the LPI processes in massively parallel computers well, several parallel techniques are used, such as the coupled parallel algorithm of FFT and fluid numerical computation, the load balance algorithm, and the data transfer algorithm. Now the phenomena of filamentation, SBS and SRS have been studied in low-density plasma successfully with LAP3D. Scalability of the program is demonstrated with a parallel efficiency above 50% on about ten thousand of processors.

  1. Patterns For Parallel Programming

    Mattson, Timothy G; Massingill, Berna L

    2005-01-01

    From grids and clusters to next-generation game consoles, parallel computing is going mainstream. Innovations such as Hyper-Threading Technology, HyperTransport Technology, and multicore microprocessors from IBM, Intel, and Sun are accelerating the movement's growth. Only one thing is missing: programmers with the skills to meet the soaring demand for parallel software.

  2. A parallel 3-D discrete wavelet transform architecture using pipelined lifting scheme approach for video coding

    Hegde, Ganapathi; Vaya, Pukhraj

    2013-10-01

    This article presents a parallel architecture for 3-D discrete wavelet transform (3-DDWT). The proposed design is based on the 1-D pipelined lifting scheme. The architecture is fully scalable beyond the present coherent Daubechies filter bank (9, 7). This 3-DDWT architecture has advantages such as no group of pictures restriction and reduced memory referencing. It offers low power consumption, low latency and high throughput. The computing technique is based on the concept that lifting scheme minimises the storage requirement. The application specific integrated circuit implementation of the proposed architecture is done by synthesising it using 65 nm Taiwan Semiconductor Manufacturing Company standard cell library. It offers a speed of 486 MHz with a power consumption of 2.56 mW. This architecture is suitable for real-time video compression even with large frame dimensions.

  3. 3-D readout-electronics packaging for high-bandwidth massively paralleled imager

    Kwiatkowski, Kris; Lyke, James

    2007-12-18

    Dense, massively parallel signal processing electronics are co-packaged behind associated sensor pixels. Microchips containing a linear or bilinear arrangement of photo-sensors, together with associated complex electronics, are integrated into a simple 3-D structure (a "mirror cube"). An array of photo-sensitive cells are disposed on a stacked CMOS chip's surface at a 45.degree. angle from light reflecting mirror surfaces formed on a neighboring CMOS chip surface. Image processing electronics are held within the stacked CMOS chip layers. Electrical connections couple each of said stacked CMOS chip layers and a distribution grid, the connections for distributing power and signals to components associated with each stacked CSMO chip layer.

  4. Hybrid Parallel Bundle Adjustment for 3D Scene Reconstruction with Massive Points

    Xin Liu; Wei Gao; Zhan-Yi Hu

    2012-01-01

    Bundle adjustment (BA) is a crucial but time consuming step in 3D reconstruction.In this paper,we intend to tackle a special class of BA problems where the reconstructed 3D points are much more numerous than the camera parameters,called Massive-Points BA (MPBA) problems.This is often the case when high-resolution images are used.We present a design and implementation of a new bundle adjustment algorithm for efficiently solving the MPBA problems.The use of hardware parallelism,the multi-core CPUs as well as GPUs,is explored.By careful memory-usage design,the graphic-memory limitation is effectively alleviated.Several modern acceleration strategies for bundle adjustment,such as the mixed-precision arithmetics,the embedded point iteration,and the preconditioned conjugate gradients,are explored and compared.By using several high-resolution image datasets,we generate a variety of MPBA problems,with which the performance of five bundle adjustment algorithms are evaluated.The experimental results show that our algorithm is up to 40 times faster than classical Sparse Bundle Adjustment,while maintaining comparable precision.

  5. Parallel programming with Python

    Palach, Jan

    2014-01-01

    A fast, easy-to-follow and clear tutorial to help you develop Parallel computing systems using Python. Along with explaining the fundamentals, the book will also introduce you to slightly advanced concepts and will help you in implementing these techniques in the real world. If you are an experienced Python programmer and are willing to utilize the available computing resources by parallelizing applications in a simple way, then this book is for you. You are required to have a basic knowledge of Python development to get the most of this book.

  6. Parallel 3-d simulations for porous media models in soil mechanics

    Wieners, C.; Ammann, M.; Diebels, S.; Ehlers, W.

    Numerical simulations in 3-d for porous media models in soil mechanics are a difficult task for the engineering modelling as well as for the numerical realization. Here, we present a general numerical scheme for the simulation of two-phase models in combination with an material model via the stress response with a specialized parallel saddle point solver. Therefore, we give a brief introduction into the theoretical background of the Theory of Porous Media and constitute a two-phase model consisting of a porous solid skeleton saturated by a viscous pore-fluid. The material behaviour of the skeleton is assumed to be elasto-viscoplastic. The governing equations are transfered to a weak formulation suitable for the application of the finite element method. Introducing an formulation in terms of the stress response, we define a clear interface between the assembling process and the parallel solver modules. We demonstrate the efficiency of this approach by challenging numerical experiments realized on the Linux Cluster in Chemnitz.

  7. Practical parallel programming

    Bauer, Barr E

    2014-01-01

    This is the book that will teach programmers to write faster, more efficient code for parallel processors. The reader is introduced to a vast array of procedures and paradigms on which actual coding may be based. Examples and real-life simulations using these devices are presented in C and FORTRAN.

  8. The 3D Elevation Program: summary for Virginia

    Carswell, William J.

    2013-01-01

    Elevation data are essential to a broad range of applications, including forest resources management, wildlife and habitat management, national security, recreation, and many others. For the Commonwealth of Virginia, elevation data are critical for urban and regional planning, natural resources conservation, flood risk management, agriculture and precision farming, resource mining, infrastructure and construction management, and other business uses. Today, high-quality light detection and ranging (lidar) data are the sources for creating elevation models and other elevation datasets. Federal, State, and local agencies work in partnership to (1) replace data, on a national basis, that are (on average) 30 years old and of lower quality and (2) provide coverage where publicly accessible data do not exist. A joint goal of State and Federal partners is to acquire consistent, statewide coverage to support existing and emerging applications enabled by lidar data. The new 3D Elevation Program (3DEP) initiative, managed by the U.S. Geological Survey (USGS), responds to the growing need for high-quality topographic data and a wide range of other three-dimensional representations of the Nation’s natural and constructed features.

  9. The 3D Elevation Program: summary for Texas

    Carswell, William J.

    2013-01-01

    Elevation data are essential to a broad range of applications, including forest resources management, wildlife and habitat management, national security, recreation, and many others. For the State of Texas, elevation data are critical for natural resources conservation; wildfire management, planning, and response; flood risk management; agriculture and precision farming; infrastructure and construction management; water supply and quality; and other business uses. Today, high-quality light detection and ranging (lidar) data are the source for creating elevation models and other elevation datasets. Federal, State, and local agencies work in partnership to (1) replace data, on a national basis, that are (on average) 30 years old and of lower quality and (2) provide coverage where publicly accessible data do not exist. A joint goal of State and Federal partners is to acquire consistent, statewide coverage to support existing and emerging applications enabled by lidar data. The new 3D Elevation Program (3DEP) initiative, managed by the U.S. Geological Survey (USGS), responds to the growing need for high-quality topographic data and a wide range of other three-dimensional representations of the Nation’s natural and constructed features.

  10. High-Level Parallel Programming.

    parallel programming languages. These issues were evaluated via the utilization of a language called UC. UC is a programming language aimed at balancing notational simplicity with execution efficiency and portability. UC accomplishes this by separating the programming task from the efficiency issues. This report gives a description of the language, its current implementation, its verification methodology and its use in designing various

  11. 3D Kirchhoff depth migration algorithm: A new scalable approach for parallelization on multicore CPU based cluster

    Rastogi, Richa; Londhe, Ashutosh; Srivastava, Abhishek; Sirasala, Kirannmayi M.; Khonde, Kiran

    2017-03-01

    In this article, a new scalable 3D Kirchhoff depth migration algorithm is presented on state of the art multicore CPU based cluster. Parallelization of 3D Kirchhoff depth migration is challenging due to its high demand of compute time, memory, storage and I/O along with the need of their effective management. The most resource intensive modules of the algorithm are traveltime calculations and migration summation which exhibit an inherent trade off between compute time and other resources. The parallelization strategy of the algorithm largely depends on the storage of calculated traveltimes and its feeding mechanism to the migration process. The presented work is an extension of our previous work, wherein a 3D Kirchhoff depth migration application for multicore CPU based parallel system had been developed. Recently, we have worked on improving parallel performance of this application by re-designing the parallelization approach. The new algorithm is capable to efficiently migrate both prestack and poststack 3D data. It exhibits flexibility for migrating large number of traces within the available node memory and with minimal requirement of storage, I/O and inter-node communication. The resultant application is tested using 3D Overthrust data on PARAM Yuva II, which is a Xeon E5-2670 based multicore CPU cluster with 16 cores/node and 64 GB shared memory. Parallel performance of the algorithm is studied using different numerical experiments and the scalability results show striking improvement over its previous version. An impressive 49.05X speedup with 76.64% efficiency is achieved for 3D prestack data and 32.00X speedup with 50.00% efficiency for 3D poststack data, using 64 nodes. The results also demonstrate the effectiveness and robustness of the improved algorithm with high scalability and efficiency on a multicore CPU cluster.

  12. Writing parallel programs that work

    CERN. Geneva

    2012-01-01

    Serial algorithms typically run inefficiently on parallel machines. This may sound like an obvious statement, but it is the root cause of why parallel programming is considered to be difficult. The current state of the computer industry is still that almost all programs in existence are serial. This talk will describe the techniques used in the Intel Parallel Studio to provide a developer with the tools necessary to understand the behaviors and limitations of the existing serial programs. Once the limitations are known the developer can refactor the algorithms and reanalyze the resulting programs with the tools in the Intel Parallel Studio to create parallel programs that work. About the speaker Paul Petersen is a Sr. Principal Engineer in the Software and Solutions Group (SSG) at Intel. He received a Ph.D. degree in Computer Science from the University of Illinois in 1993. After UIUC, he was employed at Kuck and Associates, Inc. (KAI) working on auto-parallelizing compiler (KAP), and was involved in th...

  13. Parallel computing simulation of electrical excitation and conduction in the 3D human heart.

    Di Yu; Dongping Du; Hui Yang; Yicheng Tu

    2014-01-01

    A correctly beating heart is important to ensure adequate circulation of blood throughout the body. Normal heart rhythm is produced by the orchestrated conduction of electrical signals throughout the heart. Cardiac electrical activity is the resulted function of a series of complex biochemical-mechanical reactions, which involves transportation and bio-distribution of ionic flows through a variety of biological ion channels. Cardiac arrhythmias are caused by the direct alteration of ion channel activity that results in changes in the AP waveform. In this work, we developed a whole-heart simulation model with the use of massive parallel computing with GPGPU and OpenGL. The simulation algorithm was implemented under several different versions for the purpose of comparisons, including one conventional CPU version and two GPU versions based on Nvidia CUDA platform. OpenGL was utilized for the visualization / interaction platform because it is open source, light weight and universally supported by various operating systems. The experimental results show that the GPU-based simulation outperforms the conventional CPU-based approach and significantly improves the speed of simulation. By adopting modern computer architecture, this present investigation enables real-time simulation and visualization of electrical excitation and conduction in the large and complicated 3D geometry of a real-world human heart.

  14. A Heterogeneous Parallel Programming Capability

    1990-11-30

    the various implementations of Express attempted to address only the first of these is- sues - providing a portable, standard platform for parallel ... programming on a wide variety of dif- I I! 5 ferent systems. Each implementation, however, was independent, but allowed programs to execute on a single

  15. New 3D parallel GILD electromagnetic modeling and nonlinear inversion using global magnetic integral and local differential equation

    Xie, G.; Li, J.; Majer, E.; Zuo, D.

    1998-07-01

    This paper describes a new 3D parallel GILD electromagnetic (EM) modeling and nonlinear inversion algorithm. The algorithm consists of: (a) a new magnetic integral equation instead of the electric integral equation to solve the electromagnetic forward modeling and inverse problem; (b) a collocation finite element method for solving the magnetic integral and a Galerkin finite element method for the magnetic differential equations; (c) a nonlinear regularizing optimization method to make the inversion stable and of high resolution; and (d) a new parallel 3D modeling and inversion using a global integral and local differential domain decomposition technique (GILD). The new 3D nonlinear electromagnetic inversion has been tested with synthetic data and field data. The authors obtained very good imaging for the synthetic data and reasonable subsurface EM imaging for the field data. The parallel algorithm has high parallel efficiency over 90% and can be a parallel solver for elliptic, parabolic, and hyperbolic modeling and inversion. The parallel GILD algorithm can be extended to develop a high resolution and large scale seismic and hydrology modeling and inversion in the massively parallel computer.

  16. Influence of intrinsic and extrinsic forces on 3D stress distribution using CUDA programming

    Räss, Ludovic; Omlin, Samuel; Podladchikov, Yuri

    2013-04-01

    In order to have a better understanding of the influence of buoyancy (intrinsic) and boundary (extrinsic) forces in a nonlinear rheology due to a power law fluid, some basics needs to be explored through 3D numerical calculation. As first approach, the already studied Stokes setup of a rising sphere will be used to calibrate the 3D model. Far field horizontal tectonic stress is applied to the sphere, which generates a vertical acceleration, buoyancy driven. This simple and known setup allows some benchmarking performed through systematic runs. The relative importance of intrinsic and extrinsic forces producing the wide variety of rates and styles of deformation, including absence of deformation and generating 3D stress patterns, will be determined. Relation between vertical motion and power law exponent will also be explored. The goal of these investigations will be to run models having topography and density structure from geophysical imaging as input, and 3D stress field as output. The stress distribution in Swiss Alps and Plateau and its implication for risk analysis is one of the perspective for this research. In fact, proximity of the stress to the failure is fundamental for risk assessment. Sensitivity of this to the accurate topography representation can then be evaluated. The developed 3D numerical codes, tuned for mid-sized cluster, need to be optimized, especially while running good resolution in full 3D. Therefor, two largely used computing platforms, MATLAB and FORTRAN 90 are explored. Starting with an easy adaptable and as short as possible MATLAB code, which is then upgraded in order to reach higher performance in simulation times and resolution. A significant speedup using the rising NVIDIA CUDA technology and resources is also possible. Programming in C-CUDA, creating some synchronization feature, and comparing the results with previous runs, helps us to investigate the new speedup possibilities allowed through GPU parallel computing. These codes

  17. PROGRAM MODULE FOR 3-DIMENSIONAL (3D) VISUALIZATION OF FABRIC

    Drole, Blaž

    2012-01-01

    Development of new designs in textile industry is a work, that requires a great deal of precision and the need for end design visualasation. Altough computers are being used for many years now, we were technologically limited when visualasing the final result to only two dimensional display. With the advances in computer technology, especialy graphic cards, we are now able to display complex weave in three dimensional (3D) space even on personal computers. This greatly simplifies the visualis...

  18. TOMO3D: 3-D joint refraction and reflection traveltime tomography parallel code for active-source seismic data—synthetic test

    Meléndez, A.; Korenaga, J.; Sallarès, V.; Miniussi, A.; Ranero, C. R.

    2015-10-01

    We present a new 3-D traveltime tomography code (TOMO3D) for the modelling of active-source seismic data that uses the arrival times of both refracted and reflected seismic phases to derive the velocity distribution and the geometry of reflecting boundaries in the subsurface. This code is based on its popular 2-D version TOMO2D from which it inherited the methods to solve the forward and inverse problems. The traveltime calculations are done using a hybrid ray-tracing technique combining the graph and bending methods. The LSQR algorithm is used to perform the iterative regularized inversion to improve the initial velocity and depth models. In order to cope with an increased computational demand due to the incorporation of the third dimension, the forward problem solver, which takes most of the run time (˜90 per cent in the test presented here), has been parallelized with a combination of multi-processing and message passing interface standards. This parallelization distributes the ray-tracing and traveltime calculations among available computational resources. The code's performance is illustrated with a realistic synthetic example, including a checkerboard anomaly and two reflectors, which simulates the geometry of a subduction zone. The code is designed to invert for a single reflector at a time. A data-driven layer-stripping strategy is proposed for cases involving multiple reflectors, and it is tested for the successive inversion of the two reflectors. Layers are bound by consecutive reflectors, and an initial velocity model for each inversion step incorporates the results from previous steps. This strategy poses simpler inversion problems at each step, allowing the recovery of strong velocity discontinuities that would otherwise be smoothened.

  19. Experiences Using Hybrid MPI/OpenMP in the Real World: Parallelization of a 3D CFD Solver for Multi-Core Node Clusters

    Gabriele Jost

    2010-01-01

    Full Text Available Today most systems in high-performance computing (HPC feature a hierarchical hardware design: shared-memory nodes with several multi-core CPUs are connected via a network infrastructure. When parallelizing an application for these architectures it seems natural to employ a hierarchical programming model such as combining MPI and OpenMP. Nevertheless, there is the general lore that pure MPI outperforms the hybrid MPI/OpenMP approach. In this paper, we describe the hybrid MPI/OpenMP parallelization of IR3D (Incompressible Realistic 3-D code, a full-scale real-world application, which simulates the environmental effects on the evolution of vortices trailing behind control surfaces of underwater vehicles. We discuss performance, scalability and limitations of the pure MPI version of the code on a variety of hardware platforms and show how the hybrid approach can help to overcome certain limitations.

  20. Reactor Dosimetry Applications Using RAPTOR-M3G:. a New Parallel 3-D Radiation Transport Code

    Longoni, Gianluca; Anderson, Stanwood L.

    2009-08-01

    The numerical solution of the Linearized Boltzmann Equation (LBE) via the Discrete Ordinates method (SN) requires extensive computational resources for large 3-D neutron and gamma transport applications due to the concurrent discretization of the angular, spatial, and energy domains. This paper will discuss the development RAPTOR-M3G (RApid Parallel Transport Of Radiation - Multiple 3D Geometries), a new 3-D parallel radiation transport code, and its application to the calculation of ex-vessel neutron dosimetry responses in the cavity of a commercial 2-loop Pressurized Water Reactor (PWR). RAPTOR-M3G is based domain decomposition algorithms, where the spatial and angular domains are allocated and processed on multi-processor computer architectures. As compared to traditional single-processor applications, this approach reduces the computational load as well as the memory requirement per processor, yielding an efficient solution methodology for large 3-D problems. Measured neutron dosimetry responses in the reactor cavity air gap will be compared to the RAPTOR-M3G predictions. This paper is organized as follows: Section 1 discusses the RAPTOR-M3G methodology; Section 2 describes the 2-loop PWR model and the numerical results obtained. Section 3 addresses the parallel performance of the code, and Section 4 concludes this paper with final remarks and future work.

  1. Development of a stereo vision measurement system for a 3D three-axial pneumatic parallel mechanism robot arm.

    Chiang, Mao-Hsiung; Lin, Hao-Ting; Hou, Chien-Lun

    2011-01-01

    In this paper, a stereo vision 3D position measurement system for a three-axial pneumatic parallel mechanism robot arm is presented. The stereo vision 3D position measurement system aims to measure the 3D trajectories of the end-effector of the robot arm. To track the end-effector of the robot arm, the circle detection algorithm is used to detect the desired target and the SAD algorithm is used to track the moving target and to search the corresponding target location along the conjugate epipolar line in the stereo pair. After camera calibration, both intrinsic and extrinsic parameters of the stereo rig can be obtained, so images can be rectified according to the camera parameters. Thus, through the epipolar rectification, the stereo matching process is reduced to a horizontal search along the conjugate epipolar line. Finally, 3D trajectories of the end-effector are computed by stereo triangulation. The experimental results show that the stereo vision 3D position measurement system proposed in this paper can successfully track and measure the fifth-order polynomial trajectory and sinusoidal trajectory of the end-effector of the three- axial pneumatic parallel mechanism robot arm.

  2. Parallel rapid relaxation inversion of 3D magnetotelluric data%大地电磁三维快速松弛反演并行算法研究

    林昌洪; 谭捍东; 佟拓

    2009-01-01

    We implement a parallel algorithm with the advantage of MPI (Message Passing Interface) to speed up the rapid relaxation inversion for 3D magnetotelluric data. We test the parallel rapid relaxation algorithm with synthetic and real data. The execution efficiency of the algorithm for several different situations is also compared. The results 'indicate that the parallel rapid relaxation algorithm for 3D magnetotelluric inversion, is effective. This parallel algorithm implemented on a common PC promotes the practical application of 3D magnetotelluric inversion and can be suitable for the other geophysical 3D modeling and inversion.

  3. Parallel Programming Archetypes in Combinatorics and Optimization

    1995-06-12

    A parallel programming archetype is a language independent program design strategy. We describe two archetypes in combinatorics and optimization...the systematic design of efficient sequential and parallel programs. The research whose results are presented in this document is part of the ongoing project on Parallel Programming Archetype.

  4. Graphics-Based Parallel Programming Tools

    1991-09-01

    AD-A254 406 (9 FINAL REPORT DLECTF ’AUG 13 1992 Graphics-Based Parallel Programming Tools .p Janice E. Cuny, Principal Investigator Department of...suggest parallel (either because we use a parallel graph rewriting mechanism or because we apply our results to parallel programming ), we interpret it to...was to provide support for the ex- plicit representation of graphs for use within a parallel programming environ- ment. In our environment, we view a

  5. C# game programming cookbook for Unity 3D

    Murray, Jeff W

    2014-01-01

    Making Games the Modular Way Important Programming ConceptsBuilding the Core Game Framework Controllers and Managers Building the Core Framework ScriptsPlayer Structure Game-Specific Player Controller Dealing with Input Player Manager User Data ManagerRecipes: Common Components Introduction The Timer Class Spawn ScriptsSet Gravity Pretend Friction-Friction Simulation to Prevent Slipping Around Cameras Input Scripts Automatic Self-Destruction Script Automatic Object SpinnerScene Manager Building Player Movement ControllersShoot 'Em Up Spaceship Humanoid Character Wheeled Vehicle Weapon Systems

  6. A Parallel Programming Model With Sequential Semantics

    1996-01-01

    Parallel programming is more difficult than sequential programming in part because of the complexity of reasoning, testing, and debugging in the...context of concurrency. In the thesis, we present and investigate a parallel programming model that provides direct control of parallelism in a notation

  7. Field Verification Program, Coastal Flooding and Storm Protection Program. Preliminary User’s Manual 3-D Mathematical Model of Coastal, Estuarine, and Lake Currents (CELC3D).

    1984-04-01

    on any other virtual machine, e.g., the CDC cyber 203, or .other non-virtual machine with sufficient memory. The CELC3D program solves the mean...Princeton, NJ, 287 pp; also WES Technical Report (in Press), U.S. Army Eng. Waterways Experiment Station, - Vicksburg, MS. " Sheng, Y.P., H. Segur , and W.S

  8. Focusing optics of a parallel beam CCD optical tomography apparatus for 3D radiation gel dosimetry.

    Krstajić, Nikola; Doran, Simon J

    2006-04-21

    Optical tomography of gel dosimeters is a promising and cost-effective avenue for quality control of radiotherapy treatments such as intensity-modulated radiotherapy (IMRT). Systems based on a laser coupled to a photodiode have so far shown the best results within the context of optical scanning of radiosensitive gels, but are very slow ( approximately 9 min per slice) and poorly suited to measurements that require many slices. Here, we describe a fast, three-dimensional (3D) optical computed tomography (optical-CT) apparatus, based on a broad, collimated beam, obtained from a high power LED and detected by a charged coupled detector (CCD). The main advantages of such a system are (i) an acquisition speed approximately two orders of magnitude higher than a laser-based system when 3D data are required, and (ii) a greater simplicity of design. This paper advances our previous work by introducing a new design of focusing optics, which take information from a suitably positioned focal plane and project an image onto the CCD. An analysis of the ray optics is presented, which explains the roles of telecentricity, focusing, acceptance angle and depth-of-field (DOF) in the formation of projections. A discussion of the approximation involved in measuring the line integrals required for filtered backprojection reconstruction is given. Experimental results demonstrate (i) the effect on projections of changing the position of the focal plane of the apparatus, (ii) how to measure the acceptance angle of the optics, and (iii) the ability of the new scanner to image both absorbing and scattering gel phantoms. The quality of reconstructed images is very promising and suggests that the new apparatus may be useful in a clinical setting for fast and accurate 3D dosimetry.

  9. Massive parallelization of a 3D finite difference electromagnetic forward solution using domain decomposition methods on multiple CUDA enabled GPUs

    Schultz, A.

    2010-12-01

    3D forward solvers lie at the core of inverse formulations used to image the variation of electrical conductivity within the Earth's interior. This property is associated with variations in temperature, composition, phase, presence of volatiles, and in specific settings, the presence of groundwater, geothermal resources, oil/gas or minerals. The high cost of 3D solutions has been a stumbling block to wider adoption of 3D methods. Parallel algorithms for modeling frequency domain 3D EM problems have not achieved wide scale adoption, with emphasis on fairly coarse grained parallelism using MPI and similar approaches. The communications bandwidth as well as the latency required to send and receive network communication packets is a limiting factor in implementing fine grained parallel strategies, inhibiting wide adoption of these algorithms. Leading Graphics Processor Unit (GPU) companies now produce GPUs with hundreds of GPU processor cores per die. The footprint, in silicon, of the GPU's restricted instruction set is much smaller than the general purpose instruction set required of a CPU. Consequently, the density of processor cores on a GPU can be much greater than on a CPU. GPUs also have local memory, registers and high speed communication with host CPUs, usually through PCIe type interconnects. The extremely low cost and high computational power of GPUs provides the EM geophysics community with an opportunity to achieve fine grained (i.e. massive) parallelization of codes on low cost hardware. The current generation of GPUs (e.g. NVidia Fermi) provides 3 billion transistors per chip die, with nearly 500 processor cores and up to 6 GB of fast (DDR5) GPU memory. This latest generation of GPU supports fast hardware double precision (64 bit) floating point operations of the type required for frequency domain EM forward solutions. Each Fermi GPU board can sustain nearly 1 TFLOP in double precision, and multiple boards can be installed in the host computer system. We

  10. High-Performance Computation of Distributed-Memory Parallel 3D Voronoi and Delaunay Tessellation

    Peterka, Tom; Morozov, Dmitriy; Phillips, Carolyn

    2014-11-14

    Computing a Voronoi or Delaunay tessellation from a set of points is a core part of the analysis of many simulated and measured datasets: N-body simulations, molecular dynamics codes, and LIDAR point clouds are just a few examples. Such computational geometry methods are common in data analysis and visualization; but as the scale of simulations and observations surpasses billions of particles, the existing serial and shared-memory algorithms no longer suffice. A distributed-memory scalable parallel algorithm is the only feasible approach. The primary contribution of this paper is a new parallel Delaunay and Voronoi tessellation algorithm that automatically determines which neighbor points need to be exchanged among the subdomains of a spatial decomposition. Other contributions include periodic and wall boundary conditions, comparison of our method using two popular serial libraries, and application to numerous science datasets.

  11. Calibration of 3-d.o.f. Translational Parallel Manipulators Using Leg Observations

    Pashkevich, Anatoly; Wenger, Philippe; Gomolitsky, Roman

    2009-01-01

    The paper proposes a novel approach for the geometrical model calibration of quasi-isotropic parallel kinematic mechanisms of the Orthoglide family. It is based on the observations of the manipulator leg parallelism during motions between the specific test postures and employs a low-cost measuring system composed of standard comparator indicators attached to the universal magnetic stands. They are sequentially used for measuring the deviation of the relevant leg location while the manipulator moves the TCP along the Cartesian axes. Using the measured differences, the developed algorithm estimates the joint offsets and the leg lengths that are treated as the most essential parameters. Validity of the proposed calibration technique is confirmed by the experimental results.

  12. Structured Parallel Programming Patterns for Efficient Computation

    McCool, Michael; Robison, Arch

    2012-01-01

    Programming is now parallel programming. Much as structured programming revolutionized traditional serial programming decades ago, a new kind of structured programming, based on patterns, is relevant to parallel programming today. Parallel computing experts and industry insiders Michael McCool, Arch Robison, and James Reinders describe how to design and implement maintainable and efficient parallel algorithms using a pattern-based approach. They present both theory and practice, and give detailed concrete examples using multiple programming models. Examples are primarily given using two of th

  13. Comparison of parallel and spiral tagged MRI geometries in estimation of 3-D myocardial strains

    Tustison, Nicholas J.; Amini, Amir A.

    2005-04-01

    Research involving the quantification of left ventricular myocardial strain from cardiac tagged magnetic resonance imaging (MRI) is extensive. Two different imaging geometries are commonly employed by these methodologies to extract longitudinal deformation. We refer to these imaging geometries as either parallel or spiral. In the spiral configuration, four long-axis tagged image slices which intersect along the long-axis of the left ventricle are collected and in the parallel configuration, contiguous tagged long-axis images spanning the width of the left ventricle between the lateral wall and the septum are collected. Despite the number of methodologies using either or both imaging configurations, to date, no comparison has been made to determine which geometry results in more accurate estimation of strains. Using previously published work in which left ventricular myocardial strain is calculated from 4-D anatomical NURBS models, we compare the strain calculated from these two imaging geometries in both simulated tagged MR images for which ground truth strain is available as well as in in vivo data. It is shown that strains calculated using the parallel imaging protocol are more accurate than that calculated using spiral protocol.

  14. 3D seismic modeling and reverse‐time migration with the parallel Fourier method using non‐blocking collective communications

    Chu, Chunlei

    2009-01-01

    The major performance bottleneck of the parallel Fourier method on distributed memory systems is the network communication cost. In this study, we investigate the potential of using non‐blocking all‐to‐all communications to solve this problem by overlapping computation and communication. We present the runtime comparison of a 3D seismic modeling problem with the Fourier method using non‐blocking and blocking calls, respectively, on a Linux cluster. The data demonstrate that a performance improvement of up to 40% can be achieved by simply changing blocking all‐to‐all communication calls to non‐blocking ones to introduce the overlapping capability. A 3D reverse‐time migration result is also presented as an extension to the modeling work based on non‐blocking collective communications.

  15. Design and Sensitivity Analysis Simulation of a Novel 3D Force Sensor Based on a Parallel Mechanism

    Eileen Chih-Ying Yang

    2016-12-01

    Full Text Available Automated force measurement is one of the most important technologies in realizing intelligent automation systems. However, while many methods are available for micro-force sensing, measuring large three-dimensional (3D forces and loads remains a significant challenge. Accordingly, the present study proposes a novel 3D force sensor based on a parallel mechanism. The transformation function and sensitivity index of the proposed sensor are analytically derived. The simulation results show that the sensor has a larger effective measuring capability than traditional force sensors. Moreover, the sensor has a greater measurement sensitivity for horizontal forces than for vertical forces over most of the measurable force region. In other words, compared to traditional force sensors, the proposed sensor is more sensitive to shear forces than normal forces.

  16. The ParaScope parallel programming environment

    Cooper, Keith D.; Hall, Mary W.; Hood, Robert T.; Kennedy, Ken; Mckinley, Kathryn S.; Mellor-Crummey, John M.; Torczon, Linda; Warren, Scott K.

    1993-01-01

    The ParaScope parallel programming environment, developed to support scientific programming of shared-memory multiprocessors, includes a collection of tools that use global program analysis to help users develop and debug parallel programs. This paper focuses on ParaScope's compilation system, its parallel program editor, and its parallel debugging system. The compilation system extends the traditional single-procedure compiler by providing a mechanism for managing the compilation of complete programs. Thus, ParaScope can support both traditional single-procedure optimization and optimization across procedure boundaries. The ParaScope editor brings both compiler analysis and user expertise to bear on program parallelization. It assists the knowledgeable user by displaying and managing analysis and by providing a variety of interactive program transformations that are effective in exposing parallelism. The debugging system detects and reports timing-dependent errors, called data races, in execution of parallel programs. The system combines static analysis, program instrumentation, and run-time reporting to provide a mechanical system for isolating errors in parallel program executions. Finally, we describe a new project to extend ParaScope to support programming in FORTRAN D, a machine-independent parallel programming language intended for use with both distributed-memory and shared-memory parallel computers.

  17. Sparse Approximations of the Schur Complement for Parallel Algebraic Hybrid Solvers in 3D

    L.Giraud; A.Haidar; Y.Saad

    2010-01-01

    In this paper we study the computational performance of variants of an al-gebraic additive Schwarz preconditioner for the Schur complement for the solution of large sparse linear systems. In earlier works, the local Schur complements were com- puted exactly using a sparse direct solver. The robustness of the preconditioner comes at the price of this memory and time intensive computation that is the main bottleneck of the approach for tackling huge problems. In this work we investigate the use of sparse approximation of the dense local Schur complements. These approximations are com-puted using a partial incomplete LU factorization. Such a numerical calculation is the core of the multi-level incomplete factorization such as the one implemented in pARMS. The numerical and computing performance of the new numerical scheme is illustrated on a set of large 3D convection-diffusion problems;preliminary experiments on linear systems arising from structural mechanics are also reported.

  18. About Parallel Programming: Paradigms, Parallel Execution and Collaborative Systems

    Loredana MOCEAN

    2009-01-01

    Full Text Available In the last years, there were made efforts for delineation of a stabile and unitary frame, where the problems of logical parallel processing must find solutions at least at the level of imperative languages. The results obtained by now are not at the level of the made efforts. This paper wants to be a little contribution at these efforts. We propose an overview in parallel programming, parallel execution and collaborative systems.

  19. Parallel Programming in the Age of Ubiquitous Parallelism

    Pingali, Keshav

    2014-04-01

    Multicore and manycore processors are now ubiquitous, but parallel programming remains as difficult as it was 30-40 years ago. During this time, our community has explored many promising approaches including functional and dataflow languages, logic programming, and automatic parallelization using program analysis and restructuring, but none of these approaches has succeeded except in a few niche application areas. In this talk, I will argue that these problems arise largely from the computation-centric foundations and abstractions that we currently use to think about parallelism. In their place, I will propose a novel data-centric foundation for parallel programming called the operator formulation in which algorithms are described in terms of actions on data. The operator formulation shows that a generalized form of data-parallelism called amorphous data-parallelism is ubiquitous even in complex, irregular graph applications such as mesh generation/refinement/partitioning and SAT solvers. Regular algorithms emerge as a special case of irregular ones, and many application-specific optimization techniques can be generalized to a broader context. The operator formulation also leads to a structural analysis of algorithms called TAO-analysis that provides implementation guidelines for exploiting parallelism efficiently. Finally, I will describe a system called Galois based on these ideas for exploiting amorphous data-parallelism on multicores and GPUs

  20. A parallel fast multipole BEM and its applications to large-scale analysis of 3-D fiber-reinforced composites

    Ting Lei; Zhenhan Yao; Haitao Wang; Pengbo Wang

    2006-01-01

    In this paper, an adaptive boundary element method (BEM) is presented for solving 3-D elasticity problems. The numerical scheme is accelerated by the new version of fast multipole method (FMM) and parallelized on distributed memory architectures. The resulting solver is applied to the study of representative volume element (RVE)for short fiberreinforced composites with complex inclusion geometry. Numerical examples performed on a 32-processor cluster show that the proposed method is both accurate and efficient. And can solve problems of large size that are challenging to existing state-of-the-art domain methods.

  1. Massively Parallel Linear Stability Analysis with P_ARPACK for 3D Fluid Flow Modeled with MPSalsa

    Lehoucq, R.B.; Salinger, A.G.

    1998-10-13

    We are interested in the stability of three-dimensional fluid flows to small dkturbances. One computational approach is to solve a sequence of large sparse generalized eigenvalue problems for the leading modes that arise from discretizating the differential equations modeling the flow. The modes of interest are the eigenvalues of largest real part and their associated eigenvectors. We discuss our work to develop an effi- cient and reliable eigensolver for use by the massively parallel simulation code MPSalsa. MPSalsa allows simulation of complex 3D fluid flow, heat transfer, and mass transfer with detailed bulk fluid and surface chemical reaction kinetics.

  2. A 3D hybrid grid generation technique and a multigrid/parallel algorithm based on anisotropic agglomeration approach

    Zhang Laiping; Zhao Zhong; Chang Xinghua; He Xin

    2013-01-01

    A hybrid grid generation technique and a multigrid/parallel algorithm are presented in this paper for turbulence flow simulations over three-dimensional (3D) complex geometries.The hybrid grid generation technique is based on an agglomeration method of anisotropic tetrahedrons.Firstly,the complex computational domain is covered by pure tetrahedral grids,in which anisotropic tetrahedrons are adopted to discrete the boundary layer and isotropic tetrahedrons in the outer field.Then,the anisotropic tetrahedrons in the boundary layer are agglomerated to generate prismatic grids.The agglomeration method can improve the grid quality in boundary layer and reduce the grid quantity to enhance the numerical accuracy and efficiency.In order to accelerate the convergence history,a multigrid/parallel algorithm is developed also based on anisotropic agglomeration approach.The numerical results demonstrate the excellent accelerating capability of this multigrid method.

  3. Introducing zeus-mp: a 3d, parallel, multiphysics code for astrophysical fluid dynamics

    Michael L. Norman

    2000-01-01

    Full Text Available Describimos ZEUS-MP: un c odigo Multi-F sica, Masivamente-Paralelo, Pasa-Mensajes para simulaciones tridimensionales de din amica de uidos astrof sicos. ZEUS-MP es la continuaci on de los c odigos ZEUS-2D y ZEUS-3D, desarrollados y diseminados por el Laboratorio de Astrof sica Computacional (lca.ncsa.uiuc.edu del NCSA. La versi on V1.0, liberada el 1/1/2000, contiene los siguientes m odulos: hidrodin amica ideal, MHD ideal y auto-gravedad. Las pr oximas versiones tendr an difusi on radiativa de ujo limitado, conducci on de calor, plasma de dos temperaturas y funciones de enfriamiento y calentamiento. Las ecuaciones covariantes est an avanzadas en una malla Euleriana m ovil en coordenadas Cartesianas, cil ndricas y polares esf ericas. La paralelizaci on es hecha por descomposici on del dominio y est a implementada en F77 y MPI. El c odigo es portable en un amplio rango de plataformas, desde redes de estaciones de trabajo hasta procesadores de paralelismo masivo. Se presentan algunos resultados de la e ciencia en paralelo junto con una aplicaci on a formaci on estelar turbulenta.

  4. Fast and Precise 3D Computation of Capacitance of Parallel Narrow Beam MEMS Structures

    Majumdar, N

    2007-01-01

    Efficient design and performance of electrically actuated MEMS devices necessitate accurate estimation of electrostatic forces on the MEMS structures. This in turn requires thorough study of the capacitance of the structures and finally the charge density distribution on the various surfaces of a device. In this work, nearly exact BEM solutions have been provided in order to estimate these properties of a parallel narrow beam structure found in MEMS devices. The effect of three-dimensionality, which is an important aspect for these structures, and associated fringe fields have been studied in detail. A reasonably large parameter space has been covered in order to follow the variation of capacitance with various geometric factors. The present results have been compared with those obtained using empirical parametrized expressions keeping in view the requirement of the speed of computation. The limitations of the empirical expressions have been pointed out and possible approaches of their improvement have been d...

  5. A Case Study of a Hybrid Parallel 3D Surface Rendering Graphics Architecture

    Holten-Lund, Hans Erik; Madsen, Jan; Pedersen, Steen

    1997-01-01

    This paper presents a case study in the design strategy used inbuilding a graphics computer, for drawing very complex 3Dgeometric surfaces. The goal is to build a PC based computer systemcapable of handling surfaces built from about 2 million triangles, andto be able to render a perspective view...... of these on a computer displayat interactive frame rates, i.e. processing around 50 milliontriangles per second. The paper presents a hardware/softwarearchitecture called HPGA (Hybrid Parallel Graphics Architecture) whichis likely to be able to carry out this task. The case study focuses ontechniques to increase...... the clock frequency as well as the parallelismof the system. This paper focuses on the back-end graphics pipeline,which is responsible for rasterizing triangles.%with a practically linear increase in performance. A pure software implementation of the proposed architecture iscurrently able to process 300...

  6. Parallel deconvolution of large 3D images obtained by confocal laser scanning microscopy.

    Pawliczek, Piotr; Romanowska-Pawliczek, Anna; Soltys, Zbigniew

    2010-03-01

    Various deconvolution algorithms are often used for restoration of digital images. Image deconvolution is especially needed for the correction of three-dimensional images obtained by confocal laser scanning microscopy. Such images suffer from distortions, particularly in the Z dimension. As a result, reliable automatic segmentation of these images may be difficult or even impossible. Effective deconvolution algorithms are memory-intensive and time-consuming. In this work, we propose a parallel version of the well-known Richardson-Lucy deconvolution algorithm developed for a system with distributed memory and implemented with the use of Message Passing Interface (MPI). It enables significantly more rapid deconvolution of two-dimensional and three-dimensional images by efficiently splitting the computation across multiple computers. The implementation of this algorithm can be used on professional clusters provided by computing centers as well as on simple networks of ordinary PC machines.

  7. Adobe Flash 11 Stage3D (Molehill) Game Programming Beginner's Guide

    Kaitila, Christer

    2011-01-01

    Written in an informal and friendly manner, the style and approach of this book will take you on an exciting adventure. Piece by piece, detailed examples help you along the way by providing real-world game code required to make a complete 3D video game. Each chapter builds upon the experience and achievements earned in the last, culminating in the ultimate prize - your game! If you ever wanted to make your own 3D game in Flash, then this book is for you. This book is a perfect introduction to 3D game programming in Adobe Molehill for complete beginners. You do not need to know anything about S

  8. Genetic Parallel Programming: design and implementation.

    Cheang, Sin Man; Leung, Kwong Sak; Lee, Kin Hong

    2006-01-01

    This paper presents a novel Genetic Parallel Programming (GPP) paradigm for evolving parallel programs running on a Multi-Arithmetic-Logic-Unit (Multi-ALU) Processor (MAP). The MAP is a Multiple Instruction-streams, Multiple Data-streams (MIMD), general-purpose register machine that can be implemented on modern Very Large-Scale Integrated Circuits (VLSIs) in order to evaluate genetic programs at high speed. For human programmers, writing parallel programs is more difficult than writing sequential programs. However, experimental results show that GPP evolves parallel programs with less computational effort than that of their sequential counterparts. It creates a new approach to evolving a feasible problem solution in parallel program form and then serializes it into a sequential program if required. The effectiveness and efficiency of GPP are investigated using a suite of 14 well-studied benchmark problems. Experimental results show that GPP speeds up evolution substantially.

  9. Requirements for Data-Parallel Programming Environments

    1994-04-22

    fully automatic techniques would be insufficient by themselves to support general parallel programming , even in the limited domain of scientific...computation. In other words, in an effective parallel programming system, the programmer would have to provide additional information to help the system...convey an understanding of the tools and strategies that will be needed to adequately support efficient, machine-independent, data- parallel programming .

  10. Parallel Programming Environment for OpenMP

    Insung Park

    2001-01-01

    Full Text Available We present our effort to provide a comprehensive parallel programming environment for the OpenMP parallel directive language. This environment includes a parallel programming methodology for the OpenMP programming model and a set of tools (Ursa Minor and InterPol that support this methodology. Our toolset provides automated and interactive assistance to parallel programmers in time-consuming tasks of the proposed methodology. The features provided by our tools include performance and program structure visualization, interactive optimization, support for performance modeling, and performance advising for finding and correcting performance problems. The presented evaluation demonstrates that our environment offers significant support in general parallel tuning efforts and that the toolset facilitates many common tasks in OpenMP parallel programming in an efficient manner.

  11. PDDP, A Data Parallel Programming Model

    Karen H. Warren

    1996-01-01

    Full Text Available PDDP, the parallel data distribution preprocessor, is a data parallel programming model for distributed memory parallel computers. PDDP implements high-performance Fortran-compatible data distribution directives and parallelism expressed by the use of Fortran 90 array syntax, the FORALL statement, and the WHERE construct. Distributed data objects belong to a global name space; other data objects are treated as local and replicated on each processor. PDDP allows the user to program in a shared memory style and generates codes that are portable to a variety of parallel machines. For interprocessor communication, PDDP uses the fastest communication primitives on each platform.

  12. iPhone 3D Programming Developing Graphical Applications with OpenGL ES

    Rideout, Philip

    2010-01-01

    What does it take to build an iPhone app with stunning 3D graphics? This book will show you how to apply OpenGL graphics programming techniques to any device running the iPhone OS -- including the iPad and iPod Touch -- with no iPhone development or 3D graphics experience required. iPhone 3D Programming provides clear step-by-step instructions, as well as lots of practical advice, for using the iPhone SDK and OpenGL. You'll build several graphics programs -- progressing from simple to more complex examples -- that focus on lighting, textures, blending, augmented reality, optimization for pe

  13. Development of a 3D Parallel Mechanism Robot Arm with Three Vertical-Axial Pneumatic Actuators Combined with a Stereo Vision System

    Hao-Ting Lin

    2011-12-01

    Full Text Available This study aimed to develop a novel 3D parallel mechanism robot driven by three vertical-axial pneumatic actuators with a stereo vision system for path tracking control. The mechanical system and the control system are the primary novel parts for developing a 3D parallel mechanism robot. In the mechanical system, a 3D parallel mechanism robot contains three serial chains, a fixed base, a movable platform and a pneumatic servo system. The parallel mechanism are designed and analyzed first for realizing a 3D motion in the X-Y-Z coordinate system of the robot’s end-effector. The inverse kinematics and the forward kinematics of the parallel mechanism robot are investigated by using the Denavit-Hartenberg notation (D-H notation coordinate system. The pneumatic actuators in the three vertical motion axes are modeled. In the control system, the Fourier series-based adaptive sliding-mode controller with H∞ tracking performance is used to design the path tracking controllers of the three vertical servo pneumatic actuators for realizing 3D path tracking control of the end-effector. Three optical linear scales are used to measure the position of the three pneumatic actuators. The 3D position of the end-effector is then calculated from the measuring position of the three pneumatic actuators by means of the kinematics. However, the calculated 3D position of the end-effector cannot consider the manufacturing and assembly tolerance of the joints and the parallel mechanism so that errors between the actual position and the calculated 3D position of the end-effector exist. In order to improve this situation, sensor collaboration is developed in this paper. A stereo vision system is used to collaborate with the three position sensors of the pneumatic actuators. The stereo vision system combining two CCD serves to measure the actual 3D position of the end-effector and calibrate the error between the actual and the calculated 3D position of the end

  14. Development of a 3D parallel mechanism robot arm with three vertical-axial pneumatic actuators combined with a stereo vision system.

    Chiang, Mao-Hsiung; Lin, Hao-Ting

    2011-01-01

    This study aimed to develop a novel 3D parallel mechanism robot driven by three vertical-axial pneumatic actuators with a stereo vision system for path tracking control. The mechanical system and the control system are the primary novel parts for developing a 3D parallel mechanism robot. In the mechanical system, a 3D parallel mechanism robot contains three serial chains, a fixed base, a movable platform and a pneumatic servo system. The parallel mechanism are designed and analyzed first for realizing a 3D motion in the X-Y-Z coordinate system of the robot's end-effector. The inverse kinematics and the forward kinematics of the parallel mechanism robot are investigated by using the Denavit-Hartenberg notation (D-H notation) coordinate system. The pneumatic actuators in the three vertical motion axes are modeled. In the control system, the Fourier series-based adaptive sliding-mode controller with H(∞) tracking performance is used to design the path tracking controllers of the three vertical servo pneumatic actuators for realizing 3D path tracking control of the end-effector. Three optical linear scales are used to measure the position of the three pneumatic actuators. The 3D position of the end-effector is then calculated from the measuring position of the three pneumatic actuators by means of the kinematics. However, the calculated 3D position of the end-effector cannot consider the manufacturing and assembly tolerance of the joints and the parallel mechanism so that errors between the actual position and the calculated 3D position of the end-effector exist. In order to improve this situation, sensor collaboration is developed in this paper. A stereo vision system is used to collaborate with the three position sensors of the pneumatic actuators. The stereo vision system combining two CCD serves to measure the actual 3D position of the end-effector and calibrate the error between the actual and the calculated 3D position of the end-effector. Furthermore, to

  15. Parallelized 3D CSEM modeling using edge-based finite element with total field formulation and unstructured mesh

    Cai, Hongzhu; Hu, Xiangyun; Li, Jianhui; Endo, Masashi; Xiong, Bin

    2017-02-01

    We solve the 3D controlled-source electromagnetic (CSEM) problem using the edge-based finite element method. The modeling domain is discretized using unstructured tetrahedral mesh. We adopt the total field formulation for the quasi-static variant of Maxwell's equation and the computation cost to calculate the primary field can be saved. We adopt a new boundary condition which approximate the total field on the boundary by the primary field corresponding to the layered earth approximation of the complicated conductivity model. The primary field on the modeling boundary is calculated using fast Hankel transform. By using this new type of boundary condition, the computation cost can be reduced significantly and the modeling accuracy can be improved. We consider that the conductivity can be anisotropic. We solve the finite element system of equations using a parallelized multifrontal solver which works efficiently for multiple source and large scale electromagnetic modeling.

  16. Parallel programming with PCN. Revision 1

    Foster, I.; Tuecke, S.

    1991-12-01

    PCN is a system for developing and executing parallel programs. It comprises a high-level programming language, tools for developing and debugging programs in this language, and interfaces to Fortran and C that allow the reuse of existing code in multilingual parallel programs. Programs developed using PCN are portable across many different workstations, networks, and parallel computers. This document provides all the information required to develop parallel programs with the PCN programming system. In includes both tutorial and reference material. It also presents the basic concepts that underly PCN, particularly where these are likely to be unfamiliar to the reader, and provides pointers to other documentation on the PCN language, programming techniques, and tools. PCN is in the public domain. The latest version of both the software and this manual can be obtained by anonymous FTP from Argonne National Laboratory in the directory pub/pcn at info.mcs.anl.gov (c.f. Appendix A).

  17. Synthetic models of distributed memory parallel programs

    Poplawski, D.A. (Michigan Technological Univ., Houghton, MI (USA). Dept. of Computer Science)

    1990-09-01

    This paper deals with the construction and use of simple synthetic programs that model the behavior of more complex, real parallel programs. Synthetic programs can be used in many ways: to construct an easily ported suite of benchmark programs, to experiment with alternate parallel implementations of a program without actually writing them, and to predict the behavior and performance of an algorithm on a new or hypothetical machine. Synthetic programs are constructed easily from scratch, from existing programs, and can even be constructed using nothing but information obtained from traces of the real program's execution.

  18. Parallel programming characteristics of a DSP-based parallel system

    GAO Shu; GUO Qing-ping

    2006-01-01

    This paper firstly introduces the structure and working principle of DSP-based parallel system, parallel accelerating board and SHARC DSP chip. Then it pays attention to investigating the system's programming characteristics, especially the mode of communication, discussing how to design parallel algorithms and presenting a domain-decomposition-based complete multi-grid parallel algorithm with virtual boundary forecast (VBF) to solve a lot of large-scale and complicated heat problems. In the end, Mandelbrot Set and a non-linear heat transfer equation of ceramic/metal composite material are taken as examples to illustrate the implementation of the proposed algorithm. The results showed that the solutions are highly efficient and have linear speedup.

  19. Experiences in Data-Parallel Programming

    Terry W. Clark

    1997-01-01

    Full Text Available To efficiently parallelize a scientific application with a data-parallel compiler requires certain structural properties in the source program, and conversely, the absence of others. A recent parallelization effort of ours reinforced this observation and motivated this correspondence. Specifically, we have transformed a Fortran 77 version of GROMOS, a popular dusty-deck program for molecular dynamics, into Fortran D, a data-parallel dialect of Fortran. During this transformation we have encountered a number of difficulties that probably are neither limited to this particular application nor do they seem likely to be addressed by improved compiler technology in the near future. Our experience with GROMOS suggests a number of points to keep in mind when developing software that may at some time in its life cycle be parallelized with a data-parallel compiler. This note presents some guidelines for engineering data-parallel applications that are compatible with Fortran D or High Performance Fortran compilers.

  20. Massively Parallel Finite Element Programming

    Heister, Timo

    2010-01-01

    Today\\'s large finite element simulations require parallel algorithms to scale on clusters with thousands or tens of thousands of processor cores. We present data structures and algorithms to take advantage of the power of high performance computers in generic finite element codes. Existing generic finite element libraries often restrict the parallelization to parallel linear algebra routines. This is a limiting factor when solving on more than a few hundreds of cores. We describe routines for distributed storage of all major components coupled with efficient, scalable algorithms. We give an overview of our effort to enable the modern and generic finite element library deal.II to take advantage of the power of large clusters. In particular, we describe the construction of a distributed mesh and develop algorithms to fully parallelize the finite element calculation. Numerical results demonstrate good scalability. © 2010 Springer-Verlag.

  1. Productive Parallel Programming: The PCN Approach

    Ian Foster

    1992-01-01

    Full Text Available We describe the PCN programming system, focusing on those features designed to improve the productivity of scientists and engineers using parallel supercomputers. These features include a simple notation for the concise specification of concurrent algorithms, the ability to incorporate existing Fortran and C code into parallel applications, facilities for reusing parallel program components, a portable toolkit that allows applications to be developed on a workstation or small parallel computer and run unchanged on supercomputers, and integrated debugging and performance analysis tools. We survey representative scientific applications and identify problem classes for which PCN has proved particularly useful.

  2. A survey of parallel programming tools

    Cheng, Doreen Y.

    1991-01-01

    This survey examines 39 parallel programming tools. Focus is placed on those tool capabilites needed for parallel scientific programming rather than for general computer science. The tools are classified with current and future needs of Numerical Aerodynamic Simulator (NAS) in mind: existing and anticipated NAS supercomputers and workstations; operating systems; programming languages; and applications. They are divided into four categories: suggested acquisitions, tools already brought in; tools worth tracking; and tools eliminated from further consideration at this time.

  3. Playable Stories: Making Programming and 3D Role-Playing Game Design Personally and Socially Relevant

    Ingram-Goble, Adam

    2013-01-01

    This is an exploratory design study of a novel system for learning programming and 3D role-playing game design as tools for social change. This study was conducted at two sites. Participants in the study were ages 9-14 and worked for up to 15 hours with the platform to learn how to program and design video games with personally or socially…

  4. GPU-based, parallel-line, omni-directional integration of measured acceleration field to obtain the 3D pressure distribution

    Wang, Jin; Zhang, Cao; Katz, Joseph

    2016-11-01

    A PIV based method to reconstruct the volumetric pressure field by direct integration of the 3D material acceleration directions has been developed. Extending the 2D virtual-boundary omni-directional method (Omni2D, Liu & Katz, 2013), the new 3D parallel-line omni-directional method (Omni3D) integrates the material acceleration along parallel lines aligned in multiple directions. Their angles are set by a spherical virtual grid. The integration is parallelized on a Tesla K40c GPU, which reduced the computing time from three hours to one minute for a single realization. To validate its performance, this method is utilized to calculate the 3D pressure fields in isotropic turbulence and channel flow using the JHU DNS Databases (http://turbulence.pha.jhu.edu). Both integration of the DNS acceleration as well as acceleration from synthetic 3D particles are tested. Results are compared to other method, e.g. solution to the Pressure Poisson Equation (e.g. PPE, Ghaemi et al., 2012) with Bernoulli based Dirichlet boundary conditions, and the Omni2D method. The error in Omni3D prediction is uniformly low, and its sensitivity to acceleration errors is local. It agrees with the PPE/Bernoulli prediction away from the Dirichlet boundary. The Omni3D method is also applied to experimental data obtained using tomographic PIV, and results are correlated with deformation of a compliant wall. ONR.

  5. 2D/3D Program work summary report, [January 1988--December 1992

    Damerell, P. S.; Simons, J. W. [eds., MPR Associates, Washington, DC (United States)

    1993-06-01

    The 2D/3D Program was carried out by Germany, Japan and the United States to investigate the thermal-hydraulics of a PWR large-break LOCA. A contributory approach was utilized in which each country contributed significant effort to the program and all three countries shared the research results. Germany constructed and operated the Upper Plenum Test Facility (UPTF), and Japan constructed and operated the Cylindrical Core Test Facility (CCTF) and the Slab Core Test Facility (SCTF). The US contribution consisted of provision of advanced instrumentation to each of the three test facilities, and assessment of the TRAC computer code against the test results. Evaluations of the test results were carried out in all three countries. This report summarizes the 2D/3D Program in terms of the contributing efforts of the participants.

  6. Integrated Task and Data Parallel Programming

    Grimshaw, A. S.

    1998-01-01

    This research investigates the combination of task and data parallel language constructs within a single programming language. There are an number of applications that exhibit properties which would be well served by such an integrated language. Examples include global climate models, aircraft design problems, and multidisciplinary design optimization problems. Our approach incorporates data parallel language constructs into an existing, object oriented, task parallel language. The language will support creation and manipulation of parallel classes and objects of both types (task parallel and data parallel). Ultimately, the language will allow data parallel and task parallel classes to be used either as building blocks or managers of parallel objects of either type, thus allowing the development of single and multi-paradigm parallel applications. 1995 Research Accomplishments In February I presented a paper at Frontiers 1995 describing the design of the data parallel language subset. During the spring I wrote and defended my dissertation proposal. Since that time I have developed a runtime model for the language subset. I have begun implementing the model and hand-coding simple examples which demonstrate the language subset. I have identified an astrophysical fluid flow application which will validate the data parallel language subset. 1996 Research Agenda Milestones for the coming year include implementing a significant portion of the data parallel language subset over the Legion system. Using simple hand-coded methods, I plan to demonstrate (1) concurrent task and data parallel objects and (2) task parallel objects managing both task and data parallel objects. My next steps will focus on constructing a compiler and implementing the fluid flow application with the language. Concurrently, I will conduct a search for a real-world application exhibiting both task and data parallelism within the same program. Additional 1995 Activities During the fall I collaborated

  7. The PISCES 2 parallel programming environment

    Pratt, Terrence W.

    1987-01-01

    PISCES 2 is a programming environment for scientific and engineering computations on MIMD parallel computers. It is currently implemented on a flexible FLEX/32 at NASA Langley, a 20 processor machine with both shared and local memories. The environment provides an extended Fortran for applications programming, a configuration environment for setting up a run on the parallel machine, and a run-time environment for monitoring and controlling program execution. This paper describes the overall design of the system and its implementation on the FLEX/32. Emphasis is placed on several novel aspects of the design: the use of a carefully defined virtual machine, programmer control of the mapping of virtual machine to actual hardware, forces for medium-granularity parallelism, and windows for parallel distribution of data. Some preliminary measurements of storage use are included.

  8. Automatic Parallelization Tool: Classification of Program Code for Parallel Computing

    Mustafa Basthikodi

    2016-04-01

    Full Text Available Performance growth of single-core processors has come to a halt in the past decade, but was re-enabled by the introduction of parallelism in processors. Multicore frameworks along with Graphical Processing Units empowered to enhance parallelism broadly. Couples of compilers are updated to developing challenges forsynchronization and threading issues. Appropriate program and algorithm classifications will have advantage to a great extent to the group of software engineers to get opportunities for effective parallelization. In present work we investigated current species for classification of algorithms, in that related work on classification is discussed along with the comparison of issues that challenges the classification. The set of algorithms are chosen which matches the structure with different issues and perform given task. We have tested these algorithms utilizing existing automatic species extraction toolsalong with Bones compiler. We have added functionalities to existing tool, providing a more detailed characterization. The contributions of our work include support for pointer arithmetic, conditional and incremental statements, user defined types, constants and mathematical functions. With this, we can retain significant data which is not captured by original speciesof algorithms. We executed new theories into the device, empowering automatic characterization of program code.

  9. DVR3D: a program suite for the calculation of rotation-vibration spectra of triatomic molecules

    Tennyson, Jonathan; Kostin, Maxim A.; Barletta, Paolo; Harris, Gregory J.; Polyansky, Oleg L.; Ramanlal, Jayesh; Zobov, Nikolai F.

    2004-11-01

    from: CPC Program Library, Queen's University of Belfast, N. Ireland Reference in CPC to previous version: 86 (1995) 175 Catalogue identifier of previous version: ADAK Authors of previous version: J. Tennyson, J.R. Henderson and N.G. Fulton Does the new version supersede the original program?: DVR3DRJZ supersedes DVR3DRJ Computer: PC running Linux Installation: desktop Other machines on which program tested: Compaq running True64 Unix; SGI Origin 2000, Sunfire V750 and V880 systems running SunOS, IBM p690 Regatta running AIX Programming language used in the new version: Fortran 90 Memory required to execute: case dependent No. of lines in distributed program, including test data, etc.: 4203 No. of bytes in distributed program, including test data, etc.: 30 087 Has code been vectorised or parallelised?: The code has been extensively vectorised. A parallel version of the code, PDVR3D has been developed [1], contact the first author for details Additional keywords: perpendicular embedding Distribution format: gz Nature of physical problem: DVR3DRJZ calculates the bound vibrational or Coriolis decoupled rotational-vibrational states of a triatomic system in body-fixed Jacobi (scattering) or Radau coordinates [2] Method of solution: All coordinates are treated in a discrete variable representation (DVR). The angular coordinate uses a DVR based on (associated) Legendre polynomials and the radial coordinates utilise a DVR based on either Morse oscillator-like or spherical oscillator functions. Intermediate diagonalisation and truncation is performed on the hierarchical expression of the Hamiltonian operator to yield the final secular problem. DVR3DRJ provides the vibrational wavefunctions necessary for ROTLEV3, ROLEV3B or ROTLEV3Z to calculate rotationally excited states, DIPOLE3 to calculate rotational-vibrational transition strengths and XPECT3 to compute expectation values Restrictions on the complexity of the problem: (1) The size of the final Hamiltonian matrix that can

  10. Semantic Language Extensions for Implicit Parallel Programming

    2013-09-01

    stimulating discussions on everything from nature to nurture, the joint work sessions at Frist/Small World/ Starbucks and squash games. Your competitive...importance of computational power, and performance enhancing strategies in use. This field study was conducted through personal interviews with 114...livelocks. Despite this effort, manual concurrency control and a fixed choice of parallelization strategy often result in parallel programs with poor

  11. Wireless Rover Meets 3D Design and Product Development

    Deal, Walter F., III; Hsiung, Steve C.

    2016-01-01

    Today there are a number of 3D printing technologies that are low cost and within the budgets of middle and high school programs. Educational technology companies offer a variety of 3D printing technologies and parallel curriculum materials to enable technology and engineering teachers to easily add 3D learning activities to their programs.…

  12. OpenCL parallel programming development cookbook

    Tay, Raymond

    2013-01-01

    OpenCL Parallel Programming Development Cookbook will provide a set of advanced recipes that can be utilized to optimize existing code. This book is therefore ideal for experienced developers with a working knowledge of C/C++ and OpenCL.This book is intended for software developers who have often wondered what to do with that newly bought CPU or GPU they bought other than using it for playing computer games; this book is also for developers who have a working knowledge of C/C++ and who want to learn how to write parallel programs in OpenCL so that life isn't too boring.

  13. Plasmonics and the parallel programming problem

    Vishkin, Uzi; Smolyaninov, Igor; Davis, Chris

    2007-02-01

    While many parallel computers have been built, it has generally been too difficult to program them. Now, all computers are effectively becoming parallel machines. Biannual doubling in the number of cores on a single chip, or faster, over the coming decade is planned by most computer vendors. Thus, the parallel programming problem is becoming more critical. The only known solution to the parallel programming problem in the theory of computer science is through a parallel algorithmic theory called PRAM. Unfortunately, some of the PRAM theory assumptions regarding the bandwidth between processors and memories did not properly reflect a parallel computer that could be built in previous decades. Reaching memories, or other processors in a multi-processor organization, required off-chip connections through pins on the boundary of each electric chip. Using the number of transistors that is becoming available on chip, on-chip architectures that adequately support the PRAM are becoming possible. However, the bandwidth of off-chip connections remains insufficient and the latency remains too high. This creates a bottleneck at the boundary of the chip for a PRAM-On-Chip architecture. This also prevents scalability to larger "supercomputing" organizations spanning across many processing chips that can handle massive amounts of data. Instead of connections through pins and wires, power-efficient CMOS-compatible on-chip conversion to plasmonic nanowaveguides is introduced for improved latency and bandwidth. Proper incorporation of our ideas offer exciting avenues to resolving the parallel programming problem, and an alternative way for building faster, more useable and much more compact supercomputers.

  14. Detecting drug use in adolescents using a 3D simulation program

    Luis Iribarne

    2010-11-01

    Full Text Available This work presents a new 3D simulation program, called MiiSchool, and its application to the detection of problem behaviours appearing in school settings. We begin by describing some of the main features of the Mii School program. Then, we present the results of a study in which adolescents responded to Mii School simulations involving the consumption of alcoholic drinks, cigarettes, cannabis, cocaine, and MDMA (ecstasy. We established a“risk profile” based on the observed response patterns. We also present results concerning user satisfaction with the program and the extent to which users felt that the simulated scenes were realistic. Lastly, we discuss the usefulness of Mii School as a tool for assessing drug use in school settings.

  15. Reactor safety issues resolved by the 2D/3D program

    NONE

    1995-09-01

    The 2D/3D Program studied multidimensional thermal-hydraulics in a PWR core and primary system during the end-of-blowdown and post-blowdown phases of a large-break LOCA (LBLOCA), and during selected small-break LOCA (SBLOCA) transients. The program included tests at the Cylindrical Core Test Facility (CCTF), the Slab Core Test Facility (SCTF), and the Upper Plenum Test Facility (UPTF), and computer analyses using TRAC. Tests at CCTF investigated core thermal-hydraulics and overall system behavior while tests at SCTF concentrated on multidimensional core thermal-hydraulics. The UPTF tests investigated two-phase flow behavior in the downcomer, upper plenum, tie plate region, and primary loops. TRAC analyses evaluated thermal-hydraulic behavior throughout the primary system in tests as well as in PWRs. This report summarizes the test and analysis results in each of the main areas where improved information was obtained in the 2D/3D Program. The discussion is organized in terms of the reactor safety issues investigated. This report was prepared in a coordination among US, Germany and Japan. US and Germany have published the report as NUREG/IA-0127 and GRS-101 respectively. (author).

  16. Playable stories: Making programming and 3D role-playing game design personally and socially relevant

    Ingram-Goble, Adam

    This is an exploratory design study of a novel system for learning programming and 3D role-playing game design as tools for social change. This study was conducted at two sites. Participants in the study were ages 9-14 and worked for up to 15 hours with the platform to learn how to program and design video games with personally or socially relevant narratives. This first study was successful in that students learned to program a narrative game, and they viewed the social problem framing for the practices as an interesting aspect of the experience. The second study provided illustrative examples of how providing less general structure up-front, afforded players the opportunity to produce the necessary structures as needed for their particular design, and therefore had a richer understanding of what those structures represented. This study demonstrates that not only were participants able to use computational thinking skills such as Boolean and conditional logic, planning, modeling, abstraction, and encapsulation, they were able to bridge these skills to social domains they cared about. In particular, participants created stories about socially relevant topics without to explicit pushes by the instructors. The findings also suggest that the rapid uptake, and successful creation of personally and socially relevant narratives may have been facilitated by close alignment between the conceptual tools represented in the platform, and the domain of 3D role-playing games.

  17. Modeling of AAR affected structures using the GROW3D FEA program

    Curtis, D.D. [Acres International Limited, Niagara Falls, Ontario (Canada)

    1995-12-31

    The objective of this paper is to present a rational and practical methodology for finite element stress analysis of AAR affected structures. The methodology is presented using case history studies which illustrate the practical application of the GROW3D program. GROW3D uses an anisotropic expansion strain function and concrete properties which simulates the following key characteristics of AAR affected concrete (1) concrete growth expansion rates dependent on the stress vectors at each point; (2) concrete growth rate variation due to changes in moisture content and temperature; and (3) time-dependent, enhanced creep behavior. GROW3D has been applied to several hydropower structures and case histories from the Mactaquac Generating Station are presented herein. Mactaquac is selected because extensive instrumentation data before and after remedial measures have been used to calibrate and test the model. The results of analyses of three different structures are given, i.e., the intake, diversion sluiceway and powerhouse. The analysis results are used to identify potential structural problems and the need and timing of remedial measures. The output from GROW3D includes displacement rates, total displacements, global stresses and local factors of safety. The local factors of safety (or strength to stress ratios) are computed for several modes of failure including crushing, cracking, shear and sliding on horizontal construction joints. The analysis results are compared with field measurements which are taken before and after slot cutting. The effects of including the above-mentioned characteristics and other modeling assumptions on the computed results is discussed herein. Finally, a brief discussion on the recent enhancements to the model is given. These enhancements include the implementation of a more rigorous treatment of concrete creep effects.

  18. 冷冻电镜三维重构在CPU-GPU系统中的并行性%Parallelism for cryo-EM 3D reconstruction on CPU-GPU heterogeneous system

    李兴建; 李临川; 谭光明; 张佩珩

    2011-01-01

    It is a challenge to efficiently utilize massive parallelism on both applications and architectures for heterogeneous systems. A practice of accelerating a cryo-EM 3D program was presented on how to exploit and orchestrate parallelism of applications to take advantage of the underlying parallelism exposed at the architecture level. All possible parallelism in cryo-EM 3D was exploited, and a self-adaptive dynamic scheduling algorithm was leveraged to efficiently implement parallelism mapping between the application and architecture. The experiment on a part of dawning nebulae system (32 nodes) confirms that a hierarchical parallelism is an efficient pattern of parallel programming to utilize capabilities of both CPU and GPU on a heterogeneous system. The hybrid CPU-GPU program improves performance by 2. 4 times over the best CPU-only one for certain problem sizes.%为了有效地发掘和利用异构系统在应用和体系结构上的并行性,以冷冻电镜三维重构为例展示如何利用应用程序潜在的并行性.通过分析重构计算所有的并行性,实现了将动态自适应的划分算法用于任务在异构系统上高效的分发.在曙光星云系统的部分节点系统(32节点)上评估并行化的程序性能.实验证明:多层次的并行化是CPU与GPU异构系统上开发并行性的有效模式;CPU-GPU混合程序在给定问题规模上相对单纯CPU程序获得2.4倍加速比.

  19. The 3D elevation program - Precision agriculture and other farm practices

    Sugarbaker, Larry J.; Carswell, Jr., William J.

    2016-12-27

    The agriculture industry, including farmers who rely on advanced technologies, increasingly use light detection and ranging (lidar) data for crop management to enhance agricultural productivity. Annually, the combination of greater yields and reduced crop losses is estimated to increase revenue by \\$2 billion for America's farmers when terrain data derived from lidar are available for croplands. Additionally, the Natural Resources Conservation Service (NRCS) estimates that the value of improved services for farmers, through its farm assistance program, would be \\$79 million annually if lidar-derived digital elevation models (DEMs) are made available to the public.The 3D Elevation Program (3DEP) of the U.S. Geological Survey provides the programmatic infrastructure to generate and supply superior, lidar-derived terrain data to the agriculture industry, which would allow farms to refine agricultural practices and produce crops more efficiently. By providing data to users, 3DEP reduces users’ costs and risks, allowing them to concentrate on mission objectives. 3DEP includes (1) data acquisition partnerships that leverage funding, (2) contracts with experienced private mapping firms, (3) technical expertise, lidar data standards and specifications, and (4) most importantly, public access to high-quality 3D elevation data.

  20. Contributions to computational stereology and parallel programming

    Rasmusson, Allan

    rotator, even without the need for isotropic sections. To meet the need for computational power to perform image restoration of virtual tissue sections, parallel programming on GPUs has also been part of the project. This has lead to a significant change in paradigm for a previously developed surgical...

  1. Parallel Volunteer Learning during Youth Programs

    Lesmeister, Marilyn K.; Green, Jeremy; Derby, Amy; Bothum, Candi

    2012-01-01

    Lack of time is a hindrance for volunteers to participate in educational opportunities, yet volunteer success in an organization is tied to the orientation and education they receive. Meeting diverse educational needs of volunteers can be a challenge for program managers. Scheduling a Volunteer Learning Track for chaperones that is parallel to a…

  2. Evaluation of single photon and Geiger mode Lidar for the 3D Elevation Program

    Stoker, Jason M.; Abdullah, Qassim; Nayegandhi, Amar; Winehouse, Jayna

    2016-01-01

    Data acquired by Harris Corporation’s (Melbourne, FL, USA) Geiger-mode IntelliEarth™ sensor and Sigma Space Corporation’s (Lanham-Seabrook, MD, USA) Single Photon HRQLS sensor were evaluated and compared to accepted 3D Elevation Program (3DEP) data and survey ground control to assess the suitability of these new technologies for the 3DEP. While not able to collect data currently to meet USGS lidar base specification, this is partially due to the fact that the specification was written for linear-mode systems specifically. With little effort on part of the manufacturers of the new lidar systems and the USGS Lidar specifications team, data from these systems could soon serve the 3DEP program and its users. Many of the shortcomings noted in this study have been reported to have been corrected or improved upon in the next generation sensors.

  3. Concurrency-based approaches to parallel programming

    Kale, L.V.; Chrisochoides, N.; Kohl, J.; Yelick, K.

    1995-01-01

    The inevitable transition to parallel programming can be facilitated by appropriate tools, including languages and libraries. After describing the needs of applications developers, this paper presents three specific approaches aimed at development of efficient and reusable parallel software for irregular and dynamic-structured problems. A salient feature of all three approaches in their exploitation of concurrency within a processor. Benefits of individual approaches such as these can be leveraged by an interoperability environment which permits modules written using different approaches to co-exist in single applications.

  4. Parallel GRISYS/Power Challenge System Version 1.0 and 3D Prestack Depth Migration Package

    Zhao Zhenwen

    1995-01-01

    @@ Based on the achievements and experience of seismic data parallel processing made in the past years by Beijing Global Software Corporation (GS) of CNPC, Parallel GRISYS/Power Challenge seismic data processing system version 1.0 has been cooperatively developed and integrated on the Power Challenge computer by GS, SGI (USA) and Shuangyuan Company of Academia Sinica.

  5. Towards HPC++: A Unified Approach to Parallel Programming in C++

    1998-10-30

    Compositional C++ or CC++, is a general purpose parallel programming language designed to support a wide range of parallel programming styles. By...appropriate for parallelizing the range of applications that one would write in C++. CC++ supports the integration of different parallel programming styles

  6. Persistent Monitoring of Urban Infrasound Phenomenology. Report 1: Modeling an Urban Environment for Acoustical Analyses using the 3-D Finite-Difference Time-Domain Program PSTOP3D

    2015-08-01

    by PSTOP3D to save 3D output relative to topography without auxiliary structures. Finished reading indexed material properties from mctp_geo.dat.1...using the 3-D Finite-Difference Time-Domain Program PSTOP3D Michael E. Pace Information Technology Laboratory U.S. Army Engineer Research and... Read topography into Global Mapper ...................................................................... 15 3.2.2 Create clip region for topography

  7. Fabrication of 3-D Reconstituted Organoid Arrays by DNA-Programmed Assembly of Cells (DPAC).

    Todhunter, Michael E; Weber, Robert J; Farlow, Justin; Jee, Noel Y; Cerchiari, Alec E; Gartner, Zev J

    2016-09-13

    Tissues are the organizational units of function in metazoan organisms. Tissues comprise an assortment of cellular building blocks, soluble factors, and extracellular matrix (ECM) composed into specific three-dimensional (3-D) structures. The capacity to reconstitute tissues in vitro with the structural complexity observed in vivo is key to understanding processes such as morphogenesis, homeostasis, and disease. In this article, we describe DNA-programmed assembly of cells (DPAC), a method to fabricate viable, functional arrays of organoid-like tissues within 3-D ECM gels. In DPAC, dissociated cells are chemically functionalized with degradable oligonucleotide "Velcro," allowing rapid, specific, and reversible cell adhesion to a two-dimensional (2-D) template patterned with complementary DNA. An iterative assembly process builds up organoids, layer-by-layer, from this initial 2-D template and into the third dimension. Cleavage of the DNA releases the completed array of tissues that are captured and fully embedded in ECM gels for culture and observation. DPAC controls the size, shape, composition, and spatial heterogeneity of organoids and permits positioning of constituent cells with single-cell resolution even within cultures several centimeters long. © 2016 by John Wiley & Sons, Inc.

  8. Programming massively parallel processors a hands-on approach

    Kirk, David B

    2010-01-01

    Programming Massively Parallel Processors discusses basic concepts about parallel programming and GPU architecture. ""Massively parallel"" refers to the use of a large number of processors to perform a set of computations in a coordinated parallel way. The book details various techniques for constructing parallel programs. It also discusses the development process, performance level, floating-point format, parallel patterns, and dynamic parallelism. The book serves as a teaching guide where parallel programming is the main topic of the course. It builds on the basics of C programming for CUDA, a parallel programming environment that is supported on NVI- DIA GPUs. Composed of 12 chapters, the book begins with basic information about the GPU as a parallel computer source. It also explains the main concepts of CUDA, data parallelism, and the importance of memory access efficiency using CUDA. The target audience of the book is graduate and undergraduate students from all science and engineering disciplines who ...

  9. Specifying and Executing Optimizations for Parallel Programs

    William Mansky

    2014-07-01

    Full Text Available Compiler optimizations, usually expressed as rewrites on program graphs, are a core part of all modern compilers. However, even production compilers have bugs, and these bugs are difficult to detect and resolve. The problem only becomes more complex when compiling parallel programs; from the choice of graph representation to the possibility of race conditions, optimization designers have a range of factors to consider that do not appear when dealing with single-threaded programs. In this paper we present PTRANS, a domain-specific language for formal specification of compiler transformations, and describe its executable semantics. The fundamental approach of PTRANS is to describe program transformations as rewrites on control flow graphs with temporal logic side conditions. The syntax of PTRANS allows cleaner, more comprehensible specification of program optimizations; its executable semantics allows these specifications to act as prototypes for the optimizations themselves, so that candidate optimizations can be tested and refined before going on to include them in a compiler. We demonstrate the use of PTRANS to state, test, and refine the specification of a redundant store elimination optimization on parallel programs.

  10. Timing-Sequence Testing of Parallel Programs

    LIANG Yu; LI Shu; ZHANG Hui; HAN Chengde

    2000-01-01

    Testing of parallel programs involves two parts-testing of controlflow within the processes and testing of timing-sequence.This paper focuses on the latter, particularly on the timing-sequence of message-passing paradigms.Firstly the coarse-grained SYN-sequence model is built up to describe the execution of distributed programs. All of the topics discussed in this paper are based on it. The most direct way to test a program is to run it. A fault-free parallel program should be of both correct computing results and proper SYN-sequence. In order to analyze the validity of observed SYN-sequence, this paper presents the formal specification (Backus Normal Form) of the valid SYN-sequence. Till now there is little work about the testing coverage for distributed programs. Calculating the number of the valid SYN-sequences is the key to coverage problem, while the number of the valid SYN-sequences is terribly large and it is very hard to obtain the combination law among SYN-events. In order to resolve this problem, this paper proposes an efficient testing strategy-atomic SYN-event testing, which is to linearize the SYN-sequence (making it only consist of serial atomic SYN-events) first and then test each atomic SYN-event independently. This paper particularly provides the calculating formula about the number of the valid SYN-sequences for tree-topology atomic SYN-event (broadcast and combine). Furthermore,the number of valid SYN-sequences also,to some degree, mirrors the testability of parallel programs. Taking tree-topology atomic SYN-event as an example, this paper demonstrates the testability and communication speed of the tree-topology atomic SYN-event under different numbers of branches in order to achieve a more satisfactory tradeoff between testability and communication efficiency.

  11. Scaling and performance of a 3-D radiation hydrodynamics code on message-passing parallel computers: final report

    Hayes, J C; Norman, M

    1999-10-28

    This report details an investigation into the efficacy of two approaches to solving the radiation diffusion equation within a radiation hydrodynamic simulation. Because leading-edge scientific computing platforms have evolved from large single-node vector processors to parallel aggregates containing tens to thousands of individual CPU's, the ability of an algorithm to maintain high compute efficiency when distributed over a large array of nodes is critically important. The viability of an algorithm thus hinges upon the tripartite question of numerical accuracy, total time to solution, and parallel efficiency.

  12. Accepting the T3D

    Rich, D.O.; Pope, S.C.; DeLapp, J.G.

    1994-10-01

    In April, a 128 PE Cray T3D was installed at Los Alamos National Laboratory`s Advanced Computing Laboratory as part of the DOE`s High-Performance Parallel Processor Program (H4P). In conjunction with CRI, the authors implemented a 30 day acceptance test. The test was constructed in part to help them understand the strengths and weaknesses of the T3D. In this paper, they briefly describe the H4P and its goals. They discuss the design and implementation of the T3D acceptance test and detail issues that arose during the test. They conclude with a set of system requirements that must be addressed as the T3D system evolves.

  13. Parallel Programming with MatlabMPI

    Kepner, J V

    2001-01-01

    MatlabMPI is a Matlab implementation of the Message Passing Interface (MPI) standard and allows any Matlab program to exploit multiple processors. MatlabMPI currently implements the basic six functions that are the core of the MPI point-to-point communications standard. The key technical innovation of MatlabMPI is that it implements the widely used MPI ``look and feel'' on top of standard Matlab file I/O, resulting in an extremely compact (~100 lines) and ``pure'' implementation which runs anywhere Matlab runs. The performance has been tested on both shared and distributed memory parallel computers. MatlabMPI can match the bandwidth of C based MPI at large message sizes. A test image filtering application using MatlabMPI achieved a speedup of ~70 on a parallel computer.

  14. Array distribution in data-parallel programs

    Chatterjee, Siddhartha; Gilbert, John R.; Schreiber, Robert; Sheffler, Thomas J.

    1994-01-01

    We consider distribution at compile time of the array data in a distributed-memory implementation of a data-parallel program written in a language like Fortran 90. We allow dynamic redistribution of data and define a heuristic algorithmic framework that chooses distribution parameters to minimize an estimate of program completion time. We represent the program as an alignment-distribution graph. We propose a divide-and-conquer algorithm for distribution that initially assigns a common distribution to each node of the graph and successively refines this assignment, taking computation, realignment, and redistribution costs into account. We explain how to estimate the effect of distribution on computation cost and how to choose a candidate set of distributions. We present the results of an implementation of our algorithms on several test problems.

  15. Profiling parallel Mercury programs with ThreadScope

    Bone, Paul

    2011-01-01

    The behavior of parallel programs is even harder to understand than the behavior of sequential programs. Parallel programs may suffer from any of the performance problems affecting sequential programs, as well as from several problems unique to parallel systems. Many of these problems are quite hard (or even practically impossible) to diagnose without help from specialized tools. We present a proposal for a tool for profiling the parallel execution of Mercury programs, a proposal whose implementation we have already started. This tool is an adaptation and extension of the ThreadScope profiler that was first built to help programmers visualize the execution of parallel Haskell programs.

  16. Parallel Programming Strategies for Irregular Adaptive Applications

    Biswas, Rupak; Biegel, Bryan (Technical Monitor)

    2001-01-01

    Achieving scalable performance for dynamic irregular applications is eminently challenging. Traditional message-passing approaches have been making steady progress towards this goal; however, they suffer from complex implementation requirements. The use of a global address space greatly simplifies the programming task, but can degrade the performance for such computations. In this work, we examine two typical irregular adaptive applications, Dynamic Remeshing and N-Body, under competing programming methodologies and across various parallel architectures. The Dynamic Remeshing application simulates flow over an airfoil, and refines localized regions of the underlying unstructured mesh. The N-Body experiment models two neighboring Plummer galaxies that are about to undergo a merger. Both problems demonstrate dramatic changes in processor workloads and interprocessor communication with time; thus, dynamic load balancing is a required component.

  17. Four styles of parallel and net programming

    Zhiwei XU; Yongqiang HE; Wei LIN; Li ZHA

    2009-01-01

    This paper reviews the programming landscape for parallel and network computing systems, focusing on four styles of concurrent programming models, and example languages/libraries. The four styles correspond to four scales of the targeted systems. At the smallest coprocessor scale, Single Instruction Multiple Thread (SIMT) and Compute Unified Device Architecture (CUDA) are considered. Transactional memory is discussed at the multicore or process scale. The MapReduce style is ex-amined at the datacenter scale. At the Internet scale, Grid Service Markup Language (GSML) is reviewed, which intends to integrate resources distributed across multiple dat-acenters.The four styles are concerned with and emphasize differ-ent issues, which are needed by systems at different scales. This paper discusses issues related to efficiency, ease of use, and expressiveness.

  18. VERIFICATION OF PARALLEL AUTOMATA-BASED PROGRAMS

    M. A. Lukin

    2014-01-01

    Full Text Available The paper deals with an interactive method of automatic verification for parallel automata-based programs. The hierarchical state machines can be implemented in different threads and can interact with each other. Verification is done by means of Spin tool and includes automatic Promela model construction, conversion of LTL-formula to Spin format and counterexamples in terms of automata. Interactive verification gives the possibility to decrease verification time and increase the maximum size of verifiable programs. Considered method supports verification of the parallel system for hierarchical automata that interact with each other through messages and shared variables. The feature of automaton model is that each state machine is considered as a new data type and can have an arbitrary bounded number of instances. Each state machine in the system can run a different state machine in a new thread or have nested state machine. This method was implemented in the developed Stater tool. Stater shows correct operation for all test cases.

  19. Professional WebGL Programming Developing 3D Graphics for the Web

    Anyuru, Andreas

    2012-01-01

    Everything you need to know about developing hardware-accelerated 3D graphics with WebGL! As the newest technology for creating 3D graphics on the web, in both games, applications, and on regular websites, WebGL gives web developers the capability to produce eye-popping graphics. This book teaches you how to use WebGL to create stunning cross-platform apps. The book features several detailed examples that show you how to develop 3D graphics with WebGL, including explanations of code snippets that help you understand the why behind the how. You will also develop a stronger understanding of W

  20. Scalable, incremental learning with MapReduce parallelization for cell detection in high-resolution 3D microscopy data

    Sung, Chul

    2013-08-01

    Accurate estimation of neuronal count and distribution is central to the understanding of the organization and layout of cortical maps in the brain, and changes in the cell population induced by brain disorders. High-throughput 3D microscopy techniques such as Knife-Edge Scanning Microscopy (KESM) are enabling whole-brain survey of neuronal distributions. Data from such techniques pose serious challenges to quantitative analysis due to the massive, growing, and sparsely labeled nature of the data. In this paper, we present a scalable, incremental learning algorithm for cell body detection that can address these issues. Our algorithm is computationally efficient (linear mapping, non-iterative) and does not require retraining (unlike gradient-based approaches) or retention of old raw data (unlike instance-based learning). We tested our algorithm on our rat brain Nissl data set, showing superior performance compared to an artificial neural network-based benchmark, and also demonstrated robust performance in a scenario where the data set is rapidly growing in size. Our algorithm is also highly parallelizable due to its incremental nature, and we demonstrated this empirically using a MapReduce-based implementation of the algorithm. We expect our scalable, incremental learning approach to be widely applicable to medical imaging domains where there is a constant flux of new data. © 2013 IEEE.

  1. Computational modeling of pitching cylinder-type ocean wave energy converters using 3D MPI-parallel simulations

    Freniere, Cole; Pathak, Ashish; Raessi, Mehdi

    2016-11-01

    Ocean Wave Energy Converters (WECs) are devices that convert energy from ocean waves into electricity. To aid in the design of WECs, an advanced computational framework has been developed which has advantages over conventional methods. The computational framework simulates the performance of WECs in a virtual wave tank by solving the full Navier-Stokes equations in 3D, capturing the fluid-structure interaction, nonlinear and viscous effects. In this work, we present simulations of the performance of pitching cylinder-type WECs and compare against experimental data. WECs are simulated at both model and full scales. The results are used to determine the role of the Keulegan-Carpenter (KC) number. The KC number is representative of viscous drag behavior on a bluff body in an oscillating flow, and is considered an important indicator of the dynamics of a WEC. Studying the effects of the KC number is important for determining the validity of the Froude scaling and the inviscid potential flow theory, which are heavily relied on in the conventional approaches to modeling WECs. Support from the National Science Foundation is gratefully acknowledged.

  2. Verification and validation of a parallel 3D direct simulation Monte Carlo solver for atmospheric entry applications

    Nizenkov, Paul; Noeding, Peter; Konopka, Martin; Fasoulas, Stefanos

    2017-03-01

    The in-house direct simulation Monte Carlo solver PICLas, which enables parallel, three-dimensional simulations of rarefied gas flows, is verified and validated. Theoretical aspects of the method and the employed schemes are briefly discussed. Considered cases include simple reservoir simulations and complex re-entry geometries, which were selected from literature and simulated with PICLas. First, the chemistry module is verified using simple numerical and analytical solutions. Second, simulation results of the rarefied gas flow around a 70° blunted-cone, the REX Free-Flyer as well as multiple points of the re-entry trajectory of the Orion capsule are presented in terms of drag and heat flux. A comparison to experimental measurements as well as other numerical results shows an excellent agreement across the different simulation cases. An outlook on future code development and applications is given.

  3. Extending a serial 3D two-phase CFD code to parallel execution over MPI by using the PETSc library for domain decomposition

    Ervik, Åsmund; Müller, Bernhard

    2014-01-01

    To leverage the last two decades' transition in High-Performance Computing (HPC) towards clusters of compute nodes bound together with fast interconnects, a modern scalable CFD code must be able to efficiently distribute work amongst several nodes using the Message Passing Interface (MPI). MPI can enable very large simulations running on very large clusters, but it is necessary that the bulk of the CFD code be written with MPI in mind, an obstacle to parallelizing an existing serial code. In this work we present the results of extending an existing two-phase 3D Navier-Stokes solver, which was completely serial, to a parallel execution model using MPI. The 3D Navier-Stokes equations for two immiscible incompressible fluids are solved by the continuum surface force method, while the location of the interface is determined by the level-set method. We employ the Portable Extensible Toolkit for Scientific Computing (PETSc) for domain decomposition (DD) in a framework where only a fraction of the code needs to be a...

  4. Mars-solar wind interaction: LatHyS, an improved parallel 3-D multispecies hybrid model

    Modolo, Ronan; Hess, Sebastien; Mancini, Marco; Leblanc, Francois; Chaufray, Jean-Yves; Brain, David; Leclercq, Ludivine; Esteban-Hernández, Rosa; Chanteur, Gerard; Weill, Philippe; González-Galindo, Francisco; Forget, Francois; Yagi, Manabu; Mazelle, Christian

    2016-07-01

    In order to better represent Mars-solar wind interaction, we present an unprecedented model achieving spatial resolution down to 50 km, a so far unexplored resolution for global kinetic models of the Martian ionized environment. Such resolution approaches the ionospheric plasma scale height. In practice, the model is derived from a first version described in Modolo et al. (2005). An important effort of parallelization has been conducted and is presented here. A better description of the ionosphere was also implemented including ionospheric chemistry, electrical conductivities, and a drag force modeling the ion-neutral collisions in the ionosphere. This new version of the code, named LatHyS (Latmos Hybrid Simulation), is here used to characterize the impact of various spatial resolutions on simulation results. In addition, and following a global model challenge effort, we present the results of simulation run for three cases which allow addressing the effect of the suprathermal corona and of the solar EUV activity on the magnetospheric plasma boundaries and on the global escape. Simulation results showed that global patterns are relatively similar for the different spatial resolution runs, but finest grid runs provide a better representation of the ionosphere and display more details of the planetary plasma dynamic. Simulation results suggest that a significant fraction of escaping O+ ions is originated from below 1200 km altitude.

  5. 三维有限元并行EBE方法%PARALLEL 3-D FINITE ELEMENT ANALYSIS BASED ON EBE METHOD

    刘耀儒; 周维垣; 杨强

    2006-01-01

    采用Jacobi预处理,推导了基于EBE方法的预处理共轭梯度算法,给出了有限元EBE方法在分布存储并行机上的计算过程,可以实现整个三维有限元计算过程的并行化.编制了三维有限元求解的PFEM(Parallel Finite Element Method)程序,并在网络机群系统上实现.采用矩形截面悬臂梁的算例,对PFEM程序进行了数值测试,对串行计算和并行计算的效率进行了分析,最后将PFEM程序应用于二滩拱坝-地基系统的三维有限元数值计算中.结果表明,三维有限元EBE算法在求解过程中不需要集成整体刚度矩阵,有效地减少了对内存的需求,具有很好的并行性,可以有效地进行三维复杂结构的大规模数值分析.

  6. Data-Parallel Programming in a Multithreaded Environment

    Matthew Haines

    1997-01-01

    Full Text Available Research on programming distributed memory multiprocessors has resulted in a well-understood programming model, namely data-parallel programming. However, data-parallel programming in a multithreaded environment is far less understood. For example, if multiple threads within the same process belong to different data-parallel computations, then the architecture, compiler, or run-time system must ensure that relative indexing and collective operations are handled properly and efficiently. We introduce a run-time-based solution for data-parallel programming in a distributed memory environment that handles the problems of relative indexing and collective communications among thread groups. As a result, the data-parallel programming model can now be executed in a multithreaded environment, such as a system using threads to support both task and data parallelism.

  7. NEPTUNE:并行三维全电磁粒子模拟软件%NEPTUNE:A 3-D Fully Electromagnetic Particle Parallel Software

    陈军; 董烨; 杨温渊; 董志伟

    2009-01-01

    为求解具有复杂几何的高功率微波电磁场问题,本文研制了一个三维全电磁粒子并行软件NEPTUNE.本文介绍了该并行软件的基本结构和采用的一些并行算法.目前,该软件已经成功模拟了多种高功率源器件,并可扩展到数千台处理器核上运行.%We developed a three-dimensional fully electromagnetic particle parallel software based on the parallel adaptive structure mesh application infrastructure, ,to solve the electromagnetic problem in the high power microwave devices with complex geometry. This paper presents the basic numerical method and parallel algorithm used in the parallel program. A typical device with complex geometry is simulated by the parallel program on thousands of processors, and the results show the good scalability. Currently it has been simulated many high power microwave devices successfully.

  8. What a Parallel Programming Language Has to Let You Say,

    1984-09-01

    RD-fl147 854 WHAT A PARALLEL PROGRAMMING LANGUAGE HAS TO LET YOU SAY 1/1 (U) MASSACHUSETTS INST OF TECH CAMBRIDGE ARTIFICIAL INTELLIGENCE LAB A...What a parallel programming language has to let I* you say 6. PERFORMING ORG. REPORT NUMNER io 8. CONTRACT OR GRANT NUSeR(8e) Alan Bawden/Philip E. Agre...Massachusetts Institute of Technology Artificial Intelligence Laboratory AI Memo 796 September 1984 What a parallel programming language has to let

  9. A Fast Parallel Simulation Code for Interaction between Proto-Planetary Disk and Embedded Proto-Planets: Implementation for 3D Code

    Li, Shengtai [Los Alamos National Laboratory; Li, Hui [Los Alamos National Laboratory

    2012-06-14

    sensitive to the position of the planet, we adopt the corotating frame that allows the planet moving only in radial direction if only one planet is present. This code has been extensively tested on a number of problems. For the earthmass planet with constant aspect ratio h = 0.05, the torque calculated using our code matches quite well with the the 3D linear theory results by Tanaka et al. (2002). The code is fully parallelized via message-passing interface (MPI) and has very high parallel efficiency. Several numerical examples for both fixed planet and moving planet are provided to demonstrate the efficacy of the numerical method and code.

  10. A task-based parallelism and vectorized approach to 3D Method of Characteristics (MOC) reactor simulation for high performance computing architectures

    Tramm, John R.; Gunow, Geoffrey; He, Tim; Smith, Kord S.; Forget, Benoit; Siegel, Andrew R.

    2016-05-01

    In this study we present and analyze a formulation of the 3D Method of Characteristics (MOC) technique applied to the simulation of full core nuclear reactors. Key features of the algorithm include a task-based parallelism model that allows independent MOC tracks to be assigned to threads dynamically, ensuring load balancing, and a wide vectorizable inner loop that takes advantage of modern SIMD computer architectures. The algorithm is implemented in a set of highly optimized proxy applications in order to investigate its performance characteristics on CPU, GPU, and Intel Xeon Phi architectures. Speed, power, and hardware cost efficiencies are compared. Additionally, performance bottlenecks are identified for each architecture in order to determine the prospects for continued scalability of the algorithm on next generation HPC architectures.

  11. PDDP: A data parallel programming model. Revision 1

    Warren, K.H.

    1995-06-01

    PDDP, the Parallel Data Distribution Preprocessor, is a data parallel programming model for distributed memory parallel computers. PDDP impelments High Performance Fortran compatible data distribution directives and parallelism expressed by the use of Fortran 90 array syntax, the FORALL statement, and the (WRERE?) construct. Distribued data objects belong to a global name space; other data objects are treated as local and replicated on each processor. PDDP allows the user to program in a shared-memory style and generates codes that are portable to a variety of parallel machines. For interprocessor communication, PDDP uses the fastest communication primitives on each platform.

  12. Optimizing FORTRAN Programs for Hierarchical Memory Parallel Processing Systems

    金国华; 陈福接

    1993-01-01

    Parallel loops account for the greatest amount of parallelism in numerical programs.Executing nested loops in parallel with low run-time overhead is thus very important for achieving high performance in parallel processing systems.However,in parallel processing systems with caches or local memories in memory hierarchies,“thrashing problemmay”may arise whenever data move back and forth between the caches or local memories in different processors.Previous techniques can only deal with the rather simple cases with one linear function in the perfactly nested loop.In this paper,we present a parallel program optimizing technique called hybri loop interchange(HLI)for the cases with multiple linear functions and loop-carried data dependences in the nested loop.With HLI we can easily eliminate or reduce the thrashing phenomena without reucing the program parallelism.

  13. A Parallel 3D Spectral Difference Method for Solutions of Compressible Navier Stokes Equations on Deforming Grids and Simulations of Vortex Induced Vibration

    DeJong, Andrew

    Numerical models of fluid-structure interaction have grown in importance due to increasing interest in environmental energy harvesting, airfoil-gust interactions, and bio-inspired formation flying. Powered by increasingly powerful parallel computers, such models seek to explain the fundamental physics behind the complex, unsteady fluid-structure phenomena. To this end, a high-fidelity computational model based on the high-order spectral difference method on 3D unstructured, dynamic meshes has been developed. The spectral difference method constructs continuous solution fields within each element with a Riemann solver to compute the inviscid fluxes at the element interfaces and an averaging mechanism to compute the viscous fluxes. This method has shown promise in the past as a highly accurate, yet sufficiently fast method for solving unsteady viscous compressible flows. The solver is monolithically coupled to the equations of motion of an elastically mounted 3-degree of freedom rigid bluff body undergoing flow-induced lift, drag, and torque. The mesh is deformed using 4 methods: an analytic function, Laplace equation, biharmonic equation, and a bi-elliptic equation with variable diffusivity. This single system of equations -- fluid and structure -- is advanced through time using a 5-stage, 4th-order Runge-Kutta scheme. Message Passing Interface is used to run the coupled system in parallel on up to 240 processors. The solver is validated against previously published numerical and experimental data for an elastically mounted cylinder. The effect of adding an upstream body and inducing wake galloping is observed.

  14. Introduction to 3D Graphics through Excel

    Benacka, Jan

    2013-01-01

    The article presents a method of explaining the principles of 3D graphics through making a revolvable and sizable orthographic parallel projection of cuboid in Excel. No programming is used. The method was tried in fourteen 90 minute lessons with 181 participants, which were Informatics teachers, undergraduates of Applied Informatics and gymnasium…

  15. THERM3D -- A boundary element computer program for transient heat conduction problems

    Ingber, M.S. [New Mexico Univ., Albuquerque, NM (United States). Dept. of Mechanical Engineering

    1994-02-01

    The computer code THERM3D implements the direct boundary element method (BEM) to solve transient heat conduction problems in arbitrary three-dimensional domains. This particular implementation of the BEM avoids performing time-consuming domain integrations by approximating a ``generalized forcing function`` in the interior of the domain with the use of radial basis functions. An approximate particular solution is then constructed, and the original problem is transformed into a sequence of Laplace problems. The code is capable of handling a large variety of boundary conditions including isothermal, specified flux, convection, radiation, and combined convection and radiation conditions. The computer code is benchmarked by comparisons with analytic and finite element results.

  16. Parallel Programming with Matrix Distributed Processing

    Di Pierro, Massimo

    2005-01-01

    Matrix Distributed Processing (MDP) is a C++ library for fast development of efficient parallel algorithms. It constitues the core of FermiQCD. MDP enables programmers to focus on algorithms, while parallelization is dealt with automatically and transparently. Here we present a brief overview of MDP and examples of applications in Computer Science (Cellular Automata), Engineering (PDE Solver) and Physics (Ising Model).

  17. An interactive parallel programming environment applied in atmospheric science

    vonLaszewski, G.

    1996-01-01

    This article introduces an interactive parallel programming environment (IPPE) that simplifies the generation and execution of parallel programs. One of the tasks of the environment is to generate message-passing parallel programs for homogeneous and heterogeneous computing platforms. The parallel programs are represented by using visual objects. This is accomplished with the help of a graphical programming editor that is implemented in Java and enables portability to a wide variety of computer platforms. In contrast to other graphical programming systems, reusable parts of the programs can be stored in a program library to support rapid prototyping. In addition, runtime performance data on different computing platforms is collected in a database. A selection process determines dynamically the software and the hardware platform to be used to solve the problem in minimal wall-clock time. The environment is currently being tested on a Grand Challenge problem, the NASA four-dimensional data assimilation system.

  18. Declarative Parallel Programming in Spreadsheet End-User Development

    Biermann, Florian

    2016-01-01

    . In this literature study, we provide an overview of the publications on spreadsheet end-user programming and declarative array programming to inform further research on parallel programming in spreadsheets. Our results show that there is a clear overlap between spreadsheet programming and array programming and we...

  19. Deterministic Consistency: A Programming Model for Shared Memory Parallelism

    Aviram, Amittai; Ford, Bryan

    2009-01-01

    The difficulty of developing reliable parallel software is generating interest in deterministic environments, where a given program and input can yield only one possible result. Languages or type systems can enforce determinism in new code, and runtime systems can impose synthetic schedules on legacy parallel code. To parallelize existing serial code, however, we would like a programming model that is naturally deterministic without language restrictions or artificial scheduling. We propose "...

  20. Programming Mechanical and Physicochemical Properties of 3D Hydrogel Cellular Microcultures via Direct Ink Writing.

    McCracken, Joselle M; Badea, Adina; Kandel, Mikhail E; Gladman, A Sydney; Wetzel, David J; Popescu, Gabriel; Lewis, Jennifer A; Nuzzo, Ralph G

    2016-05-01

    3D hydrogel scaffolds are widely used in cellular microcultures and tissue engineering. Using direct ink writing, microperiodic poly(2-hydroxyethyl-methacrylate) (pHEMA) scaffolds are created that are then printed, cured, and modified by absorbing 30 kDa protein poly-l-lysine (PLL) to render them biocompliant in model NIH/3T3 fibroblast and MC3T3-E1 preosteoblast cell cultures. Spatial light interference microscopy (SLIM) live cell imaging studies are carried out to quantify cellular motilities for each cell type, substrate, and surface treatment of interest. 3D scaffold mechanics is investigated using atomic force microscopy (AFM), while their absorption kinetics are determined by confocal fluorescence microscopy (CFM) for a series of hydrated hydrogel films prepared from prepolymers with different homopolymer-to-monomer (Mr ) ratios. The observations reveal that the inks with higher Mr values yield relatively more open-mesh gels due to a lower degree of entanglement. The biocompatibility of printed hydrogel scaffolds can be controlled by both PLL content and hydrogel mesh properties.

  1. A linear programming approach to reconstructing subcellular structures from confocal images for automated generation of representative 3D cellular models.

    Wood, Scott T; Dean, Brian C; Dean, Delphine

    2013-04-01

    This paper presents a novel computer vision algorithm to analyze 3D stacks of confocal images of fluorescently stained single cells. The goal of the algorithm is to create representative in silico model structures that can be imported into finite element analysis software for mechanical characterization. Segmentation of cell and nucleus boundaries is accomplished via standard thresholding methods. Using novel linear programming methods, a representative actin stress fiber network is generated by computing a linear superposition of fibers having minimum discrepancy compared with an experimental 3D confocal image. Qualitative validation is performed through analysis of seven 3D confocal image stacks of adherent vascular smooth muscle cells (VSMCs) grown in 2D culture. The presented method is able to automatically generate 3D geometries of the cell's boundary, nucleus, and representative F-actin network based on standard cell microscopy data. These geometries can be used for direct importation and implementation in structural finite element models for analysis of the mechanics of a single cell to potentially speed discoveries in the fields of regenerative medicine, mechanobiology, and drug discovery.

  2. PROP3D: A Program for 3D Euler Unsteady Aerodynamic and Aeroelastic (Flutter and Forced Response) Analysis of Propellers. Version 1.0

    Srivastava, R.; Reddy, T. S. R.

    1996-01-01

    This guide describes the input data required, for steady or unsteady aerodynamic and aeroelastic analysis of propellers and the output files generated, in using PROP3D. The aerodynamic forces are obtained by solving three dimensional unsteady, compressible Euler equations. A normal mode structural analysis is used to obtain the aeroelastic equations, which are solved using either time domain or frequency domain solution method. Sample input and output files are included in this guide for steady aerodynamic analysis of single and counter-rotation propellers, and aeroelastic analysis of single-rotation propeller.

  3. Selecting Simulation Models when Predicting Parallel Program Behaviour

    Broberg, Magnus; Lundberg, Lars; Grahn, Håkan

    2002-01-01

    The use of multiprocessors is an important way to increase the performance of a supercom-puting program. This means that the program has to be parallelized to make use of the multi-ple processors. The parallelization is unfortunately not an easy task. Development tools supporting parallel programs are important. Further, it is the customer that decides the number of processors in the target machine, and as a result the developer has to make sure that the pro-gram runs efficiently on any numbe...

  4. Professional Parallel Programming with C# Master Parallel Extensions with NET 4

    Hillar, Gastón

    2010-01-01

    Expert guidance for those programming today's dual-core processors PCs As PC processors explode from one or two to now eight processors, there is an urgent need for programmers to master concurrent programming. This book dives deep into the latest technologies available to programmers for creating professional parallel applications using C#, .NET 4, and Visual Studio 2010. The book covers task-based programming, coordination data structures, PLINQ, thread pools, asynchronous programming model, and more. It also teaches other parallel programming techniques, such as SIMD and vectorization.Teach

  5. 3D and Education

    Meulien Ohlmann, Odile

    2013-02-01

    Today the industry offers a chain of 3D products. Learning to "read" and to "create in 3D" becomes an issue of education of primary importance. 25 years professional experience in France, the United States and Germany, Odile Meulien set up a personal method of initiation to 3D creation that entails the spatial/temporal experience of the holographic visual. She will present some different tools and techniques used for this learning, their advantages and disadvantages, programs and issues of educational policies, constraints and expectations related to the development of new techniques for 3D imaging. Although the creation of display holograms is very much reduced compared to the creation of the 90ies, the holographic concept is spreading in all scientific, social, and artistic activities of our present time. She will also raise many questions: What means 3D? Is it communication? Is it perception? How the seeing and none seeing is interferes? What else has to be taken in consideration to communicate in 3D? How to handle the non visible relations of moving objects with subjects? Does this transform our model of exchange with others? What kind of interaction this has with our everyday life? Then come more practical questions: How to learn creating 3D visualization, to learn 3D grammar, 3D language, 3D thinking? What for? At what level? In which matter? for whom?

  6. 3-D magnetotelluric inversion including topography using deformed hexahedral edge finite elements and direct solvers parallelized on SMP computers - Part I: forward problem and parameter Jacobians

    Kordy, M.; Wannamaker, P.; Maris, V.; Cherkaev, E.; Hill, G.

    2016-01-01

    We have developed an algorithm, which we call HexMT, for 3-D simulation and inversion of magnetotelluric (MT) responses using deformable hexahedral finite elements that permit incorporation of topography. Direct solvers parallelized on symmetric multiprocessor (SMP), single-chassis workstations with large RAM are used throughout, including the forward solution, parameter Jacobians and model parameter update. In Part I, the forward simulator and Jacobian calculations are presented. We use first-order edge elements to represent the secondary electric field (E), yielding accuracy O(h) for E and its curl (magnetic field). For very low frequencies or small material admittivities, the E-field requires divergence correction. With the help of Hodge decomposition, the correction may be applied in one step after the forward solution is calculated. This allows accurate E-field solutions in dielectric air. The system matrix factorization and source vector solutions are computed using the MKL PARDISO library, which shows good scalability through 24 processor cores. The factorized matrix is used to calculate the forward response as well as the Jacobians of electromagnetic (EM) field and MT responses using the reciprocity theorem. Comparison with other codes demonstrates accuracy of our forward calculations. We consider a popular conductive/resistive double brick structure, several synthetic topographic models and the natural topography of Mount Erebus in Antarctica. In particular, the ability of finite elements to represent smooth topographic slopes permits accurate simulation of refraction of EM waves normal to the slopes at high frequencies. Run-time tests of the parallelized algorithm indicate that for meshes as large as 176 × 176 × 70 elements, MT forward responses and Jacobians can be calculated in ˜1.5 hr per frequency. Together with an efficient inversion parameter step described in Part II, MT inversion problems of 200-300 stations are computable with total run times

  7. The SMC (Short Model Coil) Nb3Sn Program: FE Analysis with 3D Modeling

    Kokkinos, C; Guinchard, M; Karppinen, M; Manil, P; Perez, J C; Regis, F

    2012-01-01

    The SMC (Short Model Coil) project aims at testing superconducting coils in racetrack configuration, wound with Nb3Sn cable. The degradation of the magnetic properties of the cable is studied by applying different levels of pre-stress. It is an essential step in the validation of procedures for the construction of superconducting magnets with high performance conductor. Two SMC assemblies have been completed and cold tested in the frame of a European collaboration between CEA (FR), CERN and STFC (UK), with the technical support from LBNL (US). The second assembly showed remarkable good quench results, reaching a peak field of 12.5T. This paper details the new 3D modeling method of the SMC, implemented using the ANSYS® Workbench environment. Advanced computer-aided-design (CAD) tools are combined with multi-physics Finite Element Analyses (FEA), in the same integrated graphic interface, forming a fully parametric model that enables simulation driven development of the SMC project. The magnetic and structural ...

  8. 基于C++ Builder的共焦显微镜三维重建方法%3D reconstruction of parallel confocal microscope based on C++ Builder

    杨召雷; 张运波; 董洪波

    2012-01-01

    主要研究基于C++ Builder的数字微镜器件(Digital Micro-mirror Device,简称DMD)多路并行扫描共焦检测系统三维重建方法,将实验采集的虚拟针孔位图经过拟合后得到三维坐标数据,将其网格化后,采用集成OpenGL的C++Builder开发三维重建软件系统并重建三维图形,实验结果成功还原了被测物表面的微结构.%This paper primarily researched 3D reconstruction of the DMD ( Digital Micro-mirror Device) multi-point parallel confocal inspection system based on C ++ Builder, with the virtual pinhole bitmap collected from the laboratory after fitting, three-dimensional coordinate data was obtained, then made them grid. After the three-dimensional reconstruction software system compiled by C ++ Builder & OpenGL, the three-dimensional graphics were rebuild, and the results showed that the surface of the micro-structure was restored successfully.

  9. Directions in parallel programming: HPF, shared virtual memory and object parallelism in pC++

    Bodin, Francois; Priol, Thierry; Mehrotra, Piyush; Gannon, Dennis

    1994-01-01

    Fortran and C++ are the dominant programming languages used in scientific computation. Consequently, extensions to these languages are the most popular for programming massively parallel computers. We discuss two such approaches to parallel Fortran and one approach to C++. The High Performance Fortran Forum has designed HPF with the intent of supporting data parallelism on Fortran 90 applications. HPF works by asking the user to help the compiler distribute and align the data structures with the distributed memory modules in the system. Fortran-S takes a different approach in which the data distribution is managed by the operating system and the user provides annotations to indicate parallel control regions. In the case of C++, we look at pC++ which is based on a concurrent aggregate parallel model.

  10. A Performance Analysis Tool for PVM Parallel Programs

    Chen Wang; Yin Liu; Changjun Jiang; Zhaoqing Zhang

    2004-01-01

    In this paper,we introduce the design and implementation of ParaVT,which is a visual performance analysis and parallel debugging tool.In ParaVT,we propose an automated instrumentation mechanism. Based on this mechanism,ParaVT automatically analyzes the performance bottleneck of parallel applications and provides a visual user interface to monitor and analyze the performance of parallel programs.In addition ,it also supports certain extensions.

  11. Architectural Adaptability in Parallel Programming via Control Abstraction

    1991-01-01

    Technical Report 359 January 1991 Abstract Parallel programming involves finding the potential parallelism in an application, choos - ing an...during the development of this paper. 34 References [Albert et ai, 1988] Eugene Albert, Kathleen Knobe, Joan D. Lukas, and Guy L. Steele, Jr

  12. Detection of And—Parallelism in Logic Programs

    黄志毅; 胡守仁

    1990-01-01

    In this paper,we present a detection technique of and-parallelism in logic programs.The detection consists of three phases:analysis of entry modes,derivation of exit modes and determination of execution graph expressions.Compared with other techniques[2,4,5],our approach with the compile-time program-level data-dependence analysis of logic programs,can efficiently exploit and-parallelism in logic programs.Two precompilers,based on our technique and DeGroot's approach[3] respectively,have been implemented in SES-PIM system[12],Through compiling and running some typical benchmarks in SES-PIM,we conclude that our technique can,in most cases,exploit as much and-parallelism as the dynamic approach[13]does under“produces-consumer”scheme,and needs less dynamic overhead while exploiting more and parallelism than DeGroot's approach does.

  13. The FORCE: A highly portable parallel programming language

    Jordan, Harry F.; Benten, Muhammad S.; Alaghband, Gita; Jakob, Ruediger

    1989-01-01

    Here, it is explained why the FORCE parallel programming language is easily portable among six different shared-memory microprocessors, and how a two-level macro preprocessor makes it possible to hide low level machine dependencies and to build machine-independent high level constructs on top of them. These FORCE constructs make it possible to write portable parallel programs largely independent of the number of processes and the specific shared memory multiprocessor executing them.

  14. The FORCE - A highly portable parallel programming language

    Jordan, Harry F.; Benten, Muhammad S.; Alaghband, Gita; Jakob, Ruediger

    1989-01-01

    This paper explains why the FORCE parallel programming language is easily portable among six different shared-memory multiprocessors, and how a two-level macro preprocessor makes it possible to hide low-level machine dependencies and to build machine-independent high-level constructs on top of them. These FORCE constructs make it possible to write portable parallel programs largely independent of the number of processes and the specific shared-memory multiprocessor executing them.

  15. Integrated Task And Data Parallel Programming: Language Design

    Grimshaw, Andrew S.; West, Emily A.

    1998-01-01

    his research investigates the combination of task and data parallel language constructs within a single programming language. There are an number of applications that exhibit properties which would be well served by such an integrated language. Examples include global climate models, aircraft design problems, and multidisciplinary design optimization problems. Our approach incorporates data parallel language constructs into an existing, object oriented, task parallel language. The language will support creation and manipulation of parallel classes and objects of both types (task parallel and data parallel). Ultimately, the language will allow data parallel and task parallel classes to be used either as building blocks or managers of parallel objects of either type, thus allowing the development of single and multi-paradigm parallel applications. 1995 Research Accomplishments In February I presented a paper at Frontiers '95 describing the design of the data parallel language subset. During the spring I wrote and defended my dissertation proposal. Since that time I have developed a runtime model for the language subset. I have begun implementing the model and hand-coding simple examples which demonstrate the language subset. I have identified an astrophysical fluid flow application which will validate the data parallel language subset. 1996 Research Agenda Milestones for the coming year include implementing a significant portion of the data parallel language subset over the Legion system. Using simple hand-coded methods, I plan to demonstrate (1) concurrent task and data parallel objects and (2) task parallel objects managing both task and data parallel objects. My next steps will focus on constructing a compiler and implementing the fluid flow application with the language. Concurrently, I will conduct a search for a real-world application exhibiting both task and data parallelism within the same program m. Additional 1995 Activities During the fall I collaborated

  16. Characterizing and Mitigating Work Time Inflation in Task Parallel Programs

    Stephen L. Olivier

    2013-01-01

    Full Text Available Task parallelism raises the level of abstraction in shared memory parallel programming to simplify the development of complex applications. However, task parallel applications can exhibit poor performance due to thread idleness, scheduling overheads, and work time inflation – additional time spent by threads in a multithreaded computation beyond the time required to perform the same work in a sequential computation. We identify the contributions of each factor to lost efficiency in various task parallel OpenMP applications and diagnose the causes of work time inflation in those applications. Increased data access latency can cause significant work time inflation in NUMA systems. Our locality framework for task parallel OpenMP programs mitigates this cause of work time inflation. Our extensions to the Qthreads library demonstrate that locality-aware scheduling can improve performance up to 3X compared to the Intel OpenMP task scheduler.

  17. The 3D structure of the hadrons: recents results and experimental program at Jefferson Lab

    Muñoz Camacho C.

    2014-04-01

    Full Text Available The understanding of Quantum Chromodynamics (QCD at large distances still remains one of the main outstanding problems of nuclear physics. Studying the internal structure of hadrons provides a way to probe QCD in the non-perturbative domain and can help us unravel the internal structure of the most elementary blocks of matter. Jefferson Lab (JLab has already delivered results on how elementary quarks and gluons create nucleon structure and properties. The upgrade of JLab to 12 GeV will allow the full exploration of the valence-quark structure of nucleons and the extraction of real threedimensional pictures. I will present recent results and review the future experimental program at JLab.

  18. Using Fuzzy Gaussian Inference and Genetic Programming to Classify 3D Human Motions

    Khoury, Mehdi; Liu, Honghai

    This research introduces and builds on the concept of Fuzzy Gaussian Inference (FGI) (Khoury and Liu in Proceedings of UKCI, 2008 and IEEE Workshop on Robotic Intelligence in Informationally Structured Space (RiiSS 2009), 2009) as a novel way to build Fuzzy Membership Functions that map to hidden Probability Distributions underlying human motions. This method is now combined with a Genetic Programming Fuzzy rule-based system in order to classify boxing moves from natural human Motion Capture data. In this experiment, FGI alone is able to recognise seven different boxing stances simultaneously with an accuracy superior to a GMM-based classifier. Results seem to indicate that adding an evolutionary Fuzzy Inference Engine on top of FGI improves the accuracy of the classifier in a consistent way.

  19. Hybrid 3-D rocket trajectory program. Part 1: Formulation and analysis. Part 2: Computer programming and user's instruction. [computerized simulation using three dimensional motion analysis

    Huang, L. C. P.; Cook, R. A.

    1973-01-01

    Models utilizing various sub-sets of the six degrees of freedom are used in trajectory simulation. A 3-D model with only linear degrees of freedom is especially attractive, since the coefficients for the angular degrees of freedom are the most difficult to determine and the angular equations are the most time consuming for the computer to evaluate. A computer program is developed that uses three separate subsections to predict trajectories. A launch rail subsection is used until the rocket has left its launcher. The program then switches to a special 3-D section which computes motions in two linear and one angular degrees of freedom. When the rocket trims out, the program switches to the standard, three linear degrees of freedom model.

  20. Retargeting of existing FORTRAN program and development of parallel compilers

    Agrawal, Dharma P.

    1988-01-01

    The software models used in implementing the parallelizing compiler for the B-HIVE multiprocessor system are described. The various models and strategies used in the compiler development are: flexible granularity model, which allows a compromise between two extreme granularity models; communication model, which is capable of precisely describing the interprocessor communication timings and patterns; loop type detection strategy, which identifies different types of loops; critical path with coloring scheme, which is a versatile scheduling strategy for any multicomputer with some associated communication costs; and loop allocation strategy, which realizes optimum overlapped operations between computation and communication of the system. Using these models, several sample routines of the AIR3D package are examined and tested. It may be noted that automatically generated codes are highly parallelized to provide the maximized degree of parallelism, obtaining the speedup up to a 28 to 32-processor system. A comparison of parallel codes for both the existing and proposed communication model, is performed and the corresponding expected speedup factors are obtained. The experimentation shows that the B-HIVE compiler produces more efficient codes than existing techniques. Work is progressing well in completing the final phase of the compiler. Numerous enhancements are needed to improve the capabilities of the parallelizing compiler.

  1. Program Transformation to Identify List-Based Parallel Skeletons

    Venkatesh Kannan

    2016-07-01

    Full Text Available Algorithmic skeletons are used as building-blocks to ease the task of parallel programming by abstracting the details of parallel implementation from the developer. Most existing libraries provide implementations of skeletons that are defined over flat data types such as lists or arrays. However, skeleton-based parallel programming is still very challenging as it requires intricate analysis of the underlying algorithm and often uses inefficient intermediate data structures. Further, the algorithmic structure of a given program may not match those of list-based skeletons. In this paper, we present a method to automatically transform any given program to one that is defined over a list and is more likely to contain instances of list-based skeletons. This facilitates the parallel execution of a transformed program using existing implementations of list-based parallel skeletons. Further, by using an existing transformation called distillation in conjunction with our method, we produce transformed programs that contain fewer inefficient intermediate data structures.

  2. Documentation of program AFTBDY to generate coordinate system for 3D after body using body fitted curvilinear coordinates, part 1

    Kumar, D.

    1980-01-01

    The computer program AFTBDY generates a body fitted curvilinear coordinate system for a wedge curved after body. This wedge curved after body is being used in an experimental program. The coordinate system generated by AFTBDY is used to solve 3D compressible N.S. equations. The coordinate system in the physical plane is a cartesian x,y,z system, whereas, in the transformed plane a rectangular xi, eta, zeta system is used. The coordinate system generated is such that in the transformed plane coordinate spacing in the xi, eta, zeta direction is constant and equal to unity. The physical plane coordinate lines in the different regions are clustered heavily or sparsely depending on the regions where physical quantities to be solved for by the N.S. equations have high or low gradients. The coordinate distribution in the physical plane is such that x stays constant in eta and zeta direction, whereas, z stays constant in xi and eta direction. The desired distribution in x and z is input to the program. Consequently, only the y-coordinate is solved for by the program AFTBDY.

  3. Web Based Parallel Programming Workshop for Undergraduate Education.

    Marcus, Robert L.; Robertson, Douglass

    Central State University (Ohio), under a contract with Nichols Research Corporation, has developed a World Wide web based workshop on high performance computing entitled "IBN SP2 Parallel Programming Workshop." The research is part of the DoD (Department of Defense) High Performance Computing Modernization Program. The research…

  4. Protocol-Based Verification of Message-Passing Parallel Programs

    López-Acosta, Hugo-Andrés; Eduardo R. B. Marques, Eduardo R. B.; Martins, Francisco;

    2015-01-01

    a protocol language based on a dependent type system for message-passing parallel programs, which includes various communication operators, such as point-to-point messages, broadcast, reduce, array scatter and gather. For the verification of a program against a given protocol, the protocol is first...

  5. Development of massively parallel quantum chemistry program SMASH

    Ishimura, Kazuya [Department of Theoretical and Computational Molecular Science, Institute for Molecular Science 38 Nishigo-Naka, Myodaiji, Okazaki, Aichi 444-8585 (Japan)

    2015-12-31

    A massively parallel program for quantum chemistry calculations SMASH was released under the Apache License 2.0 in September 2014. The SMASH program is written in the Fortran90/95 language with MPI and OpenMP standards for parallelization. Frequently used routines, such as one- and two-electron integral calculations, are modularized to make program developments simple. The speed-up of the B3LYP energy calculation for (C{sub 150}H{sub 30}){sub 2} with the cc-pVDZ basis set (4500 basis functions) was 50,499 on 98,304 cores of the K computer.

  6. Accelerate Performance on the Parallel Programming Super Highway

    2010-04-01

    barriers  associated with  parallel   programming Dataflow languages ought to be considered along with  traditional (imperative) programming solutions 2...dasymptotic con ition (3 GHz) Moore’s Law may still be valid, but the Law of  Thermodynamics is also valid Parallel   Programming  options exist, but...languages can address some major  challenges associated with  parallel   programming Many dataflow languages exist today, and should be  considered along  ith

  7. Evaluating integration of inland bathymetry in the U.S. Geological Survey 3D Elevation Program, 2014

    Miller-Corbett, Cynthia

    2016-09-01

    Inland bathymetry survey collections, survey data types, features, sources, availability, and the effort required to integrate inland bathymetric data into the U.S. Geological Survey 3D Elevation Program are assessed to help determine the feasibility of integrating three-dimensional water feature elevation data into The National Map. Available data from wading, acoustic, light detection and ranging, and combined technique surveys are provided by the U.S. Geological Survey, National Oceanic and Atmospheric Administration, U.S. Army Corps of Engineers, and other sources. Inland bathymetric data accessed through Web-hosted resources or contacts provide useful baseline parameters for evaluating survey types and techniques used for collection and processing, and serve as a basis for comparing survey methods and the quality of results. Historically, boat-mounted acoustic surveys have provided most inland bathymetry data. Light detection and ranging techniques that are beneficial in areas hard to reach by boat, that can collect dense data in shallow water to provide comprehensive coverage, and that can be cost effective for surveying large areas with good water clarity are becoming more common; however, optimal conditions and techniques for collecting and processing light detection and ranging inland bathymetry surveys are not yet well defined.Assessment of site condition parameters important for understanding inland bathymetry survey issues and results, and an evaluation of existing inland bathymetry survey coverage are proposed as steps to develop criteria for implementing a useful and successful inland bathymetry survey plan in the 3D Elevation Program. These survey parameters would also serve as input for an inland bathymetry survey data baseline. Integration and interpolation techniques are important factors to consider in developing a robust plan; however, available survey data are usually in a triangulated irregular network format or other format compatible with

  8. On program restructuring, scheduling, and communication for parallel processor systems

    Polychronopoulos, Constantine D.

    1986-08-01

    This dissertation discusses several software and hardware aspects of program execution on large-scale, high-performance parallel processor systems. The issues covered are program restructuring, partitioning, scheduling and interprocessor communication, synchronization, and hardware design issues of specialized units. All this work was performed focusing on a single goal: to maximize program speedup, or equivalently, to minimize parallel execution time. Parafrase, a Fortran restructuring compiler was used to transform programs in a parallel form and conduct experiments. Two new program restructuring techniques are presented, loop coalescing and subscript blocking. Compile-time and run-time scheduling schemes are covered extensively. Depending on the program construct, these algorithms generate optimal or near-optimal schedules. For the case of arbitrarily nested hybrid loops, two optimal scheduling algorithms for dynamic and static scheduling are presented. Simulation results are given for a new dynamic scheduling algorithm. The performance of this algorithm is compared to that of self-scheduling. Techniques for program partitioning and minimization of interprocessor communication for idealized program models and for real Fortran programs are also discussed. The close relationship between scheduling, interprocessor communication, and synchronization becomes apparent at several points in this work. Finally, the impact of various types of overhead on program speedup and experimental results are presented. 69 refs., 74 figs., 14 tabs.

  9. Optimized Parallel Execution of Declarative Programs on Distributed Memory Multiprocessors

    沈美明; 田新民; 等

    1993-01-01

    In this paper,we focus on the compiling implementation of parlalel logic language PARLOG and functional language ML on distributed memory multiprocessors.Under the graph rewriting framework, a Heterogeneous Parallel Graph Rewriting Execution Model(HPGREM)is presented firstly.Then based on HPGREM,a parallel abstact machine PAM/TGR is described.Furthermore,several optimizing compilation schemes for executing declarative programs on transputer array are proposed.The performance statistics on transputer array demonstrate the effectiveness of our model,parallel abstract machine,optimizing compilation strategies and compiler.

  10. Methodological proposal for the volumetric study of archaeological ceramics through 3D edition free-software programs: the case of the celtiberians cemeteries of the meseta

    Álvaro Sánchez Climent

    2014-10-01

    Full Text Available Nowadays the free-software programs have been converted into the ideal tools for the archaeological researches, reaching the same level as other commercial programs. For that reason, the 3D modeling tool Blender has reached in the last years a great popularity offering similar characteristics like other commercial 3D editing programs such as 3D Studio Max or AutoCAD. Recently, it has been developed the necessary script for the volumetric calculations of three-dimnesional objects, offering great possibilities to calculate the volume of the archaeological ceramics. In this paper, we present a methodological approach for the volumetric studies with Blender and a study case of funerary urns from several celtiberians cemeteries of the Spanish Meseta. The goal is to demonstrate the great possibilities that the 3D editing free-software tools have in the volumetric studies at the present time.

  11. The parallel programming of voluntary and reflexive saccades.

    Walker, Robin; McSorley, Eugene

    2006-06-01

    A novel two-step paradigm was used to investigate the parallel programming of consecutive, stimulus-elicited ('reflexive') and endogenous ('voluntary') saccades. The mean latency of voluntary saccades, made following the first reflexive saccades in two-step conditions, was significantly reduced compared to that of voluntary saccades made in the single-step control trials. The latency of the first reflexive saccades was modulated by the requirement to make a second saccade: first saccade latency increased when a second voluntary saccade was required in the opposite direction to the first saccade, and decreased when a second saccade was required in the same direction as the first reflexive saccade. A second experiment confirmed the basic effect and also showed that a second reflexive saccade may be programmed in parallel with a first voluntary saccade. The results support the view that voluntary and reflexive saccades can be programmed in parallel on a common motor map.

  12. Protocol-Based Verification of Message-Passing Parallel Programs

    López-Acosta, Hugo-Andrés; Eduardo R. B. Marques, Eduardo R. B.; Martins, Francisco

    2015-01-01

    We present ParTypes, a type-based methodology for the verification of Message Passing Interface (MPI) programs written in the C programming language. The aim is to statically verify programs against protocol specifications, enforcing properties such as fidelity and absence of deadlocks. We develop...... a protocol language based on a dependent type system for message-passing parallel programs, which includes various communication operators, such as point-to-point messages, broadcast, reduce, array scatter and gather. For the verification of a program against a given protocol, the protocol is first...

  13. The MHOST finite element program: 3-D inelastic analysis methods for hot section components. Volume 1: Theoretical manual

    Nakazawa, Shohei

    1991-01-01

    Formulations and algorithms implemented in the MHOST finite element program are discussed. The code uses a novel concept of the mixed iterative solution technique for the efficient 3-D computations of turbine engine hot section components. The general framework of variational formulation and solution algorithms are discussed which were derived from the mixed three field Hu-Washizu principle. This formulation enables the use of nodal interpolation for coordinates, displacements, strains, and stresses. Algorithmic description of the mixed iterative method includes variations for the quasi static, transient dynamic and buckling analyses. The global-local analysis procedure referred to as the subelement refinement is developed in the framework of the mixed iterative solution, of which the detail is presented. The numerically integrated isoparametric elements implemented in the framework is discussed. Methods to filter certain parts of strain and project the element discontinuous quantities to the nodes are developed for a family of linear elements. Integration algorithms are described for linear and nonlinear equations included in MHOST program.

  14. 3D Printing an Octohedron

    Aboufadel, Edward F.

    2014-01-01

    The purpose of this short paper is to describe a project to manufacture a regular octohedron on a 3D printer. We assume that the reader is familiar with the basics of 3D printing. In the project, we use fundamental ideas to calculate the vertices and faces of an octohedron. Then, we utilize the OPENSCAD program to create a virtual 3D model and an STereoLithography (.stl) file that can be used by a 3D printer.

  15. Parallel Implementation of the PHOENIX Generalized Stellar Atmosphere Program; 2, Wavelength Parallelization

    Baron, E A; Hauschildt, Peter H.

    1997-01-01

    We describe an important addition to the parallel implementation of our generalized NLTE stellar atmosphere and radiative transfer computer program PHOENIX. In a previous paper in this series we described data and task parallel algorithms we have developed for radiative transfer, spectral line opacity, and NLTE opacity and rate calculations. These algorithms divided the work spatially or by spectral lines, that is distributing the radial zones, individual spectral lines, or characteristic rays among different processors and employ, in addition task parallelism for logically independent functions (such as atomic and molecular line opacities). For finite, monotonic velocity fields, the radiative transfer equation is an initial value problem in wavelength, and hence each wavelength point depends upon the previous one. However, for sophisticated NLTE models of both static and moving atmospheres needed to accurately describe, e.g., novae and supernovae, the number of wavelength points is very large (200,000--300,0...

  16. Center for Programming Models for Scalable Parallel Computing

    John Mellor-Crummey

    2008-02-29

    Rice University's achievements as part of the Center for Programming Models for Scalable Parallel Computing include: (1) design and implemention of cafc, the first multi-platform CAF compiler for distributed and shared-memory machines, (2) performance studies of the efficiency of programs written using the CAF and UPC programming models, (3) a novel technique to analyze explicitly-parallel SPMD programs that facilitates optimization, (4) design, implementation, and evaluation of new language features for CAF, including communication topologies, multi-version variables, and distributed multithreading to simplify development of high-performance codes in CAF, and (5) a synchronization strength reduction transformation for automatically replacing barrier-based synchronization with more efficient point-to-point synchronization. The prototype Co-array Fortran compiler cafc developed in this project is available as open source software from http://www.hipersoft.rice.edu/caf.

  17. GeoGebra 3D from the perspectives of elementary pre-service mathematics teachers who are familiar with a number of software programs

    Serdal Baltaci

    2015-01-01

    Full Text Available Each new version of the GeoGebra dynamic mathematics software goes through updates and innovations. One of these innovations is the GeoGebra 5.0 version. This version aims to facilitate 3D instruction by offering opportunities for students to analyze 3D objects. While scanning the previous studies of GeoGebra 3D, it is seen that they mainly focus on the visualization of a problem in daily life and the dimensions of the evaluation of the process of problem solving with various variables. Therefore, this research problem was determined to reveal the opinions of pre-service elementary mathematics teachers who can use multiple software programs very well, about the usability of GeoGebra 3D. Compared to other studies conducted in this field, this study is thought to add a new dimension to the literature on GeoGebra 3D because the participants in the study had received training in using the Derive, Cabri, Cabri 3D, GeoGebra and GeoGebra 3D programs and had developed activities throughout their undergraduate programs and in some cases they were held responsible for those programs in their exams. In this research, we used the method of case study. The participants consisted of five elementary pre-service mathematics teachers who were enrolled in fourth year courses. We employed semi-structured interviews to collect data. It is concluded that pre-service elementary mathematics teachers expressed a great deal of opinions about the positive contribution of the GeoGebra 3D dynamic mathematics software.

  18. MELD: A Logical Approach to Distributed and Parallel Programming

    2012-03-01

    USA: ACM, 1974, pp. 249–264. [14] M. Isard , M. Budiu, Y. Yu, A. Birrell, and D. Fetterly, “Dryad: Distributed data-parallel programs from sequential...TR-2006-140. [Online]. Available: http://budiu.info/work/eurosys07.pdf [15] Y. Yu, M. Isard , D. Fetterly, M. Budiu, Ú . Erlingsson, P. K. Gunda

  19. Simulation in 3 dimensions of a cycle 18 months for an BWR type reactor using the Nod3D program; Simulacion en 3 dimensiones de un ciclo de 18 meses para un reactor BWR usando el programa Nod3D

    Hernandez, N.; Alonso, G. [ININ, A.P. 18-1027, 11801 Mexico D.F. (Mexico)]. E-mail: nhm@nuclear.inin.mx; Valle, E. del [IPN, ESFM, 07738 Mexico D.F. (Mexico)

    2004-07-01

    The development of own codes that you/they allow the simulation in 3 dimensions of the nucleus of a reactor and be of easy maintenance, without the consequent payment of expensive use licenses, it can be a factor that propitiates the technological independence. In the Department of Nuclear Engineering (DIN) of the Superior School of Physics and Mathematics (ESFM) of the National Polytechnic Institute (IPN) a denominated program Nod3D has been developed with the one that one can simulate the operation of a reactor BWR in 3 dimensions calculating the effective multiplication factor (kJJ3, as well as the distribution of the flow neutronic and of the axial and radial profiles of the power, inside a means of well-known characteristics solving the equations of diffusion of neutrons numerically in stationary state and geometry XYZ using the mathematical nodal method RTN0 (Raviart-Thomas-Nedelec of index zero). One of the limitations of the program Nod3D is that it doesn't allow to consider the burnt of the fuel in an independent way considering feedback, this makes it in an implicit way considering the effective sections in each step of burnt and these sections are obtained of the code Core Master LEND. However even given this limitation, the results obtained in the simulation of a cycle of typical operation of a reactor of the type BWR are similar to those reported by the code Core Master LENDS. The results of the keJ - that were obtained with the program Nod3D they were compared with the results of the code Core Master LEND, presenting a difference smaller than 0.2% (200 pcm), and in the case of the axial profile of power, the maxim differs it was of 2.5%. (Author)

  20. A Large-Grain Parallel Programming Environment for Non-Programmers

    Lewis, Ted

    1994-01-01

    1994 International Conference on Parallel Processing Banger is a parallel programming environment used by non-professional programmers to write explicitly parallel large-grain parallel programs. The goals of Banger are: 1. extreme ease of use, 2. immediate feedback, and 3. machine-independence. Banger is based on three principles: 1. separation of parallel programming-in-the-large from sequential programming-in-the-small, 2. separation of programming environment from target machine ...

  1. GRID2D/3D: A computer program for generating grid systems in complex-shaped two- and three-dimensional spatial domains. Part 1: Theory and method

    Shih, T. I.-P.; Bailey, R. T.; Nguyen, H. L.; Roelke, R. J.

    1990-01-01

    An efficient computer program, called GRID2D/3D was developed to generate single and composite grid systems within geometrically complex two- and three-dimensional (2- and 3-D) spatial domains that can deform with time. GRID2D/3D generates single grid systems by using algebraic grid generation methods based on transfinite interpolation in which the distribution of grid points within the spatial domain is controlled by stretching functions. All single grid systems generated by GRID2D/3D can have grid lines that are continuous and differentiable everywhere up to the second-order. Also, grid lines can intersect boundaries of the spatial domain orthogonally. GRID2D/3D generates composite grid systems by patching together two or more single grid systems. The patching can be discontinuous or continuous. For continuous composite grid systems, the grid lines are continuous and differentiable everywhere up to the second-order except at interfaces where different single grid systems meet. At interfaces where different single grid systems meet, the grid lines are only differentiable up to the first-order. For 2-D spatial domains, the boundary curves are described by using either cubic or tension spline interpolation. For 3-D spatial domains, the boundary surfaces are described by using either linear Coon's interpolation, bi-hyperbolic spline interpolation, or a new technique referred to as 3-D bi-directional Hermite interpolation. Since grid systems generated by algebraic methods can have grid lines that overlap one another, GRID2D/3D contains a graphics package for evaluating the grid systems generated. With the graphics package, the user can generate grid systems in an interactive manner with the grid generation part of GRID2D/3D. GRID2D/3D is written in FORTRAN 77 and can be run on any IBM PC, XT, or AT compatible computer. In order to use GRID2D/3D on workstations or mainframe computers, some minor modifications must be made in the graphics part of the program; no

  2. Modelling parallel programs and multiprocessor architectures with AXE

    Yan, Jerry C.; Fineman, Charles E.

    1991-01-01

    AXE, An Experimental Environment for Parallel Systems, was designed to model and simulate for parallel systems at the process level. It provides an integrated environment for specifying computation models, multiprocessor architectures, data collection, and performance visualization. AXE is being used at NASA-Ames for developing resource management strategies, parallel problem formulation, multiprocessor architectures, and operating system issues related to the High Performance Computing and Communications Program. AXE's simple, structured user-interface enables the user to model parallel programs and machines precisely and efficiently. Its quick turn-around time keeps the user interested and productive. AXE models multicomputers. The user may easily modify various architectural parameters including the number of sites, connection topologies, and overhead for operating system activities. Parallel computations in AXE are represented as collections of autonomous computing objects known as players. Their use and behavior is described. Performance data of the multiprocessor model can be observed on a color screen. These include CPU and message routing bottlenecks, and the dynamic status of the software.

  3. Vyukový program pro demonstraci metod texturování 3D objektů

    Kukla, Michal

    2008-01-01

    Práca sa zaoberá vývojom aplikácie pre demonštrovanie metód texturovania 3D objektov. Práca obsahuje teoreticky rozobrané jednotlivé metódy textúrovania, ktoré tvoria základ pre návrh obsahu demonštrácie. Požiadavky na aplikáciu sú analyzované a aplikované pre vytvorenie návrhu aplikácie. Postupy pri implementácii jednotlivých metód textúrovania a popis ovládacích prvkov aplikácie sú popísané v závere práce. Program môže slúžiť ako demonštračný  pri výučbe alebo tiež ako študijný materiá...

  4. Advanced parallel programming models research and development opportunities.

    Wen, Zhaofang.; Brightwell, Ronald Brian

    2004-07-01

    There is currently a large research and development effort within the high-performance computing community on advanced parallel programming models. This research can potentially have an impact on parallel applications, system software, and computing architectures in the next several years. Given Sandia's expertise and unique perspective in these areas, particularly on very large-scale systems, there are many areas in which Sandia can contribute to this effort. This technical report provides a survey of past and present parallel programming model research projects and provides a detailed description of the Partitioned Global Address Space (PGAS) programming model. The PGAS model may offer several improvements over the traditional distributed memory message passing model, which is the dominant model currently being used at Sandia. This technical report discusses these potential benefits and outlines specific areas where Sandia's expertise could contribute to current research activities. In particular, we describe several projects in the areas of high-performance networking, operating systems and parallel runtime systems, compilers, application development, and performance evaluation.

  5. Heterogeneous Multicore Parallel Programming for Graphics Processing Units

    Francois Bodin

    2009-01-01

    Full Text Available Hybrid parallel multicore architectures based on graphics processing units (GPUs can provide tremendous computing power. Current NVIDIA and AMD Graphics Product Group hardware display a peak performance of hundreds of gigaflops. However, exploiting GPUs from existing applications is a difficult task that requires non-portable rewriting of the code. In this paper, we present HMPP, a Heterogeneous Multicore Parallel Programming workbench with compilers, developed by CAPS entreprise, that allows the integration of heterogeneous hardware accelerators in a unintrusive manner while preserving the legacy code.

  6. Feedback Driven Annotation and Refactoring of Parallel Programs

    Larsen, Per

    This thesis combines programmer knowledge and feedback to improve modeling and optimization of software. The research is motivated by two observations. First, there is a great need for automatic analysis of software for embedded systems - to expose and model parallelism inherent in programs. Second......, some program properties are beyond reach of such analysis for theoretical and practical reasons - but can be described by programmers. Three aspects are explored. The first is annotation of the source code. Two annotations are introduced. These allow more accurate modeling of parallelism...... are not effective unless programmers are told how and when they are benecial. A prototype compilation feedback system was developed in collaboration with IBM Haifa Research Labs. It reports issues that prevent further analysis to the programmer. Performance evaluation shows that three programs performes signicantly...

  7. Sisal 3.2: functional language for scientific parallel programming

    Kasyanov, Victor

    2013-05-01

    Sisal 3.2 is a new input language of system of functional programming (SFP) which is under development at the Institute of Informatics Systems in Novosibirsk as an interactive visual environment for supporting of scientific parallel programming. This paper contains an overview of Sisal 3.2 and a description of its new features compared with previous versions of the SFP input language such as the multidimensional array support, new abstractions like parametric types and generalised procedures, more flexible user-defined reductions, improved interoperability with other programming languages and specification of several optimising source text annotations.

  8. CaKernel – A Parallel Application Programming Framework for Heterogenous Computing Architectures

    Marek Blazewicz

    2011-01-01

    Full Text Available With the recent advent of new heterogeneous computing architectures there is still a lack of parallel problem solving environments that can help scientists to use easily and efficiently hybrid supercomputers. Many scientific simulations that use structured grids to solve partial differential equations in fact rely on stencil computations. Stencil computations have become crucial in solving many challenging problems in various domains, e.g., engineering or physics. Although many parallel stencil computing approaches have been proposed, in most cases they solve only particular problems. As a result, scientists are struggling when it comes to the subject of implementing a new stencil-based simulation, especially on high performance hybrid supercomputers. In response to the presented need we extend our previous work on a parallel programming framework for CUDA – CaCUDA that now supports OpenCL. We present CaKernel – a tool that simplifies the development of parallel scientific applications on hybrid systems. CaKernel is built on the highly scalable and portable Cactus framework. In the CaKernel framework, Cactus manages the inter-process communication via MPI while CaKernel manages the code running on Graphics Processing Units (GPUs and interactions between them. As a non-trivial test case we have developed a 3D CFD code to demonstrate the performance and scalability of the automatically generated code.

  9. Performance Evaluation Methodologies and Tools for Massively Parallel Programs

    Yan, Jerry C.; Sarukkai, Sekhar; Tucker, Deanne (Technical Monitor)

    1994-01-01

    The need for computing power has forced a migration from serial computation on a single processor to parallel processing on multiprocessors. However, without effective means to monitor (and analyze) program execution, tuning the performance of parallel programs becomes exponentially difficult as program complexity and machine size increase. The recent introduction of performance tuning tools from various supercomputer vendors (Intel's ParAide, TMC's PRISM, CSI'S Apprentice, and Convex's CXtrace) seems to indicate the maturity of performance tool technologies and vendors'/customers' recognition of their importance. However, a few important questions remain: What kind of performance bottlenecks can these tools detect (or correct)? How time consuming is the performance tuning process? What are some important technical issues that remain to be tackled in this area? This workshop reviews the fundamental concepts involved in analyzing and improving the performance of parallel and heterogeneous message-passing programs. Several alternative strategies will be contrasted, and for each we will describe how currently available tuning tools (e.g., AIMS, ParAide, PRISM, Apprentice, CXtrace, ATExpert, Pablo, IPS-2)) can be used to facilitate the process. We will characterize the effectiveness of the tools and methodologies based on actual user experiences at NASA Ames Research Center. Finally, we will discuss their limitations and outline recent approaches taken by vendors and the research community to address them.

  10. A new computer program for topological, visual analysis of 3D particle configurations based on visual representation of radial distribution function peaks as bonds

    Metere, Alfredo; Dzugutov, Mikhail

    2015-01-01

    We present a new program able to perform unique visual analysis on generic particle systems: PASYVAT (PArticle SYstem Visual Analysis Tool). More specifically, it can perform a selection of multiple interparticle distance ranges from a radial distribution function (RDF) plot and display them in 3D as bonds. This software can be used with any data set representing a system of particles in 3D. In this manuscript the reader will find a description of the program and its internal structure, with emphasis on its applicability in the study of certain particle configurations, obtained from classical molecular dynamics simulation in condensed matter physics.

  11. Programming N-Cubes with a Graphical Parallel Programming Environment Versus an Extended Sequential Language.

    1986-11-01

    parallel programming environment and language Poker. Our example programs, an implementation of a Cholesky algorithm for a banded matrix, were written in both languages and compiled into object codes that ran on the Cosmic Cube. However the program written in Poker is shorter, faster and easier to write, easier to debug, and portable without changes to other parallel computer architectures. The Poker program was slower than the program written directly in Cosmic Cube C, however the experiments provided insights into changes that make Poker programs nearly as fast.

  12. Scientific programming on massively parallel processor CP-PACS

    Boku, Taisuke [Tsukuba Univ., Ibaraki (Japan). Inst. of Information Sciences and Electronics

    1998-03-01

    The massively parallel processor CP-PACS takes various problems of calculation physics as the object, and it has been designed so that its architecture has been devised to do various numerical processings. In this report, the outline of the CP-PACS and the example of programming in the Kernel CG benchmark in NAS Parallel Benchmarks, version 1, are shown, and the pseudo vector processing mechanism and the parallel processing tuning of scientific and technical computation utilizing the three-dimensional hyper crossbar net, which are two great features of the architecture of the CP-PACS are described. As for the CP-PACS, the PUs based on RISC processor and added with pseudo vector processor are used. Pseudo vector processing is realized as the loop processing by scalar command. The features of the connection net of PUs are explained. The algorithm of the NPB version 1 Kernel CG is shown. The part that takes the time for processing most in the main loop is the product of matrix and vector (matvec), and the parallel processing of the matvec is explained. The time for the computation by the CPU is determined. As the evaluation of the performance, the evaluation of the time for execution, the short vector processing of pseudo vector processor based on slide window, and the comparison with other parallel computers are reported. (K.I.)

  13. 3D photoacoustic imaging

    Carson, Jeffrey J. L.; Roumeliotis, Michael; Chaudhary, Govind; Stodilka, Robert Z.; Anastasio, Mark A.

    2010-06-01

    Our group has concentrated on development of a 3D photoacoustic imaging system for biomedical imaging research. The technology employs a sparse parallel detection scheme and specialized reconstruction software to obtain 3D optical images using a single laser pulse. With the technology we have been able to capture 3D movies of translating point targets and rotating line targets. The current limitation of our 3D photoacoustic imaging approach is its inability ability to reconstruct complex objects in the field of view. This is primarily due to the relatively small number of projections used to reconstruct objects. However, in many photoacoustic imaging situations, only a few objects may be present in the field of view and these objects may have very high contrast compared to background. That is, the objects have sparse properties. Therefore, our work had two objectives: (i) to utilize mathematical tools to evaluate 3D photoacoustic imaging performance, and (ii) to test image reconstruction algorithms that prefer sparseness in the reconstructed images. Our approach was to utilize singular value decomposition techniques to study the imaging operator of the system and evaluate the complexity of objects that could potentially be reconstructed. We also compared the performance of two image reconstruction algorithms (algebraic reconstruction and l1-norm techniques) at reconstructing objects of increasing sparseness. We observed that for a 15-element detection scheme, the number of measureable singular vectors representative of the imaging operator was consistent with the demonstrated ability to reconstruct point and line targets in the field of view. We also observed that the l1-norm reconstruction technique, which is known to prefer sparseness in reconstructed images, was superior to the algebraic reconstruction technique. Based on these findings, we concluded (i) that singular value decomposition of the imaging operator provides valuable insight into the capabilities of

  14. 3维全电磁粒子软件NEPTUNE中的并行计算方法%Parallelization methods in 3D fully electromagnetic code NEPTUNE

    陈军; 莫则尧; 董烨; 杨温渊; 董志伟

    2011-01-01

    NEPTUNE is a three-dimensional fully parallel electromagnetic code to solve electromagnetic problem in high power microwaveC HPM) devices with complex geometry. This paper introduces the following three parallelization methods used in the code. For massively computation, the "block-patch" two level parallel domain decomposition strategy is provided to scale the computation size to thousands of processor cores. Based on the geometry information, the mesh is reconfigured using the adaptive technology to get rid of invalid grid cells, and thus the storage amount and parallel execution time decrease sharply. On the basis of traditional Boris' successive over relaxation (SOR) iteration method, a parallel Poisson solver on irregular domains is provided with red and black ordering technology and geometry constraints. With the above methods, NEPTUNE can get 51. 8% parallel efficiency on 1 024 cores when simulating MILO devices.%介绍了NEPTUNE软件采用的一些并行计算方法:采用“块-网格片”二层并行区域分解方法,使计算规模能够扩展到上千个处理器核.基于复杂几何特征采用自适应技术并行生成结构网格,在原有规则区域的基础上剔除无效网格,大幅降低了存储量和并行执行时间.在经典的Boris和SOR迭代方法基础上,采用红黑排序和几何约束,提出了非规则区域上的Poisson方程并行求解方法.采用这些方法后,当使用NEP-TUNE软件模拟MILO器件时,可在1024个处理器核上获得51.8%的并行效率.

  15. Automatic array alignment in data-parallel programs

    Chatterjee, Siddhartha; Gilbert, John R.; Schreiber, Robert; Teng, Shang-Hua

    1993-01-01

    FORTRAN 90 and other data-parallel languages express parallelism in the form of operations on data aggregates such as arrays. Misalignment of the operands of an array operation can reduce program performance on a distributed-memory parallel machine by requiring nonlocal data accesses. Determining array alignments that reduce communication is therefore a key issue in compiling such languages. We present a framework for the automatic determination of array alignments in array-based, data-parallel languages. Our language model handles array sectioning, reductions, spreads, transpositions, and masked operations. We decompose alignment functions into three constituents: axis, stride, and offset. For each of these subproblems, we show how to solve the alignment problem for a basic block of code, possibly containing common subexpressions. Alignments are generated for all array objects in the code, both named program variables and intermediate results. We assign computation to processors by virtue of explicit alignment of all temporaries; the resulting work assignment is in general better than that provided by the 'owner-computes' rule. Finally, we present some ideas for dealing with control flow, replication, and dynamic alignments that depend on loop induction variables.

  16. On the utility of threads for data parallel programming

    Fahringer, Thomas; Haines, Matthew; Mehrotra, Piyush

    1995-01-01

    Threads provide a useful programming model for asynchronous behavior because of their ability to encapsulate units of work that can then be scheduled for execution at runtime, based on the dynamic state of a system. Recently, the threaded model has been applied to the domain of data parallel scientific codes, and initial reports indicate that the threaded model can produce performance gains over non-threaded approaches, primarily through the use of overlapping useful computation with communication latency. However, overlapping computation with communication is possible without the benefit of threads if the communication system supports asynchronous primitives, and this comparison has not been made in previous papers. This paper provides a critical look at the utility of lightweight threads as applied to data parallel scientific programming.

  17. Final Report: Center for Programming Models for Scalable Parallel Computing

    Mellor-Crummey, John [William Marsh Rice University

    2011-09-13

    As part of the Center for Programming Models for Scalable Parallel Computing, Rice University collaborated with project partners in the design, development and deployment of language, compiler, and runtime support for parallel programming models to support application development for the “leadership-class” computer systems at DOE national laboratories. Work over the course of this project has focused on the design, implementation, and evaluation of a second-generation version of Coarray Fortran. Research and development efforts of the project have focused on the CAF 2.0 language, compiler, runtime system, and supporting infrastructure. This has involved working with the teams that provide infrastructure for CAF that we rely on, implementing new language and runtime features, producing an open source compiler that enabled us to evaluate our ideas, and evaluating our design and implementation through the use of benchmarks. The report details the research, development, findings, and conclusions from this work.

  18. Fast 3D coronary artery contrast-enhanced magnetic resonance angiography with magnetization transfer contrast, fat suppression and parallel imaging as applied on an anthropomorphic moving heart phantom.

    Irwan, Roy; Rüssel, Iris K; Sijens, Paul E

    2006-09-01

    A magnetic resonance sequence for high-resolution imaging of coronary arteries in a very short acquisition time is presented. The technique is based on fast low-angle shot and uses fat saturation and magnetization transfer contrast prepulses to improve image contrast. GeneRalized Autocalibrating Partially Parallel Acquisitions (GRAPPA) is implemented to shorten acquisition time. The sequence was tested on a moving anthropomorphic silicone heart phantom where the coronary arteries were filled with a gadolinium contrast agent solution, and imaging was performed at varying heart rates using GRAPPA. The clinical relevance of the phantom was validated by comparing the myocardial relaxation times of the phantom's homogeneous silicone cardiac wall to those of humans. Signal-to-noise ratio and contrast-to-noise ratio were higher when parallel imaging was used, possibly benefiting from the acquisition of one partition per heartbeat. Another advantage of parallel imaging for visualizing the coronary arteries is that the entire heart can be imaged within a few breath-holds.

  19. A New Approach to Improve Cognition, Muscle Strength, and Postural Balance in Community-Dwelling Elderly with a 3-D Virtual Reality Kayak Program.

    Park, Junhyuck; Yim, JongEun

    2016-01-01

    Aging is usually accompanied with deterioration of physical abilities, such as muscular strength, sensory sensitivity, and functional capacity. Recently, intervention methods with virtual reality have been introduced, providing an enjoyable therapy for elderly. The aim of this study was to investigate whether a 3-D virtual reality kayak program could improve the cognitive function, muscle strength, and balance of community-dwelling elderly. Importantly, kayaking involves most of the upper body musculature and needs the balance control. Seventy-two participants were randomly allocated into the kayak program group (n = 36) and the control group (n = 36). The two groups were well matched with respect to general characteristics at baseline. The participants in both groups performed a conventional exercise program for 30 min, and then the 3-D virtual reality kayak program was performed in the kayak program group for 20 min, two times a week for 6 weeks. Cognitive function was measured using the Montreal Cognitive Assessment. Muscle strength was measured using the arm curl and handgrip strength tests. Standing and sitting balance was measured using the Good Balance system. The post-test was performed in the same manner as the pre-test; the overall outcomes such as cognitive function (p kayak program group compared to the control group. We propose that the 3-D virtual reality kayak program is a promising intervention method for improving the cognitive function, muscle strength, and balance of elderly.

  20. 3D Elevation Program—Virtual USA in 3D

    Lukas, Vicki; Stoker, J.M.

    2016-04-14

    The U.S. Geological Survey (USGS) 3D Elevation Program (3DEP) uses a laser system called ‘lidar’ (light detection and ranging) to create a virtual reality map of the Nation that is very accurate. 3D maps have many uses with new uses being discovered all the time.  

  1. MLP: A Parallel Programming Alternative to MPI for New Shared Memory Parallel Systems

    Taft, James R.

    1999-01-01

    Recent developments at the NASA AMES Research Center's NAS Division have demonstrated that the new generation of NUMA based Symmetric Multi-Processing systems (SMPs), such as the Silicon Graphics Origin 2000, can successfully execute legacy vector oriented CFD production codes at sustained rates far exceeding processing rates possible on dedicated 16 CPU Cray C90 systems. This high level of performance is achieved via shared memory based Multi-Level Parallelism (MLP). This programming approach, developed at NAS and outlined below, is distinct from the message passing paradigm of MPI. It offers parallelism at both the fine and coarse grained level, with communication latencies that are approximately 50-100 times lower than typical MPI implementations on the same platform. Such latency reductions offer the promise of performance scaling to very large CPU counts. The method draws on, but is also distinct from, the newly defined OpenMP specification, which uses compiler directives to support a limited subset of multi-level parallel operations. The NAS MLP method is general, and applicable to a large class of NASA CFD codes.

  2. An informal introduction to program transformation and parallel processors

    Hopkins, K.W. [Southwest Baptist Univ., Bolivar, MO (United States)

    1994-08-01

    In the summer of 1992, I had the opportunity to participate in a Faculty Research Program at Argonne National Laboratory. I worked under Dr. Jim Boyle on a project transforming code written in pure functional Lisp to Fortran code to run on distributed-memory parallel processors. To perform this project, I had to learn three things: the transformation system, the basics of distributed-memory parallel machines, and the Lisp programming language. Each of these topics in computer science was unfamiliar to me as a mathematician, but I found that they (especially parallel processing) are greatly impacting many fields of mathematics and science. Since most mathematicians have some exposure to computers, but.certainly are not computer scientists, I felt it was appropriate to write a paper summarizing my introduction to these areas and how they can fit together. This paper is not meant to be a full explanation of the topics, but an informal introduction for the ``mathematical layman.`` I place myself in that category as well as my previous use of computers was as a classroom demonstration tool.

  3. Advanced Programming Platform for efficient use of Data Parallel Hardware

    Cabellos, Luis

    2012-01-01

    Graphics processing units (GPU) had evolved from a specialized hardware capable to render high quality graphics in games to a commodity hardware for effective processing blocks of data in a parallel schema. This evolution is particularly interesting for scientific groups, which traditionally use mainly CPU as a work horse, and now can profit of the arrival of GPU hardware to HPC clusters. This new GPU hardware promises a boost in peak performance, but it is not trivial to use. In this article a programming platform designed to promote a direct use of this specialized hardware is presented. This platform includes a visual editor of parallel data flows and it is oriented to the execution in distributed clusters with GPUs. Examples of application in two characteristic problems, Fast Fourier Transform and Image Compression, are also shown.

  4. A parallel domain decomposition-based implicit method for the Cahn-Hilliard-Cook phase-field equation in 3D

    Zheng, Xiang

    2015-03-01

    We present a numerical algorithm for simulating the spinodal decomposition described by the three dimensional Cahn-Hilliard-Cook (CHC) equation, which is a fourth-order stochastic partial differential equation with a noise term. The equation is discretized in space and time based on a fully implicit, cell-centered finite difference scheme, with an adaptive time-stepping strategy designed to accelerate the progress to equilibrium. At each time step, a parallel Newton-Krylov-Schwarz algorithm is used to solve the nonlinear system. We discuss various numerical and computational challenges associated with the method. The numerical scheme is validated by a comparison with an explicit scheme of high accuracy (and unreasonably high cost). We present steady state solutions of the CHC equation in two and three dimensions. The effect of the thermal fluctuation on the spinodal decomposition process is studied. We show that the existence of the thermal fluctuation accelerates the spinodal decomposition process and that the final steady morphology is sensitive to the stochastic noise. We also show the evolution of the energies and statistical moments. In terms of the parallel performance, it is found that the implicit domain decomposition approach scales well on supercomputers with a large number of processors. © 2015 Elsevier Inc.

  5. NavP: Structured and Multithreaded Distributed Parallel Programming

    Pan, Lei

    2007-01-01

    We present Navigational Programming (NavP) -- a distributed parallel programming methodology based on the principles of migrating computations and multithreading. The four major steps of NavP are: (1) Distribute the data using the data communication pattern in a given algorithm; (2) Insert navigational commands for the computation to migrate and follow large-sized distributed data; (3) Cut the sequential migrating thread and construct a mobile pipeline; and (4) Loop back for refinement. NavP is significantly different from the current prevailing Message Passing (MP) approach. The advantages of NavP include: (1) NavP is structured distributed programming and it does not change the code structure of an original algorithm. This is in sharp contrast to MP as MP implementations in general do not resemble the original sequential code; (2) NavP implementations are always competitive with the best MPI implementations in terms of performance. Approaches such as DSM or HPF have failed to deliver satisfying performance as of today in contrast, even if they are relatively easy to use compared to MP; (3) NavP provides incremental parallelization, which is beyond the reach of MP; and (4) NavP is a unifying approach that allows us to exploit both fine- (multithreading on shared memory) and coarse- (pipelined tasks on distributed memory) grained parallelism. This is in contrast to the currently popular hybrid use of MP+OpenMP, which is known to be complex to use. We present experimental results that demonstrate the effectiveness of NavP.

  6. PLOT3D user's manual

    Walatka, Pamela P.; Buning, Pieter G.; Pierce, Larry; Elson, Patricia A.

    1990-01-01

    PLOT3D is a computer graphics program designed to visualize the grids and solutions of computational fluid dynamics. Seventy-four functions are available. Versions are available for many systems. PLOT3D can handle multiple grids with a million or more grid points, and can produce varieties of model renderings, such as wireframe or flat shaded. Output from PLOT3D can be used in animation programs. The first part of this manual is a tutorial that takes the reader, keystroke by keystroke, through a PLOT3D session. The second part of the manual contains reference chapters, including the helpfile, data file formats, advice on changing PLOT3D, and sample command files.

  7. Parallelization and checkpointing of GPU applications through program transformation

    Solano-Quinde, Lizandro Damian [Iowa State Univ., Ames, IA (United States)

    2012-01-01

    GPUs have emerged as a powerful tool for accelerating general-purpose applications. The availability of programming languages that makes writing general-purpose applications for running on GPUs tractable have consolidated GPUs as an alternative for accelerating general purpose applications. Among the areas that have benefited from GPU acceleration are: signal and image processing, computational fluid dynamics, quantum chemistry, and, in general, the High Performance Computing (HPC) Industry. In order to continue to exploit higher levels of parallelism with GPUs, multi-GPU systems are gaining popularity. In this context, single-GPU applications are parallelized for running in multi-GPU systems. Furthermore, multi-GPU systems help to solve the GPU memory limitation for applications with large application memory footprint. Parallelizing single-GPU applications has been approached by libraries that distribute the workload at runtime, however, they impose execution overhead and are not portable. On the other hand, on traditional CPU systems, parallelization has been approached through application transformation at pre-compile time, which enhances the application to distribute the workload at application level and does not have the issues of library-based approaches. Hence, a parallelization scheme for GPU systems based on application transformation is needed. Like any computing engine of today, reliability is also a concern in GPUs. GPUs are vulnerable to transient and permanent failures. Current checkpoint/restart techniques are not suitable for systems with GPUs. Checkpointing for GPU systems present new and interesting challenges, primarily due to the natural differences imposed by the hardware design, the memory subsystem architecture, the massive number of threads, and the limited amount of synchronization among threads. Therefore, a checkpoint/restart technique suitable for GPU systems is needed. The goal of this work is to exploit higher levels of parallelism and

  8. 3D Animation Essentials

    Beane, Andy

    2012-01-01

    The essential fundamentals of 3D animation for aspiring 3D artists 3D is everywhere--video games, movie and television special effects, mobile devices, etc. Many aspiring artists and animators have grown up with 3D and computers, and naturally gravitate to this field as their area of interest. Bringing a blend of studio and classroom experience to offer you thorough coverage of the 3D animation industry, this must-have book shows you what it takes to create compelling and realistic 3D imagery. Serves as the first step to understanding the language of 3D and computer graphics (CG)Covers 3D anim

  9. 3D video

    Lucas, Laurent; Loscos, Céline

    2013-01-01

    While 3D vision has existed for many years, the use of 3D cameras and video-based modeling by the film industry has induced an explosion of interest for 3D acquisition technology, 3D content and 3D displays. As such, 3D video has become one of the new technology trends of this century.The chapters in this book cover a large spectrum of areas connected to 3D video, which are presented both theoretically and technologically, while taking into account both physiological and perceptual aspects. Stepping away from traditional 3D vision, the authors, all currently involved in these areas, provide th

  10. A Parallel 3D Model for the Multi-Species Low Energy Beam Transport System of the RIA Prototype ECR Ion Source VENUS

    Qiang, Ji; Todd, Damon

    2005-01-01

    The driver linac of the proposed Rare Isotope Accelerator (RIA) requires a great variety of high intensity, high charge state ion beams. In order to design and optimize the low energy beam line optics of the RIA front end, we have developed a new parallel three-dimensional model to simulate the low energy, multi-species beam transport from the ECR ion source extraction region to the focal plane of the analyzing magnet. A multi-section overlapped computational domain has been used to break the original transport system into a number of independent subsystems. Within each subsystem, macro-particle tracking is used to obtain the charge density distribution in this subdomain. The three-dimensional Poisson equation is solved within the subdomain and particle tracking is repeated until the solution converges. Two new Poisson solvers based on a combination of the spectral method and the multigrid method have been developed to solve the Poisson equation in cylindrical coordinates for the beam extraction region and in...

  11. 三维有限元并行计算及其工程应用%PARALLEL 3D FINITE ELEMENT ANALYSIS AND ITS APPLICATION TO HYDRAULIC ENGINEERING

    刘耀儒; 周维垣; 杨强; 陈新

    2005-01-01

    针对水利工程中的大型复杂三维结构对大规模数值计算的需求,基于J-PCG方法(Jacobi预处理共轭梯度法),建立了有限元EBE(element-by-element)方法在分布存储并行机上的计算算法.该算法不用考虑网格拓扑结构和单元的排序,同时不形成整体刚度矩阵,而且避免了对复杂的三维结构进行区域分解.采用上述算法编制了三维有限元并行求解的PFEM(parallel finite element method)程序,并在网络机群系统上实现,然后将其应用到二滩拱坝-地基系统和水布娅地下洞室的三维有限元数值计算中.数值计算结果表明,三维有限元并行EBE方法非常适合于水利工程中三维复杂结构的大规模数值计算.

  12. Advanced 3-D Ultrasound Imaging

    Rasmussen, Morten Fischer

    The main purpose of the PhD project was to develop methods that increase the 3-D ultrasound imaging quality available for the medical personnel in the clinic. Acquiring a 3-D volume gives the medical doctor the freedom to investigate the measured anatomy in any slice desirable after the scan has...... been completed. This allows for precise measurements of organs dimensions and makes the scan more operator independent. Real-time 3-D ultrasound imaging is still not as widespread in use in the clinics as 2-D imaging. A limiting factor has traditionally been the low image quality achievable using...... Field II simulations and measurements with the ultrasound research scanner SARUS and a 3.5MHz 1024 element 2-D transducer array. In all investigations, 3-D synthetic aperture imaging achieved a smaller main-lobe, lower sidelobes, higher contrast, and better signal to noise ratio than parallel...

  13. Energy consumption model over parallel programs implemented on multicore architectures

    Ricardo Isidro-Ramirez

    2015-06-01

    Full Text Available In High Performance Computing, energy consump-tion is becoming an important aspect to consider. Due to the high costs that represent energy production in all countries it holds an important role and it seek to find ways to save energy. It is reflected in some efforts to reduce the energy requirements of hardware components and applications. Some options have been appearing in order to scale down energy use and, con-sequently, scale up energy efficiency. One of these strategies is the multithread programming paradigm, whose purpose is to produce parallel programs able to use the full amount of computing resources available in a microprocessor. That energy saving strategy focuses on efficient use of multicore processors that are found in various computing devices, like mobile devices. Actually, as a growing trend, multicore processors are found as part of various specific purpose computers since 2003, from High Performance Computing servers to mobile devices. However, it is not clear how multiprogramming affects energy efficiency. This paper presents an analysis of different types of multicore-based architectures used in computing, and then a valid model is presented. Based on Amdahl’s Law, a model that considers different scenarios of energy use in multicore architectures it is proposed. Some interesting results were found from experiments with the developed algorithm, that it was execute of a parallel and sequential way. A lower limit of energy consumption was found in a type of multicore architecture and this behavior was observed experimentally.

  14. An empirical study of FORTRAN programs for parallelizing compilers

    Shen, Zhiyu; Li, Zhiyuan; Yew, Pen-Chung

    1990-01-01

    Some results are reported from an empirical study of program characteristics that are important in parallelizing compiler writers, especially in the area of data dependence analysis and program transformations. The state of the art in data dependence analysis and some parallel execution techniques are examined. The major findings are included. Many subscripts contain symbolic terms with unknown values. A few methods of determining their values at compile time are evaluated. Array references with coupled subscripts appear quite frequently; these subscripts must be handled simultaneously in a dependence test, rather than being handled separately as in current test algorithms. Nonzero coefficients of loop indexes in most subscripts are found to be simple: they are either 1 or -1. This allows an exact real-valued test to be as accurate as an exact integer-valued test for one-dimensional or two-dimensional arrays. Dependencies with uncertain distance are found to be rather common, and one of the main reasons is the frequent appearance of symbolic terms with unknown values.

  15. Poker on the Cosmic Cube: The First Retargetable Parallel Programming Language and Environment.

    1986-06-01

    parallel programming environment, to new parallel architectures. The specifics are illustrated by describing the retarget of Poker to CalTech’s Cosmic Cube. Poker requires only three features from the target architecture: MIMD operation, message passing inter-process communication, and a sequential language (e.g. C) for the processor elements. In return Poker gives the new architecture a complete parallel programming environment which will compile Poker parallel programs without modification, into efficient object code for the new architecture.

  16. CUDA programs for GPU computing of Swendsen-Wang multi-cluster spin flip algorithm: 2D and 3D Ising, Potts, and XY models

    Komura, Yukihiro

    2014-01-01

    We present sample CUDA programs for the GPU computing of the Swendsen-Wang multi-cluster spin flip algorithm. We deal with the classical spin models; the Ising model, the $q$-state Potts model, and the classical XY model. As for the lattice, both the 2D (square) lattice and the 3D (simple cubic) lattice are treated. We already reported the idea of the GPU implementation for 2D models [Comput. Phys. Commun. 183 (2012) 1155-1161]. We here explain the details of sample programs, and discuss the performance of the present GPU implementation for the 3D Ising and XY models. We also show the calculated results of the moment ratio for these models, and discuss phase transitions.

  17. A Parallel Vector Machine for the PM Programming Language

    Bellerby, Tim

    2016-04-01

    PM is a new programming language which aims to make the writing of computational geoscience models on parallel hardware accessible to scientists who are not themselves expert parallel programmers. It is based around the concept of communicating operators: language constructs that enable variables local to a single invocation of a parallelised loop to be viewed as if they were arrays spanning the entire loop domain. This mechanism enables different loop invocations (which may or may not be executing on different processors) to exchange information in a manner that extends the successful Communicating Sequential Processes idiom from single messages to collective communication. Communicating operators avoid the additional synchronisation mechanisms, such as atomic variables, required when programming using the Partitioned Global Address Space (PGAS) paradigm. Using a single loop invocation as the fundamental unit of concurrency enables PM to uniformly represent different levels of parallelism from vector operations through shared memory systems to distributed grids. This paper describes an implementation of PM based on a vectorised virtual machine. On a single processor node, concurrent operations are implemented using masked vector operations. Virtual machine instructions operate on vectors of values and may be unmasked, masked using a Boolean field, or masked using an array of active vector cell locations. Conditional structures (such as if-then-else or while statement implementations) calculate and apply masks to the operations they control. A shift in mask representation from Boolean to location-list occurs when active locations become sufficiently sparse. Parallel loops unfold data structures (or vectors of data structures for nested loops) into vectors of values that may additionally be distributed over multiple computational nodes and then split into micro-threads compatible with the size of the local cache. Inter-node communication is accomplished using

  18. Open-MP与并行程序设计%Open-MP and Parallel Programming

    陈崚; 陈宏建; 秦玲

    2003-01-01

    The application programming interface Open-MP for the shared memory parallel computer system and its characteristics are illustrated. We also compare Open-MP with parallel programming tool MPI.To overcome the disadvantage of large overhead in Open-MP program,several optimization methods in Open-MP programming are presented to increase the efficiency of its execution.

  19. Parallel functional programming in Sisal: Fictions, facts, and future

    McGraw, J.R.

    1993-07-01

    This paper provides a status report on the progress of research and development on the functional language Sisal. This project focuses on providing a highly effective method of writing large scientific applications that can efficiently execute on a spectrum of different multiprocessors. The paper includes sections on the language definition, compilation strategies, and programming techniques intended for readers with little or no background with Sisal. The section on performance presents our most recent results on execution speed for shared-memory multiprocessors, our findings using Sisal to develop codes, and our experiences migrating the same source code to different machines. For large programs, the execution performance of Sisal (with minimal supporting advice from the programmer) usually exceeds that of the best available automatic, vector/parallel Fortran compilers. Our evidence also indicates that Sisal programs tend to be shorter in length, faster to write, and dearer to understand than equivalent algorithms in Fortran. The paper concludes with a substantial discussion of common criticisms of the language and our plans for addressing them. Most notably, efficient implementations for distributed memory machines are lacking; an issue we plan to remedy.

  20. 基于 VIC 3D 技术的重组竹I 型断裂参数确定方法%Determination of mode-I fracture parameters of parallel strand bamboo based on VIC-3D technology

    杨蕾; 周爱萍; 黄东升; 何晨

    2016-01-01

    Fabricated via industrial process using raw bamboo,Parallel strand bamboo (PSB)is a high strength composite which has been used in building constructions in recent years.Microvoids are inevitably left in PSB composite owing to the dimensional inconsistence of bamboo fibers;hence fracture due to mocrovids coalescence and expanding maybe a major failure mode of PSB components.This paper amid at studying on the Mode I fracture properties based on LEFM theory and Iwrin’s energy theory. Wedge splitting test was employed as test method.The length of crack expending was determined by three dimensional virtual image correlation (VIC-3D)global-field deformation acquisition system.Fracture toughness of PSB and R-curves were obtained.The results showed that the VIC-3D global-field acquisition system can precisely determine the crack tip and the complicated calculation for crack length determination may be avoided.Good agreement may be achieved between the result of this method and compliance approach,which indicated that the proposed method in this study is efficient.%重组竹是原竹经工业化制造而成的一种高强复合材料,近年来被用于建筑结构。由于重组竹内部存在许多微裂纹,断裂破坏是重组竹构件的主要失效模式。本文基于线弹性断裂理论和 Irwin 能量原理,采用 VIC 3D 全场变形测量技术确定裂纹扩展长度,通过重组竹楔形试件断裂试验,研究重组竹 I 型裂纹的断裂性能,给出重组竹的断裂韧度和 R 曲线。试验表明:VIC 3D 全场变形测量技术可准确确定裂纹尖端,避免了等效柔度法的复杂运算。该方法的结构与等效柔度法高度一致,是研究重组竹断裂性能的一种有效测量手段。

  1. 基于Simulink/SimMechanics的三自由度并联机器人控制系统仿真%Simulation of Control System for 3D of Parallel Robot based on Simulink/SimMechanics

    胡峰; 骆德渊; 雷霆; 柯辉

    2012-01-01

    Aiming at the problem that the control system of parallel robot is more complicated, compared to the traditional series robot, the technology of virtual simulation was investigated on the control strategy of parallel robot Taking 3D0F Delta Parallel Robot for example, in order to conveniently and rapidly achieve the control system simulation , using Simulink for simulation platform and combining with SimMechanics link, the method of modeling which translates CAD assemblies of Pro/E into SimMechanics model was presented, after that the PID controller model was designed. The experimental results show that it can provide the efficient and significant simulation platform to research the control strategy of parallel robot.%针对并联机器人控制系统比传统串联机器人更加复杂的问题,将虚拟仿真技术应用到并联机器人控制策略的研究上.以三自由度Delta并联机器人为例,为便捷高效实现其控制系统仿真,利用Simulink为仿真平台,结合SimMechanics Link接口软件,提出了三维Pro/E模型转换成SimMechanics模型的建模方法建立机械系统模型,并设计PID控制器模型进行仿真分析.结果表明,该方法为并联机器人控制策略的研究提供了高效的仿真平台,便于展开针对并联机器人特点的各种控制策略的研究.

  2. Ex-vessel neutron dosimetry analysis for westinghouse 4-loop XL pressurized water reactor plant using the RadTrack{sup TM} Code System with the 3D parallel discrete ordinates code RAPTOR-M3G

    Chen, J.; Alpan, F. A.; Fischer, G.A.; Fero, A.H. [Westinghouse Electric Company, Nuclear Services, Radiation Engineering and Analysis, 1000 Westinghouse Dr., Cranberry Township, PA 16066-5228 (United States)

    2011-07-01

    Traditional two-dimensional (2D)/one-dimensional (1D) SYNTHESIS methodology has been widely used to calculate fast neutron (>1.0 MeV) fluence exposure to reactor pressure vessel in the belt-line region. However, it is expected that this methodology cannot provide accurate fast neutron fluence calculation at elevations far above or below the active core region. A three-dimensional (3D) parallel discrete ordinates calculation for ex-vessel neutron dosimetry on a Westinghouse 4-Loop XL Pressurized Water Reactor has been done. It shows good agreement between the calculated results and measured results. Furthermore, the results show very different fast neutron flux values at some of the former plate locations and elevations above and below an active core than those calculated by a 2D/1D SYNTHESIS method. This indicates that for certain irregular reactor internal structures, where the fast neutron flux has a very strong local effect, it is required to use a 3D transport method to calculate accurate fast neutron exposure. (authors)

  3. The Rochester Checkers Player: Multi-Model Parallel Programming for Animate Vision

    1991-06-01

    parallel programming is likely to serve for all tasks, however. Early vision algorithms are intensely data parallel, often utilizing fine-grain parallel computations that share an image, while cognition algorithms decompose naturally by function, often consisting of loosely-coupled, coarse-grain parallel units. A typical animate vision application will likely consist of many tasks, each of which may require a different parallel programming model, and all of which must cooperate to achieve the desired behavior. These multi-model programs require an

  4. GRID2D/3D: A computer program for generating grid systems in complex-shaped two- and three-dimensional spatial domains. Part 2: User's manual and program listing

    Bailey, R. T.; Shih, T. I.-P.; Nguyen, H. L.; Roelke, R. J.

    1990-01-01

    An efficient computer program, called GRID2D/3D, was developed to generate single and composite grid systems within geometrically complex two- and three-dimensional (2- and 3-D) spatial domains that can deform with time. GRID2D/3D generates single grid systems by using algebraic grid generation methods based on transfinite interpolation in which the distribution of grid points within the spatial domain is controlled by stretching functions. All single grid systems generated by GRID2D/3D can have grid lines that are continuous and differentiable everywhere up to the second-order. Also, grid lines can intersect boundaries of the spatial domain orthogonally. GRID2D/3D generates composite grid systems by patching together two or more single grid systems. The patching can be discontinuous or continuous. For continuous composite grid systems, the grid lines are continuous and differentiable everywhere up to the second-order except at interfaces where different single grid systems meet. At interfaces where different single grid systems meet, the grid lines are only differentiable up to the first-order. For 2-D spatial domains, the boundary curves are described by using either cubic or tension spline interpolation. For 3-D spatial domains, the boundary surfaces are described by using either linear Coon's interpolation, bi-hyperbolic spline interpolation, or a new technique referred to as 3-D bi-directional Hermite interpolation. Since grid systems generated by algebraic methods can have grid lines that overlap one another, GRID2D/3D contains a graphics package for evaluating the grid systems generated. With the graphics package, the user can generate grid systems in an interactive manner with the grid generation part of GRID2D/3D. GRID2D/3D is written in FORTRAN 77 and can be run on any IBM PC, XT, or AT compatible computer. In order to use GRID2D/3D on workstations or mainframe computers, some minor modifications must be made in the graphics part of the program; no

  5. MoldaNet: a network distributed molecular graphics and modelling program that integrates secure signed applet and Java 3D technologies.

    Yoshida, H; Rzepa, H S; Tonge, A P

    1998-06-01

    MoldaNet is a molecular graphics and modelling program that integrates several new Java technologies, including authentication as a Secure Signed Applet, and implementation of Java 3D classes to enable access to hardware graphics acceleration. It is the first example of a novel class of Internet-based distributed computational chemistry tool designed to eliminate the need for user pre-installation of software on their client computer other than a standard Internet browser. The creation of a properly authenticated tool using a signed digital X.509 certificate permits the user to employ MoldaNet to read and write the files to a local file store; actions that are normally disallowed in Java applets. The modularity of the Java language also allows straightforward inclusion of Java3D and Chemical Markup Language classes in MoldaNet to permit the user to filter their model into 3D model descriptors such as VRML97 or CML for saving on local disk. The implications for both distance-based training environments and chemical commerce are noted.

  6. Parallel Programming Methodologies for Non-Uniform Structured Problems in Materials Science

    1993-10-01

    COVERED 1 10/93 _ Interim 12/01/92 - 09/30/93 4. TITLE AND SUBTITLE 5. FUNDING NUMBERS Parallel Programming Methodologies for Non-Uniform Structured...Dear Dr. van Tilborg, Enclosed you will find the annual report for " Parallel Programming Methodolo- gies for Non-Uniform Structured Problems in...Quincy Street Arlington, VA 22217-5660 Dear Dr. van Tilborg, Enclosed you will find the annual report for " Parallel Programming Methodolo- gies for Non

  7. 并行程序设计语言发展现状%Current Development of Parallel Programming Language

    韩卫; 郝红宇; 代丽

    2003-01-01

    In this paper we introduce the history of the parallel programming language and list some of currently parallel programming languages. Then according to the classified principle. We analyze some of the representative parallel programming languages in detail. Finally, we show a further feature to the parallel programming language.

  8. Hip2Norm: an object-oriented cross-platform program for 3D analysis of hip joint morphology using 2D pelvic radiographs.

    Zheng, G; Tannast, M; Anderegg, C; Siebenrock, K A; Langlotz, F

    2007-07-01

    We developed an object-oriented cross-platform program to perform three-dimensional (3D) analysis of hip joint morphology using two-dimensional (2D) anteroposterior (AP) pelvic radiographs. Landmarks extracted from 2D AP pelvic radiographs and optionally an additional lateral pelvic X-ray were combined with a cone beam projection model to reconstruct 3D hip joints. Since individual pelvic orientation can vary considerably, a method for standardizing pelvic orientation was implemented to determine the absolute tilt/rotation. The evaluation of anatomically morphologic differences was achieved by reconstructing the projected acetabular rim and the measured hip parameters as if obtained in a standardized neutral orientation. The program had been successfully used to interactively objectify acetabular version in hips with femoro-acetabular impingement or developmental dysplasia. Hip(2)Norm is written in object-oriented programming language C++ using cross-platform software Qt (TrollTech, Oslo, Norway) for graphical user interface (GUI) and is transportable to any platform.

  9. IZDELAVA TISKALNIKA 3D

    Brdnik, Lovro

    2015-01-01

    Diplomsko delo analizira trenutno stanje 3D tiskalnikov na trgu. Prikazan je razvoj in principi delovanja 3D tiskalnikov. Predstavljeni so tipi 3D tiskalnikov, njihove prednosti in slabosti. Podrobneje je predstavljena zgradba in delovanje koračnih motorjev. Opravljene so meritve koračnih motorjev. Opisana je programska oprema za rokovanje s 3D tiskalniki in komponente, ki jih potrebujemo za izdelavo. Diploma se oklepa vprašanja, ali je izdelava 3D tiskalnika bolj ekonomična kot pa naložba v ...

  10. MulticoreBSP for C : A high-performance library for shared-memory parallel programming

    Yzelman, A. N.; Bisseling, R. H.; Roose, D.; Meerbergen, K.

    2014-01-01

    The bulk synchronous parallel (BSP) model, as well as parallel programming interfaces based on BSP, classically target distributed-memory parallel architectures. In earlier work, Yzelman and Bisseling designed a MulticoreBSP for Java library specifically for shared-memory architectures. In the prese

  11. Processor Allocation for Optimistic Parallelization of Irregular Programs

    Versaci, Francesco

    2012-01-01

    Optimistic parallelization is a promising approach for the parallelization of irregular algorithms: potentially interfering tasks are launched dynamically, and the runtime system detects conflicts between concurrent activities, aborting and rolling back conflicting tasks. However, parallelism in irregular algorithms is very complex. In a regular algorithm like dense matrix multiplication, the amount of parallelism can usually be expressed as a function of the problem size, so it is reasonably straightforward to determine how many processors should be allocated to execute a regular algorithm of a certain size (this is called the processor allocation problem). In contrast, parallelism in irregular algorithms can be a function of input parameters, and the amount of parallelism can vary dramatically during the execution of the irregular algorithm. Therefore, the processor allocation problem for irregular algorithms is very difficult. In this paper, we describe the first systematic strategy for addressing this pro...

  12. Development of an Artificial Intelligence Programming Course and Unity3d Based Framework to Motivate Learning in Artistic Minded Students

    Reng, Lars

    2012-01-01

    between technical and artistic minded students is, however, increased once the students reach the sixth semester. The complex algorithms of the artificial intelligence course seemed to demotivate the artistic minded students even before the course began. This paper will present the extensive changes made...... to the sixth semester artificial intelligence programming course, in order to provide a highly motivating direct visual feedback, and thereby remove the steep initial learning curve for artistic minded students. The framework was developed with close dialog to both the game industry and experienced master...

  13. 3-D Hybrid Kinetic Modeling of the Interaction Between the Solar Wind and Lunar-like Exospheric Pickup Ions in Case of Oblique/ Quasi-Parallel/Parallel Upstream Magnetic Field

    Lipatov, A. S.; Farrell, W. M.; Cooper, J. F.; Sittler, E. C., Jr.; Hartle, R. E.

    2015-01-01

    The interactions between the solar wind and Moon-sized objects are determined by a set of the solar wind parameters and plasma environment of the space objects. The orientation of upstream magnetic field is one of the key factors which determines the formation and structure of bow shock wave/Mach cone or Alfven wing near the obstacle. The study of effects of the direction of the upstream magnetic field on lunar-like plasma environment is the main subject of our investigation in this paper. Photoionization, electron-impact ionization and charge exchange are included in our hybrid model. The computational model includes the self-consistent dynamics of the light (hydrogen (+), helium (+)) and heavy (sodium (+)) pickup ions. The lunar interior is considered as a weakly conducting body. Our previous 2013 lunar work, as reported in this journal, found formation of a triple structure of the Mach cone near the Moon in the case of perpendicular upstream magnetic field. Further advances in modeling now reveal the presence of strong wave activity in the upstream solar wind and plasma wake in the cases of quasiparallel and parallel upstream magnetic fields. However, little wave activity is found for the opposite case with a perpendicular upstream magnetic field. The modeling does not show a formation of the Mach cone in the case of theta(Sub B,U) approximately equal to 0 degrees.

  14. HPF: a data parallel programming interface for large-scale numerical simulations

    Seo, Yoshiki; Suehiro, Kenji; Murai, Hitoshi [NEC Corp., Tokyo (Japan)

    1998-03-01

    HPF (High Performance Fortran) is a data parallel language designed for programming on distributed memory parallel systems. The first draft of HPF1.0 was defined in 1993 as a de facto standard language. Recently, relatively reliable HPF compilers have become available on several distributed memory parallel systems. Many projects to parallelize real world programs have started mainly in the U.S. and Europe, and the weak and strong points in the current HPF have been made clear. In this paper, major data transfer patterns required to parallelize numerical simulations, such as SHIFT, matrix transposition, reduction, GATHER/SCATTER and irregular communication, and the programming methods to implement them with HPF are described. The problems in the current HPF interface for developing efficient parallel programs and recent activities to deal with them is presented as well. (author)

  15. OpenCL: A Parallel Programming Standard for Heterogeneous Computing Systems.

    Stone, John E; Gohara, David; Shi, Guochun

    2010-05-01

    We provide an overview of the key architectural features of recent microprocessor designs and describe the programming model and abstractions provided by OpenCL, a new parallel programming standard targeting these architectures.

  16. VPython: Python plus Animations in Stereo 3D

    Sherwood, Bruce

    2004-03-01

    Python is a modern object-oriented programming language. VPython (http://vpython.org) is a combination of Python (http://python.org), the Numeric module from LLNL (http://www.pfdubois.com/numpy), and the Visual module created by David Scherer, all of which have been under continuous development as open source projects. VPython makes it easy to write programs that generate real-time, navigable 3D animations. The Visual module includes a set of 3D objects (sphere, cylinder, arrow, etc.), tools for creating other shapes, and support for vector algebra. The 3D renderer runs in a parallel thread, and animations are produced as a side effect of computations, freeing the programmer to concentrate on the physics. Applications include educational and research visualization. In the Fall of 2003 Hugh Fisher at the Australian National University, John Zelle at Wartburg College, and I contributed to a new stereo capability of VPython. By adding a single statement to an existing VPython program, animations can be viewed in true stereo 3D. One can choose several modes: active shutter glasses, passive polarized glasses, or colored glasses (e.g. red-cyan). The talk will demonstrate the new stereo capability and discuss the pros and cons of various schemes for display of stereo 3D for a large audience. Supported in part by NSF grant DUE-0237132.

  17. TEHNOLOGIJE 3D TISKALNIKOV

    Kolar, Nataša

    2016-01-01

    Diplomsko delo predstavi razvoj tiskanja skozi čas. Podrobneje so opisani 3D tiskalniki, ki uporabljajo različne tehnologije 3D tiskanja. Predstavljene so različne tehnologije 3D tiskanja, njihova uporaba in narejeni prototipi oz. končni izdelki. Diplomsko delo opiše celoten postopek, od zamisli, priprave podatkov in tiskalnika do izdelave prototipa oz. končnega izdelka.

  18. KT3D_H2O: a program for kriging water level data using hydrologic drift terms.

    Karanovic, Marinko; Tonkin, Matthew; Wilson, David

    2009-01-01

    It is often necessary to estimate the zone of contribution to, or the capture zone developed by, pumped wells: for example, when evaluating pump-and-treat remedies and when developing wellhead protection areas for supply wells. Tonkin and Larson (2002) and Brochu and Marcotte (2003) describe a mapping-based method for estimating the capture zone of pumped wells, developed by combining universal kriging (kriging with a trend) with analytical expressions that describe the response of the potentiometric surface to certain applied stresses. This Methods Note describes (a) expansions to the technique described by Tonkin and Larson (2002); (b) the concept of the capture frequency map (CFM), a technique that combines information from multiple capture zone maps into a single depiction of capture; (c) the development of a graphical user interface to facilitate the use of the methods described; and (d) the integration of these programs within the MapWindow geographic information system environment. An example application is presented that illustrates ground water level contours, capture zones, and a CFM prepared using the methods and software described.

  19. 3D virtuel udstilling

    Tournay, Bruno; Rüdiger, Bjarne

    2006-01-01

    3d digital model af Arkitektskolens gård med virtuel udstilling af afgangsprojekter fra afgangen sommer 2006. 10 s.......3d digital model af Arkitektskolens gård med virtuel udstilling af afgangsprojekter fra afgangen sommer 2006. 10 s....

  20. Martian terrain - 3D

    1997-01-01

    This area of terrain near the Sagan Memorial Station was taken on Sol 3 by the Imager for Mars Pathfinder (IMP). 3D glasses are necessary to identify surface detail.The IMP is a stereo imaging system with color capability provided by 24 selectable filters -- twelve filters per 'eye.' It stands 1.8 meters above the Martian surface, and has a resolution of two millimeters at a range of two meters.Mars Pathfinder is the second in NASA's Discovery program of low-cost spacecraft with highly focused science goals. The Jet Propulsion Laboratory, Pasadena, CA, developed and manages the Mars Pathfinder mission for NASA's Office of Space Science, Washington, D.C. JPL is an operating division of the California Institute of Technology (Caltech). The Imager for Mars Pathfinder (IMP) was developed by the University of Arizona Lunar and Planetary Laboratory under contract to JPL. Peter Smith is the Principal Investigator.Click below to see the left and right views individually. [figure removed for brevity, see original site] Left [figure removed for brevity, see original site] Right

  1. A simple and efficient explicit parallelization of logic programs using low-level threading primitives

    Saha, Diptikalyan

    2009-01-01

    In this work, we present an automatic way to parallelize logic programs for finding all the answers to queries using a transformation to low level threading primitives. Although much work has been done in parallelization of logic programming more than a decade ago (e.g., Aurora, Muse, YapOR), the current state of parallelizing logic programs is still very poor. This work presents a way for parallelism of tabled logic programs in XSB Prolog under the well founded semantics. An important contribution of this work relies in merging answer-tables from multiple children threads without incurring copying or full-sharing and synchronization of data-structures. The implementation of the parent-children shared answer-tables surpasses in efficiency all the other data-structures currently implemented for completion of answers in parallelization using multi-threading. The transformation and its lower-level answer merging predicates were implemented as an extension to the XSB system.

  2. Blender 3D cookbook

    Valenza, Enrico

    2015-01-01

    This book is aimed at the professionals that already have good 3D CGI experience with commercial packages and have now decided to try the open source Blender and want to experiment with something more complex than the average tutorials on the web. However, it's also aimed at the intermediate Blender users who simply want to go some steps further.It's taken for granted that you already know how to move inside the Blender interface, that you already have 3D modeling knowledge, and also that of basic 3D modeling and rendering concepts, for example, edge-loops, n-gons, or samples. In any case, it'

  3. Machine and Collection Abstractions for User-Implemented Data-Parallel Programming

    Magne Haveraaen

    2000-01-01

    Full Text Available Data parallelism has appeared as a fruitful approach to the parallelisation of compute-intensive programs. Data parallelism has the advantage of mimicking the sequential (and deterministic structure of programs as opposed to task parallelism, where the explicit interaction of processes has to be programmed. In data parallelism data structures, typically collection classes in the form of large arrays, are distributed on the processors of the target parallel machine. Trying to extract distribution aspects from conventional code often runs into problems with a lack of uniformity in the use of the data structures and in the expression of data dependency patterns within the code. Here we propose a framework with two conceptual classes, Machine and Collection. The Machine class abstracts hardware communication and distribution properties. This gives a programmer high-level access to the important parts of the low-level architecture. The Machine class may readily be used in the implementation of a Collection class, giving the programmer full control of the parallel distribution of data, as well as allowing normal sequential implementation of this class. Any program using such a collection class will be parallelisable, without requiring any modification, by choosing between sequential and parallel versions at link time. Experiments with a commercial application, built using the Sophus library which uses this approach to parallelisation, show good parallel speed-ups, without any adaptation of the application program being needed.

  4. 3D Digital Modelling

    Hundebøl, Jesper

    wave of new building information modelling tools demands further investigation, not least because of industry representatives' somewhat coarse parlance: Now the word is spreading -3D digital modelling is nothing less than a revolution, a shift of paradigm, a new alphabet... Research qeustions. Based...... on empirical probes (interviews, observations, written inscriptions) within the Danish construction industry this paper explores the organizational and managerial dynamics of 3D Digital Modelling. The paper intends to - Illustrate how the network of (non-)human actors engaged in the promotion (and arrest) of 3......D Modelling (in Denmark) stabilizes - Examine how 3D Modelling manifests itself in the early design phases of a construction project with a view to discuss the effects hereof for i.a. the management of the building process. Structure. The paper introduces a few, basic methodological concepts...

  5. Professional Papervision3D

    Lively, Michael

    2010-01-01

    Professional Papervision3D describes how Papervision3D works and how real world applications are built, with a clear look at essential topics such as building websites and games, creating virtual tours, and Adobe's Flash 10. Readers learn important techniques through hands-on applications, and build on those skills as the book progresses. The companion website contains all code examples, video step-by-step explanations, and a collada repository.

  6. AE3D

    2016-06-20

    AE3D solves for the shear Alfven eigenmodes and eigenfrequencies in a torodal magnetic fusion confinement device. The configuration can be either 2D (e.g. tokamak, reversed field pinch) or 3D (e.g. stellarator, helical reversed field pinch, tokamak with ripple). The equations solved are based on a reduced MHD model and sound wave coupling effects are not currently included.

  7. Feasibility of 3D Partially Parallel Acquisition DCE MRI in Pulmonary Parenchyma Perfusion%三维并行采集动态增强MRI在肺实质局部灌注中的应用研究

    夏艺; 范丽; 刘士远; 管宇; 徐雪原; 于红; 肖湘生

    2012-01-01

    目的 评价3D并行采集动态对比增强MRI(dynamic contrast-enhanced MRI,DCE-MRI)技术对肺实质局部灌注成像的可行性.资料与方法 采用GE 1.5 T MRI系统,对10名健康志愿者及47例肺部疾病患者行灌注成像;评价肺灌注图像的均匀度,若存在灌注异常区域则计算其与正常肺组织的信号强度之比( RSI).结果 DCE-MRI可以清楚地显示肺实质灌注情况:10名健康志愿者的灌注图像较均匀,未见灌注缺损区.10例肺动脉栓塞( pulmonary embolism,PE)共出现12个楔形灌注缺损区,其中1例双侧PE出现3个灌注缺损区;12例侵犯邻近肺动脉的肺癌,在相应供血区均出现灌注缺损;RSI经单样本t检验差异具有明显的统计学意义(t=-24.74,P<0.05);另25例(20例未侵犯邻近肺动脉的肺癌和5例炎性病变)在对比剂首过肺实质强化达峰值时,病灶局部均呈低信号改变.结论 3D并行采集DCE-MRI技术可在单次屏气状态下完成动态多期扫描,获得全肺的容积灌注成像数据,对MR肺灌注图像采用半量化分析可明显区分出灌注异常区与灌注正常区.%Objective To assess the feasibility of 3 D partially parallel acquisition dynamic contrast enhanced (DCE) MRI in pulmonary parenchyma perfusion. Materials and Methods Ten healthy volunteers and 47 patients with lung disease performed perfusion imaging on a clinical 1. 5-T GE Excite HD whole body system. The homogeneity of perfusion images were assessed. In case of perfusion abnormality, the signal intensity ratio ( RSI) of perfusion abnormality and normal lung were calculated. Results Pulmonary parenchyma perfusion was well depicted with DCE-MRI. The perfusion images of healthy volunteers were homogeneous. 12 wedge shaped perfusion defects were visualized in 10 patients with pulmonary embolisms. 12 perfusion defects were also showed in 12 patients with lung cancer infiltrating the pulmonary artery. There was significant difference in RSI (t = - 24

  8. Molecular dynamics simulation on a network of workstations using a machine-independent parallel programming language.

    1991-01-01

    Molecular dynamics simulations investigate local and global motion in molecules. Several parallel computing approaches have been taken to attack the most computationally expensive phase of molecular simulations, the evaluation of long range interactions. This paper develops a straightforward but effective algorithm for molecular dynamics simulations using the machine-independent parallel programming language, Linda. The algorithm was run both on a shared memory parallel computer and on a netw...

  9. CRBLASTER: a fast parallel-processing program for cosmic ray rejection

    Mighell, Kenneth J.

    2008-08-01

    Many astronomical image-analysis programs are based on algorithms that can be described as being embarrassingly parallel, where the analysis of one subimage generally does not affect the analysis of another subimage. Yet few parallel-processing astrophysical image-analysis programs exist that can easily take full advantage of todays fast multi-core servers costing a few thousands of dollars. A major reason for the shortage of state-of-the-art parallel-processing astrophysical image-analysis codes is that the writing of parallel codes has been perceived to be difficult. I describe a new fast parallel-processing image-analysis program called crblaster which does cosmic ray rejection using van Dokkum's L.A.Cosmic algorithm. crblaster is written in C using the industry standard Message Passing Interface (MPI) library. Processing a single 800×800 HST WFPC2 image takes 1.87 seconds using 4 processes on an Apple Xserve with two dual-core 3.0-GHz Intel Xeons; the efficiency of the program running with the 4 processors is 82%. The code can be used as a software framework for easy development of parallel-processing image-anlaysis programs using embarrassing parallel algorithms; the biggest required modification is the replacement of the core image processing function with an alternative image-analysis function based on a single-processor algorithm. I describe the design, implementation and performance of the program.

  10. The Effect of Parallel Programming Languages on the Performance and Energy Consumption of HPC Applications

    Muhammad Aqib

    2016-02-01

    Full Text Available Big and complex applications need many resources and long computation time to execute sequentially. In this scenario, all application's processes are handled in sequential fashion even if they are independent of each other. In high- performance computing environment, multiple processors are available to running applications in parallel. So mutually independent blocks of codes could run in parallel. This approach not only increases the efficiency of the system without affecting the results but also saves a significant amount of energy. Many parallel programming models or APIs like Open MPI, Open MP, CUDA, etc. are available to running multiple instructions in parallel. In this paper, the efficiency and energy consumption of two known tasks i.e. matrix multiplication and quicksort are analyzed using different parallel programming models and a multiprocessor machine. The obtained results, which can be generalized, outline the effect of choosing a programming model on the efficiency and energy consumption when running different codes on different machines.

  11. Exploiting Vector and Multicore Parallelsim for Recursive, Data- and Task-Parallel Programs

    Ren, Bin; Krishnamoorthy, Sriram; Agrawal, Kunal; Kulkarni, Milind

    2017-01-26

    Modern hardware contains parallel execution resources that are well-suited for data-parallelism-vector units-and task parallelism-multicores. However, most work on parallel scheduling focuses on one type of hardware or the other. In this work, we present a scheduling framework that allows for a unified treatment of task- and data-parallelism. Our key insight is an abstraction, task blocks, that uniformly handles data-parallel iterations and task-parallel tasks, allowing them to be scheduled on vector units or executed independently as multicores. Our framework allows us to define schedulers that can dynamically select between executing task- blocks on vector units or multicores. We show that these schedulers are asymptotically optimal, and deliver the maximum amount of parallelism available in computation trees. To evaluate our schedulers, we develop program transformations that can convert mixed data- and task-parallel pro- grams into task block-based programs. Using a prototype instantiation of our scheduling framework, we show that, on an 8-core system, we can simultaneously exploit vector and multicore parallelism to achieve 14×-108× speedup over sequential baselines.

  12. GeoGebra 3D from the Perspectives of Elementary Pre-Service Mathematics Teachers Who Are Familiar with a Number of Software Programs

    Baltaci, Serdal; Yildiz, Avni

    2015-01-01

    Each new version of the GeoGebra dynamic mathematics software goes through updates and innovations. One of these innovations is the GeoGebra 5.0 version. This version aims to facilitate 3D instruction by offering opportunities for students to analyze 3D objects. While scanning the previous studies of GeoGebra 3D, it is seen that they mainly focus…

  13. Cost Reduction Through the Use of Additive Manufacturing (3d Printing) and Collaborative Product Life Cycle Management Technologies to Enhance the Navy’s Maintenance Programs

    2013-09-01

    Objet (Israel), Polymers , Prototyping 3D Systems (US), Solidscape (US) 3D Systems (US), Polymers , Metals, Prototyping, ExOne (US), Casting Molds...Voxeljet (Germany) Direct Part Stratasys (US), Bits from Bytes, RepRap Polymers Prototyping EOS (Germany), 3D Systems (US), Polymers , Prototyping, Arcam...and typical markets Vat Photopolymerization Material Jetting Binder Jetting Material Extrusion Powder Bed Fusion Sheet Lamination Directed Energy

  14. F-Nets and Software Cabling: Deriving a Formal Model and Language for Portable Parallel Programming

    DiNucci, David C.; Saini, Subhash (Technical Monitor)

    1998-01-01

    Parallel programming is still being based upon antiquated sequence-based definitions of the terms "algorithm" and "computation", resulting in programs which are architecture dependent and difficult to design and analyze. By focusing on obstacles inherent in existing practice, a more portable model is derived here, which is then formalized into a model called Soviets which utilizes a combination of imperative and functional styles. This formalization suggests more general notions of algorithm and computation, as well as insights into the meaning of structured programming in a parallel setting. To illustrate how these principles can be applied, a very-high-level graphical architecture-independent parallel language, called Software Cabling, is described, with many of the features normally expected from today's computer languages (e.g. data abstraction, data parallelism, and object-based programming constructs).

  15. Resolutions of the Coulomb operator: VIII. Parallel implementation using the modern programming language X10.

    Limpanuparb, Taweetham; Milthorpe, Josh; Rendell, Alistair P

    2014-10-30

    Use of the modern parallel programming language X10 for computing long-range Coulomb and exchange interactions is presented. By using X10, a partitioned global address space language with support for task parallelism and the explicit representation of data locality, the resolution of the Ewald operator can be parallelized in a straightforward manner including use of both intranode and internode parallelism. We evaluate four different schemes for dynamic load balancing of integral calculation using X10's work stealing runtime, and report performance results for long-range HF energy calculation of large molecule/high quality basis running on up to 1024 cores of a high performance cluster machine.

  16. Radiochromic 3D Detectors

    Oldham, Mark

    2015-01-01

    Radiochromic materials exhibit a colour change when exposed to ionising radiation. Radiochromic film has been used for clinical dosimetry for many years and increasingly so recently, as films of higher sensitivities have become available. The two principle advantages of radiochromic dosimetry include greater tissue equivalence (radiologically) and the lack of requirement for development of the colour change. In a radiochromic material, the colour change arises direct from ionising interactions affecting dye molecules, without requiring any latent chemical, optical or thermal development, with important implications for increased accuracy and convenience. It is only relatively recently however, that 3D radiochromic dosimetry has become possible. In this article we review recent developments and the current state-of-the-art of 3D radiochromic dosimetry, and the potential for a more comprehensive solution for the verification of complex radiation therapy treatments, and 3D dose measurement in general.

  17. 3D Spectroscopic Instrumentation

    Bershady, Matthew A

    2009-01-01

    In this Chapter we review the challenges of, and opportunities for, 3D spectroscopy, and how these have lead to new and different approaches to sampling astronomical information. We describe and categorize existing instruments on 4m and 10m telescopes. Our primary focus is on grating-dispersed spectrographs. We discuss how to optimize dispersive elements, such as VPH gratings, to achieve adequate spectral resolution, high throughput, and efficient data packing to maximize spatial sampling for 3D spectroscopy. We review and compare the various coupling methods that make these spectrographs ``3D,'' including fibers, lenslets, slicers, and filtered multi-slits. We also describe Fabry-Perot and spatial-heterodyne interferometers, pointing out their advantages as field-widened systems relative to conventional, grating-dispersed spectrographs. We explore the parameter space all these instruments sample, highlighting regimes open for exploitation. Present instruments provide a foil for future development. We give an...

  18. 3D Projection Installations

    Halskov, Kim; Johansen, Stine Liv; Bach Mikkelsen, Michelle

    2014-01-01

    Three-dimensional projection installations are particular kinds of augmented spaces in which a digital 3-D model is projected onto a physical three-dimensional object, thereby fusing the digital content and the physical object. Based on interaction design research and media studies, this article...... contributes to the understanding of the distinctive characteristics of such a new medium, and identifies three strategies for designing 3-D projection installations: establishing space; interplay between the digital and the physical; and transformation of materiality. The principal empirical case, From...... Fingerplan to Loop City, is a 3-D projection installation presenting the history and future of city planning for the Copenhagen area in Denmark. The installation was presented as part of the 12th Architecture Biennale in Venice in 2010....

  19. Interaktiv 3D design

    Villaume, René Domine; Ørstrup, Finn Rude

    2002-01-01

    Projektet undersøger potentialet for interaktiv 3D design via Internettet. Arkitekt Jørn Utzons projekt til Espansiva blev udviklet som et byggesystem med det mål, at kunne skabe mangfoldige planmuligheder og mangfoldige facade- og rumudformninger. Systemets bygningskomponenter er digitaliseret som...... 3D elementer og gjort tilgængelige. Via Internettet er det nu muligt at sammenstille og afprøve en uendelig  række bygningstyper som  systemet blev tænkt og udviklet til....

  20. Tangible 3D Modelling

    Hejlesen, Aske K.; Ovesen, Nis

    2012-01-01

    This paper presents an experimental approach to teaching 3D modelling techniques in an Industrial Design programme. The approach includes the use of tangible free form models as tools for improving the overall learning. The paper is based on lecturer and student experiences obtained through...

  1. Shaping 3-D boxes

    Stenholt, Rasmus; Madsen, Claus B.

    2011-01-01

    Enabling users to shape 3-D boxes in immersive virtual environments is a non-trivial problem. In this paper, a new family of techniques for creating rectangular boxes of arbitrary position, orientation, and size is presented and evaluated. These new techniques are based solely on position data...

  2. 3D Wire 2015

    Jordi, Moréton; F, Escribano; J. L., Farias

    This document is a general report on the implementation of gamification in 3D Wire 2015 event. As the second gamification experience in this event, we have delved deeply in the previous objectives (attracting public areas less frequented exhibition in previous years and enhance networking) and ha......, improves socialization and networking, improves media impact, improves fun factor and improves encouragement of the production team....

  3. On the Performance of the Python Programming Language for Serial and Parallel Scientific Computations

    Xing Cai

    2005-01-01

    Full Text Available This article addresses the performance of scientific applications that use the Python programming language. First, we investigate several techniques for improving the computational efficiency of serial Python codes. Then, we discuss the basic programming techniques in Python for parallelizing serial scientific applications. It is shown that an efficient implementation of the array-related operations is essential for achieving good parallel performance, as for the serial case. Once the array-related operations are efficiently implemented, probably using a mixed-language implementation, good serial and parallel performance become achievable. This is confirmed by a set of numerical experiments. Python is also shown to be well suited for writing high-level parallel programs.

  4. Concurrent extensions to the FORTRAN language for parallel programming of computational fluid dynamics algorithms

    Weeks, Cindy Lou

    1986-01-01

    Experiments were conducted at NASA Ames Research Center to define multi-tasking software requirements for multiple-instruction, multiple-data stream (MIMD) computer architectures. The focus was on specifying solutions for algorithms in the field of computational fluid dynamics (CFD). The program objectives were to allow researchers to produce usable parallel application software as soon as possible after acquiring MIMD computer equipment, to provide researchers with an easy-to-learn and easy-to-use parallel software language which could be implemented on several different MIMD machines, and to enable researchers to list preferred design specifications for future MIMD computer architectures. Analysis of CFD algorithms indicated that extensions of an existing programming language, adaptable to new computer architectures, provided the best solution to meeting program objectives. The CoFORTRAN Language was written in response to these objectives and to provide researchers a means to experiment with parallel software solutions to CFD algorithms on machines with parallel architectures.

  5. Massively parallel code named NEPTUNE for 3D fully electromagnetic and PIC simulations%3维全电磁粒子模拟大规模并行程序NEPTUNE

    董烨; 董志伟; 周海京; 陈虹; 莫则尧; 陈军; 杨温渊; 赵强; 夏芳; 肖丽; 马彦; 廖丽; 孙会芳

    2011-01-01

    介绍了自主编制的3维全电磁粒子模拟大规模并行程序NEPTUNE的基本情况.该程序具备对多种典型高功率微波源器件的3维模拟能力,可以在数百乃至上千个CPU上稳定运行.该程序使用时域有限差分(FDTD)方法更新计算电磁场,采用Buneman-Boris算法更新粒子运动状态,运用质点网格法(PIC)处理粒子与电磁场的耦合关系,最后利用Boris方法求解泊松方程对电场散度进行修正,以确保计算精度.该程序初步具备复杂几何结构建模能力,可以对典型高功率微波器件中常见的一些复杂结构,如任意边界形状的轴对称几何体、正交投影面几何体,慢波结构、耦合孔洞、金属线和曲面薄膜等进行几何建模.该程序将理想导体边界、外加波边界、粒子发射与吸收边界及完全匹配层边界等物理边界应用于几何边界上,实现了数值计算的封闭求解.最后以算例的形式,介绍了使用NEPTUNE程序对磁绝缘线振荡器、相对论返波管、虚阴极振荡器及相对论速调管等典型高功率微波源器件进行的模拟计算情况,验证了模拟计算结果的可靠性,同时给出了并行效率的分布情况.%A massively parallel code named NEPTUNE for 3D fully electromagnetic and particle-in-cell(PIC) simulations is introduced , which can run on the Linux system with hundreds or even thousands of CPUs. NEPTUNE is capable of three-dimensional simulation of various typical high power microwave( HPM) devices. In NEPTUNE code, electromagnetic fields are updated by using finite-difference time-domain (FDTD) method to solve Maxwell equations and particles are moved by using Buneman-Boris method to solve the relativistic Newton-Lorentz equation. The electromagnetic fields and particles are coupled by using linear weighing interpolation PIC method, and the electric field components are corrected by using Boris method to solve the Poisson e-quation in order to ensure charge

  6. Parallel Libraries to support High-Level Programming

    Larsen, Morten Nørgaard

    model requires the programmer to think a bit differently, but at the same time the implemented algorithms will perform very well, as shown by the initial tests presented. In the second part of this thesis, I will change focus from the CELL-BE architecture to the more traditionally x86 architecture...... of the more exotic though short-lived heterogeneous CELL Broadband Engine (CELL-BE) architecture added to this shift. Furthermore, the use of cluster computers made of commodity hardware and specialized supercomputers have greatly increased in both industry as well as in the academic world. Finally...... as they would be a single machine. In between is a number of tools helping the programmers handle communication, share data, run loops in parallel, handle algorithms mining huge amounts of data etc. Even though most of them do a good job performance-wise, almost all of them require that the programmers learn...

  7. Managing Algorithmic Skeleton Nesting Requirements in Realistic Image Processing Applications: The Case of the SKiPPER-II Parallel Programming Environment's Operating Model

    Duculty Florent

    2005-01-01

    Full Text Available SKiPPER is a SKeleton-based Parallel Programming EnviRonment being developed since 1996 and running at LASMEA Laboratory, the Blaise-Pascal University, France. The main goal of the project was to demonstrate the applicability of skeleton-based parallel programming techniques to the fast prototyping of reactive vision applications. This paper deals with the special features embedded in the latest version of the project: algorithmic skeleton nesting capabilities and a fully dynamic operating model. Throughout the case study of a complete and realistic image processing application, in which we have pointed out the requirement for skeleton nesting, we are presenting the operating model of this feature. The work described here is one of the few reported experiments showing the application of skeleton nesting facilities for the parallelisation of a realistic application, especially in the area of image processing. The image processing application we have chosen is a 3D face-tracking algorithm from appearance.

  8. Using CLIPS in the domain of knowledge-based massively parallel programming

    Dvorak, Jiri J.

    1994-01-01

    The Program Development Environment (PDE) is a tool for massively parallel programming of distributed-memory architectures. Adopting a knowledge-based approach, the PDE eliminates the complexity introduced by parallel hardware with distributed memory and offers complete transparency in respect of parallelism exploitation. The knowledge-based part of the PDE is realized in CLIPS. Its principal task is to find an efficient parallel realization of the application specified by the user in a comfortable, abstract, domain-oriented formalism. A large collection of fine-grain parallel algorithmic skeletons, represented as COOL objects in a tree hierarchy, contains the algorithmic knowledge. A hybrid knowledge base with rule modules and procedural parts, encoding expertise about application domain, parallel programming, software engineering, and parallel hardware, enables a high degree of automation in the software development process. In this paper, important aspects of the implementation of the PDE using CLIPS and COOL are shown, including the embedding of CLIPS with C++-based parts of the PDE. The appropriateness of the chosen approach and of the CLIPS language for knowledge-based software engineering are discussed.

  9. Guide to development of a scalar massive parallel programming on Paragon

    Ueshima, Yutaka; Arakawa, Takuya; Sasaki, Akira [Japan Atomic Energy Research Inst., Neyagawa, Osaka (Japan). Kansai Research Establishment; Yokota, Hisasi

    1998-10-01

    Parallel calculations using more than hundred computers had begun in Japan only several years ago. The Intel Paragon XP/S 15GP256 , 75MP834 were introduced as pioneers in Japan Atomic Energy Research Institute (JAERI) to pursue massive parallel simulations for advanced photon and fusion researches. Recently, large number of parallel programs have been transplanted or newly produced to perform the parallel calculations with those computers. However, these programs are developed based on software technologies for conventional super computer, therefore they sometimes cause troubles in the massive parallel computing. In principle, when programs are developed under different computer and operating system (OS), prudent directions and knowledge are needed. However, integration of knowledge and standardization of environment are quite difficult because number of Paragon system and Paragon`s users are very small in Japan. Therefore, we summarized information which was got through the process of development of a massive parallel program in the Paragon XP/S 75MP834. (author)

  10. Unoriented 3d TFTs

    Bhardwaj, Lakshya

    2016-01-01

    This paper generalizes two facts about oriented 3d TFTs to the unoriented case. On one hand, it is known that oriented 3d TFTs having a topological boundary condition admit a state-sum construction known as the Turaev-Viro construction. This is related to the string-net construction of fermionic phases of matter. We show how Turaev-Viro construction can be generalized to unoriented 3d TFTs. On the other hand, it is known that the "fermionic" versions of oriented TFTs, known as Spin-TFTs, can be constructed in terms of "shadow" TFTs which are ordinary oriented TFTs with an anomalous Z_2 1-form symmetry. We generalize this correspondence to Pin+ TFTs by showing that they can be constructed in terms of ordinary unoriented TFTs with anomalous Z_2 1-form symmetry having a mixed anomaly with time-reversal symmetry. The corresponding Pin+ TFT does not have any anomaly for time-reversal symmetry however and hence it can be unambiguously defined on a non-orientable manifold. In case a Pin+ TFT admits a topological bou...

  11. Parallelized Solution to Semidefinite Programmings in Quantum Complexity Theory

    Wu, Xiaodi

    2010-01-01

    In this paper we present an equilibrium value based framework for solving SDPs via the multiplicative weight update method which is different from the one in Kale's thesis \\cite{Kale07}. One of the main advantages of the new framework is that we can guarantee the convertibility from approximate to exact feasibility in a much more general class of SDPs than previous result. Another advantage is the design of the oracle which is necessary for applying the multiplicative weight update method is much simplified in general cases. This leads to an alternative and easier solutions to the SDPs used in the previous results \\class{QIP(2)}$\\subseteq$\\class{PSPACE} \\cite{JainUW09} and \\class{QMAM}=\\class{PSPACE} \\cite{JainJUW09}. Furthermore, we provide a generic form of SDPs which can be solved in the similar way. By parallelizing every step in our solution, we are able to solve a class of SDPs in \\class{NC}. Although our motivation is from quantum computing, our result will also apply directly to any SDP which satisfie...

  12. Programming a massively parallel, computation universal system: static behavior

    Lapedes, A.; Farber, R.

    1986-01-01

    In previous work by the authors, the ''optimum finding'' properties of Hopfield neural nets were applied to the nets themselves to create a ''neural compiler.'' This was done in such a way that the problem of programming the attractors of one neural net (called the Slave net) was expressed as an optimization problem that was in turn solved by a second neural net (the Master net). In this series of papers that approach is extended to programming nets that contain interneurons (sometimes called ''hidden neurons''), and thus deals with nets capable of universal computation. 22 refs.

  13. Parallelizing Deadlock Resolution in Symbolic Synthesis of Distributed Programs

    2008-01-01

    follows. In Sections 2 and 3, we present precise defini- tions for distributed programs, specifications, and fault- tolerance. We formally state the...Subsequently, experimental results and analysis are presented in Section 6. Related work is discussed in Section 7. Finally, we conclude in Section...infinite com- putation by stuttering at sl. On the other hand, if there exists a state sd such that there is no outgoing transition (or a self-loop

  14. Concurrent Programming Using Actors: Exploiting Large-Scale Parallelism,

    1985-10-07

    ORGANIZATION NAME AND ADDRESS 10. PROGRAM ELEMENT. PROJECT. TASK* Artificial Inteligence Laboratory AREA Is WORK UNIT NUMBERS 545 Technology Square...D-R162 422 CONCURRENT PROGRMMIZNG USING f"OS XL?ITP TEH l’ LARGE-SCALE PARALLELISH(U) NASI AC E Al CAMBRIDGE ARTIFICIAL INTELLIGENCE L. G AGHA ET AL...RESOLUTION TEST CHART N~ATIONAL BUREAU OF STANDA.RDS - -96 A -E. __ _ __ __’ .,*- - -- •. - MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL

  15. Task scheduling of parallel programs to optimize communications for cluster of SMPs

    郑纬民; 杨博; 林伟坚; 李志光

    2001-01-01

    This paper discusses the compile time task scheduling of parallel program running on cluster of SMP workstations. Firstly, the problem is stated formally and transformed into a graph partition problem and proved to be NP-Complete. A heuristic algorithm MMP-Solver is then proposed to solve the problem. Experiment result shows that the task scheduling can reduce communication overhead of parallel applications greatly and MMP-Solver outperforms the existing algorithms.

  16. Describing, using 'recognition cones'. [parallel-series model with English-like computer program

    Uhr, L.

    1973-01-01

    A parallel-serial 'recognition cone' model is examined, taking into account the model's ability to describe scenes of objects. An actual program is presented in an English-like language. The concept of a 'description' is discussed together with possible types of descriptive information. Questions regarding the level and the variety of detail are considered along with approaches for improving the serial representations of parallel systems.

  17. Efficient Parallelization of the Stochastic Dual Dynamic Programming Algorithm Applied to Hydropower Scheduling

    Arild Helseth

    2015-12-01

    Full Text Available Stochastic dual dynamic programming (SDDP has become a popular algorithm used in practical long-term scheduling of hydropower systems. The SDDP algorithm is computationally demanding, but can be designed to take advantage of parallel processing. This paper presents a novel parallel scheme for the SDDP algorithm, where the stage-wise synchronization point traditionally used in the backward iteration of the SDDP algorithm is partially relaxed. The proposed scheme was tested on a realistic model of a Norwegian water course, proving that the synchronization point relaxation significantly improves parallel efficiency.

  18. 3D Reconstruction of NMR Images

    Peter Izak

    2007-01-01

    Full Text Available This paper introduces experiment of 3D reconstruction NMR images scanned from magnetic resonance device. There are described methods which can be used for 3D reconstruction magnetic resonance images in biomedical application. The main idea is based on marching cubes algorithm. For this task was chosen sophistication method by program Vision Assistant, which is a part of program LabVIEW.

  19. Discrete Method of Images for 3D Radio Propagation Modeling

    Novak, Roman

    2016-09-01

    Discretization by rasterization is introduced into the method of images (MI) in the context of 3D deterministic radio propagation modeling as a way to exploit spatial coherence of electromagnetic propagation for fine-grained parallelism. Traditional algebraic treatment of bounding regions and surfaces is replaced by computer graphics rendering of 3D reflections and double refractions while building the image tree. The visibility of reception points and surfaces is also resolved by shader programs. The proposed rasterization is shown to be of comparable run time to that of the fundamentally parallel shooting and bouncing rays. The rasterization does not affect the signal evaluation backtracking step, thus preserving its advantage over the brute force ray-tracing methods in terms of accuracy. Moreover, the rendering resolution may be scaled back for a given level of scenario detail with only marginal impact on the image tree size. This allows selection of scene optimized execution parameters for faster execution, giving the method a competitive edge. The proposed variant of MI can be run on any GPU that supports real-time 3D graphics.

  20. Generalized recovery algorithm for 3D super-resolution microscopy using rotating point spread functions

    Shuang, Bo; Wang, Wenxiao; Shen, Hao; Tauzin, Lawrence J.; Flatebo, Charlotte; Chen, Jianbo; Moringo, Nicholas A.; Bishop, Logan D. C.; Kelly, Kevin F.; Landes, Christy F.

    2016-08-01

    Super-resolution microscopy with phase masks is a promising technique for 3D imaging and tracking. Due to the complexity of the resultant point spread functions, generalized recovery algorithms are still missing. We introduce a 3D super-resolution recovery algorithm that works for a variety of phase masks generating 3D point spread functions. A fast deconvolution process generates initial guesses, which are further refined by least squares fitting. Overfitting is suppressed using a machine learning determined threshold. Preliminary results on experimental data show that our algorithm can be used to super-localize 3D adsorption events within a porous polymer film and is useful for evaluating potential phase masks. Finally, we demonstrate that parallel computation on graphics processing units can reduce the processing time required for 3D recovery. Simulations reveal that, through desktop parallelization, the ultimate limit of real-time processing is possible. Our program is the first open source recovery program for generalized 3D recovery using rotating point spread functions.

  1. CRBLASTER: A Fast Parallel-Processing Program for Cosmic Ray Rejection in Space-Based Observations

    Mighell, K.

    Many astronomical image analysis tasks are based on algorithms that can be described as being embarrassingly parallel - where the analysis of one subimage generally does not affect the analysis of another subimage. Yet few parallel-processing astrophysical image-analysis programs exist that can easily take full advantage of today's fast multi-core servers costing a few thousands of dollars. One reason for the shortage of state-of-the-art parallel processing astrophysical image-analysis codes is that the writing of parallel codes has been perceived to be difficult. I describe a new fast parallel-processing image-analysis program called CRBLASTER which does cosmic ray rejection using van Dokkum's L.A.Cosmic algorithm. CRBLASTER is written in C using the industry standard Message Passing Interface library. Processing a single 800 x 800 Hubble Space Telescope Wide-Field Planetary Camera 2 (WFPC2) image takes 1.9 seconds using 4 processors on an Apple Xserve with two dual-core 3.0-GHz Intel Xeons; the efficiency of the program running with the 4 cores is 82%. The code has been designed to be used as a software framework for the easy development of parallel-processing image-analysis programs using embarrassing parallel algorithms; all that needs to be done is to replace the core image processing task (in this case the C function that performs the L.A.Cosmic algorithm) with an alternative image analysis task based on a single processor algorithm. I describe the design and implementation of the program and then discuss how it could possibly be used to quickly do time-critical analysis applications such as those involved with space surveillance or do complex calibration tasks as part of the pipeline processing of images from large focal plane arrays.

  2. 3D and beyond

    Fung, Y. C.

    1995-05-01

    This conference on physiology and function covers a wide range of subjects, including the vasculature and blood flow, the flow of gas, water, and blood in the lung, the neurological structure and function, the modeling, and the motion and mechanics of organs. Many technologies are discussed. I believe that the list would include a robotic photographer, to hold the optical equipment in a precisely controlled way to obtain the images for the user. Why are 3D images needed? They are to achieve certain objectives through measurements of some objects. For example, in order to improve performance in sports or beauty of a person, we measure the form, dimensions, appearance, and movements.

  3. A Tool for Performance Modeling of Parallel Programs

    J.A. González

    2003-01-01

    Full Text Available Current performance prediction analytical models try to characterize the performance behavior of actual machines through a small set of parameters. In practice, substantial deviations are observed. These differences are due to factors as memory hierarchies or network latency. A natural approach is to associate a different proportionality constant with each basic block, and analogously, to associate different latencies and bandwidths with each "communication block". Unfortunately, to use this approach implies that the evaluation of parameters must be done for each algorithm. This is a heavy task, implying experiment design, timing, statistics, pattern recognition and multi-parameter fitting algorithms. Software support is required. We present a compiler that takes as source a C program annotated with complexity formulas and produces as output an instrumented code. The trace files obtained from the execution of the resulting code are analyzed with an interactive interpreter, giving us, among other information, the values of those parameters.

  4. Remote Memory Access: A Case for Portable, Efficient and Library Independent Parallel Programming

    Alexandros V. Gerbessiotis

    2004-01-01

    Full Text Available In this work we make a strong case for remote memory access (RMA as the effective way to program a parallel computer by proposing a framework that supports RMA in a library independent, simple and intuitive way. If one uses our approach the parallel code one writes will run transparently under MPI-2 enabled libraries but also bulk-synchronous parallel libraries. The advantage of using RMA is code simplicity, reduced programming complexity, and increased efficiency. We support the latter claims by implementing under this framework a collection of benchmark programs consisting of a communication and synchronization performance assessment program, a dense matrix multiplication algorithm, and two variants of a parallel radix-sort algorithm and examine their performance on a LINUX-based PC cluster under three different RMA enabled libraries: LAM MPI, BSPlib, and PUB. We conclude that implementations of such parallel algorithms using RMA communication primitives lead to code that is as efficient as the message-passing equivalent code and in the case of radix-sort substantially more efficient. In addition our work can be used as a comparative study of the relevant capabilities of the three libraries.

  5. 3D Surgical Simulation

    Cevidanes, Lucia; Tucker, Scott; Styner, Martin; Kim, Hyungmin; Chapuis, Jonas; Reyes, Mauricio; Proffit, William; Turvey, Timothy; Jaskolka, Michael

    2009-01-01

    This paper discusses the development of methods for computer-aided jaw surgery. Computer-aided jaw surgery allows us to incorporate the high level of precision necessary for transferring virtual plans into the operating room. We also present a complete computer-aided surgery (CAS) system developed in close collaboration with surgeons. Surgery planning and simulation include construction of 3D surface models from Cone-beam CT (CBCT), dynamic cephalometry, semi-automatic mirroring, interactive cutting of bone and bony segment repositioning. A virtual setup can be used to manufacture positioning splints for intra-operative guidance. The system provides further intra-operative assistance with the help of a computer display showing jaw positions and 3D positioning guides updated in real-time during the surgical procedure. The CAS system aids in dealing with complex cases with benefits for the patient, with surgical practice, and for orthodontic finishing. Advanced software tools for diagnosis and treatment planning allow preparation of detailed operative plans, osteotomy repositioning, bone reconstructions, surgical resident training and assessing the difficulties of the surgical procedures prior to the surgery. CAS has the potential to make the elaboration of the surgical plan a more flexible process, increase the level of detail and accuracy of the plan, yield higher operative precision and control, and enhance documentation of cases. Supported by NIDCR DE017727, and DE018962 PMID:20816308

  6. TOWARDS: 3D INTERNET

    Ms. Swapnali R. Ghadge

    2013-08-01

    Full Text Available In today’s ever-shifting media landscape, it can be a complex task to find effective ways to reach your desired audience. As traditional media such as television continue to lose audience share, one venue in particular stands out for its ability to attract highly motivated audiences and for its tremendous growth potential the 3D Internet. The concept of '3D Internet' has recently come into the spotlight in the R&D arena, catching the attention of many people, and leading to a lot of discussions. Basically, one can look into this matter from a few different perspectives: visualization and representation of information, and creation and transportation of information, among others. All of them still constitute research challenges, as no products or services are yet available or foreseen for the near future. Nevertheless, one can try to envisage the directions that can be taken towards achieving this goal. People who take part in virtual worlds stay online longer with a heightened level of interest. To take advantage of that interest, diverse businesses and organizations have claimed an early stake in this fast-growing market. They include technology leaders such as IBM, Microsoft, and Cisco, companies such as BMW, Toyota, Circuit City, Coca Cola, and Calvin Klein, and scores of universities, including Harvard, Stanford and Penn State.

  7. High performance parallelism pearls 2 multicore and many-core programming approaches

    Jeffers, Jim

    2015-01-01

    High Performance Parallelism Pearls Volume 2 offers another set of examples that demonstrate how to leverage parallelism. Similar to Volume 1, the techniques included here explain how to use processors and coprocessors with the same programming - illustrating the most effective ways to combine Xeon Phi coprocessors with Xeon and other multicore processors. The book includes examples of successful programming efforts, drawn from across industries and domains such as biomed, genetics, finance, manufacturing, imaging, and more. Each chapter in this edited work includes detailed explanations of t

  8. Empirical valence bond models for reactive potential energy surfaces: a parallel multilevel genetic program approach.

    Bellucci, Michael A; Coker, David F

    2011-07-28

    We describe a new method for constructing empirical valence bond potential energy surfaces using a parallel multilevel genetic program (PMLGP). Genetic programs can be used to perform an efficient search through function space and parameter space to find the best functions and sets of parameters that fit energies obtained by ab initio electronic structure calculations. Building on the traditional genetic program approach, the PMLGP utilizes a hierarchy of genetic programming on two different levels. The lower level genetic programs are used to optimize coevolving populations in parallel while the higher level genetic program (HLGP) is used to optimize the genetic operator probabilities of the lower level genetic programs. The HLGP allows the algorithm to dynamically learn the mutation or combination of mutations that most effectively increase the fitness of the populations, causing a significant increase in the algorithm's accuracy and efficiency. The algorithm's accuracy and efficiency is tested against a standard parallel genetic program with a variety of one-dimensional test cases. Subsequently, the PMLGP is utilized to obtain an accurate empirical valence bond model for proton transfer in 3-hydroxy-gamma-pyrone in gas phase and protic solvent.

  9. Scalable parallel programming for high performance seismic simulation on petascale heterogeneous supercomputers

    Zhou, Jun

    The 1994 Northridge earthquake in Los Angeles, California, killed 57 people, injured over 8,700 and caused an estimated $20 billion in damage. Petascale simulations are needed in California and elsewhere to provide society with a better understanding of the rupture and wave dynamics of the largest earthquakes at shaking frequencies required to engineer safe structures. As the heterogeneous supercomputing infrastructures are becoming more common, numerical developments in earthquake system research are particularly challenged by the dependence on the accelerator elements to enable "the Big One" simulations with higher frequency and finer resolution. Reducing time to solution and power consumption are two primary focus area today for the enabling technology of fault rupture dynamics and seismic wave propagation in realistic 3D models of the crust's heterogeneous structure. This dissertation presents scalable parallel programming techniques for high performance seismic simulation running on petascale heterogeneous supercomputers. A real world earthquake simulation code, AWP-ODC, one of the most advanced earthquake codes to date, was chosen as the base code in this research, and the testbed is based on Titan at Oak Ridge National Laboraratory, the world's largest hetergeneous supercomputer. The research work is primarily related to architecture study, computation performance tuning and software system scalability. An earthquake simulation workflow has also been developed to support the efficient production sets of simulations. The highlights of the technical development are an aggressive performance optimization focusing on data locality and a notable data communication model that hides the data communication latency. This development results in the optimal computation efficiency and throughput for the 13-point stencil code on heterogeneous systems, which can be extended to general high-order stencil codes. Started from scratch, the hybrid CPU/GPU version of AWP

  10. 3D Turtle Graphics” by using a 3D Printer

    Yasusi Kanada

    2015-04-01

    Full Text Available When creating shapes by using a 3D printer, usually, a static (declarative model designed by using a 3D CAD system is translated to a CAM program and it is sent to the printer. However, widely-used FDM-type 3D printers input a dynamical (procedural program that describes control of motions of the print head and extrusion of the filament. If the program is expressed by using a programming language or a library in a straight manner, solids can be created by a method similar to turtle graphics. An open-source library that enables “turtle 3D printing” method was described by Python and tested. Although this method currently has a problem that it cannot print in the air; however, if this problem is solved by an appropriate method, shapes drawn by 3D turtle graphics freely can be embodied by this method.

  11. Parallel programming of saccades during natural scene viewing: evidence from eye movement positions.

    Wu, Esther X W; Gilani, Syed Omer; van Boxtel, Jeroen J A; Amihai, Ido; Chua, Fook Kee; Yen, Shih-Cheng

    2013-10-24

    Previous studies have shown that saccade plans during natural scene viewing can be programmed in parallel. This evidence comes mainly from temporal indicators, i.e., fixation durations and latencies. In the current study, we asked whether eye movement positions recorded during scene viewing also reflect parallel programming of saccades. As participants viewed scenes in preparation for a memory task, their inspection of the scene was suddenly disrupted by a transition to another scene. We examined whether saccades after the transition were invariably directed immediately toward the center or were contingent on saccade onset times relative to the transition. The results, which showed a dissociation in eye movement behavior between two groups of saccades after the scene transition, supported the parallel programming account. Saccades with relatively long onset times (>100 ms) after the transition were directed immediately toward the center of the scene, probably to restart scene exploration. Saccades with short onset times (programming of saccades during scene viewing. Additionally, results from the analyses of intersaccadic intervals were also consistent with the parallel programming hypothesis.

  12. Grid Service Framework:Supporting Multi-Models Parallel Grid Programming

    邓倩妮; 陆鑫达

    2004-01-01

    Web service is a grid computing technology that promises greater ease-of-use and interoperability than previous distributed computing technologies. This paper proposed Group Service Framework, a grid computing platform based on Microsoft. NET that use web service to: (1) locate and harness volunteer computing resources for different applications, and (2) support multi-models such as Master/Slave, Divide and Conquer, Phase Parallel and so forth parallel programming paradigms in Grid environment, (3) allocate data and balance load dynamically and transparently for grid computing application. The Grid Service Framework based on Microsoft. NET was used to implement several simple parallel computing applications. The results show that the proposed Group Service Framework is suitable for generic parallel numerical computing.

  13. 3D Membrane Imaging and Porosity Visualization

    Sundaramoorthi, Ganesh

    2016-03-03

    Ultrafiltration asymmetric porous membranes were imaged by two microscopy methods, which allow 3D reconstruction: Focused Ion Beam and Serial Block Face Scanning Electron Microscopy. A new algorithm was proposed to evaluate porosity and average pore size in different layers orthogonal and parallel to the membrane surface. The 3D-reconstruction enabled additionally the visualization of pore interconnectivity in different parts of the membrane. The method was demonstrated for a block copolymer porous membrane and can be extended to other membranes with application in ultrafiltration, supports for forward osmosis, etc, offering a complete view of the transport paths in the membrane.

  14. Architecture-Adaptive Computing Environment: A Tool for Teaching Parallel Programming

    Dorband, John E.; Aburdene, Maurice F.

    2002-01-01

    Recently, networked and cluster computation have become very popular. This paper is an introduction to a new C based parallel language for architecture-adaptive programming, aCe C. The primary purpose of aCe (Architecture-adaptive Computing Environment) is to encourage programmers to implement applications on parallel architectures by providing them the assurance that future architectures will be able to run their applications with a minimum of modification. A secondary purpose is to encourage computer architects to develop new types of architectures by providing an easily implemented software development environment and a library of test applications. This new language should be an ideal tool to teach parallel programming. In this paper, we will focus on some fundamental features of aCe C.

  15. Convex quadratic programming relaxations for parallel machine scheduling with controllable processing times subject to release times

    ZHANG Feng; CHEN Feng; TANG Guochun

    2004-01-01

    Scheduling unrelated parallel machines with controllable processing times subject to release times is investigated. Based on the convex quadratic programming relaxation and the randomized rounding strategy, a 2-approximation algorithm is obtained for a special case with the all-or-none property and then a 3-approximation algorithm is presented for general problem.

  16. All-pairs Shortest Path Algorithm based on MPI+CUDA Distributed Parallel Programming Model

    Qingshuang Wu

    2013-12-01

    Full Text Available In view of the problem that computing shortest paths in a graph is a complex and time-consuming process, and the traditional algorithm that rely on the CPU as computing unit solely can't meet the demand of real-time processing, in this paper, we present an all-pairs shortest paths algorithm using MPI+CUDA hybrid programming model, which can take use of the overwhelming computing power of the GPU cluster to speed up the processing. This proposed algorithm can combine the advantages of MPI and CUDA programming model, and can realize two-level parallel computing. In the cluster-level, we take use of the MPI programming model to achieve a coarse-grained parallel computing between the computational nodes of the GPU cluster. In the node-level, we take use of the CUDA programming model to achieve a GPU-accelerated fine grit parallel computing in each computational node internal. The experimental results show that the MPI+CUDA-based parallel algorithm can take full advantage of the powerful computing capability of the GPU cluster, and can achieve about hundreds of time speedup; The whole algorithm has good computing performance, reliability and scalability, and it is able to meet the demand of real-time processing of massive spatial shortest path analysis

  17. Parallel Implementation of a Semidefinite Programming Solver based on CSDP in a distributed memory cluster

    Ivanov, I.D.; de Klerk, E.

    2007-01-01

    In this paper we present the algorithmic framework and practical aspects of implementing a parallel version of a primal-dual semidefinite programming solver on a distributed memory computer cluster. Our implementation is based on the CSDP solver and uses a message passing interface (MPI), and the Sc

  18. Wireless 3D Chocolate Printer

    FROILAN G. DESTREZA

    2014-02-01

    Full Text Available This study is for the BSHRM Students of Batangas State University (BatStateU ARASOF for the researchers believe that the Wireless 3D Chocolate Printer would be helpful in their degree program especially on making creative, artistic, personalized and decorative chocolate designs. The researchers used the Prototyping model as procedural method for the successful development and implementation of the hardware and software. This method has five phases which are the following: quick plan, quick design, prototype construction, delivery and feedback and communication. This study was evaluated by the BSHRM Students and the assessment of the respondents regarding the software and hardware application are all excellent in terms of Accuracy, Effecitveness, Efficiency, Maintainability, Reliability and User-friendliness. Also, the overall level of acceptability of the design project as evaluated by the respondents is excellent. With regard to the observation about the best raw material to use in 3D printing, the chocolate is good to use as the printed material is slightly distorted,durable and very easy to prepare; the icing is also good to use as the printed material is not distorted and is very durable but consumes time to prepare; the flour is not good as the printed material is distorted, not durable but it is easy to prepare. The computation of the economic viability level of 3d printer with reference to ROI is 37.14%. The recommendation of the researchers in the design project are as follows: adding a cooling system so that the raw material will be more durable, development of a more simplified version and improving the extrusion process wherein the user do not need to stop the printing process just to replace the empty syringe with a new one.

  19. 3D printing for dummies

    Hausman, Kalani Kirk

    2014-01-01

    Get started printing out 3D objects quickly and inexpensively! 3D printing is no longer just a figment of your imagination. This remarkable technology is coming to the masses with the growing availability of 3D printers. 3D printers create 3-dimensional layered models and they allow users to create prototypes that use multiple materials and colors.  This friendly-but-straightforward guide examines each type of 3D printing technology available today and gives artists, entrepreneurs, engineers, and hobbyists insight into the amazing things 3D printing has to offer. You'll discover methods for

  20. Intraoral 3D scanner

    Kühmstedt, Peter; Bräuer-Burchardt, Christian; Munkelt, Christoph; Heinze, Matthias; Palme, Martin; Schmidt, Ingo; Hintersehr, Josef; Notni, Gunther

    2007-09-01

    Here a new set-up of a 3D-scanning system for CAD/CAM in dental industry is proposed. The system is designed for direct scanning of the dental preparations within the mouth. The measuring process is based on phase correlation technique in combination with fast fringe projection in a stereo arrangement. The novelty in the approach is characterized by the following features: A phase correlation between the phase values of the images of two cameras is used for the co-ordinate calculation. This works contrary to the usage of only phase values (phasogrammetry) or classical triangulation (phase values and camera image co-ordinate values) for the determination of the co-ordinates. The main advantage of the method is that the absolute value of the phase at each point does not directly determine the coordinate. Thus errors in the determination of the co-ordinates are prevented. Furthermore, using the epipolar geometry of the stereo-like arrangement the phase unwrapping problem of fringe analysis can be solved. The endoscope like measurement system contains one projection and two camera channels for illumination and observation of the object, respectively. The new system has a measurement field of nearly 25mm × 15mm. The user can measure two or three teeth at one time. So the system can by used for scanning of single tooth up to bridges preparations. In the paper the first realization of the intraoral scanner is described.

  1. Dynamic Frames Based Generation of 3D Scenes and Applications

    Kvesić, Anton; Radošević, Danijel; Orehovački, Tihomir

    2015-01-01

    Modern graphic/programming tools like Unity enables the possibility of creating 3D scenes as well as making 3D scene based program applications, including full physical model, motion, sounds, lightning effects etc. This paper deals with the usage of dynamic frames based generator in the automatic generation of 3D scene and related source code. The suggested model enables the possibility to specify features of the 3D scene in a form of textual specification, as well as exporting such features ...

  2. Cellular Microcultures: Programming Mechanical and Physicochemical Properties of 3D Hydrogel Cellular Microcultures via Direct Ink Writing (Adv. Healthcare Mater. 9/2016).

    McCracken, Joselle M; Badea, Adina; Kandel, Mikhail E; Gladman, A Sydney; Wetzel, David J; Popescu, Gabriel; Lewis, Jennifer A; Nuzzo, Ralph G

    2016-05-01

    R. Nuzzo and co-workers show on page 1025 how compositional differences in hydrogels are used to tune their cellular compliance by controlling their polymer mesh properties and subsequent uptake of the protein poly-l-lysine (green spheres in circled inset). The cover image shows pyramid micro-scaffolds prepared using direct ink writing (DIW) that differentially direct fibroblast and preosteoblast growth in 3D, depending on cell motility and surface treatment.

  3. LB3D: a protein three-dimensional substructure search program based on the lower bound of a root mean square deviation value.

    Terashi, Genki; Shibuya, Tetsuo; Takeda-Shitaka, Mayuko

    2012-05-01

    Searching for protein structure-function relationships using three-dimensional (3D) structural coordinates represents a fundamental approach for determining the function of proteins with unknown functions. Since protein structure databases are rapidly growing in size, the development of a fast search method to find similar protein substructures by comparison of protein 3D structures is essential. In this article, we present a novel protein 3D structure search method to find all substructures with root mean square deviations (RMSDs) to the query structure that are lower than a given threshold value. Our new algorithm runs in O(m + N/m(0.5)) time, after O(N log N) preprocessing, where N is the database size and m is the query length. The new method is 1.8-41.6 times faster than the practically best known O(N) algorithm, according to computational experiments using a huge database (i.e., >20,000,000 C-alpha coordinates).

  4. LDRD final report on massively-parallel linear programming : the parPCx system.

    Parekh, Ojas (Emory University, Atlanta, GA); Phillips, Cynthia Ann; Boman, Erik Gunnar

    2005-02-01

    This report summarizes the research and development performed from October 2002 to September 2004 at Sandia National Laboratories under the Laboratory-Directed Research and Development (LDRD) project ''Massively-Parallel Linear Programming''. We developed a linear programming (LP) solver designed to use a large number of processors. LP is the optimization of a linear objective function subject to linear constraints. Companies and universities have expended huge efforts over decades to produce fast, stable serial LP solvers. Previous parallel codes run on shared-memory systems and have little or no distribution of the constraint matrix. We have seen no reports of general LP solver runs on large numbers of processors. Our parallel LP code is based on an efficient serial implementation of Mehrotra's interior-point predictor-corrector algorithm (PCx). The computational core of this algorithm is the assembly and solution of a sparse linear system. We have substantially rewritten the PCx code and based it on Trilinos, the parallel linear algebra library developed at Sandia. Our interior-point method can use either direct or iterative solvers for the linear system. To achieve a good parallel data distribution of the constraint matrix, we use a (pre-release) version of a hypergraph partitioner from the Zoltan partitioning library. We describe the design and implementation of our new LP solver called parPCx and give preliminary computational results. We summarize a number of issues related to efficient parallel solution of LPs with interior-point methods including data distribution, numerical stability, and solving the core linear system using both direct and iterative methods. We describe a number of applications of LP specific to US Department of Energy mission areas and we summarize our efforts to integrate parPCx (and parallel LP solvers in general) into Sandia's massively-parallel integer programming solver PICO (Parallel Interger and

  5. Teaching Scientific Computing: A Model-Centered Approach to Pipeline and Parallel Programming with C

    Vladimiras Dolgopolovas

    2015-01-01

    Full Text Available The aim of this study is to present an approach to the introduction into pipeline and parallel computing, using a model of the multiphase queueing system. Pipeline computing, including software pipelines, is among the key concepts in modern computing and electronics engineering. The modern computer science and engineering education requires a comprehensive curriculum, so the introduction to pipeline and parallel computing is the essential topic to be included in the curriculum. At the same time, the topic is among the most motivating tasks due to the comprehensive multidisciplinary and technical requirements. To enhance the educational process, the paper proposes a novel model-centered framework and develops the relevant learning objects. It allows implementing an educational platform of constructivist learning process, thus enabling learners’ experimentation with the provided programming models, obtaining learners’ competences of the modern scientific research and computational thinking, and capturing the relevant technical knowledge. It also provides an integral platform that allows a simultaneous and comparative introduction to pipelining and parallel computing. The programming language C for developing programming models and message passing interface (MPI and OpenMP parallelization tools have been chosen for implementation.

  6. Method, systems, and computer program products for implementing function-parallel network firewall

    Fulp, Errin W.; Farley, Ryan J.

    2011-10-11

    Methods, systems, and computer program products for providing function-parallel firewalls are disclosed. According to one aspect, a function-parallel firewall includes a first firewall node for filtering received packets using a first portion of a rule set including a plurality of rules. The first portion includes less than all of the rules in the rule set. At least one second firewall node filters packets using a second portion of the rule set. The second portion includes at least one rule in the rule set that is not present in the first portion. The first and second portions together include all of the rules in the rule set.

  7. Speedup properties of phases in the execution profile of distributed parallel programs

    Carlson, B.M. [Toronto Univ., ON (Canada). Computer Systems Research Institute; Wagner, T.D.; Dowdy, L.W. [Vanderbilt Univ., Nashville, TN (United States). Dept. of Computer Science; Worley, P.H. [Oak Ridge National Lab., TN (United States)

    1992-08-01

    The execution profile of a distributed-memory parallel program specifies the number of busy processors as a function of time. Periods of homogeneous processor utilization are manifested in many execution profiles. These periods can usually be correlated with the algorithms implemented in the underlying parallel code. Three families of methods for smoothing execution profile data are presented. These approaches simplify the problem of detecting end points of periods of homogeneous utilization. These periods, called phases, are then examined in isolation, and their speedup characteristics are explored. A specific workload executed on an Intel iPSC/860 is used for validation of the techniques described.

  8. Programming Environment for a High-Performance Parallel Supercomputer with Intelligent Communication

    A. Gunzinger

    1996-01-01

    Full Text Available At the Electronics Laboratory of the Swiss Federal Institute of Technology (ETH in Zürich, the high-performance parallel supercomputer MUSIC (MUlti processor System with Intelligent Communication has been developed. As applications like neural network simulation and molecular dynamics show, the Electronics Laboratory supercomputer is absolutely on par with those of conventional supercomputers, but electric power requirements are reduced by a factor of 1,000, weight is reduced by a factor of 400, and price is reduced by a factor of 100. Software development is a key issue of such parallel systems. This article focuses on the programming environment of the MUSIC system and on its applications.

  9. Academic training: From Evolution Theory to Parallel and Distributed Genetic Programming

    2007-01-01

    2006-2007 ACADEMIC TRAINING PROGRAMME LECTURE SERIES 15, 16 March From 11:00 to 12:00 - Main Auditorium, bldg. 500 From Evolution Theory to Parallel and Distributed Genetic Programming F. FERNANDEZ DE VEGA / Univ. of Extremadura, SP Lecture No. 1: From Evolution Theory to Evolutionary Computation Evolutionary computation is a subfield of artificial intelligence (more particularly computational intelligence) involving combinatorial optimization problems, which are based to some degree on the evolution of biological life in the natural world. In this tutorial we will review the source of inspiration for this metaheuristic and its capability for solving problems. We will show the main flavours within the field, and different problems that have been successfully solved employing this kind of techniques. Lecture No. 2: Parallel and Distributed Genetic Programming The successful application of Genetic Programming (GP, one of the available Evolutionary Algorithms) to optimization problems has encouraged an ...

  10. Parallel programming of exogenous and endogenous components in the antisaccade task.

    Massen, Cristina

    2004-04-01

    In the antisaccade task subjects are required to suppress the reflexive tendency to look at a peripherally presented stimulus and to perform a saccade in the opposite direction instead. The present studies aimed at investigating the inhibitory mechanisms responsible for successful performance in this task, testing a hypothesis of parallel programming of exogenous and endogenous components: A reflexive saccade to the stimulus is automatically programmed and competes with the concurrently established voluntary programme to look in the opposite direction. The experiments followed the logic of selectively manipulating the speed of processing of these components and testing the prediction that a selective slowing of the exogenous component should result in a reduced error rate in this task, while a selective slowing of the endogenous component should have the opposite effect. The results provide evidence for the hypothesis of parallel programming and are discussed in the context of alternative accounts of antisaccade performance.

  11. Calculation of illumination conditions at the lunar south pole - parallel programming approach

    Figuera, R. Marco; Gläser, P.; Oberst, J.; De Rosa, D.

    2014-04-01

    In this paper we present a parallel programming approach to evaluate illumination conditions at the lunar south pole. Due to the small inclination (1.54°) of the lunar rotational axis with respect to the ecliptic plane and the topography of the lunar south pole, which allows long illumination periods, the study of illumination conditions is of great importance. Several tests were conducted in order to check the viability of the study and to optimize the tool used to calculate such illumination. First results using a simulated case study showed a reduction of the computation time in the order of 8-12 times using parallel programming in the Graphic Processing Unit (GPU) in comparison with sequential programming in the Central Processing Unit (CPU).

  12. Salient Local 3D Features for 3D Shape Retrieval

    Godil, Afzal

    2011-01-01

    In this paper we describe a new formulation for the 3D salient local features based on the voxel grid inspired by the Scale Invariant Feature Transform (SIFT). We use it to identify the salient keypoints (invariant points) on a 3D voxelized model and calculate invariant 3D local feature descriptors at these keypoints. We then use the bag of words approach on the 3D local features to represent the 3D models for shape retrieval. The advantages of the method are that it can be applied to rigid as well as to articulated and deformable 3D models. Finally, this approach is applied for 3D Shape Retrieval on the McGill articulated shape benchmark and then the retrieval results are presented and compared to other methods.

  13. Algorithmic differentiation of pragma-defined parallel regions differentiating computer programs containing OpenMP

    Förster, Michael

    2014-01-01

    Numerical programs often use parallel programming techniques such as OpenMP to compute the program's output values as efficient as possible. In addition, derivative values of these output values with respect to certain input values play a crucial role. To achieve code that computes not only the output values simultaneously but also the derivative values, this work introduces several source-to-source transformation rules. These rules are based on a technique called algorithmic differentiation. The main focus of this work lies on the important reverse mode of algorithmic differentiation. The inh

  14. Research on Parallelization of Multiphase Space Numerical Simulation%多相空间数值模拟并行化研究

    陆林生; 董超群; 李志辉

    2003-01-01

    This paper researches on parellelization of multiphase space numerical simulation with the case of the gaskinetic algorithm for 3-D flows. It focuses on the techniques of domain decomposition methods, vector reduction andboundary processing parallel optimization. The HPF parallel program design has been developed by the Parallel Pro-gramming Concept Design (PPCDS). The preferable parallel efficiency has been found by the HPF program in highperformance computer with massive scale parallel computing.

  15. Time domain topology optimization of 3D nanophotonic devices

    Elesin, Yuriy; Lazarov, Boyan Stefanov; Jensen, Jakob Søndergaard;

    2014-01-01

    We present an efficient parallel topology optimization framework for design of large scale 3D nanophotonic devices. The code shows excellent scalability and is demonstrated for optimization of broadband frequency splitter, waveguide intersection, photonic crystal-based waveguide and nanowire...

  16. Molecular dynamics simulation on a network of workstations using a machine-independent parallel programming language.

    Shifman, M A; Windemuth, A; Schulten, K; Miller, P L

    1992-04-01

    Molecular dynamics simulations investigate local and global motion in molecules. Several parallel computing approaches have been taken to attack the most computationally expensive phase of molecular simulations, the evaluation of long range interactions. This paper reviews these approaches and develops a straightforward but effective algorithm using the machine-independent parallel programming language, Linda. The algorithm was run both on a shared memory parallel computer and on a network of high performance Unix workstations. Performance benchmarks were performed on both systems using two proteins. This algorithm offers a portable cost-effective alternative for molecular dynamics simulations. In view of the increasing numbers of networked workstations, this approach could help make molecular dynamics simulations more easily accessible to the research community.

  17. Fixed-dimensional parallel linear programming via relative {Epsilon}-approximations

    Goodrich, M.T.

    1996-12-31

    We show that linear programming in IR{sup d} can be solved deterministically in O((log log n){sup d}) time using linear work in the PRAM model of computation, for any fixed constant d. Our method is developed for the CRCW variant of the PRAM parallel computation model, and can be easily implemented to run in O(log n(log log n){sup d-1}) time using linear work on an EREW PRAM. A key component in these algorithms is a new, efficient parallel method for constructing E-nets and E-approximations (which have wide applicability in computational geometry). In addition, we introduce a new deterministic set approximation for range spaces with finite VC-exponent, which we call the {delta}-relative {epsilon}-approximation, and we show how such approximations can be efficiently constructed in parallel.

  18. Run-Time and Compiler Support for Programming in Adaptive Parallel Environments

    Guy Edjlali

    1997-01-01

    Full Text Available For better utilization of computing resources, it is important to consider parallel programming environments in which the number of available processors varies at run-time. In this article, we discuss run-time support for data-parallel programming in such an adaptive environment. Executing programs in an adaptive environment requires redistributing data when the number of processors changes, and also requires determining new loop bounds and communication patterns for the new set of processors. We have developed a run-time library to provide this support. We discuss how the run-time library can be used by compilers of high-performance Fortran (HPF-like languages to generate code for an adaptive environment. We present performance results for a Navier-Stokes solver and a multigrid template run on a network of workstations and an IBM SP-2. Our experiments show that if the number of processors is not varied frequently, the cost of data redistribution is not significant compared to the time required for the actual computation. Overall, our work establishes the feasibility of compiling HPF for a network of nondedicated workstations, which are likely to be an important resource for parallel programming in the future.

  19. Towards Interactive Visual Exploration of Parallel Programs using a Domain-Specific Language

    Klein, Tobias

    2016-04-19

    The use of GPUs and the massively parallel computing paradigm have become wide-spread. We describe a framework for the interactive visualization and visual analysis of the run-time behavior of massively parallel programs, especially OpenCL kernels. This facilitates understanding a program\\'s function and structure, finding the causes of possible slowdowns, locating program bugs, and interactively exploring and visually comparing different code variants in order to improve performance and correctness. Our approach enables very specific, user-centered analysis, both in terms of the recording of the run-time behavior and the visualization itself. Instead of having to manually write instrumented code to record data, simple code annotations tell the source-to-source compiler which code instrumentation to generate automatically. The visualization part of our framework then enables the interactive analysis of kernel run-time behavior in a way that can be very specific to a particular problem or optimization goal, such as analyzing the causes of memory bank conflicts or understanding an entire parallel algorithm.

  20. Beam and Truss Finite Element Verification for DYNA3D

    Rathbun, H J

    2007-07-16

    The explicit finite element (FE) software program DYNA3D has been developed at Lawrence Livermore National Laboratory (LLNL) to simulate the dynamic behavior of structures, systems, and components. This report focuses on verification of beam and truss element formulations in DYNA3D. An efficient protocol has been developed to verify the accuracy of these structural elements by generating a set of representative problems for which closed-form quasi-static steady-state analytical reference solutions exist. To provide as complete coverage as practically achievable, problem sets are developed for each beam and truss element formulation (and their variants) in all modes of loading and physical orientation. Analyses with loading in the elastic and elastic-plastic regimes are performed. For elastic loading, the FE results are within 1% of the reference solutions for all cases. For beam element bending and torsion loading in the plastic regime, the response is heavily dependent on the numerical integration rule chosen, with higher refinement yielding greater accuracy (agreement to within 1%). Axial loading in the plastic regime produces accurate results (agreement to within 0.01%) for all integration rules and element formulations. Truss elements are also verified to provide accurate results (within 0.01%) for elastic and elastic-plastic loading. A sample problem to verify beam element response in ParaDyn, the parallel version DYNA3D, is also presented.

  1. BSP模型下的并行程序设计与开发%Design and Development of Parallel Programs on Bulk Synchronous Parallel Model

    赖树华; 陆朝俊; 孙永强

    2001-01-01

    The Bulk Synchronous Parallel (BSP) model was simply introduced, and the advantage of the parapllel program's design and development on BSP model was discussed. Then it analysed how to design and develop the parallel programs on BSP model and summarized several principles the developer must comply with. At last a useful parallel programming method based on the BSP model was presented: the two phase method of BSP parallel program design. An example was given to illustrate how to make use of the above method and the BSP performance prediction tool.%介绍了BSP(Bulk Synchronous Parallel)模型,讨论了在该模型下进行并行程序设计的优点、并行算法的分析和设计方法及其必须遵守的原则.以两矩阵的乘法为例说明了如何借助BSP并行程序性能预测工具,利用两阶段BSP并行程序设计方法进行BSP并行程序的设计和开发.

  2. The (parallel) approximability of non-boolean satisfiability problems and restricted integer programming

    Serna, Maria; Trevisan, Luca; Xhafa, Fatos

    We present parallel approximation algorithms for maximization problems expressible by integer linear programs of a restricted syntactic form introduced by Barland et al. [BKT96]. One of our motivations was to show whether the approximation results in the framework of Barland et al. holds in the parallel setting. Our results are a confirmation of this, and thus we have a new common framework for both computational settings. Also, we prove almost tight non-approximability results, thus solving a main open question of Barland et al. We obtain the results through the constraint satisfaction problem over multi-valued domains, for which we show non-approximability results and develop parallel approximation algorithms. Our parallel approximation algorithms are based on linear programming and random rounding; they are better than previously known sequential algorithms. The non-approximability results are based on new recent progress in the fields of Probabilistically Checkable Proofs and Multi-Prover One-Round Proof Systems [Raz95, Hås97, AS97, RS97].

  3. Scoops3D: software to analyze 3D slope stability throughout a digital landscape

    Reid, Mark E.; Christian, Sarah B.; Brien, Dianne L.; Henderson, Scott T.

    2015-01-01

    The computer program, Scoops3D, evaluates slope stability throughout a digital landscape represented by a digital elevation model (DEM). The program uses a three-dimensional (3D) method of columns approach to assess the stability of many (typically millions) potential landslides within a user-defined size range. For each potential landslide (or failure), Scoops3D assesses the stability of a rotational, spherical slip surface encompassing many DEM cells using a 3D version of either Bishop’s simplified method or the Ordinary (Fellenius) method of limit-equilibrium analysis. Scoops3D has several options for the user to systematically and efficiently search throughout an entire DEM, thereby incorporating the effects of complex surface topography. In a thorough search, each DEM cell is included in multiple potential failures, and Scoops3D records the lowest stability (factor of safety) for each DEM cell, as well as the size (volume or area) associated with each of these potential landslides. It also determines the least-stable potential failure for the entire DEM. The user has a variety of options for building a 3D domain, including layers or full 3D distributions of strength and pore-water pressures, simplistic earthquake loading, and unsaturated suction conditions. Results from Scoops3D can be readily incorporated into a geographic information system (GIS) or other visualization software. This manual includes information on the theoretical basis for the slope-stability analysis, requirements for constructing and searching a 3D domain, a detailed operational guide (including step-by-step instructions for using the graphical user interface [GUI] software, Scoops3D-i) and input/output file specifications, practical considerations for conducting an analysis, results of verification tests, and multiple examples illustrating the capabilities of Scoops3D. Easy-to-use software installation packages are available for the Windows or Macintosh operating systems; these packages

  4. From functional programming to multicore parallelism: A case study based on Presburger Arithmetic

    Dung, Phan Anh; Hansen, Michael Reichhardt

    2011-01-01

    The overall goal of this work is studying parallelization of functional programs with the specific case study of decision procedures for Presburger Arithmetic (PA). PA is a first order theory of integers accepting addition as its only operation. Whereas it has wide applications in different areas......, we are interested in using PA in connection with the Duration Calculus Model Checker (DCMC) [5]. There are effective decision procedures for PA including Cooper’s algorithm and the Omega Test; however, their complexity is extremely high with doubly exponential lower bound and triply exponential upper...... in the SMT-solver Z3 [8] which has the capability of solving Presburger formulas. Functional programming is well-suited for the domain of decision procedures, and its immutability feature helps to reduce parallelization effort. While Haskell has progressed with a lot of parallelismrelated research [6], we...

  5. Class Notes: Programming Parallel Algorithms CS 15-840B (Fall 1992)

    1993-02-01

    840: Programming Parallel Algorithms Lecture #15 Scribe: Bob Wheeler Thursday, 6 Nov 92 Overview * Connected components (continued). * Minimum spanning...Sriram Sethuraman Singular value decomposition Ken Tew EEG analysis Eric Thayer Speech recognition Xuemei Wang & Bob Wheeler Matrix operations Matt...Computing, 14(4):862-874, 1985. [33] L. W. Tucker, C. R. Feynman , and D. M. Fritzsche. Object recognition using the Connection Machine. Proceedings CVPR 󈨜

  6. HipMatch: an object-oriented cross-platform program for accurate determination of cup orientation using 2D-3D registration of single standard X-ray radiograph and a CT volume.

    Zheng, Guoyan; Zhang, Xuan; Steppacher, Simon D; Murphy, Stephen B; Siebenrock, Klaus A; Tannast, Moritz

    2009-09-01

    The widely used procedure of evaluation of cup orientation following total hip arthroplasty using single standard anteroposterior (AP) radiograph is known inaccurate, largely due to the wide variability in individual pelvic orientation relative to X-ray plate. 2D-3D image registration methods have been introduced for an accurate determination of the post-operative cup alignment with respect to an anatomical reference extracted from the CT data. Although encouraging results have been reported, their extensive usage in clinical routine is still limited. This may be explained by their requirement of a CAD model of the prosthesis, which is often difficult to be organized from the manufacturer due to the proprietary issue, and by their requirement of either multiple radiographs or a radiograph-specific calibration, both of which are not available for most retrospective studies. To address these issues, we developed and validated an object-oriented cross-platform program called "HipMatch" where a hybrid 2D-3D registration scheme combining an iterative landmark-to-ray registration with a 2D-3D intensity-based registration was implemented to estimate a rigid transformation between a pre-operative CT volume and the post-operative X-ray radiograph for a precise estimation of cup alignment. No CAD model of the prosthesis is required. Quantitative and qualitative results evaluated on cadaveric and clinical datasets are given, which indicate the robustness and the accuracy of the program. HipMatch is written in object-oriented programming language C++ using cross-platform software Qt (TrollTech, Oslo, Norway), VTK, and Coin3D and is transportable to any platform.

  7. Dynamic programming in parallel boundary detection with application to ultrasound intima-media segmentation.

    Zhou, Yuan; Cheng, Xinyao; Xu, Xiangyang; Song, Enmin

    2013-12-01

    Segmentation of carotid artery intima-media in longitudinal ultrasound images for measuring its thickness to predict cardiovascular diseases can be simplified as detecting two nearly parallel boundaries within a certain distance range, when plaque with irregular shapes is not considered. In this paper, we improve the implementation of two dynamic programming (DP) based approaches to parallel boundary detection, dual dynamic programming (DDP) and piecewise linear dual dynamic programming (PL-DDP). Then, a novel DP based approach, dual line detection (DLD), which translates the original 2-D curve position to a 4-D parameter space representing two line segments in a local image segment, is proposed to solve the problem while maintaining efficiency and rotation invariance. To apply the DLD to ultrasound intima-media segmentation, it is imbedded in a framework that employs an edge map obtained from multiplication of the responses of two edge detectors with different scales and a coupled snake model that simultaneously deforms the two contours for maintaining parallelism. The experimental results on synthetic images and carotid arteries of clinical ultrasound images indicate improved performance of the proposed DLD compared to DDP and PL-DDP, with respect to accuracy and efficiency.

  8. CLUSTEREASY:A Program for Simulating Scalar Field Evolution on Parallel Computers

    Felder, Gary N

    2007-01-01

    We describe a new, parallel programming version of the scalar field simulation program LATTICEEASY. The new C++ program, CLUSTEREASY, can simulate arbitrary scalar field models on distributed-memory clusters. The speed and memory requirements scale well with the number of processors. As with the serial version of LATTICEEASY, CLUSTEREASY can run simulations in one, two, or three dimensions, with or without expansion of the universe, with customizable parameters and output. The program and its full documentation are available on the LATTICEEASY website at http://www.science.smith.edu/departments/Physics/fstaff/gfelder/latticeeasy/. In this paper we provide a brief overview of what CLUSTEREASY does and the ways in which it does and doesn't differ from the serial version of LATTICEEASY.

  9. 3D Spectroscopy in Astronomy

    Mediavilla, Evencio; Arribas, Santiago; Roth, Martin; Cepa-Nogué, Jordi; Sánchez, Francisco

    2011-09-01

    Preface; Acknowledgements; 1. Introductory review and technical approaches Martin M. Roth; 2. Observational procedures and data reduction James E. H. Turner; 3. 3D Spectroscopy instrumentation M. A. Bershady; 4. Analysis of 3D data Pierre Ferruit; 5. Science motivation for IFS and galactic studies F. Eisenhauer; 6. Extragalactic studies and future IFS science Luis Colina; 7. Tutorials: how to handle 3D spectroscopy data Sebastian F. Sánchez, Begona García-Lorenzo and Arlette Pécontal-Rousset.

  10. Spherical 3D isotropic wavelets

    Lanusse, F.; Rassat, A.; Starck, J.-L.

    2012-04-01

    Context. Future cosmological surveys will provide 3D large scale structure maps with large sky coverage, for which a 3D spherical Fourier-Bessel (SFB) analysis in spherical coordinates is natural. Wavelets are particularly well-suited to the analysis and denoising of cosmological data, but a spherical 3D isotropic wavelet transform does not currently exist to analyse spherical 3D data. Aims: The aim of this paper is to present a new formalism for a spherical 3D isotropic wavelet, i.e. one based on the SFB decomposition of a 3D field and accompany the formalism with a public code to perform wavelet transforms. Methods: We describe a new 3D isotropic spherical wavelet decomposition based on the undecimated wavelet transform (UWT) described in Starck et al. (2006). We also present a new fast discrete spherical Fourier-Bessel transform (DSFBT) based on both a discrete Bessel transform and the HEALPIX angular pixelisation scheme. We test the 3D wavelet transform and as a toy-application, apply a denoising algorithm in wavelet space to the Virgo large box cosmological simulations and find we can successfully remove noise without much loss to the large scale structure. Results: We have described a new spherical 3D isotropic wavelet transform, ideally suited to analyse and denoise future 3D spherical cosmological surveys, which uses a novel DSFBT. We illustrate its potential use for denoising using a toy model. All the algorithms presented in this paper are available for download as a public code called MRS3D at http://jstarck.free.fr/mrs3d.html

  11. 3D IBFV : Hardware-Accelerated 3D Flow Visualization

    Telea, Alexandru; Wijk, Jarke J. van

    2003-01-01

    We present a hardware-accelerated method for visualizing 3D flow fields. The method is based on insertion, advection, and decay of dye. To this aim, we extend the texture-based IBFV technique for 2D flow visualization in two main directions. First, we decompose the 3D flow visualization problem in a

  12. Interactive 3D multimedia content

    Cellary, Wojciech

    2012-01-01

    The book describes recent research results in the areas of modelling, creation, management and presentation of interactive 3D multimedia content. The book describes the current state of the art in the field and identifies the most important research and design issues. Consecutive chapters address these issues. These are: database modelling of 3D content, security in 3D environments, describing interactivity of content, searching content, visualization of search results, modelling mixed reality content, and efficient creation of interactive 3D content. Each chapter is illustrated with example a

  13. A 3-D Contextual Classifier

    Larsen, Rasmus

    1997-01-01

    . This includes the specification of a Gaussian distribution for the pixel values as well as a prior distribution for the configuration of class variables within the cross that is m ade of a pixel and its four nearest neighbours. We will extend this algorithm to 3-D, i.e. we will specify a simultaneous Gaussian...... distr ibution for a pixel and its 6 nearest 3-D neighbours, and generalise the class variable configuration distribution within the 3-D cross. The algorithm is tested on a synthetic 3-D multivariate dataset....

  14. 3D Bayesian contextual classifiers

    Larsen, Rasmus

    2000-01-01

    We extend a series of multivariate Bayesian 2-D contextual classifiers to 3-D by specifying a simultaneous Gaussian distribution for the feature vectors as well as a prior distribution of the class variables of a pixel and its 6 nearest 3-D neighbours.......We extend a series of multivariate Bayesian 2-D contextual classifiers to 3-D by specifying a simultaneous Gaussian distribution for the feature vectors as well as a prior distribution of the class variables of a pixel and its 6 nearest 3-D neighbours....

  15. 3-D printers for libraries

    Griffey, Jason

    2014-01-01

    As the maker movement continues to grow and 3-D printers become more affordable, an expanding group of hobbyists is keen to explore this new technology. In the time-honored tradition of introducing new technologies, many libraries are considering purchasing a 3-D printer. Jason Griffey, an early enthusiast of 3-D printing, has researched the marketplace and seen several systems first hand at the Consumer Electronics Show. In this report he introduces readers to the 3-D printing marketplace, covering such topics asHow fused deposition modeling (FDM) printing workBasic terminology such as build

  16. 3D for Graphic Designers

    Connell, Ellery

    2011-01-01

    Helping graphic designers expand their 2D skills into the 3D space The trend in graphic design is towards 3D, with the demand for motion graphics, animation, photorealism, and interactivity rapidly increasing. And with the meteoric rise of iPads, smartphones, and other interactive devices, the design landscape is changing faster than ever.2D digital artists who need a quick and efficient way to join this brave new world will want 3D for Graphic Designers. Readers get hands-on basic training in working in the 3D space, including product design, industrial design and visualization, modeling, ani

  17. The FORCE: A portable parallel programming language supporting computational structural mechanics

    Jordan, Harry F.; Benten, Muhammad S.; Brehm, Juergen; Ramanan, Aruna

    1989-01-01

    This project supports the conversion of codes in Computational Structural Mechanics (CSM) to a parallel form which will efficiently exploit the computational power available from multiprocessors. The work is a part of a comprehensive, FORTRAN-based system to form a basis for a parallel version of the NICE/SPAR combination which will form the CSM Testbed. The software is macro-based and rests on the force methodology developed by the principal investigator in connection with an early scientific multiprocessor. Machine independence is an important characteristic of the system so that retargeting it to the Flex/32, or any other multiprocessor on which NICE/SPAR might be imnplemented, is well supported. The principal investigator has experience in producing parallel software for both full and sparse systems of linear equations using the force macros. Other researchers have used the Force in finite element programs. It has been possible to rapidly develop software which performs at maximum efficiency on a multiprocessor. The inherent machine independence of the system also means that the parallelization will not be limited to a specific multiprocessor.

  18. High performance parallel computers for science: New developments at the Fermilab advanced computer program

    Nash, T.; Areti, H.; Atac, R.; Biel, J.; Cook, A.; Deppe, J.; Edel, M.; Fischler, M.; Gaines, I.; Hance, R.

    1988-08-01

    Fermilab's Advanced Computer Program (ACP) has been developing highly cost effective, yet practical, parallel computers for high energy physics since 1984. The ACP's latest developments are proceeding in two directions. A Second Generation ACP Multiprocessor System for experiments will include $3500 RISC processors each with performance over 15 VAX MIPS. To support such high performance, the new system allows parallel I/O, parallel interprocess communication, and parallel host processes. The ACP Multi-Array Processor, has been developed for theoretical physics. Each $4000 node is a FORTRAN or C programmable pipelined 20 MFlops (peak), 10 MByte single board computer. These are plugged into a 16 port crossbar switch crate which handles both inter and intra crate communication. The crates are connected in a hypercube. Site oriented applications like lattice gauge theory are supported by system software called CANOPY, which makes the hardware virtually transparent to users. A 256 node, 5 GFlop, system is under construction. 10 refs., 7 figs.

  19. Documentation of a computer program to simulate lake-aquifer interaction using the MODFLOW ground water flow model and the MOC3D solute-transport model

    Merritt, Michael L.; Konikow, Leonard F.

    2000-01-01

    Heads and flow patterns in surficial aquifers can be strongly influenced by the presence of stationary surface-water bodies (lakes) that are in direct contact, vertically and laterally, with the aquifer. Conversely, lake stages can be significantly affected by the volume of water that seeps through the lakebed that separates the lake from the aquifer. For these reasons, a set of computer subroutines called the Lake Package (LAK3) was developed to represent lake/aquifer interaction in numerical simulations using the U.S. Geological Survey three-dimensional, finite-difference, modular ground-water flow model MODFLOW and the U.S. Geological Survey three-dimensional method-of-characteristics solute-transport model MOC3D. In the Lake Package described in this report, a lake is represented as a volume of space within the model grid which consists of inactive cells extending downward from the upper surface of the grid. Active model grid cells bordering this space, representing the adjacent aquifer, exchange water with the lake at a rate determined by the relative heads and by conductances that are based on grid cell dimensions, hydraulic conductivities of the aquifer material, and user-specified leakance distributions that represent the resistance to flow through the material of the lakebed. Parts of the lake may become ?dry? as upper layers of the model are dewatered, with a concomitant reduction in lake surface area, and may subsequently rewet when aquifer heads rise. An empirical approximation has been encoded to simulate the rewetting of a lake that becomes completely dry. The variations of lake stages are determined by independent water budgets computed for each lake in the model grid. This lake budget process makes the package a simulator of the response of lake stage to hydraulic stresses applied to the aquifer. Implementation of a lake water budget requires input of parameters including those representing the rate of lake atmospheric recharge and evaporation

  20. LFP: A PC-program for ligand-field analysis of 3d(n) ions in Oh and lower symmetries

    Kurzak

    2000-05-01

    A modular and efficient version of PC program for calculating ligand-field parameters, i.e. crystal-field model (CFM) and angular overlap model (AOM) is presented. The LFP program is designed to calculate the ligand-field parameters of low symmetry transition metal complexes. It is based on the general method for the analysis of central ion states distortion using group theory. The program has not closed form. It will be extended, according to spectroscopic studies in our laboratory. It is written in FORTRAN language.

  1. Performance Evaluation of Parallel Message Passing and Thread Programming Model on Multicore Architectures

    Hasta, D T

    2010-01-01

    The current trend of multicore architectures on shared memory systems underscores the need of parallelism. While there are some programming model to express parallelism, thread programming model has become a standard to support these system such as OpenMP, and POSIX threads. MPI (Message Passing Interface) which remains the dominant model used in high-performance computing today faces this challenge. Previous version of MPI which is MPI-1 has no shared memory concept, and Current MPI version 2 which is MPI-2 has a limited support for shared memory systems. In this research, MPI-2 version of MPI will be compared with OpenMP to see how well does MPI perform on multicore / SMP (Symmetric Multiprocessor) machines. Comparison between OpenMP for thread programming model and MPI for message passing programming model will be conducted on multicore shared memory machine architectures to see who has a better performance in terms of speed and throughput. Application used to assess the scalability of the evaluated parall...

  2. R3D-2-MSA: the RNA 3D structure-to-multiple sequence alignment server.

    Cannone, Jamie J; Sweeney, Blake A; Petrov, Anton I; Gutell, Robin R; Zirbel, Craig L; Leontis, Neocles

    2015-07-01

    The RNA 3D Structure-to-Multiple Sequence Alignment Server (R3D-2-MSA) is a new web service that seamlessly links RNA three-dimensional (3D) structures to high-quality RNA multiple sequence alignments (MSAs) from diverse biological sources. In this first release, R3D-2-MSA provides manual and programmatic access to curated, representative ribosomal RNA sequence alignments from bacterial, archaeal, eukaryal and organellar ribosomes, using nucleotide numbers from representative atomic-resolution 3D structures. A web-based front end is available for manual entry and an Application Program Interface for programmatic access. Users can specify up to five ranges of nucleotides and 50 nucleotide positions per range. The R3D-2-MSA server maps these ranges to the appropriate columns of the corresponding MSA and returns the contents of the columns, either for display in a web browser or in JSON format for subsequent programmatic use. The browser output page provides a 3D interactive display of the query, a full list of sequence variants with taxonomic information and a statistical summary of distinct sequence variants found. The output can be filtered and sorted in the browser. Previous user queries can be viewed at any time by resubmitting the output URL, which encodes the search and re-generates the results. The service is freely available with no login requirement at http://rna.bgsu.edu/r3d-2-msa.

  3. Supernova Remnant in 3-D

    2009-01-01

    of the wavelength shift is related to the speed of motion, one can determine how fast the debris are moving in either direction. Because Cas A is the result of an explosion, the stellar debris is expanding radially outwards from the explosion center. Using simple geometry, the scientists were able to construct a 3-D model using all of this information. A program called 3-D Slicer modified for astronomical use by the Astronomical Medicine Project at Harvard University in Cambridge, Mass. was used to display and manipulate the 3-D model. Commercial software was then used to create the 3-D fly-through. The blue filaments defining the blast wave were not mapped using the Doppler effect because they emit a different kind of light synchrotron radiation that does not emit light at discrete wavelengths, but rather in a broad continuum. The blue filaments are only a representation of the actual filaments observed at the blast wave. This visualization shows that there are two main components to this supernova remnant: a spherical component in the outer parts of the remnant and a flattened (disk-like) component in the inner region. The spherical component consists of the outer layer of the star that exploded, probably made of helium and carbon. These layers drove a spherical blast wave into the diffuse gas surrounding the star. The flattened component that astronomers were unable to map into 3-D prior to these Spitzer observations consists of the inner layers of the star. It is made from various heavier elements, not all shown in the visualization, such as oxygen, neon, silicon, sulphur, argon and iron. High-velocity plumes, or jets, of this material are shooting out from the explosion in the plane of the disk-like component mentioned above. Plumes of silicon appear in the northeast and southwest, while those of iron are seen in the southeast and north. These jets were already known and Doppler velocity measurements have been made for these structures, but their orientation and

  4. The neural basis of parallel saccade programming: an fMRI study.

    Hu, Yanbo; Walker, Robin

    2011-11-01

    The neural basis of parallel saccade programming was examined in an event-related fMRI study using a variation of the double-step saccade paradigm. Two double-step conditions were used: one enabled the second saccade to be partially programmed in parallel with the first saccade while in a second condition both saccades had to be prepared serially. The intersaccadic interval, observed in the parallel programming (PP) condition, was significantly reduced compared with latency in the serial programming (SP) condition and also to the latency of single saccades in control conditions. The fMRI analysis revealed greater activity (BOLD response) in the frontal and parietal eye fields for the PP condition compared with the SP double-step condition and when compared with the single-saccade control conditions. By contrast, activity in the supplementary eye fields was greater for the double-step condition than the single-step condition but did not distinguish between the PP and SP requirements. The role of the frontal eye fields in PP may be related to the advanced temporal preparation and increased salience of the second saccade goal that may mediate activity in other downstream structures, such as the superior colliculus. The parietal lobes may be involved in the preparation for spatial remapping, which is required in double-step conditions. The supplementary eye fields appear to have a more general role in planning saccade sequences that may be related to error monitoring and the control over the execution of the correct sequence of responses.

  5. Spherical 3D Isotropic Wavelets

    Lanusse, F; Starck, J -L

    2011-01-01

    Future cosmological surveys will provide 3D large scale structure maps with large sky coverage, for which a 3D Spherical Fourier-Bessel (SFB) analysis in is natural. Wavelets are particularly well-suited to the analysis and denoising of cosmological data, but a spherical 3D isotropic wavelet transform does not currently exist to analyse spherical 3D data. The aim of this paper is to present a new formalism for a spherical 3D isotropic wavelet, i.e. one based on the Fourier-Bessel decomposition of a 3D field and accompany the formalism with a public code to perform wavelet transforms. We describe a new 3D isotropic spherical wavelet decomposition based on the undecimated wavelet transform (UWT) described in Starck et al. 2006. We also present a new fast Discrete Spherical Fourier-Bessel Transform (DSFBT) based on both a discrete Bessel Transform and the HEALPIX angular pixelisation scheme. We test the 3D wavelet transform and as a toy-application, apply a denoising algorithm in wavelet space to the Virgo large...

  6. Improvement of 3D Scanner

    2003-01-01

    The disadvantage remaining in 3D scanning system and its reasons are discussed. A new host-and-slave structure with high speed image acquisition and processing system is proposed to quicken the image processing and improve the performance of 3D scanning system.

  7. 3D Printing for Bricks

    ECT Team, Purdue

    2015-01-01

    Building Bytes, by Brian Peters, is a project that uses desktop 3D printers to print bricks for architecture. Instead of using an expensive custom-made printer, it uses a normal standard 3D printer which is available for everyone and makes it more accessible and also easier for fabrication.

  8. 3D vision system assessment

    Pezzaniti, J. Larry; Edmondson, Richard; Vaden, Justin; Hyatt, Bryan; Chenault, David B.; Kingston, David; Geulen, Vanilynmae; Newell, Scott; Pettijohn, Brad

    2009-02-01

    In this paper, we report on the development of a 3D vision system consisting of a flat panel stereoscopic display and auto-converging stereo camera and an assessment of the system's use for robotic driving, manipulation, and surveillance operations. The 3D vision system was integrated onto a Talon Robot and Operator Control Unit (OCU) such that direct comparisons of the performance of a number of test subjects using 2D and 3D vision systems were possible. A number of representative scenarios were developed to determine which tasks benefited most from the added depth perception and to understand when the 3D vision system hindered understanding of the scene. Two tests were conducted at Fort Leonard Wood, MO with noncommissioned officers ranked Staff Sergeant and Sergeant First Class. The scenarios; the test planning, approach and protocols; the data analysis; and the resulting performance assessment of the 3D vision system are reported.

  9. 3D printing in dentistry.

    Dawood, A; Marti Marti, B; Sauret-Jackson, V; Darwood, A

    2015-12-01

    3D printing has been hailed as a disruptive technology which will change manufacturing. Used in aerospace, defence, art and design, 3D printing is becoming a subject of great interest in surgery. The technology has a particular resonance with dentistry, and with advances in 3D imaging and modelling technologies such as cone beam computed tomography and intraoral scanning, and with the relatively long history of the use of CAD CAM technologies in dentistry, it will become of increasing importance. Uses of 3D printing include the production of drill guides for dental implants, the production of physical models for prosthodontics, orthodontics and surgery, the manufacture of dental, craniomaxillofacial and orthopaedic implants, and the fabrication of copings and frameworks for implant and dental restorations. This paper reviews the types of 3D printing technologies available and their various applications in dentistry and in maxillofacial surgery.

  10. Using 3D in Visualization

    Wood, Jo; Kirschenbauer, Sabine; Döllner, Jürgen

    2005-01-01

    The notion of three-dimensionality is applied to five stages of the visualization pipeline. While 3D visulization is most often associated with the visual mapping and representation of data, this chapter also identifies its role in the management and assembly of data, and in the media used...... to display 3D imagery. The extra cartographic degree of freedom offered by using 3D is explored and offered as a motivation for employing 3D in visualization. The use of VR and the construction of virtual environments exploit navigational and behavioral realism, but become most usefil when combined...... with abstracted representations embedded in a 3D space. The interactions between development of geovisualization, the technology used to implement it and the theory surrounding cartographic representation are explored. The dominance of computing technologies, driven particularly by the gaming industry...

  11. Development of visual 3D virtual environment for control software

    Hirose, Michitaka; Myoi, Takeshi; Amari, Haruo; Inamura, Kohei; Stark, Lawrence

    1991-01-01

    Virtual environments for software visualization may enable complex programs to be created and maintained. A typical application might be for control of regional electric power systems. As these encompass broader computer networks than ever, construction of such systems becomes very difficult. Conventional text-oriented environments are useful in programming individual processors. However, they are obviously insufficient to program a large and complicated system, that includes large numbers of computers connected to each other; such programming is called 'programming in the large.' As a solution for this problem, the authors are developing a graphic programming environment wherein one can visualize complicated software in virtual 3D world. One of the major features of the environment is the 3D representation of concurrent process. 3D representation is used to supply both network-wide interprocess programming capability (capability for 'programming in the large') and real-time programming capability. The authors' idea is to fuse both the block diagram (which is useful to check relationship among large number of processes or processors) and the time chart (which is useful to check precise timing for synchronization) into a single 3D space. The 3D representation gives us a capability for direct and intuitive planning or understanding of complicated relationship among many concurrent processes. To realize the 3D representation, a technology to enable easy handling of virtual 3D object is a definite necessity. Using a stereo display system and a gesture input device (VPL DataGlove), our prototype of the virtual workstation has been implemented. The workstation can supply the 'sensation' of the virtual 3D space to a programmer. Software for the 3D programming environment is implemented on the workstation. According to preliminary assessments, a 50 percent reduction of programming effort is achieved by using the virtual 3D environment. The authors expect that the 3D

  12. R3D Align: global pairwise alignment of RNA 3D structures using local superpositions

    Rahrig, Ryan R.; Leontis, Neocles B.; Zirbel, Craig L.

    2010-01-01

    Motivation: Comparing 3D structures of homologous RNA molecules yields information about sequence and structural variability. To compare large RNA 3D structures, accurate automatic comparison tools are needed. In this article, we introduce a new algorithm and web server to align large homologous RNA structures nucleotide by nucleotide using local superpositions that accommodate the flexibility of RNA molecules. Local alignments are merged to form a global alignment by employing a maximum clique algorithm on a specially defined graph that we call the ‘local alignment’ graph. Results: The algorithm is implemented in a program suite and web server called ‘R3D Align’. The R3D Align alignment of homologous 3D structures of 5S, 16S and 23S rRNA was compared to a high-quality hand alignment. A full comparison of the 16S alignment with the other state-of-the-art methods is also provided. The R3D Align program suite includes new diagnostic tools for the structural evaluation of RNA alignments. The R3D Align alignments were compared to those produced by other programs and were found to be the most accurate, in comparison with a high quality hand-crafted alignment and in conjunction with a series of other diagnostics presented. The number of aligned base pairs as well as measures of geometric similarity are used to evaluate the accuracy of the alignments. Availability: R3D Align is freely available through a web server http://rna.bgsu.edu/R3DAlign. The MATLAB source code of the program suite is also freely available for download at that location. Supplementary information: Supplementary data are available at Bioinformatics online. Contact: r-rahrig@onu.edu PMID:20929913

  13. Improved CUDA programs for GPU computing of Swendsen-Wang multi-cluster spin flip algorithm: 2D and 3D Ising, Potts, and XY models

    Komura, Yukihiro; Okabe, Yutaka

    2016-03-01

    We present new versions of sample CUDA programs for the GPU computing of the Swendsen-Wang multi-cluster spin flip algorithm. In this update, we add the method of GPU-based cluster-labeling algorithm without the use of conventional iteration (Komura, 2015) to those programs. For high-precision calculations, we also add a random-number generator in the cuRAND library. Moreover, we fix several bugs and remove the extra usage of shared memory in the kernel functions.

  14. Facing competition: Neural mechanisms underlying parallel programming of antisaccades and prosaccades.

    Talanow, Tobias; Kasparbauer, Anna-Maria; Steffens, Maria; Meyhöfer, Inga; Weber, Bernd; Smyrnis, Nikolaos; Ettinger, Ulrich

    2016-08-01

    The antisaccade task is a prominent tool to investigate the response inhibition component of cognitive control. Recent theoretical accounts explain performance in terms of parallel programming of exogenous and endogenous saccades, linked to the horse race metaphor. Previous studies have tested the hypothesis of competing saccade signals at the behavioral level by selectively slowing the programming of endogenous or exogenous processes e.g. by manipulating the probability of antisaccades in an experimental block. To gain a better understanding of inhibitory control processes in parallel saccade programming, we analyzed task-related eye movements and blood oxygenation level dependent (BOLD) responses obtained using functional magnetic resonance imaging (fMRI) at 3T from 16 healthy participants in a mixed antisaccade and prosaccade task. The frequency of antisaccade trials was manipulated across blocks of high (75%) and low (25%) antisaccade frequency. In blocks with high antisaccade frequency, antisaccade latencies were shorter and error rates lower whilst prosaccade latencies were longer and error rates were higher. At the level of BOLD, activations in the task-related saccade network (left inferior parietal lobe, right inferior parietal sulcus, left precentral gyrus reaching into left middle frontal gyrus and inferior frontal junction) and deactivations in components of the default mode network (bilateral temporal cortex, ventromedial prefrontal cortex) compensated increased cognitive control demands. These findings illustrate context dependent mechanisms underlying the coordination of competing decision signals in volitional gaze control.

  15. ADT-3D Tumor Detection Assistant in 3D

    Jaime Lazcano Bello

    2008-12-01

    Full Text Available The present document describes ADT-3D (Three-Dimensional Tumor Detector Assistant, a prototype application developed to assist doctors diagnose, detect and locate tumors in the brain by using CT scan. The reader may find on this document an introduction to tumor detection; ADT-3D main goals; development details; description of the product; motivation for its development; result’s study; and areas of applicability.

  16. The Mathematical Method of Simplifying Level of Detail Model in 3D Game Programming%3D游戏程序设计中简化模型细节层次的数学方法

    张云苑

    2013-01-01

    应用真实的角色和场景图形的表现,再现真实世界的3D游戏已成为电子游戏的主流产品。通过程序设计算法简化模型细节层次是实时3D图形学应用于游戏角色和场景图形表现的一项技术。该文论述了不同层次细节模型简化的数学方法,通过简化游戏中的角色或场景图形的复杂度,以满足实时3D渲染要求,达到真实世界3D游戏的表现效果。%The 3D video games which applying the performance of real characters and scenes graphs to produce the real world have become a mainstream products in the electronic game. Using algorithmic programming to simplify the level of detail model is a technique of applying the real-time 3D graphics into the gaming characters and scenes graphs. This article discusses mathe-matical approaches of different simplified level of details models, by simplifying the complexity of the game's role or the scene graph, it can meet real-time 3D rendering requirements and achieve real-world 3D gaming performance results .

  17. Eighth SIAM conference on parallel processing for scientific computing: Final program and abstracts

    NONE

    1997-12-31

    This SIAM conference is the premier forum for developments in parallel numerical algorithms, a field that has seen very lively and fruitful developments over the past decade, and whose health is still robust. Themes for this conference were: combinatorial optimization; data-parallel languages; large-scale parallel applications; message-passing; molecular modeling; parallel I/O; parallel libraries; parallel software tools; parallel compilers; particle simulations; problem-solving environments; and sparse matrix computations.

  18. Unassisted 3D camera calibration

    Atanassov, Kalin; Ramachandra, Vikas; Nash, James; Goma, Sergio R.

    2012-03-01

    With the rapid growth of 3D technology, 3D image capture has become a critical part of the 3D feature set on mobile phones. 3D image quality is affected by the scene geometry as well as on-the-device processing. An automatic 3D system usually assumes known camera poses accomplished by factory calibration using a special chart. In real life settings, pose parameters estimated by factory calibration can be negatively impacted by movements of the lens barrel due to shaking, focusing, or camera drop. If any of these factors displaces the optical axes of either or both cameras, vertical disparity might exceed the maximum tolerable margin and the 3D user may experience eye strain or headaches. To make 3D capture more practical, one needs to consider unassisted (on arbitrary scenes) calibration. In this paper, we propose an algorithm that relies on detection and matching of keypoints between left and right images. Frames containing erroneous matches, along with frames with insufficiently rich keypoint constellations, are detected and discarded. Roll, pitch yaw , and scale differences between left and right frames are then estimated. The algorithm performance is evaluated in terms of the remaining vertical disparity as compared to the maximum tolerable vertical disparity.

  19. Bioprinting of 3D hydrogels.

    Stanton, M M; Samitier, J; Sánchez, S

    2015-08-07

    Three-dimensional (3D) bioprinting has recently emerged as an extension of 3D material printing, by using biocompatible or cellular components to build structures in an additive, layer-by-layer methodology for encapsulation and culture of cells. These 3D systems allow for cell culture in a suspension for formation of highly organized tissue or controlled spatial orientation of cell environments. The in vitro 3D cellular environments simulate the complexity of an in vivo environment and natural extracellular matrices (ECM). This paper will focus on bioprinting utilizing hydrogels as 3D scaffolds. Hydrogels are advantageous for cell culture as they are highly permeable to cell culture media, nutrients, and waste products generated during metabolic cell processes. They have the ability to be fabricated in customized shapes with various material properties with dimensions at the micron scale. 3D hydrogels are a reliable method for biocompatible 3D printing and have applications in tissue engineering, drug screening, and organ on a chip models.

  20. Program Suite for Conceptual Designing of Parallel Mechanism-Based Robots and Machine Tools

    Slobodan Tabaković

    2013-06-01

    This paper describes the categorization of criteria for the conceptual design of parallel mechanism‐based robots or machine tools, resulting from workspace analysis as well as the procedure of their defining. Furthermore, it also presents the designing methodology that was implemented into the program for the creation of a robot or machine tool space model and the optimization of the resulting solution. For verification of the criteria and the programme suite, three common (conceptually different mechanisms with a similar mechanical structure and kinematic characteristics were used.

  1. GPU-based rapid reconstruction of cellular 3D refractive index maps from tomographic phase microscopy (Conference Presentation)

    Dardikman, Gili; Shaked, Natan T.

    2016-03-01

    We present highly parallel and efficient algorithms for real-time reconstruction of the quantitative three-dimensional (3-D) refractive-index maps of biological cells without labeling, as obtained from the interferometric projections acquired by tomographic phase microscopy (TPM). The new algorithms are implemented on the graphic processing unit (GPU) of the computer using CUDA programming environment. The reconstruction process includes two main parts. First, we used parallel complex wave-front reconstruction of the TPM-based interferometric projections acquired at various angles. The complex wave front reconstructions are done on the GPU in parallel, while minimizing the calculation time of the Fourier transforms and phase unwrapping needed. Next, we implemented on the GPU in parallel the 3-D refractive index map retrieval using the TPM filtered-back projection algorithm. The incorporation of algorithms that are inherently parallel with a programming environment such as Nvidia's CUDA makes it possible to obtain real-time processing rate, and enables high-throughput platform for label-free, 3-D cell visualization and diagnosis.

  2. Tuotekehitysprojekti: 3D-tulostin

    Pihlajamäki, Janne

    2011-01-01

    Opinnäytetyössä tutustuttiin 3D-tulostamisen teknologiaan. Työssä käytiin läpi 3D-tulostimesta tehty tuotekehitysprojekti. Sen lisäksi esiteltiin yleisellä tasolla tuotekehitysprosessi ja syntyneiden tulosten mahdollisia suojausmenetelmiä. Tavoitteena tässä työssä oli kehittää markkinoilta jo löytyvää kotitulostin-tasoista 3D-laiteteknologiaa lähemmäksi ammattilaistason ratkaisua. Tavoitteeseen pyrittiin keskittymällä parantamaan laitteella saavutettavaa tulostustarkkuutta ja -nopeutt...

  3. Handbook of 3D integration

    Garrou , Philip; Ramm , Peter

    2014-01-01

    Edited by key figures in 3D integration and written by top authors from high-tech companies and renowned research institutions, this book covers the intricate details of 3D process technology.As such, the main focus is on silicon via formation, bonding and debonding, thinning, via reveal and backside processing, both from a technological and a materials science perspective. The last part of the book is concerned with assessing and enhancing the reliability of the 3D integrated devices, which is a prerequisite for the large-scale implementation of this emerging technology. Invaluable reading fo

  4. Color 3D Reverse Engineering

    2002-01-01

    This paper presents a principle and a method of col or 3D laser scanning measurement. Based on the fundamental monochrome 3D measureme nt study, color information capture, color texture mapping, coordinate computati on and other techniques are performed to achieve color 3D measurement. The syste m is designed and composed of a line laser light emitter, one color CCD camera, a motor-driven rotary filter, a circuit card and a computer. Two steps in captu ring object's images in the measurement process: Firs...

  5. Exploration of 3D Printing

    Lin, Zeyu

    2014-01-01

    3D printing technology is introduced and defined in this Thesis. Some methods of 3D printing are illustrated and their principles are explained with pictures. Most of the essential parts are presented with pictures and their effects are explained within the whole system. Problems on Up! Plus 3D printer are solved and a DIY product is made with this machine. The processes of making product are recorded and the items which need to be noticed during the process are the highlight in this th...

  6. PSH3D fast Poisson solver for petascale DNS

    Adams, Darren; Dodd, Michael; Ferrante, Antonino

    2016-11-01

    Direct numerical simulation (DNS) of high Reynolds number, Re >= O (105) , turbulent flows requires computational meshes >= O (1012) grid points, and, thus, the use of petascale supercomputers. DNS often requires the solution of a Helmholtz (or Poisson) equation for pressure, which constitutes the bottleneck of the solver. We have developed a parallel solver of the Helmholtz equation in 3D, PSH3D. The numerical method underlying PSH3D combines a parallel 2D Fast Fourier transform in two spatial directions, and a parallel linear solver in the third direction. For computational meshes up to 81923 grid points, our numerical results show that PSH3D scales up to at least 262k cores of Cray XT5 (Blue Waters). PSH3D has a peak performance 6 × faster than 3D FFT-based methods when used with the 'partial-global' optimization, and for a 81923 mesh solves the Poisson equation in 1 sec using 128k cores. Also, we have verified that the use of PSH3D with the 'partial-global' optimization in our DNS solver does not reduce the accuracy of the numerical solution of the incompressible Navier-Stokes equations.

  7. Full Parallel Implementation of an All-Electron Four-Component Dirac-Kohn-Sham Program.

    Rampino, Sergio; Belpassi, Leonardo; Tarantelli, Francesco; Storchi, Loriano

    2014-09-09

    A full distributed-memory implementation of the Dirac-Kohn-Sham (DKS) module of the program BERTHA (Belpassi et al., Phys. Chem. Chem. Phys. 2011, 13, 12368-12394) is presented, where the self-consistent field (SCF) procedure is replicated on all the parallel processes, each process working on subsets of the global matrices. The key feature of the implementation is an efficient procedure for switching between two matrix distribution schemes, one (integral-driven) optimal for the parallel computation of the matrix elements and another (block-cyclic) optimal for the parallel linear algebra operations. This approach, making both CPU-time and memory scalable with the number of processors used, virtually overcomes at once both time and memory barriers associated with DKS calculations. Performance, portability, and numerical stability of the code are illustrated on the basis of test calculations on three gold clusters of increasing size, an organometallic compound, and a perovskite model. The calculations are performed on a Beowulf and a BlueGene/Q system.

  8. A Model for Managing 3D Printing Services in Academic Libraries

    Scalfani, Vincent F.; Sahib, Josh

    2013-01-01

    The appearance of 3D printers in university libraries opens many opportunities for advancing outreach, teaching, and research programs. The University of Alabama (UA) Libraries recently adopted 3D printing technology and maintains an open access 3D Printing Studio. The Studio consists of a 3D printer, multiple 3D design workstations, and other…

  9. Conducting polymer 3D microelectrodes

    Sasso, Luigi; Vazquez, Patricia; Vedarethinam, Indumathi

    2010-01-01

    Conducting polymer 3D microelectrodes have been fabricated for possible future neurological applications. A combination of micro-fabrication techniques and chemical polymerization methods has been used to create pillar electrodes in polyaniline and polypyrrole. The thin polymer films obtained...

  10. 3D Face Apperance Model

    Lading, Brian; Larsen, Rasmus; Astrom, K

    2006-01-01

    We build a 3D face shape model, including inter- and intra-shape variations, derive the analytical Jacobian of its resulting 2D rendered image, and show example of its fitting performance with light, pose, id, expression and texture variations......We build a 3D face shape model, including inter- and intra-shape variations, derive the analytical Jacobian of its resulting 2D rendered image, and show example of its fitting performance with light, pose, id, expression and texture variations...

  11. 3D Face Appearance Model

    Lading, Brian; Larsen, Rasmus; Åström, Kalle

    2006-01-01

    We build a 3d face shape model, including inter- and intra-shape variations, derive the analytical jacobian of its resulting 2d rendered image, and show example of its fitting performance with light, pose, id, expression and texture variations.}......We build a 3d face shape model, including inter- and intra-shape variations, derive the analytical jacobian of its resulting 2d rendered image, and show example of its fitting performance with light, pose, id, expression and texture variations.}...

  12. Main: TATCCAYMOTIFOSRAMY3D [PLACE

    Full Text Available TATCCAYMOTIFOSRAMY3D S000256 01-August-2006 (last modified) kehi TATCCAY motif foun...d in rice (O.s.) RAmy3D alpha-amylase gene promoter; Y=T/C; a GATA motif as its antisense sequence; TATCCAY ...motif and G motif (see S000130) are responsible for sugar repression (Toyofuku et al. 1998); GATA; amylase; sugar; repression; rice (Oryza sativa) TATCCAY ...

  13. Forward ramp in 3D

    1997-01-01

    Mars Pathfinder's forward rover ramp can be seen successfully unfurled in this image, taken in stereo by the Imager for Mars Pathfinder (IMP) on Sol 3. 3D glasses are necessary to identify surface detail. This ramp was not used for the deployment of the microrover Sojourner, which occurred at the end of Sol 2. When this image was taken, Sojourner was still latched to one of the lander's petals, waiting for the command sequence that would execute its descent off of the lander's petal.The image helped Pathfinder scientists determine whether to deploy the rover using the forward or backward ramps and the nature of the first rover traverse. The metallic object at the lower left of the image is the lander's low-gain antenna. The square at the end of the ramp is one of the spacecraft's magnetic targets. Dust that accumulates on the magnetic targets will later be examined by Sojourner's Alpha Proton X-Ray Spectrometer instrument for chemical analysis. At right, a lander petal is visible.The IMP is a stereo imaging system with color capability provided by 24 selectable filters -- twelve filters per 'eye.' It stands 1.8 meters above the Martian surface, and has a resolution of two millimeters at a range of two meters.Mars Pathfinder is the second in NASA's Discovery program of low-cost spacecraft with highly focused science goals. The Jet Propulsion Laboratory, Pasadena, CA, developed and manages the Mars Pathfinder mission for NASA's Office of Space Science, Washington, D.C. JPL is an operating division of the California Institute of Technology (Caltech). The Imager for Mars Pathfinder (IMP) was developed by the University of Arizona Lunar and Planetary Laboratory under contract to JPL. Peter Smith is the Principal Investigator.Click below to see the left and right views individually. [figure removed for brevity, see original site] Left [figure removed for brevity, see original site] Right

  14. 3D fold growth in transpression

    Frehner, Marcel

    2016-12-01

    Geological folds in transpression are inherently 3D structures; hence their growth and rotation behavior is studied using 3D numerical finite-element simulations. Upright single-layer buckle folds in Newtonian materials are considered, which grow from an initial point-like perturbation due to a combination of in-plane shortening and shearing (i.e., transpression). The resulting fold growth exhibits three components: (1) fold amplification (vertical), (2) fold elongation (parallel to fold axis), and (3) sequential fold growth (perpendicular to axial plane) of new anti- and synforms adjacent to the initial fold. Generally, the fold growth rates are smaller for shearing-dominated than for shortening-dominated transpression. In spite of the growth rate, the folding behavior is very similar for the different convergence angles. The two lateral directions always exhibit similar growth rates implying that the bulk fold structure occupies an increasing roughly circular area. Fold axes are always parallel to the major horizontal principal strain axis (λ→max, i.e., long axis of the horizontal finite strain ellipse), which is initially also parallel to the major horizontal instantaneous stretching axis (ISA→max). After initiation, the fold axes rotate together with λ→max. Sequential folds appearing later do not initiate parallel to ISA→max, but parallel to λ→max, i.e. parallel to the already existing folds, and also rotate with λ→max. Therefore, fold axes do not correspond to passive material lines and hinge migration takes place as a consequence. The fold axis orientation parallel to λ→max is independent of convergence angle and viscosity ratio. Therefore, a triangular relationship between convergence angle, amount of shortening, and fold axis orientation exists. If two of these values are known, the third can be determined. This relationship is applied to the Zagros fold-and-thrust-belt to estimate the degree of strain partitioning between the Simply

  15. JAC3D -- A three-dimensional finite element computer program for the nonlinear quasi-static response of solids with the conjugate gradient method; Yucca Mountain Site Characterization Project

    Biffle, J.H.

    1993-02-01

    JAC3D is a three-dimensional finite element program designed to solve quasi-static nonlinear mechanics problems. A set of continuum equations describes the nonlinear mechanics involving large rotation and strain. A nonlinear conjugate gradient method is used to solve the equation. The method is implemented in a three-dimensional setting with various methods for accelerating convergence. Sliding interface logic is also implemented. An eight-node Lagrangian uniform strain element is used with hourglass stiffness to control the zero-energy modes. This report documents the elastic and isothermal elastic-plastic material model. Other material models, documented elsewhere, are also available. The program is vectorized for efficient performance on Cray computers. Sample problems described are the bending of a thin beam, the rotation of a unit cube, and the pressurization and thermal loading of a hollow sphere.

  16. FARGO3D: A NEW GPU-ORIENTED MHD CODE

    Benitez-Llambay, Pablo [Instituto de Astronomía Teórica y Experimental, Observatorio Astronónomico, Universidad Nacional de Córdoba. Laprida 854, X5000BGR, Córdoba (Argentina); Masset, Frédéric S., E-mail: pbllambay@oac.unc.edu.ar, E-mail: masset@icf.unam.mx [Instituto de Ciencias Físicas, Universidad Nacional Autónoma de México (UNAM), Apdo. Postal 48-3,62251-Cuernavaca, Morelos (Mexico)

    2016-03-15

    We present the FARGO3D code, recently publicly released. It is a magnetohydrodynamics code developed with special emphasis on the physics of protoplanetary disks and planet–disk interactions, and parallelized with MPI. The hydrodynamics algorithms are based on finite-difference upwind, dimensionally split methods. The magnetohydrodynamics algorithms consist of the constrained transport method to preserve the divergence-free property of the magnetic field to machine accuracy, coupled to a method of characteristics for the evaluation of electromotive forces and Lorentz forces. Orbital advection is implemented, and an N-body solver is included to simulate planets or stars interacting with the gas. We present our implementation in detail and present a number of widely known tests for comparison purposes. One strength of FARGO3D is that it can run on either graphical processing units (GPUs) or central processing units (CPUs), achieving large speed-up with respect to CPU cores. We describe our implementation choices, which allow a user with no prior knowledge of GPU programming to develop new routines for CPUs, and have them translated automatically for GPUs.

  17. MPML3D: Scripting Agents for the 3D Internet.

    Prendinger, Helmut; Ullrich, Sebastian; Nakasone, Arturo; Ishizuka, Mitsuru

    2011-05-01

    The aim of this paper is two-fold. First, it describes a scripting language for specifying communicative behavior and interaction of computer-controlled agents ("bots") in the popular three-dimensional (3D) multiuser online world of "Second Life" and the emerging "OpenSimulator" project. While tools for designing avatars and in-world objects in Second Life exist, technology for nonprogrammer content creators of scenarios involving scripted agents is currently missing. Therefore, we have implemented new client software that controls bots based on the Multimodal Presentation Markup Language 3D (MPML3D), a highly expressive XML-based scripting language for controlling the verbal and nonverbal behavior of interacting animated agents. Second, the paper compares Second Life and OpenSimulator platforms and discusses the merits and limitations of each from the perspective of agent control. Here, we also conducted a small study that compares the network performance of both platforms.

  18. Emerging Applications of Bedside 3D Printing in Plastic Surgery

    Michael P Chae

    2015-06-01

    Full Text Available Modern imaging techniques are an essential component of preoperative planning in plastic and reconstructive surgery. However, conventional modalities, including three-dimensional (3D reconstructions, are limited by their representation on 2D workstations. 3D printing has been embraced by early adopters to produce medical imaging-guided 3D printed biomodels that facilitate various aspects of clinical practice. The cost and size of 3D printers have rapidly decreased over the past decade in parallel with the expiration of key 3D printing patents. With increasing accessibility, investigators are now able to convert standard imaging data into Computer Aided Design (CAD files using various 3D reconstruction softwares and ultimately fabricate 3D models using 3D printing techniques, such as stereolithography (SLA, multijet modeling (MJM, selective laser sintering (SLS, binder jet technique (BJT, and fused deposition modeling (FDM. Significant improvements in clinical imaging and user-friendly 3D software have permitted computer-aided 3D modeling of anatomical structures and implants without out-sourcing in many cases. These developments offer immense potential for the application of 3D printing at the bedside for a variety of clinical applications. However, many clinicians have questioned whether the cost-to-benefit ratio justifies its ongoing use. In this review the existing uses of 3D printing in plastic surgery practice, spanning the spectrum from templates for facial transplantation surgery through to the formation of bespoke craniofacial implants to optimize post-operative aesthetics, are described. Furthermore, we discuss the potential of 3D printing to become an essential office-based tool in plastic surgery to assist in preoperative planning, patient and surgical trainee education, and the development of intraoperative guidance tools and patient-specific prosthetics in everyday surgical practice.

  19. Emerging Applications of Bedside 3D Printing in Plastic Surgery.

    Chae, Michael P; Rozen, Warren M; McMenamin, Paul G; Findlay, Michael W; Spychal, Robert T; Hunter-Smith, David J

    2015-01-01

    Modern imaging techniques are an essential component of preoperative planning in plastic and reconstructive surgery. However, conventional modalities, including three-dimensional (3D) reconstructions, are limited by their representation on 2D workstations. 3D printing, also known as rapid prototyping or additive manufacturing, was once the province of industry to fabricate models from a computer-aided design (CAD) in a layer-by-layer manner. The early adopters in clinical practice have embraced the medical imaging-guided 3D-printed biomodels for their ability to provide tactile feedback and a superior appreciation of visuospatial relationship between anatomical structures. With increasing accessibility, investigators are able to convert standard imaging data into a CAD file using various 3D reconstruction softwares and ultimately fabricate 3D models using 3D printing techniques, such as stereolithography, multijet modeling, selective laser sintering, binder jet technique, and fused deposition modeling. However, many clinicians have questioned whether the cost-to-benefit ratio justifies its ongoing use. The cost and size of 3D printers have rapidly decreased over the past decade in parallel with the expiration of key 3D printing patents. Significant improvements in clinical imaging and user-friendly 3D software have permitted computer-aided 3D modeling of anatomical structures and implants without outsourcing in many cases. These developments offer immense potential for the application of 3D printing at the bedside for a variety of clinical applications. In this review, existing uses of 3D printing in plastic surgery practice spanning the spectrum from templates for facial transplantation surgery through to the formation of bespoke craniofacial implants to optimize post-operative esthetics are described. Furthermore, we discuss the potential of 3D printing to become an essential office-based tool in plastic surgery to assist in preoperative planning, developing

  20. From 3D view to 3D print

    Dima, M.; Farisato, G.; Bergomi, M.; Viotto, V.; Magrin, D.; Greggio, D.; Farinato, J.; Marafatto, L.; Ragazzoni, R.; Piazza, D.

    2014-08-01

    In the last few years 3D printing is getting more and more popular and used in many fields going from manufacturing to industrial design, architecture, medical support and aerospace. 3D printing is an evolution of bi-dimensional printing, which allows to obtain a solid object from a 3D model, realized with a 3D modelling software. The final product is obtained using an additive process, in which successive layers of material are laid down one over the other. A 3D printer allows to realize, in a simple way, very complex shapes, which would be quite difficult to be produced with dedicated conventional facilities. Thanks to the fact that the 3D printing is obtained superposing one layer to the others, it doesn't need any particular work flow and it is sufficient to simply draw the model and send it to print. Many different kinds of 3D printers exist based on the technology and material used for layer deposition. A common material used by the toner is ABS plastics, which is a light and rigid thermoplastic polymer, whose peculiar mechanical properties make it diffusely used in several fields, like pipes production and cars interiors manufacturing. I used this technology to create a 1:1 scale model of the telescope which is the hardware core of the space small mission CHEOPS (CHaracterising ExOPlanets Satellite) by ESA, which aims to characterize EXOplanets via transits observations. The telescope has a Ritchey-Chrétien configuration with a 30cm aperture and the launch is foreseen in 2017. In this paper, I present the different phases for the realization of such a model, focusing onto pros and cons of this kind of technology. For example, because of the finite printable volume (10×10×12 inches in the x, y and z directions respectively), it has been necessary to split the largest parts of the instrument in smaller components to be then reassembled and post-processed. A further issue is the resolution of the printed material, which is expressed in terms of layers

  1. 3D-mallinnus ja 3D-animaatiot biovoimalaitoksesta

    Hiltula, Tytti

    2014-01-01

    Opinnäytetyössä tehtiin biovoimalaitoksen piirustuksista 3D-mallinnus ja animaatiot. Työn tarkoituksena oli saada valmiiksi Recwell Oy:lle markkinointiin tarkoitetut kuva- ja videomateriaalit. Työssä perehdyttiin 3D-mallintamisen perustietoihin ja lähtökohtiin sekä animaation laatimiseen. Työ laadittiin kokonaisuudessaan AutoCAD-ohjelmalla, ja työn aikana tutustuttiin huolellisesti myös ohjelman käyttöohjeisiin. Piirustusten mitoituksessa huomattiin jo alkuvaiheessa suuria puutteita, ...

  2. When fast atom diffraction turns 3D

    Zugarramurdi, Asier; Borisov, Andrei G., E-mail: andrei.borissov@u-psud.fr

    2013-12-15

    Fast atom diffraction at surfaces (FAD) in grazing incidence geometry is characterized by the slow motion in the direction perpendicular to the surface and fast motion parallel to the surface plane along a low index direction. It is established experimentally that for the typical surfaces the FAD reveals the 2D diffraction patterns associated with exchange of the reciprocal lattice vector perpendicular to the direction of fast motion. The reciprocal lattice vector exchange along the direction of fast motion is negligible. The usual approximation made in the description of the experimental data is then to assume that the effective potential leading to the diffraction results from the averaging of the 3D surface potential along the atomic strings forming the axial channel. In this work we use full quantum wave packet propagation calculations to study theoretically the possibility to observe the 3D diffraction in FAD experiments. We show that for the surfaces with large unit cell, such as can be the case for reconstructed or vicinal surfaces, the 3D diffraction can be observed. The reciprocal lattice vector exchange along the direction of fast motion leads to several Laue circles in the diffraction pattern.

  3. PLOT3D/AMES, UNIX SUPERCOMPUTER AND SGI IRIS VERSION (WITHOUT TURB3D)

    Buning, P.

    1994-01-01

    PLOT3D is an interactive graphics program designed to help scientists visualize computational fluid dynamics (CFD) grids and solutions. Today, supercomputers and CFD algorithms can provide scientists with simulations of such highly complex phenomena that obtaining an understanding of the simulations has become a major problem. Tools which help the scientist visualize the simulations can be of tremendous aid. PLOT3D/AMES offers more functions and features, and has been adapted for more types of computers than any other CFD graphics program. Version 3.6b+ is supported for five computers and graphic libraries. Using PLOT3D, CFD physicists can view their computational models from any angle, observing the physics of problems and the quality of solutions. As an aid in designing aircraft, for example, PLOT3D's interactive computer graphics can show vortices, temperature, reverse flow, pressure, and dozens of other characteristics of air flow during flight. As critical areas become obvious, they can easily be studied more closely using a finer grid. PLOT3D is part of a computational fluid dynamics software cycle. First, a program such as 3DGRAPE (ARC-12620) helps the scientist generate computational grids to model an object and its surrounding space. Once the grids have been designed and parameters such as the angle of attack, Mach number, and Reynolds number have been specified, a "flow-solver" program such as INS3D (ARC-11794 or COS-10019) solves the system of equations governing fluid flow, usually on a supercomputer. Grids sometimes have as many as two million points, and the "flow-solver" produces a solution file which contains density, x- y- and z-momentum, and stagnation energy for each grid point. With such a solution file and a grid file containing up to 50 grids as input, PLOT3D can calculate and graphically display any one of 74 functions, including shock waves, surface pressure, velocity vectors, and particle traces. PLOT3D's 74 functions are organized into

  4. PLOT3D/AMES, UNIX SUPERCOMPUTER AND SGI IRIS VERSION (WITH TURB3D)

    Buning, P.

    1994-01-01

    PLOT3D is an interactive graphics program designed to help scientists visualize computational fluid dynamics (CFD) grids and solutions. Today, supercomputers and CFD algorithms can provide scientists with simulations of such highly complex phenomena that obtaining an understanding of the simulations has become a major problem. Tools which help the scientist visualize the simulations can be of tremendous aid. PLOT3D/AMES offers more functions and features, and has been adapted for more types of computers than any other CFD graphics program. Version 3.6b+ is supported for five computers and graphic libraries. Using PLOT3D, CFD physicists can view their computational models from any angle, observing the physics of problems and the quality of solutions. As an aid in designing aircraft, for example, PLOT3D's interactive computer graphics can show vortices, temperature, reverse flow, pressure, and dozens of other characteristics of air flow during flight. As critical areas become obvious, they can easily be studied more closely using a finer grid. PLOT3D is part of a computational fluid dynamics software cycle. First, a program such as 3DGRAPE (ARC-12620) helps the scientist generate computational grids to model an object and its surrounding space. Once the grids have been designed and parameters such as the angle of attack, Mach number, and Reynolds number have been specified, a "flow-solver" program such as INS3D (ARC-11794 or COS-10019) solves the system of equations governing fluid flow, usually on a supercomputer. Grids sometimes have as many as two million points, and the "flow-solver" produces a solution file which contains density, x- y- and z-momentum, and stagnation energy for each grid point. With such a solution file and a grid file containing up to 50 grids as input, PLOT3D can calculate and graphically display any one of 74 functions, including shock waves, surface pressure, velocity vectors, and particle traces. PLOT3D's 74 functions are organized into

  5. What is "the patient perspective" in patient engagement programs? Implicit logics and parallels to feminist theories.

    Rowland, Paula; McMillan, Sarah; McGillicuddy, Patti; Richards, Joy

    2017-01-01

    Public and patient involvement (PPI) in health care may refer to many different processes, ranging from participating in decision-making about one's own care to participating in health services research, health policy development, or organizational reforms. Across these many forms of public and patient involvement, the conceptual and theoretical underpinnings remain poorly articulated. Instead, most public and patient involvement programs rely on policy initiatives as their conceptual frameworks. This lack of conceptual clarity participates in dilemmas of program design, implementation, and evaluation. This study contributes to the development of theoretical understandings of public and patient involvement. In particular, we focus on the deployment of patient engagement programs within health service organizations. To develop a deeper understanding of the conceptual underpinnings of these programs, we examined the concept of "the patient perspective" as used by patient engagement practitioners and participants. Specifically, we focused on the way this phrase was used in the singular: "the" patient perspective or "the" patient voice. From qualitative analysis of interviews with 20 patient advisers and 6 staff members within a large urban health network in Canada, we argue that "the patient perspective" is referred to as a particular kind of situated knowledge, specifically an embodied knowledge of vulnerability. We draw parallels between this logic of patient perspective and the logic of early feminist theory, including the concepts of standpoint theory and strong objectivity. We suggest that champions of patient engagement may learn much from the way feminist theorists have constructed their arguments and addressed critique.

  6. 3D- VISUALIZATION BY RAYTRACING IMAGE SYNTHESIS ON GPU

    Al-Oraiqat Anas M.

    2016-06-01

    Full Text Available This paper presents a realization of the approach to spatial 3D stereo of visualization of 3D images with use parallel Graphics processing unit (GPU. The experiments of realization of synthesis of images of a 3D stage by a method of trace of beams on GPU with Compute Unified Device Architecture (CUDA have shown that 60 % of the time is spent for the decision of a computing problem approximately, the major part of time (40 % is spent for transfer of data between the central processing unit and GPU for calculations and the organization process of visualization. The study of the influence of increase in the size of the GPU network at the speed of calculations showed importance of the correct task of structure of formation of the parallel computer network and general mechanism of parallelization.

  7. YouDash3D: exploring stereoscopic 3D gaming for 3D movie theaters

    Schild, Jonas; Seele, Sven; Masuch, Maic

    2012-03-01

    Along with the success of the digitally revived stereoscopic cinema, events beyond 3D movies become attractive for movie theater operators, i.e. interactive 3D games. In this paper, we present a case that explores possible challenges and solutions for interactive 3D games to be played by a movie theater audience. We analyze the setting and showcase current issues related to lighting and interaction. Our second focus is to provide gameplay mechanics that make special use of stereoscopy, especially depth-based game design. Based on these results, we present YouDash3D, a game prototype that explores public stereoscopic gameplay in a reduced kiosk setup. It features live 3D HD video stream of a professional stereo camera rig rendered in a real-time game scene. We use the effect to place the stereoscopic effigies of players into the digital game. The game showcases how stereoscopic vision can provide for a novel depth-based game mechanic. Projected trigger zones and distributed clusters of the audience video allow for easy adaptation to larger audiences and 3D movie theater gaming.

  8. Uncertainty Analysis of RELAP5-3D

    Alexandra E Gertman; Dr. George L Mesina

    2012-07-01

    As world-wide energy consumption continues to increase, so does the demand for the use of alternative energy sources, such as Nuclear Energy. Nuclear Power Plants currently supply over 370 gigawatts of electricity, and more than 60 new nuclear reactors have been commissioned by 15 different countries. The primary concern for Nuclear Power Plant operation and lisencing has been safety. The safety of the operation of Nuclear Power Plants is no simple matter- it involves the training of operators, design of the reactor, as well as equipment and design upgrades throughout the lifetime of the reactor, etc. To safely design, operate, and understand nuclear power plants, industry and government alike have relied upon the use of best-estimate simulation codes, which allow for an accurate model of any given plant to be created with well-defined margins of safety. The most widely used of these best-estimate simulation codes in the Nuclear Power industry is RELAP5-3D. Our project focused on improving the modeling capabilities of RELAP5-3D by developing uncertainty estimates for its calculations. This work involved analyzing high, medium, and low ranked phenomena from an INL PIRT on a small break Loss-Of-Coolant Accident as wall as an analysis of a large break Loss-Of- Coolant Accident. Statistical analyses were performed using correlation coefficients. To perform the studies, computer programs were written that modify a template RELAP5 input deck to produce one deck for each combination of key input parameters. Python scripting enabled the running of the generated input files with RELAP5-3D on INL’s massively parallel cluster system. Data from the studies was collected and analyzed with SAS. A summary of the results of our studies are presented.

  9. 3D future internet media

    Dagiuklas, Tasos

    2014-01-01

    This book describes recent innovations in 3D media and technologies, with coverage of 3D media capturing, processing, encoding, and adaptation, networking aspects for 3D Media, and quality of user experience (QoE). The main contributions are based on the results of the FP7 European Projects ROMEO, which focus on new methods for the compression and delivery of 3D multi-view video and spatial audio, as well as the optimization of networking and compression jointly across the Future Internet (www.ict-romeo.eu). The delivery of 3D media to individual users remains a highly challenging problem due to the large amount of data involved, diverse network characteristics and user terminal requirements, as well as the user’s context such as their preferences and location. As the number of visual views increases, current systems will struggle to meet the demanding requirements in terms of delivery of constant video quality to both fixed and mobile users. ROMEO will design and develop hybrid-networking solutions that co...

  10. Materialedreven 3d digital formgivning

    Hansen, Flemming Tvede

    2010-01-01

    Formålet med forskningsprojektet er for det første at understøtte keramikeren i at arbejde eksperimenterende med digital formgivning, og for det andet at bidrage til en tværfaglig diskurs om brugen af digital formgivning. Forskningsprojektet fokuserer på 3d formgivning og derved på 3d digital...... formgivning og Rapid Prototyping (RP). RP er en fællesbetegnelse for en række af de teknikker, der muliggør at overføre den digitale form til 3d fysisk form. Forskningsprojektet koncentrerer sig om to overordnede forskningsspørgsmål. Det første handler om, hvordan viden og erfaring indenfor det keramiske...... fagområde kan blive udnyttet i forhold til 3d digital formgivning. Det andet handler om, hvad en sådan tilgang kan bidrage med, og hvordan den kan blive udnyttet i et dynamisk samspil med det keramiske materiale i formgivningen af 3d keramiske artefakter. Materialedreven formgivning er karakteriseret af en...

  11. Novel 3D media technologies

    Dagiuklas, Tasos

    2015-01-01

    This book describes recent innovations in 3D media and technologies, with coverage of 3D media capturing, processing, encoding, and adaptation, networking aspects for 3D Media, and quality of user experience (QoE). The contributions are based on the results of the FP7 European Project ROMEO, which focuses on new methods for the compression and delivery of 3D multi-view video and spatial audio, as well as the optimization of networking and compression jointly across the future Internet. The delivery of 3D media to individual users remains a highly challenging problem due to the large amount of data involved, diverse network characteristics and user terminal requirements, as well as the user’s context such as their preferences and location. As the number of visual views increases, current systems will struggle to meet the demanding requirements in terms of delivery of consistent video quality to fixed and mobile users. ROMEO will present hybrid networking solutions that combine the DVB-T2 and DVB-NGH broadcas...

  12. Speaking Volumes About 3-D

    2002-01-01

    In 1999, Genex submitted a proposal to Stennis Space Center for a volumetric 3-D display technique that would provide multiple users with a 360-degree perspective to simultaneously view and analyze 3-D data. The futuristic capabilities of the VolumeViewer(R) have offered tremendous benefits to commercial users in the fields of medicine and surgery, air traffic control, pilot training and education, computer-aided design/computer-aided manufacturing, and military/battlefield management. The technology has also helped NASA to better analyze and assess the various data collected by its satellite and spacecraft sensors. Genex capitalized on its success with Stennis by introducing two separate products to the commercial market that incorporate key elements of the 3-D display technology designed under an SBIR contract. The company Rainbow 3D(R) imaging camera is a novel, three-dimensional surface profile measurement system that can obtain a full-frame 3-D image in less than 1 second. The third product is the 360-degree OmniEye(R) video system. Ideal for intrusion detection, surveillance, and situation management, this unique camera system offers a continuous, panoramic view of a scene in real time.

  13. Numerical Simulation of 3-D Wave Crests

    YU Dingyong; ZHANG Hanyuan

    2003-01-01

    A clear definition of 3-D wave crest and a description of the procedures to detect the boundary of wave crest are presented in the paper. By using random wave theory and directional wave spectrum, a MATLAB-platformed program is designed to simulate random wave crests for various directional spectral conditions in deep water. Statistics of wave crests with different directional spreading parameters and different directional functions are obtained and discussed.

  14. A Prototypical 3D Graphical Visualizer for Object-Oriented Systems

    1996-01-01

    is paper describes a framework for visualizing object-oriented systems within a 3D interactive environment.The 3D visualizer represents the structure of a program as Cylinder Net that simultaneously specifies two relationships between objects within 3D virtual space.Additionally,it represents additional relationships on demand when objects are moved into local focus.The 3D visualizer is implemented using a 3D graphics toolkit,TOAST,that implements 3D Widgets 3D graphics to ease the programming task for 3D visualization.

  15. PLOT3D/AMES, SGI IRIS VERSION (WITH TURB3D)

    Buning, P.

    1994-01-01

    PLOT3D is an interactive graphics program designed to help scientists visualize computational fluid dynamics (CFD) grids and solutions. Today, supercomputers and CFD algorithms can provide scientists with simulations of such highly complex phenomena that obtaining an understanding of the simulations has become a major problem. Tools which help the scientist visualize the simulations can be of tremendous aid. PLOT3D/AMES offers more functions and features, and has been adapted for more types of computers than any other CFD graphics program. Version 3.6b+ is supported for five computers and graphic libraries. Using PLOT3D, CFD physicists can view their computational models from any angle, observing the physics of problems and the quality of solutions. As an aid in designing aircraft, for example, PLOT3D's interactive computer graphics can show vortices, temperature, reverse flow, pressure, and dozens of other characteristics of air flow during flight. As critical areas become obvious, they can easily be studied more closely using a finer grid. PLOT3D is part of a computational fluid dynamics software cycle. First, a program such as 3DGRAPE (ARC-12620) helps the scientist generate computational grids to model an object and its surrounding space. Once the grids have been designed and parameters such as the angle of attack, Mach number, and Reynolds number have been specified, a "flow-solver" program such as INS3D (ARC-11794 or COS-10019) solves the system of equations governing fluid flow, usually on a supercomputer. Grids sometimes have as many as two million points, and the "flow-solver" produces a solution file which contains density, x- y- and z-momentum, and stagnation energy for each grid point. With such a solution file and a grid file containing up to 50 grids as input, PLOT3D can calculate and graphically display any one of 74 functions, including shock waves, surface pressure, velocity vectors, and particle traces. PLOT3D's 74 functions are organized into

  16. PLOT3D/AMES, SGI IRIS VERSION (WITHOUT TURB3D)

    Buning, P.

    1994-01-01

    PLOT3D is an interactive graphics program designed to help scientists visualize computational fluid dynamics (CFD) grids and solutions. Today, supercomputers and CFD algorithms can provide scientists with simulations of such highly complex phenomena that obtaining an understanding of the simulations has become a major problem. Tools which help the scientist visualize the simulations can be of tremendous aid. PLOT3D/AMES offers more functions and features, and has been adapted for more types of computers than any other CFD graphics program. Version 3.6b+ is supported for five computers and graphic libraries. Using PLOT3D, CFD physicists can view their computational models from any angle, observing the physics of problems and the quality of solutions. As an aid in designing aircraft, for example, PLOT3D's interactive computer graphics can show vortices, temperature, reverse flow, pressure, and dozens of other characteristics of air flow during flight. As critical areas become obvious, they can easily be studied more closely using a finer grid. PLOT3D is part of a computational fluid dynamics software cycle. First, a program such as 3DGRAPE (ARC-12620) helps the scientist generate computational grids to model an object and its surrounding space. Once the grids have been designed and parameters such as the angle of attack, Mach number, and Reynolds number have been specified, a "flow-solver" program such as INS3D (ARC-11794 or COS-10019) solves the system of equations governing fluid flow, usually on a supercomputer. Grids sometimes have as many as two million points, and the "flow-solver" produces a solution file which contains density, x- y- and z-momentum, and stagnation energy for each grid point. With such a solution file and a grid file containing up to 50 grids as input, PLOT3D can calculate and graphically display any one of 74 functions, including shock waves, surface pressure, velocity vectors, and particle traces. PLOT3D's 74 functions are organized into

  17. DOE SBIR Phase-1 Report on Hybrid CPU-GPU Parallel Development of the Eulerian-Lagrangian Barracuda Multiphase Program

    Dr. Dale M. Snider

    2011-02-28

    This report gives the result from the Phase-1 work on demonstrating greater than 10x speedup of the Barracuda computer program using parallel methods and GPU processors (General-Purpose Graphics Processing Unit or Graphics Processing Unit). Phase-1 demonstrated a 12x speedup on a typical Barracuda function using the GPU processor. The problem test case used about 5 million particles and 250,000 Eulerian grid cells. The relative speedup, compared to a single CPU, increases with increased number of particles giving greater than 12x speedup. Phase-1 work provided a path for reformatting data structure modifications to give good parallel performance while keeping a friendly environment for new physics development and code maintenance. The implementation of data structure changes will be in Phase-2. Phase-1 laid the ground work for the complete parallelization of Barracuda in Phase-2, with the caveat that implemented computer practices for parallel programming done in Phase-1 gives immediate speedup in the current Barracuda serial running code. The Phase-1 tasks were completed successfully laying the frame work for Phase-2. The detailed results of Phase-1 are within this document. In general, the speedup of one function would be expected to be higher than the speedup of the entire code because of I/O functions and communication between the algorithms. However, because one of the most difficult Barracuda algorithms was parallelized in Phase-1 and because advanced parallelization methods and proposed parallelization optimization techniques identified in Phase-1 will be used in Phase-2, an overall Barracuda code speedup (relative to a single CPU) is expected to be greater than 10x. This means that a job which takes 30 days to complete will be done in 3 days. Tasks completed in Phase-1 are: Task 1: Profile the entire Barracuda code and select which subroutines are to be parallelized (See Section Choosing a Function to Accelerate) Task 2: Select a GPU consultant company and

  18. Modification of 3D milling machine to 3D printer

    Halamíček, Lukáš

    2015-01-01

    Tato práce se zabývá přestavbou gravírovací frézky na 3D tiskárnu. V první části se práce zabývá možnými technologiemi 3D tisku a možností jejich využití u přestavby. Dále jsou popsány a vybrány vhodné součásti pro přestavbu. V další části je realizováno řízení ohřevu podložky, trysky a řízení posuvu drátu pomocí softwaru TwinCat od společnosti Beckhoff na průmyslovém počítači. Výsledkem práce by měla být oživená 3D tiskárna. This thesis deals with rebuilding of engraving machine to 3D pri...

  19. Aspects of defects in 3d-3d correspondence

    Gang, Dongmin; Kim, Nakwoo; Romo, Mauricio; Yamazaki, Masahito

    2016-10-01

    In this paper we study supersymmetric co-dimension 2 and 4 defects in the compactification of the 6d (2, 0) theory of type A N -1 on a 3-manifold M . The so-called 3d-3d correspondence is a relation between complexified Chern-Simons theory (with gauge group SL(N,C) ) on M and a 3d N=2 theory T N [ M ]. We study this correspondence in the presence of supersymmetric defects, which are knots/links inside the 3-manifold. Our study employs a number of different methods: state-integral models for complex Chern-Simons theory, cluster algebra techniques, domain wall theory T [SU( N )], 5d N=2 SYM, and also supergravity analysis through holography. These methods are complementary and we find agreement between them. In some cases the results lead to highly non-trivial predictions on the partition function. Our discussion includes a general expression for the cluster partition function, which can be used to compute in the presence of maximal and certain class of non-maximal punctures when N > 2. We also highlight the non-Abelian description of the 3d N=2 T N [ M ] theory with defect included, when such a description is available. This paper is a companion to our shorter paper [1], which summarizes our main results.

  20. Streamlining of the RELAP5-3D Code

    Mesina, George L; Hykes, Joshua; Guillen, Donna Post

    2007-11-01

    RELAP5-3D is widely used by the nuclear community to simulate general thermal hydraulic systems and has proven to be so versatile that the spectrum of transient two-phase problems that can be analyzed has increased substantially over time. To accommodate the many new types of problems that are analyzed by RELAP5-3D, both the physics and numerical methods of the code have been continuously improved. In the area of computational methods and mathematical techniques, many upgrades and improvements have been made decrease code run time and increase solution accuracy. These include vectorization, parallelization, use of improved equation solvers for thermal hydraulics and neutron kinetics, and incorporation of improved library utilities. In the area of applied nuclear engineering, expanded capabilities include boron and level tracking models, radiation/conduction enclosure model, feedwater heater and compressor components, fluids and corresponding correlations for modeling Generation IV reactor designs, and coupling to computational fluid dynamics solvers. Ongoing and proposed future developments include improvements to the two-phase pump model, conversion to FORTRAN 90, and coupling to more computer programs. This paper summarizes the general improvements made to RELAP5-3D, with an emphasis on streamlining the code infrastructure for improved maintenance and development. With all these past, present and planned developments, it is necessary to modify the code infrastructure to incorporate modifications in a consistent and maintainable manner. Modifying a complex code such as RELAP5-3D to incorporate new models, upgrade numerics, and optimize existing code becomes more difficult as the code grows larger. The difficulty of this as well as the chance of introducing errors is significantly reduced when the code is structured. To streamline the code into a structured program, a commercial restructuring tool, FOR_STRUCT, was applied to the RELAP5-3D source files. The

  1. 3-D Vector Flow Imaging

    Holbek, Simon

    studies and in vivo. Phantom measurements are compared with their corresponding reference value, whereas the in vivo measurement is validated against the current golden standard for non-invasive blood velocity estimates, based on magnetic resonance imaging (MRI). The study concludes, that a high precision......, if this significant reduction in the element count can still provide precise and robust 3-D vector flow estimates in a plane. The study concludes that the RC array is capable of estimating precise 3-D vector flow both in a plane and in a volume, despite the low channel count. However, some inherent new challenges......For the last decade, the field of ultrasonic vector flow imaging has gotten an increasingly attention, as the technique offers a variety of new applications for screening and diagnostics of cardiovascular pathologies. The main purpose of this PhD project was therefore to advance the field of 3-D...

  2. 3D vector flow imaging

    Pihl, Michael Johannes

    The main purpose of this PhD project is to develop an ultrasonic method for 3D vector flow imaging. The motivation is to advance the field of velocity estimation in ultrasound, which plays an important role in the clinic. The velocity of blood has components in all three spatial dimensions, yet...... conventional methods can estimate only the axial component. Several approaches for 3D vector velocity estimation have been suggested, but none of these methods have so far produced convincing in vivo results nor have they been adopted by commercial manufacturers. The basis for this project is the Transverse...... on the TO fields are suggested. They can be used to optimize the TO method. In the third part, a TO method for 3D vector velocity estimation is proposed. It employs a 2D phased array transducer and decouples the velocity estimation into three velocity components, which are estimated simultaneously based on 5...

  3. Markerless 3D Face Tracking

    Walder, Christian; Breidt, Martin; Bulthoff, Heinrich

    2009-01-01

    We present a novel algorithm for the markerless tracking of deforming surfaces such as faces. We acquire a sequence of 3D scans along with color images at 40Hz. The data is then represented by implicit surface and color functions, using a novel partition-of-unity type method of efficiently...... combining local regressors using nearest neighbor searches. Both these functions act on the 4D space of 3D plus time, and use temporal information to handle the noise in individual scans. After interactive registration of a template mesh to the first frame, it is then automatically deformed to track...... the scanned surface, using the variation of both shape and color as features in a dynamic energy minimization problem. Our prototype system yields high-quality animated 3D models in correspondence, at a rate of approximately twenty seconds per timestep. Tracking results for faces and other objects...

  4. 3D Printed Bionic Nanodevices.

    Kong, Yong Lin; Gupta, Maneesh K; Johnson, Blake N; McAlpine, Michael C

    2016-06-01

    The ability to three-dimensionally interweave biological and functional materials could enable the creation of bionic devices possessing unique and compelling geometries, properties, and functionalities. Indeed, interfacing high performance active devices with biology could impact a variety of fields, including regenerative bioelectronic medicines, smart prosthetics, medical robotics, and human-machine interfaces. Biology, from the molecular scale of DNA and proteins, to the macroscopic scale of tissues and organs, is three-dimensional, often soft and stretchable, and temperature sensitive. This renders most biological platforms incompatible with the fabrication and materials processing methods that have been developed and optimized for functional electronics, which are typically planar, rigid and brittle. A number of strategies have been developed to overcome these dichotomies. One particularly novel approach is the use of extrusion-based multi-material 3D printing, which is an additive manufacturing technology that offers a freeform fabrication strategy. This approach addresses the dichotomies presented above by (1) using 3D printing and imaging for customized, hierarchical, and interwoven device architectures; (2) employing nanotechnology as an enabling route for introducing high performance materials, with the potential for exhibiting properties not found in the bulk; and (3) 3D printing a range of soft and nanoscale materials to enable the integration of a diverse palette of high quality functional nanomaterials with biology. Further, 3D printing is a multi-scale platform, allowing for the incorporation of functional nanoscale inks, the printing of microscale features, and ultimately the creation of macroscale devices. This blending of 3D printing, novel nanomaterial properties, and 'living' platforms may enable next-generation bionic systems. In this review, we highlight this synergistic integration of the unique properties of nanomaterials with the

  5. Microfluidic 3D Helix Mixers

    Georgette B. Salieb-Beugelaar

    2016-10-01

    Full Text Available Polymeric microfluidic systems are well suited for miniaturized devices with complex functionality, and rapid prototyping methods for 3D microfluidic structures are increasingly used. Mixing at the microscale and performing chemical reactions at the microscale are important applications of such systems and we therefore explored feasibility, mixing characteristics and the ability to control a chemical reaction in helical 3D channels produced by the emerging thread template method. Mixing at the microscale is challenging because channel size reduction for improving solute diffusion comes at the price of a reduced Reynolds number that induces a strictly laminar flow regime and abolishes turbulence that would be desired for improved mixing. Microfluidic 3D helix mixers were rapidly prototyped in polydimethylsiloxane (PDMS using low-surface energy polymeric threads, twisted to form 2-channel and 3-channel helices. Structure and flow characteristics were assessed experimentally by microscopy, hydraulic measurements and chromogenic reaction, and were modeled by computational fluid dynamics. We found that helical 3D microfluidic systems produced by thread templating allow rapid prototyping, can be used for mixing and for controlled chemical reaction with two or three reaction partners at the microscale. Compared to the conventional T-shaped microfluidic system used as a control device, enhanced mixing and faster chemical reaction was found to occur due to the combination of diffusive mixing in small channels and flow folding due to the 3D helix shape. Thus, microfluidic 3D helix mixers can be rapidly prototyped using the thread template method and are an attractive and competitive method for fluid mixing and chemical reactions at the microscale.

  6. The New Realm of 3-D Vision

    2002-01-01

    Dimension Technologies Inc., developed a line of 2-D/3-D Liquid Crystal Display (LCD) screens, including a 15-inch model priced at consumer levels. DTI's family of flat panel LCD displays, called the Virtual Window(TM), provide real-time 3-D images without the use of glasses, head trackers, helmets, or other viewing aids. Most of the company initial 3-D display research was funded through NASA's Small Business Innovation Research (SBIR) program. The images on DTI's displays appear to leap off the screen and hang in space. The display accepts input from computers or stereo video sources, and can be switched from 3-D to full-resolution 2-D viewing with the push of a button. The Virtual Window displays have applications in data visualization, medicine, architecture, business, real estate, entertainment, and other research, design, military, and consumer applications. Displays are currently used for computer games, protein analysis, and surgical imaging. The technology greatly benefits the medical field, as surgical simulators are helping to increase the skills of surgical residents. Virtual Window(TM) is a trademark of Dimension Technologies Inc.

  7. 3D reconstruction based on spatial vanishing information

    Yuan Shu; Zheng Tan

    2005-01-01

    An approach for the three-dimensional (3D) reconstruction of architectural scenes from two un-calibrated images is described in this paper. From two views of one architectural structure, three pairs of corresponding vanishing points of three major mutual orthogonal directions can be extracted. The simple but powerful constraints of parallelism and orthogonal lines in architectural scenes can be used to calibrate the cameras and to recover the 3D information of the structure. This approach is applied to the real images of architectural scenes, and a 3D model of a building in virtual reality modelling language (VRML) format is presented which illustrates the method with successful performance.

  8. The Fortran-P Translator: Towards Automatic Translation of Fortran 77 Programs for Massively Parallel Processors

    Matthew O'keefe

    1995-01-01

    Full Text Available Massively parallel processors (MPPs hold the promise of extremely high performance that, if realized, could be used to study problems of unprecedented size and complexity. One of the primary stumbling blocks to this promise has been the lack of tools to translate application codes to MPP form. In this article we show how applications codes written in a subset of Fortran 77, called Fortran-P, can be translated to achieve good performance on several massively parallel machines. This subset can express codes that are self-similar, where the algorithm applied to the global data domain is also applied to each subdomain. We have found many codes that match the Fortran-P programming style and have converted them using our tools. We believe a self-similar coding style will accomplish what a vectorizable style has accomplished for vector machines by allowing the construction of robust, user-friendly, automatic translation systems that increase programmer productivity and generate fast, efficient code for MPPs.

  9. Mobile and replicated alignment of arrays in data-parallel programs

    Chatterjee, Siddhartha; Gilbert, John R.; Schreiber, Robert

    1993-01-01

    When a data-parallel language like FORTRAN 90 is compiled for a distributed-memory machine, aggregate data objects (such as arrays) are distributed across the processor memories. The mapping determines the amount of residual communication needed to bring operands of parallel operations into alignment with each other. A common approach is to break the mapping into two stages: first, an alignment that maps all the objects to an abstract template, and then a distribution that maps the template to the processors. We solve two facets of the problem of finding alignments that reduce residual communication: we determine alignments that vary in loops, and objects that should have replicated alignments. We show that loop-dependent mobile alignment is sometimes necessary for optimum performance, and we provide algorithms with which a compiler can determine good mobile alignments for objects within do loops. We also identify situations in which replicated alignment is either required by the program itself (via spread operations) or can be used to improve performance. We propose an algorithm based on network flow that determines which objects to replicate so as to minimize the total amount of broadcast communication in replication. This work on mobile and replicated alignment extends our earlier work on determining static alignment.

  10. A Review on Large Scale Graph Processing Using Big Data Based Parallel Programming Models

    Anuraj Mohan

    2017-02-01

    Full Text Available Processing big graphs has become an increasingly essential activity in various fields like engineering, business intelligence and computer science. Social networks and search engines usually generate large graphs which demands sophisticated techniques for social network analysis and web structure mining. Latest trends in graph processing tend towards using Big Data platforms for parallel graph analytics. MapReduce has emerged as a Big Data based programming model for the processing of massively large datasets. Apache Giraph, an open source implementation of Google Pregel which is based on Bulk Synchronous Parallel Model (BSP is used for graph analytics in social networks like Facebook. This proposed work is to investigate the algorithmic effects of the MapReduce and BSP model on graph problems. The triangle counting problem in graphs is considered as a benchmark and evaluations are made on the basis of time of computation on the same cluster, scalability in relation to graph and cluster size, resource utilization and the structure of the graph.

  11. Research on Task Parallel Programming Model%任务并行编程模型研究与进展

    王蕾; 崔慧敏; 陈莉; 冯晓兵

    2013-01-01

    Task parallel programming model is a widely used parallel programming model on multi-core platforms.With the intention of simplifying parallel programming and improving the utilization of multiple cores,this paper provides an introduction to the essential programming interfaces and the supporting mechanism used in task parallel programming models and discusses issues and the latest achievements from three perspectives:Parallelism expression,data management and task scheduling.In the end,some future trends in this area are discussed.%任务并行编程模型是近年来多核平台上广泛研究和使用的并行编程模型,旨在简化并行编程和提高多核利用率.首先,介绍了任务并行编程模型的基本编程接口和支持机制;然后,从3个角度,即并行性表达、数据管理和任务调度介绍任务并行编程模型的研究问题、困难和最新研究成果;最后展望了任务并行未来的研究方向.

  12. 3D Equilibrium Reconstructions in DIII-D

    Lao, L. L.; Ferraro, N. W.; Strait, E. J.; Turnbull, A. D.; King, J. D.; Hirshman, H. P.; Lazarus, E. A.; Sontag, A. C.; Hanson, J.; Trevisan, G.

    2013-10-01

    Accurate and efficient 3D equilibrium reconstruction is needed in tokamaks for study of 3D magnetic field effects on experimentally reconstructed equilibrium and for analysis of MHD stability experiments with externally imposed magnetic perturbations. A large number of new magnetic probes have been recently installed in DIII-D to improve 3D equilibrium measurements and to facilitate 3D reconstructions. The V3FIT code has been in use in DIII-D to support 3D reconstruction and the new magnetic diagnostic design. V3FIT is based on the 3D equilibrium code VMEC that assumes nested magnetic surfaces. V3FIT uses a pseudo-Newton least-square algorithm to search for the solution vector. In parallel, the EFIT equilibrium reconstruction code is being extended to allow for 3D effects using a perturbation approach based on an expansion of the MHD equations. EFIT uses the cylindrical coordinate system and can include the magnetic island and stochastic effects. Algorithms are being developed to allow EFIT to reconstruct 3D perturbed equilibria directly making use of plasma response to 3D perturbations from the GATO, MARS-F, or M3D-C1 MHD codes. DIII-D 3D reconstruction examples using EFIT and V3FIT and the new 3D magnetic data will be presented. Work supported in part by US DOE under DE-FC02-04ER54698, DE-FG02-95ER54309 and DE-AC05-06OR23100.

  13. An approach to multicore parallelism using functional programming: A case study based on Presburger Arithmetic

    Dung, Phan Anh; Hansen, Michael Reichhardt

    2015-01-01

    platform executing on an 8-core machine. A speedup of approximately 4 was obtained for Cooper’s algorithm and a speedup of approximately 6 was obtained for the exact-shadow part of the Omega Test. The considered procedures are complex, memory-intense algorithms on huge formula trees and the case study...... reveals more general applicable techniques and guideline for deriving parallel algorithms from sequential ones in the context of data-intensive tree algorithms. The obtained insights should apply for any strict and impure functional programming language. Furthermore, the results obtained for the exact......-shadow elimination procedure have a wider applicability because they can directly be transferred to the Fourier–Motzkinelimination method....

  14. Structured Parallel Programming: How Informatics Can Help Overcome the Software Dilemma

    Helmar Burkhart

    1996-01-01

    Full Text Available The state-of-the-art programming of parallel computers is far from being successful. The main challenge today is, therefore, the development of techniques and tools that improve programmers' productivity. Programmability, portability, and reusability are key issues to be solved. In this article we shall report about our ongoing efforts in this direction. After a short discussion of the software dilemma found today, we shall present the Basel approach. We shall summarize our algorithm description methodology and discuss the basic concepts of the proposed skeleton language. An algorithmic example and comments on implementation aspects will explain our work in more detail. We shall summarize the current state of the implementation and conclude with a discussion of related work.

  15. Parallel conjugate gradient: effects of ordering strategies, programming paradigms, and architectural platforms

    Oliker, L.; Li, X.; Heber, G.; Biswas, R.

    2000-05-01

    The Conjugate Gradient (CG) algorithm is perhaps the best-known iterative technique to solve sparse linear systems that are symmetric and positive definite. A sparse matrix-vector multiply (SPMV) usually accounts for most of the floating-point operations with a CG iteration. In this paper, we investigate the effects of various ordering and partitioning strategies on the performance of parallel CG and SPMV using different programming and architectures. Results show that for this class of applications, ordering significantly improves overall performance, that cache reuse may be more important than reducing communication, and that it is possible to achieve message passing performance using shared memory constructs through careful data ordering and distribution. However, a multithreaded implementation of CG on the Tera MTA does not require special ordering or partitioning to obtain high efficiency and scalability.

  16. Managing Communication Latency-Hiding at Runtime for Parallel Programming Languages and Libraries

    Kristensen, Mads Ruben Burgdorff

    2012-01-01

    This work introduces a runtime model for managing communication with support for latency-hiding. The model enables non-computer science researchers to exploit communication latency-hiding techniques seamlessly. For compiled languages, it is often possible to create efficient schedules for communication, but this is not the case for interpreted languages. By maintaining data dependencies between scheduled operations, it is possible to aggressively initiate communication and lazily evaluate tasks to allow maximal time for the communication to finish before entering a wait state. We implement a heuristic of this model in DistNumPy, an auto-parallelizing version of numerical Python that allows sequential NumPy programs to run on distributed memory architectures. Furthermore, we present performance comparisons for eight benchmarks with and without automatic latency-hiding. The results shows that our model reduces the time spent on waiting for communication as much as 27 times, from a maximum of 54% to only 2% of t...

  17. Aplikasi Tiga Dimensi (3D): GoogleSketchUp

    2010-01-01

    Sketchup adalah sebuah program aplikasi komputer untuk membuat model 3 Dimensi (3-D) atas benda-benda fisik seperti gedung-gedung, peralatan rumah tangga, disain tata ruang dan sebagainya. Disain arsitektur merupakan salah satu aplikasi pemakaian SketchUp. Sebelum ada Google Building Maker, SketchUp adalah satu-satunya program aplikasi yang dipakai untuk membuat bangunan-bangunan 3-Dimensi yang anda. Sebetulnya sebelumnya sudah banyak program aplikasi pembuatan model 3-D seperti ini bered...

  18. Making Inexpensive 3-D Models

    Manos, Harry

    2016-01-01

    Visual aids are important to student learning, and they help make the teacher's job easier. Keeping with the "TPT" theme of "The Art, Craft, and Science of Physics Teaching," the purpose of this article is to show how teachers, lacking equipment and funds, can construct a durable 3-D model reference frame and a model gravity…

  19. 3D terahertz beam profiling

    Pedersen, Pernille Klarskov; Strikwerda, Andrew; Wang, Tianwu

    2013-01-01

    We present a characterization of THz beams generated in both a two-color air plasma and in a LiNbO3 crystal. Using a commercial THz camera, we record intensity images as a function of distance through the beam waist, from which we extract 2D beam profiles and visualize our measurements into 3D beam...

  20. 3D Printing: Exploring Capabilities

    Samuels, Kyle; Flowers, Jim

    2015-01-01

    As 3D printers become more affordable, schools are using them in increasing numbers. They fit well with the emphasis on product design in technology and engineering education, allowing students to create high-fidelity physical models to see and test different iterations in their product designs. They may also help students to "think in three…

  1. When Art Meets 3D

    2011-01-01

    The presentation of the vanguard work,My Dream3D,the innovative production by the China Disabled People’s Performing Art Troupe(CDPPAT),directed by Joy Joosang Park,provided the film’s domestic premiere at Beijing’s Olympic Park onApril7.The show provided an intriguing insight not

  2. Priprava 3D modelov za 3D tisk

    2015-01-01

    Po mnenju nekaterih strokovnjakov bo aditivna proizvodnja (ali 3D tiskanje) spremenila proizvodnjo industrijo, saj si bo vsak posameznik lahko natisnil svoj objekt po želji. V diplomski nalogi so predstavljene nekatere tehnologije aditivne proizvodnje. V nadaljevanju diplomske naloge je predstavljena izdelava makete hiše v merilu 1:100, vse od modeliranja do tiskanja. Poseben poudarek je posvečen predelavi modela, da je primeren za tiskanje, kjer je razvit pristop za hitrejše i...

  3. Post processing of 3D models for 3D printing

    2015-01-01

    According to the opinion of some experts the additive manufacturing or 3D printing will change manufacturing industry, because any individual could print their own model according to his or her wishes. In this graduation thesis some of the additive manufacturing technologies are presented. Furthermore in the production of house scale model in 1:100 is presented, starting from modeling to printing. Special attention is given to postprocessing of the building model elements us...

  4. 3D TRUMP - A GBI launch window tool

    Karels, Steven N.; Hancock, John; Matchett, Gary

    3D TRUMP is a novel GPS and communicatons-link software analysis tool developed for the SDIO's Ground-Based Interceptor (GBI) program. 3D TRUMP uses a computationally efficient analysis tool which provides key GPS-based performance measures for an entire GBI mission's reentry vehicle and interceptor trajectories. Algorithms and sample outputs are presented.

  5. Study on Parallel Computing

    Guo-Liang Chen; Guang-Zhong Sun; Yun-Quan Zhang; Ze-Yao Mo

    2006-01-01

    In this paper, we present a general survey on parallel computing. The main contents include parallel computer system which is the hardware platform of parallel computing, parallel algorithm which is the theoretical base of parallel computing, parallel programming which is the software support of parallel computing. After that, we also introduce some parallel applications and enabling technologies. We argue that parallel computing research should form an integrated methodology of "architecture - algorithm - programming - application". Only in this way, parallel computing research becomes continuous development and more realistic.

  6. Simulation of current generation in a 3-D plasma model

    Tsung, F.S.; Dawson, J.M. [Univ. of California, Los Angeles, CA (United States)

    1996-12-31

    Two wires carrying current in the same direction will attract each other, and two wires carrying current in the opposite direction will repel each other. Now, consider a test charge in a plasma. If the test charge carries current parallel to the plasma, then it will be pulled toward the plasma core, and if the test charge carries current anti-parallel to the plasma, then it will be pushed to the edge. The electromagnetic coupling between the plasma and a test charge (i.e., the A{sub {parallel}} {circ} v{sub {parallel}} term in the test charge`s Hamiltonian) breaks the symmetry in the parallel direction, and gives rise to a diffusion coefficient which is dependent on the particle`s parallel velocity. This is the basis for the {open_quotes}preferential loss{close_quotes} mechanism described in the work by Nunan et al. In our previous 2+{1/2}D work, in both cylindrical and toroidal geometries, showed that if the plasma column is centrally fueled, then an initial current increases steadily. The results in straight, cylindrical plasmas showed that self generated parallel current arises without trapped particle or neoclassical diffusion, as assumed by the bootstrap theory. It suggests that the fundamental mechanism seems to be the conservation of particles canonical momenta in the direction of the ignorable coordinate. We have extended the simulation to 3D to verify the model put forth. A scalable 3D EM-PIC code, with a localized field-solver, has been implemented to run on a large class of parallel computers. On the 512-node SP2 at Cornell Theory Center, we have benchmarked the 2+{1/2}D calculations using 32 grids in the previously ignored direction, and a 100-fold increase in the number of particles. Our preliminary results show good agreements between the 2+{1/2}D and the 3D calculations. We will present our 3D results at the meeting.

  7. 3D Elevation Program: summary for Vermont

    Carswell, William J.

    2015-01-01

    Elevation data are essential to a broad range of applications, including forest resources management, wildlife and habitat management, national security, recreation, and many others. For the State of Vermont, elevation data are critical for hazard mitigation, geologic resource assessment, natural resources conservation, agriculture and precision farming, flood risk management, infrastructure and construction management, and other business uses. Today, high-density light detection and ranging (lidar) data are the primary sources for deriving elevation models and other datasets. Federal, State, Tribal, and local agencies work in partnership to (1) replace data that are older and of lower quality and (2) provide coverage where publicly accessible data do not exist. A joint goal of State and Federal partners is to acquire consistent, statewide coverage to support existing and emerging applications enabled by lidar data.

  8. 3D Elevation Program: summary for Nebraska

    Carswell, William J.

    2015-01-01

    Elevation data are essential to a broad range of applications, including forest resources management, wildlife and habitat management, national security, recreation, and many others. For the State of Nebraska, elevation data are critical for agriculture and precision farming, natural resources conservation, flood risk management, infrastructure and construction management, geologic resource assessment and hazard mitigation, and other business uses. Today, high-density light detection and ranging (lidar) data are the primary sources for deriving elevation models and other datasets. Federal, State, Tribal, and local agencies work in partnership to (1) replace data that are older and of lower quality and (2) provide coverage where publicly accessible data do not exist. A joint goal of State and Federal partners is to acquire consistent, statewide coverage to support existing and emerging applications enabled by lidar data.

  9. 3D graphics for game programming

    Han, JungHyun

    2011-01-01

    Modeling in Game ProductionVertex ProcessingRasterizationFragment Processing and Output MergingIllumination and ShadersParametric Curves and SurfacesShader ModelsImage TexturingBump MappingAdvanced TexturingCharacter AnimationPhysics-based SimulationReferences

  10. [Development of a software for 3D virtual phantom design].

    Zou, Lian; Xie, Zhao; Wu, Qi

    2014-02-01

    In this paper, we present a 3D virtual phantom design software, which was developed based on object-oriented programming methodology and dedicated to medical physics research. This software was named Magical Phan tom (MPhantom), which is composed of 3D visual builder module and virtual CT scanner. The users can conveniently construct any complex 3D phantom, and then export the phantom as DICOM 3.0 CT images. MPhantom is a user-friendly and powerful software for 3D phantom configuration, and has passed the real scene's application test. MPhantom will accelerate the Monte Carlo simulation for dose calculation in radiation therapy and X ray imaging reconstruction algorithm research.

  11. 3D Genome Tuner: Compare Multiple Circular Genomes in a 3D Context

    Qi Wang; Qun Liang; Xiuqing Zhang

    2009-01-01

    Circular genomes, being the largest proportion of sequenced genomes, play an important role in genome analysis. However, traditional 2D circular map only provides an overview and annotations of genome but does not offer feature-based comparison. For remedying these shortcomings, we developed 3D Genome Tuner, a hybrid of circular map and comparative map tools. Its capability of viewing comparisons between multiple circular maps in a 3D space offers great benefits to the study of comparative genomics. The program is freely available(under an LGPL licence)at http://sourceforge.net/projects/dgenometuner.

  12. 3-D force-balanced magnetospheric configurations

    S. Zaharia

    2004-01-01

    . Our results provide 3-D distributions of magnetic field, plasma pressure, as well as parallel and transverse currents for both quiet-time and disturbed magnetospheric conditions.

    Key words. Magnetospheric physics (magnetospheric configuration and dynamics; magnetotail; plasma sheet

  13. Forensic 3D Scene Reconstruction

    LITTLE,CHARLES Q.; PETERS,RALPH R.; RIGDON,J. BRIAN; SMALL,DANIEL E.

    1999-10-12

    Traditionally law enforcement agencies have relied on basic measurement and imaging tools, such as tape measures and cameras, in recording a crime scene. A disadvantage of these methods is that they are slow and cumbersome. The development of a portable system that can rapidly record a crime scene with current camera imaging, 3D geometric surface maps, and contribute quantitative measurements such as accurate relative positioning of crime scene objects, would be an asset to law enforcement agents in collecting and recording significant forensic data. The purpose of this project is to develop a feasible prototype of a fast, accurate, 3D measurement and imaging system that would support law enforcement agents to quickly document and accurately record a crime scene.

  14. 3D Printed Robotic Hand

    Pizarro, Yaritzmar Rosario; Schuler, Jason M.; Lippitt, Thomas C.

    2013-01-01

    Dexterous robotic hands are changing the way robots and humans interact and use common tools. Unfortunately, the complexity of the joints and actuations drive up the manufacturing cost. Some cutting edge and commercially available rapid prototyping machines now have the ability to print multiple materials and even combine these materials in the same job. A 3D model of a robotic hand was designed using Creo Parametric 2.0. Combining "hard" and "soft" materials, the model was printed on the Object Connex350 3D printer with the purpose of resembling as much as possible the human appearance and mobility of a real hand while needing no assembly. After printing the prototype, strings where installed as actuators to test mobility. Based on printing materials, the manufacturing cost of the hand was $167, significantly lower than other robotic hands without the actuators since they have more complex assembly processes.

  15. 3D Printable Graphene Composite.

    Wei, Xiaojun; Li, Dong; Jiang, Wei; Gu, Zheming; Wang, Xiaojuan; Zhang, Zengxing; Sun, Zhengzong

    2015-07-08

    In human being's history, both the Iron Age and Silicon Age thrived after a matured massive processing technology was developed. Graphene is the most recent superior material which could potentially initialize another new material Age. However, while being exploited to its full extent, conventional processing methods fail to provide a link to today's personalization tide. New technology should be ushered in. Three-dimensional (3D) printing fills the missing linkage between graphene materials and the digital mainstream. Their alliance could generate additional stream to push the graphene revolution into a new phase. Here we demonstrate for the first time, a graphene composite, with a graphene loading up to 5.6 wt%, can be 3D printable into computer-designed models. The composite's linear thermal coefficient is below 75 ppm·°C(-1) from room temperature to its glass transition temperature (Tg), which is crucial to build minute thermal stress during the printing process.

  16. Automated classification of RNA 3D motifs and the RNA 3D Motif Atlas.

    Petrov, Anton I; Zirbel, Craig L; Leontis, Neocles B

    2013-10-01

    The analysis of atomic-resolution RNA three-dimensional (3D) structures reveals that many internal and hairpin loops are modular, recurrent, and structured by conserved non-Watson-Crick base pairs. Structurally similar loops define RNA 3D motifs that are conserved in homologous RNA molecules, but can also occur at nonhomologous sites in diverse RNAs, and which often vary in sequence. To further our understanding of RNA motif structure and sequence variability and to provide a useful resource for structure modeling and prediction, we present a new method for automated classification of internal and hairpin loop RNA 3D motifs and a new online database called the RNA 3D Motif Atlas. To classify the motif instances, a representative set of internal and hairpin loops is automatically extracted from a nonredundant list of RNA-containing PDB files. Their structures are compared geometrically, all-against-all, using the FR3D program suite. The loops are clustered into motif groups, taking into account geometric similarity and structural annotations and making allowance for a variable number of bulged bases. The automated procedure that we have implemented identifies all hairpin and internal loop motifs previously described in the literature. All motif instances and motif groups are assigned unique and stable identifiers and are made available in the RNA 3D Motif Atlas (http://rna.bgsu.edu/motifs), which is automatically updated every four weeks. The RNA 3D Motif Atlas provides an interactive user interface for exploring motif diversity and tools for programmatic data access.

  17. Medical 3D thermography system

    GRUBIŠIĆ, IVAN

    2011-01-01

    Infrared (IR) thermography determines the surface temperature of an object or human body using thermal IR measurement camera. It is an imaging technology which is contactless and completely non-invasive. These propertiesmake IR thermography a useful method of analysis that is used in various industrial applications to detect, monitor and predict irregularities in many fields from engineering to medical and biological observations. This paper presents a conceptual model of Medical 3D Thermo...

  18. Comparative Analysis of Photogrammetric Methods for 3D Models for Museums

    Hafstað Ármannsdottir, Unnur Erla; Antón Castro, Francesc/François; Mioc, Darka

    2014-01-01

    to 3D models using Sketchup and Designing Reality. Finally, panoramic photography is discussed as a 2D alternative to 3D. Sketchup is a free-ware 3D drawing program and Designing Reality is a commercial program, which uses Structure from motion. For each program/method, the same comparative analysis...

  19. 3D silicon strip detectors

    Parzefall, Ulrich [Physikalisches Institut, Universitaet Freiburg, Hermann-Herder-Str. 3, D-79104 Freiburg (Germany)], E-mail: ulrich.parzefall@physik.uni-freiburg.de; Bates, Richard [University of Glasgow, Department of Physics and Astronomy, Glasgow G12 8QQ (United Kingdom); Boscardin, Maurizio [FBK-irst, Center for Materials and Microsystems, via Sommarive 18, 38050 Povo di Trento (Italy); Dalla Betta, Gian-Franco [INFN and Universita' di Trento, via Sommarive 14, 38050 Povo di Trento (Italy); Eckert, Simon [Physikalisches Institut, Universitaet Freiburg, Hermann-Herder-Str. 3, D-79104 Freiburg (Germany); Eklund, Lars; Fleta, Celeste [University of Glasgow, Department of Physics and Astronomy, Glasgow G12 8QQ (United Kingdom); Jakobs, Karl; Kuehn, Susanne [Physikalisches Institut, Universitaet Freiburg, Hermann-Herder-Str. 3, D-79104 Freiburg (Germany); Lozano, Manuel [Instituto de Microelectronica de Barcelona, IMB-CNM, CSIC, Barcelona (Spain); Pahn, Gregor [Physikalisches Institut, Universitaet Freiburg, Hermann-Herder-Str. 3, D-79104 Freiburg (Germany); Parkes, Chris [University of Glasgow, Department of Physics and Astronomy, Glasgow G12 8QQ (United Kingdom); Pellegrini, Giulio [Instituto de Microelectronica de Barcelona, IMB-CNM, CSIC, Barcelona (Spain); Pennicard, David [University of Glasgow, Department of Physics and Astronomy, Glasgow G12 8QQ (United Kingdom); Piemonte, Claudio; Ronchin, Sabina [FBK-irst, Center for Materials and Microsystems, via Sommarive 18, 38050 Povo di Trento (Italy); Szumlak, Tomasz [University of Glasgow, Department of Physics and Astronomy, Glasgow G12 8QQ (United Kingdom); Zoboli, Andrea [INFN and Universita' di Trento, via Sommarive 14, 38050 Povo di Trento (Italy); Zorzi, Nicola [FBK-irst, Center for Materials and Microsystems, via Sommarive 18, 38050 Povo di Trento (Italy)

    2009-06-01

    While the Large Hadron Collider (LHC) at CERN has started operation in autumn 2008, plans for a luminosity upgrade to the Super-LHC (sLHC) have already been developed for several years. This projected luminosity increase by an order of magnitude gives rise to a challenging radiation environment for tracking detectors at the LHC experiments. Significant improvements in radiation hardness are required with respect to the LHC. Using a strawman layout for the new tracker of the ATLAS experiment as an example, silicon strip detectors (SSDs) with short strips of 2-3 cm length are foreseen to cover the region from 28 to 60 cm distance to the beam. These SSD will be exposed to radiation levels up to 10{sup 15}N{sub eq}/cm{sup 2}, which makes radiation resistance a major concern for the upgraded ATLAS tracker. Several approaches to increasing the radiation hardness of silicon detectors exist. In this article, it is proposed to combine the radiation hard 3D-design originally conceived for pixel-style applications with the benefits of the established planar technology for strip detectors by using SSDs that have regularly spaced doped columns extending into the silicon bulk under the detector strips. The first 3D SSDs to become available for testing were made in the Single Type Column (STC) design, a technological simplification of the original 3D design. With such 3D SSDs, a small number of prototype sLHC detector modules with LHC-speed front-end electronics as used in the semiconductor tracking systems of present LHC experiments were built. Modules were tested before and after irradiation to fluences of 10{sup 15}N{sub eq}/cm{sup 2}. The tests were performed with three systems: a highly focused IR-laser with 5{mu}m spot size to make position-resolved scans of the charge collection efficiency, an Sr{sup 90}{beta}-source set-up to measure the signal levels for a minimum ionizing particle (MIP), and a beam test with 180 GeV pions at CERN. This article gives a brief overview of

  20. Fully 3D GPU PET reconstruction

    Herraiz, J.L., E-mail: joaquin@nuclear.fis.ucm.es [Grupo de Fisica Nuclear, Departmento Fisica Atomica, Molecular y Nuclear, Universidad Complutense de Madrid (Spain); Espana, S. [Department of Radiation Oncology, Massachusetts General Hospital and Harvard Medical School, Boston, MA (United States); Cal-Gonzalez, J. [Grupo de Fisica Nuclear, Departmento Fisica Atomica, Molecular y Nuclear, Universidad Complutense de Madrid (Spain); Vaquero, J.J. [Departmento de Bioingenieria e Ingenieria Espacial, Universidad Carlos III, Madrid (Spain); Desco, M. [Departmento de Bioingenieria e Ingenieria Espacial, Universidad Carlos III, Madrid (Spain); Unidad de Medicina y Cirugia Experimental, Hospital General Universitario Gregorio Maranon, Madrid (Spain); Udias, J.M. [Grupo de Fisica Nuclear, Departmento Fisica Atomica, Molecular y Nuclear, Universidad Complutense de Madrid (Spain)

    2011-08-21

    Fully 3D iterative tomographic image reconstruction is computationally very demanding. Graphics Processing Unit (GPU) has been proposed for many years as potential accelerators in complex scientific problems, but it has not been used until the recent advances in the programmability of GPUs that the best available reconstruction codes have started to be implemented to be run on GPUs. This work presents a GPU-based fully 3D PET iterative reconstruction software. This new code may reconstruct sinogram data from several commercially available PET scanners. The most important and time-consuming parts of the code, the forward and backward projection operations, are based on an accurate model of the scanner obtained with the Monte Carlo code PeneloPET and they have been massively parallelized on the GPU. For the PET scanners considered, the GPU-based code is more than 70 times faster than a similar code running on a single core of a fast CPU, obtaining in both cases the same images. The code has been designed to be easily adapted to reconstruct sinograms from any other PET scanner, including scanner prototypes.

  1. Statistical 3D damage accumulation model for ion implant simulators

    Hernandez-Mangas, J.M. E-mail: jesman@ele.uva.es; Lazaro, J.; Enriquez, L.; Bailon, L.; Barbolla, J.; Jaraiz, M

    2003-04-01

    A statistical 3D damage accumulation model, based on the modified Kinchin-Pease formula, for ion implant simulation has been included in our physically based ion implantation code. It has only one fitting parameter for electronic stopping and uses 3D electron density distributions for different types of targets including compound semiconductors. Also, a statistical noise reduction mechanism based on the dose division is used. The model has been adapted to be run under parallel execution in order to speed up the calculation in 3D structures. Sequential ion implantation has been modelled including previous damage profiles. It can also simulate the implantation of molecular and cluster projectiles. Comparisons of simulated doping profiles with experimental SIMS profiles are presented. Also comparisons between simulated amorphization and experimental RBS profiles are shown. An analysis of sequential versus parallel processing is provided.

  2. Statistical 3D damage accumulation model for ion implant simulators

    Hernandez-Mangas, J M; Enriquez, L E; Bailon, L; Barbolla, J; Jaraiz, M

    2003-01-01

    A statistical 3D damage accumulation model, based on the modified Kinchin-Pease formula, for ion implant simulation has been included in our physically based ion implantation code. It has only one fitting parameter for electronic stopping and uses 3D electron density distributions for different types of targets including compound semiconductors. Also, a statistical noise reduction mechanism based on the dose division is used. The model has been adapted to be run under parallel execution in order to speed up the calculation in 3D structures. Sequential ion implantation has been modelled including previous damage profiles. It can also simulate the implantation of molecular and cluster projectiles. Comparisons of simulated doping profiles with experimental SIMS profiles are presented. Also comparisons between simulated amorphization and experimental RBS profiles are shown. An analysis of sequential versus parallel processing is provided.

  3. Dual side transparent OLED 3D display using Gabor super-lens

    Chestak, Sergey; Kim, Dae-Sik; Cho, Sung-Woo

    2015-03-01

    We devised dual side transparent 3D display using transparent OLED panel and two lenticular arrays. The OLED panel is sandwiched between two parallel confocal lenticular arrays, forming Gabor super-lens. The display provides dual side stereoscopic 3D imaging and floating image of the object, placed behind it. The floating image can be superimposed with the displayed 3D image. The displayed autostereoscopic 3D images are composed of 4 views, each with resolution 64x90 pix.

  4. Using Computer-Aided Design Software and 3D Printers to Improve Spatial Visualization

    Katsio-Loudis, Petros; Jones, Millie

    2015-01-01

    Many articles have been published on the use of 3D printing technology. From prefabricated homes and outdoor structures to human organs, 3D printing technology has found a niche in many fields, but especially education. With the introduction of AutoCAD technical drawing programs and now 3D printing, learners can use 3D printed models to develop…

  5. 3D Ultrasonic Wave Simulations for Structural Health Monitoring

    Campbell, Leckey Cara A/; Miler, Corey A.; Hinders, Mark K.

    2011-01-01

    Structural health monitoring (SHM) for the detection of damage in aerospace materials is an important area of research at NASA. Ultrasonic guided Lamb waves are a promising SHM damage detection technique since the waves can propagate long distances. For complicated flaw geometries experimental signals can be difficult to interpret. High performance computing can now handle full 3-dimensional (3D) simulations of elastic wave propagation in materials. We have developed and implemented parallel 3D elastodynamic finite integration technique (3D EFIT) code to investigate ultrasound scattering from flaws in materials. EFIT results have been compared to experimental data and the simulations provide unique insight into details of the wave behavior. This type of insight is useful for developing optimized experimental SHM techniques. 3D EFIT can also be expanded to model wave propagation and scattering in anisotropic composite materials.

  6. Design for scalability in 3D computer graphics architectures

    Holten-Lund, Hans Erik

    2002-01-01

    been developed. Hybris is a prototype rendering architeture which can be tailored to many specific 3D graphics applications and implemented in various ways. Parallel software implementations for both single and multi-processor Windows 2000 system have been demonstrated. Working hardware/software...... codesign implementations of Hybris for standard-cell based ASIC (simulated) and FPGA technologies have been demonstrated, using manual co-synthesis for translation of a Virtual Prototyping architecture specification written in C into both optimized C source for software and into to a synthesizable VHDL...... specification for hardware implementation. A flexible VRML 97 3D scene graph engine with a Java interface and C++ interface has been implemented to allow flexible integration of the rendering technology into Java and C++ applications. A 3D medical visualization workstation prototype (3D-Med) is examined...

  7. 2-D Versus 3-D Magnetotelluric Data Interpretation

    Ledo, Juanjo

    2005-09-01

    In recent years, the number of publications dealing with the mathematical and physical 3-D aspects of the magnetotelluric method has increased drastically. However, field experiments on a grid are often impractical and surveys are frequently restricted to single or widely separated profiles. So, in many cases we find ourselves with the following question: is the applicability of the 2-D hypothesis valid to extract geoelectric and geological information from real 3-D environments? The aim of this paper is to explore a few instructive but general situations to understand the basics of a 2-D interpretation of 3-D magnetotelluric data and to determine which data subset (TE-mode or TM-mode) is best for obtaining the electrical conductivity distribution of the subsurface using 2-D techniques. A review of the mathematical and physical fundamentals of the electromagnetic fields generated by a simple 3-D structure allows us to prioritise the choice of modes in a 2-D interpretation of responses influenced by 3-D structures. This analysis is corroborated by numerical results from synthetic models and by real data acquired by other authors. One important result of this analysis is that the mode most unaffected by 3-D effects depends on the position of the 3-D structure with respect to the regional 2-D strike direction. When the 3-D body is normal to the regional strike, the TE-mode is affected mainly by galvanic effects, while the TM-mode is affected by galvanic and inductive effects. In this case, a 2-D interpretation of the TM-mode is prone to error. When the 3-D body is parallel to the regional 2-D strike the TE-mode is affected by galvanic and inductive effects and the TM-mode is affected mainly by galvanic effects, making it more suitable for 2-D interpretation. In general, a wise 2-D interpretation of 3-D magnetotelluric data can be a guide to a reasonable geological interpretation.

  8. Beginning Android 3D game development

    Chin, Robert

    2014-01-01

    1. This Apress book aims to be first or unique English language book to market on Android 3D Game Programming (there is however a Chinese lang book.). 1. Given sell like, there may be some potential for retail, trade sales. 2. Otherwise, most revenue should come from the high relevancy of Android in books as a service database engines like Safari where Android meme is nearly at the top. 3. Android has the most user market share worldwide and is second best apps eco although game specific development seems more popular on Android than iOS.

  9. Single Camera Calibration in 3D Vision

    Caius SULIMAN

    2009-12-01

    Full Text Available Camera calibration is a necessary step in 3D vision in order to extract metric information from 2D images. A camera is considered to be calibrated when the parameters of the camera are known (i.e. principal distance, lens distorsion, focal length etc.. In this paper we deal with a single camera calibration method and with the help of this method we try to find the intrinsic and extrinsic camera parameters. The method was implemented with succes in the programming and simulation environment Matlab.

  10. Practical algorithms for 3D computer graphics

    Ferguson, R Stuart

    2013-01-01

    ""A valuable book to accompany any course that mixes the theory and practice of 3D graphics. The book's web site has many useful programs and code samples.""-Karen Rafferty, Queen's University, Belfast""The topics covered by this book are backed by the OpenFX modeling and animation software. This is a big plus in that it provides a practical perspective and encourages experimentation. … [This] will offer students a more interesting and hands-on learning experience, especially for those wishing to pursue a career in computer game development.""-Naganand Madhavapeddy, GameDeveloper>

  11. Levande 3D-TV Streaming

    Neupane, Bishal; Moazzeni, Pooya

    2012-01-01

    The world is not flat as a pancake. It has height, width and depth. So we should see it even on TV. So far we cannot see three-dimensional programs directly into our TVs. Not even in cinemas with 3D cinema works "for real". For still there the magic sits in those glasses. The glasses of different colors allow distinguishing right and left eye impression tightened so that one sees different images with each eye. That is what creates the illusion of three dimensions. The goal of this ...

  12. Comparison of 2D and 3D Neutron Transport Analyses on Yonggwang Unit 3 Reactor

    Maeng, Aoung Jae; Kim, Byoung Chul; Lim, Mi Joung; Kim, Kyung Sik; Jeon, Young Kyou [Korea Reactor Integrity Surveillance Technology, Daejeon (Korea, Republic of); Yoo, Choon Sung [Korea Atomic Energy Research Institutes, Daejeon (Korea, Republic of)

    2012-10-15

    10 CFR Part 50 Appendix H requires periodical surveillance program in the reactor vessel (RV) belt line region of light water nuclear power plant to check vessel integrity resulting from the exposure to neutron irradiation and thermal environment. Exact exposure analysis of the neutron fluence based on right modeling and simulations is the most important in the evaluation. Traditional 2 dimensional (D) and 1D synthesis methodologies have been widely applied to evaluate the fast neutron (E > 1.0 MeV) fluence exposure to RV. However, 2D and 1D methodologies have not provided accurate fast neutron fluence evaluation at elevations far above or below the active core region. RAPTOR-M3G (RApid Parallel Transport Of Radiation - Multiple 3D Geometries) program for 3D geometries calculation was therefore developed both by Westinghouse Electronic Company, USA and Korea Reactor Integrity Surveillance Technology (KRIST) for the analysis of In-Vessel Surveillance Test and Ex-Vessel Neutron Dosimetry (EVND). Especially EVND which is installed at active core height between biological shielding material and concrete also evaluates axial neutron fluence by placing three dosimetries each at Top, Middle and Bottom part of the angle representing maximum neutron fluence. The EVND programs have been applied to the Korea Nuclear Plants. The objective of this study is therefore to compare the 3D and the 2D Neutron Transport Calculations and Analyses on the Yonggwang unit 3 Reactor as an example.

  13. Interactive 3D Mars Visualization

    Powell, Mark W.

    2012-01-01

    The Interactive 3D Mars Visualization system provides high-performance, immersive visualization of satellite and surface vehicle imagery of Mars. The software can be used in mission operations to provide the most accurate position information for the Mars rovers to date. When integrated into the mission data pipeline, this system allows mission planners to view the location of the rover on Mars to 0.01-meter accuracy with respect to satellite imagery, with dynamic updates to incorporate the latest position information. Given this information so early in the planning process, rover drivers are able to plan more accurate drive activities for the rover than ever before, increasing the execution of science activities significantly. Scientifically, this 3D mapping information puts all of the science analyses to date into geologic context on a daily basis instead of weeks or months, as was the norm prior to this contribution. This allows the science planners to judge the efficacy of their previously executed science observations much more efficiently, and achieve greater science return as a result. The Interactive 3D Mars surface view is a Mars terrain browsing software interface that encompasses the entire region of exploration for a Mars surface exploration mission. The view is interactive, allowing the user to pan in any direction by clicking and dragging, or to zoom in or out by scrolling the mouse or touchpad. This set currently includes tools for selecting a point of interest, and a ruler tool for displaying the distance between and positions of two points of interest. The mapping information can be harvested and shared through ubiquitous online mapping tools like Google Mars, NASA WorldWind, and Worldwide Telescope.

  14. How 3-D Movies Work

    吕铁雄

    2011-01-01

    难度:★★★★☆词数:450 建议阅读时间:8分钟 Most people see out of two eyes. This is a basic fact of humanity,but it’s what makes possible the illusion of depth(纵深幻觉) that 3-D movies create. Human eyes are spaced about two inches apart, meaning that each eye gives the brain a slightly different perspective(透视感)on the same object. The brain then uses this variance to quickly determine an object’s distance.

  15. Virtual 3-D Facial Reconstruction

    Martin Paul Evison

    2000-06-01

    Full Text Available Facial reconstructions in archaeology allow empathy with people who lived in the past and enjoy considerable popularity with the public. It is a common misconception that facial reconstruction will produce an exact likeness; a resemblance is the best that can be hoped for. Research at Sheffield University is aimed at the development of a computer system for facial reconstruction that will be accurate, rapid, repeatable, accessible and flexible. This research is described and prototypical 3-D facial reconstructions are presented. Interpolation models simulating obesity, ageing and ethnic affiliation are also described. Some strengths and weaknesses in the models, and their potential for application in archaeology are discussed.

  16. Real-time depth map manipulation for 3D visualization

    Ideses, Ianir; Fishbain, Barak; Yaroslavsky, Leonid

    2009-02-01

    One of the key aspects of 3D visualization is computation of depth maps. Depth maps enables synthesis of 3D video from 2D video and use of multi-view displays. Depth maps can be acquired in several ways. One method is to measure the real 3D properties of the scene objects. Other methods rely on using two cameras and computing the correspondence for each pixel. Once a depth map is acquired for every frame, it can be used to construct its artificial stereo pair. There are many known methods for computing the optical flow between adjacent video frames. The drawback of these methods is that they require extensive computation power and are not very well suited to high quality real-time 3D rendering. One efficient method for computing depth maps is extraction of motion vector information from standard video encoders. In this paper we present methods to improve the 3D visualization quality acquired from compression CODECS by spatial/temporal and logical operations and manipulations. We show how an efficient real time implementation of spatial-temporal local order statistics such as median and local adaptive filtering in 3D-DCT domain can substantially improve the quality of depth maps and consequently 3D video while retaining real-time rendering. Real-time performance is achived by utilizing multi-core technology using standard parallelization algorithms and libraries (OpenMP, IPP).

  17. Manipulation of stimulus onset delay in reading: evidence for parallel programming of saccades.

    Morrison, R E

    1984-10-01

    On-line eye movement recording of 12 subjects who read short stories on a cathode ray tube enabled a test of direct control and preprogramming models of eye movements in reading. Contingent upon eye position, a mask was displayed in place of the letters in central vision after each saccade, delaying the onset of the stimulus in each eye fixation. The duration of the delay was manipulated in fixed or randomized blocks. Although the length of the delay strongly affected the duration of the fixations, there was no difference due to the conditions of delay manipulation, indicating that fixation duration is under direct control. However, not all fixations were lengthened by the period of the delay. Some ended while the mask was still present, suggesting they had been preprogrammed. But these "anticipation" eye movements could not have been completely determined before the fixation was processed because their fixation durations and saccade lengths were affected by the spatial extent of the mask, which varied randomly. Neither preprogramming nor existing serial direct control models of eye guidance can adequately account for these data. Instead, a model with direct control and parallel programming of saccades is proposed to explain the data and eye movements in reading in general.

  18. A pattern recognition system for prostate mass spectra discrimination based on the CUDA parallel programming model

    Kostopoulos, Spiros; Glotsos, Dimitris; Sidiropoulos, Konstantinos; Asvestas, Pantelis; Cavouras, Dionisis; Kalatzis, Ioannis

    2014-03-01

    The aim of the present study was to implement a pattern recognition system for the discrimination of healthy from malignant prostate tumors from proteomic Mass Spectroscopy (MS) samples and to identify m/z intervals of potential biomarkers associated with prostate cancer. One hundred and six MS-spectra were studied in total. Sixty three spectra corresponded to healthy cases (PSA 10). The MS-spectra are publicly available from the NCI Clinical Proteomics Database. The pre-processing comprised the steps: denoising, normalization, peak extraction and peak alignment. Due to the enormous number of features that rose from MS-spectra as informative peaks, and in order to secure optimum system design, the classification task was performed by programming in parallel the multiprocessors of an nVIDIA GPU card, using the CUDA framework. The proposed system achieved 98.1% accuracy. The identified m/z intervals displayed significant statistical differences between the two classes and were found to possess adequate discriminatory power in characterizing prostate samples, when employed in the design of the classification system. Those intervals should be further investigated since they might lead to the identification of potential new biomarkers for prostate cancer.

  19. FROMS3D: New Software for 3-D Visualization of Fracture Network System in Fractured Rock Masses

    Noh, Y. H.; Um, J. G.; Choi, Y.

    2014-12-01

    A new software (FROMS3D) is presented to visualize fracture network system in 3-D. The software consists of several modules that play roles in management of borehole and field fracture data, fracture network modelling, visualization of fracture geometry in 3-D and calculation and visualization of intersections and equivalent pipes between fractures. Intel Parallel Studio XE 2013, Visual Studio.NET 2010 and the open source VTK library were utilized as development tools to efficiently implement the modules and the graphical user interface of the software. The results have suggested that the developed software is effective in visualizing 3-D fracture network system, and can provide useful information to tackle the engineering geological problems related to strength, deformability and hydraulic behaviors of the fractured rock masses.

  20. 3D medical thermography device

    Moghadam, Peyman

    2015-05-01

    In this paper, a novel handheld 3D medical thermography system is introduced. The proposed system consists of a thermal-infrared camera, a color camera and a depth camera rigidly attached in close proximity and mounted on an ergonomic handle. As a practitioner holding the device smoothly moves it around the human body parts, the proposed system generates and builds up a precise 3D thermogram model by incorporating information from each new measurement in real-time. The data is acquired in motion, thus it provides multiple points of view. When processed, these multiple points of view are adaptively combined by taking into account the reliability of each individual measurement which can vary due to a variety of factors such as angle of incidence, distance between the device and the subject and environmental sensor data or other factors influencing a confidence of the thermal-infrared data when captured. Finally, several case studies are presented to support the usability and performance of the proposed system.