WorldWideScience

Sample records for cost graphics processing

  1. Energy- and cost-efficient lattice-QCD computations using graphics processing units

    Energy Technology Data Exchange (ETDEWEB)

    Bach, Matthias

    2014-07-01

    Quarks and gluons are the building blocks of all hadronic matter, like protons and neutrons. Their interaction is described by Quantum Chromodynamics (QCD), a theory under test by large scale experiments like the Large Hadron Collider (LHC) at CERN and in the future at the Facility for Antiproton and Ion Research (FAIR) at GSI. However, perturbative methods can only be applied to QCD for high energies. Studies from first principles are possible via a discretization onto an Euclidean space-time grid. This discretization of QCD is called Lattice QCD (LQCD) and is the only ab-initio option outside of the high-energy regime. LQCD is extremely compute and memory intensive. In particular, it is by definition always bandwidth limited. Thus - despite the complexity of LQCD applications - it led to the development of several specialized compute platforms and influenced the development of others. However, in recent years General-Purpose computation on Graphics Processing Units (GPGPU) came up as a new means for parallel computing. Contrary to machines traditionally used for LQCD, graphics processing units (GPUs) are a massmarket product. This promises advantages in both the pace at which higher-performing hardware becomes available and its price. CL2QCD is an OpenCL based implementation of LQCD using Wilson fermions that was developed within this thesis. It operates on GPUs by all major vendors as well as on central processing units (CPUs). On the AMD Radeon HD 7970 it provides the fastest double-precision D kernel for a single GPU, achieving 120GFLOPS. D - the most compute intensive kernel in LQCD simulations - is commonly used to compare LQCD platforms. This performance is enabled by an in-depth analysis of optimization techniques for bandwidth-limited codes on GPUs. Further, analysis of the communication between GPU and CPU, as well as between multiple GPUs, enables high-performance Krylov space solvers and linear scaling to multiple GPUs within a single system. LQCD

  2. Feasibility Analysis of Low Cost Graphical Processing Units for Electromagnetic Field Simulations by Finite Difference Time Domain Method

    CERN Document Server

    Choudhari, A V; Gupta, M R

    2013-01-01

    Among several techniques available for solving Computational Electromagnetics (CEM) problems, the Finite Difference Time Domain (FDTD) method is one of the best suited approaches when a parallelized hardware platform is used. In this paper we investigate the feasibility of implementing the FDTD method using the NVIDIA GT 520, a low cost Graphical Processing Unit (GPU), for solving the differential form of Maxwell's equation in time domain. Initially a generalized benchmarking problem of bandwidth test and another benchmarking problem of 'matrix left division is discussed for understanding the correlation between the problem size and the performance on the CPU and the GPU respectively. This is further followed by the discussion of the FDTD method, again implemented on both, the CPU and the GT520 GPU. For both of the above comparisons, the CPU used is Intel E5300, a low cost dual core CPU.

  3. Graphical Language for Data Processing

    Science.gov (United States)

    Alphonso, Keith

    2011-01-01

    A graphical language for processing data allows processing elements to be connected with virtual wires that represent data flows between processing modules. The processing of complex data, such as lidar data, requires many different algorithms to be applied. The purpose of this innovation is to automate the processing of complex data, such as LIDAR, without the need for complex scripting and programming languages. The system consists of a set of user-interface components that allow the user to drag and drop various algorithmic and processing components onto a process graph. By working graphically, the user can completely visualize the process flow and create complex diagrams. This innovation supports the nesting of graphs, such that a graph can be included in another graph as a single step for processing. In addition to the user interface components, the system includes a set of .NET classes that represent the graph internally. These classes provide the internal system representation of the graphical user interface. The system includes a graph execution component that reads the internal representation of the graph (as described above) and executes that graph. The execution of the graph follows the interpreted model of execution in that each node is traversed and executed from the original internal representation. In addition, there are components that allow external code elements, such as algorithms, to be easily integrated into the system, thus making the system infinitely expandable.

  4. Graphic Design in Libraries: A Conceptual Process

    Science.gov (United States)

    Ruiz, Miguel

    2014-01-01

    Providing successful library services requires efficient and effective communication with users; therefore, it is important that content creators who develop visual materials understand key components of design and, specifically, develop a holistic graphic design process. Graphic design, as a form of visual communication, is the process of…

  5. Graphic Design in Libraries: A Conceptual Process

    Science.gov (United States)

    Ruiz, Miguel

    2014-01-01

    Providing successful library services requires efficient and effective communication with users; therefore, it is important that content creators who develop visual materials understand key components of design and, specifically, develop a holistic graphic design process. Graphic design, as a form of visual communication, is the process of…

  6. Data Sorting Using Graphics Processing Units

    Directory of Open Access Journals (Sweden)

    M. J. Mišić

    2012-06-01

    Full Text Available Graphics processing units (GPUs have been increasingly used for general-purpose computation in recent years. The GPU accelerated applications are found in both scientific and commercial domains. Sorting is considered as one of the very important operations in many applications, so its efficient implementation is essential for the overall application performance. This paper represents an effort to analyze and evaluate the implementations of the representative sorting algorithms on the graphics processing units. Three sorting algorithms (Quicksort, Merge sort, and Radix sort were evaluated on the Compute Unified Device Architecture (CUDA platform that is used to execute applications on NVIDIA graphics processing units. Algorithms were tested and evaluated using an automated test environment with input datasets of different characteristics. Finally, the results of this analysis are briefly discussed.

  7. HMI conventions for process control graphics.

    Science.gov (United States)

    Pikaar, Ruud N

    2012-01-01

    Process operators supervise and control complex processes. To enable the operator to do an adequate job, instrumentation and process control engineers need to address several related topics, such as console design, information design, navigation, and alarm management. In process control upgrade projects, usually a 1:1 conversion of existing graphics is proposed. This paper suggests another approach, efficiently leading to a reduced number of new powerful process graphics, supported by a permanent process overview displays. In addition a road map for structuring content (process information) and conventions for the presentation of objects, symbols, and so on, has been developed. The impact of the human factors engineering approach on process control upgrade projects is illustrated by several cases.

  8. Numerical Integration with Graphical Processing Unit for QKD Simulation

    Science.gov (United States)

    2014-03-27

    existing and proposed Quantum Key Distribution (QKD) systems. This research investigates using graphical processing unit ( GPU ) technology to more...Time Pad GPU graphical processing unit API application programming interface CUDA Compute Unified Device Architecture SIMD single-instruction-stream...and can be passed by value or reference [2]. 2.3 Graphical Processing Units Programming with graphical processing unit ( GPU ) requires a different

  9. Graphics processing unit-assisted lossless decompression

    Science.gov (United States)

    Loughry, Thomas A.

    2016-04-12

    Systems and methods for decompressing compressed data that has been compressed by way of a lossless compression algorithm are described herein. In a general embodiment, a graphics processing unit (GPU) is programmed to receive compressed data packets and decompress such packets in parallel. The compressed data packets are compressed representations of an image, and the lossless compression algorithm is a Rice compression algorithm.

  10. Graphics processing unit-assisted lossless decompression

    Energy Technology Data Exchange (ETDEWEB)

    Loughry, Thomas A.

    2016-04-12

    Systems and methods for decompressing compressed data that has been compressed by way of a lossless compression algorithm are described herein. In a general embodiment, a graphics processing unit (GPU) is programmed to receive compressed data packets and decompress such packets in parallel. The compressed data packets are compressed representations of an image, and the lossless compression algorithm is a Rice compression algorithm.

  11. Diffusion tensor fiber tracking on graphics processing units.

    Science.gov (United States)

    Mittmann, Adiel; Comunello, Eros; von Wangenheim, Aldo

    2008-10-01

    Diffusion tensor magnetic resonance imaging has been successfully applied to the process of fiber tracking, which determines the location of fiber bundles within the human brain. This process, however, can be quite lengthy when run on a regular workstation. We present a means of executing this process by making use of the graphics processing units of computers' video cards, which provide a low-cost parallel execution environment that algorithms like fiber tracking can benefit from. With this method we have achieved performance gains varying from 14 to 40 times on common computers. Because of accuracy issues inherent to current graphics processing units, we define a variation index in order to assess how close the results obtained with our method are to those generated by programs running on the central processing units of computers. This index shows that results produced by our method are acceptable when compared to those of traditional programs.

  12. Relativistic hydrodynamics on graphics processing units

    CERN Document Server

    Sikorski, Jan; Porter-Sobieraj, Joanna; Słodkowski, Marcin; Krzyżanowski, Piotr; Książek, Natalia; Duda, Przemysław

    2016-01-01

    Hydrodynamics calculations have been successfully used in studies of the bulk properties of the Quark-Gluon Plasma, particularly of elliptic flow and shear viscosity. However, there are areas (for instance event-by-event simulations for flow fluctuations and higher-order flow harmonics studies) where further advancement is hampered by lack of efficient and precise 3+1D~program. This problem can be solved by using Graphics Processing Unit (GPU) computing, which offers unprecedented increase of the computing power compared to standard CPU simulations. In this work, we present an implementation of 3+1D ideal hydrodynamics simulations on the Graphics Processing Unit using Nvidia CUDA framework. MUSTA-FORCE (MUlti STAge, First ORder CEntral, with a~slope limiter and MUSCL reconstruction) and WENO (Weighted Essentially Non-Oscillating) schemes are employed in the simulations, delivering second (MUSTA-FORCE), fifth and seventh (WENO) order of accuracy. Third order Runge-Kutta scheme was used for integration in the t...

  13. Graphics Processing Unit Assisted Thermographic Compositing

    Science.gov (United States)

    Ragasa, Scott; McDougal, Matthew; Russell, Sam

    2013-01-01

    Objective: To develop a software application utilizing general purpose graphics processing units (GPUs) for the analysis of large sets of thermographic data. Background: Over the past few years, an increasing effort among scientists and engineers to utilize the GPU in a more general purpose fashion is allowing for supercomputer level results at individual workstations. As data sets grow, the methods to work them grow at an equal, and often greater, pace. Certain common computations can take advantage of the massively parallel and optimized hardware constructs of the GPU to allow for throughput that was previously reserved for compute clusters. These common computations have high degrees of data parallelism, that is, they are the same computation applied to a large set of data where the result does not depend on other data elements. Signal (image) processing is one area were GPUs are being used to greatly increase the performance of certain algorithms and analysis techniques.

  14. Magnetohydrodynamics simulations on graphics processing units

    CERN Document Server

    Wong, Hon-Cheng; Feng, Xueshang; Tang, Zesheng

    2009-01-01

    Magnetohydrodynamics (MHD) simulations based on the ideal MHD equations have become a powerful tool for modeling phenomena in a wide range of applications including laboratory, astrophysical, and space plasmas. In general, high-resolution methods for solving the ideal MHD equations are computationally expensive and Beowulf clusters or even supercomputers are often used to run the codes that implemented these methods. With the advent of the Compute Unified Device Architecture (CUDA), modern graphics processing units (GPUs) provide an alternative approach to parallel computing for scientific simulations. In this paper we present, to the authors' knowledge, the first implementation to accelerate computation of MHD simulations on GPUs. Numerical tests have been performed to validate the correctness of our GPU MHD code. Performance measurements show that our GPU-based implementation achieves speedups of 2 (1D problem with 2048 grids), 106 (2D problem with 1024^2 grids), and 43 (3D problem with 128^3 grids), respec...

  15. Graphics Processing Units for HEP trigger systems

    Science.gov (United States)

    Ammendola, R.; Bauce, M.; Biagioni, A.; Chiozzi, S.; Cotta Ramusino, A.; Fantechi, R.; Fiorini, M.; Giagu, S.; Gianoli, A.; Lamanna, G.; Lonardo, A.; Messina, A.; Neri, I.; Paolucci, P. S.; Piandani, R.; Pontisso, L.; Rescigno, M.; Simula, F.; Sozzi, M.; Vicini, P.

    2016-07-01

    General-purpose computing on GPUs (Graphics Processing Units) is emerging as a new paradigm in several fields of science, although so far applications have been tailored to the specific strengths of such devices as accelerator in offline computation. With the steady reduction of GPU latencies, and the increase in link and memory throughput, the use of such devices for real-time applications in high-energy physics data acquisition and trigger systems is becoming ripe. We will discuss the use of online parallel computing on GPU for synchronous low level trigger, focusing on CERN NA62 experiment trigger system. The use of GPU in higher level trigger system is also briefly considered.

  16. Kernel density estimation using graphical processing unit

    Science.gov (United States)

    Sunarko, Su'ud, Zaki

    2015-09-01

    Kernel density estimation for particles distributed over a 2-dimensional space is calculated using a single graphical processing unit (GTX 660Ti GPU) and CUDA-C language. Parallel calculations are done for particles having bivariate normal distribution and by assigning calculations for equally-spaced node points to each scalar processor in the GPU. The number of particles, blocks and threads are varied to identify favorable configuration. Comparisons are obtained by performing the same calculation using 1, 2 and 4 processors on a 3.0 GHz CPU using MPICH 2.0 routines. Speedups attained with the GPU are in the range of 88 to 349 times compared the multiprocessor CPU. Blocks of 128 threads are found to be the optimum configuration for this case.

  17. Graphics Processing Units for HEP trigger systems

    Energy Technology Data Exchange (ETDEWEB)

    Ammendola, R. [INFN Sezione di Roma “Tor Vergata”, Via della Ricerca Scientifica 1, 00133 Roma (Italy); Bauce, M. [INFN Sezione di Roma “La Sapienza”, P.le A. Moro 2, 00185 Roma (Italy); University of Rome “La Sapienza”, P.lee A.Moro 2, 00185 Roma (Italy); Biagioni, A. [INFN Sezione di Roma “La Sapienza”, P.le A. Moro 2, 00185 Roma (Italy); Chiozzi, S.; Cotta Ramusino, A. [INFN Sezione di Ferrara, Via Saragat 1, 44122 Ferrara (Italy); University of Ferrara, Via Saragat 1, 44122 Ferrara (Italy); Fantechi, R. [INFN Sezione di Pisa, Largo B. Pontecorvo 3, 56127 Pisa (Italy); CERN, Geneve (Switzerland); Fiorini, M. [INFN Sezione di Ferrara, Via Saragat 1, 44122 Ferrara (Italy); University of Ferrara, Via Saragat 1, 44122 Ferrara (Italy); Giagu, S. [INFN Sezione di Roma “La Sapienza”, P.le A. Moro 2, 00185 Roma (Italy); University of Rome “La Sapienza”, P.lee A.Moro 2, 00185 Roma (Italy); Gianoli, A. [INFN Sezione di Ferrara, Via Saragat 1, 44122 Ferrara (Italy); University of Ferrara, Via Saragat 1, 44122 Ferrara (Italy); Lamanna, G., E-mail: gianluca.lamanna@cern.ch [INFN Sezione di Pisa, Largo B. Pontecorvo 3, 56127 Pisa (Italy); INFN Laboratori Nazionali di Frascati, Via Enrico Fermi 40, 00044 Frascati (Roma) (Italy); Lonardo, A. [INFN Sezione di Roma “La Sapienza”, P.le A. Moro 2, 00185 Roma (Italy); Messina, A. [INFN Sezione di Roma “La Sapienza”, P.le A. Moro 2, 00185 Roma (Italy); University of Rome “La Sapienza”, P.lee A.Moro 2, 00185 Roma (Italy); and others

    2016-07-11

    General-purpose computing on GPUs (Graphics Processing Units) is emerging as a new paradigm in several fields of science, although so far applications have been tailored to the specific strengths of such devices as accelerator in offline computation. With the steady reduction of GPU latencies, and the increase in link and memory throughput, the use of such devices for real-time applications in high-energy physics data acquisition and trigger systems is becoming ripe. We will discuss the use of online parallel computing on GPU for synchronous low level trigger, focusing on CERN NA62 experiment trigger system. The use of GPU in higher level trigger system is also briefly considered.

  18. Process-based costing.

    Science.gov (United States)

    Lee, Robert H; Bott, Marjorie J; Forbes, Sarah; Redford, Linda; Swagerty, Daniel L; Taunton, Roma Lee

    2003-01-01

    Understanding how quality improvement affects costs is important. Unfortunately, low-cost, reliable ways of measuring direct costs are scarce. This article builds on the principles of process improvement to develop a costing strategy that meets both criteria. Process-based costing has 4 steps: developing a flowchart, estimating resource use, valuing resources, and calculating direct costs. To illustrate the technique, this article uses it to cost the care planning process in 3 long-term care facilities. We conclude that process-based costing is easy to implement; generates reliable, valid data; and allows nursing managers to assess the costs of new or modified processes.

  19. Accelerating the Fourier split operator method via graphics processing units

    CERN Document Server

    Bauke, Heiko

    2010-01-01

    Current generations of graphics processing units have turned into highly parallel devices with general computing capabilities. Thus, graphics processing units may be utilized, for example, to solve time dependent partial differential equations by the Fourier split operator method. In this contribution, we demonstrate that graphics processing units are capable to calculate fast Fourier transforms much more efficiently than traditional central processing units. Thus, graphics processing units render efficient implementations of the Fourier split operator method possible. Performance gains of more than an order of magnitude as compared to implementations for traditional central processing units are reached in the solution of the time dependent Schr\\"odinger equation and the time dependent Dirac equation.

  20. GRAPHICAL MODELS OF THE AIRCRAFT MAINTENANCE PROCESS

    Directory of Open Access Journals (Sweden)

    Stanislav Vladimirovich Daletskiy

    2017-01-01

    Full Text Available The aircraft maintenance is realized by a rapid sequence of maintenance organizational and technical states, its re- search and analysis are carried out by statistical methods. The maintenance process concludes aircraft technical states con- nected with the objective patterns of technical qualities changes of the aircraft as a maintenance object and organizational states which determine the subjective organization and planning process of aircraft using. The objective maintenance pro- cess is realized in Maintenance and Repair System which does not include maintenance organization and planning and is a set of related elements: aircraft, Maintenance and Repair measures, executors and documentation that sets rules of their interaction for maintaining of the aircraft reliability and readiness for flight. The aircraft organizational and technical states are considered, their characteristics and heuristic estimates of connection in knots and arcs of graphs and of aircraft organi- zational states during regular maintenance and at technical state failure are given. It is shown that in real conditions of air- craft maintenance, planned aircraft technical state control and maintenance control through it, is only defined by Mainte- nance and Repair conditions at a given Maintenance and Repair type and form structures, and correspondingly by setting principles of Maintenance and Repair work types to the execution, due to maintenance, by aircraft and all its units mainte- nance and reconstruction strategies. The realization of planned Maintenance and Repair process determines the one of the constant maintenance component. The proposed graphical models allow to reveal quantitative correlations between graph knots to improve maintenance processes by statistical research methods, what reduces manning, timetable and expenses for providing safe civil aviation aircraft maintenance.

  1. Parallelizing the Cellular Potts Model on graphics processing units

    Science.gov (United States)

    Tapia, José Juan; D'Souza, Roshan M.

    2011-04-01

    The Cellular Potts Model (CPM) is a lattice based modeling technique used for simulating cellular structures in computational biology. The computational complexity of the model means that current serial implementations restrict the size of simulation to a level well below biological relevance. Parallelization on computing clusters enables scaling the size of the simulation but marginally addresses computational speed due to the limited memory bandwidth between nodes. In this paper we present new data-parallel algorithms and data structures for simulating the Cellular Potts Model on graphics processing units. Our implementations handle most terms in the Hamiltonian, including cell-cell adhesion constraint, cell volume constraint, cell surface area constraint, and cell haptotaxis. We use fine level checkerboards with lock mechanisms using atomic operations to enable consistent updates while maintaining a high level of parallelism. A new data-parallel memory allocation algorithm has been developed to handle cell division. Tests show that our implementation enables simulations of >10 cells with lattice sizes of up to 256 3 on a single graphics card. Benchmarks show that our implementation runs ˜80× faster than serial implementations, and ˜5× faster than previous parallel implementations on computing clusters consisting of 25 nodes. The wide availability and economy of graphics cards mean that our techniques will enable simulation of realistically sized models at a fraction of the time and cost of previous implementations and are expected to greatly broaden the scope of CPM applications.

  2. Accelerating Radio Astronomy Cross-Correlation with Graphics Processing Units

    CERN Document Server

    Clark, M A; Greenhill, L J

    2011-01-01

    We present a highly parallel implementation of the cross-correlation of time-series data using graphics processing units (GPUs), which is scalable to hundreds of independent inputs and suitable for the processing of signals from "Large-N" arrays of many radio antennas. The computational part of the algorithm, the X-engine, is implementated efficiently on Nvidia's Fermi architecture, sustaining up to 79% of the peak single precision floating-point throughput. We compare performance obtained for hardware- and software-managed caches, observing significantly better performance for the latter. The high performance reported involves use of a multi-level data tiling strategy in memory and use of a pipelined algorithm with simultaneous computation and transfer of data from host to device memory. The speed of code development, flexibility, and low cost of the GPU implementations compared to ASIC and FPGA implementations have the potential to greatly shorten the cycle of correlator development and deployment, for case...

  3. Energy Efficient Iris Recognition With Graphics Processing Units

    National Research Council Canada - National Science Library

    Rakvic, Ryan; Broussard, Randy; Ngo, Hau

    2016-01-01

    .... In the past few years, however, this growth has slowed for central processing units (CPUs). Instead, there has been a shift to multicore computing, specifically with the general purpose graphic processing units (GPUs...

  4. Exploiting graphics processing units for computational biology and bioinformatics.

    Science.gov (United States)

    Payne, Joshua L; Sinnott-Armstrong, Nicholas A; Moore, Jason H

    2010-09-01

    Advances in the video gaming industry have led to the production of low-cost, high-performance graphics processing units (GPUs) that possess more memory bandwidth and computational capability than central processing units (CPUs), the standard workhorses of scientific computing. With the recent release of generalpurpose GPUs and NVIDIA's GPU programming language, CUDA, graphics engines are being adopted widely in scientific computing applications, particularly in the fields of computational biology and bioinformatics. The goal of this article is to concisely present an introduction to GPU hardware and programming, aimed at the computational biologist or bioinformaticist. To this end, we discuss the primary differences between GPU and CPU architecture, introduce the basics of the CUDA programming language, and discuss important CUDA programming practices, such as the proper use of coalesced reads, data types, and memory hierarchies. We highlight each of these topics in the context of computing the all-pairs distance between instances in a dataset, a common procedure in numerous disciplines of scientific computing. We conclude with a runtime analysis of the GPU and CPU implementations of the all-pairs distance calculation. We show our final GPU implementation to outperform the CPU implementation by a factor of 1700.

  5. A Relational Reasoning Approach to Text-Graphic Processing

    Science.gov (United States)

    Danielson, Robert W.; Sinatra, Gale M.

    2017-01-01

    We propose that research on text-graphic processing could be strengthened by the inclusion of relational reasoning perspectives. We briefly outline four aspects of relational reasoning: "analogies," "anomalies," "antinomies", and "antitheses". Next, we illustrate how text-graphic researchers have been…

  6. High-throughput sequence alignment using Graphics Processing Units

    Directory of Open Access Journals (Sweden)

    Trapnell Cole

    2007-12-01

    Full Text Available Abstract Background The recent availability of new, less expensive high-throughput DNA sequencing technologies has yielded a dramatic increase in the volume of sequence data that must be analyzed. These data are being generated for several purposes, including genotyping, genome resequencing, metagenomics, and de novo genome assembly projects. Sequence alignment programs such as MUMmer have proven essential for analysis of these data, but researchers will need ever faster, high-throughput alignment tools running on inexpensive hardware to keep up with new sequence technologies. Results This paper describes MUMmerGPU, an open-source high-throughput parallel pairwise local sequence alignment program that runs on commodity Graphics Processing Units (GPUs in common workstations. MUMmerGPU uses the new Compute Unified Device Architecture (CUDA from nVidia to align multiple query sequences against a single reference sequence stored as a suffix tree. By processing the queries in parallel on the highly parallel graphics card, MUMmerGPU achieves more than a 10-fold speedup over a serial CPU version of the sequence alignment kernel, and outperforms the exact alignment component of MUMmer on a high end CPU by 3.5-fold in total application time when aligning reads from recent sequencing projects using Solexa/Illumina, 454, and Sanger sequencing technologies. Conclusion MUMmerGPU is a low cost, ultra-fast sequence alignment program designed to handle the increasing volume of data produced by new, high-throughput sequencing technologies. MUMmerGPU demonstrates that even memory-intensive applications can run significantly faster on the relatively low-cost GPU than on the CPU.

  7. Parallelization of heterogeneous reactor calculations on a graphics processing unit

    Energy Technology Data Exchange (ETDEWEB)

    Malofeev, V. M., E-mail: vm-malofeev@mail.ru; Pal’shin, V. A. [National Research Center Kurchatov Institute (Russian Federation)

    2016-12-15

    Parallelization is applied to the neutron calculations performed by the heterogeneous method on a graphics processing unit. The parallel algorithm of the modified TREC code is described. The efficiency of the parallel algorithm is evaluated.

  8. Real-time radar signal processing using GPGPU (general-purpose graphic processing unit)

    Science.gov (United States)

    Kong, Fanxing; Zhang, Yan Rockee; Cai, Jingxiao; Palmer, Robert D.

    2016-05-01

    This study introduces a practical approach to develop real-time signal processing chain for general phased array radar on NVIDIA GPUs(Graphical Processing Units) using CUDA (Compute Unified Device Architecture) libraries such as cuBlas and cuFFT, which are adopted from open source libraries and optimized for the NVIDIA GPUs. The processed results are rigorously verified against those from the CPUs. Performance benchmarked in computation time with various input data cube sizes are compared across GPUs and CPUs. Through the analysis, it will be demonstrated that GPGPUs (General Purpose GPU) real-time processing of the array radar data is possible with relatively low-cost commercial GPUs.

  9. Accelerating glassy dynamics using graphics processing units

    CERN Document Server

    Colberg, Peter H

    2009-01-01

    Modern graphics hardware offers peak performances close to 1 Tflop/s, and NVIDIA's CUDA provides a flexible and convenient programming interface to exploit these immense computing resources. We demonstrate the ability of GPUs to perform high-precision molecular dynamics simulations for nearly a million particles running stably over many days. Particular emphasis is put on the numerical long-time stability in terms of energy and momentum conservation. Floating point precision is a crucial issue here, and sufficient precision is maintained by double-single emulation of the floating point arithmetic. As a demanding test case, we have reproduced the slow dynamics of a binary Lennard-Jones mixture close to the glass transition. The improved numerical accuracy permits us to follow the relaxation dynamics of a large system over 4 non-trivial decades in time. Further, our data provide evidence for a negative power-law decay of the velocity autocorrelation function with exponent 5/2 in the close vicinity of the transi...

  10. Accelerating sparse linear algebra using graphics processing units

    Science.gov (United States)

    Spagnoli, Kyle E.; Humphrey, John R.; Price, Daniel K.; Kelmelis, Eric J.

    2011-06-01

    The modern graphics processing unit (GPU) found in many standard personal computers is a highly parallel math processor capable of over 1 TFLOPS of peak computational throughput at a cost similar to a high-end CPU with excellent FLOPS-to-watt ratio. High-level sparse linear algebra operations are computationally intense, often requiring large amounts of parallel operations and would seem a natural fit for the processing power of the GPU. Our work is on a GPU accelerated implementation of sparse linear algebra routines. We present results from both direct and iterative sparse system solvers. The GPU execution model featured by NVIDIA GPUs based on CUDA demands very strong parallelism, requiring between hundreds and thousands of simultaneous operations to achieve high performance. Some constructs from linear algebra map extremely well to the GPU and others map poorly. CPUs, on the other hand, do well at smaller order parallelism and perform acceptably during low-parallelism code segments. Our work addresses this via hybrid a processing model, in which the CPU and GPU work simultaneously to produce results. In many cases, this is accomplished by allowing each platform to do the work it performs most naturally. For example, the CPU is responsible for graph theory portion of the direct solvers while the GPU simultaneously performs the low level linear algebra routines.

  11. MASSIVELY PARALLEL LATENT SEMANTIC ANALYSES USING A GRAPHICS PROCESSING UNIT

    Energy Technology Data Exchange (ETDEWEB)

    Cavanagh, J.; Cui, S.

    2009-01-01

    Latent Semantic Analysis (LSA) aims to reduce the dimensions of large term-document datasets using Singular Value Decomposition. However, with the ever-expanding size of datasets, current implementations are not fast enough to quickly and easily compute the results on a standard PC. A graphics processing unit (GPU) can solve some highly parallel problems much faster than a traditional sequential processor or central processing unit (CPU). Thus, a deployable system using a GPU to speed up large-scale LSA processes would be a much more effective choice (in terms of cost/performance ratio) than using a PC cluster. Due to the GPU’s application-specifi c architecture, harnessing the GPU’s computational prowess for LSA is a great challenge. We presented a parallel LSA implementation on the GPU, using NVIDIA® Compute Unifi ed Device Architecture and Compute Unifi ed Basic Linear Algebra Subprograms software. The performance of this implementation is compared to traditional LSA implementation on a CPU using an optimized Basic Linear Algebra Subprograms library. After implementation, we discovered that the GPU version of the algorithm was twice as fast for large matrices (1 000x1 000 and above) that had dimensions not divisible by 16. For large matrices that did have dimensions divisible by 16, the GPU algorithm ran fi ve to six times faster than the CPU version. The large variation is due to architectural benefi ts of the GPU for matrices divisible by 16. It should be noted that the overall speeds for the CPU version did not vary from relative normal when the matrix dimensions were divisible by 16. Further research is needed in order to produce a fully implementable version of LSA. With that in mind, the research we presented shows that the GPU is a viable option for increasing the speed of LSA, in terms of cost/performance ratio.

  12. Massively Parallel Latent Semantic Analyzes using a Graphics Processing Unit

    Energy Technology Data Exchange (ETDEWEB)

    Cavanagh, Joseph M [ORNL; Cui, Xiaohui [ORNL

    2009-01-01

    Latent Semantic Indexing (LSA) aims to reduce the dimensions of large Term-Document datasets using Singular Value Decomposition. However, with the ever expanding size of data sets, current implementations are not fast enough to quickly and easily compute the results on a standard PC. The Graphics Processing Unit (GPU) can solve some highly parallel problems much faster than the traditional sequential processor (CPU). Thus, a deployable system using a GPU to speedup large-scale LSA processes would be a much more effective choice (in terms of cost/performance ratio) than using a computer cluster. Due to the GPU s application-specific architecture, harnessing the GPU s computational prowess for LSA is a great challenge. We present a parallel LSA implementation on the GPU, using NVIDIA Compute Unified Device Architecture and Compute Unified Basic Linear Algebra Subprograms. The performance of this implementation is compared to traditional LSA implementation on CPU using an optimized Basic Linear Algebra Subprograms library. After implementation, we discovered that the GPU version of the algorithm was twice as fast for large matrices (1000x1000 and above) that had dimensions not divisible by 16. For large matrices that did have dimensions divisible by 16, the GPU algorithm ran five to six times faster than the CPU version. The large variation is due to architectural benefits the GPU has for matrices divisible by 16. It should be noted that the overall speeds for the CPU version did not vary from relative normal when the matrix dimensions were divisible by 16. Further research is needed in order to produce a fully implementable version of LSA. With that in mind, the research we presented shows that the GPU is a viable option for increasing the speed of LSA, in terms of cost/performance ratio.

  13. Handling geophysical flows: Numerical modelling using Graphical Processing Units

    Science.gov (United States)

    Garcia-Navarro, Pilar; Lacasta, Asier; Juez, Carmelo; Morales-Hernandez, Mario

    2016-04-01

    Computational tools may help engineers in the assessment of sediment transport during the decision-making processes. The main requirements are that the numerical results have to be accurate and simulation models must be fast. The present work is based on the 2D shallow water equations in combination with the 2D Exner equation [1]. The resulting numerical model accuracy was already discussed in previous work. Regarding the speed of the computation, the Exner equation slows down the already costly 2D shallow water model as the number of variables to solve is increased and the numerical stability is more restrictive. On the other hand, the movement of poorly sorted material over steep areas constitutes a hazardous environmental problem. Computational tools help in the predictions of such landslides [2]. In order to overcome this problem, this work proposes the use of Graphical Processing Units (GPUs) for decreasing significantly the simulation time [3, 4]. The numerical scheme implemented in GPU is based on a finite volume scheme. The mathematical model and the numerical implementation are compared against experimental and field data. In addition, the computational times obtained with the Graphical Hardware technology are compared against Single-Core (sequential) and Multi-Core (parallel) CPU implementations. References [Juez et al.(2014)] Juez, C., Murillo, J., & Garca-Navarro, P. (2014) A 2D weakly-coupled and efficient numerical model for transient shallow flow and movable bed. Advances in Water Resources. 71 93-109. [Juez et al.(2013)] Juez, C., Murillo, J., & Garca-Navarro, P. (2013) . 2D simulation of granular flow over irregular steep slopes using global and local coordinates. Journal of Computational Physics. 225 166-204. [Lacasta et al.(2014)] Lacasta, A., Morales-Hernndez, M., Murillo, J., & Garca-Navarro, P. (2014) An optimized GPU implementation of a 2D free surface simulation model on unstructured meshes Advances in Engineering Software. 78 1-15. [Lacasta

  14. Heterogeneous Multicore Parallel Programming for Graphics Processing Units

    Directory of Open Access Journals (Sweden)

    Francois Bodin

    2009-01-01

    Full Text Available Hybrid parallel multicore architectures based on graphics processing units (GPUs can provide tremendous computing power. Current NVIDIA and AMD Graphics Product Group hardware display a peak performance of hundreds of gigaflops. However, exploiting GPUs from existing applications is a difficult task that requires non-portable rewriting of the code. In this paper, we present HMPP, a Heterogeneous Multicore Parallel Programming workbench with compilers, developed by CAPS entreprise, that allows the integration of heterogeneous hardware accelerators in a unintrusive manner while preserving the legacy code.

  15. Reflector antenna analysis using physical optics on Graphics Processing Units

    DEFF Research Database (Denmark)

    Borries, Oscar Peter; Sørensen, Hans Henrik Brandenborg; Dammann, Bernd

    2014-01-01

    The Physical Optics approximation is a widely used asymptotic method for calculating the scattering from electrically large bodies. It requires significant computational work and little memory, and is thus well suited for application on a Graphics Processing Unit. Here, we investigate...... the performance of an implementation and demonstrate that while there are some implementational pitfalls, a careful implementation can result in impressive improvements....

  16. Utilizing Graphics Processing Units for Network Anomaly Detection

    Science.gov (United States)

    2012-09-13

    matching system using deterministic finite automata and extended finite automata resulting in a speedup of 9x over the CPU implementation [SGO09]. Kovach ...pages 14–18, 2009. [Kov10] Nicholas S. Kovach . Accelerating malware detection via a graphics processing unit, 2010. http://www.dtic.mil/dtic/tr

  17. Visualisation for Stochastic Process Algebras: The Graphic Truth

    DEFF Research Database (Denmark)

    Smith, Michael James Andrew; Gilmore, Stephen

    2011-01-01

    There have historically been two approaches to performance modelling. On the one hand, textual language-based formalisms such as stochastic process algebras allow compositional modelling that is portable and easy to manage. In contrast, graphical formalisms such as stochastic Petri nets and stoch...

  18. An Interactive Graphics Program for Investigating Digital Signal Processing.

    Science.gov (United States)

    Miller, Billy K.; And Others

    1983-01-01

    Describes development of an interactive computer graphics program for use in teaching digital signal processing. The program allows students to interactively configure digital systems on a monitor display and observe their system's performance by means of digital plots on the system's outputs. A sample program run is included. (JN)

  19. Acceleration of option pricing technique on graphics processing units

    NARCIS (Netherlands)

    Zhang, B.; Oosterlee, C.W.

    2010-01-01

    The acceleration of an option pricing technique based on Fourier cosine expansions on the Graphics Processing Unit (GPU) is reported. European options, in particular with multiple strikes, and Bermudan options will be discussed. The influence of the number of terms in the Fourier cosine series expan

  20. Acceleration of option pricing technique on graphics processing units

    NARCIS (Netherlands)

    Zhang, B.; Oosterlee, C.W.

    2014-01-01

    The acceleration of an option pricing technique based on Fourier cosine expansions on the graphics processing unit (GPU) is reported. European options, in particular with multiple strikes, and Bermudan options will be discussed. The influence of the number of terms in the Fourier cosine series expan

  1. Graphic Arts: The Press and Finishing Processes. Third Edition.

    Science.gov (United States)

    Crummett, Dan

    This document contains teacher and student materials for a course in graphic arts concentrating on printing presses and the finishing process for publications. Seven units of instruction cover the following topics: (1) offset press systems; (2) offset inks and dampening chemistry; (3) offset press operating procedures; (4) preventive maintenance…

  2. The Use of Computer Graphics in the Design Process.

    Science.gov (United States)

    Palazzi, Maria

    This master's thesis examines applications of computer technology to the field of industrial design and ways in which technology can transform the traditional process. Following a statement of the problem, the history and applications of the fields of computer graphics and industrial design are reviewed. The traditional industrial design process…

  3. Flocking-based Document Clustering on the Graphics Processing Unit

    Energy Technology Data Exchange (ETDEWEB)

    Cui, Xiaohui [ORNL; Potok, Thomas E [ORNL; Patton, Robert M [ORNL; ST Charles, Jesse Lee [ORNL

    2008-01-01

    Abstract?Analyzing and grouping documents by content is a complex problem. One explored method of solving this problem borrows from nature, imitating the flocking behavior of birds. Each bird represents a single document and flies toward other documents that are similar to it. One limitation of this method of document clustering is its complexity O(n2). As the number of documents grows, it becomes increasingly difficult to receive results in a reasonable amount of time. However, flocking behavior, along with most naturally inspired algorithms such as ant colony optimization and particle swarm optimization, are highly parallel and have found increased performance on expensive cluster computers. In the last few years, the graphics processing unit (GPU) has received attention for its ability to solve highly-parallel and semi-parallel problems much faster than the traditional sequential processor. Some applications see a huge increase in performance on this new platform. The cost of these high-performance devices is also marginal when compared with the price of cluster machines. In this paper, we have conducted research to exploit this architecture and apply its strengths to the document flocking problem. Our results highlight the potential benefit the GPU brings to all naturally inspired algorithms. Using the CUDA platform from NIVIDA? we developed a document flocking implementation to be run on the NIVIDA?GEFORCE 8800. Additionally, we developed a similar but sequential implementation of the same algorithm to be run on a desktop CPU. We tested the performance of each on groups of news articles ranging in size from 200 to 3000 documents. The results of these tests were very significant. Performance gains ranged from three to nearly five times improvement of the GPU over the CPU implementation. This dramatic improvement in runtime makes the GPU a potentially revolutionary platform for document clustering algorithms.

  4. Visualisation for Stochastic Process Algebras: The Graphic Truth

    DEFF Research Database (Denmark)

    Smith, Michael James Andrew; Gilmore, Stephen

    2011-01-01

    a natural interface for labelling states in the model, which integrates with our interface for specifying and model checking properties in the Continuous Stochastic Logic (CSL). We describe recent improvements to the tool in terms of usability and exploiting the visualisation framework, and discuss some......There have historically been two approaches to performance modelling. On the one hand, textual language-based formalisms such as stochastic process algebras allow compositional modelling that is portable and easy to manage. In contrast, graphical formalisms such as stochastic Petri nets...... and stochastic activity networks provide an automaton-based view of the model, which may be easier to visualise, at the expense of portability. In this paper, we argue that we can achieve the benefits of both approaches by generating a graphical view of a stochastic process algebra model, which is synchronised...

  5. Graphical representation of the process of solving problems in statics

    Science.gov (United States)

    Lopez, Carlos

    2011-03-01

    It is presented a method of construction to a graphical representation technique of knowledge called Conceptual Chains. Especially, this tool has been focused to the representation of processes and applied to solving problems in physics, mathematics and engineering. The method is described in ten steps and is illustrated with its development in a particular topic of statics. Various possible didactic applications of this technique are showed.

  6. Efficient magnetohydrodynamic simulations on graphics processing units with CUDA

    Science.gov (United States)

    Wong, Hon-Cheng; Wong, Un-Hong; Feng, Xueshang; Tang, Zesheng

    2011-10-01

    Magnetohydrodynamic (MHD) simulations based on the ideal MHD equations have become a powerful tool for modeling phenomena in a wide range of applications including laboratory, astrophysical, and space plasmas. In general, high-resolution methods for solving the ideal MHD equations are computationally expensive and Beowulf clusters or even supercomputers are often used to run the codes that implemented these methods. With the advent of the Compute Unified Device Architecture (CUDA), modern graphics processing units (GPUs) provide an alternative approach to parallel computing for scientific simulations. In this paper we present, to the best of the author's knowledge, the first implementation of MHD simulations entirely on GPUs with CUDA, named GPU-MHD, to accelerate the simulation process. GPU-MHD supports both single and double precision computations. A series of numerical tests have been performed to validate the correctness of our code. Accuracy evaluation by comparing single and double precision computation results is also given. Performance measurements of both single and double precision are conducted on both the NVIDIA GeForce GTX 295 (GT200 architecture) and GTX 480 (Fermi architecture) graphics cards. These measurements show that our GPU-based implementation achieves between one and two orders of magnitude of improvement depending on the graphics card used, the problem size, and the precision when comparing to the original serial CPU MHD implementation. In addition, we extend GPU-MHD to support the visualization of the simulation results and thus the whole MHD simulation and visualization process can be performed entirely on GPUs.

  7. Iterative Methods for MPC on Graphical Processing Units

    DEFF Research Database (Denmark)

    2012-01-01

    The high oating point performance and memory bandwidth of Graphical Processing Units (GPUs) makes them ideal for a large number of computations which often arises in scientic computing, such as matrix operations. GPUs achieve this performance by utilizing massive par- allelism, which requires...... on their applicability for GPUs. We examine published techniques for iterative methods in interior points methods (IPMs) by applying them to simple test cases, such as a system of masses connected by springs. Iterative methods allows us deal with the ill-conditioning occurring in the later iterations of the IPM as well...... as to avoid the use of dense matrices, which may be too large for the limited memory capacity of current graphics cards....

  8. Adaptive-optics Optical Coherence Tomography Processing Using a Graphics Processing Unit*

    Science.gov (United States)

    Shafer, Brandon A.; Kriske, Jeffery E.; Kocaoglu, Omer P.; Turner, Timothy L.; Liu, Zhuolin; Lee, John Jaehwan; Miller, Donald T.

    2015-01-01

    Graphics processing units are increasingly being used for scientific computing for their powerful parallel processing abilities, and moderate price compared to super computers and computing grids. In this paper we have used a general purpose graphics processing unit to process adaptive-optics optical coherence tomography (AOOCT) images in real time. Increasing the processing speed of AOOCT is an essential step in moving the super high resolution technology closer to clinical viability. PMID:25570838

  9. Adaptive-optics optical coherence tomography processing using a graphics processing unit.

    Science.gov (United States)

    Shafer, Brandon A; Kriske, Jeffery E; Kocaoglu, Omer P; Turner, Timothy L; Liu, Zhuolin; Lee, John Jaehwan; Miller, Donald T

    2014-01-01

    Graphics processing units are increasingly being used for scientific computing for their powerful parallel processing abilities, and moderate price compared to super computers and computing grids. In this paper we have used a general purpose graphics processing unit to process adaptive-optics optical coherence tomography (AOOCT) images in real time. Increasing the processing speed of AOOCT is an essential step in moving the super high resolution technology closer to clinical viability.

  10. Monte Carlo MP2 on Many Graphical Processing Units.

    Science.gov (United States)

    Doran, Alexander E; Hirata, So

    2016-10-11

    In the Monte Carlo second-order many-body perturbation (MC-MP2) method, the long sum-of-product matrix expression of the MP2 energy, whose literal evaluation may be poorly scalable, is recast into a single high-dimensional integral of functions of electron pair coordinates, which is evaluated by the scalable method of Monte Carlo integration. The sampling efficiency is further accelerated by the redundant-walker algorithm, which allows a maximal reuse of electron pairs. Here, a multitude of graphical processing units (GPUs) offers a uniquely ideal platform to expose multilevel parallelism: fine-grain data-parallelism for the redundant-walker algorithm in which millions of threads compute and share orbital amplitudes on each GPU; coarse-grain instruction-parallelism for near-independent Monte Carlo integrations on many GPUs with few and infrequent interprocessor communications. While the efficiency boost by the redundant-walker algorithm on central processing units (CPUs) grows linearly with the number of electron pairs and tends to saturate when the latter exceeds the number of orbitals, on a GPU it grows quadratically before it increases linearly and then eventually saturates at a much larger number of pairs. This is because the orbital constructions are nearly perfectly parallelized on a GPU and thus completed in a near-constant time regardless of the number of pairs. In consequence, an MC-MP2/cc-pVDZ calculation of a benzene dimer is 2700 times faster on 256 GPUs (using 2048 electron pairs) than on two CPUs, each with 8 cores (which can use only up to 256 pairs effectively). We also numerically determine that the cost to achieve a given relative statistical uncertainty in an MC-MP2 energy increases as O(n(3)) or better with system size n, which may be compared with the O(n(5)) scaling of the conventional implementation of deterministic MP2. We thus establish the scalability of MC-MP2 with both system and computer sizes.

  11. Viscoelastic Finite Difference Modeling Using Graphics Processing Units

    Science.gov (United States)

    Fabien-Ouellet, G.; Gloaguen, E.; Giroux, B.

    2014-12-01

    Full waveform seismic modeling requires a huge amount of computing power that still challenges today's technology. This limits the applicability of powerful processing approaches in seismic exploration like full-waveform inversion. This paper explores the use of Graphics Processing Units (GPU) to compute a time based finite-difference solution to the viscoelastic wave equation. The aim is to investigate whether the adoption of the GPU technology is susceptible to reduce significantly the computing time of simulations. The code presented herein is based on the freely accessible software of Bohlen (2002) in 2D provided under a General Public License (GNU) licence. This implementation is based on a second order centred differences scheme to approximate time differences and staggered grid schemes with centred difference of order 2, 4, 6, 8, and 12 for spatial derivatives. The code is fully parallel and is written using the Message Passing Interface (MPI), and it thus supports simulations of vast seismic models on a cluster of CPUs. To port the code from Bohlen (2002) on GPUs, the OpenCl framework was chosen for its ability to work on both CPUs and GPUs and its adoption by most of GPU manufacturers. In our implementation, OpenCL works in conjunction with MPI, which allows computations on a cluster of GPU for large-scale model simulations. We tested our code for model sizes between 1002 and 60002 elements. Comparison shows a decrease in computation time of more than two orders of magnitude between the GPU implementation run on a AMD Radeon HD 7950 and the CPU implementation run on a 2.26 GHz Intel Xeon Quad-Core. The speed-up varies depending on the order of the finite difference approximation and generally increases for higher orders. Increasing speed-ups are also obtained for increasing model size, which can be explained by kernel overheads and delays introduced by memory transfers to and from the GPU through the PCI-E bus. Those tests indicate that the GPU memory size

  12. Accelerating VASP electronic structure calculations using graphic processing units

    KAUST Repository

    Hacene, Mohamed

    2012-08-20

    We present a way to improve the performance of the electronic structure Vienna Ab initio Simulation Package (VASP) program. We show that high-performance computers equipped with graphics processing units (GPUs) as accelerators may reduce drastically the computation time when offloading these sections to the graphic chips. The procedure consists of (i) profiling the performance of the code to isolate the time-consuming parts, (ii) rewriting these so that the algorithms become better-suited for the chosen graphic accelerator, and (iii) optimizing memory traffic between the host computer and the GPU accelerator. We chose to accelerate VASP with NVIDIA GPU using CUDA. We compare the GPU and original versions of VASP by evaluating the Davidson and RMM-DIIS algorithms on chemical systems of up to 1100 atoms. In these tests, the total time is reduced by a factor between 3 and 8 when running on n (CPU core + GPU) compared to n CPU cores only, without any accuracy loss. © 2012 Wiley Periodicals, Inc.

  13. Fast analytical scatter estimation using graphics processing units.

    Science.gov (United States)

    Ingleby, Harry; Lippuner, Jonas; Rickey, Daniel W; Li, Yue; Elbakri, Idris

    2015-01-01

    To develop a fast patient-specific analytical estimator of first-order Compton and Rayleigh scatter in cone-beam computed tomography, implemented using graphics processing units. The authors developed an analytical estimator for first-order Compton and Rayleigh scatter in a cone-beam computed tomography geometry. The estimator was coded using NVIDIA's CUDA environment for execution on an NVIDIA graphics processing unit. Performance of the analytical estimator was validated by comparison with high-count Monte Carlo simulations for two different numerical phantoms. Monoenergetic analytical simulations were compared with monoenergetic and polyenergetic Monte Carlo simulations. Analytical and Monte Carlo scatter estimates were compared both qualitatively, from visual inspection of images and profiles, and quantitatively, using a scaled root-mean-square difference metric. Reconstruction of simulated cone-beam projection data of an anthropomorphic breast phantom illustrated the potential of this method as a component of a scatter correction algorithm. The monoenergetic analytical and Monte Carlo scatter estimates showed very good agreement. The monoenergetic analytical estimates showed good agreement for Compton single scatter and reasonable agreement for Rayleigh single scatter when compared with polyenergetic Monte Carlo estimates. For a voxelized phantom with dimensions 128 × 128 × 128 voxels and a detector with 256 × 256 pixels, the analytical estimator required 669 seconds for a single projection, using a single NVIDIA 9800 GX2 video card. Accounting for first order scatter in cone-beam image reconstruction improves the contrast to noise ratio of the reconstructed images. The analytical scatter estimator, implemented using graphics processing units, provides rapid and accurate estimates of single scatter and with further acceleration and a method to account for multiple scatter may be useful for practical scatter correction schemes.

  14. Porting a Hall MHD Code to a Graphic Processing Unit

    Science.gov (United States)

    Dorelli, John C.

    2011-01-01

    We present our experience porting a Hall MHD code to a Graphics Processing Unit (GPU). The code is a 2nd order accurate MUSCL-Hancock scheme which makes use of an HLL Riemann solver to compute numerical fluxes and second-order finite differences to compute the Hall contribution to the electric field. The divergence of the magnetic field is controlled with Dedner?s hyperbolic divergence cleaning method. Preliminary benchmark tests indicate a speedup (relative to a single Nehalem core) of 58x for a double precision calculation. We discuss scaling issues which arise when distributing work across multiple GPUs in a CPU-GPU cluster.

  15. Line-by-line spectroscopic simulations on graphics processing units

    Science.gov (United States)

    Collange, Sylvain; Daumas, Marc; Defour, David

    2008-01-01

    We report here on software that performs line-by-line spectroscopic simulations on gases. Elaborate models (such as narrow band and correlated-K) are accurate and efficient for bands where various components are not simultaneously and significantly active. Line-by-line is probably the most accurate model in the infrared for blends of gases that contain high proportions of H 2O and CO 2 as this was the case for our prototype simulation. Our implementation on graphics processing units sustains a speedup close to 330 on computation-intensive tasks and 12 on memory intensive tasks compared to implementations on one core of high-end processors. This speedup is due to data parallelism, efficient memory access for specific patterns and some dedicated hardware operators only available in graphics processing units. It is obtained leaving most of processor resources available and it would scale linearly with the number of graphics processing units in parallel machines. Line-by-line simulation coupled with simulation of fluid dynamics was long believed to be economically intractable but our work shows that it could be done with some affordable additional resources compared to what is necessary to perform simulations on fluid dynamics alone. Program summaryProgram title: GPU4RE Catalogue identifier: ADZY_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/ADZY_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 62 776 No. of bytes in distributed program, including test data, etc.: 1 513 247 Distribution format: tar.gz Programming language: C++ Computer: x86 PC Operating system: Linux, Microsoft Windows. Compilation requires either gcc/g++ under Linux or Visual C++ 2003/2005 and Cygwin under Windows. It has been tested using gcc 4.1.2 under Ubuntu Linux 7.04 and using Visual C

  16. Optimized Laplacian image sharpening algorithm based on graphic processing unit

    Science.gov (United States)

    Ma, Tinghuai; Li, Lu; Ji, Sai; Wang, Xin; Tian, Yuan; Al-Dhelaan, Abdullah; Al-Rodhaan, Mznah

    2014-12-01

    In classical Laplacian image sharpening, all pixels are processed one by one, which leads to large amount of computation. Traditional Laplacian sharpening processed on CPU is considerably time-consuming especially for those large pictures. In this paper, we propose a parallel implementation of Laplacian sharpening based on Compute Unified Device Architecture (CUDA), which is a computing platform of Graphic Processing Units (GPU), and analyze the impact of picture size on performance and the relationship between the processing time of between data transfer time and parallel computing time. Further, according to different features of different memory, an improved scheme of our method is developed, which exploits shared memory in GPU instead of global memory and further increases the efficiency. Experimental results prove that two novel algorithms outperform traditional consequentially method based on OpenCV in the aspect of computing speed.

  17. Fast calculation of HELAS amplitudes using graphics processing unit (GPU)

    CERN Document Server

    Hagiwara, K; Okamura, N; Rainwater, D L; Stelzer, T

    2009-01-01

    We use the graphics processing unit (GPU) for fast calculations of helicity amplitudes of physics processes. As our first attempt, we compute $u\\overline{u}\\to n\\gamma$ ($n=2$ to 8) processes in $pp$ collisions at $\\sqrt{s} = 14$TeV by transferring the MadGraph generated HELAS amplitudes (FORTRAN) into newly developed HEGET ({\\bf H}ELAS {\\bf E}valuation with {\\bf G}PU {\\bf E}nhanced {\\bf T}echnology) codes written in CUDA, a C-platform developed by NVIDIA for general purpose computing on the GPU. Compared with the usual CPU programs, we obtain 40-150 times better performance on the GPU.

  18. Accelerated space object tracking via graphic processing unit

    Science.gov (United States)

    Jia, Bin; Liu, Kui; Pham, Khanh; Blasch, Erik; Chen, Genshe

    2016-05-01

    In this paper, a hybrid Monte Carlo Gauss mixture Kalman filter is proposed for the continuous orbit estimation problem. Specifically, the graphic processing unit (GPU) aided Monte Carlo method is used to propagate the uncertainty of the estimation when the observation is not available and the Gauss mixture Kalman filter is used to update the estimation when the observation sequences are available. A typical space object tracking problem using the ground radar is used to test the performance of the proposed algorithm. The performance of the proposed algorithm is compared with the popular cubature Kalman filter (CKF). The simulation results show that the ordinary CKF diverges in 5 observation periods. In contrast, the proposed hybrid Monte Carlo Gauss mixture Kalman filter achieves satisfactory performance in all observation periods. In addition, by using the GPU, the computational time is over 100 times less than that using the conventional central processing unit (CPU).

  19. Representation stigma: Perceptions of tools and processes for design graphics

    Directory of Open Access Journals (Sweden)

    David Barbarash

    2016-12-01

    Full Text Available Practicing designers and design students across multiple fields were surveyed to measure preference and perception of traditional hand and digital tools to determine if common biases for an individual toolset are realized in practice. Significant results were found, primarily with age being a determinant in preference of graphic tools and processes; this finding demonstrates a hard line between generations of designers. Results show that while there are strong opinions in tools and processes, the realities of modern business practice and production gravitate towards digital methods despite a traditional tool preference in more experienced designers. While negative stigmas regarding computers remain, younger generations are more accepting of digital tools and images, which should eventually lead to a paradigm shift in design professions.

  20. Solar physics applications of computer graphics and image processing

    Science.gov (United States)

    Altschuler, M. D.

    1985-01-01

    Computer graphics devices coupled with computers and carefully developed software provide new opportunities to achieve insight into the geometry and time evolution of scalar, vector, and tensor fields and to extract more information quickly and cheaply from the same image data. Two or more different fields which overlay in space can be calculated from the data (and the physics), then displayed from any perspective, and compared visually. The maximum regions of one field can be compared with the gradients of another. Time changing fields can also be compared. Images can be added, subtracted, transformed, noise filtered, frequency filtered, contrast enhanced, color coded, enlarged, compressed, parameterized, and histogrammed, in whole or section by section. Today it is possible to process multiple digital images to reveal spatial and temporal correlations and cross correlations. Data from different observatories taken at different times can be processed, interpolated, and transformed to a common coordinate system.

  1. Significantly reducing registration time in IGRT using graphics processing units

    DEFF Research Database (Denmark)

    Noe, Karsten Østergaard; Denis de Senneville, Baudouin; Tanderup, Kari

    2008-01-01

    Purpose/Objective For online IGRT, rapid image processing is needed. Fast parallel computations using graphics processing units (GPUs) have recently been made more accessible through general purpose programming interfaces. We present a GPU implementation of the Horn and Schunck method...... respiration phases in a free breathing volunteer and 41 anatomical landmark points in each image series. The registration method used is a multi-resolution GPU implementation of the 3D Horn and Schunck algorithm. It is based on the CUDA framework from Nvidia. Results On an Intel Core 2 CPU at 2.4GHz each...... registration took 30 minutes. On an Nvidia Geforce 8800GTX GPU in the same machine this registration took 37 seconds, making the GPU version 48.7 times faster. The nine image series of different respiration phases were registered to the same reference image (full inhale). Accuracy was evaluated on landmark...

  2. Fast free-form deformation using graphics processing units.

    Science.gov (United States)

    Modat, Marc; Ridgway, Gerard R; Taylor, Zeike A; Lehmann, Manja; Barnes, Josephine; Hawkes, David J; Fox, Nick C; Ourselin, Sébastien

    2010-06-01

    A large number of algorithms have been developed to perform non-rigid registration and it is a tool commonly used in medical image analysis. The free-form deformation algorithm is a well-established technique, but is extremely time consuming. In this paper we present a parallel-friendly formulation of the algorithm suitable for graphics processing unit execution. Using our approach we perform registration of T1-weighted MR images in less than 1 min and show the same level of accuracy as a classical serial implementation when performing segmentation propagation. This technology could be of significant utility in time-critical applications such as image-guided interventions, or in the processing of large data sets. Copyright 2009 Elsevier Ireland Ltd. All rights reserved.

  3. Solar physics applications of computer graphics and image processing

    Science.gov (United States)

    Altschuler, M. D.

    1985-01-01

    Computer graphics devices coupled with computers and carefully developed software provide new opportunities to achieve insight into the geometry and time evolution of scalar, vector, and tensor fields and to extract more information quickly and cheaply from the same image data. Two or more different fields which overlay in space can be calculated from the data (and the physics), then displayed from any perspective, and compared visually. The maximum regions of one field can be compared with the gradients of another. Time changing fields can also be compared. Images can be added, subtracted, transformed, noise filtered, frequency filtered, contrast enhanced, color coded, enlarged, compressed, parameterized, and histogrammed, in whole or section by section. Today it is possible to process multiple digital images to reveal spatial and temporal correlations and cross correlations. Data from different observatories taken at different times can be processed, interpolated, and transformed to a common coordinate system.

  4. GENETIC ALGORITHM ON GENERAL PURPOSE GRAPHICS PROCESSING UNIT: PARALLELISM REVIEW

    Directory of Open Access Journals (Sweden)

    A.J. Umbarkar

    2013-01-01

    Full Text Available Genetic Algorithm (GA is effective and robust method for solving many optimization problems. However, it may take more runs (iterations and time to get optimal solution. The execution time to find the optimal solution also depends upon the niching-technique applied to evolving population. This paper provides the information about how various authors, researchers, scientists have implemented GA on GPGPU (General purpose Graphics Processing Units with and without parallelism. Many problems have been solved on GPGPU using GA. GA is easy to parallelize because of its SIMD nature and therefore can be implemented well on GPGPU. Thus, speedup can definitely be achieved if bottleneck in GAs are identified and implemented effectively on GPGPU. Paper gives review of various applications solved using GAs on GPGPU with the future scope in the area of optimization.

  5. Simulating Lattice Spin Models on Graphics Processing Units

    CERN Document Server

    Levy, Tal; Rabani, Eran; 10.1021/ct100385b

    2012-01-01

    Lattice spin models are useful for studying critical phenomena and allow the extraction of equilibrium and dynamical properties. Simulations of such systems are usually based on Monte Carlo (MC) techniques, and the main difficulty is often the large computational effort needed when approaching critical points. In this work, it is shown how such simulations can be accelerated with the use of NVIDIA graphics processing units (GPUs) using the CUDA programming architecture. We have developed two different algorithms for lattice spin models, the first useful for equilibrium properties near a second-order phase transition point and the second for dynamical slowing down near a glass transition. The algorithms are based on parallel MC techniques, and speedups from 70- to 150-fold over conventional single-threaded computer codes are obtained using consumer-grade hardware.

  6. Molecular Dynamics Simulation of Macromolecules Using Graphics Processing Unit

    CERN Document Server

    Xu, Ji; Ge, Wei; Yu, Xiang; Yang, Xiaozhen; Li, Jinghai

    2010-01-01

    Molecular dynamics (MD) simulation is a powerful computational tool to study the behavior of macromolecular systems. But many simulations of this field are limited in spatial or temporal scale by the available computational resource. In recent years, graphics processing unit (GPU) provides unprecedented computational power for scientific applications. Many MD algorithms suit with the multithread nature of GPU. In this paper, MD algorithms for macromolecular systems that run entirely on GPU are presented. Compared to the MD simulation with free software GROMACS on a single CPU core, our codes achieve about 10 times speed-up on a single GPU. For validation, we have performed MD simulations of polymer crystallization on GPU, and the results observed perfectly agree with computations on CPU. Therefore, our single GPU codes have already provided an inexpensive alternative for macromolecular simulations on traditional CPU clusters and they can also be used as a basis to develop parallel GPU programs to further spee...

  7. Integrating post-Newtonian equations on graphics processing units

    Energy Technology Data Exchange (ETDEWEB)

    Herrmann, Frank; Tiglio, Manuel [Department of Physics, Center for Fundamental Physics, and Center for Scientific Computation and Mathematical Modeling, University of Maryland, College Park, MD 20742 (United States); Silberholz, John [Center for Scientific Computation and Mathematical Modeling, University of Maryland, College Park, MD 20742 (United States); Bellone, Matias [Facultad de Matematica, Astronomia y Fisica, Universidad Nacional de Cordoba, Cordoba 5000 (Argentina); Guerberoff, Gustavo, E-mail: tiglio@umd.ed [Facultad de Ingenieria, Instituto de Matematica y Estadistica ' Prof. Ing. Rafael Laguardia' , Universidad de la Republica, Montevideo (Uruguay)

    2010-02-07

    We report on early results of a numerical and statistical study of binary black hole inspirals. The two black holes are evolved using post-Newtonian approximations starting with initially randomly distributed spin vectors. We characterize certain aspects of the distribution shortly before merger. In particular we note the uniform distribution of black hole spin vector dot products shortly before merger and a high correlation between the initial and final black hole spin vector dot products in the equal-mass, maximally spinning case. More than 300 million simulations were performed on graphics processing units, and we demonstrate a speed-up of a factor 50 over a more conventional CPU implementation. (fast track communication)

  8. Air pollution modelling using a graphics processing unit with CUDA

    CERN Document Server

    Molnar, Ferenc; Meszaros, Robert; Lagzi, Istvan; 10.1016/j.cpc.2009.09.008

    2010-01-01

    The Graphics Processing Unit (GPU) is a powerful tool for parallel computing. In the past years the performance and capabilities of GPUs have increased, and the Compute Unified Device Architecture (CUDA) - a parallel computing architecture - has been developed by NVIDIA to utilize this performance in general purpose computations. Here we show for the first time a possible application of GPU for environmental studies serving as a basement for decision making strategies. A stochastic Lagrangian particle model has been developed on CUDA to estimate the transport and the transformation of the radionuclides from a single point source during an accidental release. Our results show that parallel implementation achieves typical acceleration values in the order of 80-120 times compared to CPU using a single-threaded implementation on a 2.33 GHz desktop computer. Only very small differences have been found between the results obtained from GPU and CPU simulations, which are comparable with the effect of stochastic tran...

  9. Graphics processing units accelerated semiclassical initial value representation molecular dynamics

    Energy Technology Data Exchange (ETDEWEB)

    Tamascelli, Dario; Dambrosio, Francesco Saverio [Dipartimento di Fisica, Università degli Studi di Milano, via Celoria 16, 20133 Milano (Italy); Conte, Riccardo [Department of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322 (United States); Ceotto, Michele, E-mail: michele.ceotto@unimi.it [Dipartimento di Chimica, Università degli Studi di Milano, via Golgi 19, 20133 Milano (Italy)

    2014-05-07

    This paper presents a Graphics Processing Units (GPUs) implementation of the Semiclassical Initial Value Representation (SC-IVR) propagator for vibrational molecular spectroscopy calculations. The time-averaging formulation of the SC-IVR for power spectrum calculations is employed. Details about the GPU implementation of the semiclassical code are provided. Four molecules with an increasing number of atoms are considered and the GPU-calculated vibrational frequencies perfectly match the benchmark values. The computational time scaling of two GPUs (NVIDIA Tesla C2075 and Kepler K20), respectively, versus two CPUs (Intel Core i5 and Intel Xeon E5-2687W) and the critical issues related to the GPU implementation are discussed. The resulting reduction in computational time and power consumption is significant and semiclassical GPU calculations are shown to be environment friendly.

  10. Polymer Field-Theory Simulations on Graphics Processing Units

    CERN Document Server

    Delaney, Kris T

    2012-01-01

    We report the first CUDA graphics-processing-unit (GPU) implementation of the polymer field-theoretic simulation framework for determining fully fluctuating expectation values of equilibrium properties for periodic and select aperiodic polymer systems. Our implementation is suitable both for self-consistent field theory (mean-field) solutions of the field equations, and for fully fluctuating simulations using the complex Langevin approach. Running on NVIDIA Tesla T20 series GPUs, we find double-precision speedups of up to 30x compared to single-core serial calculations on a recent reference CPU, while single-precision calculations proceed up to 60x faster than those on the single CPU core. Due to intensive communications overhead, an MPI implementation running on 64 CPU cores remains two times slower than a single GPU.

  11. Graphics Processing Units and High-Dimensional Optimization.

    Science.gov (United States)

    Zhou, Hua; Lange, Kenneth; Suchard, Marc A

    2010-08-01

    This paper discusses the potential of graphics processing units (GPUs) in high-dimensional optimization problems. A single GPU card with hundreds of arithmetic cores can be inserted in a personal computer and dramatically accelerates many statistical algorithms. To exploit these devices fully, optimization algorithms should reduce to multiple parallel tasks, each accessing a limited amount of data. These criteria favor EM and MM algorithms that separate parameters and data. To a lesser extent block relaxation and coordinate descent and ascent also qualify. We demonstrate the utility of GPUs in nonnegative matrix factorization, PET image reconstruction, and multidimensional scaling. Speedups of 100 fold can easily be attained. Over the next decade, GPUs will fundamentally alter the landscape of computational statistics. It is time for more statisticians to get on-board.

  12. Graphics Processing Unit Enhanced Parallel Document Flocking Clustering

    Energy Technology Data Exchange (ETDEWEB)

    Cui, Xiaohui [ORNL; Potok, Thomas E [ORNL; ST Charles, Jesse Lee [ORNL

    2010-01-01

    Analyzing and clustering documents is a complex problem. One explored method of solving this problem borrows from nature, imitating the flocking behavior of birds. One limitation of this method of document clustering is its complexity O(n2). As the number of documents grows, it becomes increasingly difficult to generate results in a reasonable amount of time. In the last few years, the graphics processing unit (GPU) has received attention for its ability to solve highly-parallel and semi-parallel problems much faster than the traditional sequential processor. In this paper, we have conducted research to exploit this archi- tecture and apply its strengths to the flocking based document clustering problem. Using the CUDA platform from NVIDIA, we developed a doc- ument flocking implementation to be run on the NVIDIA GEFORCE GPU. Performance gains ranged from thirty-six to nearly sixty times improvement of the GPU over the CPU implementation.

  13. Implementing wide baseline matching algorithms on a graphics processing unit.

    Energy Technology Data Exchange (ETDEWEB)

    Rothganger, Fredrick H.; Larson, Kurt W.; Gonzales, Antonio Ignacio; Myers, Daniel S.

    2007-10-01

    Wide baseline matching is the state of the art for object recognition and image registration problems in computer vision. Though effective, the computational expense of these algorithms limits their application to many real-world problems. The performance of wide baseline matching algorithms may be improved by using a graphical processing unit as a fast multithreaded co-processor. In this paper, we present an implementation of the difference of Gaussian feature extractor, based on the CUDA system of GPU programming developed by NVIDIA, and implemented on their hardware. For a 2000x2000 pixel image, the GPU-based method executes nearly thirteen times faster than a comparable CPU-based method, with no significant loss of accuracy.

  14. Role of Graphics Tools in the Learning Design Process

    Science.gov (United States)

    Laisney, Patrice; Brandt-Pomares, Pascale

    2015-01-01

    This paper discusses the design activities of students in secondary school in France. Graphics tools are now part of the capacity of design professionals. It is therefore apt to reflect on their integration into the technological education. Has the use of intermediate graphical tools changed students' performance, and if so in what direction, in…

  15. Graphics processing units in bioinformatics, computational biology and systems biology.

    Science.gov (United States)

    Nobile, Marco S; Cazzaniga, Paolo; Tangherloni, Andrea; Besozzi, Daniela

    2016-07-08

    Several studies in Bioinformatics, Computational Biology and Systems Biology rely on the definition of physico-chemical or mathematical models of biological systems at different scales and levels of complexity, ranging from the interaction of atoms in single molecules up to genome-wide interaction networks. Traditional computational methods and software tools developed in these research fields share a common trait: they can be computationally demanding on Central Processing Units (CPUs), therefore limiting their applicability in many circumstances. To overcome this issue, general-purpose Graphics Processing Units (GPUs) are gaining an increasing attention by the scientific community, as they can considerably reduce the running time required by standard CPU-based software, and allow more intensive investigations of biological systems. In this review, we present a collection of GPU tools recently developed to perform computational analyses in life science disciplines, emphasizing the advantages and the drawbacks in the use of these parallel architectures. The complete list of GPU-powered tools here reviewed is available at http://bit.ly/gputools. © The Author 2016. Published by Oxford University Press.

  16. Retrospective Study on Mathematical Modeling Based on Computer Graphic Processing

    Science.gov (United States)

    Zhang, Kai Li

    Graphics & image making is an important field in computer application, in which visualization software has been widely used with the characteristics of convenience and quick. However, it was thought by modeling designers that the software had been limited in it's function and flexibility because mathematics modeling platform was not built. A non-visualization graphics software appearing at this moment enabled the graphics & image design has a very good mathematics modeling platform. In the paper, a polished pyramid is established by multivariate spline function algorithm, and validate the non-visualization software is good in mathematical modeling.

  17. Graphical representation of projective analysis of the requirements used in the process design of assistive products

    Directory of Open Access Journals (Sweden)

    Nelson Luis Smythe Jr.

    2016-05-01

    Full Text Available Graphic representation is a little explored technique in the development of assistive products, despite being helpful in the visualization of conceptual aspects, which are then made tangible. This improves the understanding of the process phases by the whole team, which includes practitioners from different areas. Thus, the characterization of graphic symbols used in assistive technology development processes are presented. A literature review was carried out to select: graphic symbols and pictorial images used in health care area; and the design of new proposed methods for assistive products that had some graphic representantion of its processes. From this point the authors developed a preliminar hybrid characterization model based on symbolization of graphical representations (verbal, schematic, and pictorial and approaches for pictogram design (geons, silhouettes and observation. The results of the characterization were discussed using as reference the hybrid model and the gaps. The assessment showed low use of graphic symbols for design processes representations, even when the authors were designers.

  18. Kinematic modelling of disc galaxies using graphics processing units

    Science.gov (United States)

    Bekiaris, G.; Glazebrook, K.; Fluke, C. J.; Abraham, R.

    2016-01-01

    With large-scale integral field spectroscopy (IFS) surveys of thousands of galaxies currently under-way or planned, the astronomical community is in need of methods, techniques and tools that will allow the analysis of huge amounts of data. We focus on the kinematic modelling of disc galaxies and investigate the potential use of massively parallel architectures, such as the graphics processing unit (GPU), as an accelerator for the computationally expensive model-fitting procedure. We review the algorithms involved in model-fitting and evaluate their suitability for GPU implementation. We employ different optimization techniques, including the Levenberg-Marquardt and nested sampling algorithms, but also a naive brute-force approach based on nested grids. We find that the GPU can accelerate the model-fitting procedure up to a factor of ˜100 when compared to a single-threaded CPU, and up to a factor of ˜10 when compared to a multithreaded dual CPU configuration. Our method's accuracy, precision and robustness are assessed by successfully recovering the kinematic properties of simulated data, and also by verifying the kinematic modelling results of galaxies from the GHASP and DYNAMO surveys as found in the literature. The resulting GBKFIT code is available for download from: http://supercomputing.swin.edu.au/gbkfit.

  19. Graphics processing unit-accelerated quantitative trait Loci detection.

    Science.gov (United States)

    Chapuis, Guillaume; Filangi, Olivier; Elsen, Jean-Michel; Lavenier, Dominique; Le Roy, Pascale

    2013-09-01

    Mapping quantitative trait loci (QTL) using genetic marker information is a time-consuming analysis that has interested the mapping community in recent decades. The increasing amount of genetic marker data allows one to consider ever more precise QTL analyses while increasing the demand for computation. Part of the difficulty of detecting QTLs resides in finding appropriate critical values or threshold values, above which a QTL effect is considered significant. Different approaches exist to determine these thresholds, using either empirical methods or algebraic approximations. In this article, we present a new implementation of existing software, QTLMap, which takes advantage of the data parallel nature of the problem by offsetting heavy computations to a graphics processing unit (GPU). Developments on the GPU were implemented using Cuda technology. This new implementation performs up to 75 times faster than the previous multicore implementation, while maintaining the same results and level of precision (Double Precision) and computing both QTL values and thresholds. This speedup allows one to perform more complex analyses, such as linkage disequilibrium linkage analyses (LDLA) and multiQTL analyses, in a reasonable time frame.

  20. Kinematic Modelling of Disc Galaxies using Graphics Processing Units

    CERN Document Server

    Bekiaris, Georgios; Fluke, Christopher J; Abraham, Roberto

    2015-01-01

    With large-scale Integral Field Spectroscopy (IFS) surveys of thousands of galaxies currently under-way or planned, the astronomical community is in need of methods, techniques and tools that will allow the analysis of huge amounts of data. We focus on the kinematic modelling of disc galaxies and investigate the potential use of massively parallel architectures, such as the Graphics Processing Unit (GPU), as an accelerator for the computationally expensive model-fitting procedure. We review the algorithms involved in model-fitting and evaluate their suitability for GPU implementation. We employ different optimization techniques, including the Levenberg-Marquardt and Nested Sampling algorithms, but also a naive brute-force approach based on Nested Grids. We find that the GPU can accelerate the model-fitting procedure up to a factor of ~100 when compared to a single-threaded CPU, and up to a factor of ~10 when compared to a multi-threaded dual CPU configuration. Our method's accuracy, precision and robustness a...

  1. Efficient graphics processing unit-based voxel carving for surveillance

    Science.gov (United States)

    Ober-Gecks, Antje; Zwicker, Marius; Henrich, Dominik

    2016-07-01

    A graphics processing unit (GPU)-based implementation of a space carving method for the reconstruction of the photo hull is presented. In particular, the generalized voxel coloring with item buffer approach is transferred to the GPU. The fast computation on the GPU is realized by an incrementally calculated standard deviation within the likelihood ratio test, which is applied as color consistency criterion. A fast and efficient computation of complete voxel-pixel projections is provided using volume rendering methods. This generates a speedup of the iterative carving procedure while considering all given pixel color information. Different volume rendering methods, such as texture mapping and raycasting, are examined. The termination of the voxel carving procedure is controlled through an anytime concept. The photo hull algorithm is examined for its applicability to real-world surveillance scenarios as an online reconstruction method. For this reason, a GPU-based redesign of a visual hull algorithm is provided that utilizes geometric knowledge about known static occluders of the scene in order to create a conservative and complete visual hull that includes all given objects. This visual hull approximation serves as input for the photo hull algorithm.

  2. Use of general purpose graphics processing units with MODFLOW.

    Science.gov (United States)

    Hughes, Joseph D; White, Jeremy T

    2013-01-01

    To evaluate the use of general-purpose graphics processing units (GPGPUs) to improve the performance of MODFLOW, an unstructured preconditioned conjugate gradient (UPCG) solver has been developed. The UPCG solver uses a compressed sparse row storage scheme and includes Jacobi, zero fill-in incomplete, and modified-incomplete lower-upper (LU) factorization, and generalized least-squares polynomial preconditioners. The UPCG solver also includes options for sequential and parallel solution on the central processing unit (CPU) using OpenMP. For simulations utilizing the GPGPU, all basic linear algebra operations are performed on the GPGPU; memory copies between the central processing unit CPU and GPCPU occur prior to the first iteration of the UPCG solver and after satisfying head and flow criteria or exceeding a maximum number of iterations. The efficiency of the UPCG solver for GPGPU and CPU solutions is benchmarked using simulations of a synthetic, heterogeneous unconfined aquifer with tens of thousands to millions of active grid cells. Testing indicates GPGPU speedups on the order of 2 to 8, relative to the standard MODFLOW preconditioned conjugate gradient (PCG) solver, can be achieved when (1) memory copies between the CPU and GPGPU are optimized, (2) the percentage of time performing memory copies between the CPU and GPGPU is small relative to the calculation time, (3) high-performance GPGPU cards are utilized, and (4) CPU-GPGPU combinations are used to execute sequential operations that are difficult to parallelize. Furthermore, UPCG solver testing indicates GPGPU speedups exceed parallel CPU speedups achieved using OpenMP on multicore CPUs for preconditioners that can be easily parallelized. Published 2013. This article is a U.S. Government work and is in the public domain in the USA.

  3. Graphical representation of projective analysis of the requirements used in the process design of assistive products

    OpenAIRE

    Nelson Luis Smythe Jr.; Gheysa Caroline Prado; Kelli Cristine Assis da Silva Smythe

    2016-01-01

    Graphic representation is a little explored technique in the development of assistive products, despite being helpful in the visualization of conceptual aspects, which are then made tangible. This improves the understanding of the process phases by the whole team, which includes practitioners from different areas. Thus, the characterization of graphic symbols used in assistive technology development processes are presented. A literature review was carried out to select: graphic symbols and p...

  4. General Purpose Graphics Processing Unit Based High-Rate Rice Decompression and Reed-Solomon Decoding.

    Energy Technology Data Exchange (ETDEWEB)

    Loughry, Thomas A.

    2015-02-01

    As the volume of data acquired by space-based sensors increases, mission data compression/decompression and forward error correction code processing performance must likewise scale. This competency development effort was explored using the General Purpose Graphics Processing Unit (GPGPU) to accomplish high-rate Rice Decompression and high-rate Reed-Solomon (RS) decoding at the satellite mission ground station. Each algorithm was implemented and benchmarked on a single GPGPU. Distributed processing across one to four GPGPUs was also investigated. The results show that the GPGPU has considerable potential for performing satellite communication Data Signal Processing, with three times or better performance improvements and up to ten times reduction in cost over custom hardware, at least in the case of Rice Decompression and Reed-Solomon Decoding.

  5. General Purpose Graphics Processing Unit Based High-Rate Rice Decompression and Reed-Solomon Decoding

    Energy Technology Data Exchange (ETDEWEB)

    Loughry, Thomas A. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

    2015-02-01

    As the volume of data acquired by space-based sensors increases, mission data compression/decompression and forward error correction code processing performance must likewise scale. This competency development effort was explored using the General Purpose Graphics Processing Unit (GPGPU) to accomplish high-rate Rice Decompression and high-rate Reed-Solomon (RS) decoding at the satellite mission ground station. Each algorithm was implemented and benchmarked on a single GPGPU. Distributed processing across one to four GPGPUs was also investigated. The results show that the GPGPU has considerable potential for performing satellite communication Data Signal Processing, with three times or better performance improvements and up to ten times reduction in cost over custom hardware, at least in the case of Rice Decompression and Reed-Solomon Decoding.

  6. Accelerating chemical database searching using graphics processing units.

    Science.gov (United States)

    Liu, Pu; Agrafiotis, Dimitris K; Rassokhin, Dmitrii N; Yang, Eric

    2011-08-22

    The utility of chemoinformatics systems depends on the accurate computer representation and efficient manipulation of chemical compounds. In such systems, a small molecule is often digitized as a large fingerprint vector, where each element indicates the presence/absence or the number of occurrences of a particular structural feature. Since in theory the number of unique features can be exceedingly large, these fingerprint vectors are usually folded into much shorter ones using hashing and modulo operations, allowing fast "in-memory" manipulation and comparison of molecules. There is increasing evidence that lossless fingerprints can substantially improve retrieval performance in chemical database searching (substructure or similarity), which have led to the development of several lossless fingerprint compression algorithms. However, any gains in storage and retrieval afforded by compression need to be weighed against the extra computational burden required for decompression before these fingerprints can be compared. Here we demonstrate that graphics processing units (GPU) can greatly alleviate this problem, enabling the practical application of lossless fingerprints on large databases. More specifically, we show that, with the help of a ~$500 ordinary video card, the entire PubChem database of ~32 million compounds can be searched in ~0.2-2 s on average, which is 2 orders of magnitude faster than a conventional CPU. If multiple query patterns are processed in batch, the speedup is even more dramatic (less than 0.02-0.2 s/query for 1000 queries). In the present study, we use the Elias gamma compression algorithm, which results in a compression ratio as high as 0.097.

  7. Experiments with a low-cost system for computer graphics material model acquisition

    Science.gov (United States)

    Rushmeier, Holly; Lockerman, Yitzhak; Cartwright, Luke; Pitera, David

    2015-03-01

    We consider the design of an inexpensive system for acquiring material models for computer graphics rendering applications in animation, games and conceptual design. To be useful in these applications a system must be able to model a rich range of appearances in a computationally tractable form. The range of appearance of interest in computer graphics includes materials that have spatially varying properties, directionality, small-scale geometric structure, and subsurface scattering. To be computationally tractable, material models for graphics must be compact, editable, and efficient to numerically evaluate for ray tracing importance sampling. To construct appropriate models for a range of interesting materials, we take the approach of separating out directly and indirectly scattered light using high spatial frequency patterns introduced by Nayar et al. in 2006. To acquire the data at low cost, we use a set of Raspberry Pi computers and cameras clamped to miniature projectors. We explore techniques to separate out surface and subsurface indirect lighting. This separation would allow the fitting of simple, and so tractable, analytical models to features of the appearance model. The goal of the system is to provide models for physically accurate renderings that are visually equivalent to viewing the original physical materials.

  8. Rapid learning-based video stereolization using graphic processing unit acceleration

    Science.gov (United States)

    Sun, Tian; Jung, Cheolkon; Wang, Lei; Kim, Joongkyu

    2016-09-01

    Video stereolization has received much attention in recent years due to the lack of stereoscopic three-dimensional (3-D) contents. Although video stereolization can enrich stereoscopic 3-D contents, it is hard to achieve automatic two-dimensional-to-3-D conversion with less computational cost. We proposed rapid learning-based video stereolization using a graphic processing unit (GPU) acceleration. We first generated an initial depth map based on learning from examples. Then, we refined the depth map using saliency and cross-bilateral filtering to make object boundaries clear. Finally, we performed depth-image-based-rendering to generate stereoscopic 3-D views. To accelerate the computation of video stereolization, we provided a parallelizable hybrid GPU-central processing unit (CPU) solution to be suitable for running on GPU. Experimental results demonstrate that the proposed method is nearly 180 times faster than CPU-based processing and achieves a good performance comparable to the-state-of-the-art ones.

  9. A Block-Asynchronous Relaxation Method for Graphics Processing Units

    Energy Technology Data Exchange (ETDEWEB)

    Antz, Hartwig [Karlsruhe Inst. of Technology (KIT) (Germany); Tomov, Stanimire [Univ. of Tennessee, Knoxville, TN (United States); Dongarra, Jack [Univ. of Tennessee, Knoxville, TN (United States); Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Univ. of Manchester (United Kingdom); Heuveline, Vincent [Karlsruhe Inst. of Technology (KIT) (Germany)

    2011-11-30

    In this paper, we analyze the potential of asynchronous relaxation methods on Graphics Processing Units (GPUs). For this purpose, we developed a set of asynchronous iteration algorithms in CUDA and compared them with a parallel implementation of synchronous relaxation methods on CPU-based systems. For a set of test matrices taken from the University of Florida Matrix Collection we monitor the convergence behavior, the average iteration time and the total time-to-solution time. Analyzing the results, we observe that even for our most basic asynchronous relaxation scheme, despite its lower convergence rate compared to the Gauss-Seidel relaxation (that we expected), the asynchronous iteration running on GPUs is still able to provide solution approximations of certain accuracy in considerably shorter time then Gauss- Seidel running on CPUs. Hence, it overcompensates for the slower convergence by exploiting the scalability and the good fit of the asynchronous schemes for the highly parallel GPU architectures. Further, enhancing the most basic asynchronous approach with hybrid schemes – using multiple iterations within the ”subdomain” handled by a GPU thread block and Jacobi-like asynchronous updates across the ”boundaries”, subject to tuning various parameters – we manage to not only recover the loss of global convergence but often accelerate convergence of up to two times (compared to the effective but difficult to parallelize Gauss-Seidel type of schemes), while keeping the execution time of a global iteration practically the same. This shows the high potential of the asynchronous methods not only as a stand alone numerical solver for linear systems of equations fulfilling certain convergence conditions but more importantly as a smoother in multigrid methods. Due to the explosion of parallelism in todays architecture designs, the significance and the need for asynchronous methods, as the ones described in this work, is expected to grow.

  10. Processes of Curriculum Development in the Department of Graphic ...

    African Journals Online (AJOL)

    Test

    highlight the importance of critical engagement by staff which addresses ... facilitate the epistemological access our students need in order to achieve academic success? ... sufficient – in order to „see‟ the world as a specific knowledge practitioner, ..... In addition to these factors, staff in Graphic Design, driven by a need to ...

  11. Fast crustal deformation computing method for multiple computations accelerated by a graphics processing unit cluster

    Science.gov (United States)

    Yamaguchi, Takuma; Ichimura, Tsuyoshi; Yagi, Yuji; Agata, Ryoichiro; Hori, Takane; Hori, Muneo

    2017-08-01

    As high-resolution observational data become more common, the demand for numerical simulations of crustal deformation using 3-D high-fidelity modelling is increasing. To increase the efficiency of performing numerical simulations with high computation costs, we developed a fast solver using heterogeneous computing, with graphics processing units (GPUs) and central processing units, and then used the solver in crustal deformation computations. The solver was based on an iterative solver and was devised so that a large proportion of the computation was calculated more quickly using GPUs. To confirm the utility of the proposed solver, we demonstrated a numerical simulation of the coseismic slip distribution estimation, which requires 360 000 crustal deformation computations with 82 196 106 degrees of freedom.

  12. BarraCUDA - a fast short read sequence aligner using graphics processing units

    LENUS (Irish Health Repository)

    Klus, Petr

    2012-01-13

    Abstract Background With the maturation of next-generation DNA sequencing (NGS) technologies, the throughput of DNA sequencing reads has soared to over 600 gigabases from a single instrument run. General purpose computing on graphics processing units (GPGPU), extracts the computing power from hundreds of parallel stream processors within graphics processing cores and provides a cost-effective and energy efficient alternative to traditional high-performance computing (HPC) clusters. In this article, we describe the implementation of BarraCUDA, a GPGPU sequence alignment software that is based on BWA, to accelerate the alignment of sequencing reads generated by these instruments to a reference DNA sequence. Findings Using the NVIDIA Compute Unified Device Architecture (CUDA) software development environment, we ported the most computational-intensive alignment component of BWA to GPU to take advantage of the massive parallelism. As a result, BarraCUDA offers a magnitude of performance boost in alignment throughput when compared to a CPU core while delivering the same level of alignment fidelity. The software is also capable of supporting multiple CUDA devices in parallel to further accelerate the alignment throughput. Conclusions BarraCUDA is designed to take advantage of the parallelism of GPU to accelerate the alignment of millions of sequencing reads generated by NGS instruments. By doing this, we could, at least in part streamline the current bioinformatics pipeline such that the wider scientific community could benefit from the sequencing technology. BarraCUDA is currently available from http:\\/\\/seqbarracuda.sf.net

  13. Graphical processing unit implementation of an integrated shape-based active contour: Application to digital pathology

    Directory of Open Access Journals (Sweden)

    Sahirzeeshan Ali

    2011-01-01

    Full Text Available Commodity graphics hardware has become a cost-effective parallel platform to solve many general computational problems. In medical imaging and more so in digital pathology, segmentation of multiple structures on high-resolution images, is often a complex and computationally expensive task. Shape-based level set segmentation has recently emerged as a natural solution to segmenting overlapping and occluded objects. However the flexibility of the level set method has traditionally resulted in long computation times and therefore might have limited clinical utility. The processing times even for moderately sized images could run into several hours of computation time. Hence there is a clear need to accelerate these segmentations schemes. In this paper, we present a parallel implementation of a computationally heavy segmentation scheme on a graphical processing unit (GPU. The segmentation scheme incorporates level sets with shape priors to segment multiple overlapping nuclei from very large digital pathology images. We report a speedup of 19× compared to multithreaded C and MATLAB-based implementations of the same scheme, albeit with slight reduction in accuracy. Our GPU-based segmentation scheme was rigorously and quantitatively evaluated for the problem of nuclei segmentation and overlap resolution on digitized histopathology images corresponding to breast and prostate biopsy tissue specimens.

  14. BarraCUDA - a fast short read sequence aligner using graphics processing units

    Directory of Open Access Journals (Sweden)

    Klus Petr

    2012-01-01

    Full Text Available Abstract Background With the maturation of next-generation DNA sequencing (NGS technologies, the throughput of DNA sequencing reads has soared to over 600 gigabases from a single instrument run. General purpose computing on graphics processing units (GPGPU, extracts the computing power from hundreds of parallel stream processors within graphics processing cores and provides a cost-effective and energy efficient alternative to traditional high-performance computing (HPC clusters. In this article, we describe the implementation of BarraCUDA, a GPGPU sequence alignment software that is based on BWA, to accelerate the alignment of sequencing reads generated by these instruments to a reference DNA sequence. Findings Using the NVIDIA Compute Unified Device Architecture (CUDA software development environment, we ported the most computational-intensive alignment component of BWA to GPU to take advantage of the massive parallelism. As a result, BarraCUDA offers a magnitude of performance boost in alignment throughput when compared to a CPU core while delivering the same level of alignment fidelity. The software is also capable of supporting multiple CUDA devices in parallel to further accelerate the alignment throughput. Conclusions BarraCUDA is designed to take advantage of the parallelism of GPU to accelerate the alignment of millions of sequencing reads generated by NGS instruments. By doing this, we could, at least in part streamline the current bioinformatics pipeline such that the wider scientific community could benefit from the sequencing technology. BarraCUDA is currently available from http://seqbarracuda.sf.net

  15. Software Graphics Processing Unit (sGPU) for Deep Space Applications

    Science.gov (United States)

    McCabe, Mary; Salazar, George; Steele, Glen

    2015-01-01

    A graphics processing capability will be required for deep space missions and must include a range of applications, from safety-critical vehicle health status to telemedicine for crew health. However, preliminary radiation testing of commercial graphics processing cards suggest they cannot operate in the deep space radiation environment. Investigation into an Software Graphics Processing Unit (sGPU)comprised of commercial-equivalent radiation hardened/tolerant single board computers, field programmable gate arrays, and safety-critical display software shows promising results. Preliminary performance of approximately 30 frames per second (FPS) has been achieved. Use of multi-core processors may provide a significant increase in performance.

  16. Mendel-GPU: haplotyping and genotype imputation on graphics processing units.

    Science.gov (United States)

    Chen, Gary K; Wang, Kai; Stram, Alex H; Sobel, Eric M; Lange, Kenneth

    2012-11-15

    In modern sequencing studies, one can improve the confidence of genotype calls by phasing haplotypes using information from an external reference panel of fully typed unrelated individuals. However, the computational demands are so high that they prohibit researchers with limited computational resources from haplotyping large-scale sequence data. Our graphics processing unit based software delivers haplotyping and imputation accuracies comparable to competing programs at a fraction of the computational cost and peak memory demand. Mendel-GPU, our OpenCL software, runs on Linux platforms and is portable across AMD and nVidia GPUs. Users can download both code and documentation at http://code.google.com/p/mendel-gpu/. gary.k.chen@usc.edu. Supplementary data are available at Bioinformatics online.

  17. Pulse shape analysis for segmented germanium detectors implemented in graphics processing units

    Energy Technology Data Exchange (ETDEWEB)

    Calore, Enrico, E-mail: enrico.calore@lnl.infn.it [INFN Laboratori Nazionali di Legnaro, Viale Dell' Università 2, I-35020 Legnaro, Padova (Italy); Bazzacco, Dino, E-mail: dino.bazzacco@pd.infn.it [INFN Sezione di Padova, Via Marzolo 8, I-35131 Padova (Italy); Recchia, Francesco, E-mail: francesco.recchia@pd.infn.it [INFN Sezione di Padova, Via Marzolo 8, I-35131 Padova (Italy); Dipartimento di Fisica e Astronomia dell' Università di Padova, Via Marzolo 8, I-35131 Padova (Italy)

    2013-08-11

    Position sensitive highly segmented germanium detectors constitute the state-of-the-art of the technology employed for γ-spectroscopy studies. The operation of large spectrometers composed of tens to hundreds of such detectors demands enormous amounts of computing power for the digital treatment of the signals. The use of Graphics Processing Units (GPUs) has been evaluated as a cost-effective solution to meet such requirements. Different implementations and the hardware constraints limiting the performance of the system are examined. -- Highlights: • We implemented the grid-search algorithm in OpenCL in order to be run on GPUs. • We compared its performances in respect to an optimized CPU implementation in C++. • We analyzed the results highlighting the most probable factors limiting their speed. • We propose some solutions to overcome their speed limits.

  18. Discontinuous Galerkin methods on graphics processing units for nonlinear hyperbolic conservation laws

    CERN Document Server

    Fuhry, Martin; Krivodonova, Lilia

    2016-01-01

    We present a novel implementation of the modal discontinuous Galerkin (DG) method for hyperbolic conservation laws in two dimensions on graphics processing units (GPUs) using NVIDIA's Compute Unified Device Architecture (CUDA). Both flexible and highly accurate, DG methods accommodate parallel architectures well as their discontinuous nature produces element-local approximations. High performance scientific computing suits GPUs well, as these powerful, massively parallel, cost-effective devices have recently included support for double-precision floating point numbers. Computed examples for Euler equations over unstructured triangle meshes demonstrate the effectiveness of our implementation on an NVIDIA GTX 580 device. Profiling of our method reveals performance comparable to an existing nodal DG-GPU implementation for linear problems.

  19. Monte Carlo simulation of photon migration in 3D turbid media accelerated by graphics processing units.

    Science.gov (United States)

    Fang, Qianqian; Boas, David A

    2009-10-26

    We report a parallel Monte Carlo algorithm accelerated by graphics processing units (GPU) for modeling time-resolved photon migration in arbitrary 3D turbid media. By taking advantage of the massively parallel threads and low-memory latency, this algorithm allows many photons to be simulated simultaneously in a GPU. To further improve the computational efficiency, we explored two parallel random number generators (RNG), including a floating-point-only RNG based on a chaotic lattice. An efficient scheme for boundary reflection was implemented, along with the functions for time-resolved imaging. For a homogeneous semi-infinite medium, good agreement was observed between the simulation output and the analytical solution from the diffusion theory. The code was implemented with CUDA programming language, and benchmarked under various parameters, such as thread number, selection of RNG and memory access pattern. With a low-cost graphics card, this algorithm has demonstrated an acceleration ratio above 300 when using 1792 parallel threads over conventional CPU computation. The acceleration ratio drops to 75 when using atomic operations. These results render the GPU-based Monte Carlo simulation a practical solution for data analysis in a wide range of diffuse optical imaging applications, such as human brain or small-animal imaging.

  20. A software architecture for multi-cellular system simulations on graphics processing units.

    Science.gov (United States)

    Jeannin-Girardon, Anne; Ballet, Pascal; Rodin, Vincent

    2013-09-01

    The first aim of simulation in virtual environment is to help biologists to have a better understanding of the simulated system. The cost of such simulation is significantly reduced compared to that of in vivo simulation. However, the inherent complexity of biological system makes it hard to simulate these systems on non-parallel architectures: models might be made of sub-models and take several scales into account; the number of simulated entities may be quite large. Today, graphics cards are used for general purpose computing which has been made easier thanks to frameworks like CUDA or OpenCL. Parallelization of models may however not be easy: parallel computer programing skills are often required; several hardware architectures may be used to execute models. In this paper, we present the software architecture we built in order to implement various models able to simulate multi-cellular system. This architecture is modular and it implements data structures adapted for graphics processing units architectures. It allows efficient simulation of biological mechanisms.

  1. High-speed nonlinear finite element analysis for surgical simulation using graphics processing units.

    Science.gov (United States)

    Taylor, Z A; Cheng, M; Ourselin, S

    2008-05-01

    The use of biomechanical modelling, especially in conjunction with finite element analysis, has become common in many areas of medical image analysis and surgical simulation. Clinical employment of such techniques is hindered by conflicting requirements for high fidelity in the modelling approach, and fast solution speeds. We report the development of techniques for high-speed nonlinear finite element analysis for surgical simulation. We use a fully nonlinear total Lagrangian explicit finite element formulation which offers significant computational advantages for soft tissue simulation. However, the key contribution of the work is the presentation of a fast graphics processing unit (GPU) solution scheme for the finite element equations. To the best of our knowledge, this represents the first GPU implementation of a nonlinear finite element solver. We show that the present explicit finite element scheme is well suited to solution via highly parallel graphics hardware, and that even a midrange GPU allows significant solution speed gains (up to 16.8 x) compared with equivalent CPU implementations. For the models tested the scheme allows real-time solution of models with up to 16,000 tetrahedral elements. The use of GPUs for such purposes offers a cost-effective high-performance alternative to expensive multi-CPU machines, and may have important applications in medical image analysis and surgical simulation.

  2. Commercial Off-The-Shelf (COTS) Graphics Processing Board (GPB) Radiation Test Evaluation Report

    Science.gov (United States)

    Salazar, George A.; Steele, Glen F.

    2013-01-01

    Large round trip communications latency for deep space missions will require more onboard computational capabilities to enable the space vehicle to undertake many tasks that have traditionally been ground-based, mission control responsibilities. As a result, visual display graphics will be required to provide simpler vehicle situational awareness through graphical representations, as well as provide capabilities never before done in a space mission, such as augmented reality for in-flight maintenance or Telepresence activities. These capabilities will require graphics processors and associated support electronic components for high computational graphics processing. In an effort to understand the performance of commercial graphics card electronics operating in the expected radiation environment, a preliminary test was performed on five commercial offthe- shelf (COTS) graphics cards. This paper discusses the preliminary evaluation test results of five COTS graphics processing cards tested to the International Space Station (ISS) low earth orbit radiation environment. Three of the five graphics cards were tested to a total dose of 6000 rads (Si). The test articles, test configuration, preliminary results, and recommendations are discussed.

  3. Graphic Warning Labels and the Cost Savings from Reduced Smoking among Pregnant Women

    Directory of Open Access Journals (Sweden)

    John A. Tauras

    2017-02-01

    Full Text Available Introduction: The U.S. Food and Drug Administration (FDA has estimated the economic impact of Graphic Warning Labels (GWLs. By omitting the impact on tobacco consumption by pregnant women, the FDA analysis underestimates the economic benefits that would occur from the proposed regulations. There is a strong link between the occurrence of low birth weight babies and smoking while pregnant. Low birth weight babies in turn generate much higher hospital costs than normal birth weight babies. This study aims to fill the gap by quantifying the national hospital cost savings from the reductions in prenatal smoking that will arise if GWLs are implemented in the U.S. Data and Methods: This study uses several data sources. It uses Natality Data from the National Vital Statistics System of the National Center for Health Statistics (NCHS in 2013 to estimate the impact of prenatal smoking on the likelihood of having a low-birth-weight baby, controlling for socio-economic and demographic characteristics as well as medical and non-medical risk factors. Using these estimates, along with the estimates of Huang et al. (2014 regarding the effect of GWLs on smoking, we calculate the change in the number of LBW (low birth weight babies resulting from decreased prenatal smoking due to GWLs. Using this estimated change and the estimates from Russell et al. (2007 and AHRQ (2013 on the excess hospital costs of LBW babies, we calculate cost saving that arises from reduced prenatal smoking in response of GWLs. Results and Conclusions: Our results indicated that GWLs for this population could lead to hospital cost savings of 1.2 billion to 2.0 billion dollars over a 30 year horizon.

  4. Grace: a Cross-platform Micromagnetic Simulator On Graphics Processing Units

    CERN Document Server

    Zhu, Ru

    2014-01-01

    A micromagnetic simulator running on graphics processing unit (GPU) is presented. It achieves significant performance boost as compared to previous central processing unit (CPU) simulators, up to two orders of magnitude for large input problems. Different from GPU implementations of other research groups, this simulator is developed with C++ Accelerated Massive Parallelism (C++ AMP) and is hardware platform compatible. It runs on GPU from venders include NVidia, AMD and Intel, which paved the way for fast micromagnetic simulation on both high-end workstations with dedicated graphics cards and low-end personal computers with integrated graphics card. A copy of the simulator software is publicly available.

  5. Thermal/Heat Transfer Analysis Using a Graphic Processing Unit (GPU) Enabled Computing Environment Project

    Data.gov (United States)

    National Aeronautics and Space Administration — The objective of this project was to use GPU enabled computing to accelerate the analyses of heat transfer and thermal effects. Graphical processing unit (GPU)...

  6. Harnessing graphics processing units for improved neuroimaging statistics.

    Science.gov (United States)

    Eklund, Anders; Villani, Mattias; Laconte, Stephen M

    2013-09-01

    Simple models and algorithms based on restrictive assumptions are often used in the field of neuroimaging for studies involving functional magnetic resonance imaging, voxel based morphometry, and diffusion tensor imaging. Nonparametric statistical methods or flexible Bayesian models can be applied rather easily to yield more trustworthy results. The spatial normalization step required for multisubject studies can also be improved by taking advantage of more robust algorithms for image registration. A common drawback of algorithms based on weaker assumptions, however, is the increase in computational complexity. In this short overview, we will therefore present some examples of how inexpensive PC graphics hardware, normally used for demanding computer games, can be used to enable practical use of more realistic models and accurate algorithms, such that the outcome of neuroimaging studies really can be trusted.

  7. Low-cost compact ECG with graphic LCD and phonocardiogram system design.

    Science.gov (United States)

    Kara, Sadik; Kemaloğlu, Semra; Kirbaş, Samil

    2006-06-01

    Till today, many different ECG devices are made in developing countries. In this study, low cost, small size, portable LCD screen ECG device, and phonocardiograph were designed. With designed system, heart sounds that take synchronously with ECG signal are heard as sensitive. Improved system consist three units; Unit 1, ECG circuit, filter and amplifier structure. Unit 2, heart sound acquisition circuit. Unit 3, microcontroller, graphic LCD and ECG signal sending unit to computer. Our system can be used easily in different departments of the hospital, health institution and clinics, village clinic and also in houses because of its small size structure and other benefits. In this way, it is possible that to see ECG signal and hear heart sounds as synchronously and sensitively. In conclusion, heart sounds are heard on the part of both doctor and patient because sounds are given to environment with a tiny speaker. Thus, the patient knows and hears heart sounds him/herself and is acquainted by doctor about healthy condition.

  8. Efficient particle-in-cell simulation of auroral plasma phenomena using a CUDA enabled graphics processing unit

    Science.gov (United States)

    Sewell, Stephen

    This thesis introduces a software framework that effectively utilizes low-cost commercially available Graphic Processing Units (GPUs) to simulate complex scientific plasma phenomena that are modeled using the Particle-In-Cell (PIC) paradigm. The software framework that was developed conforms to the Compute Unified Device Architecture (CUDA), a standard for general purpose graphic processing that was introduced by NVIDIA Corporation. This framework has been verified for correctness and applied to advance the state of understanding of the electromagnetic aspects of the development of the Aurora Borealis and Aurora Australis. For each phase of the PIC methodology, this research has identified one or more methods to exploit the problem's natural parallelism and effectively map it for execution on the graphic processing unit and its host processor. The sources of overhead that can reduce the effectiveness of parallelization for each of these methods have also been identified. One of the novel aspects of this research was the utilization of particle sorting during the grid interpolation phase. The final representation resulted in simulations that executed about 38 times faster than simulations that were run on a single-core general-purpose processing system. The scalability of this framework to larger problem sizes and future generation systems has also been investigated.

  9. Student Thinking Processes While Constructing Graphic Representations of Textbook Content: What Insights Do Think-Alouds Provide?

    Science.gov (United States)

    Scott, D. Beth; Dreher, Mariam Jean

    2016-01-01

    This study examined the thinking processes students engage in while constructing graphic representations of textbook content. Twenty-eight students who either used graphic representations in a routine manner during social studies instruction or learned to construct graphic representations based on the rhetorical patterns used to organize textbook…

  10. Acceleration of integral imaging based incoherent Fourier hologram capture using graphic processing unit.

    Science.gov (United States)

    Jeong, Kyeong-Min; Kim, Hee-Seung; Hong, Sung-In; Lee, Sung-Keun; Jo, Na-Young; Kim, Yong-Soo; Lim, Hong-Gi; Park, Jae-Hyeung

    2012-10-01

    Speed enhancement of integral imaging based incoherent Fourier hologram capture using a graphic processing unit is reported. Integral imaging based method enables exact hologram capture of real-existing three-dimensional objects under regular incoherent illumination. In our implementation, we apply parallel computation scheme using the graphic processing unit, accelerating the processing speed. Using enhanced speed of hologram capture, we also implement a pseudo real-time hologram capture and optical reconstruction system. The overall operation speed is measured to be 1 frame per second.

  11. Spatial resolution recovery utilizing multi-ray tracing and graphic processing unit in PET image reconstruction.

    Science.gov (United States)

    Liang, Yicheng; Peng, Hao

    2015-02-07

    Depth-of-interaction (DOI) poses a major challenge for a PET system to achieve uniform spatial resolution across the field-of-view, particularly for small animal and organ-dedicated PET systems. In this work, we implemented an analytical method to model system matrix for resolution recovery, which was then incorporated in PET image reconstruction on a graphical processing unit platform, due to its parallel processing capacity. The method utilizes the concepts of virtual DOI layers and multi-ray tracing to calculate the coincidence detection response function for a given line-of-response. The accuracy of the proposed method was validated for a small-bore PET insert to be used for simultaneous PET/MR breast imaging. In addition, the performance comparisons were studied among the following three cases: 1) no physical DOI and no resolution modeling; 2) two physical DOI layers and no resolution modeling; and 3) no physical DOI design but with a different number of virtual DOI layers. The image quality was quantitatively evaluated in terms of spatial resolution (full-width-half-maximum and position offset), contrast recovery coefficient and noise. The results indicate that the proposed method has the potential to be used as an alternative to other physical DOI designs and achieve comparable imaging performances, while reducing detector/system design cost and complexity.

  12. Embedded-Based Graphics Processing Unit Cluster Platform for Multiple Sequence Alignments

    Directory of Open Access Journals (Sweden)

    Jyh-Da Wei

    2017-08-01

    Full Text Available High-end graphics processing units (GPUs, such as NVIDIA Tesla/Fermi/Kepler series cards with thousands of cores per chip, are widely applied to high-performance computing fields in a decade. These desktop GPU cards should be installed in personal computers/servers with desktop CPUs, and the cost and power consumption of constructing a GPU cluster platform are very high. In recent years, NVIDIA releases an embedded board, called Jetson Tegra K1 (TK1, which contains 4 ARM Cortex-A15 CPUs and 192 Compute Unified Device Architecture cores (belong to Kepler GPUs. Jetson Tegra K1 has several advantages, such as the low cost, low power consumption, and high applicability, and it has been applied into several specific applications. In our previous work, a bioinformatics platform with a single TK1 (STK platform was constructed, and this previous work is also used to prove that the Web and mobile services can be implemented in the STK platform with a good cost-performance ratio by comparing a STK platform with the desktop CPU and GPU. In this work, an embedded-based GPU cluster platform will be constructed with multiple TK1s (MTK platform. Complex system installation and setup are necessary procedures at first. Then, 2 job assignment modes are designed for the MTK platform to provide services for users. Finally, ClustalW v2.0.11 and ClustalWtk will be ported to the MTK platform. The experimental results showed that the speedup ratios achieved 5.5 and 4.8 times for ClustalW v2.0.11 and ClustalWtk, respectively, by comparing 6 TK1s with a single TK1. The MTK platform is proven to be useful for multiple sequence alignments.

  13. Calculation of HELAS amplitudes for QCD processes using graphics processing unit (GPU)

    CERN Document Server

    Hagiwara, K; Okamura, N; Rainwater, D L; Stelzer, T

    2009-01-01

    We use a graphics processing unit (GPU) for fast calculations of helicity amplitudes of quark and gluon scattering processes in massless QCD. New HEGET ({\\bf H}ELAS {\\bf E}valuation with {\\bf G}PU {\\bf E}nhanced {\\bf T}echnology) codes for gluon self-interactions are introduced, and a C++ program to convert the MadGraph generated FORTRAN codes into HEGET codes in CUDA (a C-platform for general purpose computing on GPU) is created. Because of the proliferation of the number of Feynman diagrams and the number of independent color amplitudes, the maximum number of final state jets we can evaluate on a GPU is limited to 4 for pure gluon processes ($gg\\to 4g$), or 5 for processes with one or more quark lines such as $q\\bar{q}\\to 5g$ and $qq\\to qq+3g$. Compared with the usual CPU-based programs, we obtain 60-100 times better performance on the GPU, except for 5-jet production processes and the $gg\\to 4g$ processes for which the GPU gain over the CPU is about 20.

  14. Option pricing with COS method on Graphics Processing Units

    NARCIS (Netherlands)

    B. Zhang (Bo); C.W. Oosterlee (Cornelis)

    2009-01-01

    htmlabstractIn this paper, acceleration on the GPU for option pricing by the COS method is demonstrated. In particular, both European and Bermudan options will be discussed in detail. For Bermudan options, we consider both the Black-Scholes model and Levy processes of infinite activity. Moreover, th

  15. Option pricing with COS method on Graphics Processing Units

    NARCIS (Netherlands)

    Zhang, B.; Oosterlee, C.W.

    2009-01-01

    In this paper, acceleration on the GPU for option pricing by the COS method is demonstrated. In particular, both European and Bermudan options will be discussed in detail. For Bermudan options, we consider both the Black-Scholes model and Levy processes of infinite activity. Moreover, the influence

  16. Evaluating Mobile Graphics Processing Units (GPUs) for Real-Time Resource Constrained Applications

    Energy Technology Data Exchange (ETDEWEB)

    Meredith, J; Conger, J; Liu, Y; Johnson, J

    2005-11-11

    Modern graphics processing units (GPUs) can provide tremendous performance boosts for some applications beyond what a single CPU can accomplish, and their performance is growing at a rate faster than CPUs as well. Mobile GPUs available for laptops have the small form factor and low power requirements suitable for use in embedded processing. We evaluated several desktop and mobile GPUs and CPUs on traditional and non-traditional graphics tasks, as well as on the most time consuming pieces of a full hyperspectral imaging application. Accuracy remained high despite small differences in arithmetic operations like rounding. Performance improvements are summarized here relative to a desktop Pentium 4 CPU.

  17. Accelerating Malware Detection via a Graphics Processing Unit

    Science.gov (United States)

    2010-09-01

    Processing Unit . . . . . . . . . . . . . . . . . . 4 PE Portable Executable . . . . . . . . . . . . . . . . . . . . . 4 COFF Common Object File Format...operating systems for the future [Szo05]. The PE format is an updated version of the common object file format ( COFF ) [Mic06]. Microsoft released a new...pro.mspx, Accessed July 2010, 2001. 79 Mic06. Microsoft. Common object file format ( coff ). MSDN, November 2006. Re- vision 4.1. Mic07a. Microsoft

  18. Fast data preprocessing with Graphics Processing Units for inverse problem solving in light-scattering measurements

    Science.gov (United States)

    Derkachov, G.; Jakubczyk, T.; Jakubczyk, D.; Archer, J.; Woźniak, M.

    2017-07-01

    Utilising Compute Unified Device Architecture (CUDA) platform for Graphics Processing Units (GPUs) enables significant reduction of computation time at a moderate cost, by means of parallel computing. In the paper [Jakubczyk et al., Opto-Electron. Rev., 2016] we reported using GPU for Mie scattering inverse problem solving (up to 800-fold speed-up). Here we report the development of two subroutines utilising GPU at data preprocessing stages for the inversion procedure: (i) A subroutine, based on ray tracing, for finding spherical aberration correction function. (ii) A subroutine performing the conversion of an image to a 1D distribution of light intensity versus azimuth angle (i.e. scattering diagram), fed from a movie-reading CPU subroutine running in parallel. All subroutines are incorporated in PikeReader application, which we make available on GitHub repository. PikeReader returns a sequence of intensity distributions versus a common azimuth angle vector, corresponding to the recorded movie. We obtained an overall ∼ 400 -fold speed-up of calculations at data preprocessing stages using CUDA codes running on GPU in comparison to single thread MATLAB-only code running on CPU.

  19. Accelerating large-scale protein structure alignments with graphics processing units

    Directory of Open Access Journals (Sweden)

    Pang Bin

    2012-02-01

    Full Text Available Abstract Background Large-scale protein structure alignment, an indispensable tool to structural bioinformatics, poses a tremendous challenge on computational resources. To ensure structure alignment accuracy and efficiency, efforts have been made to parallelize traditional alignment algorithms in grid environments. However, these solutions are costly and of limited accessibility. Others trade alignment quality for speedup by using high-level characteristics of structure fragments for structure comparisons. Findings We present ppsAlign, a parallel protein structure Alignment framework designed and optimized to exploit the parallelism of Graphics Processing Units (GPUs. As a general-purpose GPU platform, ppsAlign could take many concurrent methods, such as TM-align and Fr-TM-align, into the parallelized algorithm design. We evaluated ppsAlign on an NVIDIA Tesla C2050 GPU card, and compared it with existing software solutions running on an AMD dual-core CPU. We observed a 36-fold speedup over TM-align, a 65-fold speedup over Fr-TM-align, and a 40-fold speedup over MAMMOTH. Conclusions ppsAlign is a high-performance protein structure alignment tool designed to tackle the computational complexity issues from protein structural data. The solution presented in this paper allows large-scale structure comparisons to be performed using massive parallel computing power of GPU.

  20. Accelerating frequency-domain diffuse optical tomographic image reconstruction using graphics processing units.

    Science.gov (United States)

    Prakash, Jaya; Chandrasekharan, Venkittarayan; Upendra, Vishwajith; Yalavarthy, Phaneendra K

    2010-01-01

    Diffuse optical tomographic image reconstruction uses advanced numerical models that are computationally costly to be implemented in the real time. The graphics processing units (GPUs) offer desktop massive parallelization that can accelerate these computations. An open-source GPU-accelerated linear algebra library package is used to compute the most intensive matrix-matrix calculations and matrix decompositions that are used in solving the system of linear equations. These open-source functions were integrated into the existing frequency-domain diffuse optical image reconstruction algorithms to evaluate the acceleration capability of the GPUs (NVIDIA Tesla C 1060) with increasing reconstruction problem sizes. These studies indicate that single precision computations are sufficient for diffuse optical tomographic image reconstruction. The acceleration per iteration can be up to 40, using GPUs compared to traditional CPUs in case of three-dimensional reconstruction, where the reconstruction problem is more underdetermined, making the GPUs more attractive in the clinical settings. The current limitation of these GPUs in the available onboard memory (4 GB) that restricts the reconstruction of a large set of optical parameters, more than 13,377.

  1. Multidimensional upwind hydrodynamics on unstructured meshes using graphics processing units - I. Two-dimensional uniform meshes

    Science.gov (United States)

    Paardekooper, S.-J.

    2017-08-01

    We present a new method for numerical hydrodynamics which uses a multidimensional generalization of the Roe solver and operates on an unstructured triangular mesh. The main advantage over traditional methods based on Riemann solvers, which commonly use one-dimensional flux estimates as building blocks for a multidimensional integration, is its inherently multidimensional nature, and as a consequence its ability to recognize multidimensional stationary states that are not hydrostatic. A second novelty is the focus on graphics processing units (GPUs). By tailoring the algorithms specifically to GPUs, we are able to get speedups of 100-250 compared to a desktop machine. We compare the multidimensional upwind scheme to a traditional, dimensionally split implementation of the Roe solver on several test problems, and we find that the new method significantly outperforms the Roe solver in almost all cases. This comes with increased computational costs per time-step, which makes the new method approximately a factor of 2 slower than a dimensionally split scheme acting on a structured grid.

  2. Towards a Unified Sentiment Lexicon Based on Graphics Processing Units

    Directory of Open Access Journals (Sweden)

    Liliana Ibeth Barbosa-Santillán

    2014-01-01

    Full Text Available This paper presents an approach to create what we have called a Unified Sentiment Lexicon (USL. This approach aims at aligning, unifying, and expanding the set of sentiment lexicons which are available on the web in order to increase their robustness of coverage. One problem related to the task of the automatic unification of different scores of sentiment lexicons is that there are multiple lexical entries for which the classification of positive, negative, or neutral {P,N,Z} depends on the unit of measurement used in the annotation methodology of the source sentiment lexicon. Our USL approach computes the unified strength of polarity of each lexical entry based on the Pearson correlation coefficient which measures how correlated lexical entries are with a value between 1 and −1, where 1 indicates that the lexical entries are perfectly correlated, 0 indicates no correlation, and −1 means they are perfectly inversely correlated and so is the UnifiedMetrics procedure for CPU and GPU, respectively. Another problem is the high processing time required for computing all the lexical entries in the unification task. Thus, the USL approach computes a subset of lexical entries in each of the 1344 GPU cores and uses parallel processing in order to unify 155802 lexical entries. The results of the analysis conducted using the USL approach show that the USL has 95.430 lexical entries, out of which there are 35.201 considered to be positive, 22.029 negative, and 38.200 neutral. Finally, the runtime was 10 minutes for 95.430 lexical entries; this allows a reduction of the time computing for the UnifiedMetrics by 3 times.

  3. Design and analysis of CMOS analog signal processing circuits by means of a graphical MOST model

    NARCIS (Netherlands)

    Wallinga, Hans; Bult, Klaas

    1989-01-01

    A graphical representation of a simple MOST (metal-oxide-semiconductor transistor) model for the analysis of analog MOS circuits operating in strong inversion is given. It visualizes the principles of signal-processing techniques depending on the characteristics of an MOS transistor. Several lineari

  4. Parallelized CCHE2D flow model with CUDA Fortran on Graphics Process Units

    Science.gov (United States)

    This paper presents the CCHE2D implicit flow model parallelized using CUDA Fortran programming technique on Graphics Processing Units (GPUs). A parallelized implicit Alternating Direction Implicit (ADI) solver using Parallel Cyclic Reduction (PCR) algorithm on GPU is developed and tested. This solve...

  5. On the use of graphics processing units (GPUs) for molecular dynamics simulation of spherical particles

    NARCIS (Netherlands)

    Hidalgo, R.C.; Kanzaki, T.; Alonso-Marroquin, F.; Luding, S.; Yu, A.; Dong, K.; Yang, R.; Luding, S.

    2013-01-01

    General-purpose computation on Graphics Processing Units (GPU) on personal computers has recently become an attractive alternative to parallel computing on clusters and supercomputers. We present the GPU-implementation of an accurate molecular dynamics algorithm for a system of spheres. The new hybr

  6. A sampler of useful computational tools for applied geometry, computer graphics, and image processing foundations for computer graphics, vision, and image processing

    CERN Document Server

    Cohen-Or, Daniel; Ju, Tao; Mitra, Niloy J; Shamir, Ariel; Sorkine-Hornung, Olga; Zhang, Hao (Richard)

    2015-01-01

    A Sampler of Useful Computational Tools for Applied Geometry, Computer Graphics, and Image Processing shows how to use a collection of mathematical techniques to solve important problems in applied mathematics and computer science areas. The book discusses fundamental tools in analytical geometry and linear algebra. It covers a wide range of topics, from matrix decomposition to curvature analysis and principal component analysis to dimensionality reduction.Written by a team of highly respected professors, the book can be used in a one-semester, intermediate-level course in computer science. It

  7. Mesh-particle interpolations on graphics processing units and multicore central processing units.

    Science.gov (United States)

    Rossinelli, Diego; Conti, Christian; Koumoutsakos, Petros

    2011-06-13

    Particle-mesh interpolations are fundamental operations for particle-in-cell codes, as implemented in vortex methods, plasma dynamics and electrostatics simulations. In these simulations, the mesh is used to solve the field equations and the gradients of the fields are used in order to advance the particles. The time integration of particle trajectories is performed through an extensive resampling of the flow field at the particle locations. The computational performance of this resampling turns out to be limited by the memory bandwidth of the underlying computer architecture. We investigate how mesh-particle interpolation can be efficiently performed on graphics processing units (GPUs) and multicore central processing units (CPUs), and we present two implementation techniques. The single-precision results for the multicore CPU implementation show an acceleration of 45-70×, depending on system size, and an acceleration of 85-155× for the GPU implementation over an efficient single-threaded C++ implementation. In double precision, we observe a performance improvement of 30-40× for the multicore CPU implementation and 20-45× for the GPU implementation. With respect to the 16-threaded standard C++ implementation, the present CPU technique leads to a performance increase of roughly 2.8-3.7× in single precision and 1.7-2.4× in double precision, whereas the GPU technique leads to an improvement of 9× in single precision and 2.2-2.8× in double precision.

  8. Interactive Graphics Simulator: Design, Development, and Effectiveness/Cost Evaluation. Final Report.

    Science.gov (United States)

    Pieper, William J.; And Others

    This study was initiated to design, develop, implement, and evaluate a videodisc-based simulator system, the Interactive Graphics Simulator (IGS) for 6883 Converter Flight Control Test Station training at Lowry Air Force Base, Colorado. The simulator provided a means for performing task analysis online, developing simulations from the task…

  9. Optimization Solutions for Improving the Performance of the Parallel Reduction Algorithm Using Graphics Processing Units

    Directory of Open Access Journals (Sweden)

    Ion LUNGU

    2012-01-01

    Full Text Available In this paper, we research, analyze and develop optimization solutions for the parallel reduction function using graphics processing units (GPUs that implement the Compute Unified Device Architecture (CUDA, a modern and novel approach for improving the software performance of data processing applications and algorithms. Many of these applications and algorithms make use of the reduction function in their computational steps. After having designed the function and its algorithmic steps in CUDA, we have progressively developed and implemented optimization solutions for the reduction function. In order to confirm, test and evaluate the solutions' efficiency, we have developed a custom tailored benchmark suite. We have analyzed the obtained experimental results regarding: the comparison of the execution time and bandwidth when using graphic processing units covering the main CUDA architectures (Tesla GT200, Fermi GF100, Kepler GK104 and a central processing unit; the data type influence; the binary operator's influence.

  10. Mathematics of shape description a morphological approach to image processing and computer graphics

    CERN Document Server

    Ghosh, Pijush K

    2009-01-01

    Image processing problems are often not well defined because real images are contaminated with noise and other uncertain factors. In Mathematics of Shape Description, the authors take a mathematical approach to address these problems using the morphological and set-theoretic approach to image processing and computer graphics by presenting a simple shape model using two basic shape operators called Minkowski addition and decomposition. This book is ideal for professional researchers and engineers in Information Processing, Image Measurement, Shape Description, Shape Representation and Computer Graphics. Post-graduate and advanced undergraduate students in pure and applied mathematics, computer sciences, robotics and engineering will also benefit from this book.  Key FeaturesExplains the fundamental and advanced relationships between algebraic system and shape description through the set-theoretic approachPromotes interaction of image processing geochronology and mathematics in the field of algebraic geometryP...

  11. Fast blood flow visualization of high-resolution laser speckle imaging data using graphics processing unit.

    Science.gov (United States)

    Liu, Shusen; Li, Pengcheng; Luo, Qingming

    2008-09-15

    Laser speckle contrast analysis (LASCA) is a non-invasive, full-field optical technique that produces two-dimensional map of blood flow in biological tissue by analyzing speckle images captured by CCD camera. Due to the heavy computation required for speckle contrast analysis, video frame rate visualization of blood flow which is essentially important for medical usage is hardly achieved for the high-resolution image data by using the CPU (Central Processing Unit) of an ordinary PC (Personal Computer). In this paper, we introduced GPU (Graphics Processing Unit) into our data processing framework of laser speckle contrast imaging to achieve fast and high-resolution blood flow visualization on PCs by exploiting the high floating-point processing power of commodity graphics hardware. By using GPU, a 12-60 fold performance enhancement is obtained in comparison to the optimized CPU implementations.

  12. Demystifying the Cost Estimation Process

    Science.gov (United States)

    Obi, Samuel C.

    2010-01-01

    In manufacturing today, nothing is more important than giving a customer a clear and straight-forward accounting of what their money has purchased. Many potentially promising return business orders are lost because of unclear, ambiguous, or improper billing. One of the best ways of resolving cost bargaining conflicts is by providing a…

  13. Demystifying the Cost Estimation Process

    Science.gov (United States)

    Obi, Samuel C.

    2010-01-01

    In manufacturing today, nothing is more important than giving a customer a clear and straight-forward accounting of what their money has purchased. Many potentially promising return business orders are lost because of unclear, ambiguous, or improper billing. One of the best ways of resolving cost bargaining conflicts is by providing a…

  14. Use of a graphics processing unit (GPU) to facilitate real-time 3D graphic presentation of the patient skin-dose distribution during fluoroscopic interventional procedures.

    Science.gov (United States)

    Rana, Vijay; Rudin, Stephen; Bednarek, Daniel R

    2012-02-23

    We have developed a dose-tracking system (DTS) that calculates the radiation dose to the patient's skin in real-time by acquiring exposure parameters and imaging-system-geometry from the digital bus on a Toshiba Infinix C-arm unit. The cumulative dose values are then displayed as a color map on an OpenGL-based 3D graphic of the patient for immediate feedback to the interventionalist. Determination of those elements on the surface of the patient 3D-graphic that intersect the beam and calculation of the dose for these elements in real time demands fast computation. Reducing the size of the elements results in more computation load on the computer processor and therefore a tradeoff occurs between the resolution of the patient graphic and the real-time performance of the DTS. The speed of the DTS for calculating dose to the skin is limited by the central processing unit (CPU) and can be improved by using the parallel processing power of a graphics processing unit (GPU). Here, we compare the performance speed of GPU-based DTS software to that of the current CPU-based software as a function of the resolution of the patient graphics. Results show a tremendous improvement in speed using the GPU. While an increase in the spatial resolution of the patient graphics resulted in slowing down the computational speed of the DTS on the CPU, the speed of the GPU-based DTS was hardly affected. This GPU-based DTS can be a powerful tool for providing accurate, real-time feedback about patient skin-dose to physicians while performing interventional procedures.

  15. Use of a graphics processing unit (GPU) to facilitate real-time 3D graphic presentation of the patient skin-dose distribution during fluoroscopic interventional procedures

    Science.gov (United States)

    Rana, Vijay; Rudin, Stephen; Bednarek, Daniel R.

    2012-03-01

    We have developed a dose-tracking system (DTS) that calculates the radiation dose to the patient's skin in realtime by acquiring exposure parameters and imaging-system-geometry from the digital bus on a Toshiba Infinix C-arm unit. The cumulative dose values are then displayed as a color map on an OpenGL-based 3D graphic of the patient for immediate feedback to the interventionalist. Determination of those elements on the surface of the patient 3D-graphic that intersect the beam and calculation of the dose for these elements in real time demands fast computation. Reducing the size of the elements results in more computation load on the computer processor and therefore a tradeoff occurs between the resolution of the patient graphic and the real-time performance of the DTS. The speed of the DTS for calculating dose to the skin is limited by the central processing unit (CPU) and can be improved by using the parallel processing power of a graphics processing unit (GPU). Here, we compare the performance speed of GPU-based DTS software to that of the current CPU-based software as a function of the resolution of the patient graphics. Results show a tremendous improvement in speed using the GPU. While an increase in the spatial resolution of the patient graphics resulted in slowing down the computational speed of the DTS on the CPU, the speed of the GPU-based DTS was hardly affected. This GPU-based DTS can be a powerful tool for providing accurate, real-time feedback about patient skin-dose to physicians while performing interventional procedures.

  16. Bandwidth Enhancement between Graphics Processing Units on the Peripheral Component Interconnect Bus

    Directory of Open Access Journals (Sweden)

    ANTON Alin

    2015-10-01

    Full Text Available General purpose computing on graphics processing units is a new trend in high performance computing. Present day applications require office and personal supercomputers which are mostly based on many core hardware accelerators communicating with the host system through the Peripheral Component Interconnect (PCI bus. Parallel data compression is a difficult topic but compression has been used successfully to improve the communication between parallel message passing interface (MPI processes on high performance computing clusters. In this paper we show that special pur pose compression algorithms designed for scientific floating point data can be used to enhance the bandwidth between 2 graphics processing unit (GPU devices on the PCI Express (PCIe 3.0 x16 bus in a homebuilt personal supercomputer (PSC.

  17. Parallel computing for simultaneous iterative tomographic imaging by graphics processing units

    Science.gov (United States)

    Bello-Maldonado, Pedro D.; López, Ricardo; Rogers, Colleen; Jin, Yuanwei; Lu, Enyue

    2016-05-01

    In this paper, we address the problem of accelerating inversion algorithms for nonlinear acoustic tomographic imaging by parallel computing on graphics processing units (GPUs). Nonlinear inversion algorithms for tomographic imaging often rely on iterative algorithms for solving an inverse problem, thus computationally intensive. We study the simultaneous iterative reconstruction technique (SIRT) for the multiple-input-multiple-output (MIMO) tomography algorithm which enables parallel computations of the grid points as well as the parallel execution of multiple source excitation. Using graphics processing units (GPUs) and the Compute Unified Device Architecture (CUDA) programming model an overall improvement of 26.33x was achieved when combining both approaches compared with sequential algorithms. Furthermore we propose an adaptive iterative relaxation factor and the use of non-uniform weights to improve the overall convergence of the algorithm. Using these techniques, fast computations can be performed in parallel without the loss of image quality during the reconstruction process.

  18. 从图形处理器到基于GPU的通用计算%From Graphic Processing Unit to General Purpose Graphic Processing Unit

    Institute of Scientific and Technical Information of China (English)

    刘金硕; 刘天晓; 吴慧; 曾秋梅; 任梦菲; 顾宜淳

    2013-01-01

    对GPU(graphic process unit)、基于GPU的通用计算(general purpose GPU,GPGPU)、基于GPU的编程模型与环境进行了界定;将GPU的发展分为4个阶段,阐述了GPU的架构由非统一的渲染架构到统一的渲染架构,再到新一代的费米架构的变化;通过对基于GPU的通用计算的架构与多核CPU架构、分布式集群架构进行了软硬件的对比.分析表明:当进行中粒度的线程级数据密集型并行运算时,采用多核多线程并行;当进行粗粒度的网络密集型并行运算时,采用集群并行;当进行细粒度的计算密集型并行运算时,采用GPU通用计算并行.最后本文展示了未来的GPGPU的研究热点和发展方向--GPGPU自动并行化、CUDA对多种语言的支持、CUDA的性能优化,并介绍了GPGPU的一些典型应用.%This paper defines the outline of GPU(graphic processing unit) , the general purpose computation, the programming model and the environment for GPU. Besides, it introduces the evolution process from GPU to GPGPU (general purpose graphic processing unit) , and the change from non-uniform render architecture to the unified render architecture and the next Fermi architecture in details. Then it compares GPGPU architecture with multi-core GPU architecture and distributed cluster architecture from the perspective of software and hardware. When doing the middle grain level thread data intensive parallel computing, the multi-core and multi-thread should be utilized. When doing the coarse grain network computing, the cluster computing should be utilized. When doing the fine grained compute intensive parallel computing, the general purpose computation should be adopted. Meanwhile, some classical applications of GPGPU have been mentioned. At last, this paper demonstrates the further developments and research hotspots of GPGPU, which are automatic parallelization of GPGPU, multi-language support and performance optimization of CUDA, and introduces the classic

  19. FLOCKING-BASED DOCUMENT CLUSTERING ON THE GRAPHICS PROCESSING UNIT [Book Chapter

    Energy Technology Data Exchange (ETDEWEB)

    Charles, J S; Patton, R M; Potok, T E; Cui, X

    2008-01-01

    Analyzing and grouping documents by content is a complex problem. One explored method of solving this problem borrows from nature, imitating the fl ocking behavior of birds. Each bird represents a single document and fl ies toward other documents that are similar to it. One limitation of this method of document clustering is its complexity O(n2). As the number of documents grows, it becomes increasingly diffi cult to receive results in a reasonable amount of time. However, fl ocking behavior, along with most naturally inspired algorithms such as ant colony optimization and particle swarm optimization, are highly parallel and have experienced improved performance on expensive cluster computers. In the last few years, the graphics processing unit (GPU) has received attention for its ability to solve highly-parallel and semi-parallel problems much faster than the traditional sequential processor. Some applications see a huge increase in performance on this new platform. The cost of these high-performance devices is also marginal when compared with the price of cluster machines. In this paper, we have conducted research to exploit this architecture and apply its strengths to the document flocking problem. Our results highlight the potential benefi t the GPU brings to all naturally inspired algorithms. Using the CUDA platform from NVIDIA®, we developed a document fl ocking implementation to be run on the NVIDIA® GEFORCE 8800. Additionally, we developed a similar but sequential implementation of the same algorithm to be run on a desktop CPU. We tested the performance of each on groups of news articles ranging in size from 200 to 3,000 documents. The results of these tests were very signifi cant. Performance gains ranged from three to nearly fi ve times improvement of the GPU over the CPU implementation. This dramatic improvement in runtime makes the GPU a potentially revolutionary platform for document clustering algorithms.

  20. Cost Models for MMC Manufacturing Processes

    Science.gov (United States)

    Elzey, Dana M.; Wadley, Haydn N. G.

    1996-01-01

    Processes for the manufacture of advanced metal matrix composites are rapidly approaching maturity in the research laboratory and there is growing interest in their transition to industrial production. However, research conducted to date has almost exclusively focused on overcoming the technical barriers to producing high-quality material and little attention has been given to the economical feasibility of these laboratory approaches and process cost issues. A quantitative cost modeling (QCM) approach was developed to address these issues. QCM are cost analysis tools based on predictive process models relating process conditions to the attributes of the final product. An important attribute, of the QCM approach is the ability to predict the sensitivity of material production costs to product quality and to quantitatively explore trade-offs between cost and quality. Applications of the cost models allow more efficient direction of future MMC process technology development and a more accurate assessment of MMC market potential. Cost models were developed for two state-of-the art metal matrix composite (MMC) manufacturing processes: tape casting and plasma spray deposition. Quality and Cost models are presented for both processes and the resulting predicted quality-cost curves are presented and discussed.

  1. Graphics Processing Unit-Based Bioheat Simulation to Facilitate Rapid Decision Making Associated with Cryosurgery Training.

    Science.gov (United States)

    Keelan, Robert; Zhang, Hong; Shimada, Kenji; Rabin, Yoed

    2016-04-01

    This study focuses on the implementation of an efficient numerical technique for cryosurgery simulations on a graphics processing unit as an alternative means to accelerate runtime. This study is part of an ongoing effort to develop computerized training tools for cryosurgery, with prostate cryosurgery as a developmental model. The ability to perform rapid simulations of various test cases is critical to facilitate sound decision making associated with medical training. Consistent with clinical practice, the training tool aims at correlating the frozen region contour and the corresponding temperature field with the target region shape. The current study focuses on the feasibility of graphics processing unit-based computation using C++ accelerated massive parallelism, as one possible implementation. Benchmark results on a variety of computation platforms display between 3-fold acceleration (laptop) and 13-fold acceleration (gaming computer) of cryosurgery simulation, in comparison with the more common implementation on a multicore central processing unit. While the general concept of graphics processing unit-based simulations is not new, its application to phase-change problems, combined with the unique requirements for cryosurgery optimization, represents the core contribution of the current study.

  2. Process Equipment Cost Estimation, Final Report

    Energy Technology Data Exchange (ETDEWEB)

    Loh, H. P. [National Energy Technology Lab. (NETL), Morgantown, WV (United States); Lyons, Jennifer [EG& G Technical Services, Inc., Morgantown, WV (United States); White, Charles W. [EG& G Technical Services, Inc., Morgantown, WV (United States)

    2002-01-01

    This report presents generic cost curves for several equipment types generated using ICARUS Process Evaluator. The curves give Purchased Equipment Cost as a function of a capacity variable. This work was performed to assist NETL engineers and scientists in performing rapid, order of magnitude level cost estimates or as an aid in evaluating the reasonableness of cost estimates submitted with proposed systems studies or proposals for new processes. The specific equipment types contained in this report were selected to represent a relatively comprehensive set of conventional chemical process equipment types.

  3. Accelerated molecular dynamics force evaluation on graphics processing units for thermal conductivity calculations

    OpenAIRE

    Fan, Zheyong; Siro, Topi; Harju, Ari

    2012-01-01

    In this paper, we develop a highly efficient molecular dynamics code fully implemented on graphics processing units for thermal conductivity calculations using the Green-Kubo formula. We compare two different schemes for force evaluation, a previously used thread-scheme where a single thread is used for one particle and each thread calculates the total force for the corresponding particle, and a new block-scheme where a whole block is used for one particle and each thread in the block calcula...

  4. A Fast MHD Code for Gravitationally Stratified Media using Graphical Processing Units: SMAUG

    Indian Academy of Sciences (India)

    M. K. Griffiths; V. Fedun; R.Erdélyi

    2015-03-01

    Parallelization techniques have been exploited most successfully by the gaming/graphics industry with the adoption of graphical processing units (GPUs), possessing hundreds of processor cores. The opportunity has been recognized by the computational sciences and engineering communities, who have recently harnessed successfully the numerical performance of GPUs. For example, parallel magnetohydrodynamic (MHD) algorithms are important for numerical modelling of highly inhomogeneous solar, astrophysical and geophysical plasmas. Here, we describe the implementation of SMAUG, the Sheffield Magnetohydrodynamics Algorithm Using GPUs. SMAUG is a 1–3D MHD code capable of modelling magnetized and gravitationally stratified plasma. The objective of this paper is to present the numerical methods and techniques used for porting the code to this novel and highly parallel compute architecture. The methods employed are justified by the performance benchmarks and validation results demonstrating that the code successfully simulates the physics for a range of test scenarios including a full 3D realistic model of wave propagation in the solar atmosphere.

  5. Fast extended focused imaging in digital holography using a graphics processing unit.

    Science.gov (United States)

    Wang, Le; Zhao, Jianlin; Di, Jianglei; Jiang, Hongzhen

    2011-05-01

    We present a simple and effective method for reconstructing extended focused images in digital holography using a graphics processing unit (GPU). The Fresnel transform method is simplified by an algorithm named fast Fourier transform pruning with frequency shift. Then the pixel size consistency problem is solved by coordinate transformation and combining the subpixel resampling and the fast Fourier transform pruning with frequency shift. With the assistance of the GPU, we implemented an improved parallel version of this method, which obtained about a 300-500-fold speedup compared with central processing unit codes.

  6. Fast high-resolution computer-generated hologram computation using multiple graphics processing unit cluster system.

    Science.gov (United States)

    Takada, Naoki; Shimobaba, Tomoyoshi; Nakayama, Hirotaka; Shiraki, Atsushi; Okada, Naohisa; Oikawa, Minoru; Masuda, Nobuyuki; Ito, Tomoyoshi

    2012-10-20

    To overcome the computational complexity of a computer-generated hologram (CGH), we implement an optimized CGH computation in our multi-graphics processing unit cluster system. Our system can calculate a CGH of 6,400×3,072 pixels from a three-dimensional (3D) object composed of 2,048 points in 55 ms. Furthermore, in the case of a 3D object composed of 4096 points, our system is 553 times faster than a conventional central processing unit (using eight threads).

  7. Graphics processing unit-based quantitative second-harmonic generation imaging.

    Science.gov (United States)

    Kabir, Mohammad Mahfuzul; Jonayat, A S M; Patel, Sanjay; Toussaint, Kimani C

    2014-09-01

    We adapt a graphics processing unit (GPU) to dynamic quantitative second-harmonic generation imaging. We demonstrate the temporal advantage of the GPU-based approach by computing the number of frames analyzed per second from SHG image videos showing varying fiber orientations. In comparison to our previously reported CPU-based approach, our GPU-based image analysis results in ∼10× improvement in computational time. This work can be adapted to other quantitative, nonlinear imaging techniques and provides a significant step toward obtaining quantitative information from fast in vivo biological processes.

  8. Development of MATLAB-Based Digital Signal Processing Teaching Module with Graphical User Interface Environment for Nigerian University

    National Research Council Canada - National Science Library

    Oyetunji Samson Ade; Daniel Ale

    2013-01-01

    .... This paper annexes the potential of Peripheral Interface Controllers (PICs) with MATLAB resources to develop a PIC-based system with graphic user interface environment suitable for data acquisition and signal processing...

  9. Systems Biology Graphical Notation: Process Description language Level 1 Version 1.3.

    Science.gov (United States)

    Moodie, Stuart; Le Novère, Nicolas; Demir, Emek; Mi, Huaiyu; Villéger, Alice

    2015-09-04

    The Systems Biological Graphical Notation (SBGN) is an international community effort for standardized graphical representations of biological pathways and networks. The goal of SBGN is to provide unambiguous pathway and network maps for readers with different scientific backgrounds as well as to support efficient and accurate exchange of biological knowledge between different research communities, industry, and other players in systems biology. Three SBGN languages, Process Description (PD), Entity Relationship (ER) and Activity Flow (AF), allow for the representation of different aspects of biological and biochemical systems at different levels of detail. The SBGN Process Description language represents biological entities and processes between these entities within a network. SBGN PD focuses on the mechanistic description and temporal dependencies of biological interactions and transformations. The nodes (elements) are split into entity nodes describing, e.g., metabolites, proteins, genes and complexes, and process nodes describing, e.g., reactions and associations. The edges (connections) provide descriptions of relationships (or influences) between the nodes, such as consumption, production, stimulation and inhibition. Among all three languages of SBGN, PD is the closest to metabolic and regulatory pathways in biological literature and textbooks, but its well-defined semantics offer a superior precision in expressing biological knowledge.

  10. Creating Interactive Graphical Overlays in the Advanced Weather Interactive Processing System Using Shapefiles and DGM Files

    Science.gov (United States)

    Barrett, Joe H., III; Lafosse, Richard; Hood, Doris; Hoeth, Brian

    2007-01-01

    Graphical overlays can be created in real-time in the Advanced Weather Interactive Processing System (AWIPS) using shapefiles or Denver AWIPS Risk Reduction and Requirements Evaluation (DARE) Graphics Metafile (DGM) files. This presentation describes how to create graphical overlays on-the-fly for AWIPS, by using two examples of AWIPS applications that were created by the Applied Meteorology Unit (AMU) located at Cape Canaveral Air Force Station (CCAFS), Florida. The first example is the Anvil Threat Corridor Forecast Tool, which produces a shapefile that depicts a graphical threat corridor of the forecast movement of thunderstorm anvil clouds, based on the observed or forecast upper-level winds. This tool is used by the Spaceflight Meteorology Group (SMG) at Johnson Space Center, Texas and 45th Weather Squadron (45 WS) at CCAFS to analyze the threat of natural or space vehicle-triggered lightning over a location. The second example is a launch and landing trajectory tool that produces a DGM file that plots the ground track of space vehicles during launch or landing. The trajectory tool can be used by SMG and the 45 WS forecasters to analyze weather radar imagery along a launch or landing trajectory. The presentation will list the advantages and disadvantages of both file types for creating interactive graphical overlays in future AWIPS applications. Shapefiles are a popular format used extensively in Geographical Information Systems. They are usually used in AWIPS to depict static map backgrounds. A shapefile stores the geometry and attribute information of spatial features in a dataset (ESRI 1998). Shapefiles can contain point, line, and polygon features. Each shapefile contains a main file, index file, and a dBASE table. The main file contains a record for each spatial feature, which describes the feature with a list of its vertices. The index file contains the offset of each record from the beginning of the main file. The dBASE table contains records for each

  11. Employing OpenCL to Accelerate Ab Initio Calculations on Graphics Processing Units.

    Science.gov (United States)

    Kussmann, Jörg; Ochsenfeld, Christian

    2017-06-13

    We present an extension of our graphics processing units (GPU)-accelerated quantum chemistry package to employ OpenCL compute kernels, which can be executed on a wide range of computing devices like CPUs, Intel Xeon Phi, and AMD GPUs. Here, we focus on the use of AMD GPUs and discuss differences as compared to CUDA-based calculations on NVIDIA GPUs. First illustrative timings are presented for hybrid density functional theory calculations using serial as well as parallel compute environments. The results show that AMD GPUs are as fast or faster than comparable NVIDIA GPUs and provide a viable alternative for quantum chemical applications.

  12. Monte Carlo Simulations of Random Frustrated Systems on Graphics Processing Units

    Science.gov (United States)

    Feng, Sheng; Fang, Ye; Hall, Sean; Papke, Ariane; Thomasson, Cade; Tam, Ka-Ming; Moreno, Juana; Jarrell, Mark

    2012-02-01

    We study the implementation of the classical Monte Carlo simulation for random frustrated models using the multithreaded computing environment provided by the the Compute Unified Device Architecture (CUDA) on modern Graphics Processing Units (GPU) with hundreds of cores and high memory bandwidth. The key for optimizing the performance of the GPU computing is in the proper handling of the data structure. Utilizing the multi-spin coding, we obtain an efficient GPU implementation of the parallel tempering Monte Carlo simulation for the Edwards-Anderson spin glass model. In the typical simulations, we find over two thousand times of speed-up over the single threaded CPU implementation.

  13. Uncontracted Rys Quadrature Implementation of up to G Functions on Graphical Processing Units.

    Science.gov (United States)

    Asadchev, Andrey; Allada, Veerendra; Felder, Jacob; Bode, Brett M; Gordon, Mark S; Windus, Theresa L

    2010-03-09

    An implementation is presented of an uncontracted Rys quadrature algorithm for electron repulsion integrals, including up to g functions on graphical processing units (GPUs). The general GPU programming model, the challenges associated with implementing the Rys quadrature on these highly parallel emerging architectures, and a new approach to implementing the quadrature are outlined. The performance of the implementation is evaluated for single and double precision on two different types of GPU devices. The performance obtained is on par with the matrix-vector routine from the CUDA basic linear algebra subroutines (CUBLAS) library.

  14. Efficient neighbor list calculation for molecular simulation of colloidal systems using graphics processing units

    Science.gov (United States)

    Howard, Michael P.; Anderson, Joshua A.; Nikoubashman, Arash; Glotzer, Sharon C.; Panagiotopoulos, Athanassios Z.

    2016-06-01

    We present an algorithm based on linear bounding volume hierarchies (LBVHs) for computing neighbor (Verlet) lists using graphics processing units (GPUs) for colloidal systems characterized by large size disparities. We compare this to a GPU implementation of the current state-of-the-art CPU algorithm based on stenciled cell lists. We report benchmarks for both neighbor list algorithms in a Lennard-Jones binary mixture with synthetic interaction range disparity and a realistic colloid solution. LBVHs outperformed the stenciled cell lists for systems with moderate or large size disparity and dilute or semidilute fractions of large particles, conditions typical of colloidal systems.

  15. Molecular dynamics for long-range interacting systems on Graphic Processing Units

    CERN Document Server

    Filho, Tarcísio M Rocha

    2012-01-01

    We present implementations of a fourth-order symplectic integrator on graphic processing units for three $N$-body models with long-range interactions of general interest: the Hamiltonian Mean Field, Ring and two-dimensional self-gravitating models. We discuss the algorithms, speedups and errors using one and two GPU units. Speedups can be as high as 140 compared to a serial code, and the overall relative error in the total energy is of the same order of magnitude as for the CPU code. The number of particles used in the tests range from 10,000 to 50,000,000 depending on the model.

  16. Accelerated 3D Monte Carlo light dosimetry using a graphics processing unit (GPU) cluster

    Science.gov (United States)

    Lo, William Chun Yip; Lilge, Lothar

    2010-11-01

    This paper presents a basic computational framework for real-time, 3-D light dosimetry on graphics processing unit (GPU) clusters. The GPU-based approach offers a direct solution to overcome the long computation time preventing Monte Carlo simulations from being used in complex optimization problems such as treatment planning, particularly if simulated annealing is employed as the optimization algorithm. The current multi- GPU implementation is validated using a commercial light modelling software (ASAP from Breault Research Organization). It also supports the latest Fermi GPU architecture and features an interactive 3-D visualization interface. The software is available for download at http://code.google.com/p/gpu3d.

  17. An evaluation of the FDA's analysis of the costs and benefits of the graphic warning label regulation

    Science.gov (United States)

    Chaloupka, Frank J; Warner, Kenneth E; Acemoğlu, Daron; Gruber, Jonathan; Laux, Fritz; Max, Wendy; Newhouse, Joseph; Schelling, Thomas; Sindelar, Jody

    2015-01-01

    The Family Smoking Prevention and Tobacco Control Act of 2009 gave the Food and Drug Administration (FDA) regulatory authority over cigarettes and smokeless tobacco products and authorised it to assert jurisdiction over other tobacco products. As with other Federal agencies, FDA is required to assess the costs and benefits of its significant regulatory actions. To date, FDA has issued economic impact analyses of one proposed and one final rule requiring graphic warning labels (GWLs) on cigarette packaging and, most recently, of a proposed rule that would assert FDA’s authority over tobacco products other than cigarettes and smokeless tobacco. Given the controversy over the FDA's approach to assessing net economic benefits in its proposed and final rules on GWLs and the importance of having economic impact analyses prepared in accordance with sound economic analysis, a group of prominent economists met in early 2014 to review that approach and, where indicated, to offer suggestions for an improved analysis. We concluded that the analysis of the impact of GWLs on smoking substantially underestimated the benefits and overestimated the costs, leading the FDA to substantially underestimate the net benefits of the GWLs. We hope that the FDA will find our evaluation useful in subsequent analyses, not only of GWLs but also of other regulations regarding tobacco products. Most of what we discuss applies to all instances of evaluating the costs and benefits of tobacco product regulation and, we believe, should be considered in FDA's future analyses of proposed rules. PMID:25550419

  18. Low-cost LANDSAT processing system

    Science.gov (United States)

    Faust, N. L.; Hooper, N. J.; Spann, G. W.

    1980-01-01

    LANDSAT analysis system is assembled from commercially available components at relatively low cost. Small-scale system is put together for price affordable for state agencies and universities. It processes LANDSAT data for subscene areas on repetitive basis. Amount of time required for processing decreases linearly with number of classifications desired. Computer programs written in FORTRAN IV are available for analyzing data.

  19. Graphics gems

    CERN Document Server

    Heckbert, Paul S

    1994-01-01

    Graphics Gems IV contains practical techniques for 2D and 3D modeling, animation, rendering, and image processing. The book presents articles on polygons and polyhedral; a mix of formulas, optimized algorithms, and tutorial information on the geometry of 2D, 3D, and n-D space; transformations; and parametric curves and surfaces. The text also includes articles on ray tracing; shading 3D models; and frame buffer techniques. Articles on image processing; algorithms for graphical layout; basic interpolation methods; and subroutine libraries for vector and matrix algebra are also demonstrated. Com

  20. Modified graphical autocatalytic set model of combustion process in circulating fluidized bed boiler

    Science.gov (United States)

    Yusof, Nurul Syazwani; Bakar, Sumarni Abu; Ismail, Razidah

    2014-07-01

    Circulating Fluidized Bed Boiler (CFB) is a device for generating steam by burning fossil fuels in a furnace operating under a special hydrodynamic condition. Autocatalytic Set has provided a graphical model of chemical reactions that occurred during combustion process in CFB. Eight important chemical substances known as species were represented as nodes and catalytic relationships between nodes are represented by the edges in the graph. In this paper, the model is extended and modified by considering other relevant chemical reactions that also exist during the process. Catalytic relationship among the species in the model is discussed. The result reveals that the modified model is able to gives more explanation of the relationship among the species during the process at initial time t.

  1. Accelerated multidimensional radiofrequency pulse design for parallel transmission using concurrent computation on multiple graphics processing units.

    Science.gov (United States)

    Deng, Weiran; Yang, Cungeng; Stenger, V Andrew

    2011-02-01

    Multidimensional radiofrequency (RF) pulses are of current interest because of their promise for improving high-field imaging and for optimizing parallel transmission methods. One major drawback is that the computation time of numerically designed multidimensional RF pulses increases rapidly with their resolution and number of transmitters. This is critical because the construction of multidimensional RF pulses often needs to be in real time. The use of graphics processing units for computations is a recent approach for accelerating image reconstruction applications. We propose the use of graphics processing units for the design of multidimensional RF pulses including the utilization of parallel transmitters. Using a desktop computer with four NVIDIA Tesla C1060 computing processors, we found acceleration factors on the order of 20 for standard eight-transmitter two-dimensional spiral RF pulses with a 64 × 64 excitation resolution and a 10-μsec dwell time. We also show that even greater acceleration factors can be achieved for more complex RF pulses. Copyright © 2010 Wiley-Liss, Inc.

  2. 76 FR 70490 - Certain Electronic Devices With Graphics Data Processing Systems, Components Thereof, and...

    Science.gov (United States)

    2011-11-14

    ...: Apple Inc., a/k/a Apple Computer, Inc., 1 Infinite Loop, Cupertino, CA 95014. (c) The Office of Unfair... Graphics Co., Ltd. of British West Indies and S3 Graphics, Inc. of Fremont, California. ] An amended... Graphics, Inc., 940 Mission Court, Fremont, CA 94539. (b) The respondent is the following entity alleged...

  3. Environmental control costs for oil shale processes

    Energy Technology Data Exchange (ETDEWEB)

    None

    1979-10-01

    The studies reported herein are intended to provide more certainty regarding estimates of the costs of controlling environmental residuals from oil shale technologies being readied for commercial application. The need for this study was evident from earlier work conducted by the Office of Environment for the Department of Energy Oil Shale Commercialization Planning, Environmental Readiness Assessment in mid-1978. At that time there was little reliable information on the costs for controlling residuals and for safe handling of wastes from oil shale processes. The uncertainties in estimating costs of complying with yet-to-be-defined environmental standards and regulations for oil shale facilities are a critical element that will affect the decision on proceeding with shale oil production. Until the regulatory requirements are fully clarified and processes and controls are investigated and tested in units of larger size, it will not be possible to provide definitive answers to the cost question. Thus, the objective of this work was to establish ranges of possible control costs per barrel of shale oil produced, reflecting various regulatory, technical, and financing assumptions. Two separate reports make up the bulk of this document. One report, prepared by the Denver Research Institute, is a relatively rigorous engineering treatment of the subject, based on regulatory assumptions and technical judgements as to best available control technologies and practices. The other report examines the incremental cost effect of more conservative technical and financing alternatives. An overview section is included that synthesizes the products of the separate studies and addresses two variations to the assumptions.

  4. Using Graphics Processing Units to solve the classical N-body problem in physics and astrophysics

    CERN Document Server

    Spera, Mario

    2014-01-01

    Graphics Processing Units (GPUs) can speed up the numerical solution of various problems in astrophysics including the dynamical evolution of stellar systems; the performance gain can be more than a factor 100 compared to using a Central Processing Unit only. In this work I describe some strategies to speed up the classical N-body problem using GPUs. I show some features of the N-body code HiGPUs as template code. In this context, I also give some hints on the parallel implementation of a regularization method and I introduce the code HiGPUs-R. Although the main application of this work concerns astrophysics, some of the presented techniques are of general validity and can be applied to other branches of physics such as electrodynamics and QCD.

  5. Fast direct reconstruction strategy of dynamic fluorescence molecular tomography using graphics processing units

    Science.gov (United States)

    Chen, Maomao; Zhang, Jiulou; Cai, Chuangjian; Gao, Yang; Luo, Jianwen

    2016-06-01

    Dynamic fluorescence molecular tomography (DFMT) is a valuable method to evaluate the metabolic process of contrast agents in different organs in vivo, and direct reconstruction methods can improve the temporal resolution of DFMT. However, challenges still remain due to the large time consumption of the direct reconstruction methods. An acceleration strategy using graphics processing units (GPU) is presented. The procedure of conjugate gradient optimization in the direct reconstruction method is programmed using the compute unified device architecture and then accelerated on GPU. Numerical simulations and in vivo experiments are performed to validate the feasibility of the strategy. The results demonstrate that, compared with the traditional method, the proposed strategy can reduce the time consumption by ˜90% without a degradation of quality.

  6. Acceleration of Early-Photon Fluorescence Molecular Tomography with Graphics Processing Units

    Directory of Open Access Journals (Sweden)

    Xin Wang

    2013-01-01

    Full Text Available Fluorescence molecular tomography (FMT with early-photons can improve the spatial resolution and fidelity of the reconstructed results. However, its computing scale is always large which limits its applications. In this paper, we introduced an acceleration strategy for the early-photon FMT with graphics processing units (GPUs. According to the procedure, the whole solution of FMT was divided into several modules and the time consumption for each module is studied. In this strategy, two most time consuming modules (Gd and W modules were accelerated with GPU, respectively, while the other modules remained coded in the Matlab. Several simulation studies with a heterogeneous digital mouse atlas were performed to confirm the performance of the acceleration strategy. The results confirmed the feasibility of the strategy and showed that the processing speed was improved significantly.

  7. Speedup for quantum optimal control from automatic differentiation based on graphics processing units

    Science.gov (United States)

    Leung, Nelson; Abdelhafez, Mohamed; Koch, Jens; Schuster, David

    2017-04-01

    We implement a quantum optimal control algorithm based on automatic differentiation and harness the acceleration afforded by graphics processing units (GPUs). Automatic differentiation allows us to specify advanced optimization criteria and incorporate them in the optimization process with ease. We show that the use of GPUs can speedup calculations by more than an order of magnitude. Our strategy facilitates efficient numerical simulations on affordable desktop computers and exploration of a host of optimization constraints and system parameters relevant to real-life experiments. We demonstrate optimization of quantum evolution based on fine-grained evaluation of performance at each intermediate time step, thus enabling more intricate control on the evolution path, suppression of departures from the truncated model subspace, as well as minimization of the physical time needed to perform high-fidelity state preparation and unitary gates.

  8. Processing-in-Memory Enabled Graphics Processors for 3D Rendering

    Energy Technology Data Exchange (ETDEWEB)

    Xie, Chenhao; Song, Shuaiwen; Wang, Jing; Zhang, Weigong; Fu, Xin

    2017-02-06

    The performance of 3D rendering of Graphics Processing Unit that convents 3D vector stream into 2D frame with 3D image effects significantly impact users’ gaming experience on modern computer systems. Due to the high texture throughput in 3D rendering, main memory bandwidth becomes a critical obstacle for improving the overall rendering performance. 3D stacked memory systems such as Hybrid Memory Cube (HMC) provide opportunities to significantly overcome the memory wall by directly connecting logic controllers to DRAM dies. Based on the observation that texel fetches significantly impact off-chip memory traffic, we propose two architectural designs to enable Processing-In-Memory based GPU for efficient 3D rendering.

  9. Advanced Investigation and Comparative Study of Graphics Processing Unit-queries Countered

    Directory of Open Access Journals (Sweden)

    A. Baskar

    2014-10-01

    Full Text Available GPU, Graphics Processing Unit, is the buzz word ruling the market these days. What is that and how has it gained that much importance is what to be answered in this research work. The study has been constructed with full attention paid towards answering the following question. What is a GPU? How is it different from a CPU? How good/bad it is computationally when comparing to CPU? Can GPU replace CPU, or it is a day dream? How significant is arrival of APU (Accelerated Processing Unit in market? What tools are needed to make GPU work? What are the improvement/focus areas for GPU to stand in the market? All the above questions are discussed and answered well in this study with relevant explanations.

  10. CUDASW++: optimizing Smith-Waterman sequence database searches for CUDA-enabled graphics processing units.

    Science.gov (United States)

    Liu, Yongchao; Maskell, Douglas L; Schmidt, Bertil

    2009-05-06

    The Smith-Waterman algorithm is one of the most widely used tools for searching biological sequence databases due to its high sensitivity. Unfortunately, the Smith-Waterman algorithm is computationally demanding, which is further compounded by the exponential growth of sequence databases. The recent emergence of many-core architectures, and their associated programming interfaces, provides an opportunity to accelerate sequence database searches using commonly available and inexpensive hardware. Our CUDASW++ implementation (benchmarked on a single-GPU NVIDIA GeForce GTX 280 graphics card and a dual-GPU GeForce GTX 295 graphics card) provides a significant performance improvement compared to other publicly available implementations, such as SWPS3, CBESW, SW-CUDA, and NCBI-BLAST. CUDASW++ supports query sequences of length up to 59K and for query sequences ranging in length from 144 to 5,478 in Swiss-Prot release 56.6, the single-GPU version achieves an average performance of 9.509 GCUPS with a lowest performance of 9.039 GCUPS and a highest performance of 9.660 GCUPS, and the dual-GPU version achieves an average performance of 14.484 GCUPS with a lowest performance of 10.660 GCUPS and a highest performance of 16.087 GCUPS. CUDASW++ is publicly available open-source software. It provides a significant performance improvement for Smith-Waterman-based protein sequence database searches by fully exploiting the compute capability of commonly used CUDA-enabled low-cost GPUs.

  11. CUDASW++: optimizing Smith-Waterman sequence database searches for CUDA-enabled graphics processing units

    Directory of Open Access Journals (Sweden)

    Maskell Douglas L

    2009-05-01

    Full Text Available Abstract Background The Smith-Waterman algorithm is one of the most widely used tools for searching biological sequence databases due to its high sensitivity. Unfortunately, the Smith-Waterman algorithm is computationally demanding, which is further compounded by the exponential growth of sequence databases. The recent emergence of many-core architectures, and their associated programming interfaces, provides an opportunity to accelerate sequence database searches using commonly available and inexpensive hardware. Findings Our CUDASW++ implementation (benchmarked on a single-GPU NVIDIA GeForce GTX 280 graphics card and a dual-GPU GeForce GTX 295 graphics card provides a significant performance improvement compared to other publicly available implementations, such as SWPS3, CBESW, SW-CUDA, and NCBI-BLAST. CUDASW++ supports query sequences of length up to 59K and for query sequences ranging in length from 144 to 5,478 in Swiss-Prot release 56.6, the single-GPU version achieves an average performance of 9.509 GCUPS with a lowest performance of 9.039 GCUPS and a highest performance of 9.660 GCUPS, and the dual-GPU version achieves an average performance of 14.484 GCUPS with a lowest performance of 10.660 GCUPS and a highest performance of 16.087 GCUPS. Conclusion CUDASW++ is publicly available open-source software. It provides a significant performance improvement for Smith-Waterman-based protein sequence database searches by fully exploiting the compute capability of commonly used CUDA-enabled low-cost GPUs.

  12. GPUmotif: an ultra-fast and energy-efficient motif analysis program using graphics processing units.

    Directory of Open Access Journals (Sweden)

    Pooya Zandevakili

    Full Text Available Computational detection of TF binding patterns has become an indispensable tool in functional genomics research. With the rapid advance of new sequencing technologies, large amounts of protein-DNA interaction data have been produced. Analyzing this data can provide substantial insight into the mechanisms of transcriptional regulation. However, the massive amount of sequence data presents daunting challenges. In our previous work, we have developed a novel algorithm called Hybrid Motif Sampler (HMS that enables more scalable and accurate motif analysis. Despite much improvement, HMS is still time-consuming due to the requirement to calculate matching probabilities position-by-position. Using the NVIDIA CUDA toolkit, we developed a graphics processing unit (GPU-accelerated motif analysis program named GPUmotif. We proposed a "fragmentation" technique to hide data transfer time between memories. Performance comparison studies showed that commonly-used model-based motif scan and de novo motif finding procedures such as HMS can be dramatically accelerated when running GPUmotif on NVIDIA graphics cards. As a result, energy consumption can also be greatly reduced when running motif analysis using GPUmotif. The GPUmotif program is freely available at http://sourceforge.net/projects/gpumotif/

  13. Massively Parallel Signal Processing using the Graphics Processing Unit for Real-Time Brain-Computer Interface Feature Extraction.

    Science.gov (United States)

    Wilson, J Adam; Williams, Justin C

    2009-01-01

    The clock speeds of modern computer processors have nearly plateaued in the past 5 years. Consequently, neural prosthetic systems that rely on processing large quantities of data in a short period of time face a bottleneck, in that it may not be possible to process all of the data recorded from an electrode array with high channel counts and bandwidth, such as electrocorticographic grids or other implantable systems. Therefore, in this study a method of using the processing capabilities of a graphics card [graphics processing unit (GPU)] was developed for real-time neural signal processing of a brain-computer interface (BCI). The NVIDIA CUDA system was used to offload processing to the GPU, which is capable of running many operations in parallel, potentially greatly increasing the speed of existing algorithms. The BCI system records many channels of data, which are processed and translated into a control signal, such as the movement of a computer cursor. This signal processing chain involves computing a matrix-matrix multiplication (i.e., a spatial filter), followed by calculating the power spectral density on every channel using an auto-regressive method, and finally classifying appropriate features for control. In this study, the first two computationally intensive steps were implemented on the GPU, and the speed was compared to both the current implementation and a central processing unit-based implementation that uses multi-threading. Significant performance gains were obtained with GPU processing: the current implementation processed 1000 channels of 250 ms in 933 ms, while the new GPU method took only 27 ms, an improvement of nearly 35 times.

  14. [Applying graphics processing unit in real-time signal processing and visualization of ophthalmic Fourier-domain OCT system].

    Science.gov (United States)

    Liu, Qiaoyan; Li, Yuejie; Xu, Qiujing; Zhao, Jincheng; Wang, Liwei; Gao, Yonghe

    2013-01-01

    This investigation introduces GPU (Graphics Processing Unit)- based CUDA (Compute Unified Device Architecture) technology into signal processing of ophthalmic FD-OCT (Fourier-Domain Optical Coherence Tomography) imaging system, can realize parallel data processing, using CUDA to optimize relevant operations and algorithms, in order to solve the technical bottlenecks that currently affect ophthalmic real-time imaging in OCT system. Laboratory results showed that with GPU as a general parallel computing processor, the speed of imaging data processing using GPU+CPU mode is more than dozens times faster than traditional CPU platform based serial computing and imaging mode when executing the same data processing, which reaches the clinical requirements for two dimensional real-time imaging.

  15. Real-time display on Fourier domain optical coherence tomography system using a graphics processing unit.

    Science.gov (United States)

    Watanabe, Yuuki; Itagaki, Toshiki

    2009-01-01

    Fourier domain optical coherence tomography (FD-OCT) requires resampling of spectrally resolved depth information from wavelength to wave number, and the subsequent application of the inverse Fourier transform. The display rates of OCT images are much slower than the image acquisition rates due to processing speed limitations on most computers. We demonstrate a real-time display of processed OCT images using a linear-in-wave-number (linear-k) spectrometer and a graphics processing unit (GPU). We use the linear-k spectrometer with the combination of a diffractive grating with 1200 lines/mm and a F2 equilateral prism in the 840-nm spectral region to avoid calculating the resampling process. The calculations of the fast Fourier transform (FFT) are accelerated by the GPU with many stream processors, which realizes highly parallel processing. A display rate of 27.9 frames/sec for processed images (2048 FFT size x 1000 lateral A-scans) is achieved in our OCT system using a line scan CCD camera operated at 27.9 kHz.

  16. Processing Cost Analysis for Biomass Feedstocks

    Energy Technology Data Exchange (ETDEWEB)

    Badger, P.C.

    2002-11-20

    The receiving, handling, storing, and processing of woody biomass feedstocks is an overlooked component of biopower systems. The purpose of this study was twofold: (1) to identify and characterize all the receiving, handling, storing, and processing steps required to make woody biomass feedstocks suitable for use in direct combustion and gasification applications, including small modular biopower (SMB) systems, and (2) to estimate the capital and operating costs at each step. Since biopower applications can be varied, a number of conversion systems and feedstocks required evaluation. In addition to limiting this study to woody biomass feedstocks, the boundaries of this study were from the power plant gate to the feedstock entry point into the conversion device. Although some power plants are sited at a source of wood waste fuel, it was assumed for this study that all wood waste would be brought to the power plant site. This study was also confined to the following three feedstocks (1) forest residues, (2) industrial mill residues, and (3) urban wood residues. Additionally, the study was confined to grate, suspension, and fluidized bed direct combustion systems; gasification systems; and SMB conversion systems. Since scale can play an important role in types of equipment, operational requirements, and capital and operational costs, this study examined these factors for the following direct combustion and gasification system size ranges: 50, 20, 5, and 1 MWe. The scope of the study also included: Specific operational issues associated with specific feedstocks (e.g., bark and problems with bridging); Opportunities for reducing handling, storage, and processing costs; How environmental restrictions can affect handling and processing costs (e.g., noise, commingling of treated wood or non-wood materials, emissions, and runoff); and Feedstock quality issues and/or requirements (e.g., moisture, particle size, presence of non-wood materials). The study found that over the

  17. Real-Time Computation of Parameter Fitting and Image Reconstruction Using Graphical Processing Units

    CERN Document Server

    Locans, Uldis; Suter, Andreas; Fischer, Jannis; Lustermann, Werner; Dissertori, Gunther; Wang, Qiulin

    2016-01-01

    In recent years graphical processing units (GPUs) have become a powerful tool in scientific computing. Their potential to speed up highly parallel applications brings the power of high performance computing to a wider range of users. However, programming these devices and integrating their use in existing applications is still a challenging task. In this paper we examined the potential of GPUs for two different applications. The first application, created at Paul Scherrer Institut (PSI), is used for parameter fitting during data analysis of muSR (muon spin rotation, relaxation and resonance) experiments. The second application, developed at ETH, is used for PET (Positron Emission Tomography) image reconstruction and analysis. Applications currently in use were examined to identify parts of the algorithms in need of optimization. Efficient GPU kernels were created in order to allow applications to use a GPU, to speed up the previously identified parts. Benchmarking tests were performed in order to measure the ...

  18. Quantum Chemistry for Solvated Molecules on Graphical Processing Units (GPUs)using Polarizable Continuum Models

    CERN Document Server

    Liu, Fang; Kulik, Heather J; Martínez, Todd J

    2015-01-01

    The conductor-like polarization model (C-PCM) with switching/Gaussian smooth discretization is a widely used implicit solvation model in chemical simulations. However, its application in quantum mechanical calculations of large-scale biomolecular systems can be limited by computational expense of both the gas phase electronic structure and the solvation interaction. We have previously used graphical processing units (GPUs) to accelerate the first of these steps. Here, we extend the use of GPUs to accelerate electronic structure calculations including C-PCM solvation. Implementation on the GPU leads to significant acceleration of the generation of the required integrals for C-PCM. We further propose two strategies to improve the solution of the required linear equations: a dynamic convergence threshold and a randomized block-Jacobi preconditioner. These strategies are not specific to GPUs and are expected to be beneficial for both CPU and GPU implementations. We benchmark the performance of the new implementat...

  19. AN APPROACH TO EFFICIENT FEM SIMULATIONS ON GRAPHICS PROCESSING UNITS USING CUDA

    Directory of Open Access Journals (Sweden)

    Björn Nutti

    2014-04-01

    Full Text Available The paper presents a highly efficient way of simulating the dynamic behavior of deformable objects by means of the finite element method (FEM with computations performed on Graphics Processing Units (GPU. The presented implementation reduces bottlenecks related to memory accesses by grouping the necessary data per node pairs, in contrast to the classical way done per element. This strategy reduces the memory access patterns that are not suitable for the GPU memory architecture. Furthermore, the presented implementation takes advantage of the underlying sparse-block-matrix structure, and it has been demonstrated how to avoid potential bottlenecks in the algorithm. To achieve plausible deformational behavior for large local rotations, the objects are modeled by means of a simplified co-rotational FEM formulation.

  20. ASAMgpu V1.0 - a moist fully compressible atmospheric model using graphics processing units (GPUs)

    Science.gov (United States)

    Horn, S.

    2012-03-01

    In this work the three dimensional compressible moist atmospheric model ASAMgpu is presented. The calculations are done using graphics processing units (GPUs). To ensure platform independence OpenGL and GLSL are used, with that the model runs on any hardware supporting fragment shaders. The MPICH2 library enables interprocess communication allowing the usage of more than one GPU through domain decomposition. Time integration is done with an explicit three step Runge-Kutta scheme with a time-splitting algorithm for the acoustic waves. The results for four test cases are shown in this paper. A rising dry heat bubble, a cold bubble induced density flow, a rising moist heat bubble in a saturated environment, and a DYCOMS-II case.

  1. Parallel multigrid solver of radiative transfer equation for photon transport via graphics processing unit

    Science.gov (United States)

    Gao, Hao; Phan, Lan; Lin, Yuting

    2012-09-01

    A graphics processing unit-based parallel multigrid solver for a radiative transfer equation with vacuum boundary condition or reflection boundary condition is presented for heterogeneous media with complex geometry based on two-dimensional triangular meshes or three-dimensional tetrahedral meshes. The computational complexity of this parallel solver is linearly proportional to the degrees of freedom in both angular and spatial variables, while the full multigrid method is utilized to minimize the number of iterations. The overall gain of speed is roughly 30 to 300 fold with respect to our prior multigrid solver, which depends on the underlying regime and the parallelization. The numerical validations are presented with the MATLAB codes at https://sites.google.com/site/rtefastsolver/.

  2. Accelerating Image Reconstruction in Three-Dimensional Optoacoustic Tomography on Graphics Processing Units

    CERN Document Server

    Wang, Kun; Kao, Yu-Jiun; Chou, Cheng-Ying; Oraevsky, Alexander A; Anastasio, Mark A; 10.1118/1.4774361

    2013-01-01

    Purpose: Optoacoustic tomography (OAT) is inherently a three-dimensional (3D) inverse problem. However, most studies of OAT image reconstruction still employ two-dimensional (2D) imaging models. One important reason is because 3D image reconstruction is computationally burdensome. The aim of this work is to accelerate existing image reconstruction algorithms for 3D OAT by use of parallel programming techniques. Methods: Parallelization strategies are proposed to accelerate a filtered backprojection (FBP) algorithm and two different pairs of projection/backprojection operations that correspond to two different numerical imaging models. The algorithms are designed to fully exploit the parallel computing power of graphic processing units (GPUs). In order to evaluate the parallelization strategies for the projection/backprojection pairs, an iterative image reconstruction algorithm is implemented. Computer-simulation and experimental studies are conducted to investigate the computational efficiency and numerical a...

  3. Acceleration of the OpenFOAM-based MHD solver using graphics processing units

    Energy Technology Data Exchange (ETDEWEB)

    He, Qingyun; Chen, Hongli, E-mail: hlchen1@ustc.edu.cn; Feng, Jingchao

    2015-12-15

    Highlights: • A 3D PISO-MHD was implemented on Kepler-class graphics processing units (GPUs) using CUDA technology. • A consistent and conservative scheme is used in the code which was validated by three basic benchmarks in a rectangular and round ducts. • Parallelized of CPU and GPU acceleration were compared relating to single core CPU in MHD problems and non-MHD problems. • Different preconditions for solving MHD solver were compared and the results showed that AMG method is better for calculations. - Abstract: The pressure-implicit with splitting of operators (PISO) magnetohydrodynamics MHD solver of the couple of Navier–Stokes equations and Maxwell equations was implemented on Kepler-class graphics processing units (GPUs) using the CUDA technology. The solver is developed on open source code OpenFOAM based on consistent and conservative scheme which is suitable for simulating MHD flow under strong magnetic field in fusion liquid metal blanket with structured or unstructured mesh. We verified the validity of the implementation on several standard cases including the benchmark I of Shercliff and Hunt's cases, benchmark II of fully developed circular pipe MHD flow cases and benchmark III of KIT experimental case. Computational performance of the GPU implementation was examined by comparing its double precision run times with those of essentially the same algorithms and meshes. The resulted showed that a GPU (GTX 770) can outperform a server-class 4-core, 8-thread CPU (Intel Core i7-4770k) by a factor of 2 at least.

  4. Applying a visual language for image processing as a graphical teaching tool in medical imaging

    Science.gov (United States)

    Birchman, James J.; Tanimoto, Steven L.; Rowberg, Alan H.; Choi, Hyung-Sik; Kim, Yongmin

    1992-05-01

    Typical user interaction in image processing is with command line entries, pull-down menus, or text menu selections from a list, and as such is not generally graphical in nature. Although applying these interactive methods to construct more sophisticated algorithms from a series of simple image processing steps may be clear to engineers and programmers, it may not be clear to clinicians. A solution to this problem is to implement a visual programming language using visual representations to express image processing algorithms. Visual representations promote a more natural and rapid understanding of image processing algorithms by providing more visual insight into what the algorithms do than the interactive methods mentioned above can provide. Individuals accustomed to dealing with images will be more likely to understand an algorithm that is represented visually. This is especially true of referring physicians, such as surgeons in an intensive care unit. With the increasing acceptance of picture archiving and communications system (PACS) workstations and the trend toward increasing clinical use of image processing, referring physicians will need to learn more sophisticated concepts than simply image access and display. If the procedures that they perform commonly, such as window width and window level adjustment and image enhancement using unsharp masking, are depicted visually in an interactive environment, it will be easier for them to learn and apply these concepts. The software described in this paper is a visual programming language for imaging processing which has been implemented on the NeXT computer using NeXTstep user interface development tools and other tools in an object-oriented environment. The concept is based upon the description of a visual language titled `Visualization of Vision Algorithms' (VIVA). Iconic representations of simple image processing steps are placed into a workbench screen and connected together into a dataflow path by the user. As

  5. Real-time resampling in Fourier domain optical coherence tomography using a graphics processing unit.

    Science.gov (United States)

    Van der Jeught, Sam; Bradu, Adrian; Podoleanu, Adrian Gh

    2010-01-01

    Fourier domain optical coherence tomography (FD-OCT) requires either a linear-in-wavenumber spectrometer or a computationally heavy software algorithm to recalibrate the acquired optical signal from wavelength to wavenumber. The first method is sensitive to the position of the prism in the spectrometer, while the second method drastically slows down the system speed when it is implemented on a serially oriented central processing unit. We implement the full resampling process on a commercial graphics processing unit (GPU), distributing the necessary calculations to many stream processors that operate in parallel. A comparison between several recalibration methods is made in terms of performance and image quality. The GPU is also used to accelerate the fast Fourier transform (FFT) and to remove the background noise, thereby achieving full GPU-based signal processing without the need for extra resampling hardware. A display rate of 25 framessec is achieved for processed images (1,024 x 1,024 pixels) using a line-scan charge-coupled device (CCD) camera operating at 25.6 kHz.

  6. Clinical process analysis and activity-based costing at a heart center.

    Science.gov (United States)

    Ridderstolpe, Lisa; Johansson, Andreas; Skau, Tommy; Rutberg, Hans; Ahlfeldt, Hans

    2002-08-01

    Cost studies, productivity, efficiency, and quality of care measures, the links between resources and patient outcomes, are fundamental issues for hospital management today. This paper describes the implementation of a model for process analysis and activity-based costing (ABC)/management at a Heart Center in Sweden as a tool for administrative cost information, strategic decision-making, quality improvement, and cost reduction. A commercial software package (QPR) containing two interrelated parts, "ProcessGuide and CostControl," was used. All processes at the Heart Center were mapped and graphically outlined. Processes and activities such as health care procedures, research, and education were identified together with their causal relationship to costs and products/services. The construction of the ABC model in CostControl was time-consuming. However, after the ABC/management system was created, it opened the way for new possibilities including process and activity analysis, simulation, and price calculations. Cost analysis showed large variations in the cost obtained for individual patients undergoing coronary artery bypass grafting (CABG) surgery. We conclude that a process-based costing system is applicable and has the potential to be useful in hospital management.

  7. [Influence of the recording interval and a graphic organizer on the writing process/product and on other psychological variables].

    Science.gov (United States)

    García Sánchez, Jesús N; Rodríguez Pérez, Celestino

    2007-05-01

    An experimental study of the influence of the recording interval and a graphic organizer on the processes of writing composition and on the final product is presented. We studied 326 participants, age 10 to 16 years old, by means of a nested design. Two groups were compared: one group was aided in the writing process with a graphic organizer and the other was not. Each group was subdivided into two further groups: one with a mean recording interval of 45 seconds and the other with approximately 90 seconds recording interval in a writing log. The results showed that the group aided by a graphic organizer obtained better results both in processes and writing product, and that the groups assessed with an average interval of 45 seconds obtained worse results. Implications for educational practice are discussed, and limitations and future perspectives are commented on.

  8. Real-time blood flow visualization using the graphics processing unit.

    Science.gov (United States)

    Yang, Owen; Cuccia, David; Choi, Bernard

    2011-01-01

    Laser speckle imaging (LSI) is a technique in which coherent light incident on a surface produces a reflected speckle pattern that is related to the underlying movement of optical scatterers, such as red blood cells, indicating blood flow. Image-processing algorithms can be applied to produce speckle flow index (SFI) maps of relative blood flow. We present a novel algorithm that employs the NVIDIA Compute Unified Device Architecture (CUDA) platform to perform laser speckle image processing on the graphics processing unit. Software written in C was integrated with CUDA and integrated into a LabVIEW Virtual Instrument (VI) that is interfaced with a monochrome CCD camera able to acquire high-resolution raw speckle images at nearly 10 fps. With the CUDA code integrated into the LabVIEW VI, the processing and display of SFI images were performed also at ∼10 fps. We present three video examples depicting real-time flow imaging during a reactive hyperemia maneuver, with fluid flow through an in vitro phantom, and a demonstration of real-time LSI during laser surgery of a port wine stain birthmark.

  9. Real-time speckle variance swept-source optical coherence tomography using a graphics processing unit

    Science.gov (United States)

    Lee, Kenneth K. C.; Mariampillai, Adrian; Yu, Joe X. Z.; Cadotte, David W.; Wilson, Brian C.; Standish, Beau A.; Yang, Victor X. D.

    2012-01-01

    Abstract: Advances in swept source laser technology continues to increase the imaging speed of swept-source optical coherence tomography (SS-OCT) systems. These fast imaging speeds are ideal for microvascular detection schemes, such as speckle variance (SV), where interframe motion can cause severe imaging artifacts and loss of vascular contrast. However, full utilization of the laser scan speed has been hindered by the computationally intensive signal processing required by SS-OCT and SV calculations. Using a commercial graphics processing unit that has been optimized for parallel data processing, we report a complete high-speed SS-OCT platform capable of real-time data acquisition, processing, display, and saving at 108,000 lines per second. Subpixel image registration of structural images was performed in real-time prior to SV calculations in order to reduce decorrelation from stationary structures induced by the bulk tissue motion. The viability of the system was successfully demonstrated in a high bulk tissue motion scenario of human fingernail root imaging where SV images (512 × 512 pixels, n = 4) were displayed at 54 frames per second. PMID:22808428

  10. Process industries - graphic arts, paint, plastics, and textiles: all cousins under the skin

    Science.gov (United States)

    Simon, Frederick T.

    2002-06-01

    The origin and selection of colors in the process industries is different depending upon how the creative process is applied and what are the capabilities of the manufacturing process. The fashion industry (clothing) with its supplier of textiles is the leader of color innovation. Color may be introduced into textile products at several stages in the manufacturing process from fiber through yarn and finally into fabric. The paint industry is divided into two major applications: automotive and trades sales. Automotive colors are selected by stylists who are in the employ of the automobile manufacturers. Trade sales paint on the other hand can be decided by paint manufactureres or by invididuals who patronize custom mixing facilities. Plastics colors are for the most part decided by the industrial designers who include color as part of the design. Graphic Arts (painting) is a burgeoning industry that uses color in image reproduction and package design. Except for text, printed material in color today has become the norm rather than an exception.

  11. 一种低代价的图形用户界面回归测试框架%Low-cost Graphical User Interface Regression Test Framework

    Institute of Scientific and Technical Information of China (English)

    华涛; 李红红; 李来祥

    2011-01-01

    Graphical User Interface(GUI) is created with rapid prototyping, has characteristics that differ it from traditional software, so test techniques for traditional software can't directly apply to GUI. This paper analyses interaction between GUI events, researches on why some events can lead to defects and gives a cost-effective Event Interaction Graph(EIG) based GUI automated regression test framework and corresponding regression test process, which is used to provide the best combination of defect detection rate and cost.%图形用户界面(GUD采用快速原型法生成,具有一些不同于传统软件的特性,使得传统软件测试技术不能直接应用于GUI.为此,分析GUI事件的交互,研究事件交互可能导致缺陷的原因,进而提出一个低代价的基于事件交互图的GUI自动化回归测试框架及相应的回归测试过程,用于提供最优的缺陷发现率和成本组合.

  12. Optical diagnostics of a single evaporating droplet using fast parallel computing on graphics processing units

    Science.gov (United States)

    Jakubczyk, D.; Migacz, S.; Derkachov, G.; Woźniak, M.; Archer, J.; Kolwas, K.

    2016-09-01

    We report on the first application of the graphics processing units (GPUs) accelerated computing technology to improve performance of numerical methods used for the optical characterization of evaporating microdroplets. Single microdroplets of various liquids with different volatility and molecular weight (glycerine, glycols, water, etc.), as well as mixtures of liquids and diverse suspensions evaporate inside the electrodynamic trap under the chosen temperature and composition of atmosphere. The series of scattering patterns recorded from the evaporating microdroplets are processed by fitting complete Mie theory predictions with gradientless lookup table method. We showed that computations on GPUs can be effectively applied to inverse scattering problems. In particular, our technique accelerated calculations of the Mie scattering theory on a single-core processor in a Matlab environment over 800 times and almost 100 times comparing to the corresponding code in C language. Additionally, we overcame problems of the time-consuming data post-processing when some of the parameters (particularly the refractive index) of an investigated liquid are uncertain. Our program allows us to track the parameters characterizing the evaporating droplet nearly simultaneously with the progress of evaporation.

  13. Fast computation of MadGraph amplitudes on graphics processing unit (GPU)

    CERN Document Server

    Hagiwara, K; Li, Q; Okamura, N; Stelzer, T

    2013-01-01

    Continuing our previous studies on QED and QCD processes, we use the graphics processing unit (GPU) for fast calculations of helicity amplitudes for general Standard Model (SM) processes. Additional HEGET codes to handle all SM interactions are introduced, as well assthe program MG2CUDA that converts arbitrary MadGraph generated HELAS amplitudess(FORTRAN) into HEGET codes in CUDA. We test all the codes by comparing amplitudes and cross sections for multi-jet srocesses at the LHC associated with production of single and double weak bosonss a top-quark pair, Higgs boson plus a weak boson or a top-quark pair, and multisle Higgs bosons via weak-boson fusion, where all the heavy particles are allowes to decay into light quarks and leptons with full spin correlations. All the helicity amplitudes computed by HEGET are found to agree with those comsuted by HELAS within the expected numerical accuracy, and the cross sections obsained by gBASES, a GPU version of the Monte Carlo integration program, agree wish those obt...

  14. Parallel particle swarm optimization on a graphics processing unit with application to trajectory optimization

    Science.gov (United States)

    Wu, Q.; Xiong, F.; Wang, F.; Xiong, Y.

    2016-10-01

    In order to reduce the computational time, a fully parallel implementation of the particle swarm optimization (PSO) algorithm on a graphics processing unit (GPU) is presented. Instead of being executed on the central processing unit (CPU) sequentially, PSO is executed in parallel via the GPU on the compute unified device architecture (CUDA) platform. The processes of fitness evaluation, updating of velocity and position of all particles are all parallelized and introduced in detail. Comparative studies on the optimization of four benchmark functions and a trajectory optimization problem are conducted by running PSO on the GPU (GPU-PSO) and CPU (CPU-PSO). The impact of design dimension, number of particles and size of the thread-block in the GPU and their interactions on the computational time is investigated. The results show that the computational time of the developed GPU-PSO is much shorter than that of CPU-PSO, with comparable accuracy, which demonstrates the remarkable speed-up capability of GPU-PSO.

  15. Performance and scalability of Fourier domain optical coherence tomography acceleration using graphics processing units.

    Science.gov (United States)

    Li, Jian; Bloch, Pavel; Xu, Jing; Sarunic, Marinko V; Shannon, Lesley

    2011-05-01

    Fourier domain optical coherence tomography (FD-OCT) provides faster line rates, better resolution, and higher sensitivity for noninvasive, in vivo biomedical imaging compared to traditional time domain OCT (TD-OCT). However, because the signal processing for FD-OCT is computationally intensive, real-time FD-OCT applications demand powerful computing platforms to deliver acceptable performance. Graphics processing units (GPUs) have been used as coprocessors to accelerate FD-OCT by leveraging their relatively simple programming model to exploit thread-level parallelism. Unfortunately, GPUs do not "share" memory with their host processors, requiring additional data transfers between the GPU and CPU. In this paper, we implement a complete FD-OCT accelerator on a consumer grade GPU/CPU platform. Our data acquisition system uses spectrometer-based detection and a dual-arm interferometer topology with numerical dispersion compensation for retinal imaging. We demonstrate that the maximum line rate is dictated by the memory transfer time and not the processing time due to the GPU platform's memory model. Finally, we discuss how the performance trends of GPU-based accelerators compare to the expected future requirements of FD-OCT data rates.

  16. Four-dimensional structural and Doppler optical coherence tomography imaging on graphics processing units.

    Science.gov (United States)

    Sylwestrzak, Marcin; Szlag, Daniel; Szkulmowski, Maciej; Gorczynska, Iwona; Bukowska, Danuta; Wojtkowski, Maciej; Targowski, Piotr

    2012-10-01

    The authors present the application of graphics processing unit (GPU) programming for real-time three-dimensional (3-D) Fourier domain optical coherence tomography (FdOCT) imaging with implementation of flow visualization algorithms. One of the limitations of FdOCT is data processing time, which is generally longer than data acquisition time. Utilizing additional algorithms, such as Doppler analysis, further increases computation time. The general purpose computing on GPU (GPGPU) has been used successfully for structural OCT imaging, but real-time 3-D imaging of flows has so far not been presented. We have developed software for structural and Doppler OCT processing capable of visualization of two-dimensional (2-D) data (2000 A-scans, 2048 pixels per spectrum) with an image refresh rate higher than 120 Hz. The 3-D imaging of 100×100 A-scans data is performed at a rate of about 9 volumes per second. We describe the software architecture, organization of threads, and optimization. Screen shots recorded during real-time imaging of a flow phantom and the human eye are presented.

  17. Warranty cost analysis using alternating quasi-renewal processes with a warranty option

    Science.gov (United States)

    Yedida, Sarada; Unnissa Munavar, Mubashir; Ranjani, R.

    2012-03-01

    This article analyses a repairable deteriorating system with quasi-renewal operating and repair times during warranty. To establish the importance of non-negligible repair times modelled with quasi-renewal processes in warranty cost analysis, a fixed warranty model is developed and the results are compared with an existing model using expected warranty cost. Sensitivity analysis and graphical illustrations are provided to highlight the effect of various cost parameters on the expected warranty cost by means of three different distributions. In order to examine the cost implications to the customer and manufacturer, an extended warranty model in which a manufacturer offers a warranty option to the customer with two different policies, has been proposed. Based on the long-run average cost per unit time, profit analysis has been carried out for the manufacturer as it is an essential aspect of warranty management. This article emphasises the incorporation of non-negligible 'improved' repair times during warranty.

  18. Real-time photoacoustic and ultrasound dual-modality imaging system facilitated with graphics processing unit and code parallel optimization.

    Science.gov (United States)

    Yuan, Jie; Xu, Guan; Yu, Yao; Zhou, Yu; Carson, Paul L; Wang, Xueding; Liu, Xiaojun

    2013-08-01

    Photoacoustic tomography (PAT) offers structural and functional imaging of living biological tissue with highly sensitive optical absorption contrast and excellent spatial resolution comparable to medical ultrasound (US) imaging. We report the development of a fully integrated PAT and US dual-modality imaging system, which performs signal scanning, image reconstruction, and display for both photoacoustic (PA) and US imaging all in a truly real-time manner. The back-projection (BP) algorithm for PA image reconstruction is optimized to reduce the computational cost and facilitate parallel computation on a state of the art graphics processing unit (GPU) card. For the first time, PAT and US imaging of the same object can be conducted simultaneously and continuously, at a real-time frame rate, presently limited by the laser repetition rate of 10 Hz. Noninvasive PAT and US imaging of human peripheral joints in vivo were achieved, demonstrating the satisfactory image quality realized with this system. Another experiment, simultaneous PAT and US imaging of contrast agent flowing through an artificial vessel, was conducted to verify the performance of this system for imaging fast biological events. The GPU-based image reconstruction software code for this dual-modality system is open source and available for download from http://sourceforge.net/projects/patrealtime.

  19. Lunar-Forming Giant Impact Model Utilizing Modern Graphics Processing Units

    Indian Academy of Sciences (India)

    J. C. Eiland; T. C. Salzillo; B. H. Hokr; J. L. Highland; W. D. Mayfield; B. M. Wyatt

    2014-12-01

    Recent giant impact models focus on producing a circumplanetary disk of the proper composition around the Earth and defer to earlier works for the accretion of this disk into the Moon. The discontinuity between creating the circumplanetary disk and accretion of the Moon is unnatural and lacks simplicity. In addition, current giant impact theories are being questioned due to their inability to find conditions that will produce a system with both the proper angular momentum and a resultant Moon that is isotopically similar to the Earth. Here we return to first principles and produce a continuous model that can be used to rapidly search the vast impact parameter space to identify plausible initial conditions. This is accomplished by focusing on the three major components of planetary collisions: constant gravitational attraction, short range repulsion and energy transfer. The structure of this model makes it easily parallelizable and well-suited to harness the power of modern Graphics Processing Units (GPUs). The model makes clear the physically relevant processes, and allows a physical picture to naturally develop. We conclude by demonstrating how the model readily produces stable Earth–Moon systems from a single, continuous simulation. The resultant systems possess many desired characteristics such as an iron-deficient, heterogeneously-mixed Moon and accurate axial tilt of the Earth.

  20. Use of graphical statistical process control tools to monitor and improve outcomes in cardiac surgery.

    Science.gov (United States)

    Smith, Ian R; Garlick, Bruce; Gardner, Michael A; Brighouse, Russell D; Foster, Kelley A; Rivers, John T

    2013-02-01

    Graphical Statistical Process Control (SPC) tools have been shown to promptly identify significant variations in clinical outcomes in a range of health care settings. We explored the application of these techniques to qualitatively inform the routine cardiac surgical morbidity and mortality (M&M) review process at a single site. Baseline clinical and procedural data relating to 4774 consecutive cardiac surgical procedures, performed between the 1st January 2003 and the 30th April 2011, were retrospectively evaluated. A range of appropriate performance measures and benchmarks were developed and evaluated using a combination of CUmulative SUM (CUSUM) charts, Exponentially Weighted Moving Average (EWMA) charts and Funnel Plots. Charts have been discussed at the unit's routine M&M meetings. Risk adjustment (RA) based on EuroSCORE has been incorporated into the charts to improve performance. Discrete and aggregated measures, including Blood Product/Reoperation, major acute post-procedural complications and Length of Stay/Readmissiontools facilitate near "real-time" performance monitoring allowing early detection and intervention in altered performance. Careful interpretation of charts for group and individual operators has proven helpful in detecting and differentiating systemic vs. individual variation. Copyright © 2012 Australian and New Zealand Society of Cardiac and Thoracic Surgeons (ANZSCTS) and the Cardiac Society of Australia and New Zealand (CSANZ). Published by Elsevier B.V. All rights reserved.

  1. A Performance Comparison of Different Graphics Processing Units Running Direct N-Body Simulations

    CERN Document Server

    Capuzzo-Dolcetta, Roberto

    2013-01-01

    Hybrid computational architectures based on the joint power of Central Processing Units and Graphic Processing Units (GPUs) are becoming popular and powerful hardware tools for a wide range of simulations in biology, chemistry, engineering, physics, etc.. In this paper we present a comparison of performance of various GPUs available on market when applied to the numerical integration of the classic, gravitational, N-body problem. To do this, we developed an OpenCL version of the parallel code (HiGPUs) to use for these tests, because this version is the only apt to work on GPUs of different makes. The main general result is that we confirm the reliability, speed and cheapness of GPUs when applied to the examined kind of problems (i.e. when the forces to evaluate are dependent on the mutual distances, as it happens in gravitational physics and molecular dynamics). More specifically, we find that also the cheap GPUs built to be employed just for gaming applications are very performant in terms of computing speed...

  2. OCTGRAV: Sparse Octree Gravitational N-body Code on Graphics Processing Units

    Science.gov (United States)

    Gaburov, Evghenii; Bédorf, Jeroen; Portegies Zwart, Simon

    2010-10-01

    Octgrav is a new very fast tree-code which runs on massively parallel Graphical Processing Units (GPU) with NVIDIA CUDA architecture. The algorithms are based on parallel-scan and sort methods. The tree-construction and calculation of multipole moments is carried out on the host CPU, while the force calculation which consists of tree walks and evaluation of interaction list is carried out on the GPU. In this way, a sustained performance of about 100GFLOP/s and data transfer rates of about 50GB/s is achieved. It takes about a second to compute forces on a million particles with an opening angle of heta approx 0.5. To test the performance and feasibility, we implemented the algorithms in CUDA in the form of a gravitational tree-code which completely runs on the GPU. The tree construction and traverse algorithms are portable to many-core devices which have support for CUDA or OpenCL programming languages. The gravitational tree-code outperforms tuned CPU code during the tree-construction and shows a performance improvement of more than a factor 20 overall, resulting in a processing rate of more than 2.8 million particles per second. The code has a convenient user interface and is freely available for use.

  3. Practical Implementation of Prestack Kirchhoff Time Migration on a General Purpose Graphics Processing Unit

    Directory of Open Access Journals (Sweden)

    Liu Guofeng

    2016-08-01

    Full Text Available In this study, we present a practical implementation of prestack Kirchhoff time migration (PSTM on a general purpose graphic processing unit. First, we consider the three main optimizations of the PSTM GPU code, i.e., designing a configuration based on a reasonable execution, using the texture memory for velocity interpolation, and the application of an intrinsic function in device code. This approach can achieve a speedup of nearly 45 times on a NVIDIA GTX 680 GPU compared with CPU code when a larger imaging space is used, where the PSTM output is a common reflection point that is gathered as I[nx][ny][nh][nt] in matrix format. However, this method requires more memory space so the limited imaging space cannot fully exploit the GPU sources. To overcome this problem, we designed a PSTM scheme with multi-GPUs for imaging different seismic data on different GPUs using an offset value. This process can achieve the peak speedup of GPU PSTM code and it greatly increases the efficiency of the calculations, but without changing the imaging result.

  4. Practical Implementation of Prestack Kirchhoff Time Migration on a General Purpose Graphics Processing Unit

    Science.gov (United States)

    Liu, Guofeng; Li, Chun

    2016-08-01

    In this study, we present a practical implementation of prestack Kirchhoff time migration (PSTM) on a general purpose graphic processing unit. First, we consider the three main optimizations of the PSTM GPU code, i.e., designing a configuration based on a reasonable execution, using the texture memory for velocity interpolation, and the application of an intrinsic function in device code. This approach can achieve a speedup of nearly 45 times on a NVIDIA GTX 680 GPU compared with CPU code when a larger imaging space is used, where the PSTM output is a common reflection point that is gathered as I[ nx][ ny][ nh][ nt] in matrix format. However, this method requires more memory space so the limited imaging space cannot fully exploit the GPU sources. To overcome this problem, we designed a PSTM scheme with multi-GPUs for imaging different seismic data on different GPUs using an offset value. This process can achieve the peak speedup of GPU PSTM code and it greatly increases the efficiency of the calculations, but without changing the imaging result.

  5. Graphics Processing Unit (GPU) Acceleration of the Goddard Earth Observing System Atmospheric Model

    Science.gov (United States)

    Putnam, Williama

    2011-01-01

    The Goddard Earth Observing System 5 (GEOS-5) is the atmospheric model used by the Global Modeling and Assimilation Office (GMAO) for a variety of applications, from long-term climate prediction at relatively coarse resolution, to data assimilation and numerical weather prediction, to very high-resolution cloud-resolving simulations. GEOS-5 is being ported to a graphics processing unit (GPU) cluster at the NASA Center for Climate Simulation (NCCS). By utilizing GPU co-processor technology, we expect to increase the throughput of GEOS-5 by at least an order of magnitude, and accelerate the process of scientific exploration across all scales of global modeling, including: The large-scale, high-end application of non-hydrostatic, global, cloud-resolving modeling at 10- to I-kilometer (km) global resolutions Intermediate-resolution seasonal climate and weather prediction at 50- to 25-km on small clusters of GPUs Long-range, coarse-resolution climate modeling, enabled on a small box of GPUs for the individual researcher After being ported to the GPU cluster, the primary physics components and the dynamical core of GEOS-5 have demonstrated a potential speedup of 15-40 times over conventional processor cores. Performance improvements of this magnitude reduce the required scalability of 1-km, global, cloud-resolving models from an unfathomable 6 million cores to an attainable 200,000 GPU-enabled cores.

  6. Accelerated rescaling of single Monte Carlo simulation runs with the Graphics Processing Unit (GPU).

    Science.gov (United States)

    Yang, Owen; Choi, Bernard

    2013-01-01

    To interpret fiber-based and camera-based measurements of remitted light from biological tissues, researchers typically use analytical models, such as the diffusion approximation to light transport theory, or stochastic models, such as Monte Carlo modeling. To achieve rapid (ideally real-time) measurement of tissue optical properties, especially in clinical situations, there is a critical need to accelerate Monte Carlo simulation runs. In this manuscript, we report on our approach using the Graphics Processing Unit (GPU) to accelerate rescaling of single Monte Carlo runs to calculate rapidly diffuse reflectance values for different sets of tissue optical properties. We selected MATLAB to enable non-specialists in C and CUDA-based programming to use the generated open-source code. We developed a software package with four abstraction layers. To calculate a set of diffuse reflectance values from a simulated tissue with homogeneous optical properties, our rescaling GPU-based approach achieves a reduction in computation time of several orders of magnitude as compared to other GPU-based approaches. Specifically, our GPU-based approach generated a diffuse reflectance value in 0.08ms. The transfer time from CPU to GPU memory currently is a limiting factor with GPU-based calculations. However, for calculation of multiple diffuse reflectance values, our GPU-based approach still can lead to processing that is ~3400 times faster than other GPU-based approaches.

  7. Fast ray-tracing of human eye optics on Graphics Processing Units.

    Science.gov (United States)

    Wei, Qi; Patkar, Saket; Pai, Dinesh K

    2014-05-01

    We present a new technique for simulating retinal image formation by tracing a large number of rays from objects in three dimensions as they pass through the optic apparatus of the eye to objects. Simulating human optics is useful for understanding basic questions of vision science and for studying vision defects and their corrections. Because of the complexity of computing such simulations accurately, most previous efforts used simplified analytical models of the normal eye. This makes them less effective in modeling vision disorders associated with abnormal shapes of the ocular structures which are hard to be precisely represented by analytical surfaces. We have developed a computer simulator that can simulate ocular structures of arbitrary shapes, for instance represented by polygon meshes. Topographic and geometric measurements of the cornea, lens, and retina from keratometer or medical imaging data can be integrated for individualized examination. We utilize parallel processing using modern Graphics Processing Units (GPUs) to efficiently compute retinal images by tracing millions of rays. A stable retinal image can be generated within minutes. We simulated depth-of-field, accommodation, chromatic aberrations, as well as astigmatism and correction. We also show application of the technique in patient specific vision correction by incorporating geometric models of the orbit reconstructed from clinical medical images. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  8. Fast Monte Carlo simulations of ultrasound-modulated light using a graphics processing unit.

    Science.gov (United States)

    Leung, Terence S; Powell, Samuel

    2010-01-01

    Ultrasound-modulated optical tomography (UOT) is based on "tagging" light in turbid media with focused ultrasound. In comparison to diffuse optical imaging, UOT can potentially offer a better spatial resolution. The existing Monte Carlo (MC) model for simulating ultrasound-modulated light is central processing unit (CPU) based and has been employed in several UOT related studies. We reimplemented the MC model with a graphics processing unit [(GPU), Nvidia GeForce 9800] that can execute the algorithm up to 125 times faster than its CPU (Intel Core Quad) counterpart for a particular set of optical and acoustic parameters. We also show that the incorporation of ultrasound propagation in photon migration modeling increases the computational time considerably, by a factor of at least 6, in one case, even with a GPU. With slight adjustment to the code, MC simulations were also performed to demonstrate the effect of ultrasonic modulation on the speckle pattern generated by the light model (available as animation). This was computed in 4 s with our GPU implementation as compared to 290 s using the CPU.

  9. VACTIV: A graphical dialog based program for an automatic processing of line and band spectra

    Science.gov (United States)

    Zlokazov, V. B.

    2013-05-01

    The program VACTIV-Visual ACTIV-has been developed for an automatic analysis of spectrum-like distributions, in particular gamma-ray spectra or alpha-spectra and is a standard graphical dialog based Windows XX application, driven by a menu, mouse and keyboard. On the one hand, it was a conversion of an existing Fortran program ACTIV [1] to the DELPHI language; on the other hand, it is a transformation of the sequential syntax of Fortran programming to a new object-oriented style, based on the organization of event interactions. New features implemented in the algorithms of both the versions consisted in the following as peak model both an analytical function and a graphical curve could be used; the peak search algorithm was able to recognize not only Gauss peaks but also peaks with an irregular form; both narrow peaks (2-4 channels) and broad ones (50-100 channels); the regularization technique in the fitting guaranteed a stable solution in the most complicated cases of strongly overlapping or weak peaks. The graphical dialog interface of VACTIV is much more convenient than the batch mode of ACTIV. [1] V.B. Zlokazov, Computer Physics Communications, 28 (1982) 27-37. NEW VERSION PROGRAM SUMMARYProgram Title: VACTIV Catalogue identifier: ABAC_v2_0 Licensing provisions: no Programming language: DELPHI 5-7 Pascal. Computer: IBM PC series. Operating system: Windows XX. RAM: 1 MB Keywords: Nuclear physics, spectrum decomposition, least squares analysis, graphical dialog, object-oriented programming. Classification: 17.6. Catalogue identifier of previous version: ABAC_v1_0 Journal reference of previous version: Comput. Phys. Commun. 28 (1982) 27 Does the new version supersede the previous version?: Yes. Nature of problem: Program VACTIV is intended for precise analysis of arbitrary spectrum-like distributions, e.g. gamma-ray and X-ray spectra and allows the user to carry out the full cycle of automatic processing of such spectra, i.e. calibration, automatic peak search

  10. Analysis on the Application of Graphics Software in Project Cost%图形算量软件在工程造价中的应用分析

    Institute of Scientific and Technical Information of China (English)

    贾晓萍

    2012-01-01

    The project cost control is a top priority in project management. With the rapid development of modern computer technology, graphics software is applied in project coat control constantly, which achieves certain effect of cost control. In this paper, from the characteristics of graphics software, main points of use of graphics software in project cost control were discussed briefly, and its problems applied in the application of project cost at the present stage were analyzed, and improvement strategies were proposed.%工程造价控制是工程管理的重中之重.随着现代计算机技术的快速发展,图形算量软件不断在被应用在工程造价控制上,取得了一定的造价控制效果.本文从图形算量软件的特点出发,简要论述了图形算量软件在工程造价控制中的使用要点,分析了现阶段图形算量软件在工程造价应用中的问题,并提出改进策略.

  11. Density functional theory calculation on many-cores hybrid central processing unit-graphic processing unit architectures.

    Science.gov (United States)

    Genovese, Luigi; Ospici, Matthieu; Deutsch, Thierry; Méhaut, Jean-François; Neelov, Alexey; Goedecker, Stefan

    2009-07-21

    We present the implementation of a full electronic structure calculation code on a hybrid parallel architecture with graphic processing units (GPUs). This implementation is performed on a free software code based on Daubechies wavelets. Such code shows very good performances, systematic convergence properties, and an excellent efficiency on parallel computers. Our GPU-based acceleration fully preserves all these properties. In particular, the code is able to run on many cores which may or may not have a GPU associated, and thus on parallel and massive parallel hybrid machines. With double precision calculations, we may achieve considerable speedup, between a factor of 20 for some operations and a factor of 6 for the whole density functional theory code.

  12. Large-scale analytical Fourier transform of photomask layouts using graphics processing units

    Science.gov (United States)

    Sakamoto, Julia A.

    2015-10-01

    Compensation of lens-heating effects during the exposure scan in an optical lithographic system requires knowledge of the heating profile in the pupil of the projection lens. A necessary component in the accurate estimation of this profile is the total integrated distribution of light, relying on the squared modulus of the Fourier transform (FT) of the photomask layout for individual process layers. Requiring a layout representation in pixelated image format, the most common approach is to compute the FT numerically via the fast Fourier transform (FFT). However, the file size for a standard 26- mm×33-mm mask with 5-nm pixels is an overwhelming 137 TB in single precision; the data importing process alone, prior to FFT computation, can render this method highly impractical. A more feasible solution is to handle layout data in a highly compact format with vertex locations of mask features (polygons), which correspond to elements in an integrated circuit, as well as pattern symmetries and repetitions (e.g., GDSII format). Provided the polygons can decompose into shapes for which analytical FT expressions are possible, the analytical approach dramatically reduces computation time and alleviates the burden of importing extensive mask data. Algorithms have been developed for importing and interpreting hierarchical layout data and computing the analytical FT on a graphics processing unit (GPU) for rapid parallel processing, not assuming incoherent imaging. Testing was performed on the active layer of a 392- μm×297-μm virtual chip test structure with 43 substructures distributed over six hierarchical levels. The factor of improvement in the analytical versus numerical approach for importing layout data, performing CPU-GPU memory transfers, and executing the FT on a single NVIDIA Tesla K20X GPU was 1.6×104, 4.9×103, and 3.8×103, respectively. Various ideas for algorithm enhancements will be discussed.

  13. Accelerating resolution-of-the-identity second-order Møller-Plesset quantum chemistry calculations with graphical processing units.

    Science.gov (United States)

    Vogt, Leslie; Olivares-Amaya, Roberto; Kermes, Sean; Shao, Yihan; Amador-Bedolla, Carlos; Aspuru-Guzik, Alan

    2008-03-13

    The modification of a general purpose code for quantum mechanical calculations of molecular properties (Q-Chem) to use a graphical processing unit (GPU) is reported. A 4.3x speedup of the resolution-of-the-identity second-order Møller-Plesset perturbation theory (RI-MP2) execution time is observed in single point energy calculations of linear alkanes. The code modification is accomplished using the compute unified basic linear algebra subprograms (CUBLAS) library for an NVIDIA Quadro FX 5600 graphics card. Furthermore, speedups of other matrix algebra based electronic structure calculations are anticipated as a result of using a similar approach.

  14. Accounting for Students' Schemes in the Development of a Graphical Process for Solving Polynomial Inequalities in Instrumented Activity

    Science.gov (United States)

    Rivera, Ferdinand D.

    2007-01-01

    This paper provides an instrumental account of precalculus students' graphical process for solving polynomial inequalities. It is carried out in terms of the students' instrumental schemes as mediated by handheld graphing calculators and in cooperation with their classmates in a classroom setting. The ethnographic narrative relays an instrumental…

  15. Interactive Computing and Graphics in Undergraduate Digital Signal Processing. Microcomputing Working Paper Series F 84-9.

    Science.gov (United States)

    Onaral, Banu; And Others

    This report describes the development of a Drexel University electrical and computer engineering course on digital filter design that used interactive computing and graphics, and was one of three courses in a senior-level sequence on digital signal processing (DSP). Interactive and digital analysis/design routines and the interconnection of these…

  16. High performance direct gravitational N-body simulations on graphics processing units II: An implementation in CUDA

    NARCIS (Netherlands)

    Belleman, R.G.; Bédorf, J.; Portegies Zwart, S.F.

    2008-01-01

    We present the results of gravitational direct N-body simulations using the graphics processing unit (GPU) on a commercial NVIDIA GeForce 8800GTX designed for gaming computers. The force evaluation of the N-body problem is implemented in "Compute Unified Device Architecture" (CUDA) using the GPU to

  17. High performance direct gravitational N-body simulations on graphics processing units II: An implementation in CUDA

    NARCIS (Netherlands)

    Belleman, R.G.; Bédorf, J.; Portegies Zwart, S.F.

    2008-01-01

    We present the results of gravitational direct N-body simulations using the graphics processing unit (GPU) on a commercial NVIDIA GeForce 8800GTX designed for gaming computers. The force evaluation of the N-body problem is implemented in "Compute Unified Device Architecture" (CUDA) using the GPU to

  18. Text And Graphics Electronic Processing For The Printing-Publishing Industry

    Science.gov (United States)

    Harris, Barry V.

    1980-08-01

    In 1880, Stephen Horgan introduced a new dimension to the world of printing when under his direction the first published halftone, named "Shantytown", appeared on the front page of the New York Daily Graphic,.

  19. ACHIEVING HIGH INTEGRITY OF PROCESS-CONTROL SOFTWARE BY GRAPHICAL DESIGN AND FORMAL VERIFICATION

    NARCIS (Netherlands)

    HALANG, WA; Kramer, B.J.

    1992-01-01

    The International Electrotechnical Commission is currently standardising four compatible languages for designing and implementing programmable logic controllers (PLCs). The language family includes a diagrammatic notation that supports the idea of software ICs to encourage graphical design technique

  20. ACHIEVING HIGH INTEGRITY OF PROCESS-CONTROL SOFTWARE BY GRAPHICAL DESIGN AND FORMAL VERIFICATION

    NARCIS (Netherlands)

    HALANG, WA; Kramer, B.J.

    The International Electrotechnical Commission is currently standardising four compatible languages for designing and implementing programmable logic controllers (PLCs). The language family includes a diagrammatic notation that supports the idea of software ICs to encourage graphical design

  1. Fast, multi-channel real-time processing of signals with microsecond latency using graphics processing units.

    Science.gov (United States)

    Rath, N; Kato, S; Levesque, J P; Mauel, M E; Navratil, G A; Peng, Q

    2014-04-01

    Fast, digital signal processing (DSP) has many applications. Typical hardware options for performing DSP are field-programmable gate arrays (FPGAs), application-specific integrated DSP chips, or general purpose personal computer systems. This paper presents a novel DSP platform that has been developed for feedback control on the HBT-EP tokamak device. The system runs all signal processing exclusively on a Graphics Processing Unit (GPU) to achieve real-time performance with latencies below 8 μs. Signals are transferred into and out of the GPU using PCI Express peer-to-peer direct-memory-access transfers without involvement of the central processing unit or host memory. Tests were performed on the feedback control system of the HBT-EP tokamak using forty 16-bit floating point inputs and outputs each and a sampling rate of up to 250 kHz. Signals were digitized by a D-TACQ ACQ196 module, processing done on an NVIDIA GTX 580 GPU programmed in CUDA, and analog output was generated by D-TACQ AO32CPCI modules.

  2. Fast, multi-channel real-time processing of signals with microsecond latency using graphics processing units

    Science.gov (United States)

    Rath, N.; Kato, S.; Levesque, J. P.; Mauel, M. E.; Navratil, G. A.; Peng, Q.

    2014-04-01

    Fast, digital signal processing (DSP) has many applications. Typical hardware options for performing DSP are field-programmable gate arrays (FPGAs), application-specific integrated DSP chips, or general purpose personal computer systems. This paper presents a novel DSP platform that has been developed for feedback control on the HBT-EP tokamak device. The system runs all signal processing exclusively on a Graphics Processing Unit (GPU) to achieve real-time performance with latencies below 8 μs. Signals are transferred into and out of the GPU using PCI Express peer-to-peer direct-memory-access transfers without involvement of the central processing unit or host memory. Tests were performed on the feedback control system of the HBT-EP tokamak using forty 16-bit floating point inputs and outputs each and a sampling rate of up to 250 kHz. Signals were digitized by a D-TACQ ACQ196 module, processing done on an NVIDIA GTX 580 GPU programmed in CUDA, and analog output was generated by D-TACQ AO32CPCI modules.

  3. A graphical simulation model of the entire DNA process associated with the analysis of short tandem repeat loci

    OpenAIRE

    Gill, Peter; Curran, James; Elliot, Keith

    2005-01-01

    The use of expert systems to interpret short tandem repeat DNA profiles in forensic, medical and ancient DNA applications is becoming increasingly prevalent as high-throughput analytical systems generate large amounts of data that are time-consuming to process. With special reference to low copy number (LCN) applications, we use a graphical model to simulate stochastic variation associated with the entire DNA process starting with extraction of sample, followed by the processing associated wi...

  4. TMSEEG: A MATLAB-Based Graphical User Interface for Processing Electrophysiological Signals during Transcranial Magnetic Stimulation

    Science.gov (United States)

    Atluri, Sravya; Frehlich, Matthew; Mei, Ye; Garcia Dominguez, Luis; Rogasch, Nigel C.; Wong, Willy; Daskalakis, Zafiris J.; Farzan, Faranak

    2016-01-01

    Concurrent recording of electroencephalography (EEG) during transcranial magnetic stimulation (TMS) is an emerging and powerful tool for studying brain health and function. Despite a growing interest in adaptation of TMS-EEG across neuroscience disciplines, its widespread utility is limited by signal processing challenges. These challenges arise due to the nature of TMS and the sensitivity of EEG to artifacts that often mask TMS-evoked potentials (TEP)s. With an increase in the complexity of data processing methods and a growing interest in multi-site data integration, analysis of TMS-EEG data requires the development of a standardized method to recover TEPs from various sources of artifacts. This article introduces TMSEEG, an open-source MATLAB application comprised of multiple algorithms organized to facilitate a step-by-step procedure for TMS-EEG signal processing. Using a modular design and interactive graphical user interface (GUI), this toolbox aims to streamline TMS-EEG signal processing for both novice and experienced users. Specifically, TMSEEG provides: (i) targeted removal of TMS-induced and general EEG artifacts; (ii) a step-by-step modular workflow with flexibility to modify existing algorithms and add customized algorithms; (iii) a comprehensive display and quantification of artifacts; (iv) quality control check points with visual feedback of TEPs throughout the data processing workflow; and (v) capability to label and store a database of artifacts. In addition to these features, the software architecture of TMSEEG ensures minimal user effort in initial setup and configuration of parameters for each processing step. This is partly accomplished through a close integration with EEGLAB, a widely used open-source toolbox for EEG signal processing. In this article, we introduce TMSEEG, validate its features and demonstrate its application in extracting TEPs across several single- and multi-pulse TMS protocols. As the first open-source GUI-based pipeline

  5. TMSEEG: A MATLAB-Based Graphical User Interface for Processing Electrophysiological Signals during Transcranial Magnetic Stimulation.

    Science.gov (United States)

    Atluri, Sravya; Frehlich, Matthew; Mei, Ye; Garcia Dominguez, Luis; Rogasch, Nigel C; Wong, Willy; Daskalakis, Zafiris J; Farzan, Faranak

    2016-01-01

    Concurrent recording of electroencephalography (EEG) during transcranial magnetic stimulation (TMS) is an emerging and powerful tool for studying brain health and function. Despite a growing interest in adaptation of TMS-EEG across neuroscience disciplines, its widespread utility is limited by signal processing challenges. These challenges arise due to the nature of TMS and the sensitivity of EEG to artifacts that often mask TMS-evoked potentials (TEP)s. With an increase in the complexity of data processing methods and a growing interest in multi-site data integration, analysis of TMS-EEG data requires the development of a standardized method to recover TEPs from various sources of artifacts. This article introduces TMSEEG, an open-source MATLAB application comprised of multiple algorithms organized to facilitate a step-by-step procedure for TMS-EEG signal processing. Using a modular design and interactive graphical user interface (GUI), this toolbox aims to streamline TMS-EEG signal processing for both novice and experienced users. Specifically, TMSEEG provides: (i) targeted removal of TMS-induced and general EEG artifacts; (ii) a step-by-step modular workflow with flexibility to modify existing algorithms and add customized algorithms; (iii) a comprehensive display and quantification of artifacts; (iv) quality control check points with visual feedback of TEPs throughout the data processing workflow; and (v) capability to label and store a database of artifacts. In addition to these features, the software architecture of TMSEEG ensures minimal user effort in initial setup and configuration of parameters for each processing step. This is partly accomplished through a close integration with EEGLAB, a widely used open-source toolbox for EEG signal processing. In this article, we introduce TMSEEG, validate its features and demonstrate its application in extracting TEPs across several single- and multi-pulse TMS protocols. As the first open-source GUI-based pipeline

  6. Accelerated Molecular Dynamics Simulations with the AMOEBA Polarizable Force Field on Graphics Processing Units.

    Science.gov (United States)

    Lindert, Steffen; Bucher, Denis; Eastman, Peter; Pande, Vijay; McCammon, J Andrew

    2013-11-12

    The accelerated molecular dynamics (aMD) method has recently been shown to enhance the sampling of biomolecules in molecular dynamics (MD) simulations, often by several orders of magnitude. Here, we describe an implementation of the aMD method for the OpenMM application layer that takes full advantage of graphics processing units (GPUs) computing. The aMD method is shown to work in combination with the AMOEBA polarizable force field (AMOEBA-aMD), allowing the simulation of long time-scale events with a polarizable force field. Benchmarks are provided to show that the AMOEBA-aMD method is efficiently implemented and produces accurate results in its standard parametrization. For the BPTI protein, we demonstrate that the protein structure described with AMOEBA remains stable even on the extended time scales accessed at high levels of accelerations. For the DNA repair metalloenzyme endonuclease IV, we show that the use of the AMOEBA force field is a significant improvement over fixed charged models for describing the enzyme active-site. The new AMOEBA-aMD method is publicly available (http://wiki.simtk.org/openmm/VirtualRepository) and promises to be interesting for studying complex systems that can benefit from both the use of a polarizable force field and enhanced sampling.

  7. Graphics processing unit (GPU)-based computation of heat conduction in thermally anisotropic solids

    Science.gov (United States)

    Nahas, C. A.; Balasubramaniam, Krishnan; Rajagopal, Prabhu

    2013-01-01

    Numerical modeling of anisotropic media is a computationally intensive task since it brings additional complexity to the field problem in such a way that the physical properties are different in different directions. Largely used in the aerospace industry because of their lightweight nature, composite materials are a very good example of thermally anisotropic media. With advancements in video gaming technology, parallel processors are much cheaper today and accessibility to higher-end graphical processing devices has increased dramatically over the past couple of years. Since these massively parallel GPUs are very good in handling floating point arithmetic, they provide a new platform for engineers and scientists to accelerate their numerical models using commodity hardware. In this paper we implement a parallel finite difference model of thermal diffusion through anisotropic media using the NVIDIA CUDA (Compute Unified device Architecture). We use the NVIDIA GeForce GTX 560 Ti as our primary computing device which consists of 384 CUDA cores clocked at 1645 MHz with a standard desktop pc as the host platform. We compare the results from standard CPU implementation for its accuracy and speed and draw implications for simulation using the GPU paradigm.

  8. Developing a multiscale, multi-resolution agent-based brain tumor model by graphics processing units

    Directory of Open Access Journals (Sweden)

    Zhang Le

    2011-12-01

    Full Text Available Abstract Multiscale agent-based modeling (MABM has been widely used to simulate Glioblastoma Multiforme (GBM and its progression. At the intracellular level, the MABM approach employs a system of ordinary differential equations to describe quantitatively specific intracellular molecular pathways that determine phenotypic switches among cells (e.g. from migration to proliferation and vice versa. At the intercellular level, MABM describes cell-cell interactions by a discrete module. At the tissue level, partial differential equations are employed to model the diffusion of chemoattractants, which are the input factors of the intracellular molecular pathway. Moreover, multiscale analysis makes it possible to explore the molecules that play important roles in determining the cellular phenotypic switches that in turn drive the whole GBM expansion. However, owing to limited computational resources, MABM is currently a theoretical biological model that uses relatively coarse grids to simulate a few cancer cells in a small slice of brain cancer tissue. In order to improve this theoretical model to simulate and predict actual GBM cancer progression in real time, a graphics processing unit (GPU-based parallel computing algorithm was developed and combined with the multi-resolution design to speed up the MABM. The simulated results demonstrated that the GPU-based, multi-resolution and multiscale approach can accelerate the previous MABM around 30-fold with relatively fine grids in a large extracellular matrix. Therefore, the new model has great potential for simulating and predicting real-time GBM progression, if real experimental data are incorporated.

  9. An Optimized Multicolor Point-Implicit Solver for Unstructured Grid Applications on Graphics Processing Units

    Science.gov (United States)

    Zubair, Mohammad; Nielsen, Eric; Luitjens, Justin; Hammond, Dana

    2016-01-01

    In the field of computational fluid dynamics, the Navier-Stokes equations are often solved using an unstructuredgrid approach to accommodate geometric complexity. Implicit solution methodologies for such spatial discretizations generally require frequent solution of large tightly-coupled systems of block-sparse linear equations. The multicolor point-implicit solver used in the current work typically requires a significant fraction of the overall application run time. In this work, an efficient implementation of the solver for graphics processing units is proposed. Several factors present unique challenges to achieving an efficient implementation in this environment. These include the variable amount of parallelism available in different kernel calls, indirect memory access patterns, low arithmetic intensity, and the requirement to support variable block sizes. In this work, the solver is reformulated to use standard sparse and dense Basic Linear Algebra Subprograms (BLAS) functions. However, numerical experiments show that the performance of the BLAS functions available in existing CUDA libraries is suboptimal for matrices representative of those encountered in actual simulations. Instead, optimized versions of these functions are developed. Depending on block size, the new implementations show performance gains of up to 7x over the existing CUDA library functions.

  10. Parallel design of JPEG-LS encoder on graphics processing units

    Science.gov (United States)

    Duan, Hao; Fang, Yong; Huang, Bormin

    2012-01-01

    With recent technical advances in graphic processing units (GPUs), GPUs have outperformed CPUs in terms of compute capability and memory bandwidth. Many successful GPU applications to high performance computing have been reported. JPEG-LS is an ISO/IEC standard for lossless image compression which utilizes adaptive context modeling and run-length coding to improve compression ratio. However, adaptive context modeling causes data dependency among adjacent pixels and the run-length coding has to be performed in a sequential way. Hence, using JPEG-LS to compress large-volume hyperspectral image data is quite time-consuming. We implement an efficient parallel JPEG-LS encoder for lossless hyperspectral compression on a NVIDIA GPU using the computer unified device architecture (CUDA) programming technology. We use the block parallel strategy, as well as such CUDA techniques as coalesced global memory access, parallel prefix sum, and asynchronous data transfer. We also show the relation between GPU speedup and AVIRIS block size, as well as the relation between compression ratio and AVIRIS block size. When AVIRIS images are divided into blocks, each with 64×64 pixels, we gain the best GPU performance with 26.3x speedup over its original CPU code.

  11. Graphic processing unit accelerated real-time partially coherent beam generator

    Science.gov (United States)

    Ni, Xiaolong; Liu, Zhi; Chen, Chunyi; Jiang, Huilin; Fang, Hanhan; Song, Lujun; Zhang, Su

    2016-07-01

    A method of using liquid-crystals (LCs) to generate a partially coherent beam in real-time is described. An expression for generating a partially coherent beam is given and calculated using a graphic processing unit (GPU), i.e., the GeForce GTX 680. A liquid-crystal on silicon (LCOS) with 256 × 256 pixels is used as the partially coherent beam generator (PCBG). An optimizing method with partition convolution is used to improve the generating speed of our LC PCBG. The total time needed to generate a random phase map with a coherence width range from 0.015 mm to 1.5 mm is less than 2.4 ms for calculation and readout with the GPU; adding the time needed for the CPU to read and send to LCOS with the response time of the LC PCBG, the real-time partially coherent beam (PCB) generation frequency of our LC PCBG is up to 312 Hz. To our knowledge, it is the first real-time partially coherent beam generator. A series of experiments based on double pinhole interference are performed. The result shows that to generate a laser beam with a coherence width of 0.9 mm and 1.5 mm, with a mean error of approximately 1%, the RMS values needed 0.021306 and 0.020883 and the PV values required 0.073576 and 0.072998, respectively.

  12. Graphics processing unit (GPU)-accelerated particle filter framework for positron emission tomography image reconstruction.

    Science.gov (United States)

    Yu, Fengchao; Liu, Huafeng; Hu, Zhenghui; Shi, Pengcheng

    2012-04-01

    As a consequence of the random nature of photon emissions and detections, the data collected by a positron emission tomography (PET) imaging system can be shown to be Poisson distributed. Meanwhile, there have been considerable efforts within the tracer kinetic modeling communities aimed at establishing the relationship between the PET data and physiological parameters that affect the uptake and metabolism of the tracer. Both statistical and physiological models are important to PET reconstruction. The majority of previous efforts are based on simplified, nonphysical mathematical expression, such as Poisson modeling of the measured data, which is, on the whole, completed without consideration of the underlying physiology. In this paper, we proposed a graphics processing unit (GPU)-accelerated reconstruction strategy that can take both statistical model and physiological model into consideration with the aid of state-space evolution equations. The proposed strategy formulates the organ activity distribution through tracer kinetics models and the photon-counting measurements through observation equations, thus making it possible to unify these two constraints into a general framework. In order to accelerate reconstruction, GPU-based parallel computing is introduced. Experiments of Zubal-thorax-phantom data, Monte Carlo simulated phantom data, and real phantom data show the power of the method. Furthermore, thanks to the computing power of the GPU, the reconstruction time is practical for clinical application.

  13. Exploring Graphics Processing Unit (GPU Resource Sharing Efficiency for High Performance Computing

    Directory of Open Access Journals (Sweden)

    Teng Li

    2013-11-01

    Full Text Available The increasing incorporation of Graphics Processing Units (GPUs as accelerators has been one of the forefront High Performance Computing (HPC trends and provides unprecedented performance; however, the prevalent adoption of the Single-Program Multiple-Data (SPMD programming model brings with it challenges of resource underutilization. In other words, under SPMD, every CPU needs GPU capability available to it. However, since CPUs generally outnumber GPUs, the asymmetric resource distribution gives rise to overall computing resource underutilization. In this paper, we propose to efficiently share the GPU under SPMD and formally define a series of GPU sharing scenarios. We provide performance-modeling analysis for each sharing scenario with accurate experimentation validation. With the modeling basis, we further conduct experimental studies to explore potential GPU sharing efficiency improvements from multiple perspectives. Both further theoretical and experimental GPU sharing performance analysis and results are presented. Our results not only demonstrate the significant performance gain for SPMD programs with the proposed efficient GPU sharing, but also the further improved sharing efficiency with the optimization techniques based on our accurate modeling.

  14. High-Throughput Characterization of Porous Materials Using Graphics Processing Units

    Energy Technology Data Exchange (ETDEWEB)

    Kim, Jihan; Martin, Richard L.; Rübel, Oliver; Haranczyk, Maciej; Smit, Berend

    2012-05-08

    We have developed a high-throughput graphics processing units (GPU) code that can characterize a large database of crystalline porous materials. In our algorithm, the GPU is utilized to accelerate energy grid calculations where the grid values represent interactions (i.e., Lennard-Jones + Coulomb potentials) between gas molecules (i.e., CH$_{4}$ and CO$_{2}$) and material's framework atoms. Using a parallel flood fill CPU algorithm, inaccessible regions inside the framework structures are identified and blocked based on their energy profiles. Finally, we compute the Henry coefficients and heats of adsorption through statistical Widom insertion Monte Carlo moves in the domain restricted to the accessible space. The code offers significant speedup over a single core CPU code and allows us to characterize a set of porous materials at least an order of magnitude larger than ones considered in earlier studies. For structures selected from such a prescreening algorithm, full adsorption isotherms can be calculated by conducting multiple grand canonical Monte Carlo simulations concurrently within the GPU.

  15. Efficient molecular dynamics simulations with many-body potentials on graphics processing units

    CERN Document Server

    Fan, Zheyong; Vierimaa, Ville; Harju, Ari

    2016-01-01

    Graphics processing units have been extensively used to accelerate classical molecular dynamics simulations. However, there is much less progress on the acceleration of force evaluations for many-body potentials compared to pairwise ones. In the conventional force evaluation algorithm for many-body potentials, the force, virial stress, and heat current for a given atom are accumulated within different loops, which could result in write conflict between different threads in a CUDA kernel. In this work, we provide a new force evaluation algorithm, which is based on an explicit pairwise force expression for many-body potentials derived recently [Phys. Rev. B 92 (2015) 094301]. In our algorithm, the force, virial stress, and heat current for a given atom can be accumulated within a single thread and is free of write conflicts. We discuss the formulations and algorithms and evaluate their performance. A new open-source code, GPUMD, is developed based on the proposed formulations. For the Tersoff many-body potentia...

  16. Accelerating image reconstruction in three-dimensional optoacoustic tomography on graphics processing units.

    Science.gov (United States)

    Wang, Kun; Huang, Chao; Kao, Yu-Jiun; Chou, Cheng-Ying; Oraevsky, Alexander A; Anastasio, Mark A

    2013-02-01

    Optoacoustic tomography (OAT) is inherently a three-dimensional (3D) inverse problem. However, most studies of OAT image reconstruction still employ two-dimensional imaging models. One important reason is because 3D image reconstruction is computationally burdensome. The aim of this work is to accelerate existing image reconstruction algorithms for 3D OAT by use of parallel programming techniques. Parallelization strategies are proposed to accelerate a filtered backprojection (FBP) algorithm and two different pairs of projection/backprojection operations that correspond to two different numerical imaging models. The algorithms are designed to fully exploit the parallel computing power of graphics processing units (GPUs). In order to evaluate the parallelization strategies for the projection/backprojection pairs, an iterative image reconstruction algorithm is implemented. Computer simulation and experimental studies are conducted to investigate the computational efficiency and numerical accuracy of the developed algorithms. The GPU implementations improve the computational efficiency by factors of 1000, 125, and 250 for the FBP algorithm and the two pairs of projection/backprojection operators, respectively. Accurate images are reconstructed by use of the FBP and iterative image reconstruction algorithms from both computer-simulated and experimental data. Parallelization strategies for 3D OAT image reconstruction are proposed for the first time. These GPU-based implementations significantly reduce the computational time for 3D image reconstruction, complementing our earlier work on 3D OAT iterative image reconstruction.

  17. permGPU: Using graphics processing units in RNA microarray association studies

    Directory of Open Access Journals (Sweden)

    George Stephen L

    2010-06-01

    Full Text Available Abstract Background Many analyses of microarray association studies involve permutation, bootstrap resampling and cross-validation, that are ideally formulated as embarrassingly parallel computing problems. Given that these analyses are computationally intensive, scalable approaches that can take advantage of multi-core processor systems need to be developed. Results We have developed a CUDA based implementation, permGPU, that employs graphics processing units in microarray association studies. We illustrate the performance and applicability of permGPU within the context of permutation resampling for a number of test statistics. An extensive simulation study demonstrates a dramatic increase in performance when using permGPU on an NVIDIA GTX 280 card compared to an optimized C/C++ solution running on a conventional Linux server. Conclusions permGPU is available as an open-source stand-alone application and as an extension package for the R statistical environment. It provides a dramatic increase in performance for permutation resampling analysis in the context of microarray association studies. The current version offers six test statistics for carrying out permutation resampling analyses for binary, quantitative and censored time-to-event traits.

  18. Efficient molecular dynamics simulations with many-body potentials on graphics processing units

    Science.gov (United States)

    Fan, Zheyong; Chen, Wei; Vierimaa, Ville; Harju, Ari

    2017-09-01

    Graphics processing units have been extensively used to accelerate classical molecular dynamics simulations. However, there is much less progress on the acceleration of force evaluations for many-body potentials compared to pairwise ones. In the conventional force evaluation algorithm for many-body potentials, the force, virial stress, and heat current for a given atom are accumulated within different loops, which could result in write conflict between different threads in a CUDA kernel. In this work, we provide a new force evaluation algorithm, which is based on an explicit pairwise force expression for many-body potentials derived recently (Fan et al., 2015). In our algorithm, the force, virial stress, and heat current for a given atom can be accumulated within a single thread and is free of write conflicts. We discuss the formulations and algorithms and evaluate their performance. A new open-source code, GPUMD, is developed based on the proposed formulations. For the Tersoff many-body potential, the double precision performance of GPUMD using a Tesla K40 card is equivalent to that of the LAMMPS (Large-scale Atomic/Molecular Massively Parallel Simulator) molecular dynamics code running with about 100 CPU cores (Intel Xeon CPU X5670 @ 2.93 GHz).

  19. The application of projected conjugate gradient solvers on graphical processing units

    Energy Technology Data Exchange (ETDEWEB)

    Lin, Youzuo [Los Alamos National Laboratory; Renaut, Rosemary [ARIZONA STATE UNIV.

    2011-01-26

    Graphical processing units introduce the capability for large scale computation at the desktop. Presented numerical results verify that efficiencies and accuracies of basic linear algebra subroutines of all levels when implemented in CUDA and Jacket are comparable. But experimental results demonstrate that the basic linear algebra subroutines of level three offer the greatest potential for improving efficiency of basic numerical algorithms. We consider the solution of the multiple right hand side set of linear equations using Krylov subspace-based solvers. Thus, for the multiple right hand side case, it is more efficient to make use of a block implementation of the conjugate gradient algorithm, rather than to solve each system independently. Jacket is used for the implementation. Furthermore, including projection from one system to another improves efficiency. A relevant example, for which simulated results are provided, is the reconstruction of a three dimensional medical image volume acquired from a positron emission tomography scanner. Efficiency of the reconstruction is improved by using projection across nearby slices.

  20. Space Object Collision Probability via Monte Carlo on the Graphics Processing Unit

    Science.gov (United States)

    Vittaldev, Vivek; Russell, Ryan P.

    2017-09-01

    Fast and accurate collision probability computations are essential for protecting space assets. Monte Carlo (MC) simulation is the most accurate but computationally intensive method. A Graphics Processing Unit (GPU) is used to parallelize the computation and reduce the overall runtime. Using MC techniques to compute the collision probability is common in literature as the benchmark. An optimized implementation on the GPU, however, is a challenging problem and is the main focus of the current work. The MC simulation takes samples from the uncertainty distributions of the Resident Space Objects (RSOs) at any time during a time window of interest and outputs the separations at closest approach. Therefore, any uncertainty propagation method may be used and the collision probability is automatically computed as a function of RSO collision radii. Integration using a fixed time step and a quartic interpolation after every Runge Kutta step ensures that no close approaches are missed. Two orders of magnitude speedups over a serial CPU implementation are shown, and speedups improve moderately with higher fidelity dynamics. The tool makes the MC approach tractable on a single workstation, and can be used as a final product, or for verifying surrogate and analytical collision probability methods.

  1. Seismic interpretation using Support Vector Machines implemented on Graphics Processing Units

    Energy Technology Data Exchange (ETDEWEB)

    Kuzma, H A; Rector, J W; Bremer, D

    2006-06-22

    Support Vector Machines (SVMs) estimate lithologic properties of rock formations from seismic data by interpolating between known models using synthetically generated model/data pairs. SVMs are related to kriging and radial basis function neural networks. In our study, we train an SVM to approximate an inverse to the Zoeppritz equations. Training models are sampled from distributions constructed from well-log statistics. Training data is computed via a physically realistic forward modeling algorithm. In our experiments, each training data vector is a set of seismic traces similar to a 2-d image. The SVM returns a model given by a weighted comparison of the new data to each training data vector. The method of comparison is given by a kernel function which implicitly transforms data into a high-dimensional feature space and performs a dot-product. The feature space of a Gaussian kernel is made up of sines and cosines and so is appropriate for band-limited seismic problems. Training an SVM involves estimating a set of weights from the training model/data pairs. It is designed to be an easy problem; at worst it is a quadratic programming problem on the order of the size of the training set. By implementing the slowest part of our SVM algorithm on a graphics processing unit (GPU), we improve the speed of the algorithm by two orders of magnitude. Our SVM/GPU combination achieves results that are similar to those of conventional iterative inversion in fractions of the time.

  2. Multidisciplinary Simulation Acceleration using Multiple Shared-Memory Graphical Processing Units

    Science.gov (United States)

    Kemal, Jonathan Yashar

    For purposes of optimizing and analyzing turbomachinery and other designs, the unsteady Favre-averaged flow-field differential equations for an ideal compressible gas can be solved in conjunction with the heat conduction equation. We solve all equations using the finite-volume multiple-grid numerical technique, with the dual time-step scheme used for unsteady simulations. Our numerical solver code targets CUDA-capable Graphical Processing Units (GPUs) produced by NVIDIA. Making use of MPI, our solver can run across networked compute notes, where each MPI process can use either a GPU or a Central Processing Unit (CPU) core for primary solver calculations. We use NVIDIA Tesla C2050/C2070 GPUs based on the Fermi architecture, and compare our resulting performance against Intel Zeon X5690 CPUs. Solver routines converted to CUDA typically run about 10 times faster on a GPU for sufficiently dense computational grids. We used a conjugate cylinder computational grid and ran a turbulent steady flow simulation using 4 increasingly dense computational grids. Our densest computational grid is divided into 13 blocks each containing 1033x1033 grid points, for a total of 13.87 million grid points or 1.07 million grid points per domain block. To obtain overall speedups, we compare the execution time of the solver's iteration loop, including all resource intensive GPU-related memory copies. Comparing the performance of 8 GPUs to that of 8 CPUs, we obtain an overall speedup of about 6.0 when using our densest computational grid. This amounts to an 8-GPU simulation running about 39.5 times faster than running than a single-CPU simulation.

  3. Massively parallel signal processing using the graphics processing unit for real-time brain-computer interface feature extraction

    Directory of Open Access Journals (Sweden)

    J. Adam Wilson

    2009-07-01

    Full Text Available The clock speeds of modern computer processors have nearly plateaued in the past five years. Consequently, neural prosthetic systems that rely on processing large quantities of data in a short period of time face a bottleneck, in that it may not be possible to process all of the data recorded from an electrode array with high channel counts and bandwidth, such as electrocorticographic grids or other implantable systems. Therefore, in this study a method of using the processing capabilities of a graphics card (GPU was developed for real-time neural signal processing of a brain-computer interface (BCI. The NVIDIA CUDA system was used to offload processing to the GPU, which is capable of running many operations in parallel, potentially greatly increasing the speed of existing algorithms. The BCI system records many channels of data, which are processed and translated into a control signal, such as the movement of a computer cursor. This signal processing chain involves computing a matrix-matrix multiplication (i.e., a spatial filter, followed by calculating the power spectral density on every channel using an auto-regressive method, and finally classifying appropriate features for control. In this study, the first two computationally-intensive steps were implemented on the GPU, and the speed was compared to both the current implementation and a CPU-based implementation that uses multi-threading. Significant performance gains were obtained with GPU processing: the current implementation processed 1000 channels in 933 ms, while the new GPU method took only 27 ms, an improvement of nearly 35 times.

  4. Animated-simulation modeling facilitates clinical-process costing.

    Science.gov (United States)

    Zelman, W N; Glick, N D; Blackmore, C C

    2001-09-01

    Traditionally, the finance department has assumed responsibility for assessing process costs in healthcare organizations. To enhance process-improvement efforts, however, many healthcare providers need to include clinical staff in process cost analysis. Although clinical staff often use electronic spreadsheets to model the cost of specific processes, PC-based animated-simulation tools offer two major advantages over spreadsheets: they allow clinicians to interact more easily with the costing model so that it more closely represents the process being modeled, and they represent cost output as a cost range rather than as a single cost estimate, thereby providing more useful information for decision making.

  5. Full Stokes finite-element modeling of ice sheets using a graphics processing unit

    Science.gov (United States)

    Seddik, H.; Greve, R.

    2016-12-01

    Thermo-mechanical simulation of ice sheets is an important approach to understand and predict their evolution in a changing climate. For that purpose, higher order (e.g., ISSM, BISICLES) and full Stokes (e.g., Elmer/Ice, http://elmerice.elmerfem.org) models are increasingly used to more accurately model the flow of entire ice sheets. In parallel to this development, the rapidly improving performance and capabilities of Graphics Processing Units (GPUs) allows to efficiently offload more calculations of complex and computationally demanding problems on those devices. Thus, in order to continue the trend of using full Stokes models with greater resolutions, using GPUs should be considered for the implementation of ice sheet models. We developed the GPU-accelerated ice-sheet model Sainō. Sainō is an Elmer (http://www.csc.fi/english/pages/elmer) derivative implemented in Objective-C which solves the full Stokes equations with the finite element method. It uses the standard OpenCL language (http://www.khronos.org/opencl/) to offload the assembly of the finite element matrix on the GPU. A mesh-coloring scheme is used so that elements with the same color (non-sharing nodes) are assembled in parallel on the GPU without the need for synchronization primitives. The current implementation shows that, for the ISMIP-HOM experiment A, during the matrix assembly in double precision with 8000, 87,500 and 252,000 brick elements, Sainō is respectively 2x, 10x and 14x faster than Elmer/Ice (when both models are run on a single processing unit). In single precision, Sainō is even 3x, 20x and 25x faster than Elmer/Ice. A detailed description of the comparative results between Sainō and Elmer/Ice will be presented, and further perspectives in optimization and the limitations of the current implementation.

  6. In-Situ Statistical Analysis of Autotune Simulation Data using Graphical Processing Units

    Energy Technology Data Exchange (ETDEWEB)

    Ranjan, Niloo [ORNL; Sanyal, Jibonananda [ORNL; New, Joshua Ryan [ORNL

    2013-08-01

    Developing accurate building energy simulation models to assist energy efficiency at speed and scale is one of the research goals of the Whole-Building and Community Integration group, which is a part of Building Technologies Research and Integration Center (BTRIC) at Oak Ridge National Laboratory (ORNL). The aim of the Autotune project is to speed up the automated calibration of building energy models to match measured utility or sensor data. The workflow of this project takes input parameters and runs EnergyPlus simulations on Oak Ridge Leadership Computing Facility s (OLCF) computing resources such as Titan, the world s second fastest supercomputer. Multiple simulations run in parallel on nodes having 16 processors each and a Graphics Processing Unit (GPU). Each node produces a 5.7 GB output file comprising 256 files from 64 simulations. Four types of output data covering monthly, daily, hourly, and 15-minute time steps for each annual simulation is produced. A total of 270TB+ of data has been produced. In this project, the simulation data is statistically analyzed in-situ using GPUs while annual simulations are being computed on the traditional processors. Titan, with its recent addition of 18,688 Compute Unified Device Architecture (CUDA) capable NVIDIA GPUs, has greatly extended its capability for massively parallel data processing. CUDA is used along with C/MPI to calculate statistical metrics such as sum, mean, variance, and standard deviation leveraging GPU acceleration. The workflow developed in this project produces statistical summaries of the data which reduces by multiple orders of magnitude the time and amount of data that needs to be stored. These statistical capabilities are anticipated to be useful for sensitivity analysis of EnergyPlus simulations.

  7. Computing the Density Matrix in Electronic Structure Theory on Graphics Processing Units.

    Science.gov (United States)

    Cawkwell, M J; Sanville, E J; Mniszewski, S M; Niklasson, Anders M N

    2012-11-13

    The self-consistent solution of a Schrödinger-like equation for the density matrix is a critical and computationally demanding step in quantum-based models of interatomic bonding. This step was tackled historically via the diagonalization of the Hamiltonian. We have investigated the performance and accuracy of the second-order spectral projection (SP2) algorithm for the computation of the density matrix via a recursive expansion of the Fermi operator in a series of generalized matrix-matrix multiplications. We demonstrate that owing to its simplicity, the SP2 algorithm [Niklasson, A. M. N. Phys. Rev. B2002, 66, 155115] is exceptionally well suited to implementation on graphics processing units (GPUs). The performance in double and single precision arithmetic of a hybrid GPU/central processing unit (CPU) and full GPU implementation of the SP2 algorithm exceed those of a CPU-only implementation of the SP2 algorithm and traditional matrix diagonalization when the dimensions of the matrices exceed about 2000 × 2000. Padding schemes for arrays allocated in the GPU memory that optimize the performance of the CUBLAS implementations of the level 3 BLAS DGEMM and SGEMM subroutines for generalized matrix-matrix multiplications are described in detail. The analysis of the relative performance of the hybrid CPU/GPU and full GPU implementations indicate that the transfer of arrays between the GPU and CPU constitutes only a small fraction of the total computation time. The errors measured in the self-consistent density matrices computed using the SP2 algorithm are generally smaller than those measured in matrices computed via diagonalization. Furthermore, the errors in the density matrices computed using the SP2 algorithm do not exhibit any dependence of system size, whereas the errors increase linearly with the number of orbitals when diagonalization is employed.

  8. Lean Cost Management Analysis on Food Processing Enterprise

    Directory of Open Access Journals (Sweden)

    Jing Ma

    2014-06-01

    Full Text Available The aim of this study is to introduce Lean Cost Management (LCM that tries to create creating value for customers and performs whole cost management in enterprise’s entire life cycle under structure of target cost, cost sustaining and cost improvement guided by reverse thinking into food processing enterprise to construct LCM system from aspects of external value chain analysis as well as internal cost management. Dynamic pricing game model was used to provide cost improvement on food enterprise value chain so as to minimize whole cost. The target cost was divided into each part in design phase supported by cost programming, cost reduction and cost improving. Case study shows that such cost suppressing method can reduce cost of food processing enterprises and improve long-term competitiveness.

  9. Development of MATLAB-Based Digital Signal Processing Teaching Module with Graphical User Interface Environment for Nigerian University

    OpenAIRE

    2013-01-01

    The development of a teaching aid module for digital Signal processing (DSP) in Nigeria Universities was undertaken to address the problem associated with non-availability instructional module. This paper annexes the potential of Peripheral Interface Controllers (PICs) with MATLAB resources to develop a PIC-based system with graphic user interface environment suitable for data acquisition and signal processing. The module accepts data from three different sources: real time acquisition, pre-r...

  10. The PC graphics handbook

    CERN Document Server

    Sanchez, Julio

    2003-01-01

    Part I - Graphics Fundamentals PC GRAPHICS OVERVIEW History and Evolution Short History of PC Video PS/2 Video Systems SuperVGA Graphics Coprocessors and Accelerators Graphics Applications State-of-the-Art in PC Graphics 3D Application Programming Interfaces POLYGONAL MODELING Vector and Raster Data Coordinate Systems Modeling with Polygons IMAGE TRANSFORMATIONS Matrix-based Representations Matrix Arithmetic 3D Transformations PROGRAMMING MATRIX TRANSFORMATIONS Numeric Data in Matrix Form Array Processing PROJECTIONS AND RENDERING Perspective The Rendering Pipeline LIGHTING AND SHADING Lightin

  11. Graphics gems II

    CERN Document Server

    Arvo, James

    1991-01-01

    Graphics Gems II is a collection of articles shared by a diverse group of people that reflect ideas and approaches in graphics programming which can benefit other computer graphics programmers.This volume presents techniques for doing well-known graphics operations faster or easier. The book contains chapters devoted to topics on two-dimensional and three-dimensional geometry and algorithms, image processing, frame buffer techniques, and ray tracing techniques. The radiosity approach, matrix techniques, and numerical and programming techniques are likewise discussed.Graphics artists and comput

  12. Analysis of Unit Process Cost for an Engineering-Scale Pyroprocess Facility Using a Process Costing Method in Korea

    National Research Council Canada - National Science Library

    Sungki Kim; Wonil Ko; Sungsig Bang

    2015-01-01

    ...) metal ingots in a high-temperature molten salt phase. This paper provides the unit process cost of a pyroprocess facility that can process up to 10 tons of pyroprocessing product per year by utilizing the process costing method...

  13. Process Approach to Cost of Quality

    Directory of Open Access Journals (Sweden)

    Katarzyna Szczepańska

    2009-12-01

    Full Text Available Contemporary understanding comprehending cost of quality is connected both with the sphere of the management, as well as the operational activity. Borders of considering the category of quality costs moved beyond the technical – technological area. Trial including costs let the quality for classifying them with reference to all action carried out in the modern enterprise. Standard models of quality costs and the activity – based costing appointed new prospects of economics of the quality in the business administration.

  14. Large scale neural circuit mapping data analysis accelerated with the graphical processing unit (GPU)

    Science.gov (United States)

    Shi, Yulin; Veidenbaum, Alexander V.; Nicolau, Alex; Xu, Xiangmin

    2014-01-01

    Background Modern neuroscience research demands computing power. Neural circuit mapping studies such as those using laser scanning photostimulation (LSPS) produce large amounts of data and require intensive computation for post-hoc processing and analysis. New Method Here we report on the design and implementation of a cost-effective desktop computer system for accelerated experimental data processing with recent GPU computing technology. A new version of Matlab software with GPU enabled functions is used to develop programs that run on Nvidia GPUs to harness their parallel computing power. Results We evaluated both the central processing unit (CPU) and GPU-enabled computational performance of our system in benchmark testing and practical applications. The experimental results show that the GPU-CPU co-processing of simulated data and actual LSPS experimental data clearly outperformed the multi-core CPU with up to a 22x speedup, depending on computational tasks. Further, we present a comparison of numerical accuracy between GPU and CPU computation to verify the precision of GPU computation. In addition, we show how GPUs can be effectively adapted to improve the performance of commercial image processing software such as Adobe Photoshop. Comparison with Existing Method(s) To our best knowledge, this is the first demonstration of GPU application in neural circuit mapping and electrophysiology-based data processing. Conclusions Together, GPU enabled computation enhances our ability to process large-scale data sets derived from neural circuit mapping studies, allowing for increased processing speeds while retaining data precision. PMID:25277633

  15. Accelerating Wright-Fisher Forward Simulations on the Graphics Processing Unit.

    Science.gov (United States)

    Lawrie, David S

    2017-09-07

    Forward Wright-Fisher simulations are powerful in their ability to model complex demography and selection scenarios, but suffer from slow execution on the Central Processor Unit (CPU), thus limiting their usefulness. However, the single-locus Wright-Fisher forward algorithm is exceedingly parallelizable, with many steps that are so-called "embarrassingly parallel," consisting of a vast number of individual computations that are all independent of each other and thus capable of being performed concurrently. The rise of modern Graphics Processing Units (GPUs) and programming languages designed to leverage the inherent parallel nature of these processors have allowed researchers to dramatically speed up many programs that have such high arithmetic intensity and intrinsic concurrency. The presented GPU Optimized Wright-Fisher simulation, or "GO Fish" for short, can be used to simulate arbitrary selection and demographic scenarios while running over 250-fold faster than its serial counterpart on the CPU. Even modest GPU hardware can achieve an impressive speedup of over two orders of magnitude. With simulations so accelerated, one can not only do quick parametric bootstrapping of previously estimated parameters, but also use simulated results to calculate the likelihoods and summary statistics of demographic and selection models against real polymorphism data, all without restricting the demographic and selection scenarios that can be modeled or requiring approximations to the single-locus forward algorithm for efficiency. Further, as many of the parallel programming techniques used in this simulation can be applied to other computationally intensive algorithms important in population genetics, GO Fish serves as an exciting template for future research into accelerating computation in evolution. GO Fish is part of the Parallel PopGen Package available at: http://dl42.github.io/ParallelPopGen/. Copyright © 2017 Lawrie.

  16. Parallel flow accumulation algorithms for graphical processing units with application to RUSLE model

    Science.gov (United States)

    Sten, Johan; Lilja, Harri; Hyväluoma, Jari; Westerholm, Jan; Aspnäs, Mats

    2016-04-01

    Digital elevation models (DEMs) are widely used in the modeling of surface hydrology, which typically includes the determination of flow directions and flow accumulation. The use of high-resolution DEMs increases the accuracy of flow accumulation computation, but as a drawback, the computational time may become excessively long if large areas are analyzed. In this paper we investigate the use of graphical processing units (GPUs) for efficient flow accumulation calculations. We present two new parallel flow accumulation algorithms based on dependency transfer and topological sorting and compare them to previously published flow transfer and indegree-based algorithms. We benchmark the GPU implementations against industry standards, ArcGIS and SAGA. With the flow-transfer D8 flow routing model and binary input data, a speed up of 19 is achieved compared to ArcGIS and 15 compared to SAGA. We show that on GPUs the topological sort-based flow accumulation algorithm leads on average to a speedup by a factor of 7 over the flow-transfer algorithm. Thus a total speed up of the order of 100 is achieved. We test the algorithms by applying them to the Revised Universal Soil Loss Equation (RUSLE) erosion model. For this purpose we present parallel versions of the slope, LS factor and RUSLE algorithms and show that the RUSLE erosion results for an area of 12 km x 24 km containing 72 million cells can be calculated in less than a second. Since flow accumulation is needed in many hydrological models, the developed algorithms may find use in many other applications than RUSLE modeling. The algorithm based on topological sorting is particularly promising for dynamic hydrological models where flow accumulations are repeatedly computed over an unchanged DEM.

  17. A New Method Based on Graphics Processing Units for Fast Near-Infrared Optical Tomography.

    Science.gov (United States)

    Jiang, Jingjing; Ahnen, Linda; Kalyanov, Alexander; Lindner, Scott; Wolf, Martin; Majos, Salvador Sanchez

    2017-01-01

    The accuracy of images obtained by Diffuse Optical Tomography (DOT) could be substantially increased by the newly developed time resolved (TR) cameras. These devices result in unprecedented data volumes, which present a challenge to conventional image reconstruction techniques. In addition, many clinical applications require taking photons in air regions like the trachea into account, where the diffusion model fails. Image reconstruction techniques based on photon tracking are mandatory in those cases but have not been implemented so far due to computing demands. We aimed at designing an inversion algorithm which could be implemented on commercial graphics processing units (GPUs) by making use of information obtained with other imaging modalities. The method requires a segmented volume and an approximately uniform value for the reduced scattering coefficient in the volume under study. The complex photon path is reduced to a small number of partial path lengths within each segment resulting in drastically reduced memory usage and computation time. Our approach takes advantage of wavelength normalized data which renders it robust against instrumental biases and skin irregularities which is critical for realistic clinical applications. The accuracy of this method has been assessed with both simulated and experimental inhomogeneous phantoms showing good agreement with target values. The simulation study analyzed a phantom containing a tumor next to an air region. For the experimental test, a segmented cuboid phantom was illuminated by a supercontinuum laser and data were gathered by a state of the art TR camera. Reconstructions were obtained on a GPU-installed computer in less than 2 h. To our knowledge, it is the first time Monte Carlo methods have been successfully used for DOT based on TR cameras. This opens the door to applications such as accurate measurements of oxygenation in neck tumors where the presence of air regions is a problem for conventional approaches.

  18. Computer graphics application in the engineering design integration system

    Science.gov (United States)

    Glatt, C. R.; Abel, R. W.; Hirsch, G. N.; Alford, G. E.; Colquitt, W. N.; Stewart, W. A.

    1975-01-01

    The computer graphics aspect of the Engineering Design Integration (EDIN) system and its application to design problems were discussed. Three basic types of computer graphics may be used with the EDIN system for the evaluation of aerospace vehicles preliminary designs: offline graphics systems using vellum-inking or photographic processes, online graphics systems characterized by direct coupled low cost storage tube terminals with limited interactive capabilities, and a minicomputer based refresh terminal offering highly interactive capabilities. The offline line systems are characterized by high quality (resolution better than 0.254 mm) and slow turnaround (one to four days). The online systems are characterized by low cost, instant visualization of the computer results, slow line speed (300 BAUD), poor hard copy, and the early limitations on vector graphic input capabilities. The recent acquisition of the Adage 330 Graphic Display system has greatly enhanced the potential for interactive computer aided design.

  19. Real-time Graphics Processing Unit Based Fourier Domain Optical Coherence Tomography and Surgical Applications

    Science.gov (United States)

    Zhang, Kang

    2011-12-01

    In this dissertation, real-time Fourier domain optical coherence tomography (FD-OCT) capable of multi-dimensional micrometer-resolution imaging targeted specifically for microsurgical intervention applications was developed and studied. As a part of this work several ultra-high speed real-time FD-OCT imaging and sensing systems were proposed and developed. A real-time 4D (3D+time) OCT system platform using the graphics processing unit (GPU) to accelerate OCT signal processing, the imaging reconstruction, visualization, and volume rendering was developed. Several GPU based algorithms such as non-uniform fast Fourier transform (NUFFT), numerical dispersion compensation, and multi-GPU implementation were developed to improve the impulse response, SNR roll-off and stability of the system. Full-range complex-conjugate-free FD-OCT was also implemented on the GPU architecture to achieve doubled image range and improved SNR. These technologies overcome the imaging reconstruction and visualization bottlenecks widely exist in current ultra-high speed FD-OCT systems and open the way to interventional OCT imaging for applications in guided microsurgery. A hand-held common-path optical coherence tomography (CP-OCT) distance-sensor based microsurgical tool was developed and validated. Through real-time signal processing, edge detection and feed-back control, the tool was shown to be capable of track target surface and compensate motion. The micro-incision test using a phantom was performed using a CP-OCT-sensor integrated hand-held tool, which showed an incision error less than +/-5 microns, comparing to >100 microns error by free-hand incision. The CP-OCT distance sensor has also been utilized to enhance the accuracy and safety of optical nerve stimulation. Finally, several experiments were conducted to validate the system for surgical applications. One of them involved 4D OCT guided micro-manipulation using a phantom. Multiple volume renderings of one 3D data set were

  20. Cost analysis of advanced turbine blade manufacturing processes

    Science.gov (United States)

    Barth, C. F.; Blake, D. E.; Stelson, T. S.

    1977-01-01

    A rigorous analysis was conducted to estimate relative manufacturing costs for high technology gas turbine blades prepared by three candidate materials process systems. The manufacturing costs for the same turbine blade configuration of directionally solidified eutectic alloy, an oxide dispersion strengthened superalloy, and a fiber reinforced superalloy were compared on a relative basis to the costs of the same blade currently in production utilizing the directional solidification process. An analytical process cost model was developed to quantitatively perform the cost comparisons. The impact of individual process yield factors on costs was also assessed as well as effects of process parameters, raw materials, labor rates and consumable items.

  1. Creating Interactive Graphical Overlays in the Advanced Weather Interactive Processing System (AWIPS) Using Shapefiles and DGM Files

    Science.gov (United States)

    Barrett, Joe H., III; Lafosse, Richard; Hood, Doris; Hoeth, Brian

    2007-01-01

    Graphical overlays can be created in real-time in the Advanced Weather Interactive Processing System (AWIPS) using shapefiles or DARE Graphics Metafile (DGM) files. This presentation describes how to create graphical overlays on-the-fly for AWIPS, by using two examples of AWIPS applications that were created by the Applied Meteorology Unit (AMU). The first example is the Anvil Threat Corridor Forecast Tool, which produces a shapefile that depicts a graphical threat corridor of the forecast movement of thunderstorm anvil clouds, based on the observed or forecast upper-level winds. This tool is used by the Spaceflight Meteorology Group (SMG) and 45th Weather Squadron (45 WS) to analyze the threat of natural or space vehicle-triggered lightning over a location. The second example is a launch and landing trajectory tool that produces a DGM file that plots the ground track of space vehicles during launch or landing. The trajectory tool can be used by SMG and the 45 WS forecasters to analyze weather radar imagery along a launch or landing trajectory. Advantages of both file types will be listed.

  2. Large eddy simulations of turbulent flows on graphics processing units: Application to film-cooling flows

    Science.gov (United States)

    Shinn, Aaron F.

    Computational Fluid Dynamics (CFD) simulations can be very computationally expensive, especially for Large Eddy Simulations (LES) and Direct Numerical Simulations (DNS) of turbulent ows. In LES the large, energy containing eddies are resolved by the computational mesh, but the smaller (sub-grid) scales are modeled. In DNS, all scales of turbulence are resolved, including the smallest dissipative (Kolmogorov) scales. Clusters of CPUs have been the standard approach for such simulations, but an emerging approach is the use of Graphics Processing Units (GPUs), which deliver impressive computing performance compared to CPUs. Recently there has been great interest in the scientific computing community to use GPUs for general-purpose computation (such as the numerical solution of PDEs) rather than graphics rendering. To explore the use of GPUs for CFD simulations, an incompressible Navier-Stokes solver was developed for a GPU. This solver is capable of simulating unsteady laminar flows or performing a LES or DNS of turbulent ows. The Navier-Stokes equations are solved via a fractional-step method and are spatially discretized using the finite volume method on a Cartesian mesh. An immersed boundary method based on a ghost cell treatment was developed to handle flow past complex geometries. The implementation of these numerical methods had to suit the architecture of the GPU, which is designed for massive multithreading. The details of this implementation will be described, along with strategies for performance optimization. Validation of the GPU-based solver was performed for fundamental bench-mark problems, and a performance assessment indicated that the solver was over an order-of-magnitude faster compared to a CPU. The GPU-based Navier-Stokes solver was used to study film-cooling flows via Large Eddy Simulation. In modern gas turbine engines, the film-cooling method is used to protect turbine blades from hot combustion gases. Therefore, understanding the physics of

  3. Low-cost digital image processing at the University of Oklahoma

    Science.gov (United States)

    Harrington, J. A., Jr.

    1981-01-01

    Computer assisted instruction in remote sensing at the University of Oklahoma involves two separate approaches and is dependent upon initial preprocessing of a LANDSAT computer compatible tape using software developed for an IBM 370/158 computer. In-house generated preprocessing algorithms permits students or researchers to select a subset of a LANDSAT scene for subsequent analysis using either general purpose statistical packages or color graphic image processing software developed for Apple II microcomputers. Procedures for preprocessing the data and image analysis using either of the two approaches for low-cost LANDSAT data processing are described.

  4. Compressed sensing reconstruction for whole-heart imaging with 3D radial trajectories: a graphics processing unit implementation.

    Science.gov (United States)

    Nam, Seunghoon; Akçakaya, Mehmet; Basha, Tamer; Stehning, Christian; Manning, Warren J; Tarokh, Vahid; Nezafat, Reza

    2013-01-01

    A disadvantage of three-dimensional (3D) isotropic acquisition in whole-heart coronary MRI is the prolonged data acquisition time. Isotropic 3D radial trajectories allow undersampling of k-space data in all three spatial dimensions, enabling accelerated acquisition of the volumetric data. Compressed sensing (CS) reconstruction can provide further acceleration in the acquisition by removing the incoherent artifacts due to undersampling and improving the image quality. However, the heavy computational overhead of the CS reconstruction has been a limiting factor for its application. In this article, a parallelized implementation of an iterative CS reconstruction method for 3D radial acquisitions using a commercial graphics processing unit is presented. The execution time of the graphics processing unit-implemented CS reconstruction was compared with that of the C++ implementation, and the efficacy of the undersampled 3D radial acquisition with CS reconstruction was investigated in both phantom and whole-heart coronary data sets. Subsequently, the efficacy of CS in suppressing streaking artifacts in 3D whole-heart coronary MRI with 3D radial imaging and its convergence properties were studied. The CS reconstruction provides improved image quality (in terms of vessel sharpness and suppression of noise-like artifacts) compared with the conventional 3D gridding algorithm, and the graphics processing unit implementation greatly reduces the execution time of CS reconstruction yielding 34-54 times speed-up compared with C++ implementation. Copyright © 2012 Wiley Periodicals, Inc.

  5. Low cost 3D scanning process using digital image processing

    Science.gov (United States)

    Aguilar, David; Romero, Carlos; Martínez, Fernando

    2017-02-01

    This paper shows the design and building of a low cost 3D scanner, able to digitize solid objects through contactless data acquisition, using active object reflection. 3D scanners are used in different applications such as: science, engineering, entertainment, etc; these are classified in: contact scanners and contactless ones, where the last ones are often the most used but they are expensive. This low-cost prototype is done through a vertical scanning of the object using a fixed camera and a mobile horizontal laser light, which is deformed depending on the 3-dimensional surface of the solid. Using digital image processing an analysis of the deformation detected by the camera was done; it allows determining the 3D coordinates using triangulation. The obtained information is processed by a Matlab script, which gives to the user a point cloud corresponding to each horizontal scanning done. The obtained results show an acceptable quality and significant details of digitalized objects, making this prototype (built on LEGO Mindstorms NXT kit) a versatile and cheap tool, which can be used for many applications, mainly by engineering students.

  6. R graphics

    CERN Document Server

    Murrell, Paul

    2005-01-01

    R is revolutionizing the world of statistical computing. Powerful, flexible, and best of all free, R is now the program of choice for tens of thousands of statisticians. Destined to become an instant classic, R Graphics presents the first complete, authoritative exposition on the R graphical system. Paul Murrell, widely known as the leading expert on R graphics, has developed an in-depth resource that takes nothing for granted and helps both neophyte and seasoned users master the intricacies of R graphics. After an introductory overview of R graphics facilities, the presentation first focuses

  7. Process-based Cost Estimation for Ramjet/Scramjet Engines

    Science.gov (United States)

    Singh, Brijendra; Torres, Felix; Nesman, Miles; Reynolds, John

    2003-01-01

    Process-based cost estimation plays a key role in effecting cultural change that integrates distributed science, technology and engineering teams to rapidly create innovative and affordable products. Working together, NASA Glenn Research Center and Boeing Canoga Park have developed a methodology of process-based cost estimation bridging the methodologies of high-level parametric models and detailed bottoms-up estimation. The NASA GRC/Boeing CP process-based cost model provides a probabilistic structure of layered cost drivers. High-level inputs characterize mission requirements, system performance, and relevant economic factors. Design alternatives are extracted from a standard, product-specific work breakdown structure to pre-load lower-level cost driver inputs and generate the cost-risk analysis. As product design progresses and matures the lower level more detailed cost drivers can be re-accessed and the projected variation of input values narrowed, thereby generating a progressively more accurate estimate of cost-risk. Incorporated into the process-based cost model are techniques for decision analysis, specifically, the analytic hierarchy process (AHP) and functional utility analysis. Design alternatives may then be evaluated not just on cost-risk, but also user defined performance and schedule criteria. This implementation of full-trade study support contributes significantly to the realization of the integrated development environment. The process-based cost estimation model generates development and manufacturing cost estimates. The development team plans to expand the manufacturing process base from approximately 80 manufacturing processes to over 250 processes. Operation and support cost modeling is also envisioned. Process-based estimation considers the materials, resources, and processes in establishing cost-risk and rather depending on weight as an input, actually estimates weight along with cost and schedule.

  8. Ab initio nonadiabatic dynamics of multichromophore complexes: a scalable graphical-processing-unit-accelerated exciton framework.

    Science.gov (United States)

    Sisto, Aaron; Glowacki, David R; Martinez, Todd J

    2014-09-16

    ("fragmenting") a molecular system and then stitching it back together. In this Account, we address both of these problems, the first by using graphical processing units (GPUs) and electronic structure algorithms tuned for these architectures and the second by using an exciton model as a framework in which to stitch together the solutions of the smaller problems. The multitiered parallel framework outlined here is aimed at nonadiabatic dynamics simulations on large supramolecular multichromophoric complexes in full atomistic detail. In this framework, the lowest tier of parallelism involves GPU-accelerated electronic structure theory calculations, for which we summarize recent progress in parallelizing the computation and use of electron repulsion integrals (ERIs), which are the major computational bottleneck in both density functional theory (DFT) and time-dependent density functional theory (TDDFT). The topmost tier of parallelism relies on a distributed memory framework, in which we build an exciton model that couples chromophoric units. Combining these multiple levels of parallelism allows access to ground and excited state dynamics for large multichromophoric assemblies. The parallel excitonic framework is in good agreement with much more computationally demanding TDDFT calculations of the full assembly.

  9. Using wesBench to Study the Rendering Performance of Graphics Processing Units

    Energy Technology Data Exchange (ETDEWEB)

    Bethel, Edward W

    2010-01-08

    Graphics operations consist of two broad operations. The first, which we refer to here as vertex operations, consists of transformation, lighting, primitive assembly, and so forth. The second, which we refer to as pixel or fragment operations, consist of rasterization, texturing, scissoring, blending, and fill. Overall GPU rendering performance is a function of throughput of both these interdependent stages: if one stage is slower than the other, the faster stage will be forced to run more slowly and overall rendering performance will be adversely affected. This relationship is commutative: if the later stage has a greater workload than the earlier stage, the earlier stage will be forced to 'slow down.' For example, a large triangle that covers many screen pixels will incur a very small amount of work in the vertex stage while at the same time incurring a relatively large amount of work in the fragment stage. Rendering performance of a scene consisting of many large-area triangles will be limited by throughput of the fragment stage, which will have relatively more work than the vertex stage. There are two main objectives for this document. First, we introduce a new graphics benchmark, wesBench, which is useful for measuring performance of both stages of the rendering pipeline under varying conditions. Second, we present its methodology for measuring performance and show results of several performance measurement studies aimed at producing better understanding of GPU rendering performance characteristics and limits under varying configurations. First, in Section 2, we explore the 'crossover' point between geometry and rasterization. Second, in Section 3, we explore additional performance characteristics, some of which are ill- or un-documented. Lastly, several appendices provide additional material concerning problems with the gfxbench benchmark, and details about the new wesBench graphics benchmark.

  10. Process-Improvement Cost Model for the Emergency Department.

    Science.gov (United States)

    Dyas, Sheila R; Greenfield, Eric; Messimer, Sherri; Thotakura, Swati; Gholston, Sampson; Doughty, Tracy; Hays, Mary; Ivey, Richard; Spalding, Joseph; Phillips, Robin

    2015-01-01

    The objective of this report is to present a simplified, activity-based costing approach for hospital emergency departments (EDs) to use with Lean Six Sigma cost-benefit analyses. The cost model complexity is reduced by removing diagnostic and condition-specific costs, thereby revealing the underlying process activities' cost inefficiencies. Examples are provided for evaluating the cost savings from reducing discharge delays and the cost impact of keeping patients in the ED (boarding) after the decision to admit has been made. The process-improvement cost model provides a needed tool in selecting, prioritizing, and validating Lean process-improvement projects in the ED and other areas of patient care that involve multiple dissimilar diagnoses.

  11. Costs of predator-induced phenotypic plasticity: a graphical model for predicting the contribution of nonconsumptive and consumptive effects of predators on prey.

    Science.gov (United States)

    Peacor, Scott D; Peckarsky, Barbara L; Trussell, Geoffrey C; Vonesh, James R

    2013-01-01

    Defensive modifications in prey traits that reduce predation risk can also have negative effects on prey fitness. Such nonconsumptive effects (NCEs) of predators are common, often quite strong, and can even dominate the net effect of predators. We develop an intuitive graphical model to identify and explore the conditions promoting strong NCEs. The model illustrates two conditions necessary and sufficient for large NCEs: (1) trait change has a large cost, and (2) the benefit of reduced predation outweighs the costs, such as reduced growth rate. A corollary condition is that potential predation in the absence of trait change must be large. In fact, the sum total of the consumptive effects (CEs) and NCEs may be any value bounded by the magnitude of the predation rate in the absence of the trait change. The model further illustrates how, depending on the effect of increased trait change on resulting costs and benefits, any combination of strong and weak NCEs and CEs is possible. The model can also be used to examine how changes in environmental factors (e.g., refuge safety) or variation among predator-prey systems (e.g., different benefits of a prey trait change) affect NCEs. Results indicate that simple rules of thumb may not apply; factors that increase the cost of trait change or that increase the degree to which an animal changes a trait, can actually cause smaller (rather than larger) NCEs. We provide examples of how this graphical model can provide important insights for empirical studies from two natural systems. Implementation of this approach will improve our understanding of how and when NCEs are expected to dominate the total effect of predators. Further, application of the models will likely promote a better linkage between experimental and theoretical studies of NCEs, and foster synthesis across systems.

  12. NATURAL graphics

    Science.gov (United States)

    Jones, R. H.

    1984-01-01

    The hardware and software developments in computer graphics are discussed. Major topics include: system capabilities, hardware design, system compatibility, and software interface with the data base management system.

  13. Analysis of Unit Process Cost for an Engineering-Scale Pyroprocess Facility Using a Process Costing Method in Korea

    Directory of Open Access Journals (Sweden)

    Sungki Kim

    2015-08-01

    Full Text Available Pyroprocessing, which is a dry recycling method, converts spent nuclear fuel into U (Uranium/TRU (TRansUranium metal ingots in a high-temperature molten salt phase. This paper provides the unit process cost of a pyroprocess facility that can process up to 10 tons of pyroprocessing product per year by utilizing the process costing method. Toward this end, the pyroprocess was classified into four kinds of unit processes: pretreatment, electrochemical reduction, electrorefining and electrowinning. The unit process cost was calculated by classifying the cost consumed at each process into raw material and conversion costs. The unit process costs of the pretreatment, electrochemical reduction, electrorefining and electrowinning were calculated as 195 US$/kgU-TRU, 310 US$/kgU-TRU, 215 US$/kgU-TRU and 231 US$/kgU-TRU, respectively. Finally the total pyroprocess cost was calculated as 951 US$/kgU-TRU. In addition, the cost driver for the raw material cost was identified as the cost for Li3PO4, needed for the LiCl-KCl purification process, and platinum as an anode electrode in the electrochemical reduction process.

  14. Visualization of complex processes in lipid systems using computer simulations and molecular graphics.

    Science.gov (United States)

    Telenius, Jelena; Vattulainen, Ilpo; Monticelli, Luca

    2009-01-01

    Computer simulation has become an increasingly popular tool in the study of lipid membranes, complementing experimental techniques by providing information on structure and dynamics at high spatial and temporal resolution. Molecular visualization is the most powerful way to represent the results of molecular simulations, and can be used to illustrate complex transformations of lipid aggregates more easily and more effectively than written text. In this chapter, we review some basic aspects of simulation methodologies commonly employed in the study of lipid membranes and we describe a few examples of complex phenomena that have been recently investigated using molecular simulations. We then explain how molecular visualization provides added value to computational work in the field of biological membranes, and we conclude by listing a few molecular graphics packages widely used in scientific publications.

  15. Graphic Storytelling

    Science.gov (United States)

    Thompson, John

    2009-01-01

    Graphic storytelling is a medium that allows students to make and share stories, while developing their art communication skills. American comics today are more varied in genre, approach, and audience than ever before. When considering the impact of Japanese manga on the youth, graphic storytelling emerges as a powerful player in pop culture. In…

  16. Graphic Storytelling

    Science.gov (United States)

    Thompson, John

    2009-01-01

    Graphic storytelling is a medium that allows students to make and share stories, while developing their art communication skills. American comics today are more varied in genre, approach, and audience than ever before. When considering the impact of Japanese manga on the youth, graphic storytelling emerges as a powerful player in pop culture. In…

  17. Analysis of the Data From a Technical Processing Cost Study.

    Science.gov (United States)

    Rocke, Hans Joachim

    This study was conducted to analyze and summarize raw data obtained from a 1972 study: "Report on a Cost Study of Specific Technical Processing Activities of the California State University and College Libraries" with the hypothesis that the cost of technical processes increases as the production volume both rises above and falls below…

  18. A COST ORIENTED SYSTEM FOR HOLE MAKING PROCESSES

    Directory of Open Access Journals (Sweden)

    Uğur PAMUKOĞLU

    2004-01-01

    Full Text Available A knowledge based system for manufacturing of various hole making processes has been developed. In the system, selection of machining methods, determination of sequences based on cutting tools for each process, determination of process time and cost analysis have been conducted. In the procedure, all available processes have been taken in to account regarding their costs and the most suitable in cost was chosen. The system generated helps facilitate determination of process time and the costs of features to be manufactured. It is especially useful for quick cost estimation. In addition to these, the system helps people who are naïve in manufacturing operations so that people could be used for the related manufacturing stages.

  19. Enabling Seamless Access to Digital Graphical Contents for Visually Impaired Individuals via Semantic-Aware Processing

    Directory of Open Access Journals (Sweden)

    Baoxin Li

    2007-11-01

    Full Text Available Vision is one of the main sources through which people obtain information from the world, but unfortunately, visually-impaired people are partially or completely deprived of this type of information. With the help of computer technologies, people with visual impairment can independently access digital textual information by using text-to-speech and text-to-Braille software. However, in general, there still exists a major barrier for people who are blind to access the graphical information independently in real-time without the help of sighted people. In this paper, we propose a novel multi-level and multi-modal approach aiming at addressing this challenging and practical problem, with the key idea being semantic-aware visual-to-tactile conversion through semantic image categorization and segmentation, and semantic-driven image simplification. An end-to-end prototype system was built based on the approach. We present the details of the approach and the system, report sample experimental results with realistic data, and compare our approach with current typical practice.

  20. Enabling Seamless Access to Digital Graphical Contents for Visually Impaired Individuals via Semantic-Aware Processing

    Directory of Open Access Journals (Sweden)

    Wang Zheshen

    2007-01-01

    Full Text Available Vision is one of the main sources through which people obtain information from the world, but unfortunately, visually-impaired people are partially or completely deprived of this type of information. With the help of computer technologies, people with visual impairment can independently access digital textual information by using text-to-speech and text-to-Braille software. However, in general, there still exists a major barrier for people who are blind to access the graphical information independently in real-time without the help of sighted people. In this paper, we propose a novel multi-level and multi-modal approach aiming at addressing this challenging and practical problem, with the key idea being semantic-aware visual-to-tactile conversion through semantic image categorization and segmentation, and semantic-driven image simplification. An end-to-end prototype system was built based on the approach. We present the details of the approach and the system, report sample experimental results with realistic data, and compare our approach with current typical practice.

  1. Real-time reconstruction of sensitivity encoded radial magnetic resonance imaging using a graphics processing unit.

    Science.gov (United States)

    Sørensen, Thomas Sangild; Atkinson, David; Schaeffter, Tobias; Hansen, Michael Schacht

    2009-12-01

    A barrier to the adoption of non-Cartesian parallel magnetic resonance imaging for real-time applications has been the times required for the image reconstructions. These times have exceeded the underlying acquisition time thus preventing real-time display of the acquired images. We present a reconstruction algorithm for commodity graphics hardware (GPUs) to enable real time reconstruction of sensitivity encoded radial imaging (radial SENSE). We demonstrate that a radial profile order based on the golden ratio facilitates reconstruction from an arbitrary number of profiles. This allows the temporal resolution to be adjusted on the fly. A user adaptable regularization term is also included and, particularly for highly undersampled data, used to interactively improve the reconstruction quality. Each reconstruction is fully self-contained from the profile stream, i.e., the required coil sensitivity profiles, sampling density compensation weights, regularization terms, and noise estimates are computed in real-time from the acquisition data itself. The reconstruction implementation is verified using a steady state free precession (SSFP) pulse sequence and quantitatively evaluated. Three applications are demonstrated; real-time imaging with real-time SENSE 1) or k- t SENSE 2) reconstructions, and 3) offline reconstruction with interactive adjustment of reconstruction settings.

  2. Prospects for reducing the processing cost of lithium ion batteries

    Science.gov (United States)

    Wood, David L.; Li, Jianlin; Daniel, Claus

    2015-02-01

    A detailed processing cost breakdown is given for lithium-ion battery (LIB) electrodes, which focuses on: 1) elimination of toxic, costly N-methylpyrrolidone (NMP) dispersion chemistry; 2) doubling the thicknesses of the anode and cathode to raise energy density; and 3) reduction of the anode electrolyte wetting and SEI-layer formation time. These processing cost reduction technologies generically adaptable to any anode or cathode cell chemistry and are being implemented at ORNL. This paper shows step by step how these cost savings can be realized in existing or new LIB manufacturing plants using a baseline case of thin (power) electrodes produced with NMP processing and a standard 10-14-day wetting and formation process. In particular, it is shown that aqueous electrode processing can cut the electrode processing cost and energy consumption by an order of magnitude. Doubling the thickness of the electrodes allows for using half of the inactive current collectors and separators, contributing even further to the processing cost savings. Finally wetting and SEI-layer formation cost savings are discussed in the context of a protocol with significantly reduced time. These three benefits collectively offer the possibility of reducing LIB pack cost from 502.8 kW h-1-usable to 370.3 kW h-1-usable, a savings of 132.5/kWh (or 26.4%).

  3. Graphics Processing Unit-Accelerated Code for Computing Second-Order Wiener Kernels and Spike-Triggered Covariance.

    Science.gov (United States)

    Mano, Omer; Clark, Damon A

    2017-01-01

    Sensory neuroscience seeks to understand and predict how sensory neurons respond to stimuli. Nonlinear components of neural responses are frequently characterized by the second-order Wiener kernel and the closely-related spike-triggered covariance (STC). Recent advances in data acquisition have made it increasingly common and computationally intensive to compute second-order Wiener kernels/STC matrices. In order to speed up this sort of analysis, we developed a graphics processing unit (GPU)-accelerated module that computes the second-order Wiener kernel of a system's response to a stimulus. The generated kernel can be easily transformed for use in standard STC analyses. Our code speeds up such analyses by factors of over 100 relative to current methods that utilize central processing units (CPUs). It works on any modern GPU and may be integrated into many data analysis workflows. This module accelerates data analysis so that more time can be spent exploring parameter space and interpreting data.

  4. GPU MrBayes V3.1: MrBayes on Graphics Processing Units for Protein Sequence Data.

    Science.gov (United States)

    Pang, Shuai; Stones, Rebecca J; Ren, Ming-Ming; Liu, Xiao-Guang; Wang, Gang; Xia, Hong-ju; Wu, Hao-Yang; Liu, Yang; Xie, Qiang

    2015-09-01

    We present a modified GPU (graphics processing unit) version of MrBayes, called ta(MC)(3) (GPU MrBayes V3.1), for Bayesian phylogenetic inference on protein data sets. Our main contributions are 1) utilizing 64-bit variables, thereby enabling ta(MC)(3) to process larger data sets than MrBayes; and 2) to use Kahan summation to improve accuracy, convergence rates, and consequently runtime. Versus the current fastest software, we achieve a speedup of up to around 2.5 (and up to around 90 vs. serial MrBayes), and more on multi-GPU hardware. GPU MrBayes V3.1 is available from http://sourceforge.net/projects/mrbayes-gpu/.

  5. Parallelized multi-graphics processing unit framework for high-speed Gabor-domain optical coherence microscopy.

    Science.gov (United States)

    Tankam, Patrice; Santhanam, Anand P; Lee, Kye-Sung; Won, Jungeun; Canavesi, Cristina; Rolland, Jannick P

    2014-07-01

    Gabor-domain optical coherence microscopy (GD-OCM) is a volumetric high-resolution technique capable of acquiring three-dimensional (3-D) skin images with histological resolution. Real-time image processing is needed to enable GD-OCM imaging in a clinical setting. We present a parallelized and scalable multi-graphics processing unit (GPU) computing framework for real-time GD-OCM image processing. A parallelized control mechanism was developed to individually assign computation tasks to each of the GPUs. For each GPU, the optimal number of amplitude-scans (A-scans) to be processed in parallel was selected to maximize GPU memory usage and core throughput. We investigated five computing architectures for computational speed-up in processing 1000×1000 A-scans. The proposed parallelized multi-GPU computing framework enables processing at a computational speed faster than the GD-OCM image acquisition, thereby facilitating high-speed GD-OCM imaging in a clinical setting. Using two parallelized GPUs, the image processing of a 1×1×0.6  mm3 skin sample was performed in about 13 s, and the performance was benchmarked at 6.5 s with four GPUs. This work thus demonstrates that 3-D GD-OCM data may be displayed in real-time to the examiner using parallelized GPU processing.

  6. Parallelized multi–graphics processing unit framework for high-speed Gabor-domain optical coherence microscopy

    Science.gov (United States)

    Tankam, Patrice; Santhanam, Anand P.; Lee, Kye-Sung; Won, Jungeun; Canavesi, Cristina; Rolland, Jannick P.

    2014-01-01

    Abstract. Gabor-domain optical coherence microscopy (GD-OCM) is a volumetric high-resolution technique capable of acquiring three-dimensional (3-D) skin images with histological resolution. Real-time image processing is needed to enable GD-OCM imaging in a clinical setting. We present a parallelized and scalable multi-graphics processing unit (GPU) computing framework for real-time GD-OCM image processing. A parallelized control mechanism was developed to individually assign computation tasks to each of the GPUs. For each GPU, the optimal number of amplitude-scans (A-scans) to be processed in parallel was selected to maximize GPU memory usage and core throughput. We investigated five computing architectures for computational speed-up in processing 1000×1000 A-scans. The proposed parallelized multi-GPU computing framework enables processing at a computational speed faster than the GD-OCM image acquisition, thereby facilitating high-speed GD-OCM imaging in a clinical setting. Using two parallelized GPUs, the image processing of a 1×1×0.6  mm3 skin sample was performed in about 13 s, and the performance was benchmarked at 6.5 s with four GPUs. This work thus demonstrates that 3-D GD-OCM data may be displayed in real-time to the examiner using parallelized GPU processing. PMID:24695868

  7. Processing Polarity Items: Contrastive Licensing Costs

    Science.gov (United States)

    Saddy, Douglas; Drenhaus, Heiner; Frisch, Stefan

    2004-01-01

    We describe an experiment that investigated the failure to license polarity items in German using event-related brain potentials (ERPs). The results reveal distinct processing reflexes associated with failure to license positive polarity items in comparison to failure to license negative polarity items. Failure to license both negative and…

  8. Graphic Review

    DEFF Research Database (Denmark)

    Breiting, Søren

    2002-01-01

    Introduktion til 'graphic review' som en metode til at føre forståelse fra en undervisngsgang til den næste i læreruddannelse og grundskole.......Introduktion til 'graphic review' som en metode til at føre forståelse fra en undervisngsgang til den næste i læreruddannelse og grundskole....

  9. Graphics gems

    CERN Document Server

    Glassner, Andrew S

    1993-01-01

    ""The GRAPHICS GEMS Series"" was started in 1990 by Andrew Glassner. The vision and purpose of the Series was - and still is - to provide tips, techniques, and algorithms for graphics programmers. All of the gems are written by programmers who work in the field and are motivated by a common desire to share interesting ideas and tools with their colleagues. Each volume provides a new set of innovative solutions to a variety of programming problems.

  10. Reactive control processes contributing to residual switch cost and mixing cost across the adult lifespan.

    Science.gov (United States)

    Whitson, Lisa R; Karayanidis, Frini; Fulham, Ross; Provost, Alexander; Michie, Patricia T; Heathcote, Andrew; Hsieh, Shulan

    2014-01-01

    In task-switching paradigms, performance is better when repeating the same task than when alternating between tasks (switch cost) and when repeating a task alone rather than intermixed with another task (mixing cost). These costs remain even after extensive practice and when task cues enable advanced preparation (residual costs). Moreover, residual reaction time mixing cost has been consistently shown to increase with age. Residual switch and mixing costs modulate the amplitude of the stimulus-locked P3b. This mixing effect is disproportionately larger in older adults who also prepare more for and respond more cautiously on these "mixed" repeat trials (Karayanidis et al., 2011). In this paper, we analyze stimulus-locked and response-locked P3 and lateralized readiness potentials to identify whether residual switch and mixing cost arise from the need to control interference at the level of stimulus processing or response processing. Residual mixing cost was associated with control of stimulus-level interference, whereas residual switch cost was also associated with a delay in response selection. In older adults, the disproportionate increase in mixing cost was associated with greater interference at the level of decision-response mapping and response programming for repeat trials in mixed-task blocks. These findings suggest that older adults strategically recruit greater proactive and reactive control to overcome increased susceptibility to post-stimulus interference. This interpretation is consistent with recruitment of compensatory strategies to compensate for reduced repetition benefit rather than an overall decline on cognitive flexibility.

  11. An improved effective cost review process for value engineering.

    Science.gov (United States)

    Joo, D S; Park, J I

    2014-01-01

    Second-look value engineering (VE) is an approach that aims to lower the costs of products for which target costs are not being met during the production stage. Participants in second-look VE typically come up with a variety of ideas for cost cutting, but the outcomes often depend on their levels of experience, and not many good alternatives are available during the production stage. Nonetheless, good ideas have been consistently generated by VE experts. This paper investigates past second-look VE cases and the thinking processes of VE experts and proposes a cost review process as a systematic means of investigating cost-cutting ideas. This cost review process includes the use of an idea checklist and a specification review process. In addition to presenting the process, this paper reports on its feasibility, based on its introduction into a VE training course as part of a pilot study. The results indicate that the cost review process is effective in generating ideas for later analysis.

  12. Process Cost Modeling for Multi-Disciplinary Design Optimization

    Science.gov (United States)

    Bao, Han P.; Freeman, William (Technical Monitor)

    2002-01-01

    For early design concepts, the conventional approach to cost is normally some kind of parametric weight-based cost model. There is now ample evidence that this approach can be misleading and inaccurate. By the nature of its development, a parametric cost model requires historical data and is valid only if the new design is analogous to those for which the model was derived. Advanced aerospace vehicles have no historical production data and are nowhere near the vehicles of the past. Using an existing weight-based cost model would only lead to errors and distortions of the true production cost. This report outlines the development of a process-based cost model in which the physical elements of the vehicle are costed according to a first-order dynamics model. This theoretical cost model, first advocated by early work at MIT, has been expanded to cover the basic structures of an advanced aerospace vehicle. Elemental costs based on the geometry of the design can be summed up to provide an overall estimation of the total production cost for a design configuration. This capability to directly link any design configuration to realistic cost estimation is a key requirement for high payoff MDO problems. Another important consideration in this report is the handling of part or product complexity. Here the concept of cost modulus is introduced to take into account variability due to different materials, sizes, shapes, precision of fabrication, and equipment requirements. The most important implication of the development of the proposed process-based cost model is that different design configurations can now be quickly related to their cost estimates in a seamless calculation process easily implemented on any spreadsheet tool. In successive sections, the report addresses the issues of cost modeling as follows. First, an introduction is presented to provide the background for the research work. Next, a quick review of cost estimation techniques is made with the intention to

  13. COST ESTIMATION MODELS FOR DRINKING WATER TREATMENT UNIT PROCESSES

    Science.gov (United States)

    Cost models for unit processes typically utilized in a conventional water treatment plant and in package treatment plant technology are compiled in this paper. The cost curves are represented as a function of specified design parameters and are categorized into four major catego...

  14. Arbitrary Angular Momentum Electron Repulsion Integrals with Graphical Processing Units: Application to the Resolution of Identity Hartree-Fock Method.

    Science.gov (United States)

    Kalinowski, Jaroslaw; Wennmohs, Frank; Neese, Frank

    2017-07-11

    A resolution of identity based implementation of the Hartree-Fock method on graphical processing units (GPUs) is presented that is capable of handling basis functions with arbitrary angular momentum. For practical reasons, only functions up to (ff|f) angular momentum are presently calculated on the GPU, thus leaving the calculation of higher angular momenta integrals on the CPU of the hybrid CPU-GPU environment. Speedups of up to a factor of 30 are demonstrated relative to state-of-the-art serial and parallel CPU implementations. Benchmark calculations with over 3500 contracted basis functions (def2-SVP or def2-TZVP basis sets) are reported. The presented implementation supports all devices with OpenCL support and is capable of utilizing multiple GPU cards over either MPI or OpenCL itself.

  15. Real-space density functional theory on graphical processing units: computational approach and comparison to Gaussian basis set methods

    CERN Document Server

    Andrade, Xavier

    2013-01-01

    We discuss the application of graphical processing units (GPUs) to accelerate real-space density functional theory (DFT) calculations. To make our implementation efficient, we have developed a scheme to expose the data parallelism available in the DFT approach; this is applied to the different procedures required for a real-space DFT calculation. We present results for current-generation GPUs from AMD and Nvidia, which show that our scheme, implemented in the free code OCTOPUS, can reach a sustained performance of up to 90 GFlops for a single GPU, representing an important speed-up when compared to the CPU version of the code. Moreover, for some systems our implementation can outperform a GPU Gaussian basis set code, showing that the real-space approach is a competitive alternative for DFT simulations on GPUs.

  16. Real-Space Density Functional Theory on Graphical Processing Units: Computational Approach and Comparison to Gaussian Basis Set Methods.

    Science.gov (United States)

    Andrade, Xavier; Aspuru-Guzik, Alán

    2013-10-01

    We discuss the application of graphical processing units (GPUs) to accelerate real-space density functional theory (DFT) calculations. To make our implementation efficient, we have developed a scheme to expose the data parallelism available in the DFT approach; this is applied to the different procedures required for a real-space DFT calculation. We present results for current-generation GPUs from AMD and Nvidia, which show that our scheme, implemented in the free code Octopus, can reach a sustained performance of up to 90 GFlops for a single GPU, representing a significant speed-up when compared to the CPU version of the code. Moreover, for some systems, our implementation can outperform a GPU Gaussian basis set code, showing that the real-space approach is a competitive alternative for DFT simulations on GPUs.

  17. ASAMgpu V1.0 – a moist fully compressible atmospheric model using graphics processing units (GPUs

    Directory of Open Access Journals (Sweden)

    S. Horn

    2012-03-01

    Full Text Available In this work the three dimensional compressible moist atmospheric model ASAMgpu is presented. The calculations are done using graphics processing units (GPUs. To ensure platform independence OpenGL and GLSL are used, with that the model runs on any hardware supporting fragment shaders. The MPICH2 library enables interprocess communication allowing the usage of more than one GPU through domain decomposition. Time integration is done with an explicit three step Runge-Kutta scheme with a time-splitting algorithm for the acoustic waves. The results for four test cases are shown in this paper. A rising dry heat bubble, a cold bubble induced density flow, a rising moist heat bubble in a saturated environment, and a DYCOMS-II case.

  18. String matching algorithm research based on graphic processing unit%基于GPU的串匹配算法研究

    Institute of Scientific and Technical Information of China (English)

    张庆丹; 戴正华; 冯圣中; 孙凝晖

    2006-01-01

    BF算法是串匹配算法中最基础的算法,但它是串行算法,不适合图形处理器(Graphic Processing Unit,GPU)的体系结构.结合GPU的特殊体系结构,通过数据存取方式和计算策略的改进,充分利用了GPU的并行处理能力,从而基于GPU实现了BF算法.实验结果表明基于GPU的并行算法能够取得较好的加速比,同时也给出了在现有GPU架构上有效实现通用计算的瓶颈.

  19. ASAMgpu V1.0 – a moist fully compressible atmospheric model using graphics processing units (GPUs

    Directory of Open Access Journals (Sweden)

    S. Horn

    2011-10-01

    Full Text Available In this work the three dimensional compressible moist atmospheric model ASAMgpu is presented. The calculations are done using graphics processing units (GPUs. To ensure platform independence OpenGL and GLSL is used, with that the model runs on any hardware supporting fragment shaders. The MPICH2 library enables interprocess communication allowing the usage of more than one GPU through domain decomposition. Time integration is done with an explicit three step Runge-Kutta scheme with a timesplitting algorithm for the acoustic waves. The results for four test cases are shown in this paper. A rising dry heat bubble, a cold bubble induced density flow, a rising moist heat bubble in a saturated environment and a DYCOMS-II case.

  20. A graphical simulation model of the entire DNA process associated with the analysis of short tandem repeat loci.

    Science.gov (United States)

    Gill, Peter; Curran, James; Elliot, Keith

    2005-01-01

    The use of expert systems to interpret short tandem repeat DNA profiles in forensic, medical and ancient DNA applications is becoming increasingly prevalent as high-throughput analytical systems generate large amounts of data that are time-consuming to process. With special reference to low copy number (LCN) applications, we use a graphical model to simulate stochastic variation associated with the entire DNA process starting with extraction of sample, followed by the processing associated with the preparation of a PCR reaction mixture and PCR itself. Each part of the process is modelled with input efficiency parameters. Then, the key output parameters that define the characteristics of a DNA profile are derived, namely heterozygote balance (Hb) and the probability of allelic drop-out p(D). The model can be used to estimate the unknown efficiency parameters, such as pi(extraction). 'What-if' scenarios can be used to improve and optimize the entire process, e.g. by increasing the aliquot forwarded to PCR, the improvement expected to a given DNA profile can be reliably predicted. We demonstrate that Hb and drop-out are mainly a function of stochastic effect of pre-PCR molecular selection. Whole genome amplification is unlikely to give any benefit over conventional PCR for LCN.

  1. Guidelines and cost analysis for catalyst production in biocatalytic processes

    DEFF Research Database (Denmark)

    Tufvesson, Pär; Lima Ramos, Joana; Nordblad, Mathias

    2011-01-01

    be a powerful tool to guide research and development activities in order to achieve commercial potential. This study discusses the cost contribution of the biocatalyst in processes that use isolated enzymes, immobilized enzymes, or whole cells to catalyze reactions leading to the production of chemicals...... as well as the production scale are crucial for decreasing the total cost contribution of the biocatalyst. Moreover, it is clear that, based on initial process performance, the potential to reduce production costs by several orders of magnitude is possible. Guideline minimum productivities for a feasible...... process are suggested for different types of processes and products, based on typical values of biocatalyst and product costs. Such guidelines are dependent on the format of the biocatalyst (whole-cell, soluble enzyme, immobilized enzyme), as well as product market size and value. For example commodity...

  2. Process Setting Models for the Minimization of Costs Defectives ...

    African Journals Online (AJOL)

    Process Setting Models for the Minimization of Costs Defectives. ... Journal Home > Vol 15, No 1 (1991) >. Log in or Register to get access to full text downloads. ... Abstract. The economy of production controls all manufacturing activities. In the ...

  3. Graphical symbol recognition

    OpenAIRE

    K.C., Santosh; Wendling, Laurent

    2015-01-01

    International audience; The chapter focuses on one of the key issues in document image processing i.e., graphical symbol recognition. Graphical symbol recognition is a sub-field of a larger research domain: pattern recognition. The chapter covers several approaches (i.e., statistical, structural and syntactic) and specially designed symbol recognition techniques inspired by real-world industrial problems. It, in general, contains research problems, state-of-the-art methods that convey basic s...

  4. EVALUATION OF CORROSION COST OF CRUDE OIL PROCESSING INDUSTRY

    Directory of Open Access Journals (Sweden)

    ADESANYA A.O.

    2012-08-01

    Full Text Available Crude oil production industry as the hub of Nigeria Economy is not immune to the global financial meltdown being experienced world over which have resulted in a continual fall of oil price. This has necessitated the need to reduce cost of production. One of the major costs of production is corrosion cost, hence, its evaluation. This research work outlined the basic principles of corrosion prevention, monitoring and inspection and attempted to describe ways in which these measures may be adopted in the context of oil production. A wide range of facilities are used in crude oil production making it difficult to evaluate precisely the extent of corrosion and its cost implication. In this study, cost of corrosion per barrel was determined and the annualized value of corrosion cost was also determined using the principles of engineering economy and results analyzed using descriptive statistics. The results showed that among the corrosion prevention methods identified, the use of chemical treatment gave the highest cost contribution (81% of the total cost of prevention while coating added 19%. Cleaning pigging and cathodic protection gave no cost. The contribution of corrosion maintenance methods are 60% for repairs and 40% for replacement. Also among the corrosion monitoring and inspection identified, NDT gave the highest cost contribution of 41% of the total cost, followed by coating survey (34%. Cathodic protection survey and crude analysis gives the lowest cost contribution of 19% and 6% respectively. Corrosion control cost per barrel was found to be 77 cent/barrel. The significance of this cost was not much due to high price of crude oil in the international market. But the effect of corrosion in crude oil processing takes its toll on crude oil production (i.e. deferment.

  5. Graphic presentation of information of acoustic monitoring of stream grinding process

    Directory of Open Access Journals (Sweden)

    N.S. Pryadko

    2012-04-01

    Full Text Available The theoretical and experimental mechanisms of thin grinding the loose materials are analyzed. The relation of the density function of acoustic signal amplitudes of grinding process to the degree of loading the jets by material is established.

  6. Comparison between cylindrical and prismatic lithium-ion cell costs using a process based cost model

    Science.gov (United States)

    Ciez, Rebecca E.; Whitacre, J. F.

    2017-02-01

    The relative size and age of the US electric vehicle market means that a few vehicles are able to drive market-wide trends in the battery chemistries and cell formats on the road today. Three lithium-ion chemistries account for nearly all of the storage capacity, and half of the cells are cylindrical. However, no specific model exists to examine the costs of manufacturing these cylindrical cells. Here we present a process-based cost model tailored to the cylindrical lithium-ion cells currently used in the EV market. We examine the costs for varied cell dimensions, electrode thicknesses, chemistries, and production volumes. Although cost savings are possible from increasing cell dimensions and electrode thicknesses, economies of scale have already been reached, and future cost reductions from increased production volumes are minimal. Prismatic cells, which are able to further capitalize on the cost reduction from larger formats, can offer further reductions than those possible for cylindrical cells.

  7. The use of probabilistic graphical models (PGMs) to develop a cost-effective vaccination strategy against Campylobacter in poultry

    DEFF Research Database (Denmark)

    Garcia Clavero, Ana Belén; Madsen, A.; Vigre, Håkan

    , epidemiological and economic factors (cost-reward functions) have been included in the models. The final outcome of the models is presented in probabilities of expected level of Campylobacter and financial terms influenced by the decision on vaccination. For example, if the best decision seems to be to vaccinate......Human campylobacteriosis represents an important economic and public health problem. Campylobacter originating from feces of infected chickens will contaminate chicken meat posing a risk to the consumer. Vaccination against Campylobacter in broilers is one probable measure to reduce consumers......’ exposure to Campylobacter.In this presentation we focus on the development of a computerized decision support system to aid management decisions on Campylobacter vaccination of commercial broilers. Broilers should be vaccinated against Campylobacter in the first 2 weeks of age. Therefore, the decision...

  8. Cost analysis of composite fan blade manufacturing processes

    Science.gov (United States)

    Stelson, T. S.; Barth, C. F.

    1980-01-01

    The relative manufacturing costs were estimated for large high technology fan blades prepared by advanced composite fabrication methods using seven candidate materials/process systems. These systems were identified as laminated resin matrix composite, filament wound resin matrix composite, superhybrid solid laminate, superhybrid spar/shell, metal matrix composite, metal matrix composite with a spar and shell, and hollow titanium. The costs were calculated utilizing analytical process models and all cost data are presented as normalized relative values where 100 was the cost of a conventionally forged solid titanium fan blade whose geometry corresponded to a size typical of 42 blades per disc. Four costs were calculated for each of the seven candidate systems to relate the variation of cost on blade size. Geometries typical of blade designs at 24, 30, 36 and 42 blades per disc were used. The impact of individual process yield factors on costs was also assessed as well as effects of process parameters, raw materials, labor rates and consumable items.

  9. Patient level costing in Ireland: process, challenges and opportunities.

    Science.gov (United States)

    Murphy, A; McElroy, B

    2015-03-01

    In 2013, the Department of Health released their policy paper on hospital financing entitled Money Follows the Patient. A fundamental building block for the proposed financing model is patient level costing. This paper outlines the patient level costing process, identifies the opportunities and considers the challenges associated with the process in the Irish hospital setting. Methods involved a review of the existing literature which was complemented with an interview with health service staff. There are considerable challenges associated with implementing patient level costing including deficits in information and communication technologies and financial expertise as well as timeliness of coding. In addition, greater clinical input into the costing process is needed compared to traditional costing processes. However, there are long-term benefits associated with patient level costing; these include empowerment of clinical staff, improved transparency and price setting and greater fairness, especially in the treatment of outliers. These can help to achieve the Government's Health Strategy. The benefits of patient level costing need to be promoted and a commitment to investment in overcoming the challenges is required.

  10. Perception in statistical graphics

    Science.gov (United States)

    VanderPlas, Susan Ruth

    There has been quite a bit of research on statistical graphics and visualization, generally focused on new types of graphics, new software to create graphics, interactivity, and usability studies. Our ability to interpret and use statistical graphics hinges on the interface between the graph itself and the brain that perceives and interprets it, and there is substantially less research on the interplay between graph, eye, brain, and mind than is sufficient to understand the nature of these relationships. The goal of the work presented here is to further explore the interplay between a static graph, the translation of that graph from paper to mental representation (the journey from eye to brain), and the mental processes that operate on that graph once it is transferred into memory (mind). Understanding the perception of statistical graphics should allow researchers to create more effective graphs which produce fewer distortions and viewer errors while reducing the cognitive load necessary to understand the information presented in the graph. Taken together, these experiments should lay a foundation for exploring the perception of statistical graphics. There has been considerable research into the accuracy of numerical judgments viewers make from graphs, and these studies are useful, but it is more effective to understand how errors in these judgments occur so that the root cause of the error can be addressed directly. Understanding how visual reasoning relates to the ability to make judgments from graphs allows us to tailor graphics to particular target audiences. In addition, understanding the hierarchy of salient features in statistical graphics allows us to clearly communicate the important message from data or statistical models by constructing graphics which are designed specifically for the perceptual system.

  11. Graphics Gems III IBM version

    CERN Document Server

    Kirk, David

    1994-01-01

    This sequel to Graphics Gems (Academic Press, 1990), and Graphics Gems II (Academic Press, 1991) is a practical collection of computer graphics programming tools and techniques. Graphics Gems III contains a larger percentage of gems related to modeling and rendering, particularly lighting and shading. This new edition also covers image processing, numerical and programming techniques, modeling and transformations, 2D and 3D geometry and algorithms,ray tracing and radiosity, rendering, and more clever new tools and tricks for graphics programming. Volume III also includes a

  12. Computer graphics in engineering education

    CERN Document Server

    Rogers, David F

    2013-01-01

    Computer Graphics in Engineering Education discusses the use of Computer Aided Design (CAD) and Computer Aided Manufacturing (CAM) as an instructional material in engineering education. Each of the nine chapters of this book covers topics and cites examples that are relevant to the relationship of CAD-CAM with engineering education. The first chapter discusses the use of computer graphics in the U.S. Naval Academy, while Chapter 2 covers key issues in instructional computer graphics. This book then discusses low-cost computer graphics in engineering education. Chapter 4 discusses the uniform b

  13. Effects of Graphic Organizers on Student Achievement in the Writing Process

    Science.gov (United States)

    Brown, Marjorie

    2011-01-01

    Writing at the high school level requires higher cognitive and literacy skills. Educators must decide the strategies best suited for the varying skills of each process. Compounding this issue is the need to instruct students with learning disabilities. Writing for students with learning disabilities is a struggle at minimum; teachers have to find…

  14. Graphic Novels, Web Comics, and Creator Blogs: Examining Product and Process

    Science.gov (United States)

    Carter, James Bucky

    2011-01-01

    Young adult literature (YAL) of the late 20th and early 21st century is exploring hybrid forms with growing regularity by embracing textual conventions from sequential art, video games, film, and more. As well, Web-based technologies have given those who consume YAL more immediate access to authors, their metacognitive creative processes, and…

  15. [Cost management: the implementation of the activity-based costing method in sterile processing department].

    Science.gov (United States)

    Jericó, Marli de Carvalho; Castilho, Valéria

    2010-09-01

    This exploratory case study was performed aiming at implementing the Activity-based Costing (ABC) method in a sterile processing department (SPD) of a major teaching hospital. Data collection was performed throughout 2006. Documentary research techniques and non participant closed observation were used. The ABC implementation allowed for learning the activity-based costing of both the chemical and physical disinfection cycle/load: (dollar 9.95) and (dollar 12.63), respectively; as well as the cost for sterilization by steam under pressure (autoclave) (dollar 31.37) and low temperature steam and gaseous formaldehyde sterilization (LTSF) (dollar 255.28). The information provided by the ABC method has optimized the overall understanding of the cost driver process and provided the foundation for assessing performance and improvement in the SPD processes.

  16. Graphic Ecologies

    Directory of Open Access Journals (Sweden)

    Brook Weld Muller

    2014-12-01

    Full Text Available This essay describes strategic approaches to graphic representation associated with critical environmental engagement and that build from the idea of works of architecture as stitches in the ecological fabric of the city. It focuses on the building up of partial or fragmented graphics in order to describe inclusive, open-ended possibilities for making architecture that marry rich experience and responsive performance. An aphoristic approach to crafting drawings involves complex layering, conscious absence and the embracing of tension. A self-critical attitude toward the generation of imagery characterized by the notion of ‘loose precision’ may lead to more transformative and environmentally responsive architectures.

  17. Dynamic Load Balancing using Graphics Processors

    Directory of Open Access Journals (Sweden)

    R Mohan

    2014-04-01

    Full Text Available To get maximum performance on the many-core graphics processors, it is important to have an even balance of the workload so that all processing units contribute equally to the task at hand. This can be hard to achieve when the cost of a task is not known beforehand and when new sub-tasks are created dynamically during execution. Both the dynamic load balancing methods using Static task assignment and work stealing using deques are compared to see which one is more suited to the highly parallel world of graphics processors. They have been evaluated on the task of simulating a computer move against the human move, in the famous four in a row game. The experiments showed that synchronization can be very expensive, and those new methods which use graphics processor features wisely might be required.

  18. VPLS Based Quality and Cost Control for Tennessee Eastman Process

    Institute of Scientific and Technical Information of China (English)

    宋凯; 王海清; 李平

    2005-01-01

    Product quality and operation cost control obtain increasing emphases in modern chemical system engineering. To improve the fault detection power of the partial least square (PLS) method for quality control, a new QRPV statistic is proposed in terms of the VP (variable importance in projection) indices of monitored process variables, which is significantly advanced over and different from the conventional Q statistic. QRPV is calculated only by the residuals of the remarkable process variables (RPVs). Therefore, it is the dominant relation between quality and RPV not all process variables (as in the case of the conventional PLS) that is monitored by this new VP-PLS (VPLS) method. The combination of QRPV and T2 statistics is applied to the quality and cost control of the Tennessee Eastman (TE) process, and weak faults can be detected as quickly as possible. Consequently, the product quality of TE process is guaranteed and operation costs are reduced.

  19. APPLICABILITY OF ACTIVITY BASED COSTING IN NEW PRODUCT DEVELOPMENT PROCESSES

    Directory of Open Access Journals (Sweden)

    Ewa Wanda MARUSZEWSKA

    2015-01-01

    Full Text Available The purpose of the article is to emphasis that activity based costing is a proper tool for engineers to enhance their deci-sion-making process while developing new product. The theoretical analysis shows that variety of factors shall be en-compassed into new product decision-making process and therefore engineers and management should pay great attention to proper cost allocation. The paper suggests the usage of Activity Based Costing methodology for new product development decision-making process. Author states that application ABC in the process of rational decision-making referring to new product development enables managers and engineers to prioritize possible solutions, and reallocate resources used in production process in order to meet wider organizational goals. It would also contribute in coopera-tion of managers and engineers for the sake of organizational goal.

  20. DecGPU: distributed error correction on massively parallel graphics processing units using CUDA and MPI.

    Science.gov (United States)

    Liu, Yongchao; Schmidt, Bertil; Maskell, Douglas L

    2011-03-29

    Next-generation sequencing technologies have led to the high-throughput production of sequence data (reads) at low cost. However, these reads are significantly shorter and more error-prone than conventional Sanger shotgun reads. This poses a challenge for the de novo assembly in terms of assembly quality and scalability for large-scale short read datasets. We present DecGPU, the first parallel and distributed error correction algorithm for high-throughput short reads (HTSRs) using a hybrid combination of CUDA and MPI parallel programming models. DecGPU provides CPU-based and GPU-based versions, where the CPU-based version employs coarse-grained and fine-grained parallelism using the MPI and OpenMP parallel programming models, and the GPU-based version takes advantage of the CUDA and MPI parallel programming models and employs a hybrid CPU+GPU computing model to maximize the performance by overlapping the CPU and GPU computation. The distributed feature of our algorithm makes it feasible and flexible for the error correction of large-scale HTSR datasets. Using simulated and real datasets, our algorithm demonstrates superior performance, in terms of error correction quality and execution speed, to the existing error correction algorithms. Furthermore, when combined with Velvet and ABySS, the resulting DecGPU-Velvet and DecGPU-ABySS assemblers demonstrate the potential of our algorithm to improve de novo assembly quality for de-Bruijn-graph-based assemblers. DecGPU is publicly available open-source software, written in CUDA C++ and MPI. The experimental results suggest that DecGPU is an effective and feasible error correction algorithm to tackle the flood of short reads produced by next-generation sequencing technologies.

  1. DecGPU: distributed error correction on massively parallel graphics processing units using CUDA and MPI

    Directory of Open Access Journals (Sweden)

    Schmidt Bertil

    2011-03-01

    Full Text Available Abstract Background Next-generation sequencing technologies have led to the high-throughput production of sequence data (reads at low cost. However, these reads are significantly shorter and more error-prone than conventional Sanger shotgun reads. This poses a challenge for the de novo assembly in terms of assembly quality and scalability for large-scale short read datasets. Results We present DecGPU, the first parallel and distributed error correction algorithm for high-throughput short reads (HTSRs using a hybrid combination of CUDA and MPI parallel programming models. DecGPU provides CPU-based and GPU-based versions, where the CPU-based version employs coarse-grained and fine-grained parallelism using the MPI and OpenMP parallel programming models, and the GPU-based version takes advantage of the CUDA and MPI parallel programming models and employs a hybrid CPU+GPU computing model to maximize the performance by overlapping the CPU and GPU computation. The distributed feature of our algorithm makes it feasible and flexible for the error correction of large-scale HTSR datasets. Using simulated and real datasets, our algorithm demonstrates superior performance, in terms of error correction quality and execution speed, to the existing error correction algorithms. Furthermore, when combined with Velvet and ABySS, the resulting DecGPU-Velvet and DecGPU-ABySS assemblers demonstrate the potential of our algorithm to improve de novo assembly quality for de-Bruijn-graph-based assemblers. Conclusions DecGPU is publicly available open-source software, written in CUDA C++ and MPI. The experimental results suggest that DecGPU is an effective and feasible error correction algorithm to tackle the flood of short reads produced by next-generation sequencing technologies.

  2. Genetic Algorithm Supported by Graphical Processing Unit Improves the Exploration of Effective Connectivity in Functional Brain Imaging

    Directory of Open Access Journals (Sweden)

    Lawrence Wing Chi Chan

    2015-05-01

    Full Text Available Brain regions of human subjects exhibit certain levels of associated activation upon specific environmental stimuli. Functional Magnetic Resonance Imaging (fMRI detects regional signals, based on which we could infer the direct or indirect neuronal connectivity between the regions. Structural Equation Modeling (SEM is an appropriate mathematical approach for analyzing the effective connectivity using fMRI data. A maximum likelihood (ML discrepancy function is minimized against some constrained coefficients of a path model. The minimization is an iterative process. The computing time is very long as the number of iterations increases geometrically with the number of path coefficients. Using regular Quad-Core Central Processing Unit (CPU platform, duration up to three months is required for the iterations from 0 to 30 path coefficients. This study demonstrates the application of Graphical Processing Unit (GPU with the parallel Genetic Algorithm (GA that replaces the Powell minimization in the standard program code of the analysis software package. It was found in the same example that GA under GPU reduced the duration to 20 hours and provided more accurate solution when compared with standard program code under CPU.

  3. Producing optical (contact) lenses by a novel low cost process

    Science.gov (United States)

    Skipper, Richard S.; Spencer, Ian D.

    2005-09-01

    The rapid and impressive growth of China has been achieved on the back of highly labour intensive industries, often in manufacturing, and at the cost of companies and jobs in Europe and America. Approaches that worked well in the 1990's to reduce production costs in the developed countries are no longer effective when confronted with the low labour costs of China and India. We have looked at contact lenses as a product that has become highly available to consumers here but as an industry that has reduced costs by moving to low labour cost countries. The question to be answered was, "Do we have the skill to still make the product in the UK, and can we make it cheap enough to export to China?" if we do not, then contact lens manufacture will move to China sooner or later. The challenge to enter the markets of the BRIC (Brazil, Russia, India and China) countries is extremely exciting as here is the new money, high growth and here is a product that sells to those with disposable incomes. To succeed we knew we had to be radical in our approach; the radical step was very simple: to devise a process in which each step added value to the customer and not cost to the product. The presentation examines the processes used by the major producers and how, by applying good manufacturing practice sound scientific principles to them, the opportunity to design a new low cost patented process was identified.

  4. Enzymatic corn wet milling: engineering process and cost model

    Directory of Open Access Journals (Sweden)

    McAloon Andrew J

    2009-01-01

    Full Text Available Abstract Background Enzymatic corn wet milling (E-milling is a process derived from conventional wet milling for the recovery and purification of starch and co-products using proteases to eliminate the need for sulfites and decrease the steeping time. In 2006, the total starch production in USA by conventional wet milling equaled 23 billion kilograms, including modified starches and starches used for sweeteners and ethanol production 1. Process engineering and cost models for an E-milling process have been developed for a processing plant with a capacity of 2.54 million kg of corn per day (100,000 bu/day. These models are based on the previously published models for a traditional wet milling plant with the same capacity. The E-milling process includes grain cleaning, pretreatment, enzymatic treatment, germ separation and recovery, fiber separation and recovery, gluten separation and recovery and starch separation. Information for the development of the conventional models was obtained from a variety of technical sources including commercial wet milling companies, industry experts and equipment suppliers. Additional information for the present models was obtained from our own experience with the development of the E-milling process and trials in the laboratory and at the pilot plant scale. The models were developed using process and cost simulation software (SuperPro Designer® and include processing information such as composition and flow rates of the various process streams, descriptions of the various unit operations and detailed breakdowns of the operating and capital cost of the facility. Results Based on the information from the model, we can estimate the cost of production per kilogram of starch using the input prices for corn, enzyme and other wet milling co-products. The work presented here describes the E-milling process and compares the process, the operation and costs with the conventional process. Conclusion The E-milling process

  5. Developing a Graphical User Interface to Support a Real-Time Digital Signal Processing System

    Science.gov (United States)

    1993-12-01

    specific purposes (known as colored memory); and explicit support for real-time processes by use of pre-emptive scheduling. 3.3.2 The pMACS Text Editor. The... pMACS text editor is similar to the *mace text editor. It is screen and buffer oriented and uses a large subset of the exacs com- mands [30]. 3.3.3...Creating new or editing existing code can be performed directly on the PSK system using the pMACS editor. For developers thoroughly familiar with

  6. Development of a Chemically Reacting Flow Solver on the Graphic Processing Units

    Science.gov (United States)

    2011-05-10

    been implemented on the GPU by Schive et al. (2010). The outcome of their work is the GAMER code for astrophysical simulation. Thibault and...model all the elementary reactions and their reverse processes. 4.2 Chemistry Model An elementary reaction takes the form       N s sr K...are read from separated data files which contain all the species information used for the computation along with the elementary reactions. 4.3

  7. WPA/WPA2 Password Security Testing using Graphics Processing Units

    OpenAIRE

    Sorin Andrei Visan

    2013-01-01

    This thesis focuses on the testing of WPA/WPA 2 password strength. Recently, due to progress in calculation power and technology, new factors must be taken into account when choosing a WPA/WPA2 secure password. A study regarding the security of the current deployed password is reported here.Harnessing the computational power of a single and old generation GPU (NVIDIA 610M released in December 2011), we have accelerated the process of recovering a password up to 3 times faster than in the case...

  8. Graphic notation

    DEFF Research Database (Denmark)

    Bergstrøm-Nielsen, Carl

    2010-01-01

    Graphic notation is taught to music therapy students at Aalborg University in both simple and elaborate forms. This is a method of depicting music visually, and notations may serve as memory aids, as aids for analysis and reflection, and for communication purposes such as supervision or within...

  9. An Improved, Efficient and Cost Effective Software Inspection Meeting Process

    Directory of Open Access Journals (Sweden)

    Dilawar Ali

    2013-02-01

    Full Text Available Normally, the inspection process is seemed to be just finding defects in software during software development process lifecycle. Software inspection is considered as a most cost effective technique, but if these defects are not properly corrected or handled it would cost you more than double later in the project. This paper focus on the last phase of inspection meeting process showing the importance of Follow-Up Stage in software inspection meeting process. This paper also suggests a set of activities that should be performed during the Rework and Follow-Up Stages so to get inspection meeting results productive and efficient. In this paper we focus on the over the shoulder reviews so to ensure the software quality having less impact on the total software cost.

  10. A straightforward graphical user interface for basic and advanced signal processing of thermographic infrared sequences

    Science.gov (United States)

    Klein, Matthieu T.; Ibarra-Castanedo, Clemente; Maldague, Xavier P.; Bendada, Abdelhakim

    2008-03-01

    IR-View, is a free and open source Matlab software that was released in 1998 at the Computer Vision and Systems Laboratory (CVSL) at Université Laval, Canada, as an answer to many common and recurrent needs in Infrared thermography. IR-View has proven to be a useful tool at CVSL for the past 10 years. The software by itself and/or its concept and functions may be of interest for other laboratories and companies working in research in the IR NDT field. This article describes the functions and processing techniques integrated to IR-View, freely downloadable under the GNU license at http://mivim.gel.ulaval.ca. Demonstration of IR-View functionalities will also be done during the DSS08 SPIE Defense and Security Symposium.

  11. Analysis and simulation of industrial distillation processes using a graphical system design model

    Science.gov (United States)

    Boca, Maria Loredana; Dobra, Remus; Dragos, Pasculescu; Ahmad, Mohammad Ayaz

    2016-12-01

    The separation column used for experimentations one model can be configured in two ways: one - two columns of different diameters placed one within the other extension, and second way, one column with set diameter [1], [2]. The column separates the carbon isotopes based on the cryogenic distillation of pure carbon monoxide, which is fed at a constant flow rate as a gas through the feeding system [1],[2]. Based on numerical control systems used in virtual instrumentation was done some simulations of the distillation process in order to obtain of the isotope 13C at high concentrations. The experimental installation for cryogenic separation can be configured from the point of view of the separation column in two ways: Cascade - two columns of different diameters and placed one in the extension of the other column, and second one column with a set diameter. It is proposed that this installation is controlled to achieve data using a data acquisition tool and professional software that will process information from the isotopic column based on a logical dedicated algorithm. Classical isotopic column will be controlled automatically, and information about the main parameters will be monitored and properly display using one program. Take in consideration the very-low operating temperature, an efficient thermal isolation vacuum jacket is necessary. Since the "elementary separation ratio" [2] is very close to unity in order to raise the (13C) isotope concentration up to a desired level, a permanent counter current of the liquid-gaseous phases of the carbon monoxide is created by the main elements of the equipment: the boiler in the bottom-side of the column and the condenser in the top-side.

  12. Reactive control processes contributing to residual switch cost and mixing cost in young and old adults

    Directory of Open Access Journals (Sweden)

    Lisa Rebecca Whitson

    2014-04-01

    Full Text Available In task-switching paradigms, performance is better when repeating the same task than when alternating between tasks (switch cost and when repeating a task alone rather than intermixed with another task (mixing cost. These costs remain even after extensive practice and when task cues enable advanced preparation (residual costs. Moreover, residual RT mixing cost has been consistently shown to increase with age. Residual switch and mixing costs modulate the amplitude of the stimulus-locked P3b. This mixing effect is disproportionately larger in older adults who also prepare more for and respond more cautiously on these ‘mixed’ repeat trials (Karayanidis et al., 2011. In this study, we examine stimulus-locked and response-locked P3 and lateralized readiness potentials to identify whether residual switch and mixing cost arise from the need to control interference at the level of stimulus processing or response processing. Residual mixing cost was associated with control of stimulus-level interference, whereas residual switch cost was also associated with a delay in response selection. In older adults, the disproportionate increase in mixing cost was associated with greater interference at the level of decision-response mapping and response programming for repeat trials in mixed-task blocks. We argue that, together with evidence of greater proactive control and more cautious responding for these trials, these findings suggest that older adults strategically recruit greater proactive and reactive control to overcome increased susceptibility to post-stimulus interference. This interpretation is consistent with recruitment of compensatory strategies to compensate for reduced repetition benefit rather than an overall decline on cognitive flexibility.

  13. Operating cost budgeting methods: quantitative methods to improve the process

    Directory of Open Access Journals (Sweden)

    José Olegário Rodrigues da Silva

    Full Text Available Abstract Operating cost forecasts are used in economic feasibility studies of projects and in budgeting process. Studies have pointed out that some companies are not satisfied with the budgeting process and chief executive officers want updates more frequently. In these cases, the main problem lies in the costs versus benefits. Companies seek simple and cheap forecasting methods without, at the same time, conceding in terms of quality of the resulting information. This study aims to compare operating cost forecasting models to identify the ones that are relatively easy to implement and turn out less deviation. For this purpose, we applied ARIMA (autoregressive integrated moving average and distributed dynamic lag models to data from a Brazilian petroleum company. The results suggest that the models have potential application, and that multivariate models fitted better and showed itself a better way to forecast costs than univariate models.

  14. Vocational Teaching Cube System of Engineering Graphics

    Institute of Scientific and Technical Information of China (English)

    YangDaofu; LiuShenli

    2003-01-01

    Based on long-time research on vocational teaching cube theory in graphics education and analyzing on the intellectual structure in the process of reading engineering drawing, the graphics intellectual three-dimensional model, which is made up of 100 cubes, is founded and tested in higher vocational graphics education. This system serves as a good guidance to the graphics teaching.

  15. Graphics Processing Unit-Accelerated Code for Computing Second-Order Wiener Kernels and Spike-Triggered Covariance

    Science.gov (United States)

    Mano, Omer

    2017-01-01

    Sensory neuroscience seeks to understand and predict how sensory neurons respond to stimuli. Nonlinear components of neural responses are frequently characterized by the second-order Wiener kernel and the closely-related spike-triggered covariance (STC). Recent advances in data acquisition have made it increasingly common and computationally intensive to compute second-order Wiener kernels/STC matrices. In order to speed up this sort of analysis, we developed a graphics processing unit (GPU)-accelerated module that computes the second-order Wiener kernel of a system’s response to a stimulus. The generated kernel can be easily transformed for use in standard STC analyses. Our code speeds up such analyses by factors of over 100 relative to current methods that utilize central processing units (CPUs). It works on any modern GPU and may be integrated into many data analysis workflows. This module accelerates data analysis so that more time can be spent exploring parameter space and interpreting data. PMID:28068420

  16. COSTS AND PROFITABILITY IN FOOD PROCESSING: PASTRY TYPE UNITS

    Directory of Open Access Journals (Sweden)

    DUMITRANA MIHAELA

    2013-08-01

    Full Text Available For each company, profitability, products quality and customer satisfaction are the most importanttargets. To attaint these targets, managers need to know all about costs that are used in decision making. Whatkind of costs? How these costs are calculated for a specific sector such as food processing? These are only a fewquestions with answers in our paper. We consider that a case study for this sector may be relevant for all peoplethat are interested to increase the profitability of this specific activity sector.

  17. Massive Parallelism of Monte-Carlo Simulation on Low-End Hardware using Graphic Processing Units

    Energy Technology Data Exchange (ETDEWEB)

    Mburu, Joe Mwangi; Hah, Chang Joo Hah [KEPCO International Nuclear Graduate School, Ulsan (Korea, Republic of)

    2014-05-15

    Within the past decade, research has been done on utilizing GPU massive parallelization in core simulation with impressive results but unfortunately, not much commercial application has been done in the nuclear field especially in reactor core simulation. The purpose of this paper is to give an introductory concept on the topic and illustrate the potential of exploiting the massive parallel nature of GPU computing on a simple monte-carlo simulation with very minimal hardware specifications. To do a comparative analysis, a simple two dimension monte-carlo simulation is implemented for both the CPU and GPU in order to evaluate performance gain based on the computing devices. The heterogeneous platform utilized in this analysis is done on a slow notebook with only 1GHz processor. The end results are quite surprising whereby high speedups obtained are almost a factor of 10. In this work, we have utilized heterogeneous computing in a GPU-based approach in applying potential high arithmetic intensive calculation. By applying a complex monte-carlo simulation on GPU platform, we have speed up the computational process by almost a factor of 10 based on one million neutrons. This shows how easy, cheap and efficient it is in using GPU in accelerating scientific computing and the results should encourage in exploring further this avenue especially in nuclear reactor physics simulation where deterministic and stochastic calculations are quite favourable in parallelization.

  18. WPA/WPA2 Password Security Testing using Graphics Processing Units

    Directory of Open Access Journals (Sweden)

    Sorin Andrei Visan

    2013-12-01

    Full Text Available This thesis focuses on the testing of WPA/WPA 2 password strength. Recently, due to progress in calculation power and technology, new factors must be taken into account when choosing a WPA/WPA2 secure password. A study regarding the security of the current deployed password is reported here.Harnessing the computational power of a single and old generation GPU (NVIDIA 610M released in December 2011, we have accelerated the process of recovering a password up to 3 times faster than in the case of using only our CPUs power.We have come to the conclusion that using a modern GPU (mid-end class, the password recovery time could be reduced up to 10 times or even much more faster when using a more elaborate solution such as a GPU cluster service/distributed work between multiple GPUs. This fact should raise an alarm signal to the community as to the way users pick their passwords, as passwords are becoming more and more unsecure as greater calculation power becomes available.

  19. Optimization of Parallel Legendre Transform using Graphics Processing Unit (GPU) for a Geodynamo Code

    Science.gov (United States)

    Lokavarapu, H. V.; Matsui, H.

    2015-12-01

    Convection and magnetic field of the Earth's outer core are expected to have vast length scales. To resolve these flows, high performance computing is required for geodynamo simulations using spherical harmonics transform (SHT), a significant portion of the execution time is spent on the Legendre transform. Calypso is a geodynamo code designed to model magnetohydrodynamics of a Boussinesq fluid in a rotating spherical shell, such as the outer core of the Earth. The code has been shown to scale well on computer clusters capable of computing at the order of 10⁵ cores using Message Passing Interface (MPI) and Open Multi-Processing (OpenMP) parallelization for CPUs. To further optimize, we investigate three different algorithms of the SHT using GPUs. One is to preemptively compute the Legendre polynomials on the CPU before executing SHT on the GPU within the time integration loop. In the second approach, both the Legendre polynomials and the SHT are computed on the GPU simultaneously. In the third approach , we initially partition the radial grid for the forward transform and the harmonic order for the backward transform between the CPU and GPU. There after, the partitioned works are simultaneously computed in the time integration loop. We examine the trade-offs between space and time, memory bandwidth and GPU computations on Maverick, a Texas Advanced Computing Center (TACC) supercomputer. We have observed improved performance using a GPU enabled Legendre transform. Furthermore, we will compare and contrast the different algorithms in the context of GPUs.

  20. Accelerating Matrix-Vector Multiplication on Hierarchical Matrices Using Graphical Processing Units

    KAUST Repository

    Boukaram, W.

    2015-03-25

    Large dense matrices arise from the discretization of many physical phenomena in computational sciences. In statistics very large dense covariance matrices are used for describing random fields and processes. One can, for instance, describe distribution of dust particles in the atmosphere, concentration of mineral resources in the earth\\'s crust or uncertain permeability coefficient in reservoir modeling. When the problem size grows, storing and computing with the full dense matrix becomes prohibitively expensive both in terms of computational complexity and physical memory requirements. Fortunately, these matrices can often be approximated by a class of data sparse matrices called hierarchical matrices (H-matrices) where various sub-blocks of the matrix are approximated by low rank matrices. These matrices can be stored in memory that grows linearly with the problem size. In addition, arithmetic operations on these H-matrices, such as matrix-vector multiplication, can be completed in almost linear time. Originally the H-matrix technique was developed for the approximation of stiffness matrices coming from partial differential and integral equations. Parallelizing these arithmetic operations on the GPU has been the focus of this work and we will present work done on the matrix vector operation on the GPU using the KSPARSE library.

  1. Process cost calculation - an alternative to cost management?; Prozesskostenrechnung - eine Alternative fuer das Kostenmanagement?

    Energy Technology Data Exchange (ETDEWEB)

    Anon.

    1998-01-12

    Increasing competition and changing markets in the energy sector make new demands on cost management and at the same time on the quality to be provided by the costing systems. The process cost calculation used today in various branches of industry is able to contribute - by orientating itself to processes and chains of delivered value - to an assignment of costs doing more justice to the causes so that costing information with improved meaningfulness is available for entrepreneurial decisions. The authors introduce process cost calculation and illustrate its function using a practice-related example for its application. Potential fields of utilisation in the electricity supply enterprises are subsequently shown. (orig.) [Deutsch] Zunehmender Wettbewerb und sich wandelnde Maerkte im Energiesektor stellen neue Anforderungen an das Kostenmanagement und damit auch an die Qualitaet der von den Kostenrechnungssystemen bereitzustellenden Informationen. Die heute in verschiedenen Wirtschaftszweigen eingesetzte Prozesskostenrechnung kann - durch Orientierung an Prozessen und Wertschoepfungsketten - zu einer verursachungsgerechteren Kostenzuordnung beitragen, so dass fuer unternehmerische Entscheidungen Kostenrechnungsinformationen mit verbesserter Aussagekraft zur Verfuegung stehen. Die Verfasser stellen die Prozesskostenrechnung vor und veranschaulichen deren Funktionsweise anhand eines praxisbezogenen Anwendungsbeispiels. Im Anschluss werden moegliche Einsatzgebiete fuer Energieversorgungsunternehmen aufgezeigt. (orig.)

  2. Massively Parallel Signal Processing using the Graphics Processing Unit for Real-Time Brain?Computer Interface Feature Extraction

    OpenAIRE

    J. Adam Wilson; Williams, Justin C.

    2009-01-01

    The clock speeds of modern computer processors have nearly plateaued in the past 5 years. Consequently, neural prosthetic systems that rely on processing large quantities of data in a short period of time face a bottleneck, in that it may not be possible to process all of the data recorded from an electrode array with high channel counts and bandwidth, such as electrocorticographic grids or other implantable systems. Therefore, in this study a method of using the processing capabilities of a ...

  3. Digital image processing for the acquisition of graphic similarity of the distributional patterns between cutaneous lesions of linear scleroderma and Blaschko's lines.

    Science.gov (United States)

    Jue, Mihn Sook; Kim, Moon Hwan; Ko, Joo Yeon; Lee, Chang Woo

    2011-08-01

    The aim of this study is to objectively evaluate whether linear scleroderma (LS) follows Blaschko's lines (BL) in Korean patients using digital image processing. Thirty-two patients with LS were examined. According to the patients' clinical photographs, their skin lesions were plotted on the head and body charts. With the aid of graphics software, a digital image was produced that included an overlay of all the individual lesions and was used to compare the graphics with the published BL. To investigate the image similarity between the graphic patterns of the LS and BL, each case was analyzed by means of Hough transformations and Czekanowski's methods. The comparative investigation of the graphic similarity of distributional patterns between the LS and BL showed that Czekanowski's similarity index was 0.947 on average. In conclusion, our objective results suggest that the graphic patterns of the distribution of the LS skin lesions showed a high degree of similarity and in fact were almost identical to that of BL which may be the lines of embryonic development of the skin. This finding may suggest that some developmental factors during the embryological age could constitute the cause of LS.

  4. The processing cost of scrambling and topicalization in Japanese

    Directory of Open Access Journals (Sweden)

    Satoshi eImamura

    2016-04-01

    Full Text Available This article presents two reading comprehension experiments, using the sentence correctness decision task, that explore the causes of processing cost of Japanese sentences with SnomOaccV, StopOaccV, OaccSnomV, and OtopSnomV word orders. The first experiment was conducted in order to see if either syntax or frequency plays a significant role in the processing of these sentences. The results of the first experiment have shown that both the structure-building process and frequency directly affect processing load. We observed that there was no difference in processing cost between SnomnOaccV and StopOaccV, both of which are easier to process than OACCSNOMV, which is in turn easier to process than OtopSnomV: SnomOaccV = StopOaccV < OaccSnomV < OtopSnomV. This result is the mixture of the two positions. Specifically, the structure building cost of StopOaccV was neutralized by its high frequency. The aim of the second experiment was to investigate the interaction between syntactic structure, frequency, and information structure. The results showed that the processing cost of OaccSnomV was facilitated by given-new ordering, but SnomOaccV, StopOaccV, and OtopSnomV were not. Thus, we can conclude that information structure also influences processing cost. In addition, the distribution of informational effects can be accounted for by Kuno’s (1987: 212 Markedness Principle for Discourse Rule Violations: SnomOaccV and StopOaccV are unmarked/canonical options, and as such are not penalized even when they violate given-new ordering, OaccSnomV is penalized when it does not maintain given-new ordering because it is a marked/non-canonical option, and OtopSnomV is penalized even when it obeys given-new ordering possibly because more specific contexts are needed. Another reason for the increased processing cost of OtopSnomV is a garden path effect; upon encountering Otop of OtopSnomV, the parser preferentially (misinterpreted it as Stop due to a subject

  5. 30 CFR 251.13 - Reimbursement for the costs of reproducing data and information and certain processing costs.

    Science.gov (United States)

    2010-07-01

    ... and information and certain processing costs. 251.13 Section 251.13 Mineral Resources MINERALS... third party for the reasonable costs of processing geophysical information (which does not include cost... OUTER CONTINENTAL SHELF § 251.13 Reimbursement for the costs of reproducing data and information...

  6. BUSINESS PROCESS MODELLING FOR PROJECTS COSTS MANAGEMENT IN AN ORGANIZATION

    Directory of Open Access Journals (Sweden)

    PĂTRAŞCU AURELIA

    2014-05-01

    Full Text Available Using Information Technologies in organizations represents an evident progress for company, money economy, time economy and generates value for the organization. In this paper the author proposes to model the business processes for an organization that manages projects costs, because modelling is an important part of any software development process. Using software for projects costs management is essential because it allows the management of all operations according to the established parameters, the management of the projects groups, as well as the management of the projects and subprojects, at different complexity levels.

  7. Monte Carlo-based fluorescence molecular tomography reconstruction method accelerated by a cluster of graphic processing units.

    Science.gov (United States)

    Quan, Guotao; Gong, Hui; Deng, Yong; Fu, Jianwei; Luo, Qingming

    2011-02-01

    High-speed fluorescence molecular tomography (FMT) reconstruction for 3-D heterogeneous media is still one of the most challenging problems in diffusive optical fluorescence imaging. In this paper, we propose a fast FMT reconstruction method that is based on Monte Carlo (MC) simulation and accelerated by a cluster of graphics processing units (GPUs). Based on the Message Passing Interface standard, we modified the MC code for fast FMT reconstruction, and different Green's functions representing the flux distribution in media are calculated simultaneously by different GPUs in the cluster. A load-balancing method was also developed to increase the computational efficiency. By applying the Fréchet derivative, a Jacobian matrix is formed to reconstruct the distribution of the fluorochromes using the calculated Green's functions. Phantom experiments have shown that only 10 min are required to get reconstruction results with a cluster of 6 GPUs, rather than 6 h with a cluster of multiple dual opteron CPU nodes. Because of the advantages of high accuracy and suitability for 3-D heterogeneity media with refractive-index-unmatched boundaries from the MC simulation, the GPU cluster-accelerated method provides a reliable approach to high-speed reconstruction for FMT imaging.

  8. Performance of heterogeneous computing with graphics processing unit and many integrated core for hartree potential calculations on a numerical grid.

    Science.gov (United States)

    Choi, Sunghwan; Kwon, Oh-Kyoung; Kim, Jaewook; Kim, Woo Youn

    2016-09-15

    We investigated the performance of heterogeneous computing with graphics processing units (GPUs) and many integrated core (MIC) with 20 CPU cores (20×CPU). As a practical example toward large scale electronic structure calculations using grid-based methods, we evaluated the Hartree potentials of silver nanoparticles with various sizes (3.1, 3.7, 4.9, 6.1, and 6.9 nm) via a direct integral method supported by the sinc basis set. The so-called work stealing scheduler was used for efficient heterogeneous computing via the balanced dynamic distribution of workloads between all processors on a given architecture without any prior information on their individual performances. 20×CPU + 1GPU was up to ∼1.5 and ∼3.1 times faster than 1GPU and 20×CPU, respectively. 20×CPU + 2GPU was ∼4.3 times faster than 20×CPU. The performance enhancement by CPU + MIC was considerably lower than expected because of the large initialization overhead of MIC, although its theoretical performance is similar with that of CPU + GPU. © 2016 Wiley Periodicals, Inc.

  9. GPUDePiCt: A Parallel Implementation of a Clustering Algorithm for Computing Degenerate Primers on Graphics Processing Units.

    Science.gov (United States)

    Cickovski, Trevor; Flor, Tiffany; Irving-Sachs, Galen; Novikov, Philip; Parda, James; Narasimhan, Giri

    2015-01-01

    In order to make multiple copies of a target sequence in the laboratory, the technique of Polymerase Chain Reaction (PCR) requires the design of "primers", which are short fragments of nucleotides complementary to the flanking regions of the target sequence. If the same primer is to amplify multiple closely related target sequences, then it is necessary to make the primers "degenerate", which would allow it to hybridize to target sequences with a limited amount of variability that may have been caused by mutations. However, the PCR technique can only allow a limited amount of degeneracy, and therefore the design of degenerate primers requires the identification of reasonably well-conserved regions in the input sequences. We take an existing algorithm for designing degenerate primers that is based on clustering and parallelize it in a web-accessible software package GPUDePiCt, using a shared memory model and the computing power of Graphics Processing Units (GPUs). We test our implementation on large sets of aligned sequences from the human genome and show a multi-fold speedup for clustering using our hybrid GPU/CPU implementation over a pure CPU approach for these sequences, which consist of more than 7,500 nucleotides. We also demonstrate that this speedup is consistent over larger numbers and longer lengths of aligned sequences.

  10. Resurfacing Graphics

    Directory of Open Access Journals (Sweden)

    Prof. Patty K. Wongpakdee

    2013-06-01

    Full Text Available “Resurfacing Graphics” deals with the subject of unconventional design, with the purpose of engaging the viewer to experience the graphics beyond paper’s passive surface. Unconventional designs serve to reinvigorate people, whose senses are dulled by the typical, printed graphics, which bombard them each day. Today’s cutting-edge designers, illustrators and artists utilize graphics in a unique manner that allows for tactile interaction. Such works serve as valuable teaching models and encourage students to do the following: 1 investigate the trans-disciplines of art and technology; 2 appreciate that this approach can have a positive effect on the environment; 3 examine and research other approaches of design communications and 4 utilize new mediums to stretch the boundaries of artistic endeavor. This paper examines how visuals communicators are “Resurfacing Graphics” by using atypical surfaces and materials such as textile, wood, ceramics and even water. Such non-traditional transmissions of visual language serve to demonstrate student’s overreliance on paper as an outdated medium. With this exposure, students can become forward-thinking, eco-friendly, creative leaders by expanding their creative breadth and continuing the perpetual exploration for new ways to make their mark.

  11. Resurfacing Graphics

    Directory of Open Access Journals (Sweden)

    Prof. Patty K. Wongpakdee

    2013-06-01

    Full Text Available “Resurfacing Graphics” deals with the subject of unconventional design, with the purpose of engaging the viewer to experience the graphics beyond paper’s passive surface. Unconventional designs serve to reinvigorate people, whose senses are dulled by the typical, printed graphics, which bombard them each day. Today’s cutting-edge designers, illustrators and artists utilize graphics in a unique manner that allows for tactile interaction. Such works serve as valuable teaching models and encourage students to do the following: 1 investigate the trans-disciplines of art and technology; 2 appreciate that this approach can have a positive effect on the environment; 3 examine and research other approaches of design communications and 4 utilize new mediums to stretch the boundaries of artistic endeavor. This paper examines how visuals communicators are “Resurfacing Graphics” by using atypical surfaces and materials such as textile, wood, ceramics and even water. Such non-traditional transmissions of visual language serve to demonstrate student’s overreliance on paper as an outdated medium. With this exposure, students can become forward-thinking, eco-friendly, creative leaders by expanding their creative breadth and continuing the perpetual exploration for new ways to make their mark. 

  12. The hand surgery fellowship application process: expectations, logistics, and costs.

    Science.gov (United States)

    Meals, Clifton; Osterman, Meredith

    2015-04-01

    To investigate expectations, logistics, and costs relevant to the hand surgery fellowship application process. We sought to discover (1) what both applicants and program directors are seeking, (2) what both parties have to offer, (3) how both parties collect information about each other, and (4) the costs incurred in arranging each match. We conducted on-line surveys of hand surgery fellowship applicants for appointment in 2015 and of current fellowship program directors. Sixty-two applicants and 41 program directors completed the survey. Results revealed applicants' demographic characteristics, qualifications, method of ranking hand fellowship programs, costs incurred (both monetary and opportunity) during the application process, ultimate match status, and suggestions for change. Results also revealed program directors' program demographics, rationale for offering interviews and favorably ranking applicants, application-related logistical details, costs incurred (both monetary and opportunity) during the application process, and suggestions for change. Applicants for hand surgery fellowship training are primarily interested in a potential program's academic reputation, emphasis on orthopedic surgery, and location. The typical, successfully matched applicant was a 30-year-old male orthopedic resident with 3 publications to his credit. Applicants rely on peers and Web sites for information about fellowships. Fellowship directors are primarily seeking applicants recommended by other experienced surgeons and with positive personality traits. The typical fellowship director offers a single year of orthopedic-based fellowship training to 2 fellows per year and relies on a common application and in-person interviews to collect information about applicants. Applicants appear to be more concerned than directors about the current state of the match process. Applicants and directors alike incur heavy costs, in both dollars and opportunity, to arrange each match. A nuanced

  13. Mapping graphic design practice & pedagogy

    OpenAIRE

    Corazzo, James; Raven, Darren

    2016-01-01

    Workshop Description Mapping graphic design pedagogy will explore the complex, expanding and fragmenting fields of graphic design through the process of visual mapping. This experimental, collaborative workshop will enable participants to conceive and develop useful frameworks for navigating the expanding arena of graphic design that has grown from its roots in professional practice and now come to include areas of ethical, political, socio-economic, cultural and critical design. For a...

  14. Graphic filter library implemented in CUDA language

    OpenAIRE

    Peroutková, Hedvika

    2009-01-01

    This thesis deals with the problem of reducing computation time of raster image processing by parallel computing on graphics processing unit. Raster image processing thereby refers to the application of graphic filters, which can be applied in sequence with different settings. This thesis evaluates the suitability of using parallelization on graphic card for raster image adjustments based on multicriterial choice. Filters are implemented for graphics processing unit in CUDA language. Opacity ...

  15. Research of physical-chemical processes in optically transparent materials during coloring points formation by volumetric-graphical laser processing

    Science.gov (United States)

    Davidov, Nicolay N.; Sushkova, L. T.; Rufitskii, M. V.; Kudaev, Serge V.; Galkin, Arkadii F.; Orlov, Vitalii N.; Prokoshev, Valerii G.

    1996-03-01

    A distinctive feature of glass is a wide range of correlation between internal absorption and admittance of electro-magnetic streams in a wide wavelength scope starting from gamma rays and up to infrared radiation. This factor provides an opportunity for search of new realizations of processes for machining, control and exploitation of glassware for home appliances, radioelectronics and illumination.

  16. RTM: Cost-effective processing of composite structures

    Science.gov (United States)

    Hasko, Greg; Dexter, H. Benson

    1991-01-01

    Resin transfer molding (RTM) is a promising method for cost effective fabrication of high strength, low weight composite structures from textile preforms. In this process, dry fibers are placed in a mold, resin is introduced either by vacuum infusion or pressure, and the part is cured. RTM has been used in many industries, including automotive, recreation, and aerospace. Each of the industries has different requirements of material strength, weight, reliability, environmental resistance, cost, and production rate. These requirements drive the selection of fibers and resins, fiber volume fractions, fiber orientations, mold design, and processing equipment. Research is made into applying RTM to primary aircraft structures which require high strength and stiffness at low density. The material requirements are discussed of various industries, along with methods of orienting and distributing fibers, mold configurations, and processing parameters. Processing and material parameters such as resin viscosity, perform compaction and permeability, and tool design concepts are discussed. Experimental methods to measure preform compaction and permeability are presented.

  17. Monte Carlo standardless approach for laser induced breakdown spectroscopy based on massive parallel graphic processing unit computing

    Science.gov (United States)

    Demidov, A.; Eschlböck-Fuchs, S.; Kazakov, A. Ya.; Gornushkin, I. B.; Kolmhofer, P. J.; Pedarnig, J. D.; Huber, N.; Heitz, J.; Schmid, T.; Rössler, R.; Panne, U.

    2016-11-01

    The improved Monte-Carlo (MC) method for standard-less analysis in laser induced breakdown spectroscopy (LIBS) is presented. Concentrations in MC LIBS are found by fitting model-generated synthetic spectra to experimental spectra. The current version of MC LIBS is based on the graphic processing unit (GPU) computation and reduces the analysis time down to several seconds per spectrum/sample. The previous version of MC LIBS which was based on the central processing unit (CPU) computation requested unacceptably long analysis times of 10's minutes per spectrum/sample. The reduction of the computational time is achieved through the massively parallel computing on the GPU which embeds thousands of co-processors. It is shown that the number of iterations on the GPU exceeds that on the CPU by a factor > 1000 for the 5-dimentional parameter space and yet requires > 10-fold shorter computational time. The improved GPU-MC LIBS outperforms the CPU-MS LIBS in terms of accuracy, precision, and analysis time. The performance is tested on LIBS-spectra obtained from pelletized powders of metal oxides consisting of CaO, Fe2O3, MgO, and TiO2 that simulated by-products of steel industry, steel slags. It is demonstrated that GPU-based MC LIBS is capable of rapid multi-element analysis with relative error between 1 and 10's percent that is sufficient for industrial applications (e.g. steel slag analysis). The results of the improved GPU-based MC LIBS are positively compared to that of the CPU-based MC LIBS as well as to the results of the standard calibration-free (CF) LIBS based on the Boltzmann plot method.

  18. 19 CFR 10.814 - Direct costs of processing operations.

    Science.gov (United States)

    2010-04-01

    ... 19 Customs Duties 1 2010-04-01 2010-04-01 false Direct costs of processing operations. 10.814 Section 10.814 Customs Duties U.S. CUSTOMS AND BORDER PROTECTION, DEPARTMENT OF HOMELAND SECURITY... administrative salaries, casualty and liability insurance, advertising, and salesmen's salaries, commissions,...

  19. 19 CFR 10.774 - Direct costs of processing operations.

    Science.gov (United States)

    2010-04-01

    ... 19 Customs Duties 1 2010-04-01 2010-04-01 false Direct costs of processing operations. 10.774 Section 10.774 Customs Duties U.S. CUSTOMS AND BORDER PROTECTION, DEPARTMENT OF HOMELAND SECURITY... administrative salaries, casualty and liability insurance, advertising, and salesmen's salaries, commissions,...

  20. Algorithm of graphics processing units based on cosmological calculations%基于宇宙计算的图形处理器算法实现

    Institute of Scientific and Technical Information of China (English)

    郭祖华; 贾积身; 马世霞

    2014-01-01

    The next generation of survey telescopes would yield measurements of billions of galaxies,which would cause ineffi-cient,high cost when using CPU.To address this problem,this paper proposed using the graphics processing units (GPUs)in processing the universe computing problems.Firstly,it studied two cosmological calculations,the two-point angular correlation function and the aperture mass statistic.Then,it implemented the two algorithms on the GPU by constructing code,and using CUDA.Finally,it compared the calculation speeds with comparable code run on the CPU.Experimental results indicate that the calculation speeds,using GPUs,has been significantly improved comparing with using CPUs.%下一代观测望远镜将会产生数以亿计的星系测量数据值,这将导致使用中央处理器处理数据时效率低下、成本较高。为了解决这一问题,提出了基于宇宙计算的图形处理器算法。研究了两点式角相关函数以及孔径质量统计这两种宇宙学的计算方法,构建算法代码,并使用统一计算设备架构在图形处理器上实现了这两种算法;比较了算法在中央处理器和图形处理器上使用的运行速度。实验结果表明,与中央处理器相比,使用图形处理器的计算速度得到了显著提高。

  1. Cost effective processes by using negative-tone development application

    Science.gov (United States)

    Yamamoto, Kei; Kato, Keita; Ou, Keiyu; Shirakawa, Michihiro; Kamimura, Sou

    2015-03-01

    The high volume manufacturing with extreme ultraviolet (EUV) lithography is delaying due to its light source issue. Therefore, ArF-immersion lithography has still been the most promising technology for down scaling of device pitch. As the limitation of ArF-immersion single patterning is considered to be nearly 40nm half pitch (hp), ArF-immersion lithography has necessity to be extended by combining processes to achieve sub- 20nm hp patterning. Recently, there are many reports about the extension of ArF-immersion lithography, e.g., self-aligned multiple patterning (SAMP) and litho-etch-litho-etch (LELE) process. These methods have been realized by the combination of lithography, deposition, and etching. On the other aspect, 1-D layout is adopted for leading devices, which contains additional cut or block litho and etch processes to form 2-D like layout. Thus, according to the progress of down scaling technologies, number of processes increases and the cost of ownership (CoO) can not be neglected. Especially, the number of lithography steps and etching steps has been expanded by the combination of processes, and it has come to occupy a large portion of total manufacturing cost. We have reported that negative tone development (NTD) system using organic solvent developer have enough resolution to achieve fine narrow trench or contact hole patterning, since negative tone imaging enables to apply bright mask for these pattern with significantly high optical image contrast compared to positive tone imaging, and it has contributed high throughput multiple patterning. On the other hand, NTD system is found to be useful not only for leading device node, but also for cost effective process. In this report, we propose the cost effective process using NTD application. In the viewpoint of cost down at exposure tool, we have developed KrF-NTD resist which is customized for organic solvent developer. Our KrF-NTD resist has resolution comparable with ArF positive tone development

  2. Designing and Implementing an OVERFLOW Reader for ParaView and Comparing Performance Between Central Processing Units and Graphical Processing Units

    Science.gov (United States)

    Chawner, David M.; Gomez, Ray J.

    2010-01-01

    In the Applied Aerosciences and CFD branch at Johnson Space Center, computational simulations are run that face many challenges. Two of which are the ability to customize software for specialized needs and the need to run simulations as fast as possible. There are many different tools that are used for running these simulations and each one has its own pros and cons. Once these simulations are run, there needs to be software capable of visualizing the results in an appealing manner. Some of this software is called open source, meaning that anyone can edit the source code to make modifications and distribute it to all other users in a future release. This is very useful, especially in this branch where many different tools are being used. File readers can be written to load any file format into a program, to ease the bridging from one tool to another. Programming such a reader requires knowledge of the file format that is being read as well as the equations necessary to obtain the derived values after loading. When running these CFD simulations, extremely large files are being loaded and having values being calculated. These simulations usually take a few hours to complete, even on the fastest machines. Graphics processing units (GPUs) are usually used to load the graphics for computers; however, in recent years, GPUs are being used for more generic applications because of the speed of these processors. Applications run on GPUs have been known to run up to forty times faster than they would on normal central processing units (CPUs). If these CFD programs are extended to run on GPUs, the amount of time they would require to complete would be much less. This would allow more simulations to be run in the same amount of time and possibly perform more complex computations.

  3. Compute-unified device architecture implementation of a block-matching algorithm for multiple graphical processing unit cards

    Science.gov (United States)

    Massanes, Francesc; Cadennes, Marie; Brankov, Jovan G.

    2011-07-01

    We describe and evaluate a fast implementation of a classical block-matching motion estimation algorithm for multiple graphical processing units (GPUs) using the compute unified device architecture computing engine. The implemented block-matching algorithm uses summed absolute difference error criterion and full grid search (FS) for finding optimal block displacement. In this evaluation, we compared the execution time of a GPU and CPU implementation for images of various sizes, using integer and noninteger search grids. The results show that use of a GPU card can shorten computation time by a factor of 200 times for integer and 1000 times for a noninteger search grid. The additional speedup for a noninteger search grid comes from the fact that GPU has built-in hardware for image interpolation. Further, when using multiple GPU cards, the presented evaluation shows the importance of the data splitting method across multiple cards, but an almost linear speedup with a number of cards is achievable. In addition, we compared the execution time of the proposed FS GPU implementation with two existing, highly optimized nonfull grid search CPU-based motion estimations methods, namely implementation of the Pyramidal Lucas Kanade Optical flow algorithm in OpenCV and simplified unsymmetrical multi-hexagon search in H.264/AVC standard. In these comparisons, FS GPU implementation still showed modest improvement even though the computational complexity of FS GPU implementation is substantially higher than non-FS CPU implementation. We also demonstrated that for an image sequence of 720 × 480 pixels in resolution commonly used in video surveillance, the proposed GPU implementation is sufficiently fast for real-time motion estimation at 30 frames-per-second using two NVIDIA C1060 Tesla GPU cards.

  4. Evaluation of Selected Resource Allocation and Scheduling Methods in Heterogeneous Many-Core Processors and Graphics Processing Units

    Directory of Open Access Journals (Sweden)

    Ciznicki Milosz

    2014-12-01

    Full Text Available Heterogeneous many-core computing resources are increasingly popular among users due to their improved performance over homogeneous systems. Many developers have realized that heterogeneous systems, e.g. a combination of a shared memory multi-core CPU machine with massively parallel Graphics Processing Units (GPUs, can provide significant performance opportunities to a wide range of applications. However, the best overall performance can only be achieved if application tasks are efficiently assigned to different types of processor units in time taking into account their specific resource requirements. Additionally, one should note that available heterogeneous resources have been designed as general purpose units, however, with many built-in features accelerating specific application operations. In other words, the same algorithm or application functionality can be implemented as a different task for CPU or GPU. Nevertheless, from the perspective of various evaluation criteria, e.g. the total execution time or energy consumption, we may observe completely different results. Therefore, as tasks can be scheduled and managed in many alternative ways on both many-core CPUs or GPUs and consequently have a huge impact on the overall computing resources performance, there are needs for new and improved resource management techniques. In this paper we discuss results achieved during experimental performance studies of selected task scheduling methods in heterogeneous computing systems. Additionally, we present a new architecture for resource allocation and task scheduling library which provides a generic application programming interface at the operating system level for improving scheduling polices taking into account a diversity of tasks and heterogeneous computing resources characteristics.

  5. Real-time processing for full-range Fourier-domain optical-coherence tomography with zero-filling interpolation using multiple graphic processing units.

    Science.gov (United States)

    Watanabe, Yuuki; Maeno, Seiya; Aoshima, Kenji; Hasegawa, Haruyuki; Koseki, Hitoshi

    2010-09-01

    The real-time display of full-range, 2048?axial pixelx1024?lateral pixel, Fourier-domain optical-coherence tomography (FD-OCT) images is demonstrated. The required speed was achieved by using dual graphic processing units (GPUs) with many stream processors to realize highly parallel processing. We used a zero-filling technique, including a forward Fourier transform, a zero padding to increase the axial data-array size to 8192, an inverse-Fourier transform back to the spectral domain, a linear interpolation from wavelength to wavenumber, a lateral Hilbert transform to obtain the complex spectrum, a Fourier transform to obtain the axial profiles, and a log scaling. The data-transfer time of the frame grabber was 15.73?ms, and the processing time, which includes the data transfer between the GPU memory and the host computer, was 14.75?ms, for a total time shorter than the 36.70?ms frame-interval time using a line-scan CCD camera operated at 27.9?kHz. That is, our OCT system achieved a processed-image display rate of 27.23 frames/s.

  6. Process-Costing, Job-Order-Costing, Operation Costing (også kaldet Batch Costing og Functional Costing - Når Systemtankegangen ligger til grund for økonomistyringen og dens beslutninger)

    DEFF Research Database (Denmark)

    Nielsen, Steen

    2005-01-01

    De tre begreber process-costing, job-order-costing, operation-costing samt functional-based costing er faktisk historiske begreber som stammer langt tilbage i økonomistyringslitteraturen, faktisk tilbage til Scientific Management bevægelsen fra 20'erne og 30'erne. Man kan derfor ikke sige, at disse...

  7. Measuring Cognitive Load in Test Items: Static Graphics versus Animated Graphics

    Science.gov (United States)

    Dindar, M.; Kabakçi Yurdakul, I.; Inan Dönmez, F.

    2015-01-01

    The majority of multimedia learning studies focus on the use of graphics in learning process but very few of them examine the role of graphics in testing students' knowledge. This study investigates the use of static graphics versus animated graphics in a computer-based English achievement test from a cognitive load theory perspective. Three…

  8. Real time processing of Fourier domain optical coherence tomography with fixed-pattern noise removal by partial median subtraction using a graphics processing unit.

    Science.gov (United States)

    Watanabe, Yuuki

    2012-05-01

    The author presents a graphics processing unit (GPU) programming for real-time Fourier domain optical coherence tomography (FD-OCT) with fixed-pattern noise removal by subtracting mean and median. In general, the fixed-pattern noise can be removed by the averaged spectrum from the many spectra of an actual measurement. However, a mean-spectrum results in artifacts as residual lateral lines caused by a small number of high-reflective points on a sample surface. These artifacts can be eliminated from OCT images by using medians instead of means. However, median calculations that are based on a sorting algorithm can generate a large amount of computation time. With the developed GPU programming, highly reflective surface regions were obtained by calculating the standard deviation of the Fourier transformed data in the lateral direction. The medians and means were then subtracted at the observed regions and other regions, such as backgrounds. When the median calculation was less than 256 positions out of a total 512 depths in an OCT image with 1024 A-lines, the GPU processing rate was faster than that of the line scan camera (46.9 kHz). Therefore, processed OCT images can be displayed in real-time using partial medians.

  9. Sustainable cost reduction by lean management in metallurgical processes

    OpenAIRE

    A. V. Todorut; L. Paliu-Popa; V. S. Tselentis; D. Cirnu

    2016-01-01

    This paper focuses on the need for sustainable cost reduction in the metallurgical industry by applying Lean Management (LM) tools and concepts in metallurgical production processes leading to increased competitiveness of corporations in a global market. The paper highlights that Lean Management is a novel way of thinking, adapting to change, reducing waste and continuous improvement, leading to sustainable development of companies in the metallurgical industry. The authors outline the main L...

  10. MATLAB GUIDE在数字图像处理教学中的应用%Applications of MATLAB Graphical User Interfaces in Teaching the Digital Image Processing

    Institute of Scientific and Technical Information of China (English)

    邱广萍

    2014-01-01

    MATLAB GUIDE(Graphical User Interfaces) is designed for the basic function of MATLAB in the demo program. This paper analyzes several examples commonly used in digital image processing design, automatic control system development and utilization of MATLAB GUIDE teaching, MATLAB GUIDE shows advantages in the teaching of digital image processing.%MATLAB GUIDE(Graphical User Interfaces,图形用户界面)是为表现MATLAB中的基本功能而设计的演示程序。本文通过对数字图像处理中几个常用的设计例子,开发和利用MATLAB GUIDE的教学自动控制系统,展示了MATLAB GUIDE在数字图像处理课程辅助教学的优点。

  11. Energy conservation and cost benefits in the dairy processing industry

    Energy Technology Data Exchange (ETDEWEB)

    None

    1982-01-01

    Guidance is given on measuring energy consumption in the plant and pinpointing areas where energy-conservation activities can return the most favorable economics. General energy-conservation techniques applicable to most or all segments of the dairy processing industry, including the fluid milk segment, are emphasized. These general techniques include waste heat recovery, improvements in electric motor efficiency, added insulation, refrigeration improvements, upgrading of evaporators, and increases in boiler efficiency. Specific examples are given in which these techniques are applied to dairy processing plants. The potential for energy savings by cogeneration of process steam and electricity in the dairy industry is also discussed. Process changes primarily applicable to specific milk products which have resulted in significant energy cost savings at some facilities or which promise significant contributions in the future are examined. A summary checklist of plant housekeeping measures for energy conservation and guidelines for economic evaluation of conservation alternatives are provided. (MHR)

  12. Low-cost EUV collector development: design, process, and fabrication

    Science.gov (United States)

    Venables, Ranju D.; Goldstein, Michael; Engelhaupt, Darell; Lee, Sang H.; Panning, Eric M.

    2007-03-01

    Cost of ownership (COO) is an area of concern that may limit the adoption and usage of Extreme Ultraviolet Lithography (EUVL). One of the key optical components that contribute to the COO budget is the collector. The collectors being fabricated today are based on existing x-ray optic design and fabrication processes. The main contributors to collector COO are fabrication cost and lifetime. We present experimental data and optical modeling to demonstrate a roadmap for optimized efficiency and a possible approach for significant reduction in collector COO. Current state of the art collectors are based on a Wolter type-1 design and have been adapted from x-ray telescopes. It uses a long format that is suitable for imaging distant light sources such as stars. As applied to industrial equipment and very bright nearby sources, however, a Wolter collector tends to be expensive and requires significant debris shielding and integrated cooling solutions due to the source proximity and length of the collector shells. Three collector concepts are discussed in this work. The elliptical collector that has been used as a test bed to demonstrate alternative cost effective fabrication method has been optimized for collection efficiency. However, this fabrication method can be applied to other optical designs as well. The number of shells and their design may be modified to increase the collection efficiency and to accommodate different EUV sources The fabrication process used in this work starts with a glass mandrel, which is elliptical on the inside. A seed layer is coated on the inside of the glass mandrel, which is then followed by electroplating nickel. The inside/exposed surface of the electroformed nickel is then polished to meet the figure and finish requirements for the particular shell and finally coated with Ru or a multilayer film depending on the angle of incidence of EUV light. Finally the collector shell is released from the inside surface of the mandrel. There are

  13. DGLa:A Distributed Graphics Language

    Institute of Scientific and Technical Information of China (English)

    潘志庚; 胡冰峰; 等

    1994-01-01

    A distributed graphics programming language called DGLa is presented,which facilitates the development of distributed graphics application.Facilities for distributed programming and graphics support are included in it,It not only supports synchronous and asynchronous communication but also provides programmer with multiple control mechanism for process communication.The graphics support of DGLa is powerful,for both sequential graphics library and parallel graphics library are provided.The design consideration and implementation experience are discussed in detail in this paper.Application examples are also given.

  14. Design Graphics

    Science.gov (United States)

    1990-01-01

    A mathematician, David R. Hedgley, Jr. developed a computer program that considers whether a line in a graphic model of a three-dimensional object should or should not be visible. Known as the Hidden Line Computer Code, the program automatically removes superfluous lines and displays an object from a specific viewpoint, just as the human eye would see it. An example of how one company uses the program is the experience of Birdair which specializes in production of fabric skylights and stadium covers. The fabric called SHEERFILL is a Teflon coated fiberglass material developed in cooperation with DuPont Company. SHEERFILL glazed structures are either tension structures or air-supported tension structures. Both are formed by patterned fabric sheets supported by a steel or aluminum frame or cable network. Birdair uses the Hidden Line Computer Code, to illustrate a prospective structure to an architect or owner. The program generates a three- dimensional perspective with the hidden lines removed. This program is still used by Birdair and continues to be commercially available to the public.

  15. Damage costs due to bedload transport processes in Switzerland

    Science.gov (United States)

    Badoux, A.; Andres, N.; Turowski, J. M.

    2014-02-01

    In Alpine regions, floods are often associated with erosion, transport and deposition of coarse sediment along the streams. These processes are related to bedload transport and pose a hazard in addition to the elevated water discharge. However, it is unclear to what extent they contribute to total damage caused by natural hazards. Using the Swiss flood and landslide damage database - which collects financial damage data of naturally triggered floods, debris flows and landslides - we estimated the contribution of fluvial bedload transport processes to total damage costs in Switzerland. For each database entry an upper and lower limit of financial losses caused by or related to bedload transport processes was estimated, and the quality of the estimate was judged. When compared to total damage, the fraction of bedload transport damage in the 40 yr study period lies between 0.32 and 0.37. However, this value is highly variable for individual years (from 0.02 to 0.72). Bedload transport processes have induced cumulative financial losses between CHF 4.3 and 5.1 billion. Spatial analysis revealed a considerable heterogeneous distribution with largest damage for mountainous regions. The analysis of the seasonal distribution shows that more than 75 % of the bedload damage costs occurs in summer (June-August), and ∼ 23% in autumn (September-November). With roughly 56 %, by far most of the damage has been registered in August. Bedload transport processes are presently still inadequately understood, and the predictive quality of common bedload equations is often poor. Our analysis demonstrates the importance of bedload transport as a natural hazard and financial source of risk, and thus the need for future structured research on transport processes in steep streams.

  16. Process Model of Quality Cost Monitoring for Small and Medium Wood-Processing Enterprises

    Directory of Open Access Journals (Sweden)

    Denis Jelačić

    2016-01-01

    Full Text Available Quality is not only a technical category and the system of quality management is not only focused on product quality. Quality and costs are closely interlinked. The paper deals with the quality cost monitoring in small and medium wood-processing enterprises (SMEs in Slovakia, and also presents the results of the questionnaire survey. An empirical study is aimed to determine the level of understanding and level of implementation of quality cost monitoring in wood-processing SMEs in Slovakia. The research is based on PAF model. A suitable model for quality cost monitoring is also proposed in the paper based on the research results with guidelines for using the methods of Activity Basic Costing. The empirical study is focused on SMEs, which make 99.8 % of all companies in the branch, and where the quality cost monitoring often works as a latent management subsystem. SMEs managers use indicators for monitoring the processe performance and production quality, but they usually do not develop a separate framework for measuring and evaluating quality costs.

  17. HYBRID SULFUR PROCESS REFERENCE DESIGN AND COST ANALYSIS

    Energy Technology Data Exchange (ETDEWEB)

    Gorensek, M.; Summers, W.; Boltrunis, C.; Lahoda, E.; Allen, D.; Greyvenstein, R.

    2009-05-12

    This report documents a detailed study to determine the expected efficiency and product costs for producing hydrogen via water-splitting using energy from an advanced nuclear reactor. It was determined that the overall efficiency from nuclear heat to hydrogen is high, and the cost of hydrogen is competitive under a high energy cost scenario. It would require over 40% more nuclear energy to generate an equivalent amount of hydrogen using conventional water-cooled nuclear reactors combined with water electrolysis compared to the proposed plant design described herein. There is a great deal of interest worldwide in reducing dependence on fossil fuels, while also minimizing the impact of the energy sector on global climate change. One potential opportunity to contribute to this effort is to replace the use of fossil fuels for hydrogen production by the use of water-splitting powered by nuclear energy. Hydrogen production is required for fertilizer (e.g. ammonia) production, oil refining, synfuels production, and other important industrial applications. It is typically produced by reacting natural gas, naphtha or coal with steam, which consumes significant amounts of energy and produces carbon dioxide as a byproduct. In the future, hydrogen could also be used as a transportation fuel, replacing petroleum. New processes are being developed that would permit hydrogen to be produced from water using only heat or a combination of heat and electricity produced by advanced, high temperature nuclear reactors. The U.S. Department of Energy (DOE) is developing these processes under a program known as the Nuclear Hydrogen Initiative (NHI). The Republic of South Africa (RSA) also is interested in developing advanced high temperature nuclear reactors and related chemical processes that could produce hydrogen fuel via water-splitting. This report focuses on the analysis of a nuclear hydrogen production system that combines the Pebble Bed Modular Reactor (PBMR), under development by

  18. THE TECHNIQUES FOR CREATING IMAGES OF PRINT PATTERNS IN THE PROCESS OF STUDYING THE DEVELOPMENT OF ARTISTIC TASTES OF STUDENTS-PAINTERS BY MEANS OF TEXTILE GRAPHICS

    Directory of Open Access Journals (Sweden)

    M. K. KULIKOVA

    2015-01-01

    Full Text Available The character of textile images is closely linked with the specifics of the production technology and this largely determines some of the techniques and types of graphic organization. The textile graphics is continuously enriched with new techniques. Some of them were borrowed from easel graphic art, decorative art, design. This article describes some techniques used to create the images of print patterns in the process of studying the development of artistic taste of art students by means of textile graphics. The scenic effect may be obtained by spatial (additive mixing of colors. This technique is reflected in the paintings of artists-pointillists. The picturesque effect of their works is explained by using different color strokes (mostly of pure hues, which created the effect of vibration of colors. However, the palette of painters was unlimited, and students-painters have usually a very limited number of colors. To create drawings of pointillistic nature, it is necessary to select carefully 3 - 4 saturated colors (but not additive ones to avoid achromatic effect. Patterns may be developed by dots, strokes, touches. The color change may be influenced not only by the neighboring tones, but also by the size, shape of spots and the distance between them. The main feature of drawings executed in pointillist technique is the general leading color shade, i.e. the prevalence of certain colors or combination of colors. The background color is not of less importance, because it actively effects the color combinations. As it can be seen, the techniques for creating image print patterns, textile compositions may be made by a variety of methods: graphical, pictorial, with a complex interconnection of their combinations. But with all originality of the artistic idea, clarity and conciseness of graphic techniques, careful reasoning of the selection of methods of expression will always enhance the development of artistic

  19. Graphic engine resource management

    Science.gov (United States)

    Bautin, Mikhail; Dwarakinath, Ashok; Chiueh, Tzi-cker

    2008-01-01

    Modern consumer-grade 3D graphic cards boast a computation/memory resource that can easily rival or even exceed that of standard desktop PCs. Although these cards are mainly designed for 3D gaming applications, their enormous computational power has attracted developers to port an increasing number of scientific computation programs to these cards, including matrix computation, collision detection, cryptography, database sorting, etc. As more and more applications run on 3D graphic cards, there is a need to allocate the computation/memory resource on these cards among the sharing applications more fairly and efficiently. In this paper, we describe the design, implementation and evaluation of a Graphic Processing Unit (GPU) scheduler based on Deficit Round Robin scheduling that successfully allocates to every process an equal share of the GPU time regardless of their demand. This scheduler, called GERM, estimates the execution time of each GPU command group based on dynamically collected statistics, and controls each process's GPU command production rate through its CPU scheduling priority. Measurements on the first GERM prototype show that this approach can keep the maximal GPU time consumption difference among concurrent GPU processes consistently below 5% for a variety of application mixes.

  20. Mathematical Creative Activity and the Graphic Calculator

    Science.gov (United States)

    Duda, Janina

    2011-01-01

    Teaching mathematics using graphic calculators has been an issue of didactic discussions for years. Finding ways in which graphic calculators can enrich the development process of creative activity in mathematically gifted students between the ages of 16-17 is the focus of this article. Research was conducted using graphic calculators with…

  1. Grid OCL : A Graphical Object Connecting Language

    Science.gov (United States)

    Taylor, I. J.; Schutz, B. F.

    In this paper, we present an overview of the Grid OCL graphical object connecting language. Grid OCL is an extension of Grid, introduced last year, that allows users to interactively build complex data processing systems by selecting a set of desired tools and connecting them together graphically. Algorithms written in this way can now also be run outside the graphical environment.

  2. Space Spurred Computer Graphics

    Science.gov (United States)

    1983-01-01

    Dicomed Corporation was asked by NASA in the early 1970s to develop processing capabilities for recording images sent from Mars by Viking spacecraft. The company produced a film recorder which increased the intensity levels and the capability for color recording. This development led to a strong technology base resulting in sophisticated computer graphics equipment. Dicomed systems are used to record CAD (computer aided design) and CAM (computer aided manufacturing) equipment, to update maps and produce computer generated animation.

  3. Thermodynamic costs of information processing in sensory adaptation.

    Directory of Open Access Journals (Sweden)

    Pablo Sartori

    2014-12-01

    Full Text Available Biological sensory systems react to changes in their surroundings. They are characterized by fast response and slow adaptation to varying environmental cues. Insofar as sensory adaptive systems map environmental changes to changes of their internal degrees of freedom, they can be regarded as computational devices manipulating information. Landauer established that information is ultimately physical, and its manipulation subject to the entropic and energetic bounds of thermodynamics. Thus the fundamental costs of biological sensory adaptation can be elucidated by tracking how the information the system has about its environment is altered. These bounds are particularly relevant for small organisms, which unlike everyday computers, operate at very low energies. In this paper, we establish a general framework for the thermodynamics of information processing in sensing. With it, we quantify how during sensory adaptation information about the past is erased, while information about the present is gathered. This process produces entropy larger than the amount of old information erased and has an energetic cost bounded by the amount of new information written to memory. We apply these principles to the E. coli's chemotaxis pathway during binary ligand concentration changes. In this regime, we quantify the amount of information stored by each methyl group and show that receptors consume energy in the range of the information-theoretic minimum. Our work provides a basis for further inquiries into more complex phenomena, such as gradient sensing and frequency response.

  4. An optimal policy for a single-vendor and a single-buyer integrated system with setup cost reduction and process-quality improvement

    Science.gov (United States)

    Shu, Hui; Zhou, Xideng

    2014-05-01

    The single-vendor single-buyer integrated production inventory system has been an object of study for a long time, but little is known about the effect of investing in reducing setup cost reduction and process-quality improvement for an integrated inventory system in which the products are sold with free minimal repair warranty. The purpose of this article is to minimise the integrated cost by optimising simultaneously the number of shipments and the shipment quantity, the setup cost, and the process quality. An efficient algorithm procedure is proposed for determining the optimal decision variables. A numerical example is presented to illustrate the results of the proposed models graphically. Sensitivity analysis of the model with respect to key parameters of the system is carried out. The paper shows that the proposed integrated model can result in significant savings in the integrated cost.

  5. 48 CFR 30.604 - Processing changes to disclosed or established cost accounting practices.

    Science.gov (United States)

    2010-10-01

    ... disclosed or established cost accounting practices. 30.604 Section 30.604 Federal Acquisition Regulations System FEDERAL ACQUISITION REGULATION GENERAL CONTRACTING REQUIREMENTS COST ACCOUNTING STANDARDS ADMINISTRATION CAS Administration 30.604 Processing changes to disclosed or established cost accounting...

  6. Comparative cost estimates of five coal utilization processes

    Energy Technology Data Exchange (ETDEWEB)

    1979-01-01

    Detailed capital and operating cost estimates were prepared for the generation of electric power in a new, net 500 MW (e), coal-burning facility by five alternative processes: conventional boiler with no control of SO/sub 2/ emissions, atmospheric fluidized bed steam generator (AFB), conventional boiler equipped with a limestone FGD system, conventional boiler equipped with magnesia FGD system, and coal beneficiation followed by a conventional boiler quipped with limestone FGD for part of the flue gas stream. For a coal containing 3.5% sulfur, meeting SO/sub 2/ emission limits of 1.2 pounds per million Btu fired was most economical with the limestone FGD system. This result was unchanged for a coal containing 5% sulfur; however, for 2% sulfur, limestone FGD and AFB were competitive methods of controlling SO/sub 2/ emissions. Brief consideration of 90% reduction of SO/sub 2/ emissions led to the choice of limestone FGD as the most economical method. Byproduct credit for the sulfuric acid produced in regenerating the magnesia could make that system competitive with the limestone FGD system, depending upon local markets. The cost of sludge fixation and disposal would make limestone FGD noneconomic in many situations, if these steps are necessary.

  7. Finite difference calculation of acoustic streaming including the boundary layer phenomena in an ultrasonic air pump on graphics processing unit array

    Science.gov (United States)

    Wada, Yuji; Koyama, Daisuke; Nakamura, Kentaro

    2012-09-01

    Direct finite difference fluid simulation of acoustic streaming on the fine-meshed threedimension model by graphics processing unit (GPU)-oriented calculation array is discussed. Airflows due to the acoustic traveling wave are induced when an intense sound field is generated in a gap between a bending transducer and a reflector. Calculation results showed good agreement with the measurements in the pressure distribution. In addition to that, several flow-vortices were observed near the boundary of the reflector and the transducer, which have been often discussed in acoustic tube near the boundary, and have not yet been observed in the calculation in the ultrasonic air pump of this type.

  8. Fast point-based method of a computer-generated hologram for a triangle-patch model by using a graphics processing unit.

    Science.gov (United States)

    Sugawara, Takuya; Ogihara, Yuki; Sakamoto, Yuji

    2016-01-20

    The point-based method and fast-Fourier-transform-based method are commonly used for calculation methods of computer-generation holograms. This paper proposes a novel fast calculation method for a patch model, which uses the point-based method. The method provides a calculation time that is proportional to the number of patches but not to that of the point light sources. This means that the method is suitable for calculating a wide area covered by patches quickly. Experiments using a graphics processing unit indicated that the proposed method is about 8 times or more faster than the ordinary point-based method.

  9. Sustainable cost reduction by lean management in metallurgical processes

    Directory of Open Access Journals (Sweden)

    A. V. Todorut

    2016-10-01

    Full Text Available This paper focuses on the need for sustainable cost reduction in the metallurgical industry by applying Lean Management (LM tools and concepts in metallurgical production processes leading to increased competitiveness of corporations in a global market. The paper highlights that Lean Management is a novel way of thinking, adapting to change, reducing waste and continuous improvement, leading to sustainable development of companies in the metallurgical industry. The authors outline the main Lean Management instruments based on recent scientific research and include a comparative analysis of other tools, such as Sort, Straighten, Shine, Standardize, Sustain (5S, Visual Management (VM, Kaizen, Total Productive Maintenance (TPM, Single-Minute Exchange of Dies (SMED, leading to a critical appraisal of their application in the metallurgical industry.

  10. Unsupervised Neural Network Quantifies the Cost of Visual Information Processing.

    Science.gov (United States)

    Orbán, Levente L; Chartier, Sylvain

    2015-01-01

    Untrained, "flower-naïve" bumblebees display behavioural preferences when presented with visual properties such as colour, symmetry, spatial frequency and others. Two unsupervised neural networks were implemented to understand the extent to which these models capture elements of bumblebees' unlearned visual preferences towards flower-like visual properties. The computational models, which are variants of Independent Component Analysis and Feature-Extracting Bidirectional Associative Memory, use images of test-patterns that are identical to ones used in behavioural studies. Each model works by decomposing images of floral patterns into meaningful underlying factors. We reconstruct the original floral image using the components and compare the quality of the reconstructed image to the original image. Independent Component Analysis matches behavioural results substantially better across several visual properties. These results are interpreted to support a hypothesis that the temporal and energetic costs of information processing by pollinators served as a selective pressure on floral displays: flowers adapted to pollinators' cognitive constraints.

  11. Quantum Chemistry on Graphical Processing Units. 3. Analytical Energy Gradients, Geometry Optimization, and First Principles Molecular Dynamics.

    Science.gov (United States)

    Ufimtsev, Ivan S; Martinez, Todd J

    2009-10-13

    We demonstrate that a video gaming machine containing two consumer graphical cards can outpace a state-of-the-art quad-core processor workstation by a factor of more than 180× in Hartree-Fock energy + gradient calculations. Such performance makes it possible to run large scale Hartree-Fock and Density Functional Theory calculations, which typically require hundreds of traditional processor cores, on a single workstation. Benchmark Born-Oppenheimer molecular dynamics simulations are performed on two molecular systems using the 3-21G basis set - a hydronium ion solvated by 30 waters (94 atoms, 405 basis functions) and an aspartic acid molecule solvated by 147 waters (457 atoms, 2014 basis functions). Our GPU implementation can perform 27 ps/day and 0.7 ps/day of ab initio molecular dynamics simulation on a single desktop computer for these systems.

  12. Computer Graphics in ChE Education.

    Science.gov (United States)

    Reklaitis, G. V.; And Others

    1983-01-01

    Examines current uses and future possibilities of computer graphics in chemical engineering, discussing equipment needs, maintenance/manpower costs, and plan to implement computer graphics into existing programs. The plan involves matching fund equipment grants, grants for development of computer assisted instructional (CAI) software, chemical…

  13. Computer Graphics in ChE Education.

    Science.gov (United States)

    Reklaitis, G. V.; And Others

    1983-01-01

    Examines current uses and future possibilities of computer graphics in chemical engineering, discussing equipment needs, maintenance/manpower costs, and plan to implement computer graphics into existing programs. The plan involves matching fund equipment grants, grants for development of computer assisted instructional (CAI) software, chemical…

  14. Benchmarking energy use and costs in salt-and-dry fish processing and lobster processing

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    2005-07-01

    The Canadian fish processing sector was the focus of this benchmarking analysis, which was conducted jointly by the Canadian Industry Program for Energy Conservation and the Fisheries Council of Canada, who retained Corporate Renaissance Group (CRG) to establish benchmarks for salt-and-dry processing operations in Nova Scotia and lobster processing operations in Prince Edward Island. The analysis was limited to the ongoing operations of the processing plants, and started with the landing of the fish/lobster and ended with freezer/cooler storage of the final products. Fuel used by the fishing fleet and in delivery trucks was not included in this study. The initial phase of each study involved interviews with management personnel at a number of plants in order to lay out process flow diagrams which were used to identify the series of stages of production for which energy consumption could be separately analyzed. Detailed information on annual plant production and total plant energy consumption and costs for the year by fuel type were collected, as well as inventories of energy-consuming machinery and equipment. At the completion of the data collection process, CRG prepared a summary of energy use, production data, assumptions and a preliminary analysis of each plant's energy use profile. Energy consumption and costs per short ton were calculated for each stage of production. Information derived from the calculations includes revised estimates of energy consumption by stage of production; energy costs per ton of fish; total energy consumption and costs associated with production of a standard product; and a detailed inter-plant comparison of energy consumption and costs per ton among the participating plants. Details of greenhouse gas (GHG) emissions and potential energy savings were also presented. 7 tabs., 3 figs.

  15. Advanced Drying Process for Lower Manufacturing Cost of Electrodes

    Energy Technology Data Exchange (ETDEWEB)

    Ahmad, Iftikhar [Lambda Technologies, Inc., Morrisville, NC (United States); Zhang, Pu [Lambda Technologies, Inc., Morrisville, NC (United States)

    2016-11-30

    For this Vehicle Technologies Incubator/Energy Storage R&D topic, Lambda Technologies teamed with Navitas Systems and proposed a new advanced drying process that promised a 5X reduction in electrode drying time and significant reduction in the cost of large format lithium batteries used in PEV's. The operating principle of the proposed process was to use penetrating radiant energy source Variable Frequency Microwaves (VFM), that are selectively absorbed by the polar water or solvent molecules instantly in the entire volume of the electrode. The solvent molecules are thus driven out of the electrode thickness making the process more efficient and much faster than convective drying method. To evaluate the Advanced Drying Process (ADP) a hybrid prototype system utilizing VFM and hot air flow was designed and fabricated. While VFM drives the solvent out of the electrode thickness, the hot air flow exhausts the solvent vapors out of the chamber. The drying results from this prototype were very encouraging. For water based anodes there is a 5X drying advantage (time & length of oven) in using ADP over standard drying system and for the NMP based cathodes the reduction in drying time has 3X benefit. For energy savings the power consumption measurements were performed to ADP prototype and compared with the convection standard drying oven. The data collected demonstrated over 40% saving in power consumption with ADP as compared to the convection drying systems. The energy savings are one of the operational cost benefits possible with ADP. To further speed up the drying process, the ADP prototype was explored as a booster module before the convection oven and for the electrode material being evaluated it was possible to increase the drying speed by a factor of 4, which could not be accomplished with the standard dryer without surface defects and cracks. The instantaneous penetration of microwave in the entire slurry thickness showed a major advantage in rapid drying of

  16. 'Cost in transliteration': the neurocognitive processing of Romanized writing.

    Science.gov (United States)

    Rao, Chaitra; Mathur, Avantika; Singh, Nandini C

    2013-03-01

    Romanized transliteration is widely used in internet communication and global commerce, yet we know little about its behavioural and neural processing. Here, we show that Romanized text imposes a significant neurocognitive load. Readers faced greater difficulty in identifying concrete words written in Romanized transliteration (Romanagari) compared to L1 and L2. Functional neuroimaging revealed that the neural cost of processing transliterations arose from significantly greater recruitment of language (left precentral gyrus, left inferior parietal lobule) and attention networks (left mid-cingulum). Additionally, transliterated text uniquely activated attention and control areas compared to both L1 (cerebellar vermis) and L2 (pre-supplementary motor area/pre-SMA). We attribute the neural effort of reading Romanized transliteration to (i) effortful phonological retrieval from unfamiliar orthographic forms and (ii) conflicting attentional demands imposed by mapping orthographic forms of one language to phonological-semantic representations in another. Finally, significant brain-behaviour correlation suggests that the left mid-cingulum modulates cognitive-linguistic conflict.

  17. Nisin Production Utilizing Skimmed Milk Aiming to Reduce Process Cost

    Science.gov (United States)

    Jozala, Angela Faustino; de Andrade, Maura Sayuri; de Arauz, Luciana Juncioni; Pessoa, Adalberto; Penna, Thereza Christina Vessoni

    Nisin is a natural additive for conservation of food, pharmaceutical, and dental products and can be used as a therapeutic agent. Nisin inhibits the outgrowth of spores, the growth of a variety of Gram-positive and Gram-negative bacteria. This study was performed to optimize large-scale nisin production in skimmed milk and subproducts aiming at low-costs process and stimulating its utilization. Lactococcus lactis American Type Culture Collection (ATCC) 11454 was developed in a rotary shaker (30°C/36 h/100 rpm) in diluted skimmed milk and nisin activity, growth parameters, and media components were also studied. Nisin activity in growth media was expressed in arbitrary units (AU/mL) and converted to standard nisin concentration (Nisaplin®, 25 mg of pure nisin is 1.0×106 AU/mL). Nisin activity in skimmed milk 2.27 gtotal solids was up to threefold higher than transfers in skimmed milk 4.54 gtotal solids and was up to 85-fold higher than transfers in skimmed milk 1.14 gtotal solids. L. lactis was assayed in a New Brunswick fermentor with 1.5 L of diluted skimmed milk (2.27 gtotal solids) and airflow of 1.5 mL/min (30°C/36/200 rpm), without pH control. In this condition nisin activity was observed after 4 h (45.07 AU/mL) and in the end of 36 h process (3312.07 AU/mL). This work shows the utilization of a low-cost growth medium (diluted skimmed milk) to nisin production with wide applications. Furthermore, milk subproducts (milk whey) can be exploited in nisin production, because in Brazil 50% of milk whey is disposed with no treatment in rivers and because of high organic matter concentrations it is considered an important pollutant. In this particular case an optimized production of an antimicrobial would be lined up with industrial disposal recycling.

  18. Electrochromic Windows: Process and Fabrication Improvements for Lower Total Costs

    Energy Technology Data Exchange (ETDEWEB)

    Mark Burdis; Neil Sbar

    2007-03-31

    The overall goal with respect to the U.S. Department of Energy (DOE) is to achieve significant national energy savings through maximized penetration of EC windows into existing markets so that the largest cumulative energy reduction can be realized. The speed with which EC windows can be introduced and replace current IGU's (and current glazings) is clearly a strong function of cost. Therefore, the aim of this project was to investigate possible improvements to the SageGlass{reg_sign} EC glazing products to facilitate both process and fabrication improvements resulting in lower overall costs. The project was split into four major areas dealing with improvements to the electrochromic layer, the capping layer, defect elimination and general product improvements. Significant advancements have been made in each of the four areas. These can be summarized as follows: (1) Plasma assisted deposition for the electrochromic layer was pursued, and several improvements made to the technology for producing a plasma beam were made. Functional EC devices were produced using the new technology, but there are still questions to be answered regarding the intrinsic properties of the electrochromic films produced by this method. (2) The capping layer work was successfully implemented into the existing SageGlass{reg_sign} product, thereby providing a higher level of transparency and somewhat lower reflectivity than the 'standard' product. (3) Defect elimination is an ongoing effort, but this project spurred some major defect reduction programs, which led to significant improvements in yield, with all the implicit benefits afforded. In particular, major advances were made in the development of a new bus bar application process aimed at reducing the numbers of 'shorts' developed in the finished product, as well as making dramatic improvements in the methods used for tempering the glass, which had previously been seen to produce a defect which appeared as a

  19. Spins Dynamics in a Dissipative Environment: Hierarchal Equations of Motion Approach Using a Graphics Processing Unit (GPU).

    Science.gov (United States)

    Tsuchimoto, Masashi; Tanimura, Yoshitaka

    2015-08-11

    A system with many energy states coupled to a harmonic oscillator bath is considered. To study quantum non-Markovian system-bath dynamics numerically rigorously and nonperturbatively, we developed a computer code for the reduced hierarchy equations of motion (HEOM) for a graphics processor unit (GPU) that can treat the system as large as 4096 energy states. The code employs a Padé spectrum decomposition (PSD) for a construction of HEOM and the exponential integrators. Dynamics of a quantum spin glass system are studied by calculating the free induction decay signal for the cases of 3 × 2 to 3 × 4 triangular lattices with antiferromagnetic interactions. We found that spins relax faster at lower temperature due to transitions through a quantum coherent state, as represented by the off-diagonal elements of the reduced density matrix, while it has been known that the spins relax slower due to suppression of thermal activation in a classical case. The decay of the spins are qualitatively similar regardless of the lattice sizes. The pathway of spin relaxation is analyzed under a sudden temperature drop condition. The Compute Unified Device Architecture (CUDA) based source code used in the present calculations is provided as Supporting Information .

  20. Application of Cost Allocation Concepts of Game Theory Approach for Cost Sharing Process

    Directory of Open Access Journals (Sweden)

    Mojtaba Valinejad Shoubi

    2013-04-01

    Full Text Available Dissatisfaction among involved parties regarding the ways of cost allocation is ordinary in the joint ventures, since each party attempts to get more interest caused by making the coalition. Various cost allocation methods such as proportional methods, some methods in cooperative game theory approach and etc have been used for the purpose of cost sharing in the joint projects. In this study the Nucleolus, Shapley value and SCRB as the cost sharing concepts in game theory approach have been used to investigate their effectiveness in fairly joint cost allocation between parties involved in constructing the joint water supply system. Then the results derived from these methods have been compared with the results of the traditional proportional to population and demand methods. The results indicated that the proportional methods may not lead to a fairly cost allocation while the Nucleolus, SCRB and the Shapley value methods can establish adequate incentive for cooperation.

  1. Process for Low Cost Domestic Production of LIB Cathode Materials

    Energy Technology Data Exchange (ETDEWEB)

    Thurston, Anthony

    2012-10-31

    The objective of the research was to determine the best low cost method for the large scale production of the Nickel-Cobalt-Manganese (NCM) layered cathode materials. The research and development focused on scaling up the licensed technology from Argonne National Laboratory in BASF’s battery material pilot plant in Beachwood Ohio. Since BASF did not have experience with the large scale production of the NCM cathode materials there was a significant amount of development that was needed to support BASF’s already existing research program. During the three year period BASF was able to develop and validate production processes for the NCM 111, 523 and 424 materials as well as begin development of the High Energy NCM. BASF also used this time period to provide free cathode material samples to numerous manufactures, OEM’s and research companies in order to validate the ma-terials. The success of the project can be demonstrated by the construction of the production plant in Elyria Ohio and the successful operation of that facility. The benefit of the project to the public will begin to be apparent as soon as material from the production plant is being used in electric vehicles.

  2. Unsupervised Neural Network Quantifies the Cost of Visual Information Processing.

    Directory of Open Access Journals (Sweden)

    Levente L Orbán

    Full Text Available Untrained, "flower-naïve" bumblebees display behavioural preferences when presented with visual properties such as colour, symmetry, spatial frequency and others. Two unsupervised neural networks were implemented to understand the extent to which these models capture elements of bumblebees' unlearned visual preferences towards flower-like visual properties. The computational models, which are variants of Independent Component Analysis and Feature-Extracting Bidirectional Associative Memory, use images of test-patterns that are identical to ones used in behavioural studies. Each model works by decomposing images of floral patterns into meaningful underlying factors. We reconstruct the original floral image using the components and compare the quality of the reconstructed image to the original image. Independent Component Analysis matches behavioural results substantially better across several visual properties. These results are interpreted to support a hypothesis that the temporal and energetic costs of information processing by pollinators served as a selective pressure on floral displays: flowers adapted to pollinators' cognitive constraints.

  3. Taking a Longer View: Processes of Curriculum Development in the Department of Graphic Design at the University of Johannesburg

    Directory of Open Access Journals (Sweden)

    Jennifer Clarence-Fincham

    2013-11-01

    Full Text Available In the face of the complex array of competing pressures currently faced by higher education, globally, nationally and institutionally (Maistry 2010; Clegg 2005 academic staff who are required to reconceptualise their curricula are often tempted to focus on the immediate demands of the classroom and to pay scant attention to the broader knowledge and curriculum-related issues which inform pedagogical practice. In this paper we argue that opportunities should be created for staff to step back from pedagogical concerns and to consider knowledge domains and the curriculum in all its dimensions from a distance and in a more nuanced, theoretically informed way (Clarence-Fincham and Naidoo 2014; Luckett 2012; Quinn 2012.   The paper aims to show how a model for curriculum development which mirrors the three tiers of Bernstein’s pedagogical device was used in a Department of Graphic Design as a means of facilitating a deeper, more explicit understanding of the nature of the discipline and the values underpinning it, the kind of curriculum emerging from it and the student identities associated with it. (Berstein 1999, 2000; Clarence-Fincham and Naidoo 2014; Maton 2007. It begins by identifying some of the central challenges currently facing the South African Higher Education sector and then sketches the institutional context and highlights the key concepts underpinning the university’s learning-to-be’ philosophy. Within this framework, using staff responses during early curriculum development workshops, as well as ideas expressed during a later group discussion, it identifies a range of staff positions about several aspects of the curriculum which reveals both areas of agreement as well as contestation and which provides a solid platform for further interrogation and development.

  4. Graphics Processing Unit (GPU) implementation of image processing algorithms to improve system performance of the Control, Acquisition, Processing, and Image Display System (CAPIDS) of the Micro-Angiographic Fluoroscope (MAF).

    Science.gov (United States)

    Vasan, S N Swetadri; Ionita, Ciprian N; Titus, A H; Cartwright, A N; Bednarek, D R; Rudin, S

    2012-02-23

    We present the image processing upgrades implemented on a Graphics Processing Unit (GPU) in the Control, Acquisition, Processing, and Image Display System (CAPIDS) for the custom Micro-Angiographic Fluoroscope (MAF) detector. Most of the image processing currently implemented in the CAPIDS system is pixel independent; that is, the operation on each pixel is the same and the operation on one does not depend upon the result from the operation on the other, allowing the entire image to be processed in parallel. GPU hardware was developed for this kind of massive parallel processing implementation. Thus for an algorithm which has a high amount of parallelism, a GPU implementation is much faster than a CPU implementation. The image processing algorithm upgrades implemented on the CAPIDS system include flat field correction, temporal filtering, image subtraction, roadmap mask generation and display window and leveling. A comparison between the previous and the upgraded version of CAPIDS has been presented, to demonstrate how the improvement is achieved. By performing the image processing on a GPU, significant improvements (with respect to timing or frame rate) have been achieved, including stable operation of the system at 30 fps during a fluoroscopy run, a DSA run, a roadmap procedure and automatic image windowing and leveling during each frame.

  5. Graphics processing unit (GPU) implementation of image processing algorithms to improve system performance of the control acquisition, processing, and image display system (CAPIDS) of the micro-angiographic fluoroscope (MAF)

    Science.gov (United States)

    Swetadri Vasan, S. N.; Ionita, Ciprian N.; Titus, A. H.; Cartwright, A. N.; Bednarek, D. R.; Rudin, S.

    2012-03-01

    We present the image processing upgrades implemented on a Graphics Processing Unit (GPU) in the Control, Acquisition, Processing, and Image Display System (CAPIDS) for the custom Micro-Angiographic Fluoroscope (MAF) detector. Most of the image processing currently implemented in the CAPIDS system is pixel independent; that is, the operation on each pixel is the same and the operation on one does not depend upon the result from the operation on the other, allowing the entire image to be processed in parallel. GPU hardware was developed for this kind of massive parallel processing implementation. Thus for an algorithm which has a high amount of parallelism, a GPU implementation is much faster than a CPU implementation. The image processing algorithm upgrades implemented on the CAPIDS system include flat field correction, temporal filtering, image subtraction, roadmap mask generation and display window and leveling. A comparison between the previous and the upgraded version of CAPIDS has been presented, to demonstrate how the improvement is achieved. By performing the image processing on a GPU, significant improvements (with respect to timing or frame rate) have been achieved, including stable operation of the system at 30 fps during a fluoroscopy run, a DSA run, a roadmap procedure and automatic image windowing and leveling during each frame.

  6. The practice of quality-associated costing: application to transfusion manufacturing processes.

    Science.gov (United States)

    Trenchard, P M; Dixon, R

    1997-01-01

    This article applies the new method of quality-associated costing (QAC) to the mixture of processes that create red cell and plasma products from whole blood donations. The article compares QAC with two commonly encountered but arbitrary models and illustrates the invalidity of clinical cost-benefit analysis based on these models. The first, an "isolated" cost model, seeks to allocate each whole process cost to only one product class. The other is a "shared" cost model, and it seeks to allocate an approximately equal share of all process costs to all associated products.

  7. 78 FR 20393 - Cost Recovery for Permit Processing, Administration, and Enforcement

    Science.gov (United States)

    2013-04-04

    ... CFR Parts 701, 736, 737 et al. Cost Recovery for Permit Processing, Administration, and Enforcement... Parts 701, 736, 737, 738, and 750 RIN 1029-AC65 Cost Recovery for Permit Processing, Administration, and... inspection), and the differing costs for the administration of the Federal and Indian Land Programs among...

  8. 78 FR 18429 - Cost Recovery for Permit Processing, Administration, and Enforcement

    Science.gov (United States)

    2013-03-26

    ... 30 CFR Parts 701, 736, 737 et al. Cost Recovery for Permit Processing, Administration, and... 701, 736, 737, 738, and 750 RIN 1029-AC65 Cost Recovery for Permit Processing, Administration, and... fees to recover the actual costs for permit review and administration and permit enforcement...

  9. Graphic Turbulence Guidance

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — Forecast turbulence hazards identified by the Graphical Turbulence Guidance algorithm. The Graphical Turbulence Guidance product depicts mid-level and upper-level...

  10. Repellency Awareness Graphic

    Science.gov (United States)

    Companies can apply to use the voluntary new graphic on product labels of skin-applied insect repellents. This graphic is intended to help consumers easily identify the protection time for mosquitoes and ticks and select appropriately.

  11. Graphical Turbulence Guidance - Composite

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — Forecast turbulence hazards identified by the Graphical Turbulence Guidance algorithm. The Graphical Turbulence Guidance product depicts mid-level and upper-level...

  12. Process Simulation of enzymatic biodiesel production -at what cost can biodiesel be made with enzymes?

    DEFF Research Database (Denmark)

    Fjerbæk Søtoft, Lene; Christensen, Knud Villy; Rong, Benguang

    as well as environmental impacts of the alternative process must be evaluated towards the conventional process. With process simulation tools, an evaluation will be carried out looking at what it will cost to produce biodiesel with enzymes. Different scenarios will be taken into account with variations...... in raw material prices, process designs and enzyme cost and performance....

  13. Graphical Models with R

    DEFF Research Database (Denmark)

    Højsgaard, Søren; Edwards, David; Lauritzen, Steffen

    , the book provides examples of how more advanced aspects of graphical modeling can be represented and handled within R. Topics covered in the seven chapters include graphical models for contingency tables, Gaussian and mixed graphical models, Bayesian networks and modeling high dimensional data...

  14. Parallel Implementation of 2D FFT on a Graphics Processing Unit%二维FFT在GPU上的并行实现

    Institute of Scientific and Technical Information of China (English)

    陈瑞; 童莹

    2009-01-01

    FFT算法是高度并行的分治算法,因此适合在GPU(Graphics Processing Unit,图形处理器)的CUDA(Com-pute Unified Device Architecture,计算统一设备体系结构)构架上实现.阐述了GPU用于通用计算的原理和方法,并在Geforee8800 GT平台上完成了二维卷积FFT的运算实验.实验结果表明,随着图像尺寸的增加,CPU和GPU上的运算量和运算时间大幅度增加,GPU上运算的速度提高倍数也随之增加,平均提升20倍左右.

  15. Software & Hardware Architecture of General-Purpose Graphics Processing Unit%GPU通用计算软硬件处理架构研究

    Institute of Scientific and Technical Information of China (English)

    谢建春

    2013-01-01

    现代GPU不仅是功能强劲的图形处理引擎,也是具有强大计算性能和存储带宽的高度并行可编程器件,能够与CPU构建完整的异构处理系统.而将GPU用于图形处理以外的计算,一般称之为GPU通用计算(General-Purpose computing on Graphics Processing Unit,GPGPU).对GPU通用计算的概念及分类、硬件架构及工作机制、软件环境及处理模型进行详细的研究,期望为GPU通用计算在航空嵌入式计算领域的进一步应用提供参考.

  16. Fast Shepard interpolation on graphics processing units: potential energy surfaces and dynamics for H + CH4 → H2 + CH3.

    Science.gov (United States)

    Welsch, Ralph; Manthe, Uwe

    2013-04-28

    A strategy for the fast evaluation of Shepard interpolated potential energy surfaces (PESs) utilizing graphics processing units (GPUs) is presented. Speed ups of several orders of magnitude are gained for the title reaction on the ZFWCZ PES [Y. Zhou, B. Fu, C. Wang, M. A. Collins, and D. H. Zhang, J. Chem. Phys. 134, 064323 (2011)]. Thermal rate constants are calculated employing the quantum transition state concept and the multi-layer multi-configurational time-dependent Hartree approach. Results for the ZFWCZ PES are compared to rate constants obtained for other ab initio PESs and problems are discussed. A revised PES is presented. Thermal rate constants obtained for the revised PES indicate that an accurate description of the anharmonicity around the transition state is crucial.

  17. Relativistic Hydrodynamics on Graphic Cards

    CERN Document Server

    Gerhard, Jochen; Bleicher, Marcus

    2012-01-01

    We show how to accelerate relativistic hydrodynamics simulations using graphic cards (graphic processing units, GPUs). These improvements are of highest relevance e.g. to the field of high-energetic nucleus-nucleus collisions at RHIC and LHC where (ideal and dissipative) relativistic hydrodynamics is used to calculate the evolution of hot and dense QCD matter. The results reported here are based on the Sharp And Smooth Transport Algorithm (SHASTA), which is employed in many hydrodynamical models and hybrid simulation packages, e.g. the Ultrarelativistic Quantum Molecular Dynamics model (UrQMD). We have redesigned the SHASTA using the OpenCL computing framework to work on accelerators like graphic processing units (GPUs) as well as on multi-core processors. With the redesign of the algorithm the hydrodynamic calculations have been accelerated by a factor 160 allowing for event-by-event calculations and better statistics in hybrid calculations.

  18. A Survey on Graphical Programming Systems

    OpenAIRE

    Gurudatt Kulkarni; Sathyaraj. R

    2014-01-01

    Recently there has been an increasing interest in the use of graphics to help programming and understanding of computer systems. The Graphical Programming and Program Simulations are exciting areas of active computer science research that show the signs for improving the programming process. An array of different design methodologie s have arisen from research efforts and many graphical programming systems have been developed to address both general programming tasks and speci...

  19. Development of a Monte Carlo software to photon transportation in voxel structures using graphic processing units; Desenvolvimento de um software de Monte Carlo para transporte de fotons em estruturas de voxels usando unidades de processamento grafico

    Energy Technology Data Exchange (ETDEWEB)

    Bellezzo, Murillo

    2014-09-01

    As the most accurate method to estimate absorbed dose in radiotherapy, Monte Carlo Method (MCM) has been widely used in radiotherapy treatment planning. Nevertheless, its efficiency can be improved for clinical routine applications. In this thesis, the CUBMC code is presented, a GPU-based MC photon transport algorithm for dose calculation under the Compute Unified Device Architecture (CUDA) platform. The simulation of physical events is based on the algorithm used in PENELOPE, and the cross section table used is the one generated by the MATERIAL routine, also present in PENELOPE code. Photons are transported in voxel-based geometries with different compositions. There are two distinct approaches used for transport simulation. The rst of them forces the photon to stop at every voxel frontier, the second one is the Woodcock method, where the photon ignores the existence of borders and travels in homogeneous fictitious media. The CUBMC code aims to be an alternative of Monte Carlo simulator code that, by using the capability of parallel processing of graphics processing units (GPU), provide high performance simulations in low cost compact machines, and thus can be applied in clinical cases and incorporated in treatment planning systems for radiotherapy. (author)

  20. Graphical Models with R

    DEFF Research Database (Denmark)

    Højsgaard, Søren; Edwards, David; Lauritzen, Steffen

    Graphical models in their modern form have been around since the late 1970s and appear today in many areas of the sciences. Along with the ongoing developments of graphical models, a number of different graphical modeling software programs have been written over the years. In recent years many...... of these software developments have taken place within the R community, either in the form of new packages or by providing an R ingerface to existing software. This book attempts to give the reader a gentle introduction to graphical modeling using R and the main features of some of these packages. In addition......, the book provides examples of how more advanced aspects of graphical modeling can be represented and handled within R. Topics covered in the seven chapters include graphical models for contingency tables, Gaussian and mixed graphical models, Bayesian networks and modeling high dimensional data...

  1. The Process Approach to Defining and Classifying the Marketing Costs of Trade Enterprise

    Directory of Open Access Journals (Sweden)

    Stoliarchuk Hanna V.

    2017-06-01

    Full Text Available The article suggests the definition of marketing costs based on a process approach, taking into account the innovation component. A detailed analysis of existing approaches to the definition of the term of «marketing costs» has been carried out. It has been found that there is no single scientific-methodical approach to the definition of marketing costs among scientists. Feasibility of using the process approach to the definition of marketing costs has been substantiated. According to the allocated business processes of marketing activity, components of its costs have been provided with account of the costs of innovation. Prospect for further research is development of a system of the indicators for evaluating the efficiency of costs of business processes of marketing activity.

  2. 10 CFR 950.23 - Claims process for payment of covered costs.

    Science.gov (United States)

    2010-01-01

    ... 10 Energy 4 2010-01-01 2010-01-01 false Claims process for payment of covered costs. 950.23 Section 950.23 Energy DEPARTMENT OF ENERGY STANDBY SUPPORT FOR CERTAIN NUCLEAR PLANT DELAYS Claims Administration Process § 950.23 Claims process for payment of covered costs. (a) General. No more than 120...

  3. Novel interconnection processes for low cost PEN/PET substrates

    NARCIS (Netherlands)

    Brand, J. van den; Kusters, R.; Fledderus, H.; Rubingh, J.E.J.M.; Podprocky, T.; Dietzel, A.H.

    2009-01-01

    Recently a new class of flexible electronics is starting to emerge which is most effectively termed 'printed electronics'. This term often refers to all-printed, cost effective, smart electronic products that will find a wide range of applications in large quantities in our society. The substrate ma

  4. Novel interconnection processes for low cost PEN/PET substrates

    NARCIS (Netherlands)

    Brand, J. van den; Kusters, R.; Fledderus, H.; Rubingh, J.E.J.M.; Podprocky, T.; Dietzel, A.H.

    2009-01-01

    Recently a new class of flexible electronics is starting to emerge which is most effectively termed 'printed electronics'. This term often refers to all-printed, cost effective, smart electronic products that will find a wide range of applications in large quantities in our society. The substrate

  5. THE AUTOMATED TESTING SYSTEM OF PROGRAMS WITH THE GRAPHIC USER INTERFACE WITHIN THE CONTEXT OF EDUCATIONAL PROCESS

    OpenAIRE

    2009-01-01

    The paper describes the problems of automation of educational process at the course "Programming on high level language. Algorithmic languages". Complexities of testing of programs with the user interface are marked. Existing analogues was considered. Methods of automation of student's jobs testing are offered.

  6. Guidebook to R graphics using Microsoft Windows

    CERN Document Server

    Takezawa, Kunio

    2012-01-01

    Introduces the graphical capabilities of R to readers new to the software Due to its flexibility and availability, R has become the computing software of choice for statistical computing and generating graphics across various fields of research. Guidebook to R Graphics Using Microsoft® Windows offers a unique presentation of R, guiding new users through its many benefits, including the creation of high-quality graphics. Beginning with getting the program up and running, this book takes readers step by step through the process of creating histograms, boxplots, strip charts, time series gra

  7. Activity-based costing as an information basis for an efficient strategic management process

    Directory of Open Access Journals (Sweden)

    Kaličanin Đorđe

    2013-01-01

    Full Text Available Activity-based costing (ABC provides an information basis for monitoring and controlling one of two possible sources of competitive advantage, low-cost production and lowcost distribution. On the basis of cost information about particular processes and activities, management may determine their contribution to the success of a company, and may decide to transfer certain processes and activities to another company. Accuracy of cost information is conditioned by finding an adequate relation between overhead costs and cost objects, identifying and tracing cost drivers and output measures of activities, and by monitoring cost behaviour of different levels of a product. Basic characteristics of the ABC approach, such as more accurate cost price accounting of objects, focusing on process and activity output (rather than only on resource consumption and on understanding and interpretation of cost structure (rather than on cost measurement, enable managers to estimate and control future costs more reliably. Thus the ABC methodology provides a foundation for cost tracing, analysis, and management, which entails making quality and accurate operative and strategic decisions as a basis for the longterm orientation of a company. ABC is also complementary to the widely accepted technique of strategic planning and strategy implementation known as Balanced Scorecard (BSC.

  8. STORYBOARD DALAM PEMBUATAN MOTION GRAPHIC

    Directory of Open Access Journals (Sweden)

    Satrya Mahardhika

    2013-09-01

    screenplay, character, environment design and storyboards. The storyboard will be determined through camera angles, blocking, sets, and many supporting roles involved in a scene. Storyboard is also useful as a production reference in recording or taping each scene in sequence or as an efficient priority. The example used is an ad creation using motion graphic animation storyboard which has an important role as a blueprint for every scene and giving instructions to make the transition movement, layout, blocking, and defining camera movement that everything should be done periodically in animation production. Planning before making the animation or motion graphic will make the job more organized, presentable, and more efficient in the process.

  9. Using compute unified device architecture-enabled graphic processing unit to accelerate fast Fourier transform-based regression Kriging interpolation on a MODIS land surface temperature image

    Science.gov (United States)

    Hu, Hongda; Shu, Hong; Hu, Zhiyong; Xu, Jianhui

    2016-04-01

    Kriging interpolation provides the best linear unbiased estimation for unobserved locations, but its heavy computation limits the manageable problem size in practice. To address this issue, an efficient interpolation procedure incorporating the fast Fourier transform (FFT) was developed. Extending this efficient approach, we propose an FFT-based parallel algorithm to accelerate regression Kriging interpolation on an NVIDIA® compute unified device architecture (CUDA)-enabled graphic processing unit (GPU). A high-performance cuFFT library in the CUDA toolkit was introduced to execute computation-intensive FFTs on the GPU, and three time-consuming processes were redesigned as kernel functions and executed on the CUDA cores. A MODIS land surface temperature 8-day image tile at a resolution of 1 km was resampled to create experimental datasets at eight different output resolutions. These datasets were used as the interpolation grids with different sizes in a comparative experiment. Experimental results show that speedup of the FFT-based regression Kriging interpolation accelerated by GPU can exceed 1000 when processing datasets with large grid sizes, as compared to the traditional Kriging interpolation running on the CPU. These results demonstrate that the combination of FFT methods and GPU-based parallel computing techniques greatly improves the computational performance without loss of precision.

  10. Graphic Design in Educational Television.

    Science.gov (United States)

    Clarke, Beverley

    To help educational television (ETV) practitioners achieve maximum clarity, economy and purposiveness, the range of techniques of television graphics is explained. Closed-circuit and broadcast ETV are compared. The design process is discussed in terms of aspect ratio, line structure, cut off, screen size, tone scales, studio apparatus, and…

  11. Storyboard dalam Pembuatan Motion Graphic

    Directory of Open Access Journals (Sweden)

    Satrya Mahardhika

    2013-10-01

    Full Text Available Motion graphics is one category in the animation that makes animation with lots of design elements in each component. Motion graphics needs long process including preproduction, production, and postproduction. Preproduction has an important role so that the next stage may provide guidance or instructions for the production process or the animation process. Preproduction includes research, making the story, script, screenplay, character, environment design and storyboards. The storyboard will be determined through camera angles, blocking, sets, and many supporting roles involved in a scene. Storyboard is also useful as a production reference in recording or taping each scene in sequence or as an efficient priority. The example used is an ad creation using motion graphic animation storyboard which has an important role as a blueprint for every scene and giving instructions to make the transition movement, layout, blocking, and defining camera movement that everything should be done periodically in animation production. Planning before making the animation or motion graphic will make the job more organized, presentable, and more efficient in the process.  

  12. The Systems Biology Graphical Notation.

    Science.gov (United States)

    Le Novère, Nicolas; Hucka, Michael; Mi, Huaiyu; Moodie, Stuart; Schreiber, Falk; Sorokin, Anatoly; Demir, Emek; Wegner, Katja; Aladjem, Mirit I; Wimalaratne, Sarala M; Bergman, Frank T; Gauges, Ralph; Ghazal, Peter; Kawaji, Hideya; Li, Lu; Matsuoka, Yukiko; Villéger, Alice; Boyd, Sarah E; Calzone, Laurence; Courtot, Melanie; Dogrusoz, Ugur; Freeman, Tom C; Funahashi, Akira; Ghosh, Samik; Jouraku, Akiya; Kim, Sohyoung; Kolpakov, Fedor; Luna, Augustin; Sahle, Sven; Schmidt, Esther; Watterson, Steven; Wu, Guanming; Goryanin, Igor; Kell, Douglas B; Sander, Chris; Sauro, Herbert; Snoep, Jacky L; Kohn, Kurt; Kitano, Hiroaki

    2009-08-01

    Circuit diagrams and Unified Modeling Language diagrams are just two examples of standard visual languages that help accelerate work by promoting regularity, removing ambiguity and enabling software tool support for communication of complex information. Ironically, despite having one of the highest ratios of graphical to textual information, biology still lacks standard graphical notations. The recent deluge of biological knowledge makes addressing this deficit a pressing concern. Toward this goal, we present the Systems Biology Graphical Notation (SBGN), a visual language developed by a community of biochemists, modelers and computer scientists. SBGN consists of three complementary languages: process diagram, entity relationship diagram and activity flow diagram. Together they enable scientists to represent networks of biochemical interactions in a standard, unambiguous way. We believe that SBGN will foster efficient and accurate representation, visualization, storage, exchange and reuse of information on all kinds of biological knowledge, from gene regulation, to metabolism, to cellular signaling.

  13. Business Process Re-engineering: A Panacea for Reducing Operational Cost in Service Organizations

    Directory of Open Access Journals (Sweden)

    Joseph Joseph Sungau

    2015-03-01

    Full Text Available Organizations in today’s business environment struggle on how to reduce operation cost in order to set prices that can be afforded by many customers while obtaining reasonable profit.  In order to reduce Operational Cost, service organizations have been working hard to identify techniques that facilitate business processes improvement for reduced Operational Cost. In so doing, the global literature indicates that service organizations adopt Business Process Re-engineering technique as a panacea of reducing Operational Cost. Despite a documented potentiality of Business Process Re-engineering technique, there are mixed empirical results, findings and conclusions regarding the effect of Business Process Re-engineering on Operational Cost. Therefore, this paper aimed at assessing and explaining effects of BPR on Operational Cost.   The study used cross-sectional survey design to investigate the effect of BPR on Operational Cost. Intensive literature review enabled the construction of structural measurement model, formulation of testable hypotheses and operationalization of constructs. In order to test the model and hypotheses, data were collected from ninety five (95 service organizations in Tanzania. Results of the study reveal that BPR and delivering speed have no direct effects on Operational Cost; they indirectly affect Operational Cost through the mediations of service quality. Therefore, BPR influences first both service quality and delivery speed in affecting Operational Cost of service organizations. It is now recommended that service organizations should use Business Process Re-engineering as panacea of reducing Operational Cost.    

  14. USE OF THE INFORMATION SYSTEM COSTS UNDER MANAGEMENT PROCESS

    Directory of Open Access Journals (Sweden)

    EC OBICI NICOLAE

    2014-12-01

    Full Text Available Decision-making takes place at all levels of the organization, taking into account both short-term outlook and long-term perspective. Plans are implemented by decisions whose purpose is materialized by formulating rational conclusions obtained as a result of financial and quantitative analysis. Thus, managerial accounting practice is deeply involved in decision making, a basic requirement of the existence of a solid managerial accounting information system cost, able to provide fundamental data

  15. USE OF THE INFORMATION SYSTEM COSTS UNDER MANAGEMENT PROCESS

    Directory of Open Access Journals (Sweden)

    ECOBICI NICOLAE

    2014-12-01

    Full Text Available Decision-making takes place at all levels of the organization, taking into account both short-term outlook and long-term perspective. Plans are implemented by decisions whose purpose is materialized by formulating rational conclusions obtained as a result of financial and quantitative analysis. Thus, managerial accounting practice is deeply involved in decision making, a basic requirement of the existence of a solid managerial accounting information system cost, able to provide fundamental data.

  16. Thermodynamic and Thermo-graphic Research of the Interaction Process of the Lisakovsky Gravitymagnetic Concentrate with Hydrocarbons

    Directory of Open Access Journals (Sweden)

    A. A. Мuhtar

    2015-01-01

    Full Text Available The relevance of this work consists in treatment of complex brown iron ores. Large volumes of off-balance ores are an additional source of production raw materials, however, there is still a problem of their treatment by effective complex methods.This work shows a possibility of using liquid hydrocarbons as the reducers during thermochemical preparation of brown iron concentrates of the Lisakovsky field to metallurgical conversion and studies the features and main regularities of a roasting process of Lisakovsky gravitymagnetic concentrate in the presence of liquid hydrocarbons.The initial concentrate was treated by solution of a liquid hydrocarbon reducer (oil: phenyl hydride: water, which was subjected to heat treatment with the subsequent magnetic dressing.Research by the X-ray phase analysis of reducing products has shown that the main phases of magnetic fraction of a roasted product are presented by magnetite in a small amount hematite and quartz. Generally, only relative intensity of peaks is changed.The thermodynamic analysis of interaction between the hydrocarbons, which are a part of oil, and iron oxides was carried out. This analysis allowed us to suppose a reducing mechanism for the brown iron ores by liquid reducers.The data obtained by the thermodynamic analysis are confirmed by experimental results. It is proved that with increasing hydrogen-to-carbon ratio the probability of proceeding reactions of interaction between oxide of iron (III and liquid hydrocarbon increases.The differential and thermal analysis allowed us to study a heat treatment process of the Lisakovsky gravity-magnetic concentrate, which is pre-treated by oil solutions, as well as to show a possibility for proceeding the process of interaction between liquid hydrocarbon and ferriferous products of Lisakovsky gravity-magnetic concentrate.It is found that with increasing temperature in the treated samples of LGMK the hydrogoethite dehydration product interacts with

  17. Performance Tradeoff Considerations in a Graphics Processing Unit (GPU) Implementation of a Low Detectable Aircraft Sensor System

    Science.gov (United States)

    2013-01-01

    CUDA          *  Optimal  employment  of   GPU  memory...the   GPU   using   the   stream   construct  within   CUDA .    Using  this  technique,  a  small  amount  of...input  tile  data   is  sent  to  the   GPU   initially.     Then,   while   the   CUDA   kernels   process  

  18. Keeping the Cost of Process Change Low through Refactoring

    NARCIS (Netherlands)

    Weber, B.; Reichert, M.U.

    2007-01-01

    With the increasing adoption of process-aware information systems (PAIS) large process model repositories have emerged. Over time respective models have to be re-aligned to the real world business processes through customization or adaptation. This bears the risk that model redundancies are introduc

  19. Keeping the Cost of Process Change Low through Refactoring

    NARCIS (Netherlands)

    Weber, B.; Reichert, M.U.

    2007-01-01

    With the increasing adoption of process-aware information systems (PAIS) large process model repositories have emerged. Over time respective models have to be re-aligned to the real world business processes through customization or adaptation. This bears the risk that model redundancies are

  20. Getting Graphic at the School Library.

    Science.gov (United States)

    Kan, Kat

    2003-01-01

    Provides information for school libraries interested in acquiring graphic novels. Discusses theft prevention; processing and cataloging; maintaining the collection; what to choose, with two Web sites for more information on graphic novels for libraries; collection development decisions; and Japanese comics called Manga. Includes an annotated list…

  1. Using Typography Grid in Graphic Design

    Institute of Scientific and Technical Information of China (English)

    张悦霞

    2007-01-01

    Typography grid is one of the important tools in the process of graphic design. Thought studying the history and principles of typography grid, we can know and master the methods of design and improve understanding of the aesthetics rules of graphic design.

  2. Synthesising Graphical Theories

    CERN Document Server

    Kissinger, Aleks

    2012-01-01

    In recent years, diagrammatic languages have been shown to be a powerful and expressive tool for reasoning about physical, logical, and semantic processes represented as morphisms in a monoidal category. In particular, categorical quantum mechanics, or "Quantum Picturalism", aims to turn concrete features of quantum theory into abstract structural properties, expressed in the form of diagrammatic identities. One way we search for these properties is to start with a concrete model (e.g. a set of linear maps or finite relations) and start composing generators into diagrams and looking for graphical identities. Naively, we could automate this procedure by enumerating all diagrams up to a given size and check for equalities, but this is intractable in practice because it produces far too many equations. Luckily, many of these identities are not primitive, but rather derivable from simpler ones. In 2010, Johansson, Dixon, and Bundy developed a technique called conjecture synthesis for automatically generating conj...

  3. Intelligent Computer Graphics 2012

    CERN Document Server

    Miaoulis, Georgios

    2013-01-01

    In Computer Graphics, the use of intelligent techniques started more recently than in other research areas. However, during these last two decades, the use of intelligent Computer Graphics techniques is growing up year after year and more and more interesting techniques are presented in this area.   The purpose of this volume is to present current work of the Intelligent Computer Graphics community, a community growing up year after year. This volume is a kind of continuation of the previously published Springer volumes “Artificial Intelligence Techniques for Computer Graphics” (2008), “Intelligent Computer Graphics 2009” (2009), “Intelligent Computer Graphics 2010” (2010) and “Intelligent Computer Graphics 2011” (2011).   Usually, this kind of volume contains, every year, selected extended papers from the corresponding 3IA Conference of the year. However, the current volume is made from directly reviewed and selected papers, submitted for publication in the volume “Intelligent Computer Gr...

  4. Parallel Implementation of Graphics Rendering and Image Processing Algorithm Based on PAAG%基于PAAG的图形图像算法的并行实现

    Institute of Scientific and Technical Information of China (English)

    孙建; 李涛; 李雪丹

    2015-01-01

    为了解决当前的CMOS技术遇到"功耗墙"和"散热墙"等问题导致的很难通过提高主频来提升芯片性能的问题,文中提出了一种新型多态同构阵列处理器—PAAG(Polymorphic Array Architecture for Graphics).该阵列机在一个芯片上集成了多个处理器,能够通过将各种高性能复杂的算法合理分解映射到该平台上实现并行计算.通过结合使用数据并行、操作并行的计算方法,对固定渲染管线的图形算法以及由国际标准组织Khronos提出的计算视觉标准OpenVX1.0中的Kernel函数图像算法进行了深入分析,并给出了基于这些算法在PAAG上的并行化设计.通过在PAAG硬件平台对应的仿真环境上进行各个算法的并行实现,得到了算法在多个处理单元上的运行时钟,由此计算出算法在多个处理单元上运行的加速比.实验结果表明,文中的并行化设计方法在PAAG上能够实现对图形图像算法的线性加速,与串行相比,效率更高.%In order to solve the problem that current CMOS technology has already met the"wall" of power and cooling which may cause the issue of improving the performance by improving the frequency of the chips,present a new polymorphic isomorphic array processor, called PAAG (Polymorphic Array Architecture for Graphics and image processing) . This array integrates multiple processing elements on a chip,it can realize parallel computing of the high-performance and complex algorithms by dividing and mapping them to the platform. By combining the data-level and the operation-level parallel calculation methods,the algorithms of the fixed rending pipeline and these of OpenVX1. 0,a standard of computer vision,proposed by the international standard organization Khronos,are in-depth analyzed in this paper. And the parallel design of these algorithms are proposed based on PAAG. The soft simulation platform of PAAG can give the result number of the running clock of the parallel

  5. Cost model relationships between textile manufacturing processes and design details for transport fuselage elements

    Science.gov (United States)

    Metschan, Stephen L.; Wilden, Kurtis S.; Sharpless, Garrett C.; Andelman, Rich M.

    1993-01-01

    Textile manufacturing processes offer potential cost and weight advantages over traditional composite materials and processes for transport fuselage elements. In the current study, design cost modeling relationships between textile processes and element design details were developed. Such relationships are expected to help future aircraft designers to make timely decisions on the effect of design details and overall configurations on textile fabrication costs. The fundamental advantage of a design cost model is to insure that the element design is cost effective for the intended process. Trade studies on the effects of processing parameters also help to optimize the manufacturing steps for a particular structural element. Two methods of analyzing design detail/process cost relationships developed for the design cost model were pursued in the current study. The first makes use of existing databases and alternative cost modeling methods (e.g. detailed estimating). The second compares design cost model predictions with data collected during the fabrication of seven foot circumferential frames for ATCAS crown test panels. The process used in this case involves 2D dry braiding and resin transfer molding of curved 'J' cross section frame members having design details characteristic of the baseline ATCAS crown design.

  6. Deterministic Graphical Games Revisited

    DEFF Research Database (Denmark)

    Andersson, Daniel; Hansen, Kristoffer Arnsfelt; Miltersen, Peter Bro

    2008-01-01

    We revisit the deterministic graphical games of Washburn. A deterministic graphical game can be described as a simple stochastic game (a notion due to Anne Condon), except that we allow arbitrary real payoffs but disallow moves of chance. We study the complexity of solving deterministic graphical...... games and obtain an almost-linear time comparison-based algorithm for computing an equilibrium of such a game. The existence of a linear time comparison-based algorithm remains an open problem....

  7. Introduction to Graphical Modelling

    CERN Document Server

    Scutari, Marco

    2010-01-01

    The aim of this chapter is twofold. In the first part we will provide a brief overview of the mathematical and statistical foundations of graphical models, along with their fundamental properties, estimation and basic inference procedures. In particular we will develop Markov networks (also known as Markov random fields) and Bayesian networks, which comprise most past and current literature on graphical models. In the second part we will review some applications of graphical models in systems biology.

  8. Improvement of MS (multiple sclerosis) CAD (computer aided diagnosis) performance using C/C++ and computing engine in the graphical processing unit (GPU)

    Science.gov (United States)

    Suh, Joohyung; Ma, Kevin; Le, Anh

    2011-03-01

    Multiple Sclerosis (MS) is a disease which is caused by damaged myelin around axons of the brain and spinal cord. Currently, MR Imaging is used for diagnosis, but it is very highly variable and time-consuming since the lesion detection and estimation of lesion volume are performed manually. For this reason, we developed a CAD (Computer Aided Diagnosis) system which would assist segmentation of MS to facilitate physician's diagnosis. The MS CAD system utilizes K-NN (k-nearest neighbor) algorithm to detect and segment the lesion volume in an area based on the voxel. The prototype MS CAD system was developed under the MATLAB environment. Currently, the MS CAD system consumes a huge amount of time to process data. In this paper we will present the development of a second version of MS CAD system which has been converted into C/C++ in order to take advantage of the GPU (Graphical Processing Unit) which will provide parallel computation. With the realization of C/C++ and utilizing the GPU, we expect to cut running time drastically. The paper investigates the conversion from MATLAB to C/C++ and the utilization of a high-end GPU for parallel computing of data to improve algorithm performance of MS CAD.

  9. A Real-Time High Performance Computation Architecture for Multiple Moving Target Tracking Based on Wide-Area Motion Imagery via Cloud and Graphic Processing Units

    Directory of Open Access Journals (Sweden)

    Kui Liu

    2017-02-01

    Full Text Available This paper presents the first attempt at combining Cloud with Graphic Processing Units (GPUs in a complementary manner within the framework of a real-time high performance computation architecture for the application of detecting and tracking multiple moving targets based on Wide Area Motion Imagery (WAMI. More specifically, the GPU and Cloud Moving Target Tracking (GC-MTT system applied a front-end web based server to perform the interaction with Hadoop and highly parallelized computation functions based on the Compute Unified Device Architecture (CUDA©. The introduced multiple moving target detection and tracking method can be extended to other applications such as pedestrian tracking, group tracking, and Patterns of Life (PoL analysis. The cloud and GPUs based computing provides an efficient real-time target recognition and tracking approach as compared to methods when the work flow is applied using only central processing units (CPUs. The simultaneous tracking and recognition results demonstrate that a GC-MTT based approach provides drastically improved tracking with low frame rates over realistic conditions.

  10. A Real-Time High Performance Computation Architecture for Multiple Moving Target Tracking Based on Wide-Area Motion Imagery via Cloud and Graphic Processing Units.

    Science.gov (United States)

    Liu, Kui; Wei, Sixiao; Chen, Zhijiang; Jia, Bin; Chen, Genshe; Ling, Haibin; Sheaff, Carolyn; Blasch, Erik

    2017-02-12

    This paper presents the first attempt at combining Cloud with Graphic Processing Units (GPUs) in a complementary manner within the framework of a real-time high performance computation architecture for the application of detecting and tracking multiple moving targets based on Wide Area Motion Imagery (WAMI). More specifically, the GPU and Cloud Moving Target Tracking (GC-MTT) system applied a front-end web based server to perform the interaction with Hadoop and highly parallelized computation functions based on the Compute Unified Device Architecture (CUDA©). The introduced multiple moving target detection and tracking method can be extended to other applications such as pedestrian tracking, group tracking, and Patterns of Life (PoL) analysis. The cloud and GPUs based computing provides an efficient real-time target recognition and tracking approach as compared to methods when the work flow is applied using only central processing units (CPUs). The simultaneous tracking and recognition results demonstrate that a GC-MTT based approach provides drastically improved tracking with low frame rates over realistic conditions.

  11. Graphics processing unit aided highly stable real-time spectral-domain optical coherence tomography at 1375 nm based on dual-coupled-line subtraction

    Science.gov (United States)

    Kim, Ji-hyun; Han, Jae-Ho; Jeong, Jichai

    2013-04-01

    We have proposed and demonstrated a highly stable spectral-domain optical coherence tomography (SD-OCT) system based on dual-coupled-line subtraction. The proposed system achieved an ultrahigh axial resolution of 5 μm by combining four kinds of spectrally shifted superluminescent diodes at 1375 nm. Using the dual-coupled-line subtraction method, we made the system insensitive to fluctuations of the optical intensity that can possibly arise in various clinical and experimental conditions. The imaging stability was verified by perturbing the intensity by bending an optical fiber, our system being the only one to reduce the noise among the conventional systems. Also, the proposed method required less computational complexity than conventional mean- and median-line subtraction. The real-time SD-OCT scheme was implemented by graphics processing unit aided signal processing. This is the first reported reduction method for A-line-wise fixed-pattern noise in a single-shot image without estimating the DC component.

  12. A Real-Time High Performance Computation Architecture for Multiple Moving Target Tracking Based on Wide-Area Motion Imagery via Cloud and Graphic Processing Units

    Science.gov (United States)

    Liu, Kui; Wei, Sixiao; Chen, Zhijiang; Jia, Bin; Chen, Genshe; Ling, Haibin; Sheaff, Carolyn; Blasch, Erik

    2017-01-01

    This paper presents the first attempt at combining Cloud with Graphic Processing Units (GPUs) in a complementary manner within the framework of a real-time high performance computation architecture for the application of detecting and tracking multiple moving targets based on Wide Area Motion Imagery (WAMI). More specifically, the GPU and Cloud Moving Target Tracking (GC-MTT) system applied a front-end web based server to perform the interaction with Hadoop and highly parallelized computation functions based on the Compute Unified Device Architecture (CUDA©). The introduced multiple moving target detection and tracking method can be extended to other applications such as pedestrian tracking, group tracking, and Patterns of Life (PoL) analysis. The cloud and GPUs based computing provides an efficient real-time target recognition and tracking approach as compared to methods when the work flow is applied using only central processing units (CPUs). The simultaneous tracking and recognition results demonstrate that a GC-MTT based approach provides drastically improved tracking with low frame rates over realistic conditions. PMID:28208684

  13. The computer graphics metafile

    CERN Document Server

    Henderson, LR; Shepherd, B; Arnold, D B

    1990-01-01

    The Computer Graphics Metafile deals with the Computer Graphics Metafile (CGM) standard and covers topics ranging from the structure and contents of a metafile to CGM functionality, metafile elements, and real-world applications of CGM. Binary Encoding, Character Encoding, application profiles, and implementations are also discussed. This book is comprised of 18 chapters divided into five sections and begins with an overview of the CGM standard and how it can meet some of the requirements for storage of graphical data within a graphics system or application environment. The reader is then intr

  14. Graphical Models with R

    CERN Document Server

    Højsgaard, Søren; Lauritzen, Steffen

    2012-01-01

    Graphical models in their modern form have been around since the late 1970s and appear today in many areas of the sciences. Along with the ongoing developments of graphical models, a number of different graphical modeling software programs have been written over the years. In recent years many of these software developments have taken place within the R community, either in the form of new packages or by providing an R interface to existing software. This book attempts to give the reader a gentle introduction to graphical modeling using R and the main features of some of these packages. In add

  15. The computer graphics interface

    CERN Document Server

    Steinbrugge Chauveau, Karla; Niles Reed, Theodore; Shepherd, B

    2014-01-01

    The Computer Graphics Interface provides a concise discussion of computer graphics interface (CGI) standards. The title is comprised of seven chapters that cover the concepts of the CGI standard. Figures and examples are also included. The first chapter provides a general overview of CGI; this chapter covers graphics standards, functional specifications, and syntactic interfaces. Next, the book discusses the basic concepts of CGI, such as inquiry, profiles, and registration. The third chapter covers the CGI concepts and functions, while the fourth chapter deals with the concept of graphic obje

  16. Graphics processing unit implementation and optimisation of a flexible maximum a-posteriori decoder for synchronisation correction

    Directory of Open Access Journals (Sweden)

    Johann A. Briffa

    2014-06-01

    Full Text Available In this paper, the author presents an optimised parallel implementation of a flexible maximum a-posteriori decoder for synchronisation error correcting codes, supporting a very wide range of code sizes and channel conditions. On mid-range GPUs the author demonstrates decoding speedups of more than two orders of magnitude over a central processing unit implementation of the same optimised algorithm, and more than an order of magnitude over the author's earlier GPU implementation. The prominent challenge is to maintain high parallelisation efficiency over a wide range of code sizes and channel conditions, and different execution hardware. The author ensures this with a dynamic strategy for choosing parallel execution parameters at run-time. They also present a variant that trades off some decoding speed for significantly reduced memory requirement, with no loss to the decoder's error correction performance. The increased throughput of their implementation and its ability to work with less memory allow us to analyse larger codes and poorer channel conditions, and makes practical use of such codes more feasible.

  17. Vehicle Lightweighting: Mass Reduction Spectrum Analysis and Process Cost Modeling

    Energy Technology Data Exchange (ETDEWEB)

    Mascarin, Anthony [IBIS Associates, Inc., Waltham, MA (United States); Hannibal, Ted [IBIS Associates, Inc., Waltham, MA (United States); Raghunathan, Anand [Energetics Inc., Columbia, MD (United States); Ivanic, Ziga [Energetics Inc., Columbia, MD (United States); Clark, Michael [Idaho National Lab. (INL), Idaho Falls, ID (United States)

    2016-03-01

    The U.S. Department of Energy’s Vehicle Technologies Office, Materials area commissioned a study to model and assess manufacturing economics of alternative design and production strategies for a series of lightweight vehicle concepts. In the first two phases of this effort examined combinations of strategies aimed at achieving strategic targets of 40% and a 45% mass reduction relative to a standard North American midsize passenger sedan at an effective cost of $3.42 per pound (lb) saved. These results have been reported in the Idaho National Laboratory report INL/EXT-14-33863 entitled Vehicle Lightweighting: 40% and 45% Weight Savings Analysis: Technical Cost Modeling for Vehicle Lightweighting published in March 2015. The data for these strategies were drawn from many sources, including Lotus Engineering Limited and FEV, Inc. lightweighting studies, U.S. Department of Energy-funded Vehma International of America, Inc./Ford Motor Company Multi-Material Lightweight Prototype Vehicle Demonstration Project, the Aluminum Association Transportation Group, many United States Council for Automotive Research’s/United States Automotive Materials Partnership LLC lightweight materials programs, and IBIS Associates, Inc.’s decades of experience in automotive lightweighting and materials substitution analyses.

  18. Publishing on the WWW. Part 6 - Simple graphic manipulation

    OpenAIRE

    Grech, V

    2001-01-01

    Modern graphic manipulation software is quick and simple to use, and allows medical quality graphics to be produced for online publication. This article demonstrates, step by step, how submitted images are processed by the journal in preparation for publication.

  19. The Use Condom campaign and its implications for graphic ...

    African Journals Online (AJOL)

    The Use Condom campaign and its implications for graphic communication in ... to assess the roles/activities of the media team in the media production process ... to producing effective graphic messages that facilitate the rapid adoption and ...

  20. Process-Costing, Job-Order-Costing, Operation Costing (også kaldet Batch Costing og Functional Costing - Når Systemtankegangen ligger til grund for økonomistyringen og dens beslutninger)

    DEFF Research Database (Denmark)

    Nielsen, Steen

    2005-01-01

    De tre begreber process-costing, job-order-costing, operation-costing samt functional-based costing er faktisk historiske begreber som stammer langt tilbage i økonomistyringslitteraturen, faktisk tilbage til Scientific Management bevægelsen fra 20'erne og 30'erne. Man kan derfor ikke sige, at disse...... Aktivitets-Baseret Cost Management systemerne. Det er derfor vigtigt dels at kende sin historie på området, dels at gøre sig klart, om de under visse antagelser stadig har deres berettigelse. De samme begreber har også deres pendant til de danske begreber, afdelings- eller funktionsregnskabet samt...... ordreregnskabet, f.eks. som dette er analyseret hos Palle Hansen og Vagn Madsen. Begrebet operational costing anvendes også, men dette dækker i realiteten over, hvordan og hvilke elementer der indgår i hele virksomhedens regnskabs-information-system. Dvs. at dette mere er et spørgsmål om, hvordan systemerne er...

  1. 76 FR 24871 - Reimbursement for Costs of Remedial Action at Active Uranium and Thorium Processing Sites

    Science.gov (United States)

    2011-05-03

    ... Reimbursement for Costs of Remedial Action at Active Uranium and Thorium Processing Sites AGENCY: Department of... from eligible active uranium and thorium processing site licensees for reimbursement under Title X of...). Title X requires DOE to reimburse eligible uranium and thorium licensees for certain costs...

  2. 75 FR 71677 - Reimbursement for Costs of Remedial Action at Active Uranium and Thorium Processing Sites

    Science.gov (United States)

    2010-11-24

    ... Reimbursement for Costs of Remedial Action at Active Uranium and Thorium Processing Sites AGENCY: Department of... uranium and thorium processing site licensees for reimbursement under Title X of the Energy Policy Act of... requires DOE to reimburse eligible uranium and thorium licensees for certain costs of...

  3. 76 FR 30696 - Reimbursement for Costs of Remedial Action at Active Uranium and Thorium Processing Sites

    Science.gov (United States)

    2011-05-26

    ... Reimbursement for Costs of Remedial Action at Active Uranium and Thorium Processing Sites AGENCY: Department of... eligible active uranium and thorium processing site licensees for reimbursement under Title X of the Energy... requires DOE to reimburse eligible uranium and thorium licensees for certain costs of...

  4. Approximating Graphic TSP by Matchings

    CERN Document Server

    Mömke, Tobias

    2011-01-01

    We present a framework for approximating the metric TSP based on a novel use of matchings. Traditionally, matchings have been used to add edges in order to make a given graph Eulerian, whereas our approach also allows for the removal of certain edges leading to a decreased cost. For the TSP on graphic metrics (graph-TSP), the approach yields a 1.461-approximation algorithm with respect to the Held-Karp lower bound. For graph-TSP restricted to a class of graphs that contains degree three bounded and claw-free graphs, we show that the integrality gap of the Held-Karp relaxation matches the conjectured ratio 4/3. The framework allows for generalizations in a natural way and also leads to a 1.586-approximation algorithm for the traveling salesman path problem on graphic metrics where the start and end vertices are prespecified.

  5. SISTEM PERHITUNGAN HARGA POKOK PRODUKSI PADA PERUSAHAAN FARMASI PT. BALATIF DENGAN METODE PROCESS COSTING

    Directory of Open Access Journals (Sweden)

    Andreas Handojo

    2009-01-01

    Full Text Available Nowadays, PT. Balatif has the calculation system for production cost that can only include the calculation of cost of materials based on standard Bill of Material (BOM, while record of factory overhead cost, direct labor cost and report that relating to the calculation of Cost of Goods Manufactured of product still cannot be handled by the system. So that the current costs that really occurred in the production process will be difficult to trace and the production cost not based on reality process. Based on that problem, this research design a system to calculate production cost that can handle that problem. This application used Microsoft Visual Studio .Net 2005 as the programming tool and Oracle 10g as the database. Results that obtained from the application that have been made are raw material usage can be saved based on BOM or additional, machine usage, operator usage, and so on associated with the production process. In addition, the application that has been made can be used to allocate the costs that occur during the production process and to generate reports related to the calculation of Cost of Goods Manufactured of a product automatically.

  6. New Catalyst Reduces Wasted Carbon in Biofuel Process, Lowers Cost

    Energy Technology Data Exchange (ETDEWEB)

    2016-02-01

    Researchers at NREL recently developed a catalyst formulation that incorporates more hydrogen into the DME-to-high-octane gasoline process, resulting in a higher yield to gasoline-range products. Further, the researchers developed a secondary process that efficiently couples a portion of the gasoline-range product to yield jet/diesel fuels. The modified catalyst doubles the conversion rate of DME, which can be produced from biomass, to the high-octane gasoline product and significantly decreases the formation of wasted byproducts. For the distillate-range product, 80% of the mixture is in line with ASTM standards for use as a jet fuel blendstock. The increased productivity of high-octane gasoline and the development of a value-added distillate blendstock process further improve the economic viability toward commercially implementing this renewable fuels process.

  7. Low-Cost Structural Thermoelectric Materials: Processing and Consolidation

    Science.gov (United States)

    2015-01-01

    nanocrystalline materials11,12 and this research utilizes the existing powder processing infrastructure at ARL to explore nanostructured TE materials ...The process of utilizing mechanical alloying to produce bulk nanocrystalline materials is shown in Fig. 3. There are a number of different types of...consolidate nanocrystalline metal powders. In fact, the bottom image in Fig. 9 is the Ti–Ni–Sn material consolidated at 1,000 ◦C. The hollowed area is the

  8. Scientific statistics and graphics on the Macintosh

    Energy Technology Data Exchange (ETDEWEB)

    Grotch, S.L.

    1994-09-01

    In many organizations scientists have ready access to more than one computer, often both a workstation (e.g., SUN, HIP, SGI) as well as a Macintosh or other PC. The scientist commonly uses the work station for {open_quotes}number-crunching{close_quotes} and data analysis whereas the Macintosh is relegated to either word processing or serves as a {open_quotes}dumb terminal{close_quotes} to a larger mainframe computer. In an informal poll of the author`s colleagues, very few of them used their Macintoshes for either statistical analysis or for graphical data display. The author believes that this state of affairs is particularly unfortunate because over the last few years both the computational capability, and even more so, the software availability for the Macintosh have become quite formidable. In some instances, very powerful tools are now available on the Macintosh that may not exist (or be far too costly) on the so-called {open_quotes}high end{close_quotes} workstations. Many scientists are simply unaware of the wealth of extremely useful, {open_quotes}off-the-shelf{close_quote} software that already exists on the Macintosh for scientific graphical and statistical analysis. This paper is a very personal view illustrating several such software packages that have proved valuable in the author`s own work in the analysis and display of climatic datasets. It is not meant to be either an all-inclusive enumeration, nor is it to be taken as an endorsement of these products as the {open_quotes}best{close_quotes} of their class. Rather, it has been found, through extensive use that these few packages were generally capable of satisfying his particular needs for both statistical analysis and graphical data display. In the limited space available, the focus will be on some of the more novel features found to be of value.

  9. Development of SAP-DoA techniques for GPR data processing within COST Action TU1208

    Science.gov (United States)

    Meschino, Simone; Pajewski, Lara; Marciniak, Marian

    2016-04-01

    This work focuses on the use of Sub-Array Processing (SAP) and Direction of Arrival (DoA) approaches for the processing of Ground-Penetrating Radar data, with the purpose of locating metal scatterers embedded in concrete or buried in the ground. Research activities have been carried out during two Short-Term Scientific Missions (STSMs) funded by the COST (European COoperation in Science and Technology) Action TU1208 "Civil Engineering Applications of Ground Penetrating Radar" in May 2015 and January 2016. In applications involving smart antennas and in the presence of several transmitters operating simultaneously, it is important for a receiving array to be able to estimate the Direction of Arrival (DoA) of the incoming signals, in order to decipher how many emitters are present and predict their positions. A number of methods have been devised for DoA estimation: the MUltiple SIgnal Classification (MUSIC) and Estimation of Signal Parameters via Rotational Invariance Technique (ESPRIT) are amongst the most popular ones [1]. In the scenario considered by us, the electromagnetic sources are the currents induced on metal elements embedded in concrete or buried in the ground. GPR radargrams are processed, to estimate the DoAs of the electric field back-scattered by the sought targets. In order to work in near-field conditions, a sub-array processing (SAP) approach is adopted: the radargram is partitioned in sub-radargrams composed of few A-scans each, the dominant DoA is predicted for each sub-radargram. The estimated angles are triangulated, obtaining a set of crossings with intersections condensed around object locations. This pattern is filtered, in order to remove a noisy background of unwanted crossings, and is processed by applying the statistical procedure described in [2]. We tested our approach on synthetic GPR radargrams, obtained by using the freeware simulator gprMax implementing the Finite-Difference Time-Domain method [3]. In particular, we worked with

  10. Defining Dynamic Graphics by a Graphical Language

    Institute of Scientific and Technical Information of China (English)

    毛其昌; 戴汝为

    1991-01-01

    A graphical language which can be used for defining dynamic picture and applying control actions to it is defined with an expanded attributed grammar.Based on this a system is built for developing the presentation of application data of user interface.This system provides user interface designers with a friendly and high efficient programming environment.

  11. Near Real-Time Estimation of Super-Resolved Depth and All-In-Focus Images from a Plenoptic Camera Using Graphics Processing Units

    Directory of Open Access Journals (Sweden)

    J. P. Lüke

    2010-01-01

    Full Text Available Depth range cameras are a promising solution for the 3DTV production chain. The generation of color images with their accompanying depth value simplifies the transmission bandwidth problem in 3DTV and yields a direct input for autostereoscopic displays. Recent developments in plenoptic video-cameras make it possible to introduce 3D cameras that operate similarly to traditional cameras. The use of plenoptic cameras for 3DTV has some benefits with respect to 3D capture systems based on dual stereo cameras since there is no need for geometric and color calibration or frame synchronization. This paper presents a method for simultaneously recovering depth and all-in-focus images from a plenoptic camera in near real time using graphics processing units (GPUs. Previous methods for 3D reconstruction using plenoptic images suffered from the drawback of low spatial resolution. A method that overcomes this deficiency is developed on parallel hardware to obtain near real-time 3D reconstruction with a final spatial resolution of 800×600 pixels. This resolution is suitable as an input to some autostereoscopic displays currently on the market and shows that real-time 3DTV based on plenoptic video-cameras is technologically feasible.

  12. Anisotropic interfacial tension, contact angles, and line tensions: A graphics-processing-unit-based Monte Carlo study of the Ising model

    Science.gov (United States)

    Block, Benjamin J.; Kim, Suam; Virnau, Peter; Binder, Kurt

    2014-12-01

    As a generic example for crystals where the crystal-fluid interface tension depends on the orientation of the interface relative to the crystal lattice axes, the nearest-neighbor Ising model on the simple cubic lattice is studied over a wide temperature range, both above and below the roughening transition temperature. Using a thin-film geometry Lx×Ly×Lz with periodic boundary conditions along the z axis and two free Lx×Ly surfaces at which opposing surface fields ±H1 act, under conditions of partial wetting, a single planar interface inclined under a contact angle θ interface tension, the contact angle, and the line tension (which depends on the contact angle, and on temperature). All these quantities are extracted from suitable thermodynamic integration procedures. In order to keep finite-size effects as well as statistical errors small enough, rather large lattice sizes (of the order of 46 million sites) were found to be necessary, and the availability of very efficient code implementation of graphics processing units was crucial for the feasibility of this study.

  13. Accelerating electrostatic interaction calculations with graphical processing units based on new developments of Ewald method using non-uniform fast Fourier transform.

    Science.gov (United States)

    Yang, Sheng-Chun; Wang, Yong-Lei; Jiao, Gui-Sheng; Qian, Hu-Jun; Lu, Zhong-Yuan

    2016-01-30

    We present new algorithms to improve the performance of ENUF method (F. Hedman, A. Laaksonen, Chem. Phys. Lett. 425, 2006, 142) which is essentially Ewald summation using Non-Uniform FFT (NFFT) technique. A NearDistance algorithm is developed to extensively reduce the neighbor list size in real-space computation. In reciprocal-space computation, a new algorithm is developed for NFFT for the evaluations of electrostatic interaction energies and forces. Both real-space and reciprocal-space computations are further accelerated by using graphical processing units (GPU) with CUDA technology. Especially, the use of CUNFFT (NFFT based on CUDA) very much reduces the reciprocal-space computation. In order to reach the best performance of this method, we propose a procedure for the selection of optimal parameters with controlled accuracies. With the choice of suitable parameters, we show that our method is a good alternative to the standard Ewald method with the same computational precision but a dramatically higher computational efficiency.

  14. Super-Sonograms and graphical seismic source locations: Facing the challenge of real-time data processing in an OSI SAMS installation

    Science.gov (United States)

    Joswig, Manfred

    2010-05-01

    The installation and operation of an OSI seismic aftershock monitoring system (SAMS) is bound by strict time constraints: 30+ small arrays must be set up within days, and data screening must cope with the daily seismogram input. This is a significant challenge since any potential, single ML -2.0 aftershock from a potential nuclear test must be detected and discriminated against a variety of higher-amplitude noise bursts. No automated approach can handle this task to date; thus some 200 traces of 24/7 data must be screened manually with a time resolution sufficient to recover signals of just a few sec duration, and with tiny amplitudes just above the threshold of ambient noise. Previous tests confirmed that this task can not be performed by time-domain signal screening via established seismological processing software, e.g. PITSA, SEISAN, or GEOTOOLS. Instead, we introduced 'SonoView', a seismic diagnosis tool based on a compilation of array traces into super-sonograms. Several hours of cumulative array data can be displayed at once on a single computer screen - without sacrifying the necessary detectability of few-sec signals. Then 'TraceView' will guide the analyst to select the relevant traces with best SNR, and 'HypoLine' offers some interactive, graphical location tools for fast epicenter estimates and source signature identifications. A previous release of this software suite was successfully applied at IFE08 in Kasakhstan, and supported the seismic sub-team of OSI in its timely report compilation.

  15. 图形处理器在通用计算中的应用%Application of graphics processing unit in general purpose computation

    Institute of Scientific and Technical Information of China (English)

    张健; 陈瑞

    2009-01-01

    基于图形处理器(GPU)的计算统一设备体系结构(compute unified device architecture,CUDA)构架,阐述了GPU用于通用计算的原理和方法.在Geforce8800GT下,完成了矩阵乘法运算实验.实验结果表明,随着矩阵阶数的递增,无论是GPU还是CPU处理,速度都在减慢.数据增加100倍后,GPU上的运算时间仅增加了3.95倍,而CPU的运算时间增加了216.66倍.%Based on the CUDA (compute unified device architecture) of GPU (graphics processing unit), the technical fundamentals and methods for general purpose computation on GPU are introduced. The algorithm of matrix multiplication is simulated on Geforce8800 GT. With the increasing of matrix order, algorithm speed is slowed either on CPU or on GPU. After the data quantity increases to 100 times, the operation time only increased in 3.95 times on GPU, and 216.66 times on CPU.

  16. Real-time electroholography using a multiple-graphics processing unit cluster system with a single spatial light modulator and the InfiniBand network

    Science.gov (United States)

    Niwase, Hiroaki; Takada, Naoki; Araki, Hiromitsu; Maeda, Yuki; Fujiwara, Masato; Nakayama, Hirotaka; Kakue, Takashi; Shimobaba, Tomoyoshi; Ito, Tomoyoshi

    2016-09-01

    Parallel calculations of large-pixel-count computer-generated holograms (CGHs) are suitable for multiple-graphics processing unit (multi-GPU) cluster systems. However, it is not easy for a multi-GPU cluster system to accomplish fast CGH calculations when CGH transfers between PCs are required. In these cases, the CGH transfer between the PCs becomes a bottleneck. Usually, this problem occurs only in multi-GPU cluster systems with a single spatial light modulator. To overcome this problem, we propose a simple method using the InfiniBand network. The computational speed of the proposed method using 13 GPUs (NVIDIA GeForce GTX TITAN X) was more than 3000 times faster than that of a CPU (Intel Core i7 4770) when the number of three-dimensional (3-D) object points exceeded 20,480. In practice, we achieved ˜40 tera floating point operations per second (TFLOPS) when the number of 3-D object points exceeded 40,960. Our proposed method was able to reconstruct a real-time movie of a 3-D object comprising 95,949 points.

  17. Geometric Correction of Remote Sensing Images Based on Graphic Processing Unit%基于GPU大规模遥感图像的几何校正

    Institute of Scientific and Technical Information of China (English)

    陈超; 陈彬; 孟剑萍

    2012-01-01

    A method for achieving the fusion of remote sensing image with two-dimensional (2D) maps in different scales is introduced. The method includes some technologies, such as geometric correction and resampling, etc. In addition, an approach to achieve the geometric correction of the remote sensing image and the fusion of remote sensing image with 2D map are introduced through graphic processing unit (GPU) in Linux environment, thus improving the displaying ef- fects of traditional topographical maps on computer.%针对二维平面地形图与遥感图像之间同一地区不同比例的融合问题,研究了遥感地形图的几何校正和重采样等技术实现。基于图像处理器(GPU)实现了Linux环境下遥感图像的几何校正,以及带有纹理信息的遥感图像与平面地形图的融合,扩展了传统二维平面地形图的表现形式。

  18. Graphical Modeling Language Tool

    NARCIS (Netherlands)

    Rumnit, M.

    2003-01-01

    The group of the faculty EE-Math-CS of the University of Twente is developing a graphical modeling language for specifying concurrency in software design. This graphical modeling language has a mathematical background based on the theorie of CSP. This language contains the power to create trustworth

  19. Scientific Graphical Displays on the Macintosh

    Energy Technology Data Exchange (ETDEWEB)

    Grotch, S. [Lawrence Livermore National Lab., CA (United States)

    1994-11-15

    In many organizations scientists have ready access to more than one computer, often both a workstation (e.g., SUN, HP, SGI) as well as a Macintosh or other PC. The scientist commonly uses the work station for `number-crunching` and data analysis whereas the Macintosh is relegated to either word processing or serves as a `dumb terminal` to a larger main-frame computer. In an informal poll of my colleagues, very few of them used their Macintoshes for either statistical analysis or for graphical data display. I believe that this state of affairs is particularly unfortunate because over the last few years both the computational capability, and even more so, the software availability for the Macintosh have become quite formidable. In some instances, very powerful tools are now available on the Macintosh that may not exist (or be far too costly) on the so-called `high end` workstations. Many scientists are simply unaware of the wealth of extremely useful, `off-the-shelf` software that already exists on the Macintosh for scientific graphical and statistical analysis.

  20. Enzymatic corn wet milling: engineering process and cost model

    Science.gov (United States)

    Enzymatic Corn Wet Milling (E-Milling) is a proposed alternative process to conventional wet milling for the recovery and purification of starch and coproducts using proteases to eliminate the need for sulfites and to decrease the steeping time. In 2005, the total starch production in USA by conven...

  1. "Cost in Transliteration": The Neurocognitive Processing of Romanized Writing

    Science.gov (United States)

    Rao, Chaitra; Mathur, Avantika; Singh, Nandini C.

    2013-01-01

    Romanized transliteration is widely used in internet communication and global commerce, yet we know little about its behavioural and neural processing. Here, we show that Romanized text imposes a significant neurocognitive load. Readers faced greater difficulty in identifying concrete words written in Romanized transliteration (Romanagari)…

  2. Cost reductions of fuel cells for transport applications: fuel processing options

    Science.gov (United States)

    Teagan, W. P.; Bentley, J.; Barnett, B.

    The highly favorable efficiency/environmental characteristics of fuel cell technologies have now been verified by virtue of recent and ongoing field experience. The key issue regarding the timing and extent of fuel cell commercialization is the ability to reduce costs to acceptable levels in both stationary and transport applications. It is increasingly recognized that the fuel processing subsystem can have a major impact on overall system costs, particularly as ongoing R&D efforts result in reduction of the basic cost structure of stacks which currently dominate system costs. The fuel processing subsystem for polymer electrolyte membrane fuel cell (PEMFC) technology, which is the focus of transport applications, includes the reformer, shift reactors, and means for CO reduction. In addition to low cost, transport applications require a fuel processor that is compact and can start rapidly. This paper describes the impact of factors such as fuel choice, operating temperature, material selection, catalyst requirements, and controls on the cost of fuel processing systems. There are fuel processor technology paths which manufacturing cost analyses indicate are consistent with fuel processor subsystem costs of under 150/kW in stationary applications and 30/kW in transport applications. As such, the costs of mature fuel processing subsystem technologies should be consistent with their use in commercially viable fuel cell systems in both application categories.

  3. Thermo-ecological cost (TEC evaluation of metallurgical processes

    Directory of Open Access Journals (Sweden)

    W. Stanek

    2015-01-01

    Full Text Available Metallurgy represents a complex production system of fuel and mineral non-renewable resources transformation. The effectiveness of resource management in metallurgical chains depends on the applied ore grade and on the irreversibility of components of the system. TEC can be applied to measure the influence of metallurgy on the depletion of natural resources. The paper discusses the possibility of application of TEC in metallurgy and presents illustrative example concerning blast-furnace process.

  4. Low cost materials of construction for biological processes: Proceedings

    Energy Technology Data Exchange (ETDEWEB)

    1993-05-13

    The workshop was held, May 1993 in conjunction with the 15th Symposium on Biotechnology for Fuels and Chemicals. The purpose of this workshop was to present information on the biomass to ethanol process in the context of materials selection and through presentation and discussion, identify promising avenues for future research. Six technical presentations were grouped into two sessions: process assessment and technology assessment. In the process assessment session, the group felt that the pretreatment area would require the most extensive materials research due the complex chemical, physical and thermal environment. Discussion centered around the possibility of metals being leached into the process stream and their effect on the fermentation mechanics. Linings were a strong option for pretreatment assuming the economics were favorable. Fermentation was considered an important area for research also, due to the unique complex of compounds and dual phases present. Erosion in feedstock handling equipment was identified as a minor concern. In the technology assessment session, methodologies in corrosion analysis were presented in addition to an overview of current coatings/linings technology. Widely practiced testing strategies, including ASTM methods, as well as novel procedures for micro-analysis of corrosion were discussed. Various coatings and linings, including polymers and ceramics, were introduced. The prevailing recommendations for testing included keeping the testing simple until the problem warranted a more detailed approach and developing standardized testing procedures to ensure the data was reproducible and applicable. The need to evaluate currently available materials such as coatings/linings, carbon/stainless steels, or fiberglass reinforced plastic was emphasized. It was agreed that economic evaluation of each material candidate must be an integral part of any research plan.

  5. DECENTRALIZED THERMOPHILIC BIOHYDROGEN: A MORE EFFICIENT AND COST EFFECTIVE PROCESS

    OpenAIRE

    Sani, Rajesh.K.; Rajesh V. Shende; Sudhir Kumar; Aditya Bhalla

    2011-01-01

    Nonfood lignocellulosic biomass is an ideal substrate for biohydrogen production. By avoiding pretreatment steps (acid, alkali, or enzymatic), there is potential to make the process economical. Utilization of regional untreated lignocellulosic biomass by cellulolytic and fermentative thermophiles in a consolidated mode using a single reactor is one of the ways to achieve economical and sustainable biohydrogen production. Employing these potential microorganisms along with decentralized biohyd...

  6. Costs of Quality: Exploratory Analysis of Hidden Elements and Prioritization using Analytic Hierarchy Process

    Directory of Open Access Journals (Sweden)

    Sailaja A

    2015-02-01

    Full Text Available Cost of Quality analysis is emerged as an effective tool for the industrial managers for pinpointing the deficiencies in the system as well as for identifying the improvement areas by highlighting the cost reduction opportunities. However , this analysis will be fully effective only if it is further extended to identify the cost incurred in ensuring quality in all areas of the supply chain including the hidden costs and costs of missed out opportunities. Most of the hidden elements of quality costs are difficult to track and not accounted by the traditional accounting tools. An exploratory analysis is made in this research to identify the hidden elements of quality costs in manufacturing industry. Further, the identified cost elements are classified into various groups for better analysis and, finally, prioritized to identify the vital few among them. Analytic Hierarchy Process (AHP technique which is one of the most popular Multi Criteria Decision Method (MCDM and Pareto analysis were used in this study for prioritizing the hidden quality cost elements based on their degree of impact on overall cost of quality. By this analysis, the key cost elements which are to be addressed to reduce the overall cost of quality are identified.

  7. A Linux Workstation for High Performance Graphics

    Science.gov (United States)

    Geist, Robert; Westall, James

    2000-01-01

    The primary goal of this effort was to provide a low-cost method of obtaining high-performance 3-D graphics using an industry standard library (OpenGL) on PC class computers. Previously, users interested in doing substantial visualization or graphical manipulation were constrained to using specialized, custom hardware most often found in computers from Silicon Graphics (SGI). We provided an alternative to expensive SGI hardware by taking advantage of third-party, 3-D graphics accelerators that have now become available at very affordable prices. To make use of this hardware our goal was to provide a free, redistributable, and fully-compatible OpenGL work-alike library so that existing bodies of code could simply be recompiled. for PC class machines running a free version of Unix. This should allow substantial cost savings while greatly expanding the population of people with access to a serious graphics development and viewing environment. This should offer a means for NASA to provide a spectrum of graphics performance to its scientists, supplying high-end specialized SGI hardware for high-performance visualization while fulfilling the requirements of medium and lower performance applications with generic, off-the-shelf components and still maintaining compatibility between the two.

  8. Graphics Processing Unit-Accelerated Nonrigid Registration of MR Images to CT Images During CT-Guided Percutaneous Liver Tumor Ablations.

    Science.gov (United States)

    Tokuda, Junichi; Plishker, William; Torabi, Meysam; Olubiyi, Olutayo I; Zaki, George; Tatli, Servet; Silverman, Stuart G; Shekher, Raj; Hata, Nobuhiko

    2015-06-01

    Accuracy and speed are essential for the intraprocedural nonrigid magnetic resonance (MR) to computed tomography (CT) image registration in the assessment of tumor margins during CT-guided liver tumor ablations. Although both accuracy and speed can be improved by limiting the registration to a region of interest (ROI), manual contouring of the ROI prolongs the registration process substantially. To achieve accurate and fast registration without the use of an ROI, we combined a nonrigid registration technique on the basis of volume subdivision with hardware acceleration using a graphics processing unit (GPU). We compared the registration accuracy and processing time of GPU-accelerated volume subdivision-based nonrigid registration technique to the conventional nonrigid B-spline registration technique. Fourteen image data sets of preprocedural MR and intraprocedural CT images for percutaneous CT-guided liver tumor ablations were obtained. Each set of images was registered using the GPU-accelerated volume subdivision technique and the B-spline technique. Manual contouring of ROI was used only for the B-spline technique. Registration accuracies (Dice similarity coefficient [DSC] and 95% Hausdorff distance [HD]) and total processing time including contouring of ROIs and computation were compared using a paired Student t test. Accuracies of the GPU-accelerated registrations and B-spline registrations, respectively, were 88.3 ± 3.7% versus 89.3 ± 4.9% (P = .41) for DSC and 13.1 ± 5.2 versus 11.4 ± 6.3 mm (P = .15) for HD. Total processing time of the GPU-accelerated registration and B-spline registration techniques was 88 ± 14 versus 557 ± 116 seconds (P computation time despite the difference in the complexity of the algorithms (P = .71). The GPU-accelerated volume subdivision technique was as accurate as the B-spline technique and required significantly less processing time. The GPU-accelerated volume subdivision technique may enable the implementation of nonrigid

  9. [Process-oriented cost calculation in interventional radiology. A case study].

    Science.gov (United States)

    Mahnken, A H; Bruners, P; Günther, R W; Rasche, C

    2012-01-01

    Currently used costing methods such as cost centre accounting do not sufficiently reflect the process-based resource utilization in medicine. The goal of this study was to establish a process-oriented cost assessment of percutaneous radiofrequency (RF) ablation of liver and lung metastases. In each of 15 patients a detailed task analysis of the primary process of hepatic and pulmonary RF ablation was performed. Based on these data a dedicated cost calculation model was developed for each primary process. The costs of each process were computed and compared with the revenue for in-patients according to the German diagnosis-related groups (DRG) system 2010. The RF ablation of liver metastases in patients without relevant comorbidities and a low patient complexity level results in a loss of EUR 588.44, whereas the treatment of patients with a higher complexity level yields an acceptable profit. The treatment of pulmonary metastases is profitable even in cases of additional expenses due to complications. Process-oriented costing provides relevant information that is needed for understanding the economic impact of treatment decisions. It is well suited as a starting point for economically driven process optimization and reengineering. Under the terms of the German DRG 2010 system percutaneous RF ablation of lung metastases is economically reasonable, while RF ablation of liver metastases in cases of low patient complexity levels does not cover the costs.

  10. Study on efficiency of time computation in x-ray imaging simulation base on Monte Carlo algorithm using graphics processing unit

    Science.gov (United States)

    Setiani, Tia Dwi; Suprijadi, Haryanto, Freddy

    2016-03-01

    Monte Carlo (MC) is one of the powerful techniques for simulation in x-ray imaging. MC method can simulate the radiation transport within matter with high accuracy and provides a natural way to simulate radiation transport in complex systems. One of the codes based on MC algorithm that are widely used for radiographic images simulation is MC-GPU, a codes developed by Andrea Basal. This study was aimed to investigate the time computation of x-ray imaging simulation in GPU (Graphics Processing Unit) compared to a standard CPU (Central Processing Unit). Furthermore, the effect of physical parameters to the quality of radiographic images and the comparison of image quality resulted from simulation in the GPU and CPU are evaluated in this paper. The simulations were run in CPU which was simulated in serial condition, and in two GPU with 384 cores and 2304 cores. In simulation using GPU, each cores calculates one photon, so, a large number of photon were calculated simultaneously. Results show that the time simulations on GPU were significantly accelerated compared to CPU. The simulations on the 2304 core of GPU were performed about 64 -114 times faster than on CPU, while the simulation on the 384 core of GPU were performed about 20 - 31 times faster than in a single core of CPU. Another result shows that optimum quality of images from the simulation was gained at the history start from 108 and the energy from 60 Kev to 90 Kev. Analyzed by statistical approach, the quality of GPU and CPU images are relatively the same.

  11. A Survey on Graphical Programming Systems

    Directory of Open Access Journals (Sweden)

    Gurudatt Kulkarni

    2014-04-01

    Full Text Available Recently there has been an increasing interest in the use of graphics to help programming and understanding of computer systems. The Graphical Programming and Program Simulations are exciting areas of active computer science research that show the signs for improving the programming process. An array of different design methodologie s have arisen from research efforts and many graphical programming systems have been developed to address both general programming tasks and specific application areas such as physical simulation and user interface design. This paper presents a survey of t he field of graphical programming languages starting with a historical overview of some of pioneering efforts in the field. In addition this paper also presents different classifications of graphical programming languages.

  12. Realtime multi-plot graphics system

    Science.gov (United States)

    Shipkowski, Michael S.

    1990-01-01

    The increased complexity of test operations and customer requirements at Langley Research Center's National Transonic Facility (NTF) surpassed the capabilities of the initial realtime graphics system. The analysis of existing hardware and software and the enhancements made to develop a new realtime graphics system are described. The result of this effort is a cost effective system, based on hardware already in place, that support high speed, high resolution, generation and display of multiple realtime plots. The enhanced graphics system (EGS) meets the current and foreseeable future realtime graphics requirements of the NTF. While this system was developed to support wind tunnel operations, the overall design and capability of the system is applicable to other realtime data acquisition systems that have realtime plot requirements.

  13. DECENTRALIZED THERMOPHILIC BIOHYDROGEN: A MORE EFFICIENT AND COST EFFECTIVE PROCESS

    Directory of Open Access Journals (Sweden)

    Rajesh K. Sani

    2011-11-01

    Full Text Available Nonfood lignocellulosic biomass is an ideal substrate for biohydrogen production. By avoiding pretreatment steps (acid, alkali, or enzymatic, there is potential to make the process economical. Utilization of regional untreated lignocellulosic biomass by cellulolytic and fermentative thermophiles in a consolidated mode using a single reactor is one of the ways to achieve economical and sustainable biohydrogen production. Employing these potential microorganisms along with decentralized biohydrogen energy production will lead us towards regional and national independence having a positive influence on the bioenergy sector.

  14. Application of the TDABC model in the logistics process using different capacity cost rates

    OpenAIRE

    Paulo Afonso; Alex Santana

    2016-01-01

    Purpose: The understanding of logistics process in terms of costs and profitability is a complex task and there is a need of more research and applied work on these issues. In this research project, the concepts underlying Time-Driven Activity Based Costing (TDABC) have been used in the context of logistics costs. Design/methodology/approach: A Distribution Centre of wood and carpentry related materials has been studied. A multidisciplinary team has been composed to support the project in...

  15. Application of the TDABC Model in the Logistics Process Using Different Capacity Cost Rates

    OpenAIRE

    Alfonso, Paulo; Santana, Alex

    2016-01-01

    Purpose: The understanding of logistics process in terms of costs and profitability is a complex task and there is a need of more research and applied work on these issues. In this research project, the concepts underlying Time-Driven Activity Based Costing (TDABC) have been used in the context of logistics costs. Design/methodology/approach: A Distribution Centre of wood and carpentry related materials has been studied. A multidisciplinary team has been composed to support the...

  16. Improving product cost and schedule management in a garment product development process

    OpenAIRE

    Becker, Lotta

    2016-01-01

    The purpose of this research is to improve the beginning stages of a garment product development process at Marimekko Oyj, a Finnish design company, by finding better ways to manage the product cost and schedule. The main objective is to bring the final cost of garments closer to the original target cost in order to keep margins at required levels. Action research and case study were used as research methods. During the action research cycles, two collections were started and observed. Th...

  17. Structure Analysis of the Graphic Simulator for the PRIDE Equipment

    Energy Technology Data Exchange (ETDEWEB)

    Kim, Chang Hoi; Kim, Seong Hyun; Park, Byung Suk; Lee, Jong Kwang; Lee, Hyo Jik; Kim, Ki Ho [KAERI, Daejeon (Korea, Republic of)

    2010-12-15

    Simulation technology based on the computer graphics is able to minimize the trial and error and reduce the development cost and period dramatically at the design stage of the pyroprocessing facility construction and the equipment development. For this purpose, the 3D graphic simulation program named HotCell has been developed. HotCell has continuously updated for the functional addition and the bug fix, and now it reaches version third. The Digital mockup of PRIDE is furnished with the MSM(matster-slave manipulator), BDSM(bridge transported dual arm servo manipulator) and Crane in order to remote handling the processing equipment. HotCell program can be interface with the 3D mouse, the haptic device and the joystick for the realistic operation of above device. The posture of MSM can be recorded with the simple keyboard operation in order to reproduce the behavior of the MSM

  18. Factors determining cost and quality of the electrical insulation in the VPI-process

    Energy Technology Data Exchange (ETDEWEB)

    Bruetsch, R.; Allison, J.; Thaler, T. [Von Roll Isola, Breitenbach (Switzerland)

    1996-12-31

    The construction of the electrical insulation and the carrying out of the VPI-process are critical steps in the production of rotating high voltage machines. On the other hand the manufacture of the insulation and the VPI-process are cost factors. It is therefore important to know the factors influencing cost and quality of the insulation in the VPI-process in order to determine the optimal production parameters and to achieve a high reliability of the resulting machine. This article gives an overview of the relevant factors and some considerations regarding costs.

  19. Quality Costs (IRR Impact on Lot Size Considering Work in Process Inventory

    Directory of Open Access Journals (Sweden)

    Misbah Ullaha

    2014-06-01

    Full Text Available Economic order quantity model and production quantity model assume that production processes are error free. However, variations exist in processes which result in imperfection particularly in high machining environments. Processes variations result in nonconformities that increase quality costs in the form of rework, rejects and quality control techniques implementations to ensure quality product delivery. This paper is an attempt towards development of inventory model which incorporate inspection, rework, and rejection (IRR quality costs in optimum lot size calculation focusing work in process inventory. Mathematical model is derived for optimum lot size based on minimum average cost function using analytical approach. This new developed model (GTOQIRR assume an imperfect production environment. Numerical examples are used to visualize the significant effect of quality cost in the proposed model in comparison to the previously developed models. The proposed model is highly recommendable for quality based high machining manufacturing environments considering work in process inventories.

  20. Collaborative Learning Processes in Teacher Training: Benefits and Costs

    Directory of Open Access Journals (Sweden)

    Ellen Aschermann

    2015-09-01

    Full Text Available The current pedagogical discussion emphasizes self-determined and cooperative forms of learning. The theoretical background stems from constructivist theories of learning, which interpret social exchange and reflecting on one’s learning pathway as crucial points for construction processes. Consequently, self-regulation turns into a central condition for scholarly learning. This includes setting goals, planning and conducting the learning process as well as the evaluation of results. This paper focusses on the processes secondary school teachers use to implement newly acquired knowledge on self-regulated learning in their lessons. By means of qualitative research methodology, we wanted to explore how experienced teachers develop their own abilities for self-determined learning while teaching their own pupils to do so. Method: In a transdisciplinary research-project between a secondary school and a university, a group of mathematic teachers participated in a two-day training course on self-regulated learning during which they discussed and developed ways to enhance self-regulation of learners. They were then tasked with implementing their newly acquired knowledge during their maths lessons (number of pupils = 270, 4 – 5 lessons per week with eighth graders in the course of a twelve week teaching period. Two separate groups of teachers were asked to put their newly acquired skills into practice. The first group (N = 6 used a collaborative setting in which tasks, teaching activities and performance reviews had been clarified and discussed within the group. The second group (N = 4 fulfilled these tasks individually without seeking assistance. The teaching strategies were assessed by means of multiple semi-structured interviews and observations of classroom activities. Semi-structured interviews were conducted with the teachers prior to the training and after 12 and 32 weeks respectively to explore their understanding of self-regulated learning