Fast parallel event reconstruction
CERN. Geneva
2010-01-01
On-line processing of large data volumes produced in modern HEP experiments requires using maximum capabilities of modern and future many-core CPU and GPU architectures.One of such powerful feature is a SIMD instruction set, which allows packing several data items in one register and to operate on all of them, thus achievingmore operations per clock cycle. Motivated by the idea of using the SIMD unit ofmodern processors, the KF based track fit has been adapted for parallelism, including memory optimization, numerical analysis, vectorization with inline operator overloading, and optimization using SDKs. The speed of the algorithm has been increased in 120000 times with 0.1 ms/track, running in parallel on 16 SPEs of a Cell Blade computer. Running on a Nehalem CPU with 8 cores it shows the processing speed of 52 ns/track using the Intel Threading Building Blocks. The same KF algorithm running on an Nvidia GTX 280 in the CUDA frameworkprovi...
Non-Cartesian parallel imaging reconstruction.
Wright, Katherine L; Hamilton, Jesse I; Griswold, Mark A; Gulani, Vikas; Seiberlich, Nicole
2014-11-01
Non-Cartesian parallel imaging has played an important role in reducing data acquisition time in MRI. The use of non-Cartesian trajectories can enable more efficient coverage of k-space, which can be leveraged to reduce scan times. These trajectories can be undersampled to achieve even faster scan times, but the resulting images may contain aliasing artifacts. Just as Cartesian parallel imaging can be used to reconstruct images from undersampled Cartesian data, non-Cartesian parallel imaging methods can mitigate aliasing artifacts by using additional spatial encoding information in the form of the nonhomogeneous sensitivities of multi-coil phased arrays. This review will begin with an overview of non-Cartesian k-space trajectories and their sampling properties, followed by an in-depth discussion of several selected non-Cartesian parallel imaging algorithms. Three representative non-Cartesian parallel imaging methods will be described, including Conjugate Gradient SENSE (CG SENSE), non-Cartesian generalized autocalibrating partially parallel acquisition (GRAPPA), and Iterative Self-Consistent Parallel Imaging Reconstruction (SPIRiT). After a discussion of these three techniques, several potential promising clinical applications of non-Cartesian parallel imaging will be covered. © 2014 Wiley Periodicals, Inc.
Fast parallel algorithm for CT image reconstruction.
Flores, Liubov A; Vidal, Vicent; Mayo, Patricia; Rodenas, Francisco; Verdú, Gumersindo
2012-01-01
In X-ray computed tomography (CT) the X rays are used to obtain the projection data needed to generate an image of the inside of an object. The image can be generated with different techniques. Iterative methods are more suitable for the reconstruction of images with high contrast and precision in noisy conditions and from a small number of projections. Their use may be important in portable scanners for their functionality in emergency situations. However, in practice, these methods are not widely used due to the high computational cost of their implementation. In this work we analyze iterative parallel image reconstruction with the Portable Extensive Toolkit for Scientific computation (PETSc).
Sparse BLIP: BLind Iterative Parallel imaging reconstruction using compressed sensing.
She, Huajun; Chen, Rong-Rong; Liang, Dong; DiBella, Edward V R; Ying, Leslie
2014-02-01
To develop a sensitivity-based parallel imaging reconstruction method to reconstruct iteratively both the coil sensitivities and MR image simultaneously based on their prior information. Parallel magnetic resonance imaging reconstruction problem can be formulated as a multichannel sampling problem where solutions are sought analytically. However, the channel functions given by the coil sensitivities in parallel imaging are not known exactly and the estimation error usually leads to artifacts. In this study, we propose a new reconstruction algorithm, termed Sparse BLind Iterative Parallel, for blind iterative parallel imaging reconstruction using compressed sensing. The proposed algorithm reconstructs both the sensitivity functions and the image simultaneously from undersampled data. It enforces the sparseness constraint in the image as done in compressed sensing, but is different from compressed sensing in that the sensing matrix is unknown and additional constraint is enforced on the sensitivities as well. Both phantom and in vivo imaging experiments were carried out with retrospective undersampling to evaluate the performance of the proposed method. Experiments show improvement in Sparse BLind Iterative Parallel reconstruction when compared with Sparse SENSE, JSENSE, IRGN-TV, and L1-SPIRiT reconstructions with the same number of measurements. The proposed Sparse BLind Iterative Parallel algorithm reduces the reconstruction errors when compared to the state-of-the-art parallel imaging methods. Copyright © 2013 Wiley Periodicals, Inc.
Parallel Algorithm for Reconstruction of TAC Images
International Nuclear Information System (INIS)
Vidal Gimeno, V.
2012-01-01
The algebraic reconstruction methods are based on solving a system of linear equations. In a previous study, was used and showed as the PETSc library, was and is a scientific computing tool, which facilitates and enables the optimal use of a computer system in the image reconstruction process.
Parallel CT image reconstruction based on GPUs
International Nuclear Information System (INIS)
Flores, Liubov A.; Vidal, Vicent; Mayo, Patricia; Rodenas, Francisco; Verdú, Gumersindo
2014-01-01
In X-ray computed tomography (CT) iterative methods are more suitable for the reconstruction of images with high contrast and precision in noisy conditions from a small number of projections. However, in practice, these methods are not widely used due to the high computational cost of their implementation. Nowadays technology provides the possibility to reduce effectively this drawback. It is the goal of this work to develop a fast GPU-based algorithm to reconstruct high quality images from under sampled and noisy projection data. - Highlights: • We developed GPU-based iterative algorithm to reconstruct images. • Iterative algorithms are capable to reconstruct images from under sampled set of projections. • The computer cost of the implementation of the developed algorithm is low. • The efficiency of the algorithm increases for the large scale problems
A SPECT reconstruction method for extending parallel to non-parallel geometries
International Nuclear Information System (INIS)
Wen Junhai; Liang Zhengrong
2010-01-01
Due to its simplicity, parallel-beam geometry is usually assumed for the development of image reconstruction algorithms. The established reconstruction methodologies are then extended to fan-beam, cone-beam and other non-parallel geometries for practical application. This situation occurs for quantitative SPECT (single photon emission computed tomography) imaging in inverting the attenuated Radon transform. Novikov reported an explicit parallel-beam formula for the inversion of the attenuated Radon transform in 2000. Thereafter, a formula for fan-beam geometry was reported by Bukhgeim and Kazantsev (2002 Preprint N. 99 Sobolev Institute of Mathematics). At the same time, we presented a formula for varying focal-length fan-beam geometry. Sometimes, the reconstruction formula is so implicit that we cannot obtain the explicit reconstruction formula in the non-parallel geometries. In this work, we propose a unified reconstruction framework for extending parallel-beam geometry to any non-parallel geometry using ray-driven techniques. Studies by computer simulations demonstrated the accuracy of the presented unified reconstruction framework for extending parallel-beam to non-parallel geometries in inverting the attenuated Radon transform.
Roerdink, J.B.T.M.; Westenberg, M.A
1998-01-01
We consider the parallelization of two standard 2D reconstruction algorithms, filtered backprojection and direct Fourier reconstruction, using the data-parallel programming style. The algorithms are implemented on a Connection Machine CM-5 with 16 processors and a peak performance of 2 Gflop/s.
Computational acceleration for MR image reconstruction in partially parallel imaging.
Ye, Xiaojing; Chen, Yunmei; Huang, Feng
2011-05-01
In this paper, we present a fast numerical algorithm for solving total variation and l(1) (TVL1) based image reconstruction with application in partially parallel magnetic resonance imaging. Our algorithm uses variable splitting method to reduce computational cost. Moreover, the Barzilai-Borwein step size selection method is adopted in our algorithm for much faster convergence. Experimental results on clinical partially parallel imaging data demonstrate that the proposed algorithm requires much fewer iterations and/or less computational cost than recently developed operator splitting and Bregman operator splitting methods, which can deal with a general sensing matrix in reconstruction framework, to get similar or even better quality of reconstructed images.
High spatial resolution CT image reconstruction using parallel computing
International Nuclear Information System (INIS)
Yin Yin; Liu Li; Sun Gongxing
2003-01-01
Using the PC cluster system with 16 dual CPU nodes, we accelerate the FBP and OR-OSEM reconstruction of high spatial resolution image (2048 x 2048). Based on the number of projections, we rewrite the reconstruction algorithms into parallel format and dispatch the tasks to each CPU. By parallel computing, the speedup factor is roughly equal to the number of CPUs, which can be up to about 25 times when 25 CPUs used. This technique is very suitable for real-time high spatial resolution CT image reconstruction. (authors)
Improving parallel imaging by jointly reconstructing multi-contrast data.
Bilgic, Berkin; Kim, Tae Hyung; Liao, Congyu; Manhard, Mary Kate; Wald, Lawrence L; Haldar, Justin P; Setsompop, Kawin
2018-08-01
To develop parallel imaging techniques that simultaneously exploit coil sensitivity encoding, image phase prior information, similarities across multiple images, and complementary k-space sampling for highly accelerated data acquisition. We introduce joint virtual coil (JVC)-generalized autocalibrating partially parallel acquisitions (GRAPPA) to jointly reconstruct data acquired with different contrast preparations, and show its application in 2D, 3D, and simultaneous multi-slice (SMS) acquisitions. We extend the joint parallel imaging concept to exploit limited support and smooth phase constraints through Joint (J-) LORAKS formulation. J-LORAKS allows joint parallel imaging from limited autocalibration signal region, as well as permitting partial Fourier sampling and calibrationless reconstruction. We demonstrate highly accelerated 2D balanced steady-state free precession with phase cycling, SMS multi-echo spin echo, 3D multi-echo magnetization-prepared rapid gradient echo, and multi-echo gradient recalled echo acquisitions in vivo. Compared to conventional GRAPPA, proposed joint acquisition/reconstruction techniques provide more than 2-fold reduction in reconstruction error. JVC-GRAPPA takes advantage of additional spatial encoding from phase information and image similarity, and employs different sampling patterns across acquisitions. J-LORAKS achieves a more parsimonious low-rank representation of local k-space by considering multiple images as additional coils. Both approaches provide dramatic improvement in artifact and noise mitigation over conventional single-contrast parallel imaging reconstruction. Magn Reson Med 80:619-632, 2018. © 2018 International Society for Magnetic Resonance in Medicine. © 2018 International Society for Magnetic Resonance in Medicine.
QR-decomposition based SENSE reconstruction using parallel architecture.
Ullah, Irfan; Nisar, Habab; Raza, Haseeb; Qasim, Malik; Inam, Omair; Omer, Hammad
2018-04-01
Magnetic Resonance Imaging (MRI) is a powerful medical imaging technique that provides essential clinical information about the human body. One major limitation of MRI is its long scan time. Implementation of advance MRI algorithms on a parallel architecture (to exploit inherent parallelism) has a great potential to reduce the scan time. Sensitivity Encoding (SENSE) is a Parallel Magnetic Resonance Imaging (pMRI) algorithm that utilizes receiver coil sensitivities to reconstruct MR images from the acquired under-sampled k-space data. At the heart of SENSE lies inversion of a rectangular encoding matrix. This work presents a novel implementation of GPU based SENSE algorithm, which employs QR decomposition for the inversion of the rectangular encoding matrix. For a fair comparison, the performance of the proposed GPU based SENSE reconstruction is evaluated against single and multicore CPU using openMP. Several experiments against various acceleration factors (AFs) are performed using multichannel (8, 12 and 30) phantom and in-vivo human head and cardiac datasets. Experimental results show that GPU significantly reduces the computation time of SENSE reconstruction as compared to multi-core CPU (approximately 12x speedup) and single-core CPU (approximately 53x speedup) without any degradation in the quality of the reconstructed images. Copyright © 2018 Elsevier Ltd. All rights reserved.
Parallel computing for event reconstruction in high-energy physics
International Nuclear Information System (INIS)
Wolbers, S.
1993-01-01
Parallel computing has been recognized as a solution to large computing problems. In High Energy Physics offline event reconstruction of detector data is a very large computing problem that has been solved with parallel computing techniques. A review of the parallel programming package CPS (Cooperative Processes Software) developed and used at Fermilab for offline reconstruction of Terabytes of data requiring the delivery of hundreds of Vax-Years per experiment is given. The Fermilab UNIX farms, consisting of 180 Silicon Graphics workstations and 144 IBM RS6000 workstations, are used to provide the computing power for the experiments. Fermilab has had a long history of providing production parallel computing starting with the ACP (Advanced Computer Project) Farms in 1986. The Fermilab UNIX Farms have been in production for over 2 years with 24 hour/day service to experimental user groups. Additional tools for management, control and monitoring these large systems will be described. Possible future directions for parallel computing in High Energy Physics will be given
Instrument Variables for Reducing Noise in Parallel MRI Reconstruction
Directory of Open Access Journals (Sweden)
Yuchou Chang
2017-01-01
Full Text Available Generalized autocalibrating partially parallel acquisition (GRAPPA has been a widely used parallel MRI technique. However, noise deteriorates the reconstructed image when reduction factor increases or even at low reduction factor for some noisy datasets. Noise, initially generated from scanner, propagates noise-related errors during fitting and interpolation procedures of GRAPPA to distort the final reconstructed image quality. The basic idea we proposed to improve GRAPPA is to remove noise from a system identification perspective. In this paper, we first analyze the GRAPPA noise problem from a noisy input-output system perspective; then, a new framework based on errors-in-variables (EIV model is developed for analyzing noise generation mechanism in GRAPPA and designing a concrete method—instrument variables (IV GRAPPA to remove noise. The proposed EIV framework provides possibilities that noiseless GRAPPA reconstruction could be achieved by existing methods that solve EIV problem other than IV method. Experimental results show that the proposed reconstruction algorithm can better remove the noise compared to the conventional GRAPPA, as validated with both of phantom and in vivo brain data.
Parallelization of the model-based iterative reconstruction algorithm DIRA
International Nuclear Information System (INIS)
Oertenberg, A.; Sandborg, M.; Alm Carlsson, G.; Malusek, A.; Magnusson, M.
2016-01-01
New paradigms for parallel programming have been devised to simplify software development on multi-core processors and many-core graphical processing units (GPU). Despite their obvious benefits, the parallelization of existing computer programs is not an easy task. In this work, the use of the Open Multiprocessing (OpenMP) and Open Computing Language (OpenCL) frameworks is considered for the parallelization of the model-based iterative reconstruction algorithm DIRA with the aim to significantly shorten the code's execution time. Selected routines were parallelized using OpenMP and OpenCL libraries; some routines were converted from MATLAB to C and optimised. Parallelization of the code with the OpenMP was easy and resulted in an overall speedup of 15 on a 16-core computer. Parallelization with OpenCL was more difficult owing to differences between the central processing unit and GPU architectures. The resulting speedup was substantially lower than the theoretical peak performance of the GPU; the cause was explained. (authors)
PARALLEL ITERATIVE RECONSTRUCTION OF PHANTOM CATPHAN ON EXPERIMENTAL DATA
Directory of Open Access Journals (Sweden)
M. A. Mirzavand
2016-01-01
Full Text Available The principles of fast parallel iterative algorithms based on the use of graphics accelerators and OpenGL library are considered in the paper. The proposed approach provides simultaneous minimization of the residuals of the desired solution and total variation of the reconstructed three- dimensional image. The number of necessary input data, i. e. conical X-ray projections, can be reduced several times. It means in a corresponding number of times the possibility to reduce radiation exposure to the patient. At the same time maintain the necessary contrast and spatial resolution of threedimensional image of the patient. Heuristic iterative algorithm can be used as an alternative to the well-known three-dimensional Feldkamp algorithm.
A parallel stereo reconstruction algorithm with applications in entomology (APSRA)
Bhasin, Rajesh; Jang, Won Jun; Hart, John C.
2012-03-01
We propose a fast parallel algorithm for the reconstruction of 3-Dimensional point clouds of insects from binocular stereo image pairs using a hierarchical approach for disparity estimation. Entomologists study various features of insects to classify them, build their distribution maps, and discover genetic links between specimens among various other essential tasks. This information is important to the pesticide and the pharmaceutical industries among others. When considering the large collections of insects entomologists analyze, it becomes difficult to physically handle the entire collection and share the data with researchers across the world. With the method presented in our work, Entomologists can create an image database for their collections and use the 3D models for studying the shape and structure of the insects thus making it easier to maintain and share. Initial feedback shows that the reconstructed 3D models preserve the shape and size of the specimen. We further optimize our results to incorporate multiview stereo which produces better overall structure of the insects. Our main contribution is applying stereoscopic vision techniques to entomology to solve the problems faced by entomologists.
Parallelization of an existing high energy physics event reconstruction software package
International Nuclear Information System (INIS)
Schiefer, R.; Francis, D.
1996-01-01
Software parallelization allows an efficient use of available computing power to increase the performance of applications. In a case study the authors have investigated the parallelization of high energy physics event reconstruction software in terms of costs (effort, computing resource requirements), benefits (performance increase) and the feasibility of a systematic parallelization approach. Guidelines facilitating a parallel implementation are proposed for future software development
International Nuclear Information System (INIS)
Bastiens, K.; Lemahieu, I.
1994-01-01
The application of a maximum entropy reconstruction algorithm to PET images requires a lot of computing resources. A parallel implementation could seriously reduce the execution time. However, programming a parallel application is still a non trivial task, needing specialized people. In this paper a programming environment based on a visual programming language is used for a parallel implementation of the reconstruction algorithm. This programming environment allows less experienced programmers to use the performance of multiprocessor systems. (authors)
Energy Technology Data Exchange (ETDEWEB)
Bastiens, K; Lemahieu, I [University of Ghent - ELIS Department, St. Pietersnieuwstraat 41, B-9000 Ghent (Belgium)
1994-12-31
The application of a maximum entropy reconstruction algorithm to PET images requires a lot of computing resources. A parallel implementation could seriously reduce the execution time. However, programming a parallel application is still a non trivial task, needing specialized people. In this paper a programming environment based on a visual programming language is used for a parallel implementation of the reconstruction algorithm. This programming environment allows less experienced programmers to use the performance of multiprocessor systems. (authors). 8 refs, 3 figs, 1 tab.
A parallel implementation of 3-d CT image reconstruction on a hypercube multiprocessor
International Nuclear Information System (INIS)
Chen, C.M.; Lee, S.Y.; Cho, Z.H.
1990-01-01
In this paper, the authors describe how image reconstruction in computerized tomography (CT) can be parallelized on a message-passing multiprocessor. In particular, the results obtained from parallel implementation of 3-D CT image reconstruction for parallel beam geometries on the Intel hypercube, iPSC/2, are presented. A two stage pipelining approach is employed for filtering (convolution) and backprojection. The conventional sequential convolution algorithm is modified such that the symmetry of the filter kernel is fully utilized for parallelization. In the backprojection stage, the 3-D incremental algorithm, the authors' recently developed backprojection scheme which is shown to be faster than conventional algorithm, is parallelized
On-line event reconstruction using a parallel in-memory data base
Argante, E; Van der Stok, P D V; Willers, Ian Malcolm
1995-01-01
PORS is a system designed for on-line event reconstruction in high energy physics (HEP) experiments. It uses the CPREAD reconstruction program. Central to the system is a parallel in-memory database which is used as communication medium between parallel workers. A farming control structure is implemented with PORS in a natural way. The database provides structured storage of data with a short life time. PORS serves as a case study for the construction of a methodology on how to apply parallel...
International Nuclear Information System (INIS)
Park, Min Jae; Lee, Jae Sung; Kim, Soo Mee; Kang, Ji Yeon; Lee, Dong Soo; Park, Kwang Suk
2009-01-01
Conventional image reconstruction uses simplified physical models of projection. However, real physics, for example 3D reconstruction, takes too long time to process all the data in clinic and is unable in a common reconstruction machine because of the large memory for complex physical models. We suggest the realistic distributed memory model of fast-reconstruction using parallel processing on personal computers to enable large-scale technologies. The preliminary tests for the possibility on virtual machines and various performance test on commercial super computer, Tachyon were performed. Expectation maximization algorithm with common 2D projection and realistic 3D line of response were tested. Since the process time was getting slower (max 6 times) after a certain iteration, optimization for compiler was performed to maximize the efficiency of parallelization. Parallel processing of a program on multiple computers was available on Linux with MPICH and NFS. We verified that differences between parallel processed image and single processed image at the same iterations were under the significant digits of floating point number, about 6 bit. Double processors showed good efficiency (1.96 times) of parallel computing. Delay phenomenon was solved by vectorization method using SSE. Through the study, realistic parallel computing system in clinic was established to be able to reconstruct by plenty of memory using the realistic physical models which was impossible to simplify
Parallel MR image reconstruction using augmented Lagrangian methods.
Ramani, Sathish; Fessler, Jeffrey A
2011-03-01
Magnetic resonance image (MRI) reconstruction using SENSitivity Encoding (SENSE) requires regularization to suppress noise and aliasing effects. Edge-preserving and sparsity-based regularization criteria can improve image quality, but they demand computation-intensive nonlinear optimization. In this paper, we present novel methods for regularized MRI reconstruction from undersampled sensitivity encoded data--SENSE-reconstruction--using the augmented Lagrangian (AL) framework for solving large-scale constrained optimization problems. We first formulate regularized SENSE-reconstruction as an unconstrained optimization task and then convert it to a set of (equivalent) constrained problems using variable splitting. We then attack these constrained versions in an AL framework using an alternating minimization method, leading to algorithms that can be implemented easily. The proposed methods are applicable to a general class of regularizers that includes popular edge-preserving (e.g., total-variation) and sparsity-promoting (e.g., l(1)-norm of wavelet coefficients) criteria and combinations thereof. Numerical experiments with synthetic and in vivo human data illustrate that the proposed AL algorithms converge faster than both general-purpose optimization algorithms such as nonlinear conjugate gradient (NCG) and state-of-the-art MFISTA.
Fast implementations of 3D PET reconstruction using vector and parallel programming techniques
International Nuclear Information System (INIS)
Guerrero, T.M.; Cherry, S.R.; Dahlbom, M.; Ricci, A.R.; Hoffman, E.J.
1993-01-01
Computationally intensive techniques that offer potential clinical use have arisen in nuclear medicine. Examples include iterative reconstruction, 3D PET data acquisition and reconstruction, and 3D image volume manipulation including image registration. One obstacle in achieving clinical acceptance of these techniques is the computational time required. This study focuses on methods to reduce the computation time for 3D PET reconstruction through the use of fast computer hardware, vector and parallel programming techniques, and algorithm optimization. The strengths and weaknesses of i860 microprocessor based workstation accelerator boards are investigated in implementations of 3D PET reconstruction
Directory of Open Access Journals (Sweden)
Schöning André
2016-01-01
Full Text Available Track reconstruction in high track multiplicity environments at current and future high rate particle physics experiments is a big challenge and very time consuming. The search for track seeds and the fitting of track candidates are usually the most time consuming steps in the track reconstruction. Here, a new and fast track reconstruction method based on hit triplets is proposed which exploits a three-dimensional fit model including multiple scattering and hit uncertainties from the very start, including the search for track seeds. The hit triplet based reconstruction method assumes a homogeneous magnetic field which allows to give an analytical solutions for the triplet fit result. This method is highly parallelizable, needs fewer operations than other standard track reconstruction methods and is therefore ideal for the implementation on parallel computing architectures. The proposed track reconstruction algorithm has been studied in the context of the Mu3e-experiment and a typical LHC experiment.
International Nuclear Information System (INIS)
Choi, Joonsung; Kim, Dongchan; Oh, Changhyun; Han, Yeji; Park, HyunWook
2013-01-01
In MRI (magnetic resonance imaging), signal sampling along a radial k-space trajectory is preferred in certain applications due to its distinct advantages such as robustness to motion, and the radial sampling can be beneficial for reconstruction algorithms such as parallel MRI (pMRI) due to the incoherency. For radial MRI, the image is usually reconstructed from projection data using analytic methods such as filtered back-projection or Fourier reconstruction after gridding. However, the quality of the reconstructed image from these analytic methods can be degraded when the number of acquired projection views is insufficient. In this paper, we propose a novel reconstruction method based on the expectation maximization (EM) method, where the EM algorithm is remodeled for MRI so that complex images can be reconstructed. Then, to optimize the proposed method for radial pMRI, a reconstruction method that uses coil sensitivity information of multichannel RF coils is formulated. Experiment results from synthetic and in vivo data show that the proposed method introduces better reconstructed images than the analytic methods, even from highly subsampled data, and provides monotonic convergence properties compared to the conjugate gradient based reconstruction method. (paper)
Tilted cone-beam reconstruction with row-wise fan-to-parallel rebinning
International Nuclear Information System (INIS)
Hsieh Jiang; Tang Xiangyang
2006-01-01
Reconstruction algorithms for cone-beam CT have been the focus of many studies. Several exact and approximate reconstruction algorithms were proposed for step-and-shoot and helical scanning trajectories to combat cone-beam related artefacts. In this paper, we present a new closed-form cone-beam reconstruction formula for tilted gantry data acquisition. Although several algorithms were proposed in the past to combat errors induced by the gantry tilt, none of the algorithms addresses the scenario in which the cone-beam geometry is first rebinned to a set of parallel beams prior to the filtered backprojection. We show that the image quality advantages of the rebinned parallel-beam reconstruction are significant, which makes the development of such an algorithm necessary. Because of the rebinning process, the reconstruction algorithm becomes more complex and the amount of iso-centre adjustment depends not only on the projection and tilt angles, but also on the reconstructed pixel location. In this paper, we first demonstrate the advantages of the row-wise fan-to-parallel rebinning and derive a closed-form solution for the reconstruction algorithm for the step-and-shoot and constant-pitch helical scans. The proposed algorithm requires the 'warping' of the reconstruction matrix on a view-by-view basis prior to the backprojection step. We further extend the algorithm to the variable-pitch helical scans in which the patient table travels at non-constant speeds. The algorithm was tested extensively on both the 16- and 64-slice CT scanners. The efficacy of the algorithm is clearly demonstrated by multiple experiments
Non-Cartesian Parallel Imaging Reconstruction of Undersampled IDEAL Spiral 13C CSI Data
DEFF Research Database (Denmark)
Hansen, Rie Beck; Hanson, Lars G.; Ardenkjær-Larsen, Jan Henrik
scan times based on spatial information inherent to each coil element. In this work, we explored the combination of non-cartesian parallel imaging reconstruction and spatially undersampled IDEAL spiral CSI1 acquisition for efficient encoding of multiple chemical shifts within a large FOV with high...
Fast MR image reconstruction for partially parallel imaging with arbitrary k-space trajectories.
Ye, Xiaojing; Chen, Yunmei; Lin, Wei; Huang, Feng
2011-03-01
Both acquisition and reconstruction speed are crucial for magnetic resonance (MR) imaging in clinical applications. In this paper, we present a fast reconstruction algorithm for SENSE in partially parallel MR imaging with arbitrary k-space trajectories. The proposed method is a combination of variable splitting, the classical penalty technique and the optimal gradient method. Variable splitting and the penalty technique reformulate the SENSE model with sparsity regularization as an unconstrained minimization problem, which can be solved by alternating two simple minimizations: One is the total variation and wavelet based denoising that can be quickly solved by several recent numerical methods, whereas the other one involves a linear inversion which is solved by the optimal first order gradient method in our algorithm to significantly improve the performance. Comparisons with several recent parallel imaging algorithms indicate that the proposed method significantly improves the computation efficiency and achieves state-of-the-art reconstruction quality.
Machine Learning and Parallelism in the Reconstruction of LHCb and its Upgrade
International Nuclear Information System (INIS)
Cian, Michel De
2016-01-01
The LHCb detector at the LHC is a general purpose detector in the forward region with a focus on reconstructing decays of c- and b-hadrons. For Run II of the LHC, a new trigger strategy with a real-time reconstruction, alignment and calibration was employed. This was made possible by implementing an offline-like track reconstruction in the high level trigger. However, the ever increasing need for a higher throughput and the move to parallelism in the CPU architectures in the last years necessitated the use of vectorization techniques to achieve the desired speed and a more extensive use of machine learning to veto bad events early on. This document discusses selected improvements in computationally expensive parts of the track reconstruction, like the Kalman filter, as well as an improved approach to get rid of fake tracks using fast machine learning techniques. In the last part, a short overview of the track reconstruction challenges for the upgrade of LHCb, is given. Running a fully software-based trigger, a large gain in speed in the reconstruction has to be achieved to cope with the 40 MHz bunch-crossing rate. Two possible approaches for techniques exploiting massive parallelization are discussed
Machine Learning and Parallelism in the Reconstruction of LHCb and its Upgrade
De Cian, Michel
2016-11-01
The LHCb detector at the LHC is a general purpose detector in the forward region with a focus on reconstructing decays of c- and b-hadrons. For Run II of the LHC, a new trigger strategy with a real-time reconstruction, alignment and calibration was employed. This was made possible by implementing an offline-like track reconstruction in the high level trigger. However, the ever increasing need for a higher throughput and the move to parallelism in the CPU architectures in the last years necessitated the use of vectorization techniques to achieve the desired speed and a more extensive use of machine learning to veto bad events early on. This document discusses selected improvements in computationally expensive parts of the track reconstruction, like the Kalman filter, as well as an improved approach to get rid of fake tracks using fast machine learning techniques. In the last part, a short overview of the track reconstruction challenges for the upgrade of LHCb, is given. Running a fully software-based trigger, a large gain in speed in the reconstruction has to be achieved to cope with the 40 MHz bunch-crossing rate. Two possible approaches for techniques exploiting massive parallelization are discussed.
Directory of Open Access Journals (Sweden)
Abdul Majeed
Full Text Available Maxillofacial trauma are common, secondary to road traffic accident, sports injury, falls and require sophisticated radiological imaging to precisely diagnose. A direct surgical reconstruction is complex and require clinical expertise. Bio-modelling helps in reconstructing surface model from 2D contours. In this manuscript we have constructed the 3D surface using 2D Computerized Tomography (CT scan contours. The fracture part of the cranial vault are reconstructed using GC1 rational cubic Ball curve with three free parameters, later the 2D contours are flipped into 3D with equidistant z component. The constructed surface is represented by contours blending interpolant. At the end of this manuscript a case report of parietal bone fracture is also illustrated by employing this method with a Graphical User Interface (GUI illustration.
Tao, Shengzhen; Trzasko, Joshua D; Shu, Yunhong; Weavers, Paul T; Huston, John; Gray, Erin M; Bernstein, Matt A
2016-06-01
To describe how integrated gradient nonlinearity (GNL) correction can be used within noniterative partial Fourier (homodyne) and parallel (SENSE and GRAPPA) MR image reconstruction strategies, and demonstrate that performing GNL correction during, rather than after, these routines mitigates the image blurring and resolution loss caused by postreconstruction image domain based GNL correction. Starting from partial Fourier and parallel magnetic resonance imaging signal models that explicitly account for GNL, noniterative image reconstruction strategies for each accelerated acquisition technique are derived under the same core mathematical assumptions as their standard counterparts. A series of phantom and in vivo experiments on retrospectively undersampled data were performed to investigate the spatial resolution benefit of integrated GNL correction over conventional postreconstruction correction. Phantom and in vivo results demonstrate that the integrated GNL correction reduces the image blurring introduced by the conventional GNL correction, while still correcting GNL-induced coarse-scale geometrical distortion. Images generated from undersampled data using the proposed integrated GNL strategies offer superior depiction of fine image detail, for example, phantom resolution inserts and anatomical tissue boundaries. Noniterative partial Fourier and parallel imaging reconstruction methods with integrated GNL correction reduce the resolution loss that occurs during conventional postreconstruction GNL correction while preserving the computational efficiency of standard reconstruction techniques. Magn Reson Med 75:2534-2544, 2016. © 2015 Wiley Periodicals, Inc. © 2015 Wiley Periodicals, Inc.
Energy Technology Data Exchange (ETDEWEB)
Kerr, John Patrick [Iowa State Univ., Ames, IA (United States)
1992-01-01
The objective of this study was to determine the feasibility of using an Artificial Neural Network (ANN), in particular a backpropagation ANN, to improve the speed and quality of the reconstruction of three-dimensional SPECT (single photon emission computed tomography) images. In addition, since the processing elements (PE)s in each layer of an ANN are independent of each other, the speed and efficiency of the neural network architecture could be better optimized by implementing the ANN on a massively parallel computer. The specific goals of this research were: to implement a fully interconnected backpropagation neural network on a serial computer and a SIMD parallel computer, to identify any reduction in the time required to train these networks on the parallel machine versus the serial machine, to determine if these neural networks can learn to recognize SPECT data by training them on a section of an actual SPECT image, and to determine from the knowledge obtained in this research if full SPECT image reconstruction by an ANN implemented on a parallel computer is feasible both in time required to train the network, and in quality of the images reconstructed.
International Nuclear Information System (INIS)
Chaari, L.; Pesquet, J.Ch.; Chaari, L.; Ciuciu, Ph.; Benazza-Benyahia, A.
2011-01-01
To reduce scanning time and/or improve spatial/temporal resolution in some Magnetic Resonance Imaging (MRI) applications, parallel MRI acquisition techniques with multiple coils acquisition have emerged since the early 1990's as powerful imaging methods that allow a faster acquisition process. In these techniques, the full FOV image has to be reconstructed from the resulting acquired under sampled k-space data. To this end, several reconstruction techniques have been proposed such as the widely-used Sensitivity Encoding (SENSE) method. However, the reconstructed image generally presents artifacts when perturbations occur in both the measured data and the estimated coil sensitivity profiles. In this paper, we aim at achieving accurate image reconstruction under degraded experimental conditions (low magnetic field and high reduction factor), in which neither the SENSE method nor the Tikhonov regularization in the image domain give convincing results. To this end, we present a novel method for SENSE-based reconstruction which proceeds with regularization in the complex wavelet domain by promoting sparsity. The proposed approach relies on a fast algorithm that enables the minimization of regularized non-differentiable criteria including more general penalties than a classical l 1 term. To further enhance the reconstructed image quality, local convex constraints are added to the regularization process. In vivo human brain experiments carried out on Gradient-Echo (GRE) anatomical and Echo Planar Imaging (EPI) functional MRI data at 1.5 T indicate that our algorithm provides reconstructed images with reduced artifacts for high reduction factors. (authors)
Dual-volume excitation and parallel reconstruction for J-difference-edited MR spectroscopy
DEFF Research Database (Denmark)
Oeltzschner, Georg; Puts, Nicolaas A J; Chan, Kimberly L
2017-01-01
successfully reconstructed with a mean in vivo g-factor of 1.025 (typical voxel-center separation: 7-8 cm). MEGA-PRIAM experiments showed higher signal-to-noise ratio than sequential single-voxel experiments of the same total duration (mean improvement 1.38 ± 0.24). CONCLUSIONS: Simultaneous acquisition of J......PURPOSE: To develop J-difference editing with parallel reconstruction in accelerated multivoxel (PRIAM) for simultaneous measurement in two separate brain regions of γ-aminobutyric acid (GABA) or glutathione. METHODS: PRIAM separates signals from two simultaneously excited voxels using receiver...
International Nuclear Information System (INIS)
Laurent, C.; Chassery, J.M.; Peyrin, F.; Girerd, C.
1996-01-01
This paper deals with the parallel implementations of reconstruction methods in 3D tomography. 3D tomography requires voluminous data and long computation times. Parallel computing, on MIMD computers, seems to be a good approach to manage this problem. In this study, we present the different steps of the parallelization on an abstract parallel computer. Depending on the method, we use two main approaches to parallelize the algorithms: the local approach and the global approach. Experimental results on MIMD computers are presented. Two 3D images reconstructed from realistic data are showed
A parallel algorithm for 3D particle tracking and Lagrangian trajectory reconstruction
International Nuclear Information System (INIS)
Barker, Douglas; Zhang, Yuanhui; Lifflander, Jonathan; Arya, Anshu
2012-01-01
Particle-tracking methods are widely used in fluid mechanics and multi-target tracking research because of their unique ability to reconstruct long trajectories with high spatial and temporal resolution. Researchers have recently demonstrated 3D tracking of several objects in real time, but as the number of objects is increased, real-time tracking becomes impossible due to data transfer and processing bottlenecks. This problem may be solved by using parallel processing. In this paper, a parallel-processing framework has been developed based on frame decomposition and is programmed using the asynchronous object-oriented Charm++ paradigm. This framework can be a key step in achieving a scalable Lagrangian measurement system for particle-tracking velocimetry and may lead to real-time measurement capabilities. The parallel tracking algorithm was evaluated with three data sets including the particle image velocimetry standard 3D images data set #352, a uniform data set for optimal parallel performance and a computational-fluid-dynamics-generated non-uniform data set to test trajectory reconstruction accuracy, consistency with the sequential version and scalability to more than 500 processors. The algorithm showed strong scaling up to 512 processors and no inherent limits of scalability were seen. Ultimately, up to a 200-fold speedup is observed compared to the serial algorithm when 256 processors were used. The parallel algorithm is adaptable and could be easily modified to use any sequential tracking algorithm, which inputs frames of 3D particle location data and outputs particle trajectories
International Nuclear Information System (INIS)
Dong, Xiangyuan; Guo, Shuqing
2008-01-01
In this paper, a novel image reconstruction method for electrical capacitance tomography (ECT) based on the combined series and parallel model is presented. A regularization technique is used to obtain a stabilized solution of the inverse problem. Also, the adaptive coefficient of the combined model is deduced by numerical optimization. Simulation results indicate that it can produce higher quality images when compared to the algorithm based on the parallel or series models for the cases tested in this paper. It provides a new algorithm for ECT application
International Nuclear Information System (INIS)
Chen Jian-Lin; Li Lei; Wang Lin-Yuan; Cai Ai-Long; Xi Xiao-Qi; Zhang Han-Ming; Li Jian-Xin; Yan Bin
2015-01-01
The projection matrix model is used to describe the physical relationship between reconstructed object and projection. Such a model has a strong influence on projection and backprojection, two vital operations in iterative computed tomographic reconstruction. The distance-driven model (DDM) is a state-of-the-art technology that simulates forward and back projections. This model has a low computational complexity and a relatively high spatial resolution; however, it includes only a few methods in a parallel operation with a matched model scheme. This study introduces a fast and parallelizable algorithm to improve the traditional DDM for computing the parallel projection and backprojection operations. Our proposed model has been implemented on a GPU (graphic processing unit) platform and has achieved satisfactory computational efficiency with no approximation. The runtime for the projection and backprojection operations with our model is approximately 4.5 s and 10.5 s per loop, respectively, with an image size of 256×256×256 and 360 projections with a size of 512×512. We compare several general algorithms that have been proposed for maximizing GPU efficiency by using the unmatched projection/backprojection models in a parallel computation. The imaging resolution is not sacrificed and remains accurate during computed tomographic reconstruction. (paper)
STEP: Self-supporting tailored k-space estimation for parallel imaging reconstruction.
Zhou, Zechen; Wang, Jinnan; Balu, Niranjan; Li, Rui; Yuan, Chun
2016-02-01
A new subspace-based iterative reconstruction method, termed Self-supporting Tailored k-space Estimation for Parallel imaging reconstruction (STEP), is presented and evaluated in comparison to the existing autocalibrating method SPIRiT and calibrationless method SAKE. In STEP, two tailored schemes including k-space partition and basis selection are proposed to promote spatially variant signal subspace and incorporated into a self-supporting structured low rank model to enforce properties of locality, sparsity, and rank deficiency, which can be formulated into a constrained optimization problem and solved by an iterative algorithm. Simulated and in vivo datasets were used to investigate the performance of STEP in terms of overall image quality and detail structure preservation. The advantage of STEP on image quality is demonstrated by retrospectively undersampled multichannel Cartesian data with various patterns. Compared with SPIRiT and SAKE, STEP can provide more accurate reconstruction images with less residual aliasing artifacts and reduced noise amplification in simulation and in vivo experiments. In addition, STEP has the capability of combining compressed sensing with arbitrary sampling trajectory. Using k-space partition and basis selection can further improve the performance of parallel imaging reconstruction with or without calibration signals. © 2015 Wiley Periodicals, Inc.
International Nuclear Information System (INIS)
Wang Shi; Kang Kejun; Wang Jingjin
1995-01-01
Computerized Tomography (CT) is expected to become an inevitable diagnostic technique in the future. However, the long time required to reconstruct an image has been one of the major drawbacks associated with this technique. Parallel process is one of the best way to solve this problem. This paper gives the architecture and hardware design of PIRS-4 (4-processor Parallel Image Reconstruction System) which is a parallel processing system for fast 3D-CT image reconstruction by circular shifting float memory architecture. It includes structure and component of the system, the design of cross bar switch and details of control model. The test results are described
International Nuclear Information System (INIS)
Kole, J S; Beekman, F J
2005-01-01
Statistical reconstruction methods offer possibilities of improving image quality as compared to analytical methods, but current reconstruction times prohibit routine clinical applications. To reduce reconstruction times we have parallelized a statistical reconstruction algorithm for cone-beam x-ray CT, the ordered subset convex algorithm (OSC), and evaluated it on a shared memory computer. Two different parallelization strategies were developed: one that employs parallelism by computing the work for all projections within a subset in parallel, and one that divides the total volume into parts and processes the work for each sub-volume in parallel. Both methods are used to reconstruct a three-dimensional mathematical phantom on two different grid densities. The reconstructed images are binary identical to the result of the serial (non-parallelized) algorithm. The speed-up factor equals approximately 30 when using 32 to 40 processors, and scales almost linearly with the number of cpus for both methods. The huge reduction in computation time allows us to apply statistical reconstruction to clinically relevant studies for the first time
Machine learning and parallelism in the reconstruction of LHCb and its upgrade
AUTHOR|(INSPIRE)INSPIRE-00260810
2016-01-01
The LHCb detector at the LHC is a general purpose detector in the forward region with a focus on reconstructing decays of c- and b-hadrons. For Run II of the LHC, a new trigger strategy with a real-time reconstruction, alignment and calibration was employed. This was made possible by implementing an oine-like track reconstruction in the high level trigger. However, the ever increasing need for a higher throughput and the move to parallelism in the CPU architectures in the last years necessitated the use of vectorization techniques to achieve the desired speed and a more extensive use of machine learning to veto bad events early on. This document discusses selected improvements in computationally expensive parts of the track reconstruction, like the Kalman filter, as well as an improved approach to get rid of fake tracks using fast machine learning techniques. In the last part, a short overview of the track reconstruction challenges for the upgrade of LHCb, is given. Running a fully software-based trigger, a l...
BPF-type region-of-interest reconstruction for parallel translational computed tomography.
Wu, Weiwen; Yu, Hengyong; Wang, Shaoyu; Liu, Fenglin
2017-01-01
The objective of this study is to present and test a new ultra-low-cost linear scan based tomography architecture. Similar to linear tomosynthesis, the source and detector are translated in opposite directions and the data acquisition system targets on a region-of-interest (ROI) to acquire data for image reconstruction. This kind of tomographic architecture was named parallel translational computed tomography (PTCT). In previous studies, filtered backprojection (FBP)-type algorithms were developed to reconstruct images from PTCT. However, the reconstructed ROI images from truncated projections have severe truncation artefact. In order to overcome this limitation, we in this study proposed two backprojection filtering (BPF)-type algorithms named MP-BPF and MZ-BPF to reconstruct ROI images from truncated PTCT data. A weight function is constructed to deal with data redundancy for multi-linear translations modes. Extensive numerical simulations are performed to evaluate the proposed MP-BPF and MZ-BPF algorithms for PTCT in fan-beam geometry. Qualitative and quantitative results demonstrate that the proposed BPF-type algorithms cannot only more accurately reconstruct ROI images from truncated projections but also generate high-quality images for the entire image support in some circumstances.
Implementation of GPU parallel equilibrium reconstruction for plasma control in EAST
Energy Technology Data Exchange (ETDEWEB)
Huang, Yao, E-mail: yaohuang@ipp.ac.cn [Institute of Plasma Physics, Chinese Academy of Sciences, Hefei (China); Xiao, B.J. [Institute of Plasma Physics, Chinese Academy of Sciences, Hefei (China); School of Nuclear Science & Technology, University of Science & Technology of China (China); Luo, Z.P.; Yuan, Q.P.; Pei, X.F. [Institute of Plasma Physics, Chinese Academy of Sciences, Hefei (China); Yue, X.N. [School of Nuclear Science & Technology, University of Science & Technology of China (China)
2016-11-15
Highlights: • We described parallel equilibrium reconstruction code P-EFIT running on GPU was integrated with EAST plasma control system. • Compared with RT-EFIT used in EAST, P-EFIT has better spatial resolution and full algorithm of EFIT per iteration. • With the data interface through RFM, 65 × 65 spatial grids P-EFIT can satisfy the accuracy and time feasibility requirements for plasma control. • Successful control using ISOFLUX/P-EFIT was established in the dedicated experiment during the EAST 2014 campaign. • This work is a stepping-stone towards versatile ISOFLUX/P-EFIT control, such as real-time equilibrium reconstruction with more diagnostics. - Abstract: Implementation of P-EFIT code for plasma control in EAST is described. P-EFIT is based on the EFIT framework, but built with the CUDA™ architecture to take advantage of massively parallel Graphical Processing Unit (GPU) cores to significantly accelerate the computation. 65 × 65 grid size P-EFIT can complete one reconstruction iteration in 300 μs, with one iteration strategy, it can satisfy the needs of real-time plasma shape control. Data interface between P-EFIT and PCS is realized and developed by transferring data through RFM. First application of P-EFIT to discharge control in EAST is described.
SPECT reconstruction of combined cone beam and parallel hole collimation with experimental data
International Nuclear Information System (INIS)
Li, Jianying; Jaszczak, R.J.; Turkington, T.G.; Greer, K.L.; Coleman, R.E.
1993-01-01
The authors have developed three methods to combine parallel and cone bean (P and CB) SPECT data using modified Maximum Likelihood-Expectation Maximization (ML-EM) algorithms. The first combination method applies both parallel and cone beam data sets to reconstruct a single intermediate image after each iteration using the ML-EM algorithm. The other two iterative methods combine the intermediate parallel beam (PB) and cone beam (CB) source estimates to enhance the uniformity of images. These two methods are ad hoc methods. In earlier studies using computer Monte Carlo simulation, they suggested that improved images might be obtained by reconstructing combined P and CB SPECT data. These combined collimation methods are qualitatively evaluated using experimental data. An attenuation compensation is performed by including the effects of attenuation in the transition matrix as a multiplicative factor. The combined P and CB images are compared with CB-only images and the result indicate that the combined P and CB approaches suppress artifacts caused by truncated projections and correct for the distortions of the CB-only images
International Nuclear Information System (INIS)
Wang Shi; Kang Kejun; Wang Jingjin
1996-01-01
Computerized Tomography (CT) is expected to become an inevitable diagnostic technique in the future. However, the long time required to reconstruct an image has been one of the major drawbacks associated with this technique. Parallel process is one of the best way to solve this problem. This paper gives the architecture, hardware and software design of PIRS-4 (4-processor Parallel Image Reconstruction System), which is a parallel processing system for fast 3D-CT image reconstruction by circular shifting float memory architecture. It includes the structure and components of the system, the design of crossbar switch and details of control model, the description of RPBP image reconstruction, the choice of OS (Operate System) and language, the principle of imitating EMS, direct memory R/W of float and programming in the protect model. Finally, the test results are given
Parallelized Kalman-Filter-Based Reconstruction of Particle Tracks on Many-Core Architectures
Energy Technology Data Exchange (ETDEWEB)
Cerati, Giuseppe [Fermilab; Elmer, Peter [Princeton U.; Krutelyov, Slava [UC, San Diego; Lantz, Steven [Cornell U., Phys. Dept.; Lefebvre, Matthieu [Princeton U.; Masciovecchio, Mario [UC, San Diego; McDermott, Kevin [Cornell U., Phys. Dept.; Riley, Daniel [Cornell U., Phys. Dept.; Tadel, Matevž [UC, San Diego; Wittich, Peter [Cornell U., Phys. Dept.; Würthwein, Frank [UC, San Diego; Yagil, Avi [UC, San Diego
2017-11-16
Faced with physical and energy density limitations on clock speed, contemporary microprocessor designers have increasingly turned to on-chip parallelism for performance gains. Examples include the Intel Xeon Phi, GPGPUs, and similar technologies. Algorithms should accordingly be designed with ample amounts of fine-grained parallelism if they are to realize the full performance of the hardware. This requirement can be challenging for algorithms that are naturally expressed as a sequence of small-matrix operations, such as the Kalman filter methods widely in use in high-energy physics experiments. In the High-Luminosity Large Hadron Collider (HL-LHC), for example, one of the dominant computational problems is expected to be finding and fitting charged-particle tracks during event reconstruction; today, the most common track-finding methods are those based on the Kalman filter. Experience at the LHC, both in the trigger and offline, has shown that these methods are robust and provide high physics performance. Previously we reported the significant parallel speedups that resulted from our efforts to adapt Kalman-filter-based tracking to many-core architectures such as Intel Xeon Phi. Here we report on how effectively those techniques can be applied to more realistic detector configurations and event complexity.
Hoge, W Scott; Brooks, Dana H
2008-08-01
Two strategies are widely used in parallel MRI to reconstruct subsampled multicoil image data. SENSE and related methods employ explicit receiver coil spatial response estimates to reconstruct an image. In contrast, coil-by-coil methods such as GRAPPA leverage correlations among the acquired multicoil data to reconstruct missing k-space lines. In self-referenced scenarios, both methods employ Nyquist-rate low-frequency k-space data to identify the reconstruction parameters. Because GRAPPA does not require explicit coil sensitivities estimates, it needs considerably fewer autocalibration signals than SENSE. However, SENSE methods allow greater opportunity to control reconstruction quality though regularization and thus may outperform GRAPPA in some imaging scenarios. Here, we employ GRAPPA to improve self-referenced coil sensitivity estimation in SENSE and related methods using very few auto-calibration signals. This enables one to leverage each methods' inherent strength and produce high quality self-referenced SENSE reconstructions. (c) 2008 Wiley-Liss, Inc.
Lartillot, Nicolas; Rodrigue, Nicolas; Stubbs, Daniel; Richer, Jacques
2013-07-01
Modeling across site variation of the substitution process is increasingly recognized as important for obtaining more accurate phylogenetic reconstructions. Both finite and infinite mixture models have been proposed and have been shown to significantly improve on classical single-matrix models. Compared with their finite counterparts, infinite mixtures have a greater expressivity. However, they are computationally more challenging. This has resulted in practical compromises in the design of infinite mixture models. In particular, a fast but simplified version of a Dirichlet process model over equilibrium frequency profiles implemented in PhyloBayes has often been used in recent phylogenomics studies, while more refined model structures, more realistic and empirically more fit, have been practically out of reach. We introduce a message passing interface version of PhyloBayes, implementing the Dirichlet process mixture models as well as more classical empirical matrices and finite mixtures. The parallelization is made efficient thanks to the combination of two algorithmic strategies: a partial Gibbs sampling update of the tree topology and the use of a truncated stick-breaking representation for the Dirichlet process prior. The implementation shows close to linear gains in computational speed for up to 64 cores, thus allowing faster phylogenetic reconstruction under complex mixture models. PhyloBayes MPI is freely available from our website www.phylobayes.org.
Muckley, Matthew J; Noll, Douglas C; Fessler, Jeffrey A
2015-02-01
Sparsity-promoting regularization is useful for combining compressed sensing assumptions with parallel MRI for reducing scan time while preserving image quality. Variable splitting algorithms are the current state-of-the-art algorithms for SENSE-type MR image reconstruction with sparsity-promoting regularization. These methods are very general and have been observed to work with almost any regularizer; however, the tuning of associated convergence parameters is a commonly-cited hindrance in their adoption. Conversely, majorize-minimize algorithms based on a single Lipschitz constant have been observed to be slow in shift-variant applications such as SENSE-type MR image reconstruction since the associated Lipschitz constants are loose bounds for the shift-variant behavior. This paper bridges the gap between the Lipschitz constant and the shift-variant aspects of SENSE-type MR imaging by introducing majorizing matrices in the range of the regularizer matrix. The proposed majorize-minimize methods (called BARISTA) converge faster than state-of-the-art variable splitting algorithms when combined with momentum acceleration and adaptive momentum restarting. Furthermore, the tuning parameters associated with the proposed methods are unitless convergence tolerances that are easier to choose than the constraint penalty parameters required by variable splitting algorithms.
Parallelized Kalman-Filter-Based Reconstruction of Particle Tracks on Many-Core Processors and GPUs
Cerati, Giuseppe; Elmer, Peter; Krutelyov, Slava; Lantz, Steven; Lefebvre, Matthieu; Masciovecchio, Mario; McDermott, Kevin; Riley, Daniel; Tadel, Matevž; Wittich, Peter; Würthwein, Frank; Yagil, Avi
2017-08-01
For over a decade now, physical and energy constraints have limited clock speed improvements in commodity microprocessors. Instead, chipmakers have been pushed into producing lower-power, multi-core processors such as Graphical Processing Units (GPU), ARM CPUs, and Intel MICs. Broad-based efforts from manufacturers and developers have been devoted to making these processors user-friendly enough to perform general computations. However, extracting performance from a larger number of cores, as well as specialized vector or SIMD units, requires special care in algorithm design and code optimization. One of the most computationally challenging problems in high-energy particle experiments is finding and fitting the charged-particle tracks during event reconstruction. This is expected to become by far the dominant problem at the High-Luminosity Large Hadron Collider (HL-LHC), for example. Today the most common track finding methods are those based on the Kalman filter. Experience with Kalman techniques on real tracking detector systems has shown that they are robust and provide high physics performance. This is why they are currently in use at the LHC, both in the trigger and offine. Previously we reported on the significant parallel speedups that resulted from our investigations to adapt Kalman filters to track fitting and track building on Intel Xeon and Xeon Phi. Here, we discuss our progresses toward the understanding of these processors and the new developments to port the Kalman filter to NVIDIA GPUs.
Parallelized Kalman-Filter-Based Reconstruction of Particle Tracks on Many-Core Processors and GPUs
Directory of Open Access Journals (Sweden)
Cerati Giuseppe
2017-01-01
Full Text Available For over a decade now, physical and energy constraints have limited clock speed improvements in commodity microprocessors. Instead, chipmakers have been pushed into producing lower-power, multi-core processors such as Graphical Processing Units (GPU, ARM CPUs, and Intel MICs. Broad-based efforts from manufacturers and developers have been devoted to making these processors user-friendly enough to perform general computations. However, extracting performance from a larger number of cores, as well as specialized vector or SIMD units, requires special care in algorithm design and code optimization. One of the most computationally challenging problems in high-energy particle experiments is finding and fitting the charged-particle tracks during event reconstruction. This is expected to become by far the dominant problem at the High-Luminosity Large Hadron Collider (HL-LHC, for example. Today the most common track finding methods are those based on the Kalman filter. Experience with Kalman techniques on real tracking detector systems has shown that they are robust and provide high physics performance. This is why they are currently in use at the LHC, both in the trigger and offine. Previously we reported on the significant parallel speedups that resulted from our investigations to adapt Kalman filters to track fitting and track building on Intel Xeon and Xeon Phi. Here, we discuss our progresses toward the understanding of these processors and the new developments to port the Kalman filter to NVIDIA GPUs.
Parallelized Kalman-Filter-Based Reconstruction of Particle Tracks on Many-Core Processors and GPUs
Energy Technology Data Exchange (ETDEWEB)
Cerati, Giuseppe [Fermilab; Elmer, Peter [Princeton U.; Krutelyov, Slava [UC, San Diego; Lantz, Steven [Cornell U.; Lefebvre, Matthieu [Princeton U.; Masciovecchio, Mario [UC, San Diego; McDermott, Kevin [Cornell U.; Riley, Daniel [Cornell U., LNS; Tadel, Matevž [UC, San Diego; Wittich, Peter [Cornell U.; Würthwein, Frank [UC, San Diego; Yagil, Avi [UC, San Diego
2017-01-01
For over a decade now, physical and energy constraints have limited clock speed improvements in commodity microprocessors. Instead, chipmakers have been pushed into producing lower-power, multi-core processors such as Graphical Processing Units (GPU), ARM CPUs, and Intel MICs. Broad-based efforts from manufacturers and developers have been devoted to making these processors user-friendly enough to perform general computations. However, extracting performance from a larger number of cores, as well as specialized vector or SIMD units, requires special care in algorithm design and code optimization. One of the most computationally challenging problems in high-energy particle experiments is finding and fitting the charged-particle tracks during event reconstruction. This is expected to become by far the dominant problem at the High-Luminosity Large Hadron Collider (HL-LHC), for example. Today the most common track finding methods are those based on the Kalman filter. Experience with Kalman techniques on real tracking detector systems has shown that they are robust and provide high physics performance. This is why they are currently in use at the LHC, both in the trigger and offine. Previously we reported on the significant parallel speedups that resulted from our investigations to adapt Kalman filters to track fitting and track building on Intel Xeon and Xeon Phi. Here, we discuss our progresses toward the understanding of these processors and the new developments to port the Kalman filter to NVIDIA GPUs.
Directory of Open Access Journals (Sweden)
Christopher D. Dharmaraj
2009-01-01
Full Text Available Three-dimensional Oximetric Electron Paramagnetic Resonance Imaging using the Single Point Imaging modality generates unpaired spin density and oxygen images that can readily distinguish between normal and tumor tissues in small animals. It is also possible with fast imaging to track the changes in tissue oxygenation in response to the oxygen content in the breathing air. However, this involves dealing with gigabytes of data for each 3D oximetric imaging experiment involving digital band pass filtering and background noise subtraction, followed by 3D Fourier reconstruction. This process is rather slow in a conventional uniprocessor system. This paper presents a parallelization framework using OpenMP runtime support and parallel MATLAB to execute such computationally intensive programs. The Intel compiler is used to develop a parallel C++ code based on OpenMP. The code is executed on four Dual-Core AMD Opteron shared memory processors, to reduce the computational burden of the filtration task significantly. The results show that the parallel code for filtration has achieved a speed up factor of 46.66 as against the equivalent serial MATLAB code. In addition, a parallel MATLAB code has been developed to perform 3D Fourier reconstruction. Speedup factors of 4.57 and 4.25 have been achieved during the reconstruction process and oximetry computation, for a data set with 23×23×23 gradient steps. The execution time has been computed for both the serial and parallel implementations using different dimensions of the data and presented for comparison. The reported system has been designed to be easily accessible even from low-cost personal computers through local internet (NIHnet. The experimental results demonstrate that the parallel computing provides a source of high computational power to obtain biophysical parameters from 3D EPR oximetric imaging, almost in real-time.
Dharmaraj, Christopher D; Thadikonda, Kishan; Fletcher, Anthony R; Doan, Phuc N; Devasahayam, Nallathamby; Matsumoto, Shingo; Johnson, Calvin A; Cook, John A; Mitchell, James B; Subramanian, Sankaran; Krishna, Murali C
2009-01-01
Three-dimensional Oximetric Electron Paramagnetic Resonance Imaging using the Single Point Imaging modality generates unpaired spin density and oxygen images that can readily distinguish between normal and tumor tissues in small animals. It is also possible with fast imaging to track the changes in tissue oxygenation in response to the oxygen content in the breathing air. However, this involves dealing with gigabytes of data for each 3D oximetric imaging experiment involving digital band pass filtering and background noise subtraction, followed by 3D Fourier reconstruction. This process is rather slow in a conventional uniprocessor system. This paper presents a parallelization framework using OpenMP runtime support and parallel MATLAB to execute such computationally intensive programs. The Intel compiler is used to develop a parallel C++ code based on OpenMP. The code is executed on four Dual-Core AMD Opteron shared memory processors, to reduce the computational burden of the filtration task significantly. The results show that the parallel code for filtration has achieved a speed up factor of 46.66 as against the equivalent serial MATLAB code. In addition, a parallel MATLAB code has been developed to perform 3D Fourier reconstruction. Speedup factors of 4.57 and 4.25 have been achieved during the reconstruction process and oximetry computation, for a data set with 23 x 23 x 23 gradient steps. The execution time has been computed for both the serial and parallel implementations using different dimensions of the data and presented for comparison. The reported system has been designed to be easily accessible even from low-cost personal computers through local internet (NIHnet). The experimental results demonstrate that the parallel computing provides a source of high computational power to obtain biophysical parameters from 3D EPR oximetric imaging, almost in real-time.
International Nuclear Information System (INIS)
Pedron, Antoine
2013-01-01
This thesis work is placed between the scientific domain of ultrasound non-destructive testing and algorithm-architecture adequation. Ultrasound non-destructive testing includes a group of analysis techniques used in science and industry to evaluate the properties of a material, component, or system without causing damage. In order to characterise possible defects, determining their position, size and shape, imaging and reconstruction tools have been developed at CEA-LIST, within the CIVA software platform. Evolution of acquisition sensors implies a continuous growth of datasets and consequently more and more computing power is needed to maintain interactive reconstructions. General purpose processors (GPP) evolving towards parallelism and emerging architectures such as GPU allow large acceleration possibilities than can be applied to these algorithms. The main goal of the thesis is to evaluate the acceleration than can be obtained for two reconstruction algorithms on these architectures. These two algorithms differ in their parallelization scheme. The first one can be properly parallelized on GPP whereas on GPU, an intensive use of atomic instructions is required. Within the second algorithm, parallelism is easier to express, but loop ordering on GPP, as well as thread scheduling and a good use of shared memory on GPU are necessary in order to obtain efficient results. Different API or libraries, such as OpenMP, CUDA and OpenCL are evaluated through chosen benchmarks. An integration of both algorithms in the CIVA software platform is proposed and different issues related to code maintenance and durability are discussed. (author) [fr
Gong, Enhao; Huang, Feng; Ying, Kui; Wu, Wenchuan; Wang, Shi; Yuan, Chun
2015-02-01
A typical clinical MR examination includes multiple scans to acquire images with different contrasts for complementary diagnostic information. The multicontrast scheme requires long scanning time. The combination of partially parallel imaging and compressed sensing (CS-PPI) has been used to reconstruct accelerated scans. However, there are several unsolved problems in existing methods. The target of this work is to improve existing CS-PPI methods for multicontrast imaging, especially for two-dimensional imaging. If the same field of view is scanned in multicontrast imaging, there is significant amount of sharable information. It is proposed in this study to use manifold sharable information among multicontrast images to enhance CS-PPI in a sequential way. Coil sensitivity information and structure based adaptive regularization, which were extracted from previously reconstructed images, were applied to enhance the following reconstructions. The proposed method is called Parallel-imaging and compressed-sensing Reconstruction Of Multicontrast Imaging using SharablE information (PROMISE). Using L1 -SPIRiT as a CS-PPI example, results on multicontrast brain and carotid scans demonstrated that lower error level and better detail preservation can be achieved by exploiting manifold sharable information. Besides, the privilege of PROMISE still exists while there is interscan motion. Using the sharable information among multicontrast images can enhance CS-PPI with tolerance to motions. © 2014 Wiley Periodicals, Inc.
Ren, Zhong; Liu, Guodong; Huang, Zhen
2012-11-01
The image reconstruction is a key step in medical imaging (MI) and its algorithm's performance determinates the quality and resolution of reconstructed image. Although some algorithms have been used, filter back-projection (FBP) algorithm is still the classical and commonly-used algorithm in clinical MI. In FBP algorithm, filtering of original projection data is a key step in order to overcome artifact of the reconstructed image. Since simple using of classical filters, such as Shepp-Logan (SL), Ram-Lak (RL) filter have some drawbacks and limitations in practice, especially for the projection data polluted by non-stationary random noises. So, an improved wavelet denoising combined with parallel-beam FBP algorithm is used to enhance the quality of reconstructed image in this paper. In the experiments, the reconstructed effects were compared between the improved wavelet denoising and others (directly FBP, mean filter combined FBP and median filter combined FBP method). To determine the optimum reconstruction effect, different algorithms, and different wavelet bases combined with three filters were respectively test. Experimental results show the reconstruction effect of improved FBP algorithm is better than that of others. Comparing the results of different algorithms based on two evaluation standards i.e. mean-square error (MSE), peak-to-peak signal-noise ratio (PSNR), it was found that the reconstructed effects of the improved FBP based on db2 and Hanning filter at decomposition scale 2 was best, its MSE value was less and the PSNR value was higher than others. Therefore, this improved FBP algorithm has potential value in the medical imaging.
International Nuclear Information System (INIS)
Jaszczak, R.J.; Jianying Li; Huili Wang; Coleman, R.E.
1992-01-01
Single photon emission computed tomography (SPECT) using cone beam (CB) collimation exhibits increased sensitivity compared with acquisition geometries using parallel (P) hole collimation. However, CB collimation has a smaller field-of-view which may result in truncated projections and image artifacts. A primary objective of this work is to investigate maximum likelihood-expectation maximization (ML-EM) methods to reconstruct simultaneously acquired parallel and cone beam (P and CB) SPECT data. Simultaneous P and CB acquisition can be performed with commercially available triple camera systems by using two cone-beam collimators and a single parallel-hole collimator. The loss in overall sensitivity (relative to the use of three CB collimators) is about 15 to 20%. The authors have developed three methods to combine P and CB data using modified ML-EM algorithms. (author)
International Nuclear Information System (INIS)
1976-01-01
A tomograph which is capable of gathering divergent radiations and reconstruct them in signal profiles or images each corresponding with a beam of parallel rays is discussed which may eliminate the interfering point dispersion function which normally occurs
International Nuclear Information System (INIS)
Ito, Satoshi; Kawawa, Yasuhiro; Yamada, Yoshifumi
2010-01-01
We propose an image reconstruction technique in which parallel image reconstruction is performed based on the sensitivity encoding (SENSE) algorithm using only a single set of signals. The signal obtained in the phase-scrambling Fourier transform (PSFT) imaging technique can be transformed to the signal described by the Fresnel transform of the objects, which is known as the diffracted wave-front equation of the object in acoustics or optics. Since the Fresnel transform is a convolution integral on the object space, the space where the PSFT signal exists can be considered as both in the Fourier domain and in the object domain. This notable feature indicates that weighting functions corresponding to the sensitivity of radiofrequency (RF) coils can be approximately given in the PSFT signal space. Therefore, we can obtain two folded images from a single set of signals with different weighting functions, and image reconstruction based on the SENSE parallel imaging algorithm is possible using a series of folded images. Simulation and experimental studies showed that almost alias-free images can be synthesized using a single signal that does not satisfy the sampling theorem. (author)
International Nuclear Information System (INIS)
Egger, M.L.; Scheurer, A.H.; Joseph, C.
1996-01-01
The issue of long reconstruction times in PET has been addressed from several points of view, resulting in an affordable dedicated system capable of handling routine 3D reconstruction in a few minutes per frame: on the hardware side using fast processors and a parallel architecture, and on the software side, using efficient implementations of computationally less intensive algorithms. Execution times obtained for the PRT-1 data set on a parallel system of five hybrid nodes, each combining an Alpha processor for computation and a transputer for communication, are the following (256 sinograms of 96 views by 128 radial samples): Ramp algorithm 56 s, Favor 81 s and reprojection algorithm of Kinahan and Rogers 187 s. The implementation of fast rebinning algorithms has shown our hardware platform to become communications-limited; they execute faster on a conventional single-processor Alpha workstation: single-slice rebinning 7 s, Fourier rebinning 22 s, 2D filtered backprojection 5 s. The scalability of the system has been demonstrated, and a saturation effect at network sizes above ten nodes has become visible; new T9000-based products lifting most of the constraints on network topology and link throughput are expected to result in improved parallel efficiency and scalability properties
Chen, Ying; Balla, Apuroop; Rayford II, Cleveland E; Zhou, Weihua; Fang, Jian; Cong, Linlin
2010-01-01
Digital tomosynthesis is a novel technology that has been developed for various clinical applications. Parallel imaging configuration is utilised in a few tomosynthesis imaging areas such as digital chest tomosynthesis. Recently, parallel imaging configuration for breast tomosynthesis began to appear too. In this paper, we present the investigation on computational analysis of impulse response characterisation as the start point of our important research efforts to optimise the parallel imaging configurations. Results suggest that impulse response computational analysis is an effective method to compare and optimise imaging configurations.
International Nuclear Information System (INIS)
Tavares, R S; Tsuzuki, M S G; Martins, T C
2012-01-01
Electrical Impedance Tomography (EIT) is an imaging technique that attempts to reconstruct the conductivity distribution inside an object from electrical currents and potentials applied and measured at its surface. The EIT reconstruction problem is approached as an optimization problem, where the difference between the simulated and measured distributions must be minimized. This optimization problem can be solved using Simulated Annealing (SA), but at a high computational cost. To reduce the computational load, it is possible to use an incomplete evaluation of the objective function. This algorithm showed to present an outside-in behavior, determining the impedance of the external elements first, similar to a layer striping algorithm. A new outside-in heuristic to make use of this property is proposed. It also presents the impact of using GPU for parallelizing matrix-vector multiplication and triangular solvers. Results with experimental data are presented. The outside-in heuristic showed to be faster when compared to the conventional SA algorithm.
Duan, Jizhong; Liu, Yu; Jing, Peiguang
2018-02-01
Self-consistent parallel imaging (SPIRiT) is an auto-calibrating model for the reconstruction of parallel magnetic resonance imaging, which can be formulated as a regularized SPIRiT problem. The Projection Over Convex Sets (POCS) method was used to solve the formulated regularized SPIRiT problem. However, the quality of the reconstructed image still needs to be improved. Though methods such as NonLinear Conjugate Gradients (NLCG) can achieve higher spatial resolution, these methods always demand very complex computation and converge slowly. In this paper, we propose a new algorithm to solve the formulated Cartesian SPIRiT problem with the JTV and JL1 regularization terms. The proposed algorithm uses the operator splitting (OS) technique to decompose the problem into a gradient problem and a denoising problem with two regularization terms, which is solved by our proposed split Bregman based denoising algorithm, and adopts the Barzilai and Borwein method to update step size. Simulation experiments on two in vivo data sets demonstrate that the proposed algorithm is 1.3 times faster than ADMM for datasets with 8 channels. Especially, our proposal is 2 times faster than ADMM for the dataset with 32 channels. Copyright © 2017 Elsevier Inc. All rights reserved.
Xu, Zheng; Wang, Sheng; Li, Yeqing; Zhu, Feiyun; Huang, Junzhou
2018-02-08
The most recent history of parallel Magnetic Resonance Imaging (pMRI) has in large part been devoted to finding ways to reduce acquisition time. While joint total variation (JTV) regularized model has been demonstrated as a powerful tool in increasing sampling speed for pMRI, however, the major bottleneck is the inefficiency of the optimization method. While all present state-of-the-art optimizations for the JTV model could only reach a sublinear convergence rate, in this paper, we squeeze the performance by proposing a linear-convergent optimization method for the JTV model. The proposed method is based on the Iterative Reweighted Least Squares algorithm. Due to the complexity of the tangled JTV objective, we design a novel preconditioner to further accelerate the proposed method. Extensive experiments demonstrate the superior performance of the proposed algorithm for pMRI regarding both accuracy and efficiency compared with state-of-the-art methods.
International Nuclear Information System (INIS)
Park, Sook Hee
2001-02-01
This thesis implements and analyzes the parallel and networked computing libraries based on the multiprocessor computer architecture as well as networked computers, aiming at improving the computation speed of ET(Electrical Tomography) system which requires enormous CPU time in reconstructing the unknown internal state of the target object. As an instance of the typical tomography technology, ET partitions the cross-section of the target object into the tiny elements and calculates the resistivity of them with signal values measured at the boundary electrodes surrounding the surface of the object after injecting the predetermined current pattern through the object. The number of elements is determined considering the trade-off between the accuracy of the reconstructed image and the computation time. As the elements become more finer, the number of element increases, and the system can get the better image. However, the reconstruction time increases polynomially with the number of partitioned elements since the procedure consists of a number of time consuming matrix operations such as multiplication, inverse, pseudo inverse, Jacobian and so on. Consequently, the demand for improving computation speed via multiple processor grows indispensably. Moreover, currently released PCs can be stuffed with up to 4 CPUs interconnected to the shared memory while some operating systems enable the application process to benefit from such computer by allocating the threaded job to each CPU, resulting in concurrent processing. In addition, a networked computing or cluster computing environment is commonly available to almost every computer which contains communication protocol and is connected to local or global network. After partitioning the given job(numerical operation), each CPU or computer calculates the partial result independently, and the results are merged via common memory to produce the final result. It is desirable to adopt the commonly used library such as Matlab to
International Nuclear Information System (INIS)
Leggett, C; Jackson, K; Tatarkhanov, M; Yao, Y; Binet, S; Levinthal, D
2011-01-01
Thermal limitations have forced CPU manufacturers to shift from simply increasing clock speeds to improve processor performance, to producing chip designs with multi- and many-core architectures. Further the cores themselves can run multiple threads as a zero overhead context switch allowing low level resource sharing (Intel Hyperthreading). To maximize bandwidth and minimize memory latency, memory access has become non uniform (NUMA). As manufacturers add more cores to each chip, a careful understanding of the underlying architecture is required in order to fully utilize the available resources. We present AthenaMP and the Atlas event loop manager, the driver of the simulation and reconstruction engines, which have been rewritten to make use of multiple cores, by means of event based parallelism, and final stage I/O synchronization. However, initial studies on 8 andl6 core Intel architectures have shown marked non-linearities as parallel process counts increase, with as much as 30% reductions in event throughput in some scenarios. Since the Intel Nehalem architecture (both Gainestown and Westmere) will be the most common choice for the next round of hardware procurements, an understanding of these scaling issues is essential. Using hardware based event counters and Intel's Performance Tuning Utility, we have studied the performance bottlenecks at the hardware level, and discovered optimization schemes to maximize processor throughput. We have also produced optimization mechanisms, common to all large experiments, that address the extreme nature of today's HEP code, which due to it's size, places huge burdens on the memory infrastructure of today's processors.
International Nuclear Information System (INIS)
Shibata, Koichi; Notohara, Daisuke; Sakai, Takihito
2014-01-01
Purpose: Parallel-scanning tomosynthesis (PS-TS) is a novel technique that fuses the slot scanning technique and the conventional tomosynthesis (TS) technique. This approach allows one to obtain long-view tomosynthesis images in addition to normally sized tomosynthesis images, even when using a system that has no linear tomographic scanning function. The reconstruction technique and an evaluation of the resulting image quality for PS-TS are described in this paper. Methods: The PS-TS image-reconstruction technique consists of several steps (1) the projection images are divided into strips, (2) the strips are stitched together to construct images corresponding to the reconstruction plane, (3) the stitched images are filtered, and (4) the filtered stitched images are back-projected. In the case of PS-TS using the fixed-focus reconstruction method (PS-TS-F), one set of stitched images is used for the reconstruction planes at all heights, thus avoiding the necessity of repeating steps (1)–(3). A physical evaluation of the image quality of PS-TS-F compared with that of the conventional linear TS was performed using a R/F table (Sonialvision safire, Shimadzu Corp., Kyoto, Japan). The tomographic plane with the best theoretical spatial resolution (the in-focus plane, IFP) was set at a height of 100 mm from the table top by adjusting the reconstruction program. First, the spatial frequency response was evaluated at heights of −100, −50, 0, 50, 100, and 150 mm from the IFP using the edge of a 0.3-mm-thick copper plate. Second, the spatial resolution at each height was visually evaluated using an x-ray test pattern (Model No. 38, PTW Freiburg, Germany). Third, the slice sensitivity at each height was evaluated via the wire method using a 0.1-mm-diameter tungsten wire. Phantom studies using a knee phantom and a whole-body phantom were also performed. Results: The spatial frequency response of PS-TS-F yielded the best results at the IFP and degraded slightly as the
Shibata, Koichi; Notohara, Daisuke; Sakai, Takihito
2014-11-01
Parallel-scanning tomosynthesis (PS-TS) is a novel technique that fuses the slot scanning technique and the conventional tomosynthesis (TS) technique. This approach allows one to obtain long-view tomosynthesis images in addition to normally sized tomosynthesis images, even when using a system that has no linear tomographic scanning function. The reconstruction technique and an evaluation of the resulting image quality for PS-TS are described in this paper. The PS-TS image-reconstruction technique consists of several steps (1) the projection images are divided into strips, (2) the strips are stitched together to construct images corresponding to the reconstruction plane, (3) the stitched images are filtered, and (4) the filtered stitched images are back-projected. In the case of PS-TS using the fixed-focus reconstruction method (PS-TS-F), one set of stitched images is used for the reconstruction planes at all heights, thus avoiding the necessity of repeating steps (1)-(3). A physical evaluation of the image quality of PS-TS-F compared with that of the conventional linear TS was performed using a R/F table (Sonialvision safire, Shimadzu Corp., Kyoto, Japan). The tomographic plane with the best theoretical spatial resolution (the in-focus plane, IFP) was set at a height of 100 mm from the table top by adjusting the reconstruction program. First, the spatial frequency response was evaluated at heights of -100, -50, 0, 50, 100, and 150 mm from the IFP using the edge of a 0.3-mm-thick copper plate. Second, the spatial resolution at each height was visually evaluated using an x-ray test pattern (Model No. 38, PTW Freiburg, Germany). Third, the slice sensitivity at each height was evaluated via the wire method using a 0.1-mm-diameter tungsten wire. Phantom studies using a knee phantom and a whole-body phantom were also performed. The spatial frequency response of PS-TS-F yielded the best results at the IFP and degraded slightly as the distance from the IFP increased. A
Energy Technology Data Exchange (ETDEWEB)
Shibata, Koichi, E-mail: shibatak@suzuka-u.ac.jp [Department of Radiological Technology, Faculty of Health Science, Suzuka University of Medical Science 1001-1, Kishioka-cho, Suzuka 510-0293 (Japan); Notohara, Daisuke; Sakai, Takihito [R and D Department, Medical Systems Division, Shimadzu Corporation 1, Nishinokyo-Kuwabara-cho, Nakagyo-ku, Kyoto 604-8511 (Japan)
2014-11-01
Purpose: Parallel-scanning tomosynthesis (PS-TS) is a novel technique that fuses the slot scanning technique and the conventional tomosynthesis (TS) technique. This approach allows one to obtain long-view tomosynthesis images in addition to normally sized tomosynthesis images, even when using a system that has no linear tomographic scanning function. The reconstruction technique and an evaluation of the resulting image quality for PS-TS are described in this paper. Methods: The PS-TS image-reconstruction technique consists of several steps (1) the projection images are divided into strips, (2) the strips are stitched together to construct images corresponding to the reconstruction plane, (3) the stitched images are filtered, and (4) the filtered stitched images are back-projected. In the case of PS-TS using the fixed-focus reconstruction method (PS-TS-F), one set of stitched images is used for the reconstruction planes at all heights, thus avoiding the necessity of repeating steps (1)–(3). A physical evaluation of the image quality of PS-TS-F compared with that of the conventional linear TS was performed using a R/F table (Sonialvision safire, Shimadzu Corp., Kyoto, Japan). The tomographic plane with the best theoretical spatial resolution (the in-focus plane, IFP) was set at a height of 100 mm from the table top by adjusting the reconstruction program. First, the spatial frequency response was evaluated at heights of −100, −50, 0, 50, 100, and 150 mm from the IFP using the edge of a 0.3-mm-thick copper plate. Second, the spatial resolution at each height was visually evaluated using an x-ray test pattern (Model No. 38, PTW Freiburg, Germany). Third, the slice sensitivity at each height was evaluated via the wire method using a 0.1-mm-diameter tungsten wire. Phantom studies using a knee phantom and a whole-body phantom were also performed. Results: The spatial frequency response of PS-TS-F yielded the best results at the IFP and degraded slightly as the
Robson, Philip M; Grant, Aaron K; Madhuranthakam, Ananth J; Lattanzi, Riccardo; Sodickson, Daniel K; McKenzie, Charles A
2008-10-01
Parallel imaging reconstructions result in spatially varying noise amplification characterized by the g-factor, precluding conventional measurements of noise from the final image. A simple Monte Carlo based method is proposed for all linear image reconstruction algorithms, which allows measurement of signal-to-noise ratio and g-factor and is demonstrated for SENSE and GRAPPA reconstructions for accelerated acquisitions that have not previously been amenable to such assessment. Only a simple "prescan" measurement of noise amplitude and correlation in the phased-array receiver, and a single accelerated image acquisition are required, allowing robust assessment of signal-to-noise ratio and g-factor. The "pseudo multiple replica" method has been rigorously validated in phantoms and in vivo, showing excellent agreement with true multiple replica and analytical methods. This method is universally applicable to the parallel imaging reconstruction techniques used in clinical applications and will allow pixel-by-pixel image noise measurements for all parallel imaging strategies, allowing quantitative comparison between arbitrary k-space trajectories, image reconstruction, or noise conditioning techniques. (c) 2008 Wiley-Liss, Inc.
Schramm, Georg; Holler, Martin; Rezaei, Ahmadreza; Vunckx, Kathleen; Knoll, Florian; Bredies, Kristian; Boada, Fernando; Nuyts, Johan
2018-02-01
In this article, we evaluate Parallel Level Sets (PLS) and Bowsher's method as segmentation-free anatomical priors for regularized brain positron emission tomography (PET) reconstruction. We derive the proximity operators for two PLS priors and use the EM-TV algorithm in combination with the first order primal-dual algorithm by Chambolle and Pock to solve the non-smooth optimization problem for PET reconstruction with PLS regularization. In addition, we compare the performance of two PLS versions against the symmetric and asymmetric Bowsher priors with quadratic and relative difference penalty function. For this aim, we first evaluate reconstructions of 30 noise realizations of simulated PET data derived from a real hybrid positron emission tomography/magnetic resonance imaging (PET/MR) acquisition in terms of regional bias and noise. Second, we evaluate reconstructions of a real brain PET/MR data set acquired on a GE Signa time-of-flight PET/MR in a similar way. The reconstructions of simulated and real 3D PET/MR data show that all priors were superior to post-smoothed maximum likelihood expectation maximization with ordered subsets (OSEM) in terms of bias-noise characteristics in different regions of interest where the PET uptake follows anatomical boundaries. Our implementation of the asymmetric Bowsher prior showed slightly superior performance compared with the two versions of PLS and the symmetric Bowsher prior. At very high regularization weights, all investigated anatomical priors suffer from the transfer of non-shared gradients.
International Nuclear Information System (INIS)
Chen, C.M.; Lee, S.Y.
1995-01-01
The EM algorithm promises an estimated image with the maximal likelihood for 3D PET image reconstruction. However, due to its long computation time, the EM algorithm has not been widely used in practice. While several parallel implementations of the EM algorithm have been developed to make the EM algorithm feasible, they do not guarantee an optimal parallelization efficiency. In this paper, the authors propose a new parallel EM algorithm which maximizes the performance by optimizing data replication on a mesh-connected message-passing multiprocessor. To optimize data replication, the authors have formally derived the optimal allocation of shared data, group sizes, integration and broadcasting of replicated data as well as the scheduling of shared data accesses. The proposed parallel EM algorithm has been implemented on an iPSC/860 with 16 PEs. The experimental and theoretical results, which are consistent with each other, have shown that the proposed parallel EM algorithm could improve performance substantially over those using unoptimized data replication
CERN. Geneva
2012-01-01
side bus or processor interconnections. Parallelism can only result in performance gain, if the memory usage is optimized, memory locality improved and the communication between threads is minimized. But the domain of concurrent programming has become a field for highly skilled experts, as the implementation of multithreading is difficult, error prone and labor intensive. A full re-implementation for parallel execution of existing offline frameworks, like AliRoot in ALICE, is thus unaffordable. An alternative method, is to use a semi-automatic source-to-source transformation for getting a simple parallel design, with almost no interference between threads. This reduces the need of rewriting the develop...
Maldonado Puente, Bryan Patricio
2014-01-01
The inner detector of the ATLAS experiment has two types of silicon detectors used for tracking: Pixel Detector and SCT (semiconductor tracker). Once a proton-proton collision occurs, the result- ing particles pass through these detectors and these are recorded as hits on the detector surfaces. A medium to high energy particle passes through seven different surfaces of the two detectors, leaving seven hits, while lower energy particles can leave many more hits as they circle through the detector. For a typical event during the expected operational conditions, there are 30 000 hits in average recorded by the sensors. Only high energy particles are of interest for physics analysis and are taken into account for the path reconstruction; thus, a filtering process helps to discard the low energy particles produced in the collision. The following report presents a solution for increasing the speed of the filtering process in the path reconstruction algorithm.
Fraser, Graham M.; Goldman, Daniel; Ellis, Christopher G.
2013-01-01
Objective We compare Reconstructed Microvascular Networks (RMN) to Parallel Capillary Arrays (PCA) under several simulated physiological conditions to determine how the use of different vascular geometry affects oxygen transport solutions. Methods Three discrete networks were reconstructed from intravital video microscopy of rat skeletal muscle (84×168×342 μm, 70×157×268 μm and 65×240×571 μm) and hemodynamic measurements were made in individual capillaries. PCAs were created based on statistical measurements from RMNs. Blood flow and O2 transport models were applied and the resulting solutions for RMN and PCA models were compared under 4 conditions (rest, exercise, ischemia and hypoxia). Results Predicted tissue PO2 was consistently lower in all RMN simulations compared to the paired PCA. PO2 for 3D reconstructions at rest were 28.2±4.8, 28.1±3.5, and 33.0±4.5 mmHg for networks I, II, and III compared to the PCA mean values of 31.2±4.5, 30.6±3.4, and 33.8±4.6 mmHg. Simulated exercise yielded mean tissue PO2 in the RMN of 10.1±5.4, 12.6±5.7, and 19.7±5.7 mmHg compared to 15.3±7.3, 18.8±5.3, and 21.7±6.0 in PCA. Conclusions These findings suggest that volume matched PCA yield different results compared to reconstructed microvascular geometries when applied to O2 transport modeling; the predominant characteristic of this difference being an over estimate of mean tissue PO2. Despite this limitation, PCA models remain important for theoretical studies as they produce PO2 distributions with similar shape and parameter dependence as RMN. PMID:23841679
International Nuclear Information System (INIS)
Qureshi, S.A.; Mirza, S.M.; Arif, M.
2007-01-01
This paper present the effect of number of projections on inverse Radon transform (IRT) estimation using filtered back-projection (FBP) technique for parallel beam transmission tomography. The head phantom and the lung phantom have been used in this work. Various filters used in this study include Ram-Lak, Shepp-Logan, Cosin, Hamming and Hanning filters. The slices have been reconstructed by increasing the number of projections through parallel beam transmission tomography keeping the projections uniformly distributed. The Euclidean and Mean Squared errors and peak signal-to-noise ratio (PSNR) have been analyzed for their sensitiveness as functions of number of projections. It has found that image quality improves with the number of projections but at the cost of the computer time. The error has been minimized to get the best approximation of inverse Radon transform (IRT) as the number of projections is enhanced. The value of PSNR has been found to increase from 8.20 to 24.53 dB as the number of projections is raised from 5 to 180 for head phantom. (author)
Wan, Shixiang; Zou, Quan
2017-01-01
Multiple sequence alignment (MSA) plays a key role in biological sequence analyses, especially in phylogenetic tree construction. Extreme increase in next-generation sequencing results in shortage of efficient ultra-large biological sequence alignment approaches for coping with different sequence types. Distributed and parallel computing represents a crucial technique for accelerating ultra-large (e.g. files more than 1 GB) sequence analyses. Based on HAlign and Spark distributed computing system, we implement a highly cost-efficient and time-efficient HAlign-II tool to address ultra-large multiple biological sequence alignment and phylogenetic tree construction. The experiments in the DNA and protein large scale data sets, which are more than 1GB files, showed that HAlign II could save time and space. It outperformed the current software tools. HAlign-II can efficiently carry out MSA and construct phylogenetic trees with ultra-large numbers of biological sequences. HAlign-II shows extremely high memory efficiency and scales well with increases in computing resource. THAlign-II provides a user-friendly web server based on our distributed computing infrastructure. HAlign-II with open-source codes and datasets was established at http://lab.malab.cn/soft/halign.
Deshmane, Anagha; Gulani, Vikas; Griswold, Mark A; Seiberlich, Nicole
2012-07-01
Parallel imaging is a robust method for accelerating the acquisition of magnetic resonance imaging (MRI) data, and has made possible many new applications of MR imaging. Parallel imaging works by acquiring a reduced amount of k-space data with an array of receiver coils. These undersampled data can be acquired more quickly, but the undersampling leads to aliased images. One of several parallel imaging algorithms can then be used to reconstruct artifact-free images from either the aliased images (SENSE-type reconstruction) or from the undersampled data (GRAPPA-type reconstruction). The advantages of parallel imaging in a clinical setting include faster image acquisition, which can be used, for instance, to shorten breath-hold times resulting in fewer motion-corrupted examinations. In this article the basic concepts behind parallel imaging are introduced. The relationship between undersampling and aliasing is discussed and two commonly used parallel imaging methods, SENSE and GRAPPA, are explained in detail. Examples of artifacts arising from parallel imaging are shown and ways to detect and mitigate these artifacts are described. Finally, several current applications of parallel imaging are presented and recent advancements and promising research in parallel imaging are briefly reviewed. Copyright © 2012 Wiley Periodicals, Inc.
International Nuclear Information System (INIS)
Ragoschke-Schumm, Andreas; Schmidt, Peter; Mayer, Thomas E.; Schumm, Julia; Reimann, Georg; Mentzel, Hans-Joachim; Kaiser, Werner A.
2011-01-01
The cervical spine is prone to artefacts in T2 MR-imaging due to patient movements and cerebrospinal fluid flow. The periodically rotated overlapping parallel lines with enhanced reconstruction (PROPELLER/BLADE) acquisition method was developed to reduce motion artefacts. We sought to determine if T2-BLADE is superior to T2-TSE with conventional k-space reading. Twenty-five patients were examined using a 1.5 T MR-scanner. T2-weighted imaging of the cervical spine in sagittal and axial orientation using conventional or BLADE k-space reading was performed. Spinal cord, subarachnoid space, vertebrae and discs were evaluated by two independent observers using a scale from 0 (non-diagnostic) to 3 (excellent). Interobserver correlation was assessed as Cohen's kappa. Results of Mann-Whitney U test with p < 0.05 were regarded as significant. Furthermore, the investigators were asked for subjective evaluation in consensus. Overall interobserver accuracy of κ = 0.91 was obtained. Comparison of sagittal images showed better values for all investigated structures in T2-BLADE: spinal cord (TSE/BLADE: 1.52/2.04; p < 0.001), subarachnoid space (1.36/2.06; p < 0.001) and vertebrae/discs (1.66/2.86; p < 0.001). Comparison of axial images showed better values in T2-BLADE for spinal cord (1.68/1.86; p = 0.149) and vertebrae/discs (1.0/1.96: p < 0.001) while subarachnoid space was better to be evaluated in conventional T2-TSE (1.94/1.12; p < 0.001). In sagittal orientation, motion- and CSF-flow artefacts were reduced in T2-BLADE. In axial orientation, however, CSF-flow artefacts were pronounced in T2-BLADE. The image quality of the sagittal T2-BLADE sequences was significantly better than the T2-TSE and acquired in less time. In axial orientation, increased CSF-flow artefacts may reduce accuracy of structures in the subarachnoid space. (orig.)
Ling, Hangjian; Katz, Joseph
2014-09-20
This paper deals with two issues affecting the application of digital holographic microscopy (DHM) for measuring the spatial distribution of particles in a dense suspension, namely discriminating between real and virtual images and accurate detection of the particle center. Previous methods to separate real and virtual fields have involved applications of multiple phase-shifted holograms, combining reconstructed fields of multiple axially displaced holograms, and analysis of intensity distributions of weakly scattering objects. Here, we introduce a simple approach based on simultaneously recording two in-line holograms, whose planes are separated by a short distance from each other. This distance is chosen to be longer than the elongated trace of the particle. During reconstruction, the real images overlap, whereas the virtual images are displaced by twice the distance between hologram planes. Data analysis is based on correlating the spatial intensity distributions of the two reconstructed fields to measure displacement between traces. This method has been implemented for both synthetic particles and a dense suspension of 2 μm particles. The correlation analysis readily discriminates between real and virtual images of a sample containing more than 1300 particles. Consequently, we can now implement DHM for three-dimensional tracking of particles when the hologram plane is located inside the sample volume. Spatial correlations within the same reconstructed field are also used to improve the detection of the axial location of the particle center, extending previously introduced procedures to suspensions of microscopic particles. For each cross section within a particle trace, we sum the correlations among intensity distributions in all planes located symmetrically on both sides of the section. This cumulative correlation has a sharp peak at the particle center. Using both synthetic and recorded particle fields, we show that the uncertainty in localizing the axial
Crockett, Thomas W.
1995-01-01
This article provides a broad introduction to the subject of parallel rendering, encompassing both hardware and software systems. The focus is on the underlying concepts and the issues which arise in the design of parallel rendering algorithms and systems. We examine the different types of parallelism and how they can be applied in rendering applications. Concepts from parallel computing, such as data decomposition, task granularity, scalability, and load balancing, are considered in relation to the rendering problem. We also explore concepts from computer graphics, such as coherence and projection, which have a significant impact on the structure of parallel rendering algorithms. Our survey covers a number of practical considerations as well, including the choice of architectural platform, communication and memory requirements, and the problem of image assembly and display. We illustrate the discussion with numerous examples from the parallel rendering literature, representing most of the principal rendering methods currently used in computer graphics.
1982-01-01
Parallel Computations focuses on parallel computation, with emphasis on algorithms used in a variety of numerical and physical applications and for many different types of parallel computers. Topics covered range from vectorization of fast Fourier transforms (FFTs) and of the incomplete Cholesky conjugate gradient (ICCG) algorithm on the Cray-1 to calculation of table lookups and piecewise functions. Single tridiagonal linear systems and vectorized computation of reactive flow are also discussed.Comprised of 13 chapters, this volume begins by classifying parallel computers and describing techn
Parallel magnetic resonance imaging
International Nuclear Information System (INIS)
Larkman, David J; Nunes, Rita G
2007-01-01
Parallel imaging has been the single biggest innovation in magnetic resonance imaging in the last decade. The use of multiple receiver coils to augment the time consuming Fourier encoding has reduced acquisition times significantly. This increase in speed comes at a time when other approaches to acquisition time reduction were reaching engineering and human limits. A brief summary of spatial encoding in MRI is followed by an introduction to the problem parallel imaging is designed to solve. There are a large number of parallel reconstruction algorithms; this article reviews a cross-section, SENSE, SMASH, g-SMASH and GRAPPA, selected to demonstrate the different approaches. Theoretical (the g-factor) and practical (coil design) limits to acquisition speed are reviewed. The practical implementation of parallel imaging is also discussed, in particular coil calibration. How to recognize potential failure modes and their associated artefacts are shown. Well-established applications including angiography, cardiac imaging and applications using echo planar imaging are reviewed and we discuss what makes a good application for parallel imaging. Finally, active research areas where parallel imaging is being used to improve data quality by repairing artefacted images are also reviewed. (invited topical review)
Casanova, Henri; Robert, Yves
2008-01-01
""…The authors of the present book, who have extensive credentials in both research and instruction in the area of parallelism, present a sound, principled treatment of parallel algorithms. … This book is very well written and extremely well designed from an instructional point of view. … The authors have created an instructive and fascinating text. The book will serve researchers as well as instructors who need a solid, readable text for a course on parallelism in computing. Indeed, for anyone who wants an understandable text from which to acquire a current, rigorous, and broad vi
Parallel reconstruction in accelerated multivoxel MR spectroscopy
Boer, V. O.; Klomp, D. W. J.; Laterra, J.; Barker, P. B.
PurposeTo develop the simultaneous acquisition of multiple voxels in localized MR spectroscopy (MRS) using sensitivity encoding, allowing reduced total scan time compared to conventional sequential single voxel (SV) acquisition methods. MethodsDual volume localization was used to simultaneously
International Nuclear Information System (INIS)
Jejcic, A.; Maillard, J.; Maurel, G.; Silva, J.; Wolff-Bacha, F.
1997-01-01
The work in the field of parallel processing has developed as research activities using several numerical Monte Carlo simulations related to basic or applied current problems of nuclear and particle physics. For the applications utilizing the GEANT code development or improvement works were done on parts simulating low energy physical phenomena like radiation, transport and interaction. The problem of actinide burning by means of accelerators was approached using a simulation with the GEANT code. A program of neutron tracking in the range of low energies up to the thermal region has been developed. It is coupled to the GEANT code and permits in a single pass the simulation of a hybrid reactor core receiving a proton burst. Other works in this field refers to simulations for nuclear medicine applications like, for instance, development of biological probes, evaluation and characterization of the gamma cameras (collimators, crystal thickness) as well as the method for dosimetric calculations. Particularly, these calculations are suited for a geometrical parallelization approach especially adapted to parallel machines of the TN310 type. Other works mentioned in the same field refer to simulation of the electron channelling in crystals and simulation of the beam-beam interaction effect in colliders. The GEANT code was also used to simulate the operation of germanium detectors designed for natural and artificial radioactivity monitoring of environment
McCallum, Ethan
2011-01-01
It's tough to argue with R as a high-quality, cross-platform, open source statistical software product-unless you're in the business of crunching Big Data. This concise book introduces you to several strategies for using R to analyze large datasets. You'll learn the basics of Snow, Multicore, Parallel, and some Hadoop-related tools, including how to find them, how to use them, when they work well, and when they don't. With these packages, you can overcome R's single-threaded nature by spreading work across multiple CPUs, or offloading work to multiple machines to address R's memory barrier.
Directory of Open Access Journals (Sweden)
James G. Worner
2017-05-01
Full Text Available James Worner is an Australian-based writer and scholar currently pursuing a PhD at the University of Technology Sydney. His research seeks to expose masculinities lost in the shadow of Australia’s Anzac hegemony while exploring new opportunities for contemporary historiography. He is the recipient of the Doctoral Scholarship in Historical Consciousness at the university’s Australian Centre of Public History and will be hosted by the University of Bologna during 2017 on a doctoral research writing scholarship. ‘Parallel Lines’ is one of a collection of stories, The Shapes of Us, exploring liminal spaces of modern life: class, gender, sexuality, race, religion and education. It looks at lives, like lines, that do not meet but which travel in proximity, simultaneously attracted and repelled. James’ short stories have been published in various journals and anthologies.
New algorithms for parallel MRI
International Nuclear Information System (INIS)
Anzengruber, S; Ramlau, R; Bauer, F; Leitao, A
2008-01-01
Magnetic Resonance Imaging with parallel data acquisition requires algorithms for reconstructing the patient's image from a small number of measured lines of the Fourier domain (k-space). In contrast to well-known algorithms like SENSE and GRAPPA and its flavors we consider the problem as a non-linear inverse problem. However, in order to avoid cost intensive derivatives we will use Landweber-Kaczmarz iteration and in order to improve the overall results some additional sparsity constraints.
Parallel imaging with phase scrambling.
Zaitsev, Maxim; Schultz, Gerrit; Hennig, Juergen; Gruetter, Rolf; Gallichan, Daniel
2015-04-01
Most existing methods for accelerated parallel imaging in MRI require additional data, which are used to derive information about the sensitivity profile of each radiofrequency (RF) channel. In this work, a method is presented to avoid the acquisition of separate coil calibration data for accelerated Cartesian trajectories. Quadratic phase is imparted to the image to spread the signals in k-space (aka phase scrambling). By rewriting the Fourier transform as a convolution operation, a window can be introduced to the convolved chirp function, allowing a low-resolution image to be reconstructed from phase-scrambled data without prominent aliasing. This image (for each RF channel) can be used to derive coil sensitivities to drive existing parallel imaging techniques. As a proof of concept, the quadratic phase was applied by introducing an offset to the x(2) - y(2) shim and the data were reconstructed using adapted versions of the image space-based sensitivity encoding and GeneRalized Autocalibrating Partially Parallel Acquisitions algorithms. The method is demonstrated in a phantom (1 × 2, 1 × 3, and 2 × 2 acceleration) and in vivo (2 × 2 acceleration) using a 3D gradient echo acquisition. Phase scrambling can be used to perform parallel imaging acceleration without acquisition of separate coil calibration data, demonstrated here for a 3D-Cartesian trajectory. Further research is required to prove the applicability to other 2D and 3D sampling schemes. © 2014 Wiley Periodicals, Inc.
Parallel Programming with Intel Parallel Studio XE
Blair-Chappell , Stephen
2012-01-01
Optimize code for multi-core processors with Intel's Parallel Studio Parallel programming is rapidly becoming a "must-know" skill for developers. Yet, where to start? This teach-yourself tutorial is an ideal starting point for developers who already know Windows C and C++ and are eager to add parallelism to their code. With a focus on applying tools, techniques, and language extensions to implement parallelism, this essential resource teaches you how to write programs for multicore and leverage the power of multicore in your programs. Sharing hands-on case studies and real-world examples, the
Medical image reconstruction. A conceptual tutorial
International Nuclear Information System (INIS)
Zeng, Gengsheng Lawrence
2010-01-01
''Medical Image Reconstruction: A Conceptual Tutorial'' introduces the classical and modern image reconstruction technologies, such as two-dimensional (2D) parallel-beam and fan-beam imaging, three-dimensional (3D) parallel ray, parallel plane, and cone-beam imaging. This book presents both analytical and iterative methods of these technologies and their applications in X-ray CT (computed tomography), SPECT (single photon emission computed tomography), PET (positron emission tomography), and MRI (magnetic resonance imaging). Contemporary research results in exact region-of-interest (ROI) reconstruction with truncated projections, Katsevich's cone-beam filtered backprojection algorithm, and reconstruction with highly undersampled data with l 0 -minimization are also included. (orig.)
Institute of Scientific and Technical Information of China (English)
陈声富
2016-01-01
Developed four-parallel-yarn grid elastic plaid by double beam weaving transformation in JAT-190TNT-T610 model Toyota four-jet looms. Product has clear cell type and fabric fullness. Its style and quality is compliance with the requirements of the market, which has become a highlight in market selling varieties.%通过双经轴织造改造，在JAT-190TNT-T610型日本丰田喷气织机上开发的四纱并列起格的弹力格子布，格型清晰，布面丰满，风格和质量完全符合市场要求。
National Oceanic and Atmospheric Administration, Department of Commerce — The NOAA Paleoclimatology Program archives reconstructions of past climatic conditions derived from paleoclimate proxies, in addition to the Program's large holdings...
Morse, H Stephen
1994-01-01
Practical Parallel Computing provides information pertinent to the fundamental aspects of high-performance parallel processing. This book discusses the development of parallel applications on a variety of equipment.Organized into three parts encompassing 12 chapters, this book begins with an overview of the technology trends that converge to favor massively parallel hardware over traditional mainframes and vector machines. This text then gives a tutorial introduction to parallel hardware architectures. Other chapters provide worked-out examples of programs using several parallel languages. Thi
Akl, Selim G
1985-01-01
Parallel Sorting Algorithms explains how to use parallel algorithms to sort a sequence of items on a variety of parallel computers. The book reviews the sorting problem, the parallel models of computation, parallel algorithms, and the lower bounds on the parallel sorting problems. The text also presents twenty different algorithms, such as linear arrays, mesh-connected computers, cube-connected computers. Another example where algorithm can be applied is on the shared-memory SIMD (single instruction stream multiple data stream) computers in which the whole sequence to be sorted can fit in the
Keppenne, C. L.; Rienecker, M.; Borovikov, A. Y.
1999-01-01
Two massively parallel data assimilation systems in which the model forecast-error covariances are estimated from the distribution of an ensemble of model integrations are applied to the assimilation of 97-98 TOPEX/POSEIDON altimetry and TOGA/TAO temperature data into a Pacific basin version the NASA Seasonal to Interannual Prediction Project (NSIPP)ls quasi-isopycnal ocean general circulation model. in the first system, ensemble of model runs forced by an ensemble of atmospheric model simulations is used to calculate asymptotic error statistics. The data assimilation then occurs in the reduced phase space spanned by the corresponding leading empirical orthogonal functions. The second system is an ensemble Kalman filter in which new error statistics are computed during each assimilation cycle from the time-dependent ensemble distribution. The data assimilation experiments are conducted on NSIPP's 512-processor CRAY T3E. The two data assimilation systems are validated by withholding part of the data and quantifying the extent to which the withheld information can be inferred from the assimilation of the remaining data. The pros and cons of each system are discussed.
Introduction to parallel programming
Brawer, Steven
1989-01-01
Introduction to Parallel Programming focuses on the techniques, processes, methodologies, and approaches involved in parallel programming. The book first offers information on Fortran, hardware and operating system models, and processes, shared memory, and simple parallel programs. Discussions focus on processes and processors, joining processes, shared memory, time-sharing with multiple processors, hardware, loops, passing arguments in function/subroutine calls, program structure, and arithmetic expressions. The text then elaborates on basic parallel programming techniques, barriers and race
Fox, Geoffrey C; Messina, Guiseppe C
2014-01-01
A clear illustration of how parallel computers can be successfully appliedto large-scale scientific computations. This book demonstrates how avariety of applications in physics, biology, mathematics and other scienceswere implemented on real parallel computers to produce new scientificresults. It investigates issues of fine-grained parallelism relevant forfuture supercomputers with particular emphasis on hypercube architecture. The authors describe how they used an experimental approach to configuredifferent massively parallel machines, design and implement basic systemsoftware, and develop
Parallel Atomistic Simulations
Energy Technology Data Exchange (ETDEWEB)
HEFFELFINGER,GRANT S.
2000-01-18
Algorithms developed to enable the use of atomistic molecular simulation methods with parallel computers are reviewed. Methods appropriate for bonded as well as non-bonded (and charged) interactions are included. While strategies for obtaining parallel molecular simulations have been developed for the full variety of atomistic simulation methods, molecular dynamics and Monte Carlo have received the most attention. Three main types of parallel molecular dynamics simulations have been developed, the replicated data decomposition, the spatial decomposition, and the force decomposition. For Monte Carlo simulations, parallel algorithms have been developed which can be divided into two categories, those which require a modified Markov chain and those which do not. Parallel algorithms developed for other simulation methods such as Gibbs ensemble Monte Carlo, grand canonical molecular dynamics, and Monte Carlo methods for protein structure determination are also reviewed and issues such as how to measure parallel efficiency, especially in the case of parallel Monte Carlo algorithms with modified Markov chains are discussed.
Quadriceps Tendon Autograft Medial Patellofemoral Ligament Reconstruction.
Fink, Christian; Steensen, Robert; Gföller, Peter; Lawton, Robert
2018-06-01
Critically evaluate the published literature related to quadriceps tendon (QT) medial patellofemoral ligament (MPFL) reconstruction. Hamstring tendon (HT) MPFL reconstruction techniques have been shown to successfully restore patella stability, but complications including patella fracture are reported. Quadriceps tendon (QT) reconstruction techniques with an intact graft pedicle on the patella side have the advantage that patella bone tunnel drilling and fixation are no longer needed, reducing risk of patella fracture. Several QT MPFL reconstruction techniques, including minimally invasive surgical (MIS) approaches, have been published with promising clinical results and fewer complications than with HT techniques. Parallel laboratory studies have shown macroscopic anatomy and biomechanical properties of QT are more similar to native MPFL than hamstring (HS) HT, suggesting QT may more accurately restore native joint kinematics. Quadriceps tendon MPFL reconstruction, via both open and MIS techniques, have promising clinical results and offer valuable alternatives to HS grafts for primary and revision MPFL reconstruction in both children and adults.
Method for position emission mammography image reconstruction
Smith, Mark Frederick
2004-10-12
An image reconstruction method comprising accepting coincidence datat from either a data file or in real time from a pair of detector heads, culling event data that is outside a desired energy range, optionally saving the desired data for each detector position or for each pair of detector pixels on the two detector heads, and then reconstructing the image either by backprojection image reconstruction or by iterative image reconstruction. In the backprojection image reconstruction mode, rays are traced between centers of lines of response (LOR's), counts are then either allocated by nearest pixel interpolation or allocated by an overlap method and then corrected for geometric effects and attenuation and the data file updated. If the iterative image reconstruction option is selected, one implementation is to compute a grid Siddon retracing, and to perform maximum likelihood expectation maiximization (MLEM) computed by either: a) tracing parallel rays between subpixels on opposite detector heads; or b) tracing rays between randomized endpoint locations on opposite detector heads.
International Nuclear Information System (INIS)
Lesavoy, M.A.
1985-01-01
Vaginal reconstruction can be an uncomplicated and straightforward procedure when attention to detail is maintained. The Abbe-McIndoe procedure of lining the neovaginal canal with split-thickness skin grafts has become standard. The use of the inflatable Heyer-Schulte vaginal stent provides comfort to the patient and ease to the surgeon in maintaining approximation of the skin graft. For large vaginal and perineal defects, myocutaneous flaps such as the gracilis island have been extremely useful for correction of radiation-damaged tissue of the perineum or for the reconstruction of large ablative defects. Minimal morbidity and scarring ensue because the donor site can be closed primarily. With all vaginal reconstruction, a compliant patient is a necessity. The patient must wear a vaginal obturator for a minimum of 3 to 6 months postoperatively and is encouraged to use intercourse as an excellent obturator. In general, vaginal reconstruction can be an extremely gratifying procedure for both the functional and emotional well-being of patients
... in moderate exercise and recreational activities, or play sports that put less stress on the knees. ACL reconstruction is generally recommended if: You're an athlete and want to continue in your sport, especially if the sport involves jumping, cutting or ...
CERN. Geneva
2016-01-01
The traditionally used and well established parallel programming models OpenMP and MPI are both targeting lower level parallelism and are meant to be as language agnostic as possible. For a long time, those models were the only widely available portable options for developing parallel C++ applications beyond using plain threads. This has strongly limited the optimization capabilities of compilers, has inhibited extensibility and genericity, and has restricted the use of those models together with other, modern higher level abstractions introduced by the C++11 and C++14 standards. The recent revival of interest in the industry and wider community for the C++ language has also spurred a remarkable amount of standardization proposals and technical specifications being developed. Those efforts however have so far failed to build a vision on how to seamlessly integrate various types of parallelism, such as iterative parallel execution, task-based parallelism, asynchronous many-task execution flows, continuation s...
Parallelism in matrix computations
Gallopoulos, Efstratios; Sameh, Ahmed H
2016-01-01
This book is primarily intended as a research monograph that could also be used in graduate courses for the design of parallel algorithms in matrix computations. It assumes general but not extensive knowledge of numerical linear algebra, parallel architectures, and parallel programming paradigms. The book consists of four parts: (I) Basics; (II) Dense and Special Matrix Computations; (III) Sparse Matrix Computations; and (IV) Matrix functions and characteristics. Part I deals with parallel programming paradigms and fundamental kernels, including reordering schemes for sparse matrices. Part II is devoted to dense matrix computations such as parallel algorithms for solving linear systems, linear least squares, the symmetric algebraic eigenvalue problem, and the singular-value decomposition. It also deals with the development of parallel algorithms for special linear systems such as banded ,Vandermonde ,Toeplitz ,and block Toeplitz systems. Part III addresses sparse matrix computations: (a) the development of pa...
High performance parallel backprojection on FPGA
Energy Technology Data Exchange (ETDEWEB)
Pfanner, Florian; Knaup, Michael; Kachelriess, Marc [Erlangen-Nuernberg Univ., Erlangen (Germany). Inst. of Medical Physics (IMP)
2011-07-01
Reconstruction of tomographic images, i.e., images from a Computed Tomography scanner, is a very time consuming issue. The most calculation power is needed for the backprojection step. A closer inspection shows that the algorithm for backprojection is easy to parallelize. FPGAs are able to execute many operations in the same time, so a highly parallel algorithm is a requirement for a powerful acceleration. For data flow rate maximization, we realized the backprojection in a pipelined structure with data throughput of one clock cycle. Due the hardware limitations of the FPGA, it is not possible to reconstruct the image as a whole. So it is necessary to split up the image and reconstruct these parts separately. Despite that, a reconstruction of 512 projections into a 5122 image is calculated within 13 ms on a Virtex 5 FPGA. To save hardware resources we use fixed point arithmetic with an accuracy of 23 bit for calculation. A comparison of the result image and an image, calculated with floating point arithmetic on CPU, shows that there are no differences between these images. (orig.)
DEFF Research Database (Denmark)
Sitchinava, Nodar; Zeh, Norbert
2012-01-01
We present the parallel buffer tree, a parallel external memory (PEM) data structure for batched search problems. This data structure is a non-trivial extension of Arge's sequential buffer tree to a private-cache multiprocessor environment and reduces the number of I/O operations by the number of...... in the optimal OhOf(psortN + K/PB) parallel I/O complexity, where K is the size of the output reported in the process and psortN is the parallel I/O complexity of sorting N elements using P processors....
Parallel Algorithms and Patterns
Energy Technology Data Exchange (ETDEWEB)
Robey, Robert W. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
2016-06-16
This is a powerpoint presentation on parallel algorithms and patterns. A parallel algorithm is a well-defined, step-by-step computational procedure that emphasizes concurrency to solve a problem. Examples of problems include: Sorting, searching, optimization, matrix operations. A parallel pattern is a computational step in a sequence of independent, potentially concurrent operations that occurs in diverse scenarios with some frequency. Examples are: Reductions, prefix scans, ghost cell updates. We only touch on parallel patterns in this presentation. It really deserves its own detailed discussion which Gabe Rockefeller would like to develop.
Application Portable Parallel Library
Cole, Gary L.; Blech, Richard A.; Quealy, Angela; Townsend, Scott
1995-01-01
Application Portable Parallel Library (APPL) computer program is subroutine-based message-passing software library intended to provide consistent interface to variety of multiprocessor computers on market today. Minimizes effort needed to move application program from one computer to another. User develops application program once and then easily moves application program from parallel computer on which created to another parallel computer. ("Parallel computer" also include heterogeneous collection of networked computers). Written in C language with one FORTRAN 77 subroutine for UNIX-based computers and callable from application programs written in C language or FORTRAN 77.
Kalman Filter Tracking on Parallel Architectures
International Nuclear Information System (INIS)
Cerati, Giuseppe; Elmer, Peter; Krutelyov, Slava; Lantz, Steven; Lefebvre, Matthieu; McDermott, Kevin; Riley, Daniel; Tadel, Matevž; Wittich, Peter; Würthwein, Frank; Yagil, Avi
2016-01-01
Power density constraints are limiting the performance improvements of modern CPUs. To address this we have seen the introduction of lower-power, multi-core processors such as GPGPU, ARM and Intel MIC. In order to achieve the theoretical performance gains of these processors, it will be necessary to parallelize algorithms to exploit larger numbers of lightweight cores and specialized functions like large vector units. Track finding and fitting is one of the most computationally challenging problems for event reconstruction in particle physics. At the High-Luminosity Large Hadron Collider (HL-LHC), for example, this will be by far the dominant problem. The need for greater parallelism has driven investigations of very different track finding techniques such as Cellular Automata or Hough Transforms. The most common track finding techniques in use today, however, are those based on a Kalman filter approach. Significant experience has been accumulated with these techniques on real tracking detector systems, both in the trigger and offline. They are known to provide high physics performance, are robust, and are in use today at the LHC. Given the utility of the Kalman filter in track finding, we have begun to port these algorithms to parallel architectures, namely Intel Xeon and Xeon Phi. We report here on our progress towards an end-to-end track reconstruction algorithm fully exploiting vectorization and parallelization techniques in a simplified experimental environment
Quick plasma equilibrium reconstruction based on GPU
International Nuclear Information System (INIS)
Xiao Bingjia; Huang, Y.; Luo, Z.P.; Yuan, Q.P.; Lao, L.
2014-01-01
A parallel code named P-EFIT which could complete an equilibrium reconstruction iteration in 250 μs is described. It is built with the CUDA TM architecture by using Graphical Processing Unit (GPU). It is described for the optimization of middle-scale matrix multiplication on GPU and an algorithm which could solve block tri-diagonal linear system efficiently in parallel. Benchmark test is conducted. Static test proves the accuracy of the P-EFIT and simulation-test proves the feasibility of using P-EFIT for real-time reconstruction on 65x65 computation grids. (author)
Directory of Open Access Journals (Sweden)
Brown James
2007-12-01
Full Text Available This article aims to discuss the various defects that occur with maxillectomy with a full review of the literature and discussion of the advantages and disadvantages of the various techniques described. Reconstruction of the maxilla can be relatively simple for the standard low maxillectomy that does not involve the orbital floor (Class 2. In this situation the structure of the face is less damaged and the there are multiple reconstructive options for the restoration of the maxilla and dental alveolus. If the maxillectomy includes the orbit (Class 4 then problems involving the eye (enopthalmos, orbital dystopia, ectropion and diplopia are avoided which simplifies the reconstruction. Most controversy is associated with the maxillectomy that involves the orbital floor and dental alveolus (Class 3. A case is made for the use of the iliac crest with internal oblique as an ideal option but there are other methods, which may provide a similar result. A multidisciplinary approach to these patients is emphasised which should include a prosthodontist with a special expertise for these defects.
Parallel discrete event simulation
Overeinder, B.J.; Hertzberger, L.O.; Sloot, P.M.A.; Withagen, W.J.
1991-01-01
In simulating applications for execution on specific computing systems, the simulation performance figures must be known in a short period of time. One basic approach to the problem of reducing the required simulation time is the exploitation of parallelism. However, in parallelizing the simulation
Parallel reservoir simulator computations
International Nuclear Information System (INIS)
Hemanth-Kumar, K.; Young, L.C.
1995-01-01
The adaptation of a reservoir simulator for parallel computations is described. The simulator was originally designed for vector processors. It performs approximately 99% of its calculations in vector/parallel mode and relative to scalar calculations it achieves speedups of 65 and 81 for black oil and EOS simulations, respectively on the CRAY C-90
Murphy, Mark; Alley, Marcus; Demmel, James; Keutzer, Kurt; Vasanawala, Shreyas; Lustig, Michael
2012-01-01
We present ℓ1-SPIRiT, a simple algorithm for auto calibrating parallel imaging (acPI) and compressed sensing (CS) that permits an efficient implementation with clinically-feasible runtimes. We propose a CS objective function that minimizes cross-channel joint sparsity in the Wavelet domain. Our reconstruction minimizes this objective via iterative soft-thresholding, and integrates naturally with iterative Self-Consistent Parallel Imaging (SPIRiT). Like many iterative MRI reconstructions, ℓ1-SPIRiT’s image quality comes at a high computational cost. Excessively long runtimes are a barrier to the clinical use of any reconstruction approach, and thus we discuss our approach to efficiently parallelizing ℓ1-SPIRiT and to achieving clinically-feasible runtimes. We present parallelizations of ℓ1-SPIRiT for both multi-GPU systems and multi-core CPUs, and discuss the software optimization and parallelization decisions made in our implementation. The performance of these alternatives depends on the processor architecture, the size of the image matrix, and the number of parallel imaging channels. Fundamentally, achieving fast runtime requires the correct trade-off between cache usage and parallelization overheads. We demonstrate image quality via a case from our clinical experimentation, using a custom 3DFT Spoiled Gradient Echo (SPGR) sequence with up to 8× acceleration via poisson-disc undersampling in the two phase-encoded directions. PMID:22345529
Totally parallel multilevel algorithms
Frederickson, Paul O.
1988-01-01
Four totally parallel algorithms for the solution of a sparse linear system have common characteristics which become quite apparent when they are implemented on a highly parallel hypercube such as the CM2. These four algorithms are Parallel Superconvergent Multigrid (PSMG) of Frederickson and McBryan, Robust Multigrid (RMG) of Hackbusch, the FFT based Spectral Algorithm, and Parallel Cyclic Reduction. In fact, all four can be formulated as particular cases of the same totally parallel multilevel algorithm, which are referred to as TPMA. In certain cases the spectral radius of TPMA is zero, and it is recognized to be a direct algorithm. In many other cases the spectral radius, although not zero, is small enough that a single iteration per timestep keeps the local error within the required tolerance.
Energy Technology Data Exchange (ETDEWEB)
1991-10-23
An account of the Caltech Concurrent Computation Program (C{sup 3}P), a five year project that focused on answering the question: Can parallel computers be used to do large-scale scientific computations '' As the title indicates, the question is answered in the affirmative, by implementing numerous scientific applications on real parallel computers and doing computations that produced new scientific results. In the process of doing so, C{sup 3}P helped design and build several new computers, designed and implemented basic system software, developed algorithms for frequently used mathematical computations on massively parallel machines, devised performance models and measured the performance of many computers, and created a high performance computing facility based exclusively on parallel computers. While the initial focus of C{sup 3}P was the hypercube architecture developed by C. Seitz, many of the methods developed and lessons learned have been applied successfully on other massively parallel architectures.
Massively parallel mathematical sieves
Energy Technology Data Exchange (ETDEWEB)
Montry, G.R.
1989-01-01
The Sieve of Eratosthenes is a well-known algorithm for finding all prime numbers in a given subset of integers. A parallel version of the Sieve is described that produces computational speedups over 800 on a hypercube with 1,024 processing elements for problems of fixed size. Computational speedups as high as 980 are achieved when the problem size per processor is fixed. The method of parallelization generalizes to other sieves and will be efficient on any ensemble architecture. We investigate two highly parallel sieves using scattered decomposition and compare their performance on a hypercube multiprocessor. A comparison of different parallelization techniques for the sieve illustrates the trade-offs necessary in the design and implementation of massively parallel algorithms for large ensemble computers.
Parallel computing of physical maps--a comparative study in SIMD and MIMD parallelism.
Bhandarkar, S M; Chirravuri, S; Arnold, J
1996-01-01
Ordering clones from a genomic library into physical maps of whole chromosomes presents a central computational problem in genetics. Chromosome reconstruction via clone ordering is usually isomorphic to the NP-complete Optimal Linear Arrangement problem. Parallel SIMD and MIMD algorithms for simulated annealing based on Markov chain distribution are proposed and applied to the problem of chromosome reconstruction via clone ordering. Perturbation methods and problem-specific annealing heuristics are proposed and described. The SIMD algorithms are implemented on a 2048 processor MasPar MP-2 system which is an SIMD 2-D toroidal mesh architecture whereas the MIMD algorithms are implemented on an 8 processor Intel iPSC/860 which is an MIMD hypercube architecture. A comparative analysis of the various SIMD and MIMD algorithms is presented in which the convergence, speedup, and scalability characteristics of the various algorithms are analyzed and discussed. On a fine-grained, massively parallel SIMD architecture with a low synchronization overhead such as the MasPar MP-2, a parallel simulated annealing algorithm based on multiple periodically interacting searches performs the best. For a coarse-grained MIMD architecture with high synchronization overhead such as the Intel iPSC/860, a parallel simulated annealing algorithm based on multiple independent searches yields the best results. In either case, distribution of clonal data across multiple processors is shown to exacerbate the tendency of the parallel simulated annealing algorithm to get trapped in a local optimum.
Calibrationless Parallel Magnetic Resonance Imaging: A Joint Sparsity Model
Directory of Open Access Journals (Sweden)
Angshul Majumdar
2013-12-01
Full Text Available State-of-the-art parallel MRI techniques either explicitly or implicitly require certain parameters to be estimated, e.g., the sensitivity map for SENSE, SMASH and interpolation weights for GRAPPA, SPIRiT. Thus all these techniques are sensitive to the calibration (parameter estimation stage. In this work, we have proposed a parallel MRI technique that does not require any calibration but yields reconstruction results that are at par with (or even better than state-of-the-art methods in parallel MRI. Our proposed method required solving non-convex analysis and synthesis prior joint-sparsity problems. This work also derives the algorithms for solving them. Experimental validation was carried out on two datasets—eight channel brain and eight channel Shepp-Logan phantom. Two sampling methods were used—Variable Density Random sampling and non-Cartesian Radial sampling. For the brain data, acceleration factor of 4 was used and for the other an acceleration factor of 6 was used. The reconstruction results were quantitatively evaluated based on the Normalised Mean Squared Error between the reconstructed image and the originals. The qualitative evaluation was based on the actual reconstructed images. We compared our work with four state-of-the-art parallel imaging techniques; two calibrated methods—CS SENSE and l1SPIRiT and two calibration free techniques—Distributed CS and SAKE. Our method yields better reconstruction results than all of them.
Algorithms for parallel computers
International Nuclear Information System (INIS)
Churchhouse, R.F.
1985-01-01
Until relatively recently almost all the algorithms for use on computers had been designed on the (usually unstated) assumption that they were to be run on single processor, serial machines. With the introduction of vector processors, array processors and interconnected systems of mainframes, minis and micros, however, various forms of parallelism have become available. The advantage of parallelism is that it offers increased overall processing speed but it also raises some fundamental questions, including: (i) which, if any, of the existing 'serial' algorithms can be adapted for use in the parallel mode. (ii) How close to optimal can such adapted algorithms be and, where relevant, what are the convergence criteria. (iii) How can we design new algorithms specifically for parallel systems. (iv) For multi-processor systems how can we handle the software aspects of the interprocessor communications. Aspects of these questions illustrated by examples are considered in these lectures. (orig.)
Parallelism and array processing
International Nuclear Information System (INIS)
Zacharov, V.
1983-01-01
Modern computing, as well as the historical development of computing, has been dominated by sequential monoprocessing. Yet there is the alternative of parallelism, where several processes may be in concurrent execution. This alternative is discussed in a series of lectures, in which the main developments involving parallelism are considered, both from the standpoint of computing systems and that of applications that can exploit such systems. The lectures seek to discuss parallelism in a historical context, and to identify all the main aspects of concurrency in computation right up to the present time. Included will be consideration of the important question as to what use parallelism might be in the field of data processing. (orig.)
3D fast reconstruction in positron emission tomography
International Nuclear Information System (INIS)
Egger, M.L.; Scheurer, A. Hermann; Joseph, C.; Morel, C.
1996-01-01
The issue of long reconstruction times in positron emission tomography (PET) has been addressed from several points of view, resulting in an affordable dedicated system capable of handling routine 3D reconstructions in a few minutes per frame : on the hardware side using fast processors and a parallel architecture, and on the software side, using efficient implementation of computationally less intensive algorithms
International Nuclear Information System (INIS)
O'Sullivan, F.; Pawitan, Y.; Harrison, R.L.; Lewellen, T.K.
1990-01-01
In statistical terms, filtered backprojection can be viewed as smoothed Least Squares (LS). In this paper, the authors report on improvement in LS resolution by: incorporating locally adaptive smoothers, imposing positivity and using statistical methods for optimal selection of the resolution parameter. The resulting algorithm has high computational efficiency relative to more elaborate Maximum Likelihood (ML) type techniques (i.e. EM with sieves). Practical aspects of the procedure are discussed in the context of PET and illustrations with computer simulated and real tomograph data are presented. The relative recovery coefficients for a 9mm sphere in a computer simulated hot-spot phantom range from .3 to .6 when the number of counts ranges from 10,000 to 640,000 respectively. The authors will also present results illustrating the relative efficacy of ML and LS reconstruction techniques
The STAPL Parallel Graph Library
Harshvardhan,; Fidel, Adam; Amato, Nancy M.; Rauchwerger, Lawrence
2013-01-01
This paper describes the stapl Parallel Graph Library, a high-level framework that abstracts the user from data-distribution and parallelism details and allows them to concentrate on parallel graph algorithm development. It includes a customizable
Breast Reconstruction After Mastectomy
... Cancer Prevention Genetics of Breast & Gynecologic Cancers Breast Cancer Screening Research Breast Reconstruction After Mastectomy On This Page What is breast reconstruction? How do surgeons use implants to reconstruct a woman’s breast? How do surgeons ...
Breast reconstruction - implants
Breast implants surgery; Mastectomy - breast reconstruction with implants; Breast cancer - breast reconstruction with implants ... harder to find a tumor if your breast cancer comes back. Getting breast implants does not take as long as breast reconstruction ...
Improving image quality of parallel phase-shifting digital holography
International Nuclear Information System (INIS)
Awatsuji, Yasuhiro; Tahara, Tatsuki; Kaneko, Atsushi; Koyama, Takamasa; Nishio, Kenzo; Ura, Shogo; Kubota, Toshihiro; Matoba, Osamu
2008-01-01
The authors propose parallel two-step phase-shifting digital holography to improve the image quality of parallel phase-shifting digital holography. The proposed technique can increase the effective number of pixels of hologram twice in comparison to the conventional parallel four-step technique. The increase of the number of pixels makes it possible to improve the image quality of the reconstructed image of the parallel phase-shifting digital holography. Numerical simulation and preliminary experiment of the proposed technique were conducted and the effectiveness of the technique was confirmed. The proposed technique is more practical than the conventional parallel phase-shifting digital holography, because the composition of the digital holographic system based on the proposed technique is simpler.
Massively parallel multicanonical simulations
Gross, Jonathan; Zierenberg, Johannes; Weigel, Martin; Janke, Wolfhard
2018-03-01
Generalized-ensemble Monte Carlo simulations such as the multicanonical method and similar techniques are among the most efficient approaches for simulations of systems undergoing discontinuous phase transitions or with rugged free-energy landscapes. As Markov chain methods, they are inherently serial computationally. It was demonstrated recently, however, that a combination of independent simulations that communicate weight updates at variable intervals allows for the efficient utilization of parallel computational resources for multicanonical simulations. Implementing this approach for the many-thread architecture provided by current generations of graphics processing units (GPUs), we show how it can be efficiently employed with of the order of 104 parallel walkers and beyond, thus constituting a versatile tool for Monte Carlo simulations in the era of massively parallel computing. We provide the fully documented source code for the approach applied to the paradigmatic example of the two-dimensional Ising model as starting point and reference for practitioners in the field.
Truong, T. K.; Reed, I.; Yeh, C.; Shao, H.
1985-01-01
Fermat number transformation convolutes two digital data sequences. Very-large-scale integration (VLSI) applications, such as image and radar signal processing, X-ray reconstruction, and spectrum shaping, linear convolution of two digital data sequences of arbitrary lenghts accomplished using Fermat number transform (ENT).
SPINning parallel systems software
International Nuclear Information System (INIS)
Matlin, O.S.; Lusk, E.; McCune, W.
2002-01-01
We describe our experiences in using Spin to verify parts of the Multi Purpose Daemon (MPD) parallel process management system. MPD is a distributed collection of processes connected by Unix network sockets. MPD is dynamic processes and connections among them are created and destroyed as MPD is initialized, runs user processes, recovers from faults, and terminates. This dynamic nature is easily expressible in the Spin/Promela framework but poses performance and scalability challenges. We present here the results of expressing some of the parallel algorithms of MPD and executing both simulation and verification runs with Spin
Parallel programming with Python
Palach, Jan
2014-01-01
A fast, easy-to-follow and clear tutorial to help you develop Parallel computing systems using Python. Along with explaining the fundamentals, the book will also introduce you to slightly advanced concepts and will help you in implementing these techniques in the real world. If you are an experienced Python programmer and are willing to utilize the available computing resources by parallelizing applications in a simple way, then this book is for you. You are required to have a basic knowledge of Python development to get the most of this book.
Speeding up image reconstruction in computed tomography
CERN. Geneva
2018-01-01
Computed tomography (CT) is a technique for imaging cross-sections of an object using X-ray measurements taken from different angles. In last decades a significant progress has happened there: today advanced algorithms allow fast image reconstruction and obtaining high-quality images even with missing or dirty data, modern detectors provide high resolution without increasing radiation dose, and high-performance multi-core computing devices are there to help us solving such tasks even faster. I will start with CT basics, then briefly present existing classes of reconstruction algorithms and their differences. After that I will proceed to employing distinctive architectural features of modern multi-core devices (CPUs and GPUs) and popular program interfaces (OpenMP, MPI, CUDA, OpenCL) for developing effective parallel realizations of image reconstruction algorithms. Decreasing full reconstruction time from long hours up to minutes or even seconds has a revolutionary impact in diagnostic medicine and industria...
Expressing Parallelism with ROOT
Energy Technology Data Exchange (ETDEWEB)
Piparo, D. [CERN; Tejedor, E. [CERN; Guiraud, E. [CERN; Ganis, G. [CERN; Mato, P. [CERN; Moneta, L. [CERN; Valls Pla, X. [CERN; Canal, P. [Fermilab
2017-11-22
The need for processing the ever-increasing amount of data generated by the LHC experiments in a more efficient way has motivated ROOT to further develop its support for parallelism. Such support is being tackled both for shared-memory and distributed-memory environments. The incarnations of the aforementioned parallelism are multi-threading, multi-processing and cluster-wide executions. In the area of multi-threading, we discuss the new implicit parallelism and related interfaces, as well as the new building blocks to safely operate with ROOT objects in a multi-threaded environment. Regarding multi-processing, we review the new MultiProc framework, comparing it with similar tools (e.g. multiprocessing module in Python). Finally, as an alternative to PROOF for cluster-wide executions, we introduce the efforts on integrating ROOT with state-of-the-art distributed data processing technologies like Spark, both in terms of programming model and runtime design (with EOS as one of the main components). For all the levels of parallelism, we discuss, based on real-life examples and measurements, how our proposals can increase the productivity of scientists.
Expressing Parallelism with ROOT
Piparo, D.; Tejedor, E.; Guiraud, E.; Ganis, G.; Mato, P.; Moneta, L.; Valls Pla, X.; Canal, P.
2017-10-01
The need for processing the ever-increasing amount of data generated by the LHC experiments in a more efficient way has motivated ROOT to further develop its support for parallelism. Such support is being tackled both for shared-memory and distributed-memory environments. The incarnations of the aforementioned parallelism are multi-threading, multi-processing and cluster-wide executions. In the area of multi-threading, we discuss the new implicit parallelism and related interfaces, as well as the new building blocks to safely operate with ROOT objects in a multi-threaded environment. Regarding multi-processing, we review the new MultiProc framework, comparing it with similar tools (e.g. multiprocessing module in Python). Finally, as an alternative to PROOF for cluster-wide executions, we introduce the efforts on integrating ROOT with state-of-the-art distributed data processing technologies like Spark, both in terms of programming model and runtime design (with EOS as one of the main components). For all the levels of parallelism, we discuss, based on real-life examples and measurements, how our proposals can increase the productivity of scientists.
Parallel Fast Legendre Transform
Alves de Inda, M.; Bisseling, R.H.; Maslen, D.K.
1998-01-01
We discuss a parallel implementation of a fast algorithm for the discrete polynomial Legendre transform We give an introduction to the DriscollHealy algorithm using polynomial arithmetic and present experimental results on the eciency and accuracy of our implementation The algorithms were
Practical parallel programming
Bauer, Barr E
2014-01-01
This is the book that will teach programmers to write faster, more efficient code for parallel processors. The reader is introduced to a vast array of procedures and paradigms on which actual coding may be based. Examples and real-life simulations using these devices are presented in C and FORTRAN.
Parallel hierarchical radiosity rendering
Energy Technology Data Exchange (ETDEWEB)
Carter, Michael [Iowa State Univ., Ames, IA (United States)
1993-07-01
In this dissertation, the step-by-step development of a scalable parallel hierarchical radiosity renderer is documented. First, a new look is taken at the traditional radiosity equation, and a new form is presented in which the matrix of linear system coefficients is transformed into a symmetric matrix, thereby simplifying the problem and enabling a new solution technique to be applied. Next, the state-of-the-art hierarchical radiosity methods are examined for their suitability to parallel implementation, and scalability. Significant enhancements are also discovered which both improve their theoretical foundations and improve the images they generate. The resultant hierarchical radiosity algorithm is then examined for sources of parallelism, and for an architectural mapping. Several architectural mappings are discussed. A few key algorithmic changes are suggested during the process of making the algorithm parallel. Next, the performance, efficiency, and scalability of the algorithm are analyzed. The dissertation closes with a discussion of several ideas which have the potential to further enhance the hierarchical radiosity method, or provide an entirely new forum for the application of hierarchical methods.
Parallel universes beguile science
2007-01-01
A staple of mind-bending science fiction, the possibility of multiple universes has long intrigued hard-nosed physicists, mathematicians and cosmologists too. We may not be able -- as least not yet -- to prove they exist, many serious scientists say, but there are plenty of reasons to think that parallel dimensions are more than figments of eggheaded imagination.
Energy Technology Data Exchange (ETDEWEB)
2017-04-04
A parallelization of the k-means++ seed selection algorithm on three distinct hardware platforms: GPU, multicore CPU, and multithreaded architecture. K-means++ was developed by David Arthur and Sergei Vassilvitskii in 2007 as an extension of the k-means data clustering technique. These algorithms allow people to cluster multidimensional data, by attempting to minimize the mean distance of data points within a cluster. K-means++ improved upon traditional k-means by using a more intelligent approach to selecting the initial seeds for the clustering process. While k-means++ has become a popular alternative to traditional k-means clustering, little work has been done to parallelize this technique. We have developed original C++ code for parallelizing the algorithm on three unique hardware architectures: GPU using NVidia's CUDA/Thrust framework, multicore CPU using OpenMP, and the Cray XMT multithreaded architecture. By parallelizing the process for these platforms, we are able to perform k-means++ clustering much more quickly than it could be done before.
International Nuclear Information System (INIS)
Gardes, D.; Volkov, P.
1981-01-01
A 5x3cm 2 (timing only) and a 15x5cm 2 (timing and position) parallel plate avalanche counters (PPAC) are considered. The theory of operation and timing resolution is given. The measurement set-up and the curves of experimental results illustrate the possibilities of the two counters [fr
Parallel hierarchical global illumination
Energy Technology Data Exchange (ETDEWEB)
Snell, Quinn O. [Iowa State Univ., Ames, IA (United States)
1997-10-08
Solving the global illumination problem is equivalent to determining the intensity of every wavelength of light in all directions at every point in a given scene. The complexity of the problem has led researchers to use approximation methods for solving the problem on serial computers. Rather than using an approximation method, such as backward ray tracing or radiosity, the authors have chosen to solve the Rendering Equation by direct simulation of light transport from the light sources. This paper presents an algorithm that solves the Rendering Equation to any desired accuracy, and can be run in parallel on distributed memory or shared memory computer systems with excellent scaling properties. It appears superior in both speed and physical correctness to recent published methods involving bidirectional ray tracing or hybrid treatments of diffuse and specular surfaces. Like progressive radiosity methods, it dynamically refines the geometry decomposition where required, but does so without the excessive storage requirements for ray histories. The algorithm, called Photon, produces a scene which converges to the global illumination solution. This amounts to a huge task for a 1997-vintage serial computer, but using the power of a parallel supercomputer significantly reduces the time required to generate a solution. Currently, Photon can be run on most parallel environments from a shared memory multiprocessor to a parallel supercomputer, as well as on clusters of heterogeneous workstations.
Adaptive algebraic reconstruction technique
International Nuclear Information System (INIS)
Lu Wenkai; Yin Fangfang
2004-01-01
Algebraic reconstruction techniques (ART) are iterative procedures for reconstructing objects from their projections. It is proven that ART can be computationally efficient by carefully arranging the order in which the collected data are accessed during the reconstruction procedure and adaptively adjusting the relaxation parameters. In this paper, an adaptive algebraic reconstruction technique (AART), which adopts the same projection access scheme in multilevel scheme algebraic reconstruction technique (MLS-ART), is proposed. By introducing adaptive adjustment of the relaxation parameters during the reconstruction procedure, one-iteration AART can produce reconstructions with better quality, in comparison with one-iteration MLS-ART. Furthermore, AART outperforms MLS-ART with improved computational efficiency
Wald, Ingo; Ize, Santiago
2015-07-28
Parallel population of a grid with a plurality of objects using a plurality of processors. One example embodiment is a method for parallel population of a grid with a plurality of objects using a plurality of processors. The method includes a first act of dividing a grid into n distinct grid portions, where n is the number of processors available for populating the grid. The method also includes acts of dividing a plurality of objects into n distinct sets of objects, assigning a distinct set of objects to each processor such that each processor determines by which distinct grid portion(s) each object in its distinct set of objects is at least partially bounded, and assigning a distinct grid portion to each processor such that each processor populates its distinct grid portion with any objects that were previously determined to be at least partially bounded by its distinct grid portion.
Ultrascalable petaflop parallel supercomputer
Blumrich, Matthias A [Ridgefield, CT; Chen, Dong [Croton On Hudson, NY; Chiu, George [Cross River, NY; Cipolla, Thomas M [Katonah, NY; Coteus, Paul W [Yorktown Heights, NY; Gara, Alan G [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Hall, Shawn [Pleasantville, NY; Haring, Rudolf A [Cortlandt Manor, NY; Heidelberger, Philip [Cortlandt Manor, NY; Kopcsay, Gerard V [Yorktown Heights, NY; Ohmacht, Martin [Yorktown Heights, NY; Salapura, Valentina [Chappaqua, NY; Sugavanam, Krishnan [Mahopac, NY; Takken, Todd [Brewster, NY
2010-07-20
A massively parallel supercomputer of petaOPS-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC) having up to four processing elements. The ASIC nodes are interconnected by multiple independent networks that optimally maximize the throughput of packet communications between nodes with minimal latency. The multiple networks may include three high-speed networks for parallel algorithm message passing including a Torus, collective network, and a Global Asynchronous network that provides global barrier and notification functions. These multiple independent networks may be collaboratively or independently utilized according to the needs or phases of an algorithm for optimizing algorithm processing performance. The use of a DMA engine is provided to facilitate message passing among the nodes without the expenditure of processing resources at the node.
DEFF Research Database (Denmark)
Gregersen, Frans; Josephson, Olle; Kristoffersen, Gjert
of departure that English may be used in parallel with the various local, in this case Nordic, languages. As such, the book integrates the challenge of internationalization faced by any university with the wish to improve quality in research, education and administration based on the local language......Abstract [en] More parallel, please is the result of the work of an Inter-Nordic group of experts on language policy financed by the Nordic Council of Ministers 2014-17. The book presents all that is needed to plan, practice and revise a university language policy which takes as its point......(s). There are three layers in the text: First, you may read the extremely brief version of the in total 11 recommendations for best practice. Second, you may acquaint yourself with the extended version of the recommendations and finally, you may study the reasoning behind each of them. At the end of the text, we give...
PARALLEL MOVING MECHANICAL SYSTEMS
Directory of Open Access Journals (Sweden)
Florian Ion Tiberius Petrescu
2014-09-01
Full Text Available Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 Moving mechanical systems parallel structures are solid, fast, and accurate. Between parallel systems it is to be noticed Stewart platforms, as the oldest systems, fast, solid and precise. The work outlines a few main elements of Stewart platforms. Begin with the geometry platform, kinematic elements of it, and presented then and a few items of dynamics. Dynamic primary element on it means the determination mechanism kinetic energy of the entire Stewart platforms. It is then in a record tail cinematic mobile by a method dot matrix of rotation. If a structural mottoelement consists of two moving elements which translates relative, drive train and especially dynamic it is more convenient to represent the mottoelement as a single moving components. We have thus seven moving parts (the six motoelements or feet to which is added mobile platform 7 and one fixed.
Xyce parallel electronic simulator.
Energy Technology Data Exchange (ETDEWEB)
Keiter, Eric R; Mei, Ting; Russo, Thomas V.; Rankin, Eric Lamont; Schiek, Richard Louis; Thornquist, Heidi K.; Fixel, Deborah A.; Coffey, Todd S; Pawlowski, Roger P; Santarelli, Keith R.
2010-05-01
This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users Guide. The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce. This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users Guide.
Betchov, R
2012-01-01
Stability of Parallel Flows provides information pertinent to hydrodynamical stability. This book explores the stability problems that occur in various fields, including electronics, mechanics, oceanography, administration, economics, as well as naval and aeronautical engineering. Organized into two parts encompassing 10 chapters, this book starts with an overview of the general equations of a two-dimensional incompressible flow. This text then explores the stability of a laminar boundary layer and presents the equation of the inviscid approximation. Other chapters present the general equation
Algorithmically specialized parallel computers
Snyder, Lawrence; Gannon, Dennis B
1985-01-01
Algorithmically Specialized Parallel Computers focuses on the concept and characteristics of an algorithmically specialized computer.This book discusses the algorithmically specialized computers, algorithmic specialization using VLSI, and innovative architectures. The architectures and algorithms for digital signal, speech, and image processing and specialized architectures for numerical computations are also elaborated. Other topics include the model for analyzing generalized inter-processor, pipelined architecture for search tree maintenance, and specialized computer organization for raster
Proton computed tomography images with algebraic reconstruction
Energy Technology Data Exchange (ETDEWEB)
Bruzzi, M. [Physics and Astronomy Department, University of Florence, Florence (Italy); Civinini, C.; Scaringella, M. [INFN - Florence Division, Florence (Italy); Bonanno, D. [INFN - Catania Division, Catania (Italy); Brianzi, M. [INFN - Florence Division, Florence (Italy); Carpinelli, M. [INFN - Laboratori Nazionali del Sud, Catania (Italy); Chemistry and Pharmacy Department, University of Sassari, Sassari (Italy); Cirrone, G.A.P.; Cuttone, G. [INFN - Laboratori Nazionali del Sud, Catania (Italy); Presti, D. Lo [INFN - Catania Division, Catania (Italy); Physics and Astronomy Department, University of Catania, Catania (Italy); Maccioni, G. [INFN – Cagliari Division, Cagliari (Italy); Pallotta, S. [INFN - Florence Division, Florence (Italy); Department of Biomedical, Experimental and Clinical Sciences, University of Florence, Florence (Italy); SOD Fisica Medica, Azienda Ospedaliero-Universitaria Careggi, Firenze (Italy); Randazzo, N. [INFN - Catania Division, Catania (Italy); Romano, F. [INFN - Laboratori Nazionali del Sud, Catania (Italy); Sipala, V. [INFN - Laboratori Nazionali del Sud, Catania (Italy); Chemistry and Pharmacy Department, University of Sassari, Sassari (Italy); Talamonti, C. [INFN - Florence Division, Florence (Italy); Department of Biomedical, Experimental and Clinical Sciences, University of Florence, Florence (Italy); SOD Fisica Medica, Azienda Ospedaliero-Universitaria Careggi, Firenze (Italy); Vanzi, E. [Fisica Sanitaria, Azienda Ospedaliero-Universitaria Senese, Siena (Italy)
2017-02-11
A prototype of proton Computed Tomography (pCT) system for hadron-therapy has been manufactured and tested in a 175 MeV proton beam with a non-homogeneous phantom designed to simulate high-contrast material. BI-SART reconstruction algorithms have been implemented with GPU parallelism, taking into account of most likely paths of protons in matter. Reconstructed tomography images with density resolutions r.m.s. down to ~1% and spatial resolutions <1 mm, achieved within processing times of ~15′ for a 512×512 pixels image prove that this technique will be beneficial if used instead of X-CT in hadron-therapy.
Wang, Kun; Huang, Chao; Kao, Yu-Jiun; Chou, Cheng-Ying; Oraevsky, Alexander A; Anastasio, Mark A
2013-02-01
Optoacoustic tomography (OAT) is inherently a three-dimensional (3D) inverse problem. However, most studies of OAT image reconstruction still employ two-dimensional imaging models. One important reason is because 3D image reconstruction is computationally burdensome. The aim of this work is to accelerate existing image reconstruction algorithms for 3D OAT by use of parallel programming techniques. Parallelization strategies are proposed to accelerate a filtered backprojection (FBP) algorithm and two different pairs of projection/backprojection operations that correspond to two different numerical imaging models. The algorithms are designed to fully exploit the parallel computing power of graphics processing units (GPUs). In order to evaluate the parallelization strategies for the projection/backprojection pairs, an iterative image reconstruction algorithm is implemented. Computer simulation and experimental studies are conducted to investigate the computational efficiency and numerical accuracy of the developed algorithms. The GPU implementations improve the computational efficiency by factors of 1000, 125, and 250 for the FBP algorithm and the two pairs of projection/backprojection operators, respectively. Accurate images are reconstructed by use of the FBP and iterative image reconstruction algorithms from both computer-simulated and experimental data. Parallelization strategies for 3D OAT image reconstruction are proposed for the first time. These GPU-based implementations significantly reduce the computational time for 3D image reconstruction, complementing our earlier work on 3D OAT iterative image reconstruction.
Fast image processing on parallel hardware
International Nuclear Information System (INIS)
Bittner, U.
1988-01-01
Current digital imaging modalities in the medical field incorporate parallel hardware which is heavily used in the stage of image formation like the CT/MR image reconstruction or in the DSA real time subtraction. In order to image post-processing as efficient as image acquisition, new software approaches have to be found which take full advantage of the parallel hardware architecture. This paper describes the implementation of two-dimensional median filter which can serve as an example for the development of such an algorithm. The algorithm is analyzed by viewing it as a complete parallel sort of the k pixel values in the chosen window which leads to a generalization to rank order operators and other closely related filters reported in literature. A section about the theoretical base of the algorithm gives hints for how to characterize operations suitable for implementations on pipeline processors and the way to find the appropriate algorithms. Finally some results that computation time and usefulness of medial filtering in radiographic imaging are given
Resistor Combinations for Parallel Circuits.
McTernan, James P.
1978-01-01
To help simplify both teaching and learning of parallel circuits, a high school electricity/electronics teacher presents and illustrates the use of tables of values for parallel resistive circuits in which total resistances are whole numbers. (MF)
SOFTWARE FOR DESIGNING PARALLEL APPLICATIONS
Directory of Open Access Journals (Sweden)
M. K. Bouza
2017-01-01
Full Text Available The object of research is the tools to support the development of parallel programs in C/C ++. The methods and software which automates the process of designing parallel applications are proposed.
Parallel External Memory Graph Algorithms
DEFF Research Database (Denmark)
Arge, Lars Allan; Goodrich, Michael T.; Sitchinava, Nodari
2010-01-01
In this paper, we study parallel I/O efficient graph algorithms in the Parallel External Memory (PEM) model, one o f the private-cache chip multiprocessor (CMP) models. We study the fundamental problem of list ranking which leads to efficient solutions to problems on trees, such as computing lowest...... an optimal speedup of Â¿(P) in parallel I/O complexity and parallel computation time, compared to the single-processor external memory counterparts....
Parallel inter channel interaction mechanisms
International Nuclear Information System (INIS)
Jovic, V.; Afgan, N.; Jovic, L.
1995-01-01
Parallel channels interactions are examined. For experimental researches of nonstationary regimes flow in three parallel vertical channels results of phenomenon analysis and mechanisms of parallel channel interaction for adiabatic condition of one-phase fluid and two-phase mixture flow are shown. (author)
International Nuclear Information System (INIS)
Soltz, R; Vranas, P; Blumrich, M; Chen, D; Gara, A; Giampap, M; Heidelberger, P; Salapura, V; Sexton, J; Bhanot, G
2007-01-01
The theory of the strong nuclear force, Quantum Chromodynamics (QCD), can be numerically simulated from first principles on massively-parallel supercomputers using the method of Lattice Gauge Theory. We describe the special programming requirements of lattice QCD (LQCD) as well as the optimal supercomputer hardware architectures that it suggests. We demonstrate these methods on the BlueGene massively-parallel supercomputer and argue that LQCD and the BlueGene architecture are a natural match. This can be traced to the simple fact that LQCD is a regular lattice discretization of space into lattice sites while the BlueGene supercomputer is a discretization of space into compute nodes, and that both are constrained by requirements of locality. This simple relation is both technologically important and theoretically intriguing. The main result of this paper is the speedup of LQCD using up to 131,072 CPUs on the largest BlueGene/L supercomputer. The speedup is perfect with sustained performance of about 20% of peak. This corresponds to a maximum of 70.5 sustained TFlop/s. At these speeds LQCD and BlueGene are poised to produce the next generation of strong interaction physics theoretical results
A Parallel Butterfly Algorithm
Poulson, Jack; Demanet, Laurent; Maxwell, Nicholas; Ying, Lexing
2014-01-01
The butterfly algorithm is a fast algorithm which approximately evaluates a discrete analogue of the integral transform (Equation Presented.) at large numbers of target points when the kernel, K(x, y), is approximately low-rank when restricted to subdomains satisfying a certain simple geometric condition. In d dimensions with O(Nd) quasi-uniformly distributed source and target points, when each appropriate submatrix of K is approximately rank-r, the running time of the algorithm is at most O(r2Nd logN). A parallelization of the butterfly algorithm is introduced which, assuming a message latency of α and per-process inverse bandwidth of β, executes in at most (Equation Presented.) time using p processes. This parallel algorithm was then instantiated in the form of the open-source DistButterfly library for the special case where K(x, y) = exp(iΦ(x, y)), where Φ(x, y) is a black-box, sufficiently smooth, real-valued phase function. Experiments on Blue Gene/Q demonstrate impressive strong-scaling results for important classes of phase functions. Using quasi-uniform sources, hyperbolic Radon transforms, and an analogue of a three-dimensional generalized Radon transform were, respectively, observed to strong-scale from 1-node/16-cores up to 1024-nodes/16,384-cores with greater than 90% and 82% efficiency, respectively. © 2014 Society for Industrial and Applied Mathematics.
A Parallel Butterfly Algorithm
Poulson, Jack
2014-02-04
The butterfly algorithm is a fast algorithm which approximately evaluates a discrete analogue of the integral transform (Equation Presented.) at large numbers of target points when the kernel, K(x, y), is approximately low-rank when restricted to subdomains satisfying a certain simple geometric condition. In d dimensions with O(Nd) quasi-uniformly distributed source and target points, when each appropriate submatrix of K is approximately rank-r, the running time of the algorithm is at most O(r2Nd logN). A parallelization of the butterfly algorithm is introduced which, assuming a message latency of α and per-process inverse bandwidth of β, executes in at most (Equation Presented.) time using p processes. This parallel algorithm was then instantiated in the form of the open-source DistButterfly library for the special case where K(x, y) = exp(iΦ(x, y)), where Φ(x, y) is a black-box, sufficiently smooth, real-valued phase function. Experiments on Blue Gene/Q demonstrate impressive strong-scaling results for important classes of phase functions. Using quasi-uniform sources, hyperbolic Radon transforms, and an analogue of a three-dimensional generalized Radon transform were, respectively, observed to strong-scale from 1-node/16-cores up to 1024-nodes/16,384-cores with greater than 90% and 82% efficiency, respectively. © 2014 Society for Industrial and Applied Mathematics.
Accelerated Compressed Sensing Based CT Image Reconstruction.
Hashemi, SayedMasoud; Beheshti, Soosan; Gill, Patrick R; Paul, Narinder S; Cobbold, Richard S C
2015-01-01
In X-ray computed tomography (CT) an important objective is to reduce the radiation dose without significantly degrading the image quality. Compressed sensing (CS) enables the radiation dose to be reduced by producing diagnostic images from a limited number of projections. However, conventional CS-based algorithms are computationally intensive and time-consuming. We propose a new algorithm that accelerates the CS-based reconstruction by using a fast pseudopolar Fourier based Radon transform and rebinning the diverging fan beams to parallel beams. The reconstruction process is analyzed using a maximum-a-posterior approach, which is transformed into a weighted CS problem. The weights involved in the proposed model are calculated based on the statistical characteristics of the reconstruction process, which is formulated in terms of the measurement noise and rebinning interpolation error. Therefore, the proposed method not only accelerates the reconstruction, but also removes the rebinning and interpolation errors. Simulation results are shown for phantoms and a patient. For example, a 512 × 512 Shepp-Logan phantom when reconstructed from 128 rebinned projections using a conventional CS method had 10% error, whereas with the proposed method the reconstruction error was less than 1%. Moreover, computation times of less than 30 sec were obtained using a standard desktop computer without numerical optimization.
Accelerated Compressed Sensing Based CT Image Reconstruction
Directory of Open Access Journals (Sweden)
SayedMasoud Hashemi
2015-01-01
Full Text Available In X-ray computed tomography (CT an important objective is to reduce the radiation dose without significantly degrading the image quality. Compressed sensing (CS enables the radiation dose to be reduced by producing diagnostic images from a limited number of projections. However, conventional CS-based algorithms are computationally intensive and time-consuming. We propose a new algorithm that accelerates the CS-based reconstruction by using a fast pseudopolar Fourier based Radon transform and rebinning the diverging fan beams to parallel beams. The reconstruction process is analyzed using a maximum-a-posterior approach, which is transformed into a weighted CS problem. The weights involved in the proposed model are calculated based on the statistical characteristics of the reconstruction process, which is formulated in terms of the measurement noise and rebinning interpolation error. Therefore, the proposed method not only accelerates the reconstruction, but also removes the rebinning and interpolation errors. Simulation results are shown for phantoms and a patient. For example, a 512 × 512 Shepp-Logan phantom when reconstructed from 128 rebinned projections using a conventional CS method had 10% error, whereas with the proposed method the reconstruction error was less than 1%. Moreover, computation times of less than 30 sec were obtained using a standard desktop computer without numerical optimization.
Breast reconstruction - natural tissue
... flap; TRAM; Latissimus muscle flap with a breast implant; DIEP flap; DIEAP flap; Gluteal free flap; Transverse upper gracilis flap; TUG; Mastectomy - breast reconstruction with natural tissue; Breast cancer - breast reconstruction with natural tissue
Breast reconstruction after mastectomy
Directory of Open Access Journals (Sweden)
Daniel eSchmauss
2016-01-01
Full Text Available Breast cancer is the leading cause of cancer death in women worldwide. Its surgical approach has become less and less mutilating in the last decades. However, the overall number of breast reconstructions has significantly increased lately. Nowadays breast reconstruction should be individualized at its best, first of all taking into consideration oncological aspects of the tumor, neo-/adjuvant treatment and genetic predisposition, but also its timing (immediate versus delayed breast reconstruction, as well as the patient’s condition and wish. This article gives an overview over the various possibilities of breast reconstruction, including implant- and expander-based reconstruction, flap-based reconstruction (vascularized autologous tissue, the combination of implant and flap, reconstruction using non-vascularized autologous fat, as well as refinement surgery after breast reconstruction.
International Nuclear Information System (INIS)
DeHart, Mark D.; Williams, Mark L.; Bowman, Stephen M.
2010-01-01
The SCALE computational architecture has remained basically the same since its inception 30 years ago, although constituent modules and capabilities have changed significantly. This SCALE concept was intended to provide a framework whereby independent codes can be linked to provide a more comprehensive capability than possible with the individual programs - allowing flexibility to address a wide variety of applications. However, the current system was designed originally for mainframe computers with a single CPU and with significantly less memory than today's personal computers. It has been recognized that the present SCALE computation system could be restructured to take advantage of modern hardware and software capabilities, while retaining many of the modular features of the present system. Preliminary work is being done to define specifications and capabilities for a more advanced computational architecture. This paper describes the state of current SCALE development activities and plans for future development. With the release of SCALE 6.1 in 2010, a new phase of evolutionary development will be available to SCALE users within the TRITON and NEWT modules. The SCALE (Standardized Computer Analyses for Licensing Evaluation) code system developed by Oak Ridge National Laboratory (ORNL) provides a comprehensive and integrated package of codes and nuclear data for a wide range of applications in criticality safety, reactor physics, shielding, isotopic depletion and decay, and sensitivity/uncertainty (S/U) analysis. Over the last three years, since the release of version 5.1 in 2006, several important new codes have been introduced within SCALE, and significant advances applied to existing codes. Many of these new features became available with the release of SCALE 6.0 in early 2009. However, beginning with SCALE 6.1, a first generation of parallel computing is being introduced. In addition to near-term improvements, a plan for longer term SCALE enhancement
Parallel Polarization State Generation.
She, Alan; Capasso, Federico
2016-05-17
The control of polarization, an essential property of light, is of wide scientific and technological interest. The general problem of generating arbitrary time-varying states of polarization (SOP) has always been mathematically formulated by a series of linear transformations, i.e. a product of matrices, imposing a serial architecture. Here we show a parallel architecture described by a sum of matrices. The theory is experimentally demonstrated by modulating spatially-separated polarization components of a laser using a digital micromirror device that are subsequently beam combined. This method greatly expands the parameter space for engineering devices that control polarization. Consequently, performance characteristics, such as speed, stability, and spectral range, are entirely dictated by the technologies of optical intensity modulation, including absorption, reflection, emission, and scattering. This opens up important prospects for polarization state generation (PSG) with unique performance characteristics with applications in spectroscopic ellipsometry, spectropolarimetry, communications, imaging, and security.
Parallel imaging microfluidic cytometer.
Ehrlich, Daniel J; McKenna, Brian K; Evans, James G; Belkina, Anna C; Denis, Gerald V; Sherr, David H; Cheung, Man Ching
2011-01-01
By adding an additional degree of freedom from multichannel flow, the parallel microfluidic cytometer (PMC) combines some of the best features of fluorescence-activated flow cytometry (FCM) and microscope-based high-content screening (HCS). The PMC (i) lends itself to fast processing of large numbers of samples, (ii) adds a 1D imaging capability for intracellular localization assays (HCS), (iii) has a high rare-cell sensitivity, and (iv) has an unusual capability for time-synchronized sampling. An inability to practically handle large sample numbers has restricted applications of conventional flow cytometers and microscopes in combinatorial cell assays, network biology, and drug discovery. The PMC promises to relieve a bottleneck in these previously constrained applications. The PMC may also be a powerful tool for finding rare primary cells in the clinic. The multichannel architecture of current PMC prototypes allows 384 unique samples for a cell-based screen to be read out in ∼6-10 min, about 30 times the speed of most current FCM systems. In 1D intracellular imaging, the PMC can obtain protein localization using HCS marker strategies at many times for the sample throughput of charge-coupled device (CCD)-based microscopes or CCD-based single-channel flow cytometers. The PMC also permits the signal integration time to be varied over a larger range than is practical in conventional flow cytometers. The signal-to-noise advantages are useful, for example, in counting rare positive cells in the most difficult early stages of genome-wide screening. We review the status of parallel microfluidic cytometry and discuss some of the directions the new technology may take. Copyright © 2011 Elsevier Inc. All rights reserved.
The generalized back projection theorem for cone beam reconstruction
International Nuclear Information System (INIS)
Peyrin, F.C.
1985-01-01
The use of cone beam scanners raises the problem of three dimensional reconstruction from divergent projections. After a survey on bidimensional analytical reconstruction methods we examine their application to the 3D problem. Finally, it is shown that the back projection theorem can be generalized to cone beam projections. This allows to state a new inversion formula suitable for both the 4 π parallel and divergent geometries. It leads to the generalization of the ''rho-filtered back projection'' algorithm which is outlined
Morphological evidence for parallel processing of information in rat macula
Ross, M. D.
1988-01-01
Study of montages, tracings and reconstructions prepared from a series of 570 consecutive ultrathin sections shows that rat maculas are morphologically organized for parallel processing of linear acceleratory information. Type II cells of one terminal field distribute information to neighboring terminals as well. The findings are examined in light of physiological data which indicate that macular receptor fields have a preferred directional vector, and are interpreted by analogy to a computer technology known as an information network.
About Parallel Programming: Paradigms, Parallel Execution and Collaborative Systems
Directory of Open Access Journals (Sweden)
Loredana MOCEAN
2009-01-01
Full Text Available In the last years, there were made efforts for delineation of a stabile and unitary frame, where the problems of logical parallel processing must find solutions at least at the level of imperative languages. The results obtained by now are not at the level of the made efforts. This paper wants to be a little contribution at these efforts. We propose an overview in parallel programming, parallel execution and collaborative systems.
Parallel Framework for Cooperative Processes
Directory of Open Access Journals (Sweden)
Mitică Craus
2005-01-01
Full Text Available This paper describes the work of an object oriented framework designed to be used in the parallelization of a set of related algorithms. The idea behind the system we are describing is to have a re-usable framework for running several sequential algorithms in a parallel environment. The algorithms that the framework can be used with have several things in common: they have to run in cycles and the work should be possible to be split between several "processing units". The parallel framework uses the message-passing communication paradigm and is organized as a master-slave system. Two applications are presented: an Ant Colony Optimization (ACO parallel algorithm for the Travelling Salesman Problem (TSP and an Image Processing (IP parallel algorithm for the Symmetrical Neighborhood Filter (SNF. The implementations of these applications by means of the parallel framework prove to have good performances: approximatively linear speedup and low communication cost.
Parallel Monte Carlo reactor neutronics
International Nuclear Information System (INIS)
Blomquist, R.N.; Brown, F.B.
1994-01-01
The issues affecting implementation of parallel algorithms for large-scale engineering Monte Carlo neutron transport simulations are discussed. For nuclear reactor calculations, these include load balancing, recoding effort, reproducibility, domain decomposition techniques, I/O minimization, and strategies for different parallel architectures. Two codes were parallelized and tested for performance. The architectures employed include SIMD, MIMD-distributed memory, and workstation network with uneven interactive load. Speedups linear with the number of nodes were achieved
DEFF Research Database (Denmark)
Kosbar, Tamer R.; Sofan, Mamdouh A.; Waly, Mohamed A.
2015-01-01
about 6.1 °C when the TFO strand was modified with Z and the Watson-Crick strand with adenine-LNA (AL). The molecular modeling results showed that, in case of nucleobases Y and Z a hydrogen bond (1.69 and 1.72 Å, respectively) was formed between the protonated 3-aminopropyn-1-yl chain and one...... of the phosphate groups in Watson-Crick strand. Also, it was shown that the nucleobase Y made a good stacking and binding with the other nucleobases in the TFO and Watson-Crick duplex, respectively. In contrast, the nucleobase Z with LNA moiety was forced to twist out of plane of Watson-Crick base pair which......The phosphoramidites of DNA monomers of 7-(3-aminopropyn-1-yl)-8-aza-7-deazaadenine (Y) and 7-(3-aminopropyn-1-yl)-8-aza-7-deazaadenine LNA (Z) are synthesized, and the thermal stability at pH 7.2 and 8.2 of anti-parallel triplexes modified with these two monomers is determined. When, the anti...
Parallel consensual neural networks.
Benediktsson, J A; Sveinsson, J R; Ersoy, O K; Swain, P H
1997-01-01
A new type of a neural-network architecture, the parallel consensual neural network (PCNN), is introduced and applied in classification/data fusion of multisource remote sensing and geographic data. The PCNN architecture is based on statistical consensus theory and involves using stage neural networks with transformed input data. The input data are transformed several times and the different transformed data are used as if they were independent inputs. The independent inputs are first classified using the stage neural networks. The output responses from the stage networks are then weighted and combined to make a consensual decision. In this paper, optimization methods are used in order to weight the outputs from the stage networks. Two approaches are proposed to compute the data transforms for the PCNN, one for binary data and another for analog data. The analog approach uses wavelet packets. The experimental results obtained with the proposed approach show that the PCNN outperforms both a conjugate-gradient backpropagation neural network and conventional statistical methods in terms of overall classification accuracy of test data.
Gadgetron: An Open Source Framework for Medical Image Reconstruction
DEFF Research Database (Denmark)
Hansen, Michael Schacht; Sørensen, Thomas Sangild
2013-01-01
This work presents a new open source framework for medical image reconstruction called the “Gadgetron.” The framework implements a flexible system for creating streaming data processing pipelines where data pass through a series of modules or “Gadgets” from raw data to reconstructed images...... with a set of dedicated toolboxes in shared libraries for medical image reconstruction. This includes generic toolboxes for data-parallel (e.g., GPU-based) execution of compute-intensive components. The basic framework architecture is independent of medical imaging modality, but this article focuses on its...
A filtered backprojection reconstruction algorithm for Compton camera
Energy Technology Data Exchange (ETDEWEB)
Lojacono, Xavier; Maxim, Voichita; Peyrin, Francoise; Prost, Remy [Lyon Univ., Villeurbanne (France). CNRS, Inserm, INSA-Lyon, CREATIS, UMR5220; Zoglauer, Andreas [California Univ., Berkeley, CA (United States). Space Sciences Lab.
2011-07-01
In this paper we present a filtered backprojection reconstruction algorithm for Compton Camera detectors of particles. Compared to iterative methods, widely used for the reconstruction of images from Compton camera data, analytical methods are fast, easy to implement and avoid convergence issues. The method we propose is exact for an idealized Compton camera composed of two parallel plates of infinite dimension. We show that it copes well with low number of detected photons simulated from a realistic device. Images reconstructed from both synthetic data and realistic ones obtained with Monte Carlo simulations demonstrate the efficiency of the algorithm. (orig.)
A Parallel Particle Swarm Optimizer
National Research Council Canada - National Science Library
Schutte, J. F; Fregly, B .J; Haftka, R. T; George, A. D
2003-01-01
.... Motivated by a computationally demanding biomechanical system identification problem, we introduce a parallel implementation of a stochastic population based global optimizer, the Particle Swarm...
Patterns for Parallel Software Design
Ortega-Arjona, Jorge Luis
2010-01-01
Essential reading to understand patterns for parallel programming Software patterns have revolutionized the way we think about how software is designed, built, and documented, and the design of parallel software requires you to consider other particular design aspects and special skills. From clusters to supercomputers, success heavily depends on the design skills of software developers. Patterns for Parallel Software Design presents a pattern-oriented software architecture approach to parallel software design. This approach is not a design method in the classic sense, but a new way of managin
DEFF Research Database (Denmark)
Christensen, Mark Schram; Ehrsson, H Henrik; Nielsen, Jens Bo
2013-01-01
a different network, involving bilateral dorsal premotor cortex (PMd), primary motor cortex, and SMA, was more active when subjects viewed parallel movements while performing either symmetrical or parallel movements. Correlations between behavioral instability and brain activity were present in right lateral...... adduction-abduction movements symmetrically or in parallel with real-time congruent or incongruent visual feedback of the movements. One network, consisting of bilateral superior and middle frontal gyrus and supplementary motor area (SMA), was more active when subjects performed parallel movements, whereas...
High temporal resolution functional MRI using parallel echo volumar imaging
International Nuclear Information System (INIS)
Rabrait, C.; Ciuciu, P.; Ribes, A.; Poupon, C.; Dehaine-Lambertz, G.; LeBihan, D.; Lethimonnier, F.; Le Roux, P.; Dehaine-Lambertz, G.
2008-01-01
Purpose: To combine parallel imaging with 3D single-shot acquisition (echo volumar imaging, EVI) in order to acquire high temporal resolution volumar functional MRI (fMRI) data. Materials and Methods: An improved EVI sequence was associated with parallel acquisition and field of view reduction in order to acquire a large brain volume in 200 msec. Temporal stability and functional sensitivity were increased through optimization of all imaging parameters and Tikhonov regularization of parallel reconstruction. Two human volunteers were scanned with parallel EVI in a 1.5 T whole-body MR system, while submitted to a slow event-related auditory paradigm. Results: Thanks to parallel acquisition, the EVI volumes display a low level of geometric distortions and signal losses. After removal of low-frequency drifts and physiological artifacts,activations were detected in the temporal lobes of both volunteers and voxel-wise hemodynamic response functions (HRF) could be computed. On these HRF different habituation behaviors in response to sentence repetition could be identified. Conclusion: This work demonstrates the feasibility of high temporal resolution 3D fMRI with parallel EVI. Combined with advanced estimation tools,this acquisition method should prove useful to measure neural activity timing differences or study the nonlinearities and non-stationarities of the BOLD response. (authors)
Parallel algorithms for online trackfinding at PANDA
Energy Technology Data Exchange (ETDEWEB)
Bianchi, Ludovico; Ritman, James; Stockmanns, Tobias [IKP, Forschungszentrum Juelich GmbH (Germany); Herten, Andreas [JSC, Forschungszentrum Juelich GmbH (Germany); Collaboration: PANDA-Collaboration
2016-07-01
The PANDA experiment, one of the four scientific pillars of the FAIR facility currently in construction in Darmstadt, is a next-generation particle detector that will study collisions of antiprotons with beam momenta of 1.5-15 GeV/c on a fixed proton target. Because of the broad physics scope and the similar signature of signal and background events, PANDA's strategy for data acquisition is to continuously record data from the whole detector and use this global information to perform online event reconstruction and filtering. A real-time rejection factor of up to 1000 must be achieved to match the incoming data rate for offline storage, making all components of the data processing system computationally very challenging. Online particle track identification and reconstruction is an essential step, since track information is used as input in all following phases. Online tracking algorithms must ensure a delicate balance between high tracking efficiency and quality, and minimal computational footprint. For this reason, a massively parallel solution exploiting multiple Graphic Processing Units (GPUs) is under investigation. The talk presents the core concepts of the algorithms being developed for primary trackfinding, along with details of their implementation on GPUs.
Fully 3D GPU PET reconstruction
Energy Technology Data Exchange (ETDEWEB)
Herraiz, J.L., E-mail: joaquin@nuclear.fis.ucm.es [Grupo de Fisica Nuclear, Departmento Fisica Atomica, Molecular y Nuclear, Universidad Complutense de Madrid (Spain); Espana, S. [Department of Radiation Oncology, Massachusetts General Hospital and Harvard Medical School, Boston, MA (United States); Cal-Gonzalez, J. [Grupo de Fisica Nuclear, Departmento Fisica Atomica, Molecular y Nuclear, Universidad Complutense de Madrid (Spain); Vaquero, J.J. [Departmento de Bioingenieria e Ingenieria Espacial, Universidad Carlos III, Madrid (Spain); Desco, M. [Departmento de Bioingenieria e Ingenieria Espacial, Universidad Carlos III, Madrid (Spain); Unidad de Medicina y Cirugia Experimental, Hospital General Universitario Gregorio Maranon, Madrid (Spain); Udias, J.M. [Grupo de Fisica Nuclear, Departmento Fisica Atomica, Molecular y Nuclear, Universidad Complutense de Madrid (Spain)
2011-08-21
Fully 3D iterative tomographic image reconstruction is computationally very demanding. Graphics Processing Unit (GPU) has been proposed for many years as potential accelerators in complex scientific problems, but it has not been used until the recent advances in the programmability of GPUs that the best available reconstruction codes have started to be implemented to be run on GPUs. This work presents a GPU-based fully 3D PET iterative reconstruction software. This new code may reconstruct sinogram data from several commercially available PET scanners. The most important and time-consuming parts of the code, the forward and backward projection operations, are based on an accurate model of the scanner obtained with the Monte Carlo code PeneloPET and they have been massively parallelized on the GPU. For the PET scanners considered, the GPU-based code is more than 70 times faster than a similar code running on a single core of a fast CPU, obtaining in both cases the same images. The code has been designed to be easily adapted to reconstruct sinograms from any other PET scanner, including scanner prototypes.
Fully 3D GPU PET reconstruction
International Nuclear Information System (INIS)
Herraiz, J.L.; Espana, S.; Cal-Gonzalez, J.; Vaquero, J.J.; Desco, M.; Udias, J.M.
2011-01-01
Fully 3D iterative tomographic image reconstruction is computationally very demanding. Graphics Processing Unit (GPU) has been proposed for many years as potential accelerators in complex scientific problems, but it has not been used until the recent advances in the programmability of GPUs that the best available reconstruction codes have started to be implemented to be run on GPUs. This work presents a GPU-based fully 3D PET iterative reconstruction software. This new code may reconstruct sinogram data from several commercially available PET scanners. The most important and time-consuming parts of the code, the forward and backward projection operations, are based on an accurate model of the scanner obtained with the Monte Carlo code PeneloPET and they have been massively parallelized on the GPU. For the PET scanners considered, the GPU-based code is more than 70 times faster than a similar code running on a single core of a fast CPU, obtaining in both cases the same images. The code has been designed to be easily adapted to reconstruct sinograms from any other PET scanner, including scanner prototypes.
Dynamic dual-tracer PET reconstruction.
Gao, Fei; Liu, Huafeng; Jian, Yiqiang; Shi, Pengcheng
2009-01-01
Although of important medical implications, simultaneous dual-tracer positron emission tomography reconstruction remains a challenging problem, primarily because the photon measurements from dual tracers are overlapped. In this paper, we propose a simultaneous dynamic dual-tracer reconstruction of tissue activity maps based on guidance from tracer kinetics. The dual-tracer reconstruction problem is formulated in a state-space representation, where parallel compartment models serve as continuous-time system equation describing the tracer kinetic processes of dual tracers, and the imaging data is expressed as discrete sampling of the system states in measurement equation. The image reconstruction problem has therefore become a state estimation problem in a continuous-discrete hybrid paradigm, and H infinity filtering is adopted as the estimation strategy. As H infinity filtering makes no assumptions on the system and measurement statistics, robust reconstruction results can be obtained for the dual-tracer PET imaging system where the statistical properties of measurement data and system uncertainty are not available a priori, even when there are disturbances in the kinetic parameters. Experimental results on digital phantoms, Monte Carlo simulations and physical phantoms have demonstrated the superior performance.
Detector independent cellular automaton algorithm for track reconstruction
Energy Technology Data Exchange (ETDEWEB)
Kisel, Ivan; Kulakov, Igor; Zyzak, Maksym [Goethe Univ. Frankfurt am Main (Germany); Frankfurt Institute for Advanced Studies, Frankfurt am Main (Germany); GSI Helmholtzzentrum fuer Schwerionenforschung GmbH (Germany); Collaboration: CBM-Collaboration
2013-07-01
Track reconstruction is one of the most challenging problems of data analysis in modern high energy physics (HEP) experiments, which have to process per second of the order of 10{sup 7} events with high track multiplicity and density, registered by detectors of different types and, in many cases, located in non-homogeneous magnetic field. Creation of reconstruction package common for all experiments is considered to be important in order to consolidate efforts. The cellular automaton (CA) track reconstruction approach has been used successfully in many HEP experiments. It is very simple, efficient, local and parallel. Meanwhile it is intrinsically independent of detector geometry and good candidate for common track reconstruction. The CA implementation for the CBM experiment has been generalized and applied to the ALICE ITS and STAR HFT detectors. Tests with simulated collisions have been performed. The track reconstruction efficiencies are at the level of 95% for majority of the signal tracks for all detectors.
PARALLEL IMPORT: REALITY FOR RUSSIA
Directory of Open Access Journals (Sweden)
Т. А. Сухопарова
2014-01-01
Full Text Available Problem of parallel import is urgent question at now. Parallel import legalization in Russia is expedient. Such statement based on opposite experts opinion analysis. At the same time it’s necessary to negative consequences consider of this decision and to apply remedies to its minimization.Purchase on Elibrary.ru > Buy now
The Galley Parallel File System
Nieuwejaar, Nils; Kotz, David
1996-01-01
Most current multiprocessor file systems are designed to use multiple disks in parallel, using the high aggregate bandwidth to meet the growing I/0 requirements of parallel scientific applications. Many multiprocessor file systems provide applications with a conventional Unix-like interface, allowing the application to access multiple disks transparently. This interface conceals the parallelism within the file system, increasing the ease of programmability, but making it difficult or impossible for sophisticated programmers and libraries to use knowledge about their I/O needs to exploit that parallelism. In addition to providing an insufficient interface, most current multiprocessor file systems are optimized for a different workload than they are being asked to support. We introduce Galley, a new parallel file system that is intended to efficiently support realistic scientific multiprocessor workloads. We discuss Galley's file structure and application interface, as well as the performance advantages offered by that interface.
Parallelization of the FLAPW method
International Nuclear Information System (INIS)
Canning, A.; Mannstadt, W.; Freeman, A.J.
1999-01-01
The FLAPW (full-potential linearized-augmented plane-wave) method is one of the most accurate first-principles methods for determining electronic and magnetic properties of crystals and surfaces. Until the present work, the FLAPW method has been limited to systems of less than about one hundred atoms due to a lack of an efficient parallel implementation to exploit the power and memory of parallel computers. In this work we present an efficient parallelization of the method by division among the processors of the plane-wave components for each state. The code is also optimized for RISC (reduced instruction set computer) architectures, such as those found on most parallel computers, making full use of BLAS (basic linear algebra subprograms) wherever possible. Scaling results are presented for systems of up to 686 silicon atoms and 343 palladium atoms per unit cell, running on up to 512 processors on a CRAY T3E parallel computer
Parallelization of the FLAPW method
Canning, A.; Mannstadt, W.; Freeman, A. J.
2000-08-01
The FLAPW (full-potential linearized-augmented plane-wave) method is one of the most accurate first-principles methods for determining structural, electronic and magnetic properties of crystals and surfaces. Until the present work, the FLAPW method has been limited to systems of less than about a hundred atoms due to the lack of an efficient parallel implementation to exploit the power and memory of parallel computers. In this work, we present an efficient parallelization of the method by division among the processors of the plane-wave components for each state. The code is also optimized for RISC (reduced instruction set computer) architectures, such as those found on most parallel computers, making full use of BLAS (basic linear algebra subprograms) wherever possible. Scaling results are presented for systems of up to 686 silicon atoms and 343 palladium atoms per unit cell, running on up to 512 processors on a CRAY T3E parallel supercomputer.
A Method for Interactive 3D Reconstruction of Piecewise Planar Objects from Single Images
Sturm , Peter; Maybank , Steve
1999-01-01
International audience; We present an approach for 3D reconstruction of objects from a single image. Obviously, constraints on the 3D structure are needed to perform this task. Our approach is based on user-provided coplanarity, perpendicularity and parallelism constraints. These are used to calibrate the image and perform 3D reconstruction. The method is described in detail and results are provided.
International Nuclear Information System (INIS)
Qi Zhihua; Chen Guanghong
2007-01-01
Recently, x-ray differential phase contrast computed tomography (DPC-CT) has been experimentally implemented using a conventional source combined with several gratings. Images were reconstructed using a parallel-beam reconstruction formula. However, parallel-beam reconstruction formulae are not directly applicable for a large image object where the parallel-beam approximation fails. In this note, we present a new image reconstruction formula for fan-beam DPC-CT. There are two major features in this algorithm: (1) it enables the reconstruction of a local region of interest (ROI) using data acquired from an angular interval shorter than 180 0 + fan angle and (2) it still preserves the filtered backprojection structure. Numerical simulations have been conducted to validate the image reconstruction algorithm. (note)
Reconstruction of multiple line source attenuation maps
International Nuclear Information System (INIS)
Celler, A.; Sitek, A.; Harrop, R.
1996-01-01
A simple configuration for a transmission source for the single photon emission computed tomography (SPECT) was proposed, which utilizes a series of collimated line sources parallel to the axis of rotation of a camera. The detector is equipped with a standard parallel hole collimator. We have demonstrated that this type of source configuration can be used to generate sufficient data for the reconstruction of the attenuation map when using 8-10 line sources spaced by 3.5-4.5 cm for a 30 x 40cm detector at 65cm distance from the sources. Transmission data for a nonuniform thorax phantom was simulated, then binned and reconstructed using filtered backprojection (FBP) and iterative methods. The optimum maps are obtained with data binned into 2-3 bins and FBP reconstruction. The activity in the source was investigated for uniform and exponential activity distributions, as well as the effect of gaps and overlaps of the neighboring fan beams. A prototype of the line source has been built and the experimental verification of the technique has started
Image Reconstruction. Chapter 13
Energy Technology Data Exchange (ETDEWEB)
Nuyts, J. [Department of Nuclear Medicine and Medical Imaging Research Center, Katholieke Universiteit Leuven, Leuven (Belgium); Matej, S. [Medical Image Processing Group, Department of Radiology, University of Pennsylvania, Philadelphia, PA (United States)
2014-12-15
This chapter discusses how 2‑D or 3‑D images of tracer distribution can be reconstructed from a series of so-called projection images acquired with a gamma camera or a positron emission tomography (PET) system [13.1]. This is often called an ‘inverse problem’. The reconstruction is the inverse of the acquisition. The reconstruction is called an inverse problem because making software to compute the true tracer distribution from the acquired data turns out to be more difficult than the ‘forward’ direction, i.e. making software to simulate the acquisition. There are basically two approaches to image reconstruction: analytical reconstruction and iterative reconstruction. The analytical approach is based on mathematical inversion, yielding efficient, non-iterative reconstruction algorithms. In the iterative approach, the reconstruction problem is reduced to computing a finite number of image values from a finite number of measurements. That simplification enables the use of iterative instead of mathematical inversion. Iterative inversion tends to require more computer power, but it can cope with more complex (and hopefully more accurate) models of the acquisition process.
Update on orbital reconstruction.
Chen, Chien-Tzung; Chen, Yu-Ray
2010-08-01
Orbital trauma is common and frequently complicated by ocular injuries. The recent literature on orbital fracture is analyzed with emphasis on epidemiological data assessment, surgical timing, method of approach and reconstruction materials. Computed tomographic (CT) scan has become a routine evaluation tool for orbital trauma, and mobile CT can be applied intraoperatively if necessary. Concomitant serious ocular injury should be carefully evaluated preoperatively. Patients presenting with nonresolving oculocardiac reflex, 'white-eyed' blowout fracture, or diplopia with a positive forced duction test and CT evidence of orbital tissue entrapment require early surgical repair. Otherwise, enophthalmos can be corrected by late surgery with a similar outcome to early surgery. The use of an endoscope-assisted approach for orbital reconstruction continues to grow, offering an alternative method. Advances in alloplastic materials have improved surgical outcome and shortened operating time. In this review of modern orbital reconstruction, several controversial issues such as surgical indication, surgical timing, method of approach and choice of reconstruction material are discussed. Preoperative fine-cut CT image and thorough ophthalmologic examination are key elements to determine surgical indications. The choice of surgical approach and reconstruction materials much depends on the surgeon's experience and the reconstruction area. Prefabricated alloplastic implants together with image software and stereolithographic models are significant advances that help to more accurately reconstruct the traumatized orbit. The recent evolution of orbit reconstruction improves functional and aesthetic results and minimizes surgical complications.
Is Monte Carlo embarrassingly parallel?
Energy Technology Data Exchange (ETDEWEB)
Hoogenboom, J. E. [Delft Univ. of Technology, Mekelweg 15, 2629 JB Delft (Netherlands); Delft Nuclear Consultancy, IJsselzoom 2, 2902 LB Capelle aan den IJssel (Netherlands)
2012-07-01
Monte Carlo is often stated as being embarrassingly parallel. However, running a Monte Carlo calculation, especially a reactor criticality calculation, in parallel using tens of processors shows a serious limitation in speedup and the execution time may even increase beyond a certain number of processors. In this paper the main causes of the loss of efficiency when using many processors are analyzed using a simple Monte Carlo program for criticality. The basic mechanism for parallel execution is MPI. One of the bottlenecks turn out to be the rendez-vous points in the parallel calculation used for synchronization and exchange of data between processors. This happens at least at the end of each cycle for fission source generation in order to collect the full fission source distribution for the next cycle and to estimate the effective multiplication factor, which is not only part of the requested results, but also input to the next cycle for population control. Basic improvements to overcome this limitation are suggested and tested. Also other time losses in the parallel calculation are identified. Moreover, the threading mechanism, which allows the parallel execution of tasks based on shared memory using OpenMP, is analyzed in detail. Recommendations are given to get the maximum efficiency out of a parallel Monte Carlo calculation. (authors)
Is Monte Carlo embarrassingly parallel?
International Nuclear Information System (INIS)
Hoogenboom, J. E.
2012-01-01
Monte Carlo is often stated as being embarrassingly parallel. However, running a Monte Carlo calculation, especially a reactor criticality calculation, in parallel using tens of processors shows a serious limitation in speedup and the execution time may even increase beyond a certain number of processors. In this paper the main causes of the loss of efficiency when using many processors are analyzed using a simple Monte Carlo program for criticality. The basic mechanism for parallel execution is MPI. One of the bottlenecks turn out to be the rendez-vous points in the parallel calculation used for synchronization and exchange of data between processors. This happens at least at the end of each cycle for fission source generation in order to collect the full fission source distribution for the next cycle and to estimate the effective multiplication factor, which is not only part of the requested results, but also input to the next cycle for population control. Basic improvements to overcome this limitation are suggested and tested. Also other time losses in the parallel calculation are identified. Moreover, the threading mechanism, which allows the parallel execution of tasks based on shared memory using OpenMP, is analyzed in detail. Recommendations are given to get the maximum efficiency out of a parallel Monte Carlo calculation. (authors)
Parallel integer sorting with medium and fine-scale parallelism
Dagum, Leonardo
1993-01-01
Two new parallel integer sorting algorithms, queue-sort and barrel-sort, are presented and analyzed in detail. These algorithms do not have optimal parallel complexity, yet they show very good performance in practice. Queue-sort designed for fine-scale parallel architectures which allow the queueing of multiple messages to the same destination. Barrel-sort is designed for medium-scale parallel architectures with a high message passing overhead. The performance results from the implementation of queue-sort on a Connection Machine CM-2 and barrel-sort on a 128 processor iPSC/860 are given. The two implementations are found to be comparable in performance but not as good as a fully vectorized bucket sort on the Cray YMP.
Template based parallel checkpointing in a massively parallel computer system
Archer, Charles Jens [Rochester, MN; Inglett, Todd Alan [Rochester, MN
2009-01-13
A method and apparatus for a template based parallel checkpoint save for a massively parallel super computer system using a parallel variation of the rsync protocol, and network broadcast. In preferred embodiments, the checkpoint data for each node is compared to a template checkpoint file that resides in the storage and that was previously produced. Embodiments herein greatly decrease the amount of data that must be transmitted and stored for faster checkpointing and increased efficiency of the computer system. Embodiments are directed to a parallel computer system with nodes arranged in a cluster with a high speed interconnect that can perform broadcast communication. The checkpoint contains a set of actual small data blocks with their corresponding checksums from all nodes in the system. The data blocks may be compressed using conventional non-lossy data compression algorithms to further reduce the overall checkpoint size.
Permutationally invariant state reconstruction
DEFF Research Database (Denmark)
Moroder, Tobias; Hyllus, Philipp; Tóth, Géza
2012-01-01
Feasible tomography schemes for large particle numbers must possess, besides an appropriate data acquisition protocol, an efficient way to reconstruct the density operator from the observed finite data set. Since state reconstruction typically requires the solution of a nonlinear large-scale opti...... optimization, which has clear advantages regarding speed, control and accuracy in comparison to commonly employed numerical routines. First prototype implementations easily allow reconstruction of a state of 20 qubits in a few minutes on a standard computer.......-scale optimization problem, this is a major challenge in the design of scalable tomography schemes. Here we present an efficient state reconstruction scheme for permutationally invariant quantum state tomography. It works for all common state-of-the-art reconstruction principles, including, in particular, maximum...
Parallel education: what is it?
Amos, Michelle Peta
2017-01-01
In the history of education it has long been discussed that single-sex and coeducation are the two models of education present in schools. With the introduction of parallel schools over the last 15 years, there has been very little research into this 'new model'. Many people do not understand what it means for a school to be parallel or they confuse a parallel model with co-education, due to the presence of both boys and girls within the one institution. Therefore, the main obj...
Balanced, parallel operation of flashlamps
International Nuclear Information System (INIS)
Carder, B.M.; Merritt, B.T.
1979-01-01
A new energy store, the Compensated Pulsed Alternator (CPA), promises to be a cost effective substitute for capacitors to drive flashlamps that pump large Nd:glass lasers. Because the CPA is large and discrete, it will be necessary that it drive many parallel flashlamp circuits, presenting a problem in equal current distribution. Current division to +- 20% between parallel flashlamps has been achieved, but this is marginal for laser pumping. A method is presented here that provides equal current sharing to about 1%, and it includes fused protection against short circuit faults. The method was tested with eight parallel circuits, including both open-circuit and short-circuit fault tests
Workspace Analysis for Parallel Robot
Directory of Open Access Journals (Sweden)
Ying Sun
2013-05-01
Full Text Available As a completely new-type of robot, the parallel robot possesses a lot of advantages that the serial robot does not, such as high rigidity, great load-carrying capacity, small error, high precision, small self-weight/load ratio, good dynamic behavior and easy control, hence its range is extended in using domain. In order to find workspace of parallel mechanism, the numerical boundary-searching algorithm based on the reverse solution of kinematics and limitation of link length has been introduced. This paper analyses position workspace, orientation workspace of parallel robot of the six degrees of freedom. The result shows: It is a main means to increase and decrease its workspace to change the length of branch of parallel mechanism; The radius of the movement platform has no effect on the size of workspace, but will change position of workspace.
"Feeling" Series and Parallel Resistances.
Morse, Robert A.
1993-01-01
Equipped with drinking straws and stirring straws, a teacher can help students understand how resistances in electric circuits combine in series and in parallel. Follow-up suggestions are provided. (ZWH)
Parallel encoders for pixel detectors
International Nuclear Information System (INIS)
Nikityuk, N.M.
1991-01-01
A new method of fast encoding and determining the multiplicity and coordinates of fired pixels is described. A specific example construction of parallel encodes and MCC for n=49 and t=2 is given. 16 refs.; 6 figs.; 2 tabs
Massively Parallel Finite Element Programming
Heister, Timo
2010-01-01
Today\\'s large finite element simulations require parallel algorithms to scale on clusters with thousands or tens of thousands of processor cores. We present data structures and algorithms to take advantage of the power of high performance computers in generic finite element codes. Existing generic finite element libraries often restrict the parallelization to parallel linear algebra routines. This is a limiting factor when solving on more than a few hundreds of cores. We describe routines for distributed storage of all major components coupled with efficient, scalable algorithms. We give an overview of our effort to enable the modern and generic finite element library deal.II to take advantage of the power of large clusters. In particular, we describe the construction of a distributed mesh and develop algorithms to fully parallelize the finite element calculation. Numerical results demonstrate good scalability. © 2010 Springer-Verlag.
Event monitoring of parallel computations
Directory of Open Access Journals (Sweden)
Gruzlikov Alexander M.
2015-06-01
Full Text Available The paper considers the monitoring of parallel computations for detection of abnormal events. It is assumed that computations are organized according to an event model, and monitoring is based on specific test sequences
Massively Parallel Finite Element Programming
Heister, Timo; Kronbichler, Martin; Bangerth, Wolfgang
2010-01-01
Today's large finite element simulations require parallel algorithms to scale on clusters with thousands or tens of thousands of processor cores. We present data structures and algorithms to take advantage of the power of high performance computers in generic finite element codes. Existing generic finite element libraries often restrict the parallelization to parallel linear algebra routines. This is a limiting factor when solving on more than a few hundreds of cores. We describe routines for distributed storage of all major components coupled with efficient, scalable algorithms. We give an overview of our effort to enable the modern and generic finite element library deal.II to take advantage of the power of large clusters. In particular, we describe the construction of a distributed mesh and develop algorithms to fully parallelize the finite element calculation. Numerical results demonstrate good scalability. © 2010 Springer-Verlag.
The STAPL Parallel Graph Library
Harshvardhan,
2013-01-01
This paper describes the stapl Parallel Graph Library, a high-level framework that abstracts the user from data-distribution and parallelism details and allows them to concentrate on parallel graph algorithm development. It includes a customizable distributed graph container and a collection of commonly used parallel graph algorithms. The library introduces pGraph pViews that separate algorithm design from the container implementation. It supports three graph processing algorithmic paradigms, level-synchronous, asynchronous and coarse-grained, and provides common graph algorithms based on them. Experimental results demonstrate improved scalability in performance and data size over existing graph libraries on more than 16,000 cores and on internet-scale graphs containing over 16 billion vertices and 250 billion edges. © Springer-Verlag Berlin Heidelberg 2013.
Integrated variable projection approach (IVAPA) for parallel magnetic resonance imaging.
Zhang, Qiao; Sheng, Jinhua
2012-10-01
Parallel magnetic resonance imaging (pMRI) is a fast method which requires algorithms for the reconstructing image from a small number of measured k-space lines. The accurate estimation of the coil sensitivity functions is still a challenging problem in parallel imaging. The joint estimation of the coil sensitivity functions and the desired image has recently been proposed to improve the situation by iteratively optimizing both the coil sensitivity functions and the image reconstruction. It regards both the coil sensitivities and the desired images as unknowns to be solved for jointly. In this paper, we propose an integrated variable projection approach (IVAPA) for pMRI, which integrates two individual processing steps (coil sensitivity estimation and image reconstruction) into a single processing step to improve the accuracy of the coil sensitivity estimation using the variable projection approach. The method is demonstrated to be able to give an optimal solution with considerably reduced artifacts for high reduction factors and a low number of auto-calibration signal (ACS) lines, and our implementation has a fast convergence rate. The performance of the proposed method is evaluated using a set of in vivo experiment data. Copyright © 2012 Elsevier Ltd. All rights reserved.
Construction and reconstruction concept in mathematics instruction
Mumu, Jeinne; Charitas Indra Prahmana, Rully; Tanujaya, Benidiktus
2017-12-01
The purpose of this paper is to describe two learning activities undertaken by lecturers, so that students can understand a mathematical concept. The mathematical concept studied in this research is the Vector Space in Linear Algebra instruction. Classroom Action Research used as a research method with pre-service mathematics teacher at University of Papua as the research subject. Student participants are divided into two parallel classes, 24 students in regular class, and remedial class consist of 18 students. Both approaches, construct and reconstruction concept, are implemented on both classes. The result shows that concept construction can only be done in regular class while in remedial class, learning with concept construction approach is not able to increase students' understanding on the concept taught. Understanding the concept of a student in a remedial class can only be carried out using the concept reconstruction approach.
Equilibrium reconstruction in stellarators: V3FIT
Energy Technology Data Exchange (ETDEWEB)
Hanson, J.D.; Knowlton, S.F. [Physics Department, Auburn University, Auburn, AL (United States); Hirshman, S.P.; Lazarus, E.A. [Oak Ridge National Laboratory, Oak Ridge, TN (United States); Lao, L.L. [General Atomics, San Diego, CA (United States)
2003-07-01
The first section describes a general response function formalism for computing stellarator magnetic diagnostic signals, which is the first step in developing a reconstruction capability. The approach parallels that used in the EFIT two-dimensional (2-D) equilibrium reconstruction code. The second section describes the two codes we have written, V3RFUN and V3POST. V3RFUN computes the response functions for a specified magnetic diagnostic coil, and V3POST uses the response functions calculated by V3RFUN, along with the plasma current information supplied by the equilibrium code VMEC, to compute the expected magnetic diagnostic signals. These two codes are currently being used to design magnetic diagnostic for the NCSX stellarator (at PPPL) and the CTH toroidal hybrid stellarator (at Auburn University). The last section of the paper describes plans for the V3FIT code. (orig.)
Prompt data reconstruction at the ATLAS experiment
International Nuclear Information System (INIS)
Andrew Stewart, Graeme; Boyd, Jamie; Unal, Guillaume; Firmino da Costa, João; Tuggle, Joseph
2012-01-01
The ATLAS experiment at the LHC collider recorded more than 5 fb −1 data of pp collisions at a centre-of-mass energy of 7 TeV during 2011. The recorded data are promptly reconstructed in two steps at a large computing farm at CERN to provide fast access to high quality data for physics analysis. In the first step, a subset of the data, corresponding to the express stream and having 10Hz of events, is processed in parallel with data taking. Data quality, detector calibration constants, and the beam spot position are determined using the reconstructed data within 48 hours. In the second step all recorded data are processed with the updated parameters. The LHC significantly increased the instantaneous luminosity and the number of interactions per bunch crossing in 2011; the data recording rate by ATLAS exceeds 400 Hz. To cope with these challenges the performance and reliability of the ATLAS reconstruction software have been improved. In this paper we describe how the prompt data reconstruction system quickly and stably provides high quality data to analysers.
Writing parallel programs that work
CERN. Geneva
2012-01-01
Serial algorithms typically run inefficiently on parallel machines. This may sound like an obvious statement, but it is the root cause of why parallel programming is considered to be difficult. The current state of the computer industry is still that almost all programs in existence are serial. This talk will describe the techniques used in the Intel Parallel Studio to provide a developer with the tools necessary to understand the behaviors and limitations of the existing serial programs. Once the limitations are known the developer can refactor the algorithms and reanalyze the resulting programs with the tools in the Intel Parallel Studio to create parallel programs that work. About the speaker Paul Petersen is a Sr. Principal Engineer in the Software and Solutions Group (SSG) at Intel. He received a Ph.D. degree in Computer Science from the University of Illinois in 1993. After UIUC, he was employed at Kuck and Associates, Inc. (KAI) working on auto-parallelizing compiler (KAP), and was involved in th...
Exploiting Symmetry on Parallel Architectures.
Stiller, Lewis Benjamin
1995-01-01
This thesis describes techniques for the design of parallel programs that solve well-structured problems with inherent symmetry. Part I demonstrates the reduction of such problems to generalized matrix multiplication by a group-equivariant matrix. Fast techniques for this multiplication are described, including factorization, orbit decomposition, and Fourier transforms over finite groups. Our algorithms entail interaction between two symmetry groups: one arising at the software level from the problem's symmetry and the other arising at the hardware level from the processors' communication network. Part II illustrates the applicability of our symmetry -exploitation techniques by presenting a series of case studies of the design and implementation of parallel programs. First, a parallel program that solves chess endgames by factorization of an associated dihedral group-equivariant matrix is described. This code runs faster than previous serial programs, and discovered it a number of results. Second, parallel algorithms for Fourier transforms for finite groups are developed, and preliminary parallel implementations for group transforms of dihedral and of symmetric groups are described. Applications in learning, vision, pattern recognition, and statistics are proposed. Third, parallel implementations solving several computational science problems are described, including the direct n-body problem, convolutions arising from molecular biology, and some communication primitives such as broadcast and reduce. Some of our implementations ran orders of magnitude faster than previous techniques, and were used in the investigation of various physical phenomena.
Parallel algorithms for continuum dynamics
International Nuclear Information System (INIS)
Hicks, D.L.; Liebrock, L.M.
1987-01-01
Simply porting existing parallel programs to a new parallel processor may not achieve the full speedup possible; to achieve the maximum efficiency may require redesigning the parallel algorithms for the specific architecture. The authors discuss here parallel algorithms that were developed first for the HEP processor and then ported to the CRAY X-MP/4, the ELXSI/10, and the Intel iPSC/32. Focus is mainly on the most recent parallel processing results produced, i.e., those on the Intel Hypercube. The applications are simulations of continuum dynamics in which the momentum and stress gradients are important. Examples of these are inertial confinement fusion experiments, severe breaks in the coolant system of a reactor, weapons physics, shock-wave physics. Speedup efficiencies on the Intel iPSC Hypercube are very sensitive to the ratio of communication to computation. Great care must be taken in designing algorithms for this machine to avoid global communication. This is much more critical on the iPSC than it was on the three previous parallel processors
Hybrid spectral CT reconstruction.
Directory of Open Access Journals (Sweden)
Darin P Clark
Full Text Available Current photon counting x-ray detector (PCD technology faces limitations associated with spectral fidelity and photon starvation. One strategy for addressing these limitations is to supplement PCD data with high-resolution, low-noise data acquired with an energy-integrating detector (EID. In this work, we propose an iterative, hybrid reconstruction technique which combines the spectral properties of PCD data with the resolution and signal-to-noise characteristics of EID data. Our hybrid reconstruction technique is based on an algebraic model of data fidelity which substitutes the EID data into the data fidelity term associated with the PCD reconstruction, resulting in a joint reconstruction problem. Within the split Bregman framework, these data fidelity constraints are minimized subject to additional constraints on spectral rank and on joint intensity-gradient sparsity measured between the reconstructions of the EID and PCD data. Following a derivation of the proposed technique, we apply it to the reconstruction of a digital phantom which contains realistic concentrations of iodine, barium, and calcium encountered in small-animal micro-CT. The results of this experiment suggest reliable separation and detection of iodine at concentrations ≥ 5 mg/ml and barium at concentrations ≥ 10 mg/ml in 2-mm features for EID and PCD data reconstructed with inherent spatial resolutions of 176 μm and 254 μm, respectively (point spread function, FWHM. Furthermore, hybrid reconstruction is demonstrated to enhance spatial resolution within material decomposition results and to improve low-contrast detectability by as much as 2.6 times relative to reconstruction with PCD data only. The parameters of the simulation experiment are based on an in vivo micro-CT experiment conducted in a mouse model of soft-tissue sarcoma. Material decomposition results produced from this in vivo data demonstrate the feasibility of distinguishing two K-edge contrast agents with
Hybrid spectral CT reconstruction
Clark, Darin P.
2017-01-01
Current photon counting x-ray detector (PCD) technology faces limitations associated with spectral fidelity and photon starvation. One strategy for addressing these limitations is to supplement PCD data with high-resolution, low-noise data acquired with an energy-integrating detector (EID). In this work, we propose an iterative, hybrid reconstruction technique which combines the spectral properties of PCD data with the resolution and signal-to-noise characteristics of EID data. Our hybrid reconstruction technique is based on an algebraic model of data fidelity which substitutes the EID data into the data fidelity term associated with the PCD reconstruction, resulting in a joint reconstruction problem. Within the split Bregman framework, these data fidelity constraints are minimized subject to additional constraints on spectral rank and on joint intensity-gradient sparsity measured between the reconstructions of the EID and PCD data. Following a derivation of the proposed technique, we apply it to the reconstruction of a digital phantom which contains realistic concentrations of iodine, barium, and calcium encountered in small-animal micro-CT. The results of this experiment suggest reliable separation and detection of iodine at concentrations ≥ 5 mg/ml and barium at concentrations ≥ 10 mg/ml in 2-mm features for EID and PCD data reconstructed with inherent spatial resolutions of 176 μm and 254 μm, respectively (point spread function, FWHM). Furthermore, hybrid reconstruction is demonstrated to enhance spatial resolution within material decomposition results and to improve low-contrast detectability by as much as 2.6 times relative to reconstruction with PCD data only. The parameters of the simulation experiment are based on an in vivo micro-CT experiment conducted in a mouse model of soft-tissue sarcoma. Material decomposition results produced from this in vivo data demonstrate the feasibility of distinguishing two K-edge contrast agents with a spectral
Tomographic reconstruction by using FPSIRT (Fast Particle System Iterative Reconstruction Technique)
Energy Technology Data Exchange (ETDEWEB)
Moreira, Icaro Valgueiro M.; Melo, Silvio de Barros; Dantas, Carlos; Lima, Emerson Alexandre; Silva, Ricardo Martins; Cardoso, Halisson Alberdan C., E-mail: ivmm@cin.ufpe.br, E-mail: sbm@cin.ufpe.br, E-mail: rmas@cin.ufpe.br, E-mail: hacc@cin.ufpe.br, E-mail: ccd@ufpe.br, E-mail: eal@cin.ufpe.br [Universidade Federal de Pernambuco (UFPE), Recife, PE (Brazil)
2015-07-01
The PSIRT (Particle System Iterative Reconstruction Technique) is a method of tomographic image reconstruction primarily designed to work with configurations suitable for industrial applications. A particle system is an optimization technique inspired in real physical systems that associates to the reconstructing material a set of particles with certain physical features, subject to a force eld, which can produce movement. The system constantly updates the set of particles by repositioning them in such a way as to approach the equilibrium. The elastic potential along a trajectory is a function of the difference between the attenuation coefficient in the current configuration and the corresponding input data. PSIRT has been successfully used to reconstruct simulated and real objects subject to sets of parallel and fanbeam lines in different angles, representing typical gamma-ray tomographic arrangements. One of PSIRT's limitation was its performance, too slow for real time scenarios. In this work, it is presented a reformulation in PSIRT's computational model, which is able to grant the new algorithm, the FPSIRT - Fast System Iterative Reconstruction Technique, a performance up to 200-time faster than PSIRT's. In this work a comparison of their application to real and simulated data from the HSGT, High Speed Gamma Tomograph, is presented. (author)
Tomographic reconstruction by using FPSIRT (Fast Particle System Iterative Reconstruction Technique)
International Nuclear Information System (INIS)
Moreira, Icaro Valgueiro M.; Melo, Silvio de Barros; Dantas, Carlos; Lima, Emerson Alexandre; Silva, Ricardo Martins; Cardoso, Halisson Alberdan C.
2015-01-01
The PSIRT (Particle System Iterative Reconstruction Technique) is a method of tomographic image reconstruction primarily designed to work with configurations suitable for industrial applications. A particle system is an optimization technique inspired in real physical systems that associates to the reconstructing material a set of particles with certain physical features, subject to a force eld, which can produce movement. The system constantly updates the set of particles by repositioning them in such a way as to approach the equilibrium. The elastic potential along a trajectory is a function of the difference between the attenuation coefficient in the current configuration and the corresponding input data. PSIRT has been successfully used to reconstruct simulated and real objects subject to sets of parallel and fanbeam lines in different angles, representing typical gamma-ray tomographic arrangements. One of PSIRT's limitation was its performance, too slow for real time scenarios. In this work, it is presented a reformulation in PSIRT's computational model, which is able to grant the new algorithm, the FPSIRT - Fast System Iterative Reconstruction Technique, a performance up to 200-time faster than PSIRT's. In this work a comparison of their application to real and simulated data from the HSGT, High Speed Gamma Tomograph, is presented. (author)
Archer, Charles J.; Blocksome, Michael A.; Ratterman, Joseph D.; Smith, Brian E.
2014-08-12
Endpoint-based parallel data processing in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI composed of data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, the compute nodes coupled for data communications through the PAMI, including establishing a data communications geometry, the geometry specifying, for tasks representing processes of execution of the parallel application, a set of endpoints that are used in collective operations of the PAMI including a plurality of endpoints for one of the tasks; receiving in endpoints of the geometry an instruction for a collective operation; and executing the instruction for a collective operation through the endpoints in dependence upon the geometry, including dividing data communications operations among the plurality of endpoints for one of the tasks.
Parallel Implicit Algorithms for CFD
Keyes, David E.
1998-01-01
The main goal of this project was efficient distributed parallel and workstation cluster implementations of Newton-Krylov-Schwarz (NKS) solvers for implicit Computational Fluid Dynamics (CFD.) "Newton" refers to a quadratically convergent nonlinear iteration using gradient information based on the true residual, "Krylov" to an inner linear iteration that accesses the Jacobian matrix only through highly parallelizable sparse matrix-vector products, and "Schwarz" to a domain decomposition form of preconditioning the inner Krylov iterations with primarily neighbor-only exchange of data between the processors. Prior experience has established that Newton-Krylov methods are competitive solvers in the CFD context and that Krylov-Schwarz methods port well to distributed memory computers. The combination of the techniques into Newton-Krylov-Schwarz was implemented on 2D and 3D unstructured Euler codes on the parallel testbeds that used to be at LaRC and on several other parallel computers operated by other agencies or made available by the vendors. Early implementations were made directly in Massively Parallel Integration (MPI) with parallel solvers we adapted from legacy NASA codes and enhanced for full NKS functionality. Later implementations were made in the framework of the PETSC library from Argonne National Laboratory, which now includes pseudo-transient continuation Newton-Krylov-Schwarz solver capability (as a result of demands we made upon PETSC during our early porting experiences). A secondary project pursued with funding from this contract was parallel implicit solvers in acoustics, specifically in the Helmholtz formulation. A 2D acoustic inverse problem has been solved in parallel within the PETSC framework.
Second derivative parallel block backward differentiation type ...
African Journals Online (AJOL)
Second derivative parallel block backward differentiation type formulas for Stiff ODEs. ... Log in or Register to get access to full text downloads. ... and the methods are inherently parallel and can be distributed over parallel processors. They are ...
A Parallel Approach to Fractal Image Compression
Lubomir Dedera
2004-01-01
The paper deals with a parallel approach to coding and decoding algorithms in fractal image compressionand presents experimental results comparing sequential and parallel algorithms from the point of view of achieved bothcoding and decoding time and effectiveness of parallelization.
Overview of image reconstruction
International Nuclear Information System (INIS)
Marr, R.B.
1980-04-01
Image reconstruction (or computerized tomography, etc.) is any process whereby a function, f, on R/sup n/ is estimated from empirical data pertaining to its integrals, ∫f(x) dx, for some collection of hyperplanes of dimension k < n. The paper begins with background information on how image reconstruction problems have arisen in practice, and describes some of the application areas of past or current interest; these include radioastronomy, optics, radiology and nuclear medicine, electron microscopy, acoustical imaging, geophysical tomography, nondestructive testing, and NMR zeugmatography. Then the various reconstruction algorithms are discussed in five classes: summation, or simple back-projection; convolution, or filtered back-projection; Fourier and other functional transforms; orthogonal function series expansion; and iterative methods. Certain more technical mathematical aspects of image reconstruction are considered from the standpoint of uniqueness, consistency, and stability of solution. The paper concludes by presenting certain open problems. 73 references
The evolving breast reconstruction
DEFF Research Database (Denmark)
Thomsen, Jørn Bo; Gunnarsson, Gudjon Leifur
2014-01-01
The aim of this editorial is to give an update on the use of the propeller thoracodorsal artery perforator flap (TAP/TDAP-flap) within the field of breast reconstruction. The TAP-flap can be dissected by a combined use of a monopolar cautery and a scalpel. Microsurgical instruments are generally...... not needed. The propeller TAP-flap can be designed in different ways, three of these have been published: (I) an oblique upwards design; (II) a horizontal design; (III) an oblique downward design. The latissimus dorsi-flap is a good and reliable option for breast reconstruction, but has been criticized...... for oncoplastic and reconstructive breast surgery and will certainly become an invaluable addition to breast reconstructive methods....
Forging Provincial Reconstruction Teams
National Research Council Canada - National Science Library
Honore, Russel L; Boslego, David V
2007-01-01
The Provincial Reconstruction Team (PRT) training mission completed by First U.S. Army in April 2006 was a joint Service effort to meet a requirement from the combatant commander to support goals in Afghanistan...
Breast Reconstruction with Implants
... your surgical options and discuss the advantages and disadvantages of implant-based reconstruction, and may show you ... Policy Notice of Privacy Practices Notice of Nondiscrimination Advertising Mayo Clinic is a not-for-profit organization ...
Accelerated 3D-OSEM image reconstruction using a Beowulf PC cluster for pinhole SPECT
International Nuclear Information System (INIS)
Zeniya, Tsutomu; Watabe, Hiroshi; Sohlberg, Antti; Iida, Hidehiro
2007-01-01
A conventional pinhole single-photon emission computed tomography (SPECT) with a single circular orbit has limitations associated with non-uniform spatial resolution or axial blurring. Recently, we demonstrated that three-dimensional (3D) images with uniform spatial resolution and no blurring can be obtained by complete data acquired using two-circular orbit, combined with the 3D ordered subsets expectation maximization (OSEM) reconstruction method. However, a long computation time is required to obtain the reconstruction image, because of the fact that 3D-OSEM is an iterative method and two-orbit acquisition doubles the size of the projection data. To reduce the long reconstruction time, we parallelized the two-orbit pinhole 3D-OSEM reconstruction process by using a Beowulf personal computer (PC) cluster. The Beowulf PC cluster consists of seven PCs connected to Gbit Ethernet switches. Message passing interface protocol was utilized for parallelizing the reconstruction process. The projection data in a subset are distributed to each PC. The partial image forward-and back-projected in each PC is transferred to all PCs. The current image estimate on each PC is updated after summing the partial images. The performance of parallelization on the PC cluster was evaluated using two independent projection data sets acquired by a pinhole SPECT system with two different circular orbits. Parallelization using the PC cluster improved the reconstruction time with increasing number of PCs. The reconstruction time of 54 min by the single PC was decreased to 10 min when six or seven PCs were used. The speed-up factor was 5.4. The reconstruction image by the PC cluster was virtually identical with that by the single PC. Parallelization of 3D-OSEM reconstruction for pinhole SPECT using the PC cluster can significantly reduce the computation time, whereas its implementation is simple and inexpensive. (author)
Parallel fabrication of macroporous scaffolds.
Dobos, Andrew; Grandhi, Taraka Sai Pavan; Godeshala, Sudhakar; Meldrum, Deirdre R; Rege, Kaushal
2018-07-01
Scaffolds generated from naturally occurring and synthetic polymers have been investigated in several applications because of their biocompatibility and tunable chemo-mechanical properties. Existing methods for generation of 3D polymeric scaffolds typically cannot be parallelized, suffer from low throughputs, and do not allow for quick and easy removal of the fragile structures that are formed. Current molds used in hydrogel and scaffold fabrication using solvent casting and porogen leaching are often single-use and do not facilitate 3D scaffold formation in parallel. Here, we describe a simple device and related approaches for the parallel fabrication of macroporous scaffolds. This approach was employed for the generation of macroporous and non-macroporous materials in parallel, in higher throughput and allowed for easy retrieval of these 3D scaffolds once formed. In addition, macroporous scaffolds with interconnected as well as non-interconnected pores were generated, and the versatility of this approach was employed for the generation of 3D scaffolds from diverse materials including an aminoglycoside-derived cationic hydrogel ("Amikagel"), poly(lactic-co-glycolic acid) or PLGA, and collagen. Macroporous scaffolds generated using the device were investigated for plasmid DNA binding and cell loading, indicating the use of this approach for developing materials for different applications in biotechnology. Our results demonstrate that the device-based approach is a simple technology for generating scaffolds in parallel, which can enhance the toolbox of current fabrication techniques. © 2018 Wiley Periodicals, Inc.
Parallel plasma fluid turbulence calculations
International Nuclear Information System (INIS)
Leboeuf, J.N.; Carreras, B.A.; Charlton, L.A.; Drake, J.B.; Lynch, V.E.; Newman, D.E.; Sidikman, K.L.; Spong, D.A.
1994-01-01
The study of plasma turbulence and transport is a complex problem of critical importance for fusion-relevant plasmas. To this day, the fluid treatment of plasma dynamics is the best approach to realistic physics at the high resolution required for certain experimentally relevant calculations. Core and edge turbulence in a magnetic fusion device have been modeled using state-of-the-art, nonlinear, three-dimensional, initial-value fluid and gyrofluid codes. Parallel implementation of these models on diverse platforms--vector parallel (National Energy Research Supercomputer Center's CRAY Y-MP C90), massively parallel (Intel Paragon XP/S 35), and serial parallel (clusters of high-performance workstations using the Parallel Virtual Machine protocol)--offers a variety of paths to high resolution and significant improvements in real-time efficiency, each with its own advantages. The largest and most efficient calculations have been performed at the 200 Mword memory limit on the C90 in dedicated mode, where an overlap of 12 to 13 out of a maximum of 16 processors has been achieved with a gyrofluid model of core fluctuations. The richness of the physics captured by these calculations is commensurate with the increased resolution and efficiency and is limited only by the ingenuity brought to the analysis of the massive amounts of data generated
Evaluating parallel optimization on transputers
Directory of Open Access Journals (Sweden)
A.G. Chalmers
2003-12-01
Full Text Available The faster processing power of modern computers and the development of efficient algorithms have made it possible for operations researchers to tackle a much wider range of problems than ever before. Further improvements in processing speed can be achieved utilising relatively inexpensive transputers to process components of an algorithm in parallel. The Davidon-Fletcher-Powell method is one of the most successful and widely used optimisation algorithms for unconstrained problems. This paper examines the algorithm and identifies the components that can be processed in parallel. The results of some experiments with these components are presented which indicates under what conditions parallel processing with an inexpensive configuration is likely to be faster than the traditional sequential implementations. The performance of the whole algorithm with its parallel components is then compared with the original sequential algorithm. The implementation serves to illustrate the practicalities of speeding up typical OR algorithms in terms of difficulty, effort and cost. The results give an indication of the savings in time a given parallel implementation can be expected to yield.
Pattern-Driven Automatic Parallelization
Directory of Open Access Journals (Sweden)
Christoph W. Kessler
1996-01-01
Full Text Available This article describes a knowledge-based system for automatic parallelization of a wide class of sequential numerical codes operating on vectors and dense matrices, and for execution on distributed memory message-passing multiprocessors. Its main feature is a fast and powerful pattern recognition tool that locally identifies frequently occurring computations and programming concepts in the source code. This tool also works for dusty deck codes that have been "encrypted" by former machine-specific code transformations. Successful pattern recognition guides sophisticated code transformations including local algorithm replacement such that the parallelized code need not emerge from the sequential program structure by just parallelizing the loops. It allows access to an expert's knowledge on useful parallel algorithms, available machine-specific library routines, and powerful program transformations. The partially restored program semantics also supports local array alignment, distribution, and redistribution, and allows for faster and more exact prediction of the performance of the parallelized target code than is usually possible.
Noise simulation in cone beam CT imaging with parallel computing
International Nuclear Information System (INIS)
Tu, S.-J.; Shaw, Chris C; Chen, Lingyun
2006-01-01
We developed a computer noise simulation model for cone beam computed tomography imaging using a general purpose PC cluster. This model uses a mono-energetic x-ray approximation and allows us to investigate three primary performance components, specifically quantum noise, detector blurring and additive system noise. A parallel random number generator based on the Weyl sequence was implemented in the noise simulation and a visualization technique was accordingly developed to validate the quality of the parallel random number generator. In our computer simulation model, three-dimensional (3D) phantoms were mathematically modelled and used to create 450 analytical projections, which were then sampled into digital image data. Quantum noise was simulated and added to the analytical projection image data, which were then filtered to incorporate flat panel detector blurring. Additive system noise was generated and added to form the final projection images. The Feldkamp algorithm was implemented and used to reconstruct the 3D images of the phantoms. A 24 dual-Xeon PC cluster was used to compute the projections and reconstructed images in parallel with each CPU processing 10 projection views for a total of 450 views. Based on this computer simulation system, simulated cone beam CT images were generated for various phantoms and technique settings. Noise power spectra for the flat panel x-ray detector and reconstructed images were then computed to characterize the noise properties. As an example among the potential applications of our noise simulation model, we showed that images of low contrast objects can be produced and used for image quality evaluation
Evaluation of a 3D point cloud tetrahedral tomographic reconstruction method
Energy Technology Data Exchange (ETDEWEB)
Pereira, N F; Sitek, A, E-mail: nfp4@bwh.harvard.ed, E-mail: asitek@bwh.harvard.ed [Department of Radiology, Brigham and Women' s Hospital-Harvard Medical School Boston, MA (United States)
2010-09-21
Tomographic reconstruction on an irregular grid may be superior to reconstruction on a regular grid. This is achieved through an appropriate choice of the image space model, the selection of an optimal set of points and the use of any available prior information during the reconstruction process. Accordingly, a number of reconstruction-related parameters must be optimized for best performance. In this work, a 3D point cloud tetrahedral mesh reconstruction method is evaluated for quantitative tasks. A linear image model is employed to obtain the reconstruction system matrix and five point generation strategies are studied. The evaluation is performed using the recovery coefficient, as well as voxel- and template-based estimates of bias and variance measures, computed over specific regions in the reconstructed image. A similar analysis is performed for regular grid reconstructions that use voxel basis functions. The maximum likelihood expectation maximization reconstruction algorithm is used. For the tetrahedral reconstructions, of the five point generation methods that are evaluated, three use image priors. For evaluation purposes, an object consisting of overlapping spheres with varying activity is simulated. The exact parallel projection data of this object are obtained analytically using a parallel projector, and multiple Poisson noise realizations of these exact data are generated and reconstructed using the different point generation strategies. The unconstrained nature of point placement in some of the irregular mesh-based reconstruction strategies has superior activity recovery for small, low-contrast image regions. The results show that, with an appropriately generated set of mesh points, the irregular grid reconstruction methods can out-perform reconstructions on a regular grid for mathematical phantoms, in terms of the performance measures evaluated.
Evaluation of a 3D point cloud tetrahedral tomographic reconstruction method
Pereira, N. F.; Sitek, A.
2010-09-01
Tomographic reconstruction on an irregular grid may be superior to reconstruction on a regular grid. This is achieved through an appropriate choice of the image space model, the selection of an optimal set of points and the use of any available prior information during the reconstruction process. Accordingly, a number of reconstruction-related parameters must be optimized for best performance. In this work, a 3D point cloud tetrahedral mesh reconstruction method is evaluated for quantitative tasks. A linear image model is employed to obtain the reconstruction system matrix and five point generation strategies are studied. The evaluation is performed using the recovery coefficient, as well as voxel- and template-based estimates of bias and variance measures, computed over specific regions in the reconstructed image. A similar analysis is performed for regular grid reconstructions that use voxel basis functions. The maximum likelihood expectation maximization reconstruction algorithm is used. For the tetrahedral reconstructions, of the five point generation methods that are evaluated, three use image priors. For evaluation purposes, an object consisting of overlapping spheres with varying activity is simulated. The exact parallel projection data of this object are obtained analytically using a parallel projector, and multiple Poisson noise realizations of these exact data are generated and reconstructed using the different point generation strategies. The unconstrained nature of point placement in some of the irregular mesh-based reconstruction strategies has superior activity recovery for small, low-contrast image regions. The results show that, with an appropriately generated set of mesh points, the irregular grid reconstruction methods can out-perform reconstructions on a regular grid for mathematical phantoms, in terms of the performance measures evaluated.
Evaluation of a 3D point cloud tetrahedral tomographic reconstruction method
International Nuclear Information System (INIS)
Pereira, N F; Sitek, A
2010-01-01
Tomographic reconstruction on an irregular grid may be superior to reconstruction on a regular grid. This is achieved through an appropriate choice of the image space model, the selection of an optimal set of points and the use of any available prior information during the reconstruction process. Accordingly, a number of reconstruction-related parameters must be optimized for best performance. In this work, a 3D point cloud tetrahedral mesh reconstruction method is evaluated for quantitative tasks. A linear image model is employed to obtain the reconstruction system matrix and five point generation strategies are studied. The evaluation is performed using the recovery coefficient, as well as voxel- and template-based estimates of bias and variance measures, computed over specific regions in the reconstructed image. A similar analysis is performed for regular grid reconstructions that use voxel basis functions. The maximum likelihood expectation maximization reconstruction algorithm is used. For the tetrahedral reconstructions, of the five point generation methods that are evaluated, three use image priors. For evaluation purposes, an object consisting of overlapping spheres with varying activity is simulated. The exact parallel projection data of this object are obtained analytically using a parallel projector, and multiple Poisson noise realizations of these exact data are generated and reconstructed using the different point generation strategies. The unconstrained nature of point placement in some of the irregular mesh-based reconstruction strategies has superior activity recovery for small, low-contrast image regions. The results show that, with an appropriately generated set of mesh points, the irregular grid reconstruction methods can out-perform reconstructions on a regular grid for mathematical phantoms, in terms of the performance measures evaluated.
Par@Graph - a parallel toolbox for the construction and analysis of large complex climate networks
Tantet, A.J.J.
2015-01-01
In this paper, we present Par@Graph, a software toolbox to reconstruct and analyze complex climate networks having a large number of nodes (up to at least 106) and edges (up to at least 1012). The key innovation is an efficient set of parallel software tools designed to leverage the inherited hybrid
Matthew Parks; Richard Cronn; Aaron Liston
2009-01-01
We reconstruct the infrageneric phylogeny of Pinus from 37 nearly-complete chloroplast genomes (average 109 kilobases each of an approximately 120 kilobase genome) generated using multiplexed massively parallel sequencing. We found that 30/33 ingroup nodes resolved wlth > 95-percent bootstrap support; this is a substantial improvement relative...
Limited angle tomographic breast imaging: A comparison of parallel beam and pinhole collimation
International Nuclear Information System (INIS)
Wessell, D.E.; Kadrmas, D.J.; Frey, E.C.
1996-01-01
Results from clinical trials have suggested no improvement in lesion detection with parallel hole SPECT scintimammography (SM) with Tc-99m over parallel hole planar SM. In this initial investigation, we have elucidated some of the unique requirements of SPECT SM. With these requirements in mind, we have begun to develop practical data acquisition and reconstruction strategies that can reduce image artifacts and improve image quality. In this paper we investigate limited angle orbits for both parallel hole and pinhole SPECT SM. Singular Value Decomposition (SVD) is used to analyze the artifacts associated with the limited angle orbits. Maximum likelihood expectation maximization (MLEM) reconstructions are then used to examine the effects of attenuation compensation on the quality of the reconstructed image. All simulations are performed using the 3D-MCAT breast phantom. The results of these simulation studies demonstrate that limited angle SPECT SM is feasible, that attenuation correction is needed for accurate reconstructions, and that pinhole SPECT SM may have an advantage over parallel hole SPECT SM in terms of improved image quality and reduced image artifacts
Feed-forward volume rendering algorithm for moderately parallel MIMD machines
Yagel, Roni
1993-01-01
Algorithms for direct volume rendering on parallel and vector processors are investigated. Volumes are transformed efficiently on parallel processors by dividing the data into slices and beams of voxels. Equal sized sets of slices along one axis are distributed to processors. Parallelism is achieved at two levels. Because each slice can be transformed independently of others, processors transform their assigned slices with no communication, thus providing maximum possible parallelism at the first level. Within each slice, consecutive beams are incrementally transformed using coherency in the transformation computation. Also, coherency across slices can be exploited to further enhance performance. This coherency yields the second level of parallelism through the use of the vector processing or pipelining. Other ongoing efforts include investigations into image reconstruction techniques, load balancing strategies, and improving performance.
Last Glacial Maximum Salinity Reconstruction
Homola, K.; Spivack, A. J.
2016-12-01
determined experimentally. We compare the high precision salinity profiles determined using our new method to profiles determined from the traditional chloride titrations of parallel samples. Our technique provides a more accurate reconstruction of past salinity, informing questions of water mass composition and distribution during the LGM.
Parallel artificial liquid membrane extraction
DEFF Research Database (Denmark)
Gjelstad, Astrid; Rasmussen, Knut Einar; Parmer, Marthe Petrine
2013-01-01
This paper reports development of a new approach towards analytical liquid-liquid-liquid membrane extraction termed parallel artificial liquid membrane extraction. A donor plate and acceptor plate create a sandwich, in which each sample (human plasma) and acceptor solution is separated by an arti......This paper reports development of a new approach towards analytical liquid-liquid-liquid membrane extraction termed parallel artificial liquid membrane extraction. A donor plate and acceptor plate create a sandwich, in which each sample (human plasma) and acceptor solution is separated...... by an artificial liquid membrane. Parallel artificial liquid membrane extraction is a modification of hollow-fiber liquid-phase microextraction, where the hollow fibers are replaced by flat membranes in a 96-well plate format....
2D-RBUC for efficient parallel compression of residuals
Đurđević, Đorđe M.; Tartalja, Igor I.
2018-02-01
In this paper, we present a method for lossless compression of residuals with an efficient SIMD parallel decompression. The residuals originate from lossy or near lossless compression of height fields, which are commonly used to represent models of terrains. The algorithm is founded on the existing RBUC method for compression of non-uniform data sources. We have adapted the method to capture 2D spatial locality of height fields, and developed the data decompression algorithm for modern GPU architectures already present even in home computers. In combination with the point-level SIMD-parallel lossless/lossy high field compression method HFPaC, characterized by fast progressive decompression and seamlessly reconstructed surface, the newly proposed method trades off small efficiency degradation for a non negligible compression ratio (measured up to 91%) benefit.
Parallel application of plasma equilibrium fitting based on inhomogeneous platforms
International Nuclear Information System (INIS)
Liao Min; Zhang Jinhua; Chen Liaoyuan; Li Yongge; Pan Wei; Pan Li
2008-01-01
An online analysis and online display platform EFIT, which is based on the equilibrium-fitting mode, is inducted in this paper. This application can realize large data transportation between inhomogeneous platforms by designing a communication mechanism using sockets. It spends approximately one minute to complete the equilibrium fitting reconstruction by using a finite state machine to describe the management node and several node computers of cluster system to fulfill the parallel computation, this satisfies the online display during the discharge interval. An effective communication model between inhomogeneous platforms is provided, which could transport the computing results from Linux platform to Windows platform for online analysis and display. (authors)
Reconstruction algorithm medical imaging DRR; Algoritmo de construccion de imagenes medicas DRR
Energy Technology Data Exchange (ETDEWEB)
Estrada Espinosa, J. C.
2013-07-01
The method of reconstruction for digital radiographic Imaging (DRR), is based on two orthogonal images, on the dorsal and lateral decubitus position of the simulation. DRR images are reconstructed with an algorithm that simulates running a conventional X-ray, a single rendition team, beam emitted is not divergent, in this case, the rays are considered to be parallel in the image reconstruction DRR, for this purpose, it is necessary to use all the values of the units (HU) hounsfield of each voxel in all axial cuts that form the study TC, finally obtaining the reconstructed image DRR performing a transformation from 3D to 2D. (Author)
Parallel algorithms for mapping pipelined and parallel computations
Nicol, David M.
1988-01-01
Many computational problems in image processing, signal processing, and scientific computing are naturally structured for either pipelined or parallel computation. When mapping such problems onto a parallel architecture it is often necessary to aggregate an obvious problem decomposition. Even in this context the general mapping problem is known to be computationally intractable, but recent advances have been made in identifying classes of problems and architectures for which optimal solutions can be found in polynomial time. Among these, the mapping of pipelined or parallel computations onto linear array, shared memory, and host-satellite systems figures prominently. This paper extends that work first by showing how to improve existing serial mapping algorithms. These improvements have significantly lower time and space complexities: in one case a published O(nm sup 3) time algorithm for mapping m modules onto n processors is reduced to an O(nm log m) time complexity, and its space requirements reduced from O(nm sup 2) to O(m). Run time complexity is further reduced with parallel mapping algorithms based on these improvements, which run on the architecture for which they create the mappings.
Cellular automata a parallel model
Mazoyer, J
1999-01-01
Cellular automata can be viewed both as computational models and modelling systems of real processes. This volume emphasises the first aspect. In articles written by leading researchers, sophisticated massive parallel algorithms (firing squad, life, Fischer's primes recognition) are treated. Their computational power and the specific complexity classes they determine are surveyed, while some recent results in relation to chaos from a new dynamic systems point of view are also presented. Audience: This book will be of interest to specialists of theoretical computer science and the parallelism challenge.
Qualitative and quantitative improvements of PET reconstruction on GPU architecture
International Nuclear Information System (INIS)
Autret, Awen
2016-01-01
In positron emission tomography, reconstructed images suffer from a high noise level and a low resolution. Iterative reconstruction processes require an estimation of the system response (scanner and patient) and the quality of the images depends on the accuracy of this estimate. Accurate and fast to compute models already exists for the attenuation, scattering, random coincidences and dead times. Thus, this thesis focuses on modeling the system components associated with the detector response and the positron range. A new multi-GPU parallelization of the reconstruction based on a cutting of the volume is also proposed to speed up the reconstruction exploiting the computing power of such architectures. The proposed detector response model is based on a multi-ray approach that includes all the detector effects as the geometry and the scattering in the crystals. An evaluation study based on data obtained through Mote Carlo simulation (MCS) showed this model provides reconstructed images with a better contrast to noise ratio and resolution compared with those of the methods from the state of the art. The proposed positron range model is based on a simplified MCS, integrated into the forward projector during the reconstruction. A GPU implementation of this method allows running MCS three order of magnitude faster than the same simulation on GATE, while providing similar results. An evaluation study shows this model integrated in the reconstruction gives images with better contrast recovery and resolution while avoiding artifacts. (author)
User-friendly parallelization of GAUDI applications with Python
International Nuclear Information System (INIS)
Mato, Pere; Smith, Eoin
2010-01-01
GAUDI is a software framework in C++ used to build event data processing applications using a set of standard components with well-defined interfaces. Simulation, high-level trigger, reconstruction, and analysis programs used by several experiments are developed using GAUDI. These applications can be configured and driven by simple Python scripts. Given the fact that a considerable amount of existing software has been developed using serial methodology, and has existed in some cases for many years, implementation of parallelisation techniques at the framework level may offer a way of exploiting current multi-core technologies to maximize performance and reduce latencies without re-writing thousands/millions of lines of code. In the solution we have developed, the parallelization techniques are introduced to the high level Python scripts which configure and drive the applications, such that the core C++ application code requires no modification, and that end users need make only minimal changes to their scripts. The developed solution leverages from existing generic Python modules that support parallel processing. Naturally, the parallel version of a given program should produce results consistent with its serial execution. The evaluation of several prototypes incorporating various parallelization techniques are presented and discussed.
User-friendly parallelization of GAUDI applications with Python
Energy Technology Data Exchange (ETDEWEB)
Mato, Pere; Smith, Eoin, E-mail: pere.mato@cern.c [PH Department, CERN, 1211 Geneva 23 (Switzerland)
2010-04-01
GAUDI is a software framework in C++ used to build event data processing applications using a set of standard components with well-defined interfaces. Simulation, high-level trigger, reconstruction, and analysis programs used by several experiments are developed using GAUDI. These applications can be configured and driven by simple Python scripts. Given the fact that a considerable amount of existing software has been developed using serial methodology, and has existed in some cases for many years, implementation of parallelisation techniques at the framework level may offer a way of exploiting current multi-core technologies to maximize performance and reduce latencies without re-writing thousands/millions of lines of code. In the solution we have developed, the parallelization techniques are introduced to the high level Python scripts which configure and drive the applications, such that the core C++ application code requires no modification, and that end users need make only minimal changes to their scripts. The developed solution leverages from existing generic Python modules that support parallel processing. Naturally, the parallel version of a given program should produce results consistent with its serial execution. The evaluation of several prototypes incorporating various parallelization techniques are presented and discussed.
Lyu, Jingyuan; Nakarmi, Ukash; Zhang, Chaoyi; Ying, Leslie
2016-05-01
This paper presents a new approach to highly accelerated dynamic parallel MRI using low rank matrix completion, partial separability (PS) model. In data acquisition, k-space data is moderately randomly undersampled at the center kspace navigator locations, but highly undersampled at the outer k-space for each temporal frame. In reconstruction, the navigator data is reconstructed from undersampled data using structured low-rank matrix completion. After all the unacquired navigator data is estimated, the partial separable model is used to obtain partial k-t data. Then the parallel imaging method is used to acquire the entire dynamic image series from highly undersampled data. The proposed method has shown to achieve high quality reconstructions with reduction factors up to 31, and temporal resolution of 29ms, when the conventional PS method fails.
Industrial dynamic tomographic reconstruction
International Nuclear Information System (INIS)
Oliveira, Eric Ferreira de
2016-01-01
The state of the art methods applied to industrial processes is currently based on the principles of classical tomographic reconstructions developed for tomographic patterns of static distributions, or is limited to cases of low variability of the density distribution function of the tomographed object. Noise and motion artifacts are the main problems caused by a mismatch in the data from views acquired in different instants. All of these add to the known fact that using a limited amount of data can result in the presence of noise, artifacts and some inconsistencies with the distribution under study. One of the objectives of the present work is to discuss the difficulties that arise from implementing reconstruction algorithms in dynamic tomography that were originally developed for static distributions. Another objective is to propose solutions that aim at reducing a temporal type of information loss caused by employing regular acquisition systems to dynamic processes. With respect to dynamic image reconstruction it was conducted a comparison between different static reconstruction methods, like MART and FBP, when used for dynamic scenarios. This comparison was based on a MCNPx simulation as well as an analytical setup of an aluminum cylinder that moves along the section of a riser during the process of acquisition, and also based on cross section images from CFD techniques. As for the adaptation of current tomographic acquisition systems for dynamic processes, this work established a sequence of tomographic views in a just-in-time fashion for visualization purposes, a form of visually disposing density information as soon as it becomes amenable to image reconstruction. A third contribution was to take advantage of the triple color channel necessary to display colored images in most displays, so that, by appropriately scaling the acquired values of each view in the linear system of the reconstruction, it was possible to imprint a temporal trace into the regularly
Alternative reconstruction after pancreaticoduodenectomy
Directory of Open Access Journals (Sweden)
Cooperman Avram M
2008-01-01
Full Text Available Abstract Background Pancreaticoduodenectomy is the procedure of choice for tumors of the head of the pancreas and periampulla. Despite advances in surgical technique and postoperative care, the procedure continues to carry a high morbidity rate. One of the most common morbidities is delayed gastric emptying with rates of 15%–40%. Following two prolonged cases of delayed gastric emptying, we altered our reconstruction to avoid this complication altogether. Subsequently, our patients underwent a classic pancreaticoduodenectomy with an undivided Roux-en-Y technique for reconstruction. Methods We reviewed the charts of our last 13 Whipple procedures evaluating them for complications, specifically delayed gastric emptying. We compared the outcomes of those patients to a control group of 15 patients who underwent the Whipple procedure with standard reconstruction. Results No instances of delayed gastric emptying occurred in patients who underwent an undivided Roux-en-Y technique for reconstruction. There was 1 wound infection (8%, 1 instance of pneumonia (8%, and 1 instance of bleeding from the gastrojejunal staple line (8%. There was no operative mortality. Conclusion Use of the undivided Roux-en-Y technique for reconstruction following the Whipple procedure may decrease the incidence of delayed gastric emptying. In addition, it has the added benefit of eliminating bile reflux gastritis. Future randomized control trials are recommended to further evaluate the efficacy of the procedure.
Moeller polarimeter in the hall a jefferson lab after reconstruction
International Nuclear Information System (INIS)
Pomatsalyuk, R.I.
2016-01-01
The Moller polarimeter in the Hall A of Jefferson Lab was reconstructed in order to expand of the energy range of the polarimeter to measure the polarization of the electron beam with an energy up to 11.5 GeV. The paper de-scribes the main results of the Moller polarimeter testing after reconstruction. The measurements of the electrons polarization were provided by two data acquisition systems operating in parallel. The testing of the shielding insertion of magnetic dipole has been performed. The way to eliminate detected deviations in the operation of polarimeter during test is shown.
3D Tomographic Image Reconstruction using CUDA C
International Nuclear Information System (INIS)
Dominguez, J. S.; Assis, J. T.; Oliveira, L. F. de
2011-01-01
This paper presents the study and implementation of a software for three dimensional reconstruction of images obtained with a tomographic system using the capabilities of Graphic Processing Units(GPU). The reconstruction by filtered back-projection method was developed using the CUDA C, for maximum utilization of the processing capabilities of GPUs to solve computational problems with large computational cost and highly parallelizable. It was discussed the potential of GPUs and shown its advantages to solving this kind of problems. The results in terms of runtime will be compared with non-parallelized implementations and must show a great reduction of processing time. (Author)
Parallel Sparse Matrix - Vector Product
DEFF Research Database (Denmark)
Alexandersen, Joe; Lazarov, Boyan Stefanov; Dammann, Bernd
This technical report contains a case study of a sparse matrix-vector product routine, implemented for parallel execution on a compute cluster with both pure MPI and hybrid MPI-OpenMP solutions. C++ classes for sparse data types were developed and the report shows how these class can be used...
[Falsified medicines in parallel trade].
Muckenfuß, Heide
2017-11-01
The number of falsified medicines on the German market has distinctly increased over the past few years. In particular, stolen pharmaceutical products, a form of falsified medicines, have increasingly been introduced into the legal supply chain via parallel trading. The reasons why parallel trading serves as a gateway for falsified medicines are most likely the complex supply chains and routes of transport. It is hardly possible for national authorities to trace the history of a medicinal product that was bought and sold by several intermediaries in different EU member states. In addition, the heterogeneous outward appearance of imported and relabelled pharmaceutical products facilitates the introduction of illegal products onto the market. Official batch release at the Paul-Ehrlich-Institut offers the possibility of checking some aspects that might provide an indication of a falsified medicine. In some circumstances, this may allow the identification of falsified medicines before they come onto the German market. However, this control is only possible for biomedicinal products that have not received a waiver regarding official batch release. For improved control of parallel trade, better networking among the EU member states would be beneficial. European-wide regulations, e. g., for disclosure of the complete supply chain, would help to minimise the risks of parallel trading and hinder the marketing of falsified medicines.
The parallel adult education system
DEFF Research Database (Denmark)
Wahlgren, Bjarne
2015-01-01
for competence development. The Danish university educational system includes two parallel programs: a traditional academic track (candidatus) and an alternative practice-based track (master). The practice-based program was established in 2001 and organized as part time. The total program takes half the time...
Where are the parallel algorithms?
Voigt, R. G.
1985-01-01
Four paradigms that can be useful in developing parallel algorithms are discussed. These include computational complexity analysis, changing the order of computation, asynchronous computation, and divide and conquer. Each is illustrated with an example from scientific computation, and it is shown that computational complexity must be used with great care or an inefficient algorithm may be selected.
Default Parallels Plesk Panel Page
services that small businesses want and need. Our software includes key building blocks of cloud service virtualized servers Service Provider Products ParallelsÂ® Automation Hosting, SaaS, and cloud computing , the leading hosting automation software. You see this page because there is no Web site at this
Parallel plate transmission line transformer
Voeten, S.J.; Brussaard, G.J.H.; Pemen, A.J.M.
2011-01-01
A Transmission Line Transformer (TLT) can be used to transform high-voltage nanosecond pulses. These transformers rely on the fact that the length of the pulse is shorter than the transmission lines used. This allows connecting the transmission lines in parallel at the input and in series at the
Matpar: Parallel Extensions for MATLAB
Springer, P. L.
1998-01-01
Matpar is a set of client/server software that allows a MATLAB user to take advantage of a parallel computer for very large problems. The user can replace calls to certain built-in MATLAB functions with calls to Matpar functions.
Massively parallel quantum computer simulator
De Raedt, K.; Michielsen, K.; De Raedt, H.; Trieu, B.; Arnold, G.; Richter, M.; Lippert, Th.; Watanabe, H.; Ito, N.
2007-01-01
We describe portable software to simulate universal quantum computers on massive parallel Computers. We illustrate the use of the simulation software by running various quantum algorithms on different computer architectures, such as a IBM BlueGene/L, a IBM Regatta p690+, a Hitachi SR11000/J1, a Cray
Near real-time digital holographic microscope based on GPU parallel computing
Zhu, Gang; Zhao, Zhixiong; Wang, Huarui; Yang, Yan
2018-01-01
A transmission near real-time digital holographic microscope with in-line and off-axis light path is presented, in which the parallel computing technology based on compute unified device architecture (CUDA) and digital holographic microscopy are combined. Compared to other holographic microscopes, which have to implement reconstruction in multiple focal planes and are time-consuming the reconstruction speed of the near real-time digital holographic microscope can be greatly improved with the parallel computing technology based on CUDA, so it is especially suitable for measurements of particle field in micrometer and nanometer scale. Simulations and experiments show that the proposed transmission digital holographic microscope can accurately measure and display the velocity of particle field in micrometer scale, and the average velocity error is lower than 10%.With the graphic processing units(GPU), the computing time of the 100 reconstruction planes(512×512 grids) is lower than 120ms, while it is 4.9s using traditional reconstruction method by CPU. The reconstruction speed has been raised by 40 times. In other words, it can handle holograms at 8.3 frames per second and the near real-time measurement and display of particle velocity field are realized. The real-time three-dimensional reconstruction of particle velocity field is expected to achieve by further optimization of software and hardware. Keywords: digital holographic microscope,
Parallel computing: numerics, applications, and trends
National Research Council Canada - National Science Library
Trobec, Roman; Vajteršic, Marián; Zinterhof, Peter
2009-01-01
... and/or distributed systems. The contributions to this book are focused on topics most concerned in the trends of today's parallel computing. These range from parallel algorithmics, programming, tools, network computing to future parallel computing. Particular attention is paid to parallel numerics: linear algebra, differential equations, numerica...
Experiments with parallel algorithms for combinatorial problems
G.A.P. Kindervater (Gerard); H.W.J.M. Trienekens
1985-01-01
textabstractIn the last decade many models for parallel computation have been proposed and many parallel algorithms have been developed. However, few of these models have been realized and most of these algorithms are supposed to run on idealized, unrealistic parallel machines. The parallel machines
International Nuclear Information System (INIS)
Heggarty, J.W.
1999-06-01
For almost thirty years, sequential R-matrix computation has been used by atomic physics research groups, from around the world, to model collision phenomena involving the scattering of electrons or positrons with atomic or molecular targets. As considerable progress has been made in the understanding of fundamental scattering processes, new data, obtained from more complex calculations, is of current interest to experimentalists. Performing such calculations, however, places considerable demands on the computational resources to be provided by the target machine, in terms of both processor speed and memory requirement. Indeed, in some instances the computational requirements are so great that the proposed R-matrix calculations are intractable, even when utilising contemporary classic supercomputers. Historically, increases in the computational requirements of R-matrix computation were accommodated by porting the problem codes to a more powerful classic supercomputer. Although this approach has been successful in the past, it is no longer considered to be a satisfactory solution due to the limitations of current (and future) Von Neumann machines. As a consequence, there has been considerable interest in the high performance multicomputers, that have emerged over the last decade which appear to offer the computational resources required by contemporary R-matrix research. Unfortunately, developing codes for these machines is not as simple a task as it was to develop codes for successive classic supercomputers. The difficulty arises from the considerable differences in the computing models that exist between the two types of machine and results in the programming of multicomputers to be widely acknowledged as a difficult, time consuming and error-prone task. Nevertheless, unless parallel R-matrix computation is realised, important theoretical and experimental atomic physics research will continue to be hindered. This thesis describes work that was undertaken in
International Nuclear Information System (INIS)
Yeong, C.L.; Torquato, S.
1998-01-01
We formulate a procedure to reconstruct the structure of general random heterogeneous media from limited morphological information by extending the methodology of Rintoul and Torquato [J. Colloid Interface Sci. 186, 467 (1997)] developed for dispersions. The procedure has the advantages that it is simple to implement and generally applicable to multidimensional, multiphase, and anisotropic structures. Furthermore, an extremely useful feature is that it can incorporate any type and number of correlation functions in order to provide as much morphological information as is necessary for accurate reconstruction. We consider a variety of one- and two-dimensional reconstructions, including periodic and random arrays of rods, various distribution of disks, Debye random media, and a Fontainebleau sandstone sample. We also use our algorithm to construct heterogeneous media from specified hypothetical correlation functions, including an exponentially damped, oscillating function as well as physically unrealizable ones. copyright 1998 The American Physical Society
Delayed breast implant reconstruction
DEFF Research Database (Denmark)
Hvilsom, Gitte B.; Hölmich, Lisbet R.; Steding-Jessen, Marianne
2012-01-01
We evaluated the association between radiation therapy and severe capsular contracture or reoperation after 717 delayed breast implant reconstruction procedures (288 1- and 429 2-stage procedures) identified in the prospective database of the Danish Registry for Plastic Surgery of the Breast during...... of radiation therapy was associated with a non-significantly increased risk of reoperation after both 1-stage (HR = 1.4; 95% CI: 0.7-2.5) and 2-stage (HR = 1.6; 95% CI: 0.9-3.1) procedures. Reconstruction failure was highest (13.2%) in the 2-stage procedures with a history of radiation therapy. Breast...... reconstruction approaches other than implants should be seriously considered among women who have received radiation therapy....
The numerical parallel computing of photon transport
International Nuclear Information System (INIS)
Huang Qingnan; Liang Xiaoguang; Zhang Lifa
1998-12-01
The parallel computing of photon transport is investigated, the parallel algorithm and the parallelization of programs on parallel computers both with shared memory and with distributed memory are discussed. By analyzing the inherent law of the mathematics and physics model of photon transport according to the structure feature of parallel computers, using the strategy of 'to divide and conquer', adjusting the algorithm structure of the program, dissolving the data relationship, finding parallel liable ingredients and creating large grain parallel subtasks, the sequential computing of photon transport into is efficiently transformed into parallel and vector computing. The program was run on various HP parallel computers such as the HY-1 (PVP), the Challenge (SMP) and the YH-3 (MPP) and very good parallel speedup has been gotten
HEEL BONE RECONSTRUCTIVE OSTEOSYNTHESIS
Directory of Open Access Journals (Sweden)
A. N. Svetashov
2010-01-01
Full Text Available To detect the most appropriate to heel bone injury severity variants of reconstructive osteosynthesis it was analyzed treatment results of 56 patients. In 15 (26.8% patients classic methods of surgical service were applied, in 41 (73.2% cases to restore the defect porous implants were used. Osteosynthesis without heel bone plastic restoration accomplishment was ineffective in 60% patients from control group. Reconstructive osteosynthesis method ensures long-term good functional effect of rehabilitation in 96.4% patients from the basic group.
International Nuclear Information System (INIS)
Chabanat, E.; D'Hondt, J.; Estre, N.; Fruehwirth, R.; Prokofiev, K.; Speer, T.; Vanlaer, P.; Waltenberger, W.
2005-01-01
Due to the high track multiplicity in the final states expected in proton collisions at the LHC experiments, novel vertex reconstruction algorithms are required. The vertex reconstruction problem can be decomposed into a pattern recognition problem ('vertex finding') and an estimation problem ('vertex fitting'). Starting from least-squares methods, robustifications of the classical algorithms are discussed and the statistical properties of the novel methods are shown. A whole set of different approaches for the vertex finding problem is presented and compared in relevant physics channels
Chabanat, E; D'Hondt, J; Vanlaer, P; Prokofiev, K; Speer, T; Frühwirth, R; Waltenberger, W
2005-01-01
Because of the high track multiplicity in the final states expected in proton collisions at the LHC experiments, novel vertex reconstruction algorithms are required. The vertex reconstruction problem can be decomposed into a pattern recognition problem ("vertex finding") and an estimation problem ("vertex fitting"). Starting from least-square methods, ways to render the classical algorithms more robust are discussed and the statistical properties of the novel methods are shown. A whole set of different approaches for the vertex finding problem is presented and compared in relevant physics channels.
Automatic Parallelization Tool: Classification of Program Code for Parallel Computing
Directory of Open Access Journals (Sweden)
Mustafa Basthikodi
2016-04-01
Full Text Available Performance growth of single-core processors has come to a halt in the past decade, but was re-enabled by the introduction of parallelism in processors. Multicore frameworks along with Graphical Processing Units empowered to enhance parallelism broadly. Couples of compilers are updated to developing challenges forsynchronization and threading issues. Appropriate program and algorithm classifications will have advantage to a great extent to the group of software engineers to get opportunities for effective parallelization. In present work we investigated current species for classification of algorithms, in that related work on classification is discussed along with the comparison of issues that challenges the classification. The set of algorithms are chosen which matches the structure with different issues and perform given task. We have tested these algorithms utilizing existing automatic species extraction toolsalong with Bones compiler. We have added functionalities to existing tool, providing a more detailed characterization. The contributions of our work include support for pointer arithmetic, conditional and incremental statements, user defined types, constants and mathematical functions. With this, we can retain significant data which is not captured by original speciesof algorithms. We executed new theories into the device, empowering automatic characterization of program code.
3D dictionary learning based iterative cone beam CT reconstruction
Directory of Open Access Journals (Sweden)
Ti Bai
2014-03-01
Full Text Available Purpose: This work is to develop a 3D dictionary learning based cone beam CT (CBCT reconstruction algorithm on graphic processing units (GPU to improve the quality of sparse-view CBCT reconstruction with high efficiency. Methods: A 3D dictionary containing 256 small volumes (atoms of 3 × 3 × 3 was trained from a large number of blocks extracted from a high quality volume image. On the basis, we utilized cholesky decomposition based orthogonal matching pursuit algorithm to find the sparse representation of each block. To accelerate the time-consuming sparse coding in the 3D case, we implemented the sparse coding in a parallel fashion by taking advantage of the tremendous computational power of GPU. Conjugate gradient least square algorithm was adopted to minimize the data fidelity term. Evaluations are performed based on a head-neck patient case. FDK reconstruction with full dataset of 364 projections is used as the reference. We compared the proposed 3D dictionary learning based method with tight frame (TF by performing reconstructions on a subset data of 121 projections. Results: Compared to TF based CBCT reconstruction that shows good overall performance, our experiments indicated that 3D dictionary learning based CBCT reconstruction is able to recover finer structures, remove more streaking artifacts and also induce less blocky artifacts. Conclusion: 3D dictionary learning based CBCT reconstruction algorithm is able to sense the structural information while suppress the noise, and hence to achieve high quality reconstruction under the case of sparse view. The GPU realization of the whole algorithm offers a significant efficiency enhancement, making this algorithm more feasible for potential clinical application.-------------------------------Cite this article as: Bai T, Yan H, Shi F, Jia X, Lou Y, Xu Q, Jiang S, Mou X. 3D dictionary learning based iterative cone beam CT reconstruction. Int J Cancer Ther Oncol 2014; 2(2:020240. DOI: 10
CHOLLA: A NEW MASSIVELY PARALLEL HYDRODYNAMICS CODE FOR ASTROPHYSICAL SIMULATION
Energy Technology Data Exchange (ETDEWEB)
Schneider, Evan E.; Robertson, Brant E. [Steward Observatory, University of Arizona, 933 North Cherry Avenue, Tucson, AZ 85721 (United States)
2015-04-15
We present Computational Hydrodynamics On ParaLLel Architectures (Cholla ), a new three-dimensional hydrodynamics code that harnesses the power of graphics processing units (GPUs) to accelerate astrophysical simulations. Cholla models the Euler equations on a static mesh using state-of-the-art techniques, including the unsplit Corner Transport Upwind algorithm, a variety of exact and approximate Riemann solvers, and multiple spatial reconstruction techniques including the piecewise parabolic method (PPM). Using GPUs, Cholla evolves the fluid properties of thousands of cells simultaneously and can update over 10 million cells per GPU-second while using an exact Riemann solver and PPM reconstruction. Owing to the massively parallel architecture of GPUs and the design of the Cholla code, astrophysical simulations with physically interesting grid resolutions (≳256{sup 3}) can easily be computed on a single device. We use the Message Passing Interface library to extend calculations onto multiple devices and demonstrate nearly ideal scaling beyond 64 GPUs. A suite of test problems highlights the physical accuracy of our modeling and provides a useful comparison to other codes. We then use Cholla to simulate the interaction of a shock wave with a gas cloud in the interstellar medium, showing that the evolution of the cloud is highly dependent on its density structure. We reconcile the computed mixing time of a turbulent cloud with a realistic density distribution destroyed by a strong shock with the existing analytic theory for spherical cloud destruction by describing the system in terms of its median gas density.
CHOLLA: A NEW MASSIVELY PARALLEL HYDRODYNAMICS CODE FOR ASTROPHYSICAL SIMULATION
International Nuclear Information System (INIS)
Schneider, Evan E.; Robertson, Brant E.
2015-01-01
We present Computational Hydrodynamics On ParaLLel Architectures (Cholla ), a new three-dimensional hydrodynamics code that harnesses the power of graphics processing units (GPUs) to accelerate astrophysical simulations. Cholla models the Euler equations on a static mesh using state-of-the-art techniques, including the unsplit Corner Transport Upwind algorithm, a variety of exact and approximate Riemann solvers, and multiple spatial reconstruction techniques including the piecewise parabolic method (PPM). Using GPUs, Cholla evolves the fluid properties of thousands of cells simultaneously and can update over 10 million cells per GPU-second while using an exact Riemann solver and PPM reconstruction. Owing to the massively parallel architecture of GPUs and the design of the Cholla code, astrophysical simulations with physically interesting grid resolutions (≳256 3 ) can easily be computed on a single device. We use the Message Passing Interface library to extend calculations onto multiple devices and demonstrate nearly ideal scaling beyond 64 GPUs. A suite of test problems highlights the physical accuracy of our modeling and provides a useful comparison to other codes. We then use Cholla to simulate the interaction of a shock wave with a gas cloud in the interstellar medium, showing that the evolution of the cloud is highly dependent on its density structure. We reconcile the computed mixing time of a turbulent cloud with a realistic density distribution destroyed by a strong shock with the existing analytic theory for spherical cloud destruction by describing the system in terms of its median gas density
Acceleration of iterative tomographic reconstruction using graphics processors
International Nuclear Information System (INIS)
Belzunce, M.A.; Osorio, A.; Verrastro, C.A.
2009-01-01
Using iterative algorithms for image reconstruction in 3 D Positron Emission Tomography has shown to produce images with better quality than analytical methods. How ever, these algorithms are computationally expensive. New Graphic Processor Units (GPU) provides high performance at low cost and also programming tools that make possible to execute parallel algorithms easily in scientific applications. In this work, we try to achieve an acceleration of image reconstruction algorithms in 3 D PET by using a GPU. A parallel implementation of the algorithm ML-EM 3 D was developed using Siddon algorithm as Projector and Back-projector. Results show that accelerations of more than one order of magnitude can be achieved, keeping similar image quality. (author)
Reconstructing Neutrino Mass Spectrum
Smirnov, A. Yu.
1999-01-01
Reconstruction of the neutrino mass spectrum and lepton mixing is one of the fundamental problems of particle physics. In this connection we consider two central topics: (i) the origin of large lepton mixing, (ii) possible existence of new (sterile) neutrino states. We discuss also possible relation between large mixing and existence of sterile neutrinos.
Position reconstruction in LUX
Akerib, D. S.; Alsum, S.; Araújo, H. M.; Bai, X.; Bailey, A. J.; Balajthy, J.; Beltrame, P.; Bernard, E. P.; Bernstein, A.; Biesiadzinski, T. P.; Boulton, E. M.; Brás, P.; Byram, D.; Cahn, S. B.; Carmona-Benitez, M. C.; Chan, C.; Currie, A.; Cutter, J. E.; Davison, T. J. R.; Dobi, A.; Druszkiewicz, E.; Edwards, B. N.; Fallon, S. R.; Fan, A.; Fiorucci, S.; Gaitskell, R. J.; Genovesi, J.; Ghag, C.; Gilchriese, M. G. D.; Hall, C. R.; Hanhardt, M.; Haselschwardt, S. J.; Hertel, S. A.; Hogan, D. P.; Horn, M.; Huang, D. Q.; Ignarra, C. M.; Jacobsen, R. G.; Ji, W.; Kamdin, K.; Kazkaz, K.; Khaitan, D.; Knoche, R.; Larsen, N. A.; Lenardo, B. G.; Lesko, K. T.; Lindote, A.; Lopes, M. I.; Manalaysay, A.; Mannino, R. L.; Marzioni, M. F.; McKinsey, D. N.; Mei, D.-M.; Mock, J.; Moongweluwan, M.; Morad, J. A.; Murphy, A. St. J.; Nehrkorn, C.; Nelson, H. N.; Neves, F.; O'Sullivan, K.; Oliver-Mallory, K. C.; Palladino, K. J.; Pease, E. K.; Rhyne, C.; Shaw, S.; Shutt, T. A.; Silva, C.; Solmaz, M.; Solovov, V. N.; Sorensen, P.; Sumner, T. J.; Szydagis, M.; Taylor, D. J.; Taylor, W. C.; Tennyson, B. P.; Terman, P. A.; Tiedt, D. R.; To, W. H.; Tripathi, M.; Tvrznikova, L.; Uvarov, S.; Velan, V.; Verbus, J. R.; Webb, R. C.; White, J. T.; Whitis, T. J.; Witherell, M. S.; Wolfs, F. L. H.; Xu, J.; Yazdani, K.; Young, S. K.; Zhang, C.
2018-02-01
The (x, y) position reconstruction method used in the analysis of the complete exposure of the Large Underground Xenon (LUX) experiment is presented. The algorithm is based on a statistical test that makes use of an iterative method to recover the photomultiplier tube (PMT) light response directly from the calibration data. The light response functions make use of a two dimensional functional form to account for the photons reflected on the inner walls of the detector. To increase the resolution for small pulses, a photon counting technique was employed to describe the response of the PMTs. The reconstruction was assessed with calibration data including 83mKr (releasing a total energy of 41.5 keV) and 3H (β- with Q = 18.6 keV) decays, and a deuterium-deuterium (D-D) neutron beam (2.45 MeV) . Within the detector's fiducial volume, the reconstruction has achieved an (x, y) position uncertainty of σ = 0.82 cm and σ = 0.17 cm for events of only 200 and 4,000 detected electroluminescence photons respectively. Such signals are associated with electron recoils of energies ~0.25 keV and ~10 keV, respectively. The reconstructed position of the smallest events with a single electron emitted from the liquid surface (22 detected photons) has a horizontal (x, y) uncertainty of 2.13 cm.
Structural synthesis of parallel robots
Gogu, Grigore
This book represents the fifth part of a larger work dedicated to the structural synthesis of parallel robots. The originality of this work resides in the fact that it combines new formulae for mobility, connectivity, redundancy and overconstraints with evolutionary morphology in a unified structural synthesis approach that yields interesting and innovative solutions for parallel robotic manipulators. This is the first book on robotics that presents solutions for coupled, decoupled, uncoupled, fully-isotropic and maximally regular robotic manipulators with Schönflies motions systematically generated by using the structural synthesis approach proposed in Part 1. Overconstrained non-redundant/overactuated/redundantly actuated solutions with simple/complex limbs are proposed. Many solutions are presented here for the first time in the literature. The author had to make a difficult and challenging choice between protecting these solutions through patents and releasing them directly into the public domain. T...
GPU Parallel Bundle Block Adjustment
Directory of Open Access Journals (Sweden)
ZHENG Maoteng
2017-09-01
Full Text Available To deal with massive data in photogrammetry, we introduce the GPU parallel computing technology. The preconditioned conjugate gradient and inexact Newton method are also applied to decrease the iteration times while solving the normal equation. A brand new workflow of bundle adjustment is developed to utilize GPU parallel computing technology. Our method can avoid the storage and inversion of the big normal matrix, and compute the normal matrix in real time. The proposed method can not only largely decrease the memory requirement of normal matrix, but also largely improve the efficiency of bundle adjustment. It also achieves the same accuracy as the conventional method. Preliminary experiment results show that the bundle adjustment of a dataset with about 4500 images and 9 million image points can be done in only 1.5 minutes while achieving sub-pixel accuracy.
A tandem parallel plate analyzer
International Nuclear Information System (INIS)
Hamada, Y.; Fujisawa, A.; Iguchi, H.; Nishizawa, A.; Kawasumi, Y.
1996-11-01
By a new modification of a parallel plate analyzer the second-order focus is obtained in an arbitrary injection angle. This kind of an analyzer with a small injection angle will have an advantage of small operational voltage, compared to the Proca and Green analyzer where the injection angle is 30 degrees. Thus, the newly proposed analyzer will be very useful for the precise energy measurement of high energy particles in MeV range. (author)
International Nuclear Information System (INIS)
Gus'kov, B.N.; Kalinnikov, V.A.; Krastev, V.R.; Maksimov, A.N.; Nikityuk, N.M.
1985-01-01
This paper describes a high-speed parallel counter that contains 31 inputs and 15 outputs and is implemented by integrated circuits of series 500. The counter is designed for fast sampling of events according to the number of particles that pass simultaneously through the hodoscopic plane of the detector. The minimum delay of the output signals relative to the input is 43 nsec. The duration of the output signals can be varied from 75 to 120 nsec
An anthropologist in parallel structure
Directory of Open Access Journals (Sweden)
Noelle Molé Liston
2016-08-01
Full Text Available The essay examines the parallels between Molé Liston’s studies on labor and precarity in Italy and the United States’ anthropology job market. Probing the way economic shift reshaped the field of anthropology of Europe in the late 2000s, the piece explores how the neoliberalization of the American academy increased the value in studying the hardships and daily lives of non-western populations in Europe.
Combinatorics of spreads and parallelisms
Johnson, Norman
2010-01-01
Partitions of Vector Spaces Quasi-Subgeometry Partitions Finite Focal-SpreadsGeneralizing André SpreadsThe Going Up Construction for Focal-SpreadsSubgeometry Partitions Subgeometry and Quasi-Subgeometry Partitions Subgeometries from Focal-SpreadsExtended André SubgeometriesKantor's Flag-Transitive DesignsMaximal Additive Partial SpreadsSubplane Covered Nets and Baer Groups Partial Desarguesian t-Parallelisms Direct Products of Affine PlanesJha-Johnson SL(2,
Wakefield calculations on parallel computers
International Nuclear Information System (INIS)
Schoessow, P.
1990-01-01
The use of parallelism in the solution of wakefield problems is illustrated for two different computer architectures (SIMD and MIMD). Results are given for finite difference codes which have been implemented on a Connection Machine and an Alliant FX/8 and which are used to compute wakefields in dielectric loaded structures. Benchmarks on code performance are presented for both cases. 4 refs., 3 figs., 2 tabs
High-speed parallel implementation of a modified PBR algorithm on DSP-based EH topology
Rajan, K.; Patnaik, L. M.; Ramakrishna, J.
1997-08-01
Algebraic Reconstruction Technique (ART) is an age-old method used for solving the problem of three-dimensional (3-D) reconstruction from projections in electron microscopy and radiology. In medical applications, direct 3-D reconstruction is at the forefront of investigation. The simultaneous iterative reconstruction technique (SIRT) is an ART-type algorithm with the potential of generating in a few iterations tomographic images of a quality comparable to that of convolution backprojection (CBP) methods. Pixel-based reconstruction (PBR) is similar to SIRT reconstruction, and it has been shown that PBR algorithms give better quality pictures compared to those produced by SIRT algorithms. In this work, we propose a few modifications to the PBR algorithms. The modified algorithms are shown to give better quality pictures compared to PBR algorithms. The PBR algorithm and the modified PBR algorithms are highly compute intensive, Not many attempts have been made to reconstruct objects in the true 3-D sense because of the high computational overhead. In this study, we have developed parallel two-dimensional (2-D) and 3-D reconstruction algorithms based on modified PBR. We attempt to solve the two problems encountered by the PBR and modified PBR algorithms, i.e., the long computational time and the large memory requirements, by parallelizing the algorithm on a multiprocessor system. We investigate the possible task and data partitioning schemes by exploiting the potential parallelism in the PBR algorithm subject to minimizing the memory requirement. We have implemented an extended hypercube (EH) architecture for the high-speed execution of the 3-D reconstruction algorithm using the commercially available fast floating point digital signal processor (DSP) chips as the processing elements (PEs) and dual-port random access memories (DPR) as channels between the PEs. We discuss and compare the performances of the PBR algorithm on an IBM 6000 RISC workstation, on a Silicon
Aspects of computation on asynchronous parallel processors
International Nuclear Information System (INIS)
Wright, M.
1989-01-01
The increasing availability of asynchronous parallel processors has provided opportunities for original and useful work in scientific computing. However, the field of parallel computing is still in a highly volatile state, and researchers display a wide range of opinion about many fundamental questions such as models of parallelism, approaches for detecting and analyzing parallelism of algorithms, and tools that allow software developers and users to make effective use of diverse forms of complex hardware. This volume collects the work of researchers specializing in different aspects of parallel computing, who met to discuss the framework and the mechanics of numerical computing. The far-reaching impact of high-performance asynchronous systems is reflected in the wide variety of topics, which include scientific applications (e.g. linear algebra, lattice gauge simulation, ordinary and partial differential equations), models of parallelism, parallel language features, task scheduling, automatic parallelization techniques, tools for algorithm development in parallel environments, and system design issues
Parallel processing of genomics data
Agapito, Giuseppe; Guzzi, Pietro Hiram; Cannataro, Mario
2016-10-01
The availability of high-throughput experimental platforms for the analysis of biological samples, such as mass spectrometry, microarrays and Next Generation Sequencing, have made possible to analyze a whole genome in a single experiment. Such platforms produce an enormous volume of data per single experiment, thus the analysis of this enormous flow of data poses several challenges in term of data storage, preprocessing, and analysis. To face those issues, efficient, possibly parallel, bioinformatics software needs to be used to preprocess and analyze data, for instance to highlight genetic variation associated with complex diseases. In this paper we present a parallel algorithm for the parallel preprocessing and statistical analysis of genomics data, able to face high dimension of data and resulting in good response time. The proposed system is able to find statistically significant biological markers able to discriminate classes of patients that respond to drugs in different ways. Experiments performed on real and synthetic genomic datasets show good speed-up and scalability.
Fast data reconstructed method of Fourier transform imaging spectrometer based on multi-core CPU
Yu, Chunchao; Du, Debiao; Xia, Zongze; Song, Li; Zheng, Weijian; Yan, Min; Lei, Zhenggang
2017-10-01
Imaging spectrometer can gain two-dimensional space image and one-dimensional spectrum at the same time, which shows high utility in color and spectral measurements, the true color image synthesis, military reconnaissance and so on. In order to realize the fast reconstructed processing of the Fourier transform imaging spectrometer data, the paper designed the optimization reconstructed algorithm with OpenMP parallel calculating technology, which was further used for the optimization process for the HyperSpectral Imager of `HJ-1' Chinese satellite. The results show that the method based on multi-core parallel computing technology can control the multi-core CPU hardware resources competently and significantly enhance the calculation of the spectrum reconstruction processing efficiency. If the technology is applied to more cores workstation in parallel computing, it will be possible to complete Fourier transform imaging spectrometer real-time data processing with a single computer.
An Implementation and Parallelization of the Scale Space Meshing Algorithm
Directory of Open Access Journals (Sweden)
Julie Digne
2015-11-01
Full Text Available Creating an interpolating mesh from an unorganized set of oriented points is a difficult problemwhich is often overlooked. Most methods focus indeed on building a watertight smoothed meshby defining some function whose zero level set is the surface of the object. However in some casesit is crucial to build a mesh that interpolates the points and does not fill the acquisition holes:either because the data are sparse and trying to fill the holes would create spurious artifactsor because the goal is to explore visually the data exactly as they were acquired without anysmoothing process. In this paper we detail a parallel implementation of the Scale-Space Meshingalgorithm, which builds on the scale-space framework for reconstructing a high precision meshfrom an input oriented point set. This algorithm first smoothes the point set, producing asingularity free shape. It then uses a standard mesh reconstruction technique, the Ball PivotingAlgorithm, to build a mesh from the smoothed point set. The final step consists in back-projecting the mesh built on the smoothed positions onto the original point set. The result ofthis process is an interpolating, hole-preserving surface mesh reconstruction.
International Nuclear Information System (INIS)
Shen Le; Xing Yuxiang
2010-01-01
The derivative back-projection filtered algorithm for a helical cone-beam CT is a newly developed exact reconstruction method. Due to its large computational complexity, the reconstruction is rather slow for practical use. General purpose graphic processing unit (GPGPU) is an SIMD paralleled hardware architecture with powerful float-point operation capacity. In this paper,we propose a new method for PI-line choice and sampling grid, and a paralleled PI-line reconstruction algorithm implemented on NVIDIA's Compute Unified Device Architecture (CUDA). Numerical simulation studies are carried out to validate our method. Compared with conventional CPU implementation, the CUDA accelerated method provides images of the same quality with a speedup factor of 318. Optimization strategies for the GPU acceleration are presented. Finally, influence of the parameters of the PI-line samples on the reconstruction speed and image quality is discussed. (authors)
Matrix-based image reconstruction methods for tomography
International Nuclear Information System (INIS)
Llacer, J.; Meng, J.D.
1984-10-01
Matrix methods of image reconstruction have not been used, in general, because of the large size of practical matrices, ill condition upon inversion and the success of Fourier-based techniques. An exception is the work that has been done at the Lawrence Berkeley Laboratory for imaging with accelerated radioactive ions. An extension of that work into more general imaging problems shows that, with a correct formulation of the problem, positron tomography with ring geometries results in well behaved matrices which can be used for image reconstruction with no distortion of the point response in the field of view and flexibility in the design of the instrument. Maximum Likelihood Estimator methods of reconstruction, which use the system matrices tailored to specific instruments and do not need matrix inversion, are shown to result in good preliminary images. A parallel processing computer structure based on multiple inexpensive microprocessors is proposed as a system to implement the matrix-MLE methods. 14 references, 7 figures
Genome rearrangements and phylogeny reconstruction in Yersinia pestis.
Bochkareva, Olga O; Dranenko, Natalia O; Ocheredko, Elena S; Kanevsky, German M; Lozinsky, Yaroslav N; Khalaycheva, Vera A; Artamonova, Irena I; Gelfand, Mikhail S
2018-01-01
Genome rearrangements have played an important role in the evolution of Yersinia pestis from its progenitor Yersinia pseudotuberculosis . Traditional phylogenetic trees for Y. pestis based on sequence comparison have short internal branches and low bootstrap supports as only a small number of nucleotide substitutions have occurred. On the other hand, even a small number of genome rearrangements may resolve topological ambiguities in a phylogenetic tree. We reconstructed phylogenetic trees based on genome rearrangements using several popular approaches such as Maximum likelihood for Gene Order and the Bayesian model of genome rearrangements by inversions. We also reconciled phylogenetic trees for each of the three CRISPR loci to obtain an integrated scenario of the CRISPR cassette evolution. Analysis of contradictions between the obtained evolutionary trees yielded numerous parallel inversions and gain/loss events. Our data indicate that an integrated analysis of sequence-based and inversion-based trees enhances the resolution of phylogenetic reconstruction. In contrast, reconstructions of strain relationships based on solely CRISPR loci may not be reliable, as the history is obscured by large deletions, obliterating the order of spacer gains. Similarly, numerous parallel gene losses preclude reconstruction of phylogeny based on gene content.
Algebraic reconstruction techniques for spectral reconstruction in diffuse optical tomography
International Nuclear Information System (INIS)
Brendel, Bernhard; Ziegler, Ronny; Nielsen, Tim
2008-01-01
Reconstruction in diffuse optical tomography (DOT) necessitates solving the diffusion equation, which is nonlinear with respect to the parameters that have to be reconstructed. Currently applied solving methods are based on the linearization of the equation. For spectral three-dimensional reconstruction, the emerging equation system is too large for direct inversion, but the application of iterative methods is feasible. Computational effort and speed of convergence of these iterative methods are crucial since they determine the computation time of the reconstruction. In this paper, the iterative methods algebraic reconstruction technique (ART) and conjugated gradients (CGs) as well as a new modified ART method are investigated for spectral DOT reconstruction. The aim of the modified ART scheme is to speed up the convergence by considering the specific conditions of spectral reconstruction. As a result, it converges much faster to favorable results than conventional ART and CG methods
Hybrid parallel computing architecture for multiview phase shifting
Zhong, Kai; Li, Zhongwei; Zhou, Xiaohui; Shi, Yusheng; Wang, Congjun
2014-11-01
The multiview phase-shifting method shows its powerful capability in achieving high resolution three-dimensional (3-D) shape measurement. Unfortunately, this ability results in very high computation costs and 3-D computations have to be processed offline. To realize real-time 3-D shape measurement, a hybrid parallel computing architecture is proposed for multiview phase shifting. In this architecture, the central processing unit can co-operate with the graphic processing unit (GPU) to achieve hybrid parallel computing. The high computation cost procedures, including lens distortion rectification, phase computation, correspondence, and 3-D reconstruction, are implemented in GPU, and a three-layer kernel function model is designed to simultaneously realize coarse-grained and fine-grained paralleling computing. Experimental results verify that the developed system can perform 50 fps (frame per second) real-time 3-D measurement with 260 K 3-D points per frame. A speedup of up to 180 times is obtained for the performance of the proposed technique using a NVIDIA GT560Ti graphics card rather than a sequential C in a 3.4 GHZ Inter Core i7 3770.
FPGA Hardware Acceleration of a Phylogenetic Tree Reconstruction with Maximum Parsimony Algorithm
BLOCK, Henry; MARUYAMA, Tsutomu
2017-01-01
In this paper, we present an FPGA hardware implementation for a phylogenetic tree reconstruction with a maximum parsimony algorithm. We base our approach on a particular stochastic local search algorithm that uses the Progressive Neighborhood and the Indirect Calculation of Tree Lengths method. This method is widely used for the acceleration of the phylogenetic tree reconstruction algorithm in software. In our implementation, we define a tree structure and accelerate the search by parallel an...
Arctic Sea Level Reconstruction
DEFF Research Database (Denmark)
Svendsen, Peter Limkilde
Reconstruction of historical Arctic sea level is very difficult due to the limited coverage and quality of tide gauge and altimetry data in the area. This thesis addresses many of these issues, and discusses strategies to help achieve a stable and plausible reconstruction of Arctic sea level from...... 1950 to today.The primary record of historical sea level, on the order of several decades to a few centuries, is tide gauges. Tide gauge records from around the world are collected in the Permanent Service for Mean Sea Level (PSMSL) database, and includes data along the Arctic coasts. A reasonable...... amount of data is available along the Norwegian and Russian coasts since 1950, and most published research on Arctic sea level extends cautiously from these areas. Very little tide gauge data is available elsewhere in the Arctic, and records of a length of several decades,as generally recommended for sea...
Herrera, Ramón
2018-03-01
The reconstruction of a warm inflationary universe model from the scalar spectral index n_S(N) and the tensor to scalar ratio r( N) as a function of the number of e-folds N is studied. Under a general formalism we find the effective potential and the dissipative coefficient in terms of the cosmological parameters n_S and r considering the weak and strong dissipative stages under the slow roll approximation. As a specific example, we study the attractors for the index n_S given by nS-1∝ N^{-1} and for the ratio r∝ N^{-2}, in order to reconstruct the model of warm inflation. Here, expressions for the effective potential V(φ ) and the dissipation coefficient Γ (φ ) are obtained.
Jet Vertex Charge Reconstruction
Nektarijevic, Snezana; The ATLAS collaboration
2015-01-01
A newly developed algorithm called the jet vertex charge tagger, aimed at identifying the sign of the charge of jets containing $b$-hadrons, referred to as $b$-jets, is presented. In addition to the well established track-based jet charge determination, this algorithm introduces the so-called \\emph{jet vertex charge} reconstruction, which exploits the charge information associated to the displaced vertices within the jet. Furthermore, the charge of a soft muon contained in the jet is taken into account when available. All available information is combined into a multivariate discriminator. The algorithm has been developed on jets matched to generator level $b$-hadrons provided by $t\\bar{t}$ events simulated at $\\sqrt{s}$=13~TeV using the full ATLAS detector simulation and reconstruction.
Overview of the Force Scientific Parallel Language
Directory of Open Access Journals (Sweden)
Gita Alaghband
1994-01-01
Full Text Available The Force parallel programming language designed for large-scale shared-memory multiprocessors is presented. The language provides a number of parallel constructs as extensions to the ordinary Fortran language and is implemented as a two-level macro preprocessor to support portability across shared memory multiprocessors. The global parallelism model on which the Force is based provides a powerful parallel language. The parallel constructs, generic synchronization, and freedom from process management supported by the Force has resulted in structured parallel programs that are ported to the many multiprocessors on which the Force is implemented. Two new parallel constructs for looping and functional decomposition are discussed. Several programming examples to illustrate some parallel programming approaches using the Force are also presented.
Automatic Loop Parallelization via Compiler Guided Refactoring
DEFF Research Database (Denmark)
Larsen, Per; Ladelsky, Razya; Lidman, Jacob
For many parallel applications, performance relies not on instruction-level parallelism, but on loop-level parallelism. Unfortunately, many modern applications are written in ways that obstruct automatic loop parallelization. Since we cannot identify sufficient parallelization opportunities...... for these codes in a static, off-line compiler, we developed an interactive compilation feedback system that guides the programmer in iteratively modifying application source, thereby improving the compiler’s ability to generate loop-parallel code. We use this compilation system to modify two sequential...... benchmarks, finding that the code parallelized in this way runs up to 8.3 times faster on an octo-core Intel Xeon 5570 system and up to 12.5 times faster on a quad-core IBM POWER6 system. Benchmark performance varies significantly between the systems. This suggests that semi-automatic parallelization should...
Parallel kinematics type, kinematics, and optimal design
Liu, Xin-Jun
2014-01-01
Parallel Kinematics- Type, Kinematics, and Optimal Design presents the results of 15 year's research on parallel mechanisms and parallel kinematics machines. This book covers the systematic classification of parallel mechanisms (PMs) as well as providing a large number of mechanical architectures of PMs available for use in practical applications. It focuses on the kinematic design of parallel robots. One successful application of parallel mechanisms in the field of machine tools, which is also called parallel kinematics machines, has been the emerging trend in advanced machine tools. The book describes not only the main aspects and important topics in parallel kinematics, but also references novel concepts and approaches, i.e. type synthesis based on evolution, performance evaluation and optimization based on screw theory, singularity model taking into account motion and force transmissibility, and others. This book is intended for researchers, scientists, engineers and postgraduates or above with interes...
Applied Parallel Computing Industrial Computation and Optimization
DEFF Research Database (Denmark)
Madsen, Kaj; NA NA NA Olesen, Dorte
Proceedings and the Third International Workshop on Applied Parallel Computing in Industrial Problems and Optimization (PARA96)......Proceedings and the Third International Workshop on Applied Parallel Computing in Industrial Problems and Optimization (PARA96)...
Parallel algorithms and cluster computing
Hoffmann, Karl Heinz
2007-01-01
This book presents major advances in high performance computing as well as major advances due to high performance computing. It contains a collection of papers in which results achieved in the collaboration of scientists from computer science, mathematics, physics, and mechanical engineering are presented. From the science problems to the mathematical algorithms and on to the effective implementation of these algorithms on massively parallel and cluster computers we present state-of-the-art methods and technology as well as exemplary results in these fields. This book shows that problems which seem superficially distinct become intimately connected on a computational level.
Parallel computation of rotating flows
DEFF Research Database (Denmark)
Lundin, Lars Kristian; Barker, Vincent A.; Sørensen, Jens Nørkær
1999-01-01
This paper deals with the simulation of 3‐D rotating flows based on the velocity‐vorticity formulation of the Navier‐Stokes equations in cylindrical coordinates. The governing equations are discretized by a finite difference method. The solution is advanced to a new time level by a two‐step process...... is that of solving a singular, large, sparse, over‐determined linear system of equations, and the iterative method CGLS is applied for this purpose. We discuss some of the mathematical and numerical aspects of this procedure and report on the performance of our software on a wide range of parallel computers. Darbe...
Segmentation-DrivenTomographic Reconstruction
DEFF Research Database (Denmark)
Kongskov, Rasmus Dalgas
such that the segmentation subsequently can be carried out by use of a simple segmentation method, for instance just a thresholding method. We tested the advantages of going from a two-stage reconstruction method to a one stage segmentation-driven reconstruction method for the phase contrast tomography reconstruction......The tomographic reconstruction problem is concerned with creating a model of the interior of an object from some measured data, typically projections of the object. After reconstructing an object it is often desired to segment it, either automatically or manually. For computed tomography (CT...
International Nuclear Information System (INIS)
Francisco, Oscar; Rangel, Murilo; Barter, William; Bursche, Albert; Potterat, Cedric; Coco, Victor
2012-01-01
Full text: The Large Hadron Collider (LHC) is the most powerful particle accelerator in the world. It has been designed to collide proton beams at an energy up to 14 TeV in the center of mass. In 2011, the data taking was done with a center of mass energy of 7 TeV, the instant luminosity has reached values greater than 4 X 10 32 cm -2 s -1 and the integrated luminosity reached the value of 1,02fb -1 on the LHCb. The jet reconstruction is fundamental to observe events that can be used to test perturbative QCD (pQCD). It also provides a way to observe standard model channels and searches for new physics like SUSY. The anti-kt algorithm is a jet reconstruction algorithm that is based on the distance of the particles on the space ηX φ and on the transverse momentum of particles. To maximize the energy resolution all information about the trackers and the colorimeters are used on the LHCb experiment to create objects called particle flow objects that are used as input to anti-kt algorithm. The LHCb is specially interesting for jets studies because its η region is complementary to the others main experiments on LHC. We will present the first results of jet reconstruction using 2011 LHCb data. (author)
Energy Technology Data Exchange (ETDEWEB)
Francisco, Oscar; Rangel, Murilo [Universidade Federal do Rio de Janeiro (UFRJ), RJ (Brazil); Barter, William [University of Cambridge, Cambridge (United Kingdom); Bursche, Albert [Universitat Zurich, Zurich (Switzerland); Potterat, Cedric [Universitat de Barcelona, Barcelona (Spain); Coco, Victor [Nikhef National Institute for Subatomic Physics, Amsterdam (Netherlands)
2012-07-01
Full text: The Large Hadron Collider (LHC) is the most powerful particle accelerator in the world. It has been designed to collide proton beams at an energy up to 14 TeV in the center of mass. In 2011, the data taking was done with a center of mass energy of 7 TeV, the instant luminosity has reached values greater than 4 X 10{sup 32} cm{sup -2}s{sup -1} and the integrated luminosity reached the value of 1,02fb{sup -1} on the LHCb. The jet reconstruction is fundamental to observe events that can be used to test perturbative QCD (pQCD). It also provides a way to observe standard model channels and searches for new physics like SUSY. The anti-kt algorithm is a jet reconstruction algorithm that is based on the distance of the particles on the space {eta}X {phi} and on the transverse momentum of particles. To maximize the energy resolution all information about the trackers and the colorimeters are used on the LHCb experiment to create objects called particle flow objects that are used as input to anti-kt algorithm. The LHCb is specially interesting for jets studies because its {eta} region is complementary to the others main experiments on LHC. We will present the first results of jet reconstruction using 2011 LHCb data. (author)
The parallel volume at large distances
DEFF Research Database (Denmark)
Kampf, Jürgen
In this paper we examine the asymptotic behavior of the parallel volume of planar non-convex bodies as the distance tends to infinity. We show that the difference between the parallel volume of the convex hull of a body and the parallel volume of the body itself tends to . This yields a new proof...... for the fact that a planar body can only have polynomial parallel volume, if it is convex. Extensions to Minkowski spaces and random sets are also discussed....
The parallel volume at large distances
DEFF Research Database (Denmark)
Kampf, Jürgen
In this paper we examine the asymptotic behavior of the parallel volume of planar non-convex bodies as the distance tends to infinity. We show that the difference between the parallel volume of the convex hull of a body and the parallel volume of the body itself tends to 0. This yields a new proof...... for the fact that a planar body can only have polynomial parallel volume, if it is convex. Extensions to Minkowski spaces and random sets are also discussed....
[Reconstructive methods after Fournier gangrene].
Wallner, C; Behr, B; Ring, A; Mikhail, B D; Lehnhardt, M; Daigeler, A
2016-04-01
Fournier's gangrene is a variant of the necrotizing fasciitis restricted to the perineal and genital region. It presents as an acute life-threatening disease and demands rapid surgical debridement, resulting in large soft tissue defects. Various reconstructive methods have to be applied to reconstitute functionality and aesthetics. The objective of this work is to identify different reconstructive methods in the literature and compare them to our current concepts for reconstructing defects caused by Fournier gangrene. Analysis of the current literature and our reconstructive methods on Fournier gangrene. The Fournier gangrene is an emergency requiring rapid, calculated antibiotic treatment and radical surgical debridement. After the acute phase of the disease, appropriate reconstructive methods are indicated. The planning of the reconstruction of the defect depends on many factors, especially functional and aesthetic demands. Scrotal reconstruction requires a higher aesthetic and functional reconstructive degree than perineal cutaneous wounds. In general, thorough wound hygiene, proper pre-operative planning, and careful consideration of the patient's demands are essential for successful reconstruction. In the literature, various methods for reconstruction after Fournier gangrene are described. Reconstruction with a flap is required for a good functional result in complex regions as the scrotum and penis, while cutaneous wounds can be managed through skin grafting. Patient compliance and tissue demand are crucial factors in the decision-making process.
A Parallel Approach to Fractal Image Compression
Directory of Open Access Journals (Sweden)
Lubomir Dedera
2004-01-01
Full Text Available The paper deals with a parallel approach to coding and decoding algorithms in fractal image compressionand presents experimental results comparing sequential and parallel algorithms from the point of view of achieved bothcoding and decoding time and effectiveness of parallelization.
Parallel Computing Using Web Servers and "Servlets".
Lo, Alfred; Bloor, Chris; Choi, Y. K.
2000-01-01
Describes parallel computing and presents inexpensive ways to implement a virtual parallel computer with multiple Web servers. Highlights include performance measurement of parallel systems; models for using Java and intranet technology including single server, multiple clients and multiple servers, single client; and a comparison of CGI (common…
An Introduction to Parallel Computation R
Indian Academy of Sciences (India)
How are they programmed? This article provides an introduction. A parallel computer is a network of processors built for ... and have been used to solve problems much faster than a single ... in parallel computer design is to select an organization which ..... The most ambitious approach to parallel computing is to develop.
Comparison of parallel viscosity with neoclassical theory
International Nuclear Information System (INIS)
Ida, K.; Nakajima, N.
1996-04-01
Toroidal rotation profiles are measured with charge exchange spectroscopy for the plasma heated with tangential NBI in CHS heliotron/torsatron device to estimate parallel viscosity. The parallel viscosity derived from the toroidal rotation velocity shows good agreement with the neoclassical parallel viscosity plus the perpendicular viscosity. (μ perpendicular = 2 m 2 /s). (author)
HeinzelCluster: accelerated reconstruction for FORE and OSEM3D.
Vollmar, S; Michel, C; Treffert, J T; Newport, D F; Casey, M; Knöss, C; Wienhard, K; Liu, X; Defrise, M; Heiss, W D
2002-08-07
Using iterative three-dimensional (3D) reconstruction techniques for reconstruction of positron emission tomography (PET) is not feasible on most single-processor machines due to the excessive computing time needed, especially so for the large sinogram sizes of our high-resolution research tomograph (HRRT). In our first approach to speed up reconstruction time we transform the 3D scan into the format of a two-dimensional (2D) scan with sinograms that can be reconstructed independently using Fourier rebinning (FORE) and a fast 2D reconstruction method. On our dedicated reconstruction cluster (seven four-processor systems, Intel PIII@700 MHz, switched fast ethernet and Myrinet, Windows NT Server), we process these 2D sinograms in parallel. We have achieved a speedup > 23 using 26 processors and also compared results for different communication methods (RPC, Syngo, Myrinet GM). The other approach is to parallelize OSEM3D (implementation of C Michel), which has produced the best results for HRRT data so far and is more suitable for an adequate treatment of the sinogram gaps that result from the detector geometry of the HRRT. We have implemented two levels of parallelization for four dedicated cluster (a shared memory fine-grain level on each node utilizing all four processors and a coarse-grain level allowing for 15 nodes) reducing the time for one core iteration from over 7 h to about 35 min.
Advances in randomized parallel computing
Rajasekaran, Sanguthevar
1999-01-01
The technique of randomization has been employed to solve numerous prob lems of computing both sequentially and in parallel. Examples of randomized algorithms that are asymptotically better than their deterministic counterparts in solving various fundamental problems abound. Randomized algorithms have the advantages of simplicity and better performance both in theory and often in practice. This book is a collection of articles written by renowned experts in the area of randomized parallel computing. A brief introduction to randomized algorithms In the aflalysis of algorithms, at least three different measures of performance can be used: the best case, the worst case, and the average case. Often, the average case run time of an algorithm is much smaller than the worst case. 2 For instance, the worst case run time of Hoare's quicksort is O(n ), whereas its average case run time is only O( n log n). The average case analysis is conducted with an assumption on the input space. The assumption made to arrive at t...
Xyce parallel electronic simulator design.
Energy Technology Data Exchange (ETDEWEB)
Thornquist, Heidi K.; Rankin, Eric Lamont; Mei, Ting; Schiek, Richard Louis; Keiter, Eric Richard; Russo, Thomas V.
2010-09-01
This document is the Xyce Circuit Simulator developer guide. Xyce has been designed from the 'ground up' to be a SPICE-compatible, distributed memory parallel circuit simulator. While it is in many respects a research code, Xyce is intended to be a production simulator. As such, having software quality engineering (SQE) procedures in place to insure a high level of code quality and robustness are essential. Version control, issue tracking customer support, C++ style guildlines and the Xyce release process are all described. The Xyce Parallel Electronic Simulator has been under development at Sandia since 1999. Historically, Xyce has mostly been funded by ASC, the original focus of Xyce development has primarily been related to circuits for nuclear weapons. However, this has not been the only focus and it is expected that the project will diversify. Like many ASC projects, Xyce is a group development effort, which involves a number of researchers, engineers, scientists, mathmaticians and computer scientists. In addition to diversity of background, it is to be expected on long term projects for there to be a certain amount of staff turnover, as people move on to different projects. As a result, it is very important that the project maintain high software quality standards. The point of this document is to formally document a number of the software quality practices followed by the Xyce team in one place. Also, it is hoped that this document will be a good source of information for new developers.
Online Plasma Shape Reconstruction for EAST Tokamak
International Nuclear Information System (INIS)
Luo Zhengping; Xiao Bingjia; Zhu Yingfei; Yang Fei
2010-01-01
An online plasma shape reconstruction, based on the offline version of the EFIT code and MPI library, can be carried out between two adjacent shots in EAST. It combines online data acquisition, parallel calculation, and data storage together. The program on the master node of the cluster detects the termination of the discharge promptly, reads diagnostic data from the EAST mdsplus server on the completion of data storing, and writes the results onto the EFIT mdsplus server after the calculation is finished. These processes run automatically on a nine-nodes IBM blade center. The total time elapsed is about 1 second to several minutes, depending on the duration of the shot. With the results stored in the mdsplus server, it is convenient for operators and physicists to analyze the behavior of plasma using visualization tools.
PDDP, A Data Parallel Programming Model
Directory of Open Access Journals (Sweden)
Karen H. Warren
1996-01-01
Full Text Available PDDP, the parallel data distribution preprocessor, is a data parallel programming model for distributed memory parallel computers. PDDP implements high-performance Fortran-compatible data distribution directives and parallelism expressed by the use of Fortran 90 array syntax, the FORALL statement, and the WHERE construct. Distributed data objects belong to a global name space; other data objects are treated as local and replicated on each processor. PDDP allows the user to program in a shared memory style and generates codes that are portable to a variety of parallel machines. For interprocessor communication, PDDP uses the fastest communication primitives on each platform.
Parallelization of quantum molecular dynamics simulation code
International Nuclear Information System (INIS)
Kato, Kaori; Kunugi, Tomoaki; Shibahara, Masahiko; Kotake, Susumu
1998-02-01
A quantum molecular dynamics simulation code has been developed for the analysis of the thermalization of photon energies in the molecule or materials in Kansai Research Establishment. The simulation code is parallelized for both Scalar massively parallel computer (Intel Paragon XP/S75) and Vector parallel computer (Fujitsu VPP300/12). Scalable speed-up has been obtained with a distribution to processor units by division of particle group in both parallel computers. As a result of distribution to processor units not only by particle group but also by the particles calculation that is constructed with fine calculations, highly parallelization performance is achieved in Intel Paragon XP/S75. (author)
Implementation and performance of parallelized elegant
International Nuclear Information System (INIS)
Wang, Y.; Borland, M.
2008-01-01
The program elegant is widely used for design and modeling of linacs for free-electron lasers and energy recovery linacs, as well as storage rings and other applications. As part of a multi-year effort, we have parallelized many aspects of the code, including single-particle dynamics, wakefields, and coherent synchrotron radiation. We report on the approach used for gradual parallelization, which proved very beneficial in getting parallel features into the hands of users quickly. We also report details of parallelization of collective effects. Finally, we discuss performance of the parallelized code in various applications.
Parallelization of 2-D lattice Boltzmann codes
International Nuclear Information System (INIS)
Suzuki, Soichiro; Kaburaki, Hideo; Yokokawa, Mitsuo.
1996-03-01
Lattice Boltzmann (LB) codes to simulate two dimensional fluid flow are developed on vector parallel computer Fujitsu VPP500 and scalar parallel computer Intel Paragon XP/S. While a 2-D domain decomposition method is used for the scalar parallel LB code, a 1-D domain decomposition method is used for the vector parallel LB code to be vectorized along with the axis perpendicular to the direction of the decomposition. High parallel efficiency of 95.1% by the vector parallel calculation on 16 processors with 1152x1152 grid and 88.6% by the scalar parallel calculation on 100 processors with 800x800 grid are obtained. The performance models are developed to analyze the performance of the LB codes. It is shown by our performance models that the execution speed of the vector parallel code is about one hundred times faster than that of the scalar parallel code with the same number of processors up to 100 processors. We also analyze the scalability in keeping the available memory size of one processor element at maximum. Our performance model predicts that the execution time of the vector parallel code increases about 3% on 500 processors. Although the 1-D domain decomposition method has in general a drawback in the interprocessor communication, the vector parallel LB code is still suitable for the large scale and/or high resolution simulations. (author)
Parallelization of 2-D lattice Boltzmann codes
Energy Technology Data Exchange (ETDEWEB)
Suzuki, Soichiro; Kaburaki, Hideo; Yokokawa, Mitsuo
1996-03-01
Lattice Boltzmann (LB) codes to simulate two dimensional fluid flow are developed on vector parallel computer Fujitsu VPP500 and scalar parallel computer Intel Paragon XP/S. While a 2-D domain decomposition method is used for the scalar parallel LB code, a 1-D domain decomposition method is used for the vector parallel LB code to be vectorized along with the axis perpendicular to the direction of the decomposition. High parallel efficiency of 95.1% by the vector parallel calculation on 16 processors with 1152x1152 grid and 88.6% by the scalar parallel calculation on 100 processors with 800x800 grid are obtained. The performance models are developed to analyze the performance of the LB codes. It is shown by our performance models that the execution speed of the vector parallel code is about one hundred times faster than that of the scalar parallel code with the same number of processors up to 100 processors. We also analyze the scalability in keeping the available memory size of one processor element at maximum. Our performance model predicts that the execution time of the vector parallel code increases about 3% on 500 processors. Although the 1-D domain decomposition method has in general a drawback in the interprocessor communication, the vector parallel LB code is still suitable for the large scale and/or high resolution simulations. (author).
Arkin, Ethem; Tekinerdogan, Bedir; Imre, Kayhan M.
2017-01-01
The need for high-performance computing together with the increasing trend from single processor to parallel computer architectures has leveraged the adoption of parallel computing. To benefit from parallel computing power, usually parallel algorithms are defined that can be mapped and executed
Experiences in Data-Parallel Programming
Directory of Open Access Journals (Sweden)
Terry W. Clark
1997-01-01
Full Text Available To efficiently parallelize a scientific application with a data-parallel compiler requires certain structural properties in the source program, and conversely, the absence of others. A recent parallelization effort of ours reinforced this observation and motivated this correspondence. Specifically, we have transformed a Fortran 77 version of GROMOS, a popular dusty-deck program for molecular dynamics, into Fortran D, a data-parallel dialect of Fortran. During this transformation we have encountered a number of difficulties that probably are neither limited to this particular application nor do they seem likely to be addressed by improved compiler technology in the near future. Our experience with GROMOS suggests a number of points to keep in mind when developing software that may at some time in its life cycle be parallelized with a data-parallel compiler. This note presents some guidelines for engineering data-parallel applications that are compatible with Fortran D or High Performance Fortran compilers.
Streaming for Functional Data-Parallel Languages
DEFF Research Database (Denmark)
Madsen, Frederik Meisner
In this thesis, we investigate streaming as a general solution to the space inefficiency commonly found in functional data-parallel programming languages. The data-parallel paradigm maps well to parallel SIMD-style hardware. However, the traditional fully materializing execution strategy...... by extending two existing data-parallel languages: NESL and Accelerate. In the extensions we map bulk operations to data-parallel streams that can evaluate fully sequential, fully parallel or anything in between. By a dataflow, piecewise parallel execution strategy, the runtime system can adjust to any target...... flattening necessitates all sub-computations to materialize at the same time. For example, naive n by n matrix multiplication requires n^3 space in NESL because the algorithm contains n^3 independent scalar multiplications. For large values of n, this is completely unacceptable. We address the problem...
Massively parallel diffuse optical tomography
Energy Technology Data Exchange (ETDEWEB)
Sandusky, John V.; Pitts, Todd A.
2017-09-05
Diffuse optical tomography systems and methods are described herein. In a general embodiment, the diffuse optical tomography system comprises a plurality of sensor heads, the plurality of sensor heads comprising respective optical emitter systems and respective sensor systems. A sensor head in the plurality of sensors heads is caused to act as an illuminator, such that its optical emitter system transmits a transillumination beam towards a portion of a sample. Other sensor heads in the plurality of sensor heads act as observers, detecting portions of the transillumination beam that radiate from the sample in the fields of view of the respective sensory systems of the other sensor heads. Thus, sensor heads in the plurality of sensors heads generate sensor data in parallel.
Embodied and Distributed Parallel DJing.
Cappelen, Birgitta; Andersson, Anders-Petter
2016-01-01
Everyone has a right to take part in cultural events and activities, such as music performances and music making. Enforcing that right, within Universal Design, is often limited to a focus on physical access to public areas, hearing aids etc., or groups of persons with special needs performing in traditional ways. The latter might be people with disabilities, being musicians playing traditional instruments, or actors playing theatre. In this paper we focus on the innovative potential of including people with special needs, when creating new cultural activities. In our project RHYME our goal was to create health promoting activities for children with severe disabilities, by developing new musical and multimedia technologies. Because of the users' extreme demands and rich contribution, we ended up creating both a new genre of musical instruments and a new art form. We call this new art form Embodied and Distributed Parallel DJing, and the new genre of instruments for Empowering Multi-Sensorial Things.
Device for balancing parallel strings
Mashikian, Matthew S.
1985-01-01
A battery plant is described which features magnetic circuit means in association with each of the battery strings in the battery plant for balancing the electrical current flow through the battery strings by equalizing the voltage across each of the battery strings. Each of the magnetic circuit means generally comprises means for sensing the electrical current flow through one of the battery strings, and a saturable reactor having a main winding connected electrically in series with the battery string, a bias winding connected to a source of alternating current and a control winding connected to a variable source of direct current controlled by the sensing means. Each of the battery strings is formed by a plurality of batteries connected electrically in series, and these battery strings are connected electrically in parallel across common bus conductors.
Photometric Lunar Surface Reconstruction
Nefian, Ara V.; Alexandrov, Oleg; Morattlo, Zachary; Kim, Taemin; Beyer, Ross A.
2013-01-01
Accurate photometric reconstruction of the Lunar surface is important in the context of upcoming NASA robotic missions to the Moon and in giving a more accurate understanding of the Lunar soil composition. This paper describes a novel approach for joint estimation of Lunar albedo, camera exposure time, and photometric parameters that utilizes an accurate Lunar-Lambertian reflectance model and previously derived Lunar topography of the area visualized during the Apollo missions. The method introduced here is used in creating the largest Lunar albedo map (16% of the Lunar surface) at the resolution of 10 meters/pixel.
Penile surgery and reconstruction.
Perovic, Sava V; Djordjevic, Miroslav L J; Kekic, Zoran K; Djakovic, Nenad G
2002-05-01
This review will highlight recent advances in the field of penile reconstructive surgery in the paediatric and adult population. It is based on the work published during the year 2001. Besides the anatomical and histological studies of the penis, major contributions have been described in congenital and acquired penile anomalies. Also, a few new techniques and modifications of old procedures are described in order to improve the final functional and aesthetic outcome. The techniques for penile enlargement present a trend in the new millennium, but are still at the stage of investigation.
ACTS: from ATLAS software towards a common track reconstruction software
Gumpert, C.; Salzburger, A.; Kiehn, M.; Hrdinka, J.; Calace, N.; ATLAS Collaboration
2017-10-01
Reconstruction of charged particles’ trajectories is a crucial task for most particle physics experiments. The high instantaneous luminosity achieved at the LHC leads to a high number of proton-proton collisions per bunch crossing, which has put the track reconstruction software of the LHC experiments through a thorough test. Preserving track reconstruction performance under increasingly difficult experimental conditions, while keeping the usage of computational resources at a reasonable level, is an inherent problem for many HEP experiments. Exploiting concurrent algorithms and using multivariate techniques for track identification are the primary strategies to achieve that goal. Starting from current ATLAS software, the ACTS project aims to encapsulate track reconstruction software into a generic, framework- and experiment-independent software package. It provides a set of high-level algorithms and data structures for performing track reconstruction tasks as well as fast track simulation. The software is developed with special emphasis on thread-safety to support parallel execution of the code and data structures are optimised for vectorisation to speed up linear algebra operations. The implementation is agnostic to the details of the detection technologies and magnetic field configuration which makes it applicable to many different experiments.
Fan-beam filtered-backprojection reconstruction without backprojection weight
International Nuclear Information System (INIS)
Dennerlein, Frank; Noo, Frederic; Hornegger, Joachim; Lauritsch, Guenter
2007-01-01
In this paper, we address the problem of two-dimensional image reconstruction from fan-beam data acquired along a full 2π scan. Conventional approaches that follow the filtered-backprojection (FBP) structure require a weighted backprojection with the weight depending on the point to be reconstructed and also on the source position; this weight appears only in the case of divergent beam geometries. Compared to reconstruction from parallel-beam data, the backprojection weight implies an increase in computational effort and is also thought to have some negative impacts on noise properties of the reconstructed images. We demonstrate here that direct FBP reconstruction from full-scan fan-beam data is possible with no backprojection weight. Using computer-simulated, realistic fan-beam data, we compared our novel FBP formula with no backprojection weight to the use of an FBP formula based on equal weighting of all data. Comparisons in terms of signal-to-noise ratio, spatial resolution and computational efficiency are presented. These studies show that the formula we suggest yields images with a reduced noise level, at almost identical spatial resolution. This effect increases quickly with the distance from the center of the field of view, from 0% at the center to 20% less noise at 20 cm, and to 40% less noise at 25 cm. Furthermore, the suggested method is computationally less demanding and reduces computation time with a gain that was found to vary between 12% and 43% on the computers used for evaluation
Generalized Fourier slice theorem for cone-beam image reconstruction.
Zhao, Shuang-Ren; Jiang, Dazong; Yang, Kevin; Yang, Kang
2015-01-01
The cone-beam reconstruction theory has been proposed by Kirillov in 1961, Tuy in 1983, Feldkamp in 1984, Smith in 1985, Pierre Grangeat in 1990. The Fourier slice theorem is proposed by Bracewell 1956, which leads to the Fourier image reconstruction method for parallel-beam geometry. The Fourier slice theorem is extended to fan-beam geometry by Zhao in 1993 and 1995. By combining the above mentioned cone-beam image reconstruction theory and the above mentioned Fourier slice theory of fan-beam geometry, the Fourier slice theorem in cone-beam geometry is proposed by Zhao 1995 in short conference publication. This article offers the details of the derivation and implementation of this Fourier slice theorem for cone-beam geometry. Especially the problem of the reconstruction from Fourier domain has been overcome, which is that the value of in the origin of Fourier space is 0/0. The 0/0 type of limit is proper handled. As examples, the implementation results for the single circle and two perpendicular circle source orbits are shown. In the cone-beam reconstruction if a interpolation process is considered, the number of the calculations for the generalized Fourier slice theorem algorithm is O(N^4), which is close to the filtered back-projection method, here N is the image size of 1-dimension. However the interpolation process can be avoid, in that case the number of the calculations is O(N5).
Progressive Reconstruction: A Methodology for Stabilization and Reconstruction Operations
National Research Council Canada - National Science Library
Rohr, Karl C
2006-01-01
... these nations in accordance with stated United States' goals. The argument follows closely current and developing United States military doctrine on stabilization, reconstruction, and counterinsurgency operations...
Linear parallel processing machines I
Energy Technology Data Exchange (ETDEWEB)
Von Kunze, M
1984-01-01
As is well-known, non-context-free grammars for generating formal languages happen to be of a certain intrinsic computational power that presents serious difficulties to efficient parsing algorithms as well as for the development of an algebraic theory of contextsensitive languages. In this paper a framework is given for the investigation of the computational power of formal grammars, in order to start a thorough analysis of grammars consisting of derivation rules of the form aB ..-->.. A/sub 1/ ... A /sub n/ b/sub 1/...b /sub m/ . These grammars may be thought of as automata by means of parallel processing, if one considers the variables as operators acting on the terminals while reading them right-to-left. This kind of automata and their 2-dimensional programming language prove to be useful by allowing a concise linear-time algorithm for integer multiplication. Linear parallel processing machines (LP-machines) which are, in their general form, equivalent to Turing machines, include finite automata and pushdown automata (with states encoded) as special cases. Bounded LP-machines yield deterministic accepting automata for nondeterministic contextfree languages, and they define an interesting class of contextsensitive languages. A characterization of this class in terms of generating grammars is established by using derivation trees with crossings as a helpful tool. From the algebraic point of view, deterministic LP-machines are effectively represented semigroups with distinguished subsets. Concerning the dualism between generating and accepting devices of formal languages within the algebraic setting, the concept of accepting automata turns out to reduce essentially to embeddability in an effectively represented extension monoid, even in the classical cases.
Parallel computing in enterprise modeling.
Energy Technology Data Exchange (ETDEWEB)
Goldsby, Michael E.; Armstrong, Robert C.; Shneider, Max S.; Vanderveen, Keith; Ray, Jaideep; Heath, Zach; Allan, Benjamin A.
2008-08-01
This report presents the results of our efforts to apply high-performance computing to entity-based simulations with a multi-use plugin for parallel computing. We use the term 'Entity-based simulation' to describe a class of simulation which includes both discrete event simulation and agent based simulation. What simulations of this class share, and what differs from more traditional models, is that the result sought is emergent from a large number of contributing entities. Logistic, economic and social simulations are members of this class where things or people are organized or self-organize to produce a solution. Entity-based problems never have an a priori ergodic principle that will greatly simplify calculations. Because the results of entity-based simulations can only be realized at scale, scalable computing is de rigueur for large problems. Having said that, the absence of a spatial organizing principal makes the decomposition of the problem onto processors problematic. In addition, practitioners in this domain commonly use the Java programming language which presents its own problems in a high-performance setting. The plugin we have developed, called the Parallel Particle Data Model, overcomes both of these obstacles and is now being used by two Sandia frameworks: the Decision Analysis Center, and the Seldon social simulation facility. While the ability to engage U.S.-sized problems is now available to the Decision Analysis Center, this plugin is central to the success of Seldon. Because Seldon relies on computationally intensive cognitive sub-models, this work is necessary to achieve the scale necessary for realistic results. With the recent upheavals in the financial markets, and the inscrutability of terrorist activity, this simulation domain will likely need a capability with ever greater fidelity. High-performance computing will play an important part in enabling that greater fidelity.
Single-spin stochastic optical reconstruction microscopy.
Pfender, Matthias; Aslam, Nabeel; Waldherr, Gerald; Neumann, Philipp; Wrachtrup, Jörg
2014-10-14
We experimentally demonstrate precision addressing of single-quantum emitters by combined optical microscopy and spin resonance techniques. To this end, we use nitrogen vacancy (NV) color centers in diamond confined within a few ten nanometers as individually resolvable quantum systems. By developing a stochastic optical reconstruction microscopy (STORM) technique for NV centers, we are able to simultaneously perform sub-diffraction-limit imaging and optically detected spin resonance (ODMR) measurements on NV spins. This allows the assignment of spin resonance spectra to individual NV center locations with nanometer-scale resolution and thus further improves spatial discrimination. For example, we resolved formerly indistinguishable emitters by their spectra. Furthermore, ODMR spectra contain metrology information allowing for sub-diffraction-limit sensing of, for instance, magnetic or electric fields with inherently parallel data acquisition. As an example, we have detected nuclear spins with nanometer-scale precision. Finally, we give prospects of how this technique can evolve into a fully parallel quantum sensor for nanometer resolution imaging of delocalized quantum correlations.
Synchronized dynamic dose reconstruction
International Nuclear Information System (INIS)
Litzenberg, Dale W.; Hadley, Scott W.; Tyagi, Neelam; Balter, James M.; Ten Haken, Randall K.; Chetty, Indrin J.
2007-01-01
Variations in target volume position between and during treatment fractions can lead to measurable differences in the dose distribution delivered to each patient. Current methods to estimate the ongoing cumulative delivered dose distribution make idealized assumptions about individual patient motion based on average motions observed in a population of patients. In the delivery of intensity modulated radiation therapy (IMRT) with a multi-leaf collimator (MLC), errors are introduced in both the implementation and delivery processes. In addition, target motion and MLC motion can lead to dosimetric errors from interplay effects. All of these effects may be of clinical importance. Here we present a method to compute delivered dose distributions for each treatment beam and fraction, which explicitly incorporates synchronized real-time patient motion data and real-time fluence and machine configuration data. This synchronized dynamic dose reconstruction method properly accounts for the two primary classes of errors that arise from delivering IMRT with an MLC: (a) Interplay errors between target volume motion and MLC motion, and (b) Implementation errors, such as dropped segments, dose over/under shoot, faulty leaf motors, tongue-and-groove effect, rounded leaf ends, and communications delays. These reconstructed dose fractions can then be combined to produce high-quality determinations of the dose distribution actually received to date, from which individualized adaptive treatment strategies can be determined
Augusto, O
2012-01-01
The Large Hadron Collider (LHC) is the most powerful particle accelerator in the world. It has been designed to collide proton beams at an energy up to 14 TeV in the center of mass. In 2011, the data taking was done with a center of mass energy of 7 TeV, the instant luminosity has reached values greater than $4 \\times 10^{32} cm^{-2} s^{-1}$ and the integrated luminosity reached the value of 1.02 $fb^{-1}$ on the LHCb. The jet reconstruction is fundamental to observe events that can be used to test pertubative QCD (pQCD). It also provides a way to observe standard model channels and searches for new physics like SUSY. The anti-kt algorithm is a jet reconstruction algorithm that is based on the distance of the particles on the space $\\eta \\times \\phi$ and on the transverse momentum of particles. To maximize the energy resolution all information about the trackers and the calo...
Three-dimensional ICT reconstruction
International Nuclear Information System (INIS)
Zhang Aidong; Li Ju; Chen Fa; Sun Lingxia
2005-01-01
The three-dimensional ICT reconstruction method is the hot topic of recent ICT technology research. In the context, qualified visual three-dimensional ICT pictures are achieved through multi-piece two-dimensional images accumulation by, combining with thresholding method and linear interpolation. Different direction and different position images of the reconstructed pictures are got by rotation and interception respectively. The convenient and quick method is significantly instructive to more complicated three-dimensional reconstruction of ICT images. (authors)
Three-dimensional ICT reconstruction
International Nuclear Information System (INIS)
Zhang Aidong; Li Ju; Chen Fa; Sun Lingxia
2004-01-01
The three-dimensional ICT reconstruction method is the hot topic of recent ICT technology research. In the context qualified visual three-dimensional ICT pictures are achieved through multi-piece two-dimensional images accumulation by order, combining with thresholding method and linear interpolation. Different direction and different position images of the reconstructed pictures are got by rotation and interception respectively. The convenient and quick method is significantly instructive to more complicated three-dimensional reconstruction of ICT images. (authors)
General surface reconstruction for cone-beam multislice spiral computed tomography
International Nuclear Information System (INIS)
Chen Laigao; Liang Yun; Heuscher, Dominic J.
2003-01-01
A new family of cone-beam reconstruction algorithm, the General Surface Reconstruction (GSR), is proposed and formulated in this paper for multislice spiral computed tomography (CT) reconstructions. It provides a general framework to allow the reconstruction of planar or nonplanar surfaces on a set of rebinned short-scan parallel beam projection data. An iterative surface formation method is proposed as an example to show the possibility to form nonplanar reconstruction surfaces to minimize the adverse effect between the collected cone-beam projection data and the reconstruction surfaces. The improvement in accuracy of the nonplanar surfaces over planar surfaces in the two-dimensional approximate cone-beam reconstructions is mathematically proved and demonstrated using numerical simulations. The proposed GSR algorithm is evaluated by the computer simulation of cone-beam spiral scanning geometry and various mathematical phantoms. The results demonstrate that the GSR algorithm generates much better image quality compared to conventional multislice reconstruction algorithms. For a table speed up to 100 mm per rotation, GSR demonstrates good image quality for both the low-contrast ball phantom and thorax phantom. All other performance parameters are comparable to the single-slice 180 deg. LI (linear interpolation) algorithm, which is considered the 'gold standard'. GSR also achieves high computing efficiency and good temporal resolution, making it an attractive alternative for the reconstruction of next generation multislice spiral CT data
Compiler Technology for Parallel Scientific Computation
Directory of Open Access Journals (Sweden)
Can Özturan
1994-01-01
Full Text Available There is a need for compiler technology that, given the source program, will generate efficient parallel codes for different architectures with minimal user involvement. Parallel computation is becoming indispensable in solving large-scale problems in science and engineering. Yet, the use of parallel computation is limited by the high costs of developing the needed software. To overcome this difficulty we advocate a comprehensive approach to the development of scalable architecture-independent software for scientific computation based on our experience with equational programming language (EPL. Our approach is based on a program decomposition, parallel code synthesis, and run-time support for parallel scientific computation. The program decomposition is guided by the source program annotations provided by the user. The synthesis of parallel code is based on configurations that describe the overall computation as a set of interacting components. Run-time support is provided by the compiler-generated code that redistributes computation and data during object program execution. The generated parallel code is optimized using techniques of data alignment, operator placement, wavefront determination, and memory optimization. In this article we discuss annotations, configurations, parallel code generation, and run-time support suitable for parallel programs written in the functional parallel programming language EPL and in Fortran.
Computer-Aided Parallelizer and Optimizer
Jin, Haoqiang
2011-01-01
The Computer-Aided Parallelizer and Optimizer (CAPO) automates the insertion of compiler directives (see figure) to facilitate parallel processing on Shared Memory Parallel (SMP) machines. While CAPO currently is integrated seamlessly into CAPTools (developed at the University of Greenwich, now marketed as ParaWise), CAPO was independently developed at Ames Research Center as one of the components for the Legacy Code Modernization (LCM) project. The current version takes serial FORTRAN programs, performs interprocedural data dependence analysis, and generates OpenMP directives. Due to the widely supported OpenMP standard, the generated OpenMP codes have the potential to run on a wide range of SMP machines. CAPO relies on accurate interprocedural data dependence information currently provided by CAPTools. Compiler directives are generated through identification of parallel loops in the outermost level, construction of parallel regions around parallel loops and optimization of parallel regions, and insertion of directives with automatic identification of private, reduction, induction, and shared variables. Attempts also have been made to identify potential pipeline parallelism (implemented with point-to-point synchronization). Although directives are generated automatically, user interaction with the tool is still important for producing good parallel codes. A comprehensive graphical user interface is included for users to interact with the parallelization process.
A tomograph VMEbus parallel processing data acquisition system
International Nuclear Information System (INIS)
Wilkinson, N.A.; Rogers, J.G.; Atkins, M.S.
1989-01-01
This paper describes a VME based data acquisition system suitable for the development of Positron Volume Imaging tomographs which use 3-D data for improved image resolution over slice-oriented tomographs. the data acquisition must be flexible enough to accommodate several 3-D reconstruction algorithms; hence, a software-based system is most suitable. Furthermore, because of the increased dimensions and resolution of volume imaging tomographs, the raw data event rate is greater than that of slice-oriented machines. These dual requirements are met by our data acquisition system. Flexibility is achieved through an array of processors connected over a VMEbus, operating asynchronously and in parallel. High raw data throughput is achieved using a dedicated high speed data transfer device available for the VMEbus. The device can attain a raw data rate of 2.5 million coincidence events per second for raw events which are 64 bits wide
Virtual 3-D Facial Reconstruction
Directory of Open Access Journals (Sweden)
Martin Paul Evison
2000-06-01
Full Text Available Facial reconstructions in archaeology allow empathy with people who lived in the past and enjoy considerable popularity with the public. It is a common misconception that facial reconstruction will produce an exact likeness; a resemblance is the best that can be hoped for. Research at Sheffield University is aimed at the development of a computer system for facial reconstruction that will be accurate, rapid, repeatable, accessible and flexible. This research is described and prototypical 3-D facial reconstructions are presented. Interpolation models simulating obesity, ageing and ethnic affiliation are also described. Some strengths and weaknesses in the models, and their potential for application in archaeology are discussed.
Entropy and transverse section reconstruction
International Nuclear Information System (INIS)
Gullberg, G.T.
1976-01-01
A new approach to the reconstruction of a transverse section using projection data from multiple views incorporates the concept of maximum entropy. The principle of maximizing information entropy embodies the assurance of minimizing bias or prejudice in the reconstruction. Using maximum entropy is a necessary condition for the reconstructed image. This entropy criterion is most appropriate for 3-D reconstruction of objects from projections where the system is underdetermined or the data are limited statistically. This is the case in nuclear medicine time limitations in patient studies do not yield sufficient projections
Comparison of phase-constrained parallel MRI approaches: Analogies and differences.
Blaimer, Martin; Heim, Marius; Neumann, Daniel; Jakob, Peter M; Kannengiesser, Stephan; Breuer, Felix A
2016-03-01
Phase-constrained parallel MRI approaches have the potential for significantly improving the image quality of accelerated MRI scans. The purpose of this study was to investigate the properties of two different phase-constrained parallel MRI formulations, namely the standard phase-constrained approach and the virtual conjugate coil (VCC) concept utilizing conjugate k-space symmetry. Both formulations were combined with image-domain algorithms (SENSE) and a mathematical analysis was performed. Furthermore, the VCC concept was combined with k-space algorithms (GRAPPA and ESPIRiT) for image reconstruction. In vivo experiments were conducted to illustrate analogies and differences between the individual methods. Furthermore, a simple method of improving the signal-to-noise ratio by modifying the sampling scheme was implemented. For SENSE, the VCC concept was mathematically equivalent to the standard phase-constrained formulation and therefore yielded identical results. In conjunction with k-space algorithms, the VCC concept provided more robust results when only a limited amount of calibration data were available. Additionally, VCC-GRAPPA reconstructed images provided spatial phase information with full resolution. Although both phase-constrained parallel MRI formulations are very similar conceptually, there exist important differences between image-domain and k-space domain reconstructions regarding the calibration robustness and the availability of high-resolution phase information. © 2015 Wiley Periodicals, Inc.
Learning Joint-Sparse Codes for Calibration-Free Parallel MR Imaging.
Wang, Shanshan; Tan, Sha; Gao, Yuan; Liu, Qiegen; Ying, Leslie; Xiao, Taohui; Liu, Yuanyuan; Liu, Xin; Zheng, Hairong; Liang, Dong
2018-01-01
The integration of compressed sensing and parallel imaging (CS-PI) has shown an increased popularity in recent years to accelerate magnetic resonance (MR) imaging. Among them, calibration-free techniques have presented encouraging performances due to its capability in robustly handling the sensitivity information. Unfortunately, existing calibration-free methods have only explored joint-sparsity with direct analysis transform projections. To further exploit joint-sparsity and improve reconstruction accuracy, this paper proposes to Learn joINt-sparse coDes for caliBration-free parallEl mR imaGing (LINDBERG) by modeling the parallel MR imaging problem as an - - minimization objective with an norm constraining data fidelity, Frobenius norm enforcing sparse representation error and the mixed norm triggering joint sparsity across multichannels. A corresponding algorithm has been developed to alternatively update the sparse representation, sensitivity encoded images and K-space data. Then, the final image is produced as the square root of sum of squares of all channel images. Experimental results on both physical phantom and in vivo data sets show that the proposed method is comparable and even superior to state-of-the-art CS-PI reconstruction approaches. Specifically, LINDBERG has presented strong capability in suppressing noise and artifacts while reconstructing MR images from highly undersampled multichannel measurements.
Lee, Wei-Po; Hsiao, Yu-Ting; Hwang, Wei-Che
2014-01-16
To improve the tedious task of reconstructing gene networks through testing experimentally the possible interactions between genes, it becomes a trend to adopt the automated reverse engineering procedure instead. Some evolutionary algorithms have been suggested for deriving network parameters. However, to infer large networks by the evolutionary algorithm, it is necessary to address two important issues: premature convergence and high computational cost. To tackle the former problem and to enhance the performance of traditional evolutionary algorithms, it is advisable to use parallel model evolutionary algorithms. To overcome the latter and to speed up the computation, it is advocated to adopt the mechanism of cloud computing as a promising solution: most popular is the method of MapReduce programming model, a fault-tolerant framework to implement parallel algorithms for inferring large gene networks. This work presents a practical framework to infer large gene networks, by developing and parallelizing a hybrid GA-PSO optimization method. Our parallel method is extended to work with the Hadoop MapReduce programming model and is executed in different cloud computing environments. To evaluate the proposed approach, we use a well-known open-source software GeneNetWeaver to create several yeast S. cerevisiae sub-networks and use them to produce gene profiles. Experiments have been conducted and the results have been analyzed. They show that our parallel approach can be successfully used to infer networks with desired behaviors and the computation time can be largely reduced. Parallel population-based algorithms can effectively determine network parameters and they perform better than the widely-used sequential algorithms in gene network inference. These parallel algorithms can be distributed to the cloud computing environment to speed up the computation. By coupling the parallel model population-based optimization method and the parallel computational framework, high
Advances in non-Cartesian parallel magnetic resonance imaging using the GRAPPA operator
Energy Technology Data Exchange (ETDEWEB)
Seiberlich, Nicole
2008-07-21
This thesis has presented several new non-Cartesian parallel imaging methods which simplify both gridding and the reconstruction of images from undersampled data. A novel approach which uses the concepts of parallel imaging to grid data sampled along a non-Cartesian trajectory called GRAPPA Operator Gridding (GROG) is described. GROG shifts any acquired k-space data point to its nearest Cartesian location, thereby converting non-Cartesian to Cartesian data. The only requirements for GROG are a multi-channel acquisition and a calibration dataset for the determination of the GROG weights. Then an extension of GRAPPA Operator Gridding, namely Self-Calibrating GRAPPA Operator Gridding (SC-GROG) is discussed. SC-GROG is a method by which non-Cartesian data can be gridded using spatial information from a multi-channel coil array without the need for an additional calibration dataset, as required in standard GROG. Although GROG can be used to grid undersampled datasets, it is important to note that this method uses parallel imaging only for gridding, and not to reconstruct artifact-free images from undersampled data. Thereafter a simple, novel method for performing modified Cartesian GRAPPA reconstructions on undersampled non-Cartesian k-space data gridded using GROG to arrive at a non-aliased image is introduced. Because the undersampled non-Cartesian data cannot be reconstructed using a single GRAPPA kernel, several Cartesian patterns are selected for the reconstruction. Finally a novel method of using GROG to mimic the bunched phase encoding acquisition (BPE) scheme is discussed. In MRI, it is generally assumed that an artifact-free image can be reconstructed only from sampled points which fulfill the Nyquist criterion. However, the BPE reconstruction is based on the Generalized Sampling Theorem of Papoulis, which states that a continuous signal can be reconstructed from sampled points as long as the points are on average sampled at the Nyquist frequency. A novel
Advances in non-Cartesian parallel magnetic resonance imaging using the GRAPPA operator
International Nuclear Information System (INIS)
Seiberlich, Nicole
2008-01-01
This thesis has presented several new non-Cartesian parallel imaging methods which simplify both gridding and the reconstruction of images from undersampled data. A novel approach which uses the concepts of parallel imaging to grid data sampled along a non-Cartesian trajectory called GRAPPA Operator Gridding (GROG) is described. GROG shifts any acquired k-space data point to its nearest Cartesian location, thereby converting non-Cartesian to Cartesian data. The only requirements for GROG are a multi-channel acquisition and a calibration dataset for the determination of the GROG weights. Then an extension of GRAPPA Operator Gridding, namely Self-Calibrating GRAPPA Operator Gridding (SC-GROG) is discussed. SC-GROG is a method by which non-Cartesian data can be gridded using spatial information from a multi-channel coil array without the need for an additional calibration dataset, as required in standard GROG. Although GROG can be used to grid undersampled datasets, it is important to note that this method uses parallel imaging only for gridding, and not to reconstruct artifact-free images from undersampled data. Thereafter a simple, novel method for performing modified Cartesian GRAPPA reconstructions on undersampled non-Cartesian k-space data gridded using GROG to arrive at a non-aliased image is introduced. Because the undersampled non-Cartesian data cannot be reconstructed using a single GRAPPA kernel, several Cartesian patterns are selected for the reconstruction. Finally a novel method of using GROG to mimic the bunched phase encoding acquisition (BPE) scheme is discussed. In MRI, it is generally assumed that an artifact-free image can be reconstructed only from sampled points which fulfill the Nyquist criterion. However, the BPE reconstruction is based on the Generalized Sampling Theorem of Papoulis, which states that a continuous signal can be reconstructed from sampled points as long as the points are on average sampled at the Nyquist frequency. A novel
Influence of rebinning on the reconstructed resolution of fan-beam SPECT
International Nuclear Information System (INIS)
Koole, M.; D'Asseler, Y.; Staelens, S.; Vandenberghe, S.; Eede, I. van den; Walle, R. van de; Lemahieu, I.
2002-01-01
Aim: Fan-beam projection data can be rebinned to a parallel-beam geometry. This rebinning operation allows these data to be reconstructed with algorithms for parallel-beam projection data. The advantage of such an operation is that a dedicated projection/backprojection step for fan-beam geometry doesn't need to be developed. In clinical practice bilinear interpolation is often used for this rebinning operation. The aim of this study is to investigate the influence of the rebinning operation on the resolution properties of the reconstructed SPECT-image. Materials and methods: We have simulated the resolution properties of a fan-beam collimator, used in clinical routine, by means of a dedicated projector operation which models the distance dependent sensitivity and resolution of the collimator. With this projector, we generated noise-free sinograms for a point source located at various distances from the center of rotation. The number of angles of these sinograms varied from 60 to 180, corresponding to a step angle of 6 to 2 degrees. These generated fan-beam projection data were reconstructed directly with a filtered backprojection algorithm for fan-beam projection data, which consists of weighting and filtering the projection data with a ramp filter and of a weighted backprojection. Next, the generated fan-beam projection data were rebinned by means of bilinear interpolation and reconstructed with standard filtered backprojection for parallel-beam data. A two-dimensional Gaussian was fitted to the two point sources, one reconstructed with FBP for fan-beam and one reconstructed with FBP for parallel-beam after rebinning, yielding an estimate for the reconstructed Full Width at Half Maximum (FWHM) in the radial and tangential direction, for different locations in the field of view. Results: Results show little difference in resolution degradation in the radial direction between direct reconstruction and reconstruction after rebinning. However, significant loss in
Parallel processing for fluid dynamics applications
International Nuclear Information System (INIS)
Johnson, G.M.
1989-01-01
The impact of parallel processing on computational science and, in particular, on computational fluid dynamics is growing rapidly. In this paper, particular emphasis is given to developments which have occurred within the past two years. Parallel processing is defined and the reasons for its importance in high-performance computing are reviewed. Parallel computer architectures are classified according to the number and power of their processing units, their memory, and the nature of their connection scheme. Architectures which show promise for fluid dynamics applications are emphasized. Fluid dynamics problems are examined for parallelism inherent at the physical level. CFD algorithms and their mappings onto parallel architectures are discussed. Several example are presented to document the performance of fluid dynamics applications on present-generation parallel processing devices
Design considerations for parallel graphics libraries
Crockett, Thomas W.
1994-01-01
Applications which run on parallel supercomputers are often characterized by massive datasets. Converting these vast collections of numbers to visual form has proven to be a powerful aid to comprehension. For a variety of reasons, it may be desirable to provide this visual feedback at runtime. One way to accomplish this is to exploit the available parallelism to perform graphics operations in place. In order to do this, we need appropriate parallel rendering algorithms and library interfaces. This paper provides a tutorial introduction to some of the issues which arise in designing parallel graphics libraries and their underlying rendering algorithms. The focus is on polygon rendering for distributed memory message-passing systems. We illustrate our discussion with examples from PGL, a parallel graphics library which has been developed on the Intel family of parallel systems.
Super-Resolution Image Reconstruction Applied to Medical Ultrasound
Ellis, Michael
Ultrasound is the preferred imaging modality for many diagnostic applications due to its real-time image reconstruction and low cost. Nonetheless, conventional ultrasound is not used in many applications because of limited spatial resolution and soft tissue contrast. Most commercial ultrasound systems reconstruct images using a simple delay-and-sum architecture on receive, which is fast and robust but does not utilize all information available in the raw data. Recently, more sophisticated image reconstruction methods have been developed that make use of far more information in the raw data to improve resolution and contrast. One such method is the Time-Domain Optimized Near-Field Estimator (TONE), which employs a maximum a priori estimation to solve a highly underdetermined problem, given a well-defined system model. TONE has been shown to significantly improve both the contrast and resolution of ultrasound images when compared to conventional methods. However, TONE's lack of robustness to variations from the system model and extremely high computational cost hinder it from being readily adopted in clinical scanners. This dissertation aims to reduce the impact of TONE's shortcomings, transforming it from an academic construct to a clinically viable image reconstruction algorithm. By altering the system model from a collection of individual hypothetical scatterers to a collection of weighted, diffuse regions, dTONE is able to achieve much greater robustness to modeling errors. A method for efficient parallelization of dTONE is presented that reduces reconstruction time by more than an order of magnitude with little loss in image fidelity. An alternative reconstruction algorithm, called qTONE, is also developed and is able to reduce reconstruction times by another two orders of magnitude while simultaneously improving image contrast. Each of these methods for improving TONE are presented, their limitations are explored, and all are used in concert to reconstruct in
Practical implementation of tetrahedral mesh reconstruction in emission tomography
Boutchko, R.; Sitek, A.; Gullberg, G. T.
2013-05-01
This paper presents a practical implementation of image reconstruction on tetrahedral meshes optimized for emission computed tomography with parallel beam geometry. Tetrahedral mesh built on a point cloud is a convenient image representation method, intrinsically three-dimensional and with a multi-level resolution property. Image intensities are defined at the mesh nodes and linearly interpolated inside each tetrahedron. For the given mesh geometry, the intensities can be computed directly from tomographic projections using iterative reconstruction algorithms with a system matrix calculated using an exact analytical formula. The mesh geometry is optimized for a specific patient using a two stage process. First, a noisy image is reconstructed on a finely-spaced uniform cloud. Then, the geometry of the representation is adaptively transformed through boundary-preserving node motion and elimination. Nodes are removed in constant intensity regions, merged along the boundaries, and moved in the direction of the mean local intensity gradient in order to provide higher node density in the boundary regions. Attenuation correction and detector geometric response are included in the system matrix. Once the mesh geometry is optimized, it is used to generate the final system matrix for ML-EM reconstruction of node intensities and for visualization of the reconstructed images. In dynamic PET or SPECT imaging, the system matrix generation procedure is performed using a quasi-static sinogram, generated by summing projection data from multiple time frames. This system matrix is then used to reconstruct the individual time frame projections. Performance of the new method is evaluated by reconstructing simulated projections of the NCAT phantom and the method is then applied to dynamic SPECT phantom and patient studies and to a dynamic microPET rat study. Tetrahedral mesh-based images are compared to the standard voxel-based reconstruction for both high and low signal-to-noise ratio
Practical implementation of tetrahedral mesh reconstruction in emission tomography
International Nuclear Information System (INIS)
Boutchko, R; Gullberg, G T; Sitek, A
2013-01-01
This paper presents a practical implementation of image reconstruction on tetrahedral meshes optimized for emission computed tomography with parallel beam geometry. Tetrahedral mesh built on a point cloud is a convenient image representation method, intrinsically three-dimensional and with a multi-level resolution property. Image intensities are defined at the mesh nodes and linearly interpolated inside each tetrahedron. For the given mesh geometry, the intensities can be computed directly from tomographic projections using iterative reconstruction algorithms with a system matrix calculated using an exact analytical formula. The mesh geometry is optimized for a specific patient using a two stage process. First, a noisy image is reconstructed on a finely-spaced uniform cloud. Then, the geometry of the representation is adaptively transformed through boundary-preserving node motion and elimination. Nodes are removed in constant intensity regions, merged along the boundaries, and moved in the direction of the mean local intensity gradient in order to provide higher node density in the boundary regions. Attenuation correction and detector geometric response are included in the system matrix. Once the mesh geometry is optimized, it is used to generate the final system matrix for ML-EM reconstruction of node intensities and for visualization of the reconstructed images. In dynamic PET or SPECT imaging, the system matrix generation procedure is performed using a quasi-static sinogram, generated by summing projection data from multiple time frames. This system matrix is then used to reconstruct the individual time frame projections. Performance of the new method is evaluated by reconstructing simulated projections of the NCAT phantom and the method is then applied to dynamic SPECT phantom and patient studies and to a dynamic microPET rat study. Tetrahedral mesh-based images are compared to the standard voxel-based reconstruction for both high and low signal-to-noise ratio
Direct fourier method reconstruction based on unequally spaced fast fourier transform
International Nuclear Information System (INIS)
Wu Xiaofeng; Zhao Ming; Liu Li
2003-01-01
First, We give an Unequally Spaced Fast Fourier Transform (USFFT) method, which is more exact and theoretically more comprehensible than its former counterpart. Then, with an interesting interpolation scheme, we discusse how to apply USFFT to Direct Fourier Method (DFM) reconstruction of parallel projection data. At last, an emulation experiment result is given. (authors)
Using next-generation sequencing for molecular reconstruction of past Arctic vegetation and climate
DEFF Research Database (Denmark)
Sønstebø, J. H.; Gielly, L.; Brysting, A. K.
2010-01-01
is demonstrated by high-throughput parallel pyrosequencing of permafrost-preserved DNA and reconstruction of two plant communities from the last glacial period. Our approach opens new possibilities for DNA-based assessment of ancient as well as modern biodiversity of many groups of organisms using environmental...
Synchronization Techniques in Parallel Discrete Event Simulation
Lindén, Jonatan
2018-01-01
Discrete event simulation is an important tool for evaluating system models in many fields of science and engineering. To improve the performance of large-scale discrete event simulations, several techniques to parallelize discrete event simulation have been developed. In parallel discrete event simulation, the work of a single discrete event simulation is distributed over multiple processing elements. A key challenge in parallel discrete event simulation is to ensure that causally dependent ...
Parallel processing from applications to systems
Moldovan, Dan I
1993-01-01
This text provides one of the broadest presentations of parallelprocessing available, including the structure of parallelprocessors and parallel algorithms. The emphasis is on mappingalgorithms to highly parallel computers, with extensive coverage ofarray and multiprocessor architectures. Early chapters provideinsightful coverage on the analysis of parallel algorithms andprogram transformations, effectively integrating a variety ofmaterial previously scattered throughout the literature. Theory andpractice are well balanced across diverse topics in this concisepresentation. For exceptional cla
Parallel processing for artificial intelligence 1
Kanal, LN; Kumar, V; Suttner, CB
1994-01-01
Parallel processing for AI problems is of great current interest because of its potential for alleviating the computational demands of AI procedures. The articles in this book consider parallel processing for problems in several areas of artificial intelligence: image processing, knowledge representation in semantic networks, production rules, mechanization of logic, constraint satisfaction, parsing of natural language, data filtering and data mining. The publication is divided into six sections. The first addresses parallel computing for processing and understanding images. The second discus
A survey of parallel multigrid algorithms
Chan, Tony F.; Tuminaro, Ray S.
1987-01-01
A typical multigrid algorithm applied to well-behaved linear-elliptic partial-differential equations (PDEs) is described. Criteria for designing and evaluating parallel algorithms are presented. Before evaluating the performance of some parallel multigrid algorithms, consideration is given to some theoretical complexity results for solving PDEs in parallel and for executing the multigrid algorithm. The effect of mapping and load imbalance on the partial efficiency of the algorithm is studied.
Refinement of Parallel and Reactive Programs
Back, R. J. R.
1992-01-01
We show how to apply the refinement calculus to stepwise refinement of parallel and reactive programs. We use action systems as our basic program model. Action systems are sequential programs which can be implemented in a parallel fashion. Hence refinement calculus methods, originally developed for sequential programs, carry over to the derivation of parallel programs. Refinement of reactive programs is handled by data refinement techniques originally developed for the sequential refinement c...
Parallel Prediction of Stock Volatility
Directory of Open Access Journals (Sweden)
Priscilla Jenq
2017-10-01
Full Text Available Volatility is a measurement of the risk of financial products. A stock will hit new highs and lows over time and if these highs and lows fluctuate wildly, then it is considered a high volatile stock. Such a stock is considered riskier than a stock whose volatility is low. Although highly volatile stocks are riskier, the returns that they generate for investors can be quite high. Of course, with a riskier stock also comes the chance of losing money and yielding negative returns. In this project, we will use historic stock data to help us forecast volatility. Since the financial industry usually uses S&P 500 as the indicator of the market, we will use S&P 500 as a benchmark to compute the risk. We will also use artificial neural networks as a tool to predict volatilities for a specific time frame that will be set when we configure this neural network. There have been reports that neural networks with different numbers of layers and different numbers of hidden nodes may generate varying results. In fact, we may be able to find the best configuration of a neural network to compute volatilities. We will implement this system using the parallel approach. The system can be used as a tool for investors to allocating and hedging assets.
Vectoring of parallel synthetic jets
Berk, Tim; Ganapathisubramani, Bharathram; Gomit, Guillaume
2015-11-01
A pair of parallel synthetic jets can be vectored by applying a phase difference between the two driving signals. The resulting jet can be merged or bifurcated and either vectored towards the actuator leading in phase or the actuator lagging in phase. In the present study, the influence of phase difference and Strouhal number on the vectoring behaviour is examined experimentally. Phase-locked vorticity fields, measured using Particle Image Velocimetry (PIV), are used to track vortex pairs. The physical mechanisms that explain the diversity in vectoring behaviour are observed based on the vortex trajectories. For a fixed phase difference, the vectoring behaviour is shown to be primarily influenced by pinch-off time of vortex rings generated by the synthetic jets. Beyond a certain formation number, the pinch-off timescale becomes invariant. In this region, the vectoring behaviour is determined by the distance between subsequent vortex rings. We acknowledge the financial support from the European Research Council (ERC grant agreement no. 277472).
A Soft Parallel Kinematic Mechanism.
White, Edward L; Case, Jennifer C; Kramer-Bottiglio, Rebecca
2018-02-01
In this article, we describe a novel holonomic soft robotic structure based on a parallel kinematic mechanism. The design is based on the Stewart platform, which uses six sensors and actuators to achieve full six-degree-of-freedom motion. Our design is much less complex than a traditional platform, since it replaces the 12 spherical and universal joints found in a traditional Stewart platform with a single highly deformable elastomer body and flexible actuators. This reduces the total number of parts in the system and simplifies the assembly process. Actuation is achieved through coiled-shape memory alloy actuators. State observation and feedback is accomplished through the use of capacitive elastomer strain gauges. The main structural element is an elastomer joint that provides antagonistic force. We report the response of the actuators and sensors individually, then report the response of the complete assembly. We show that the completed robotic system is able to achieve full position control, and we discuss the limitations associated with using responsive material actuators. We believe that control demonstrated on a single body in this work could be extended to chains of such bodies to create complex soft robots.
Productive Parallel Programming: The PCN Approach
Directory of Open Access Journals (Sweden)
Ian Foster
1992-01-01
Full Text Available We describe the PCN programming system, focusing on those features designed to improve the productivity of scientists and engineers using parallel supercomputers. These features include a simple notation for the concise specification of concurrent algorithms, the ability to incorporate existing Fortran and C code into parallel applications, facilities for reusing parallel program components, a portable toolkit that allows applications to be developed on a workstation or small parallel computer and run unchanged on supercomputers, and integrated debugging and performance analysis tools. We survey representative scientific applications and identify problem classes for which PCN has proved particularly useful.
Prabhat
2014-01-01
Gain Critical Insight into the Parallel I/O EcosystemParallel I/O is an integral component of modern high performance computing (HPC), especially in storing and processing very large datasets to facilitate scientific discovery. Revealing the state of the art in this field, High Performance Parallel I/O draws on insights from leading practitioners, researchers, software architects, developers, and scientists who shed light on the parallel I/O ecosystem.The first part of the book explains how large-scale HPC facilities scope, configure, and operate systems, with an emphasis on choices of I/O har
Parallel, Rapid Diffuse Optical Tomography of Breast
National Research Council Canada - National Science Library
Yodh, Arjun
2001-01-01
During the last year we have experimentally and computationally investigated rapid acquisition and analysis of informationally dense diffuse optical data sets in the parallel plate compressed breast geometry...
Parallel, Rapid Diffuse Optical Tomography of Breast
National Research Council Canada - National Science Library
Yodh, Arjun
2002-01-01
During the last year we have experimentally and computationally investigated rapid acquisition and analysis of informationally dense diffuse optical data sets in the parallel plate compressed breast geometry...
Parallel auto-correlative statistics with VTK.
Energy Technology Data Exchange (ETDEWEB)
Pebay, Philippe Pierre; Bennett, Janine Camille
2013-08-01
This report summarizes existing statistical engines in VTK and presents both the serial and parallel auto-correlative statistics engines. It is a sequel to [PT08, BPRT09b, PT09, BPT09, PT10] which studied the parallel descriptive, correlative, multi-correlative, principal component analysis, contingency, k-means, and order statistics engines. The ease of use of the new parallel auto-correlative statistics engine is illustrated by the means of C++ code snippets and algorithm verification is provided. This report justifies the design of the statistics engines with parallel scalability in mind, and provides scalability and speed-up analysis results for the autocorrelative statistics engine.
Conformal pure radiation with parallel rays
International Nuclear Information System (INIS)
Leistner, Thomas; Paweł Nurowski
2012-01-01
We define pure radiation metrics with parallel rays to be n-dimensional pseudo-Riemannian metrics that admit a parallel null line bundle K and whose Ricci tensor vanishes on vectors that are orthogonal to K. We give necessary conditions in terms of the Weyl, Cotton and Bach tensors for a pseudo-Riemannian metric to be conformal to a pure radiation metric with parallel rays. Then, we derive conditions in terms of the tractor calculus that are equivalent to the existence of a pure radiation metric with parallel rays in a conformal class. We also give analogous results for n-dimensional pseudo-Riemannian pp-waves. (paper)
Compiling Scientific Programs for Scalable Parallel Systems
National Research Council Canada - National Science Library
Kennedy, Ken
2001-01-01
...). The research performed in this project included new techniques for recognizing implicit parallelism in sequential programs, a powerful and precise set-based framework for analysis and transformation...
Parallel thermal radiation transport in two dimensions
International Nuclear Information System (INIS)
Smedley-Stevenson, R.P.; Ball, S.R.
2003-01-01
This paper describes the distributed memory parallel implementation of a deterministic thermal radiation transport algorithm in a 2-dimensional ALE hydrodynamics code. The parallel algorithm consists of a variety of components which are combined in order to produce a state of the art computational capability, capable of solving large thermal radiation transport problems using Blue-Oak, the 3 Tera-Flop MPP (massive parallel processors) computing facility at AWE (United Kingdom). Particular aspects of the parallel algorithm are described together with examples of the performance on some challenging applications. (author)
Parallel Algorithms for the Exascale Era
Energy Technology Data Exchange (ETDEWEB)
Robey, Robert W. [Los Alamos National Laboratory
2016-10-19
New parallel algorithms are needed to reach the Exascale level of parallelism with millions of cores. We look at some of the research developed by students in projects at LANL. The research blends ideas from the early days of computing while weaving in the fresh approach brought by students new to the field of high performance computing. We look at reproducibility of global sums and why it is important to parallel computing. Next we look at how the concept of hashing has led to the development of more scalable algorithms suitable for next-generation parallel computers. Nearly all of this work has been done by undergraduates and published in leading scientific journals.
Parallel thermal radiation transport in two dimensions
Energy Technology Data Exchange (ETDEWEB)
Smedley-Stevenson, R.P.; Ball, S.R. [AWE Aldermaston (United Kingdom)
2003-07-01
This paper describes the distributed memory parallel implementation of a deterministic thermal radiation transport algorithm in a 2-dimensional ALE hydrodynamics code. The parallel algorithm consists of a variety of components which are combined in order to produce a state of the art computational capability, capable of solving large thermal radiation transport problems using Blue-Oak, the 3 Tera-Flop MPP (massive parallel processors) computing facility at AWE (United Kingdom). Particular aspects of the parallel algorithm are described together with examples of the performance on some challenging applications. (author)
Structured Parallel Programming Patterns for Efficient Computation
McCool, Michael; Robison, Arch
2012-01-01
Programming is now parallel programming. Much as structured programming revolutionized traditional serial programming decades ago, a new kind of structured programming, based on patterns, is relevant to parallel programming today. Parallel computing experts and industry insiders Michael McCool, Arch Robison, and James Reinders describe how to design and implement maintainable and efficient parallel algorithms using a pattern-based approach. They present both theory and practice, and give detailed concrete examples using multiple programming models. Examples are primarily given using two of th
International Nuclear Information System (INIS)
Cook, G.O. Jr.; Knight, L.
1979-07-01
The question of optimal projection angles has recently become of interest in the field of reconstruction from projections. Here, studies are concentrated on the n x n pixel space, where literative algorithms such as ART and direct matrix techniques due to Katz are considered. The best angles are determined in a Gauss--Markov statistical sense as well as with respect to a function-theoretical error bound. The possibility of making photon intensity a function of angle is also examined. Finally, the best angles to use in an ART-like algorithm are studied. A certain set of unequally spaced angles was found to be preferred in several contexts. 15 figures, 6 tables
The parallel dynamics of drift wave turbulence in the WEGA stellarator
Energy Technology Data Exchange (ETDEWEB)
Marsen, S; Endler, M; Otte, M; Wagner, F, E-mail: stefan.marsen@ipp.mpg.d [Max-Planck-Institut fuer Plasmaphysik, EURATOM Association, Wendelsteinstrasse 1, 17491 Greifswald (Germany)
2009-08-15
The three-dimensional structure of turbulence in the edge (inside the last closed flux surface) of the WEGA stellarator is studied focusing on the parallel dynamics. WEGA as a small stellarator with moderate plasma parameters offers the opportunity to study turbulence with Langmuir probes providing high spatial and temporal resolution. Multiple probes with radial, poloidal and toroidal resolution are used to measure density fluctuations. Correlation analysis is used to reconstruct a 3D picture of turbulent structures. We find that these structures originate predominantly on the low field side and have a three-dimensional character with a finite averaged parallel wavenumber. The ratio between the parallel and perpendicular wavenumber component is in the order of 10{sup -2}. The parallel dynamics are compared at magnetic inductions of 57 and 500 mT. At 500 mT, the parallel wavelength is in the order of the field line connection length 2{pi}R{iota}-bar. A frequency resolved measure of k{sub ||}/k{sub {theta}} shows a constant ratio in this case. At 57 mT the observed k{sub ||} is much smaller than at 500 mT. However, the observed small average value is due to an averaging over positive and negative components pointing parallel and antiparallel to the magnetic field vector.
CRUCIATE LIGAMENT RECONSTRUCTION
Directory of Open Access Journals (Sweden)
A. V. Korolev
2016-01-01
Full Text Available Purpose: To evaluate long-term results of meniscal repair during arthroscopic ACL reconstruction.Materials and methods: 45 patients who underwent meniscal repair during arthroscopic ACL reconstruction between 2007 and 2013 by the same surgeon were included in the study. In total, fifty meniscus were repaired (26 medial and 24 lateral. Procedures included use of one up to four Fast-Fix implants (Smith & Nephew. In five cases both medial and lateral meniscus were repaired. Cincinnati, IKDC and Lysholm scales were used for long-term outcome analysis.Results: 19 male and 26 female patients were included in the study aging from 15 to 59 years (mean age 33,2±1,5. Median time from injury to surgical procedure was zero months (ranging zero to one. Mean time from surgery to scale analysis was 55,9±3 months (ranged 20-102. Median Cincinnati score was 97 (ranged 90-100, with excellent results in 93% of cases (43 patients and good results in 7% (3 patients. Median IKDC score was 90,8 (ranged 86,2-95,4, with excellent outcomes in 51% of cases (23 patients, good in 33% (15 patients and satisfactory in 16% (7 patients. Median Lysholm score was 95 (ranged 90-100, with excellent outcomes in 76% of cases (34 patients and good in 24% (11 patients. Authors identified no statistical differences when comparing survey results in age, sex and time from trauma to surgery.Conclusions: Results of the present study match the data from orthopedic literature that prove meniscal repair as a safe and efficient procedure with good and excellent outcomes. All-inside meniscal repair can be used irrespectively of patients' age and is efficient even in case of delayed procedures.
Feasibility study of segmented-parallel-hole collimator for stationary cardiac SPECT
Energy Technology Data Exchange (ETDEWEB)
Mao, Yanfei [Utah Univ., Salt Lake City, UT (United States). Center for Advanced Imaging Research (UCAIR); Utah Univ., Salt Lake City, UT (United States). Dept. of Bioengineering; Zeng, Gengsheng L. [Utah Univ., Salt Lake City, UT (United States). Center for Advanced Imaging Research (UCAIR)
2011-07-01
The goal of this research is to propose a stationary cardiac SPECT system using the segmented parallel-beam collimator and to perform some computer simulations to test the feasibility. A stationary system has a benefit of acquiring temporally consistent projections. The most challenging issue in building a stationary system is to provide sufficient projection view-angles. A 2-detector, multi-segment collimator system with 14 view-angles over 180 in the transaxial direction and 3 view-angles in the axial directions was designed, where the two detectors are configured 90 apart in an L-shape. We applied the parallel-beam imaging geometry and used segmented parallel-hole collimator to acquire SPECT data. To improve the system condition due to data truncation, we measured more rays within the field-of-view (FOV) of the detector by using a relatively small detector bin-size. In image reconstruction, we used the maximum-likelihood expectation-maximization (ML-EM) algorithm. The criterion for evaluating the system is the summed pixel-to-pixel distance that measures the discrepancy between the 3D gold-standard image and the reconstructed 3D region of interest (ROI) with truncated data. Effects of limited number of view-angles, data truncation, varying body habitus, attenuation, and noise were considered in the system design. As a result, our segmented-parallel-beam stationary cardiac SPECT system is able to acquire sufficient data for cardiac imaging and has a high sensitivity gain. (orig.)
Reconstruction of electric systems (ELE)
International Nuclear Information System (INIS)
Kohutovic, P.
2001-01-01
The original design of WWER-230 units consisted of a single common system EEPS (essential electric power supply system) per unit. The establishment of redundancy 2 x 100% EEPS was a global task. The task was started during the 'Small reconstruction' - MR V1, continued in 'Gradual reconstruction' and finished in the year 2000. (author)
Traditional Tracking with Kalman Filter on Parallel Architectures
Cerati, Giuseppe; Elmer, Peter; Lantz, Steven; MacNeill, Ian; McDermott, Kevin; Riley, Dan; Tadel, Matevž; Wittich, Peter; Würthwein, Frank; Yagil, Avi
2015-05-01
Power density constraints are limiting the performance improvements of modern CPUs. To address this, we have seen the introduction of lower-power, multi-core processors, but the future will be even more exciting. In order to stay within the power density limits but still obtain Moore's Law performance/price gains, it will be necessary to parallelize algorithms to exploit larger numbers of lightweight cores and specialized functions like large vector units. Example technologies today include Intel's Xeon Phi and GPGPUs. Track finding and fitting is one of the most computationally challenging problems for event reconstruction in particle physics. At the High Luminosity LHC, for example, this will be by far the dominant problem. The most common track finding techniques in use today are however those based on the Kalman Filter. Significant experience has been accumulated with these techniques on real tracking detector systems, both in the trigger and offline. We report the results of our investigations into the potential and limitations of these algorithms on the new parallel hardware.
Breast Reconstruction Following Cancer Treatment.
Gerber, Bernd; Marx, Mario; Untch, Michael; Faridi, Andree
2015-08-31
About 8000 breast reconstructions after mastectomy are per - formed in Germany each year. It has become more difficult to advise patients because of the wide variety of heterologous and autologous techniques that are now available and because of changes in the recommendations about radiotherapy. This article is based on a review of pertinent articles (2005-2014) that were retrieved by a selective search employing the search terms "mastectomy" and "breast reconstruction." The goal of reconstruction is to achieve an oncologically safe and aestically satisfactory result for the patient over the long term. Heterologous, i.e., implant-based, breast reconstruction (IBR) and autologous breast reconstruction (ABR) are complementary techniques. Immediate reconstruction preserves the skin of the breast and its natural form and prevents the psychological trauma associated with mastectomy. If post-mastectomy radiotherapy (PMRT) is not indicated, implant-based reconstruction with or without a net/acellular dermal matrix (ADM) is a common option. Complications such as seroma formation, infection, and explantation are significantly more common when an ADM is used (15.3% vs. 5.4% ). If PMRT is performed, then the complication rate of implant-based breast reconstruction is 1 to 48% ; in particular, Baker grade III/IV capsular fibrosis occurs in 7 to 22% of patients, and the prosthesis must be explanted in 9 to 41% . Primary or, preferably, secondary autologous reconstruction is an alternative. The results of ABR are more stable over the long term, but the operation is markedly more complex. Autologous breast reconstruction after PMRT does not increase the risk of serious complications (20.5% vs. 17.9% without radiotherapy). No randomized controlled trials have yet been conducted to compare the reconstructive techniques with each other. If radiotherapy will not be performed, immediate reconstruction with an implant is recommended. On the other hand, if post-mastectomy radiotherapy
International Nuclear Information System (INIS)
Lee, H.; Brandyberry, M.; Tudor, A.; Matous, K.
2009-01-01
In this paper, we present a systematic approach for characterization and reconstruction of statistically optimal representative unit cells of polydisperse particulate composites. Microtomography is used to gather rich three-dimensional data of a packed glass bead system. First-, second-, and third-order probability functions are used to characterize the morphology of the material, and the parallel augmented simulated annealing algorithm is employed for reconstruction of the statistically equivalent medium. Both the fully resolved probability spectrum and the geometrically exact particle shapes are considered in this study, rendering the optimization problem multidimensional with a highly complex objective function. A ten-phase particulate composite composed of packed glass beads in a cylindrical specimen is investigated, and a unit cell is reconstructed on massively parallel computers. Further, rigorous error analysis of the statistical descriptors (probability functions) is presented and a detailed comparison between statistics of the voxel-derived pack and the representative cell is made.
Parallel Computing for Brain Simulation.
Pastur-Romay, L A; Porto-Pazos, A B; Cedron, F; Pazos, A
2017-01-01
The human brain is the most complex system in the known universe, it is therefore one of the greatest mysteries. It provides human beings with extraordinary abilities. However, until now it has not been understood yet how and why most of these abilities are produced. For decades, researchers have been trying to make computers reproduce these abilities, focusing on both understanding the nervous system and, on processing data in a more efficient way than before. Their aim is to make computers process information similarly to the brain. Important technological developments and vast multidisciplinary projects have allowed creating the first simulation with a number of neurons similar to that of a human brain. This paper presents an up-to-date review about the main research projects that are trying to simulate and/or emulate the human brain. They employ different types of computational models using parallel computing: digital models, analog models and hybrid models. This review includes the current applications of these works, as well as future trends. It is focused on various works that look for advanced progress in Neuroscience and still others which seek new discoveries in Computer Science (neuromorphic hardware, machine learning techniques). Their most outstanding characteristics are summarized and the latest advances and future plans are presented. In addition, this review points out the importance of considering not only neurons: Computational models of the brain should also include glial cells, given the proven importance of astrocytes in information processing. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
New partially parallel acquisition technique in cerebral imaging: preliminary findings
International Nuclear Information System (INIS)
Tintera, Jaroslav; Gawehn, Joachim; Bauermann, Thomas; Vucurevic, Goran; Stoeter, Peter
2004-01-01
In MRI applications where short acquisition time is necessary, the increase of acquisition speed is often at the expense of image resolution and SNR. In such cases, the newly developed parallel acquisition techniques could provide images without mentioned limitations and in reasonably shortened measurement time. A newly designed eight-channel head coil array (i-PAT coil) allowing for parallel acquisition of independently reconstructed images (GRAPPA mode) has been tested for its applicability in neuroradiology. Image homogeneity was tested in standard phantom and healthy volunteers. BOLD signal changes were studied in a group of six volunteers using finger tapping stimulation. Phantom studies revealed an important drop of signal even after the use of a normalization filter in the center of the image and an important increase of artifact power with reduction of measurement time strongly depending on the combination of acceleration parameters. The additional application of a parallel acquisition technique such as GRAPPA decreases measurement time in the range of about 30%, but further reduction is often possible only at the expense of SNR. This technique performs best in conditions in which imaging speed is important, such as CE MRA, but time resolution still does not allow the acquisition of angiograms separating the arterial and venous phase. Significantly larger areas of BOLD activation were found using the i-PAT coil compared to the standard head coil. Being an eight-channel surface coil array, peripheral cortical structures profit from high SNR as high-resolution imaging of small cortical dysplasias and functional activation of cortical areas imaged by BOLD contrast. In BOLD contrast imaging, susceptibility artifacts are reduced, but only if an appropriate combination of acceleration parameters is used. (orig.)
New partially parallel acquisition technique in cerebral imaging: preliminary findings
Energy Technology Data Exchange (ETDEWEB)
Tintera, Jaroslav [Institute for Clinical and Experimental Medicine, Prague (Czech Republic); Gawehn, Joachim; Bauermann, Thomas; Vucurevic, Goran; Stoeter, Peter [University Clinic Mainz, Institute of Neuroradiology, Mainz (Germany)
2004-12-01
In MRI applications where short acquisition time is necessary, the increase of acquisition speed is often at the expense of image resolution and SNR. In such cases, the newly developed parallel acquisition techniques could provide images without mentioned limitations and in reasonably shortened measurement time. A newly designed eight-channel head coil array (i-PAT coil) allowing for parallel acquisition of independently reconstructed images (GRAPPA mode) has been tested for its applicability in neuroradiology. Image homogeneity was tested in standard phantom and healthy volunteers. BOLD signal changes were studied in a group of six volunteers using finger tapping stimulation. Phantom studies revealed an important drop of signal even after the use of a normalization filter in the center of the image and an important increase of artifact power with reduction of measurement time strongly depending on the combination of acceleration parameters. The additional application of a parallel acquisition technique such as GRAPPA decreases measurement time in the range of about 30%, but further reduction is often possible only at the expense of SNR. This technique performs best in conditions in which imaging speed is important, such as CE MRA, but time resolution still does not allow the acquisition of angiograms separating the arterial and venous phase. Significantly larger areas of BOLD activation were found using the i-PAT coil compared to the standard head coil. Being an eight-channel surface coil array, peripheral cortical structures profit from high SNR as high-resolution imaging of small cortical dysplasias and functional activation of cortical areas imaged by BOLD contrast. In BOLD contrast imaging, susceptibility artifacts are reduced, but only if an appropriate combination of acceleration parameters is used. (orig.)
von Davier, Matthias
2016-01-01
This report presents results on a parallel implementation of the expectation-maximization (EM) algorithm for multidimensional latent variable models. The developments presented here are based on code that parallelizes both the E step and the M step of the parallel-E parallel-M algorithm. Examples presented in this report include item response…
The language parallel Pascal and other aspects of the massively parallel processor
Reeves, A. P.; Bruner, J. D.
1982-01-01
A high level language for the Massively Parallel Processor (MPP) was designed. This language, called Parallel Pascal, is described in detail. A description of the language design, a description of the intermediate language, Parallel P-Code, and details for the MPP implementation are included. Formal descriptions of Parallel Pascal and Parallel P-Code are given. A compiler was developed which converts programs in Parallel Pascal into the intermediate Parallel P-Code language. The code generator to complete the compiler for the MPP is being developed independently. A Parallel Pascal to Pascal translator was also developed. The architecture design for a VLSI version of the MPP was completed with a description of fault tolerant interconnection networks. The memory arrangement aspects of the MPP are discussed and a survey of other high level languages is given.
Parallel Boltzmann machines : a mathematical model
Zwietering, P.J.; Aarts, E.H.L.
1991-01-01
A mathematical model is presented for the description of parallel Boltzmann machines. The framework is based on the theory of Markov chains and combines a number of previously known results into one generic model. It is argued that parallel Boltzmann machines maximize a function consisting of a
The convergence of parallel Boltzmann machines
Zwietering, P.J.; Aarts, E.H.L.; Eckmiller, R.; Hartmann, G.; Hauske, G.
1990-01-01
We discuss the main results obtained in a study of a mathematical model of synchronously parallel Boltzmann machines. We present supporting evidence for the conjecture that a synchronously parallel Boltzmann machine maximizes a consensus function that consists of a weighted sum of the regular
Customizable Memory Schemes for Data Parallel Architectures
Gou, C.
2011-01-01
Memory system efficiency is crucial for any processor to achieve high performance, especially in the case of data parallel machines. Processing capabilities of parallel lanes will be wasted, when data requests are not accomplished in a sustainable and timely manner. Irregular vector memory accesses
Parallel Narrative Structure in Paul Harding's "Tinkers"
Çirakli, Mustafa Zeki
2014-01-01
The present paper explores the implications of parallel narrative structure in Paul Harding's "Tinkers" (2009). Besides primarily recounting the two sets of parallel narratives, "Tinkers" also comprises of seemingly unrelated fragments such as excerpts from clock repair manuals and diaries. The main stories, however, told…
Streaming nested data parallelism on multicores
DEFF Research Database (Denmark)
Madsen, Frederik Meisner; Filinski, Andrzej
2016-01-01
The paradigm of nested data parallelism (NDP) allows a variety of semi-regular computation tasks to be mapped onto SIMD-style hardware, including GPUs and vector units. However, some care is needed to keep down space consumption in situations where the available parallelism may vastly exceed...
Bayer image parallel decoding based on GPU
Hu, Rihui; Xu, Zhiyong; Wei, Yuxing; Sun, Shaohua
2012-11-01
In the photoelectrical tracking system, Bayer image is decompressed in traditional method, which is CPU-based. However, it is too slow when the images become large, for example, 2K×2K×16bit. In order to accelerate the Bayer image decoding, this paper introduces a parallel speedup method for NVIDA's Graphics Processor Unit (GPU) which supports CUDA architecture. The decoding procedure can be divided into three parts: the first is serial part, the second is task-parallelism part, and the last is data-parallelism part including inverse quantization, inverse discrete wavelet transform (IDWT) as well as image post-processing part. For reducing the execution time, the task-parallelism part is optimized by OpenMP techniques. The data-parallelism part could advance its efficiency through executing on the GPU as CUDA parallel program. The optimization techniques include instruction optimization, shared memory access optimization, the access memory coalesced optimization and texture memory optimization. In particular, it can significantly speed up the IDWT by rewriting the 2D (Tow-dimensional) serial IDWT into 1D parallel IDWT. Through experimenting with 1K×1K×16bit Bayer image, data-parallelism part is 10 more times faster than CPU-based implementation. Finally, a CPU+GPU heterogeneous decompression system was designed. The experimental result shows that it could achieve 3 to 5 times speed increase compared to the CPU serial method.
Parallelization of TMVA Machine Learning Algorithms
Hajili, Mammad
2017-01-01
This report reflects my work on Parallelization of TMVA Machine Learning Algorithms integrated to ROOT Data Analysis Framework during summer internship at CERN. The report consists of 4 impor- tant part - data set used in training and validation, algorithms that multiprocessing applied on them, parallelization techniques and re- sults of execution time changes due to number of workers.
17 CFR 12.24 - Parallel proceedings.
2010-04-01
...) Definition. For purposes of this section, a parallel proceeding shall include: (1) An arbitration proceeding... the receivership includes the resolution of claims made by customers; or (3) A petition filed under... any of the foregoing with knowledge of a parallel proceeding shall promptly notify the Commission, by...
Parallel S/sub n/ iteration schemes
International Nuclear Information System (INIS)
Wienke, B.R.; Hiromoto, R.E.
1986-01-01
The iterative, multigroup, discrete ordinates (S/sub n/) technique for solving the linear transport equation enjoys widespread usage and appeal. Serial iteration schemes and numerical algorithms developed over the years provide a timely framework for parallel extension. On the Denelcor HEP, the authors investigate three parallel iteration schemes for solving the one-dimensional S/sub n/ transport equation. The multigroup representation and serial iteration methods are also reviewed. This analysis represents a first attempt to extend serial S/sub n/ algorithms to parallel environments and provides good baseline estimates on ease of parallel implementation, relative algorithm efficiency, comparative speedup, and some future directions. The authors examine ordered and chaotic versions of these strategies, with and without concurrent rebalance and diffusion acceleration. Two strategies efficiently support high degrees of parallelization and appear to be robust parallel iteration techniques. The third strategy is a weaker parallel algorithm. Chaotic iteration, difficult to simulate on serial machines, holds promise and converges faster than ordered versions of the schemes. Actual parallel speedup and efficiency are high and payoff appears substantial
Parallel Computing Strategies for Irregular Algorithms
Biswas, Rupak; Oliker, Leonid; Shan, Hongzhang; Biegel, Bryan (Technical Monitor)
2002-01-01
Parallel computing promises several orders of magnitude increase in our ability to solve realistic computationally-intensive problems, but relies on their efficient mapping and execution on large-scale multiprocessor architectures. Unfortunately, many important applications are irregular and dynamic in nature, making their effective parallel implementation a daunting task. Moreover, with the proliferation of parallel architectures and programming paradigms, the typical scientist is faced with a plethora of questions that must be answered in order to obtain an acceptable parallel implementation of the solution algorithm. In this paper, we consider three representative irregular applications: unstructured remeshing, sparse matrix computations, and N-body problems, and parallelize them using various popular programming paradigms on a wide spectrum of computer platforms ranging from state-of-the-art supercomputers to PC clusters. We present the underlying problems, the solution algorithms, and the parallel implementation strategies. Smart load-balancing, partitioning, and ordering techniques are used to enhance parallel performance. Overall results demonstrate the complexity of efficiently parallelizing irregular algorithms.
Parallel fuzzy connected image segmentation on GPU
Zhuge, Ying; Cao, Yong; Udupa, Jayaram K.; Miller, Robert W.
2011-01-01
Purpose: Image segmentation techniques using fuzzy connectedness (FC) principles have shown their effectiveness in segmenting a variety of objects in several large applications. However, one challenge in these algorithms has been their excessive computational requirements when processing large image datasets. Nowadays, commodity graphics hardware provides a highly parallel computing environment. In this paper, the authors present a parallel fuzzy connected image segmentation algorithm impleme...
Parallel Algorithms for Groebner-Basis Reduction
1987-09-25
22209 ELEMENT NO. NO. NO. ACCESSION NO. 11. TITLE (Include Security Classification) * PARALLEL ALGORITHMS FOR GROEBNER -BASIS REDUCTION 12. PERSONAL...All other editions are obsolete. Productivity Engineering in the UNIXt Environment p Parallel Algorithms for Groebner -Basis Reduction Technical Report
Parallel knock-out schemes in networks
Broersma, H.J.; Fomin, F.V.; Woeginger, G.J.
2004-01-01
We consider parallel knock-out schemes, a procedure on graphs introduced by Lampert and Slater in 1997 in which each vertex eliminates exactly one of its neighbors in each round. We are considering cases in which after a finite number of rounds, where the minimimum number is called the parallel
Building a parallel file system simulator
International Nuclear Information System (INIS)
Molina-Estolano, E; Maltzahn, C; Brandt, S A; Bent, J
2009-01-01
Parallel file systems are gaining in popularity in high-end computing centers as well as commercial data centers. High-end computing systems are expected to scale exponentially and to pose new challenges to their storage scalability in terms of cost and power. To address these challenges scientists and file system designers will need a thorough understanding of the design space of parallel file systems. Yet there exist few systematic studies of parallel file system behavior at petabyte- and exabyte scale. An important reason is the significant cost of getting access to large-scale hardware to test parallel file systems. To contribute to this understanding we are building a parallel file system simulator that can simulate parallel file systems at very large scale. Our goal is to simulate petabyte-scale parallel file systems on a small cluster or even a single machine in reasonable time and fidelity. With this simulator, file system experts will be able to tune existing file systems for specific workloads, scientists and file system deployment engineers will be able to better communicate workload requirements, file system designers and researchers will be able to try out design alternatives and innovations at scale, and instructors will be able to study very large-scale parallel file system behavior in the class room. In this paper we describe our approach and provide preliminary results that are encouraging both in terms of fidelity and simulation scalability.
Online Event Reconstruction in the CBM Experiment at FAIR
Akishina, Valentina; Kisel, Ivan
2018-02-01
Targeting for rare observables, the CBM experiment will operate at high interaction rates of up to 10 MHz, which is unprecedented in heavy-ion experiments so far. It requires a novel free-streaming readout system and a new concept of data processing. The huge data rates of the CBM experiment will be reduced online to the recordable rate before saving the data to the mass storage. Full collision reconstruction and selection will be performed online in a dedicated processor farm. In order to make an efficient event selection online a clean sample of particles has to be provided by the reconstruction package called First Level Event Selection (FLES). The FLES reconstruction and selection package consists of several modules: track finding, track fitting, event building, short-lived particles finding, and event selection. Since detector measurements contain also time information, the event building is done at all stages of the reconstruction process. The input data are distributed within the FLES farm in a form of time-slices. A time-slice is reconstructed in parallel between processor cores. After all tracks of the whole time-slice are found and fitted, they are collected into clusters of tracks originated from common primary vertices, which then are fitted, thus identifying the interaction points. Secondary tracks are associated with primary vertices according to their estimated production time. After that short-lived particles are found and the full event building process is finished. The last stage of the FLES package is a selection of events according to the requested trigger signatures. The event reconstruction procedure and the results of its application to simulated collisions in the CBM detector setup are presented and discussed in detail.
Broadcasting a message in a parallel computer
Berg, Jeremy E [Rochester, MN; Faraj, Ahmad A [Rochester, MN
2011-08-02
Methods, systems, and products are disclosed for broadcasting a message in a parallel computer. The parallel computer includes a plurality of compute nodes connected together using a data communications network. The data communications network optimized for point to point data communications and is characterized by at least two dimensions. The compute nodes are organized into at least one operational group of compute nodes for collective parallel operations of the parallel computer. One compute node of the operational group assigned to be a logical root. Broadcasting a message in a parallel computer includes: establishing a Hamiltonian path along all of the compute nodes in at least one plane of the data communications network and in the operational group; and broadcasting, by the logical root to the remaining compute nodes, the logical root's message along the established Hamiltonian path.
Advanced parallel processing with supercomputer architectures
International Nuclear Information System (INIS)
Hwang, K.
1987-01-01
This paper investigates advanced parallel processing techniques and innovative hardware/software architectures that can be applied to boost the performance of supercomputers. Critical issues on architectural choices, parallel languages, compiling techniques, resource management, concurrency control, programming environment, parallel algorithms, and performance enhancement methods are examined and the best answers are presented. The authors cover advanced processing techniques suitable for supercomputers, high-end mainframes, minisupers, and array processors. The coverage emphasizes vectorization, multitasking, multiprocessing, and distributed computing. In order to achieve these operation modes, parallel languages, smart compilers, synchronization mechanisms, load balancing methods, mapping parallel algorithms, operating system functions, application library, and multidiscipline interactions are investigated to ensure high performance. At the end, they assess the potentials of optical and neural technologies for developing future supercomputers
Differences Between Distributed and Parallel Systems
Energy Technology Data Exchange (ETDEWEB)
Brightwell, R.; Maccabe, A.B.; Rissen, R.
1998-10-01
Distributed systems have been studied for twenty years and are now coming into wider use as fast networks and powerful workstations become more readily available. In many respects a massively parallel computer resembles a network of workstations and it is tempting to port a distributed operating system to such a machine. However, there are significant differences between these two environments and a parallel operating system is needed to get the best performance out of a massively parallel system. This report characterizes the differences between distributed systems, networks of workstations, and massively parallel systems and analyzes the impact of these differences on operating system design. In the second part of the report, we introduce Puma, an operating system specifically developed for massively parallel systems. We describe Puma portals, the basic building blocks for message passing paradigms implemented on top of Puma, and show how the differences observed in the first part of the report have influenced the design and implementation of Puma.
Parallel-In-Time For Moving Meshes
Energy Technology Data Exchange (ETDEWEB)
Falgout, R. D. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Manteuffel, T. A. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Southworth, B. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Schroder, J. B. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
2016-02-04
With steadily growing computational resources available, scientists must develop e ective ways to utilize the increased resources. High performance, highly parallel software has be- come a standard. However until recent years parallelism has focused primarily on the spatial domain. When solving a space-time partial di erential equation (PDE), this leads to a sequential bottleneck in the temporal dimension, particularly when taking a large number of time steps. The XBraid parallel-in-time library was developed as a practical way to add temporal parallelism to existing se- quential codes with only minor modi cations. In this work, a rezoning-type moving mesh is applied to a di usion problem and formulated in a parallel-in-time framework. Tests and scaling studies are run using XBraid and demonstrate excellent results for the simple model problem considered herein.
Parallel programming with Easy Java Simulations
Esquembre, F.; Christian, W.; Belloni, M.
2018-01-01
Nearly all of today's processors are multicore, and ideally programming and algorithm development utilizing the entire processor should be introduced early in the computational physics curriculum. Parallel programming is often not introduced because it requires a new programming environment and uses constructs that are unfamiliar to many teachers. We describe how we decrease the barrier to parallel programming by using a java-based programming environment to treat problems in the usual undergraduate curriculum. We use the easy java simulations programming and authoring tool to create the program's graphical user interface together with objects based on those developed by Kaminsky [Building Parallel Programs (Course Technology, Boston, 2010)] to handle common parallel programming tasks. Shared-memory parallel implementations of physics problems, such as time evolution of the Schrödinger equation, are available as source code and as ready-to-run programs from the AAPT-ComPADRE digital library.
Arkin, Ethem; Tekinerdogan, Bedir
2016-01-01
Mapping parallel algorithms to parallel computing platforms requires several activities such as the analysis of the parallel algorithm, the definition of the logical configuration of the platform, the mapping of the algorithm to the logical configuration platform and the implementation of the
Use of the omentum in chest-wall reconstruction
International Nuclear Information System (INIS)
Fix, R.J.; Vasconez, L.O.
1989-01-01
Increased use of the omentum in chest-wall reconstruction has paralleled the refinement of anatomic knowledge and the development of safe mobilization techniques. Important anatomic points are the omental attachments to surrounding structures, the major blood supply from the left and right gastroepiploic vessels, and the collateral circulation via the gastroepiploic arch and Barkow's marginal artery. Mobilization of the omentum to the thorax involves division of its attachments to the transverse colon and separation from the greater curvature to fabricate a bipedicled flap. Most anterior chest wounds and virtually all mediastinal wounds can be covered with the omentum based on both sets of gastroepiploic vessels. The arc of transposition is increased when the omentum is based on a single pedicle, allowing coverage of virtually all chest-wall defects. The final method of increasing flap length involves division of the gastroepiploic arch and reliance on Barkow's marginal artery as collateral circulation to maintain flap viability. With regard to chest-wall reconstruction, we have included the omentum in the armamentarium of flaps used to cover mediastinal wounds. The omentum is our flap of choice for the reconstruction of most radiation injuries of the chest wall. The omentum may also be used to provide protection to visceral anastomoses, vascular conduits, and damaged structures in the chest, as well as to cover defects secondary to tumor excision or trauma. In brief, the omentum has proved to be a most dependable and versatile flap, particularly applicable to chest-wall reconstruction
CT image reconstruction system based on hardware implementation
International Nuclear Information System (INIS)
Silva, Hamilton P. da; Evseev, Ivan; Schelin, Hugo R.; Paschuk, Sergei A.; Milhoretto, Edney; Setti, Joao A.P.; Zibetti, Marcelo; Hormaza, Joel M.; Lopes, Ricardo T.
2009-01-01
Full text: The timing factor is very important for medical imaging systems, which can nowadays be synchronized by vital human signals, like heartbeats or breath. The use of hardware implemented devices in such a system has advantages considering the high speed of information treatment combined with arbitrary low cost on the market. This article refers to a hardware system which is based on electronic programmable logic called FPGA, model Cyclone II from ALTERA Corporation. The hardware was implemented on the UP3 ALTERA Kit. A partially connected neural network with unitary weights was programmed. The system was tested with 60 topographic projections, 100 points in each, of the Shepp and Logan phantom created by MATLAB. The main restriction was found to be the memory size available on the device: the dynamic range of reconstructed image was limited to 0 65535. Also, the normalization factor must be observed in order to do not saturate the image during the reconstruction and filtering process. The test shows a principal possibility to build CT image reconstruction systems for any reasonable amount of input data by arranging the parallel work of the hardware units like we have tested. However, further studies are necessary for better understanding of the error propagation from topographic projections to reconstructed image within the implemented method. (author)
Towards an inline reconstruction architecture for micro-CT systems
International Nuclear Information System (INIS)
Brasse, David; Humbert, Bernard; Mathelin, Carole; Rio, Marie-Christine; Guyonnet, Jean-Louis
2005-01-01
Recent developments in micro-CT have revolutionized the ability to examine in vivo living experimental animal models such as mouse with a spatial resolution less than 50 μm. The main requirements of in vivo imaging for biological researchers are a good spatial resolution, a low dose induced to the animal during the full examination and a reduced acquisition and reconstruction time for screening purposes. We introduce inline acquisition and reconstruction architecture to obtain in real time the 3D attenuation map of the animal fulfilling the three previous requirements. The micro-CT system is based on commercially available x-ray detector and micro-focus x-ray source. The reconstruction architecture is based on a cluster of PCs where a dedicated communication scheme combining serial and parallel treatments is implemented. In order to obtain high performance transmission rate between the detector and the reconstruction architecture, a dedicated data acquisition system is also developed. With the proposed solution, the time required to filter and backproject a projection of 2048 x 2048 pixels inside a volume of 140 mega voxels using the Feldkamp algorithm is similar to 500 ms, the time needed to acquire the same projection
A three-dimensional reconstruction algorithm for an inverse-geometry volumetric CT system
International Nuclear Information System (INIS)
Schmidt, Taly Gilat; Fahrig, Rebecca; Pelc, Norbert J.
2005-01-01
An inverse-geometry volumetric computed tomography (IGCT) system has been proposed capable of rapidly acquiring sufficient data to reconstruct a thick volume in one circular scan. The system uses a large-area scanned source opposite a smaller detector. The source and detector have the same extent in the axial, or slice, direction, thus providing sufficient volumetric sampling and avoiding cone-beam artifacts. This paper describes a reconstruction algorithm for the IGCT system. The algorithm first rebins the acquired data into two-dimensional (2D) parallel-ray projections at multiple tilt and azimuthal angles, followed by a 3D filtered backprojection. The rebinning step is performed by gridding the data onto a Cartesian grid in a 4D projection space. We present a new method for correcting the gridding error caused by the finite and asymmetric sampling in the neighborhood of each output grid point in the projection space. The reconstruction algorithm was implemented and tested on simulated IGCT data. Results show that the gridding correction reduces the gridding errors to below one Hounsfield unit. With this correction, the reconstruction algorithm does not introduce significant artifacts or blurring when compared to images reconstructed from simulated 2D parallel-ray projections. We also present an investigation of the noise behavior of the method which verifies that the proposed reconstruction algorithm utilizes cross-plane rays as efficiently as in-plane rays and can provide noise comparable to an in-plane parallel-ray geometry for the same number of photons. Simulations of a resolution test pattern and the modulation transfer function demonstrate that the IGCT system, using the proposed algorithm, is capable of 0.4 mm isotropic resolution. The successful implementation of the reconstruction algorithm is an important step in establishing feasibility of the IGCT system
Evidence-Based ACL Reconstruction
Directory of Open Access Journals (Sweden)
E. Carlos RODRIGUEZ-MERCHAN
2015-01-01
Full Text Available There is controversy in the literature regarding a number of topics related to anterior cruciate ligament (ACLreconstruction. The purpose of this article is to answer the following questions: 1 Bone patellar tendon bone (BPTB reconstruction or hamstring reconstruction (HR; 2 Double bundle or single bundle; 3 Allograft or authograft; 4 Early or late reconstruction; 5 Rate of return to sports after ACL reconstruction; 6 Rate of osteoarthritis after ACL reconstruction. A Cochrane Library and PubMed (MEDLINE search of systematic reviews and meta-analysis related to ACL reconstruction was performed. The key words were: ACL reconstruction, systematic reviews and meta-analysis. The main criteria for selection were that the articles were systematic reviews and meta-analysesfocused on the aforementioned questions. Sixty-nine articles were found, but only 26 were selected and reviewed because they had a high grade (I-II of evidence. BPTB-R was associated with better postoperative knee stability but with a higher rate of morbidity. However, the results of both procedures in terms of functional outcome in the long-term were similar. The double-bundle ACL reconstruction technique showed better outcomes in rotational laxity, although functional recovery was similar between single-bundle and double-bundle. Autograft yielded better results than allograft. There was no difference between early and delayed reconstruction. 82% of patients were able to return to some kind of sport participation. 28% of patients presented radiological signs of osteoarthritis with a follow-up of minimum 10 years.
Reconstructing human evolution
AUTHOR|(CDS)2074069
1999-01-01
One can reconstruct human evolution using modern genetic data and models based on the mathematical theory of evolution and its four major factors : mutation, natural selection, statistical fluctuations in finite populations (random genetic drift), and migration. Archaeology gives some help on the major dates and events of the process. Chances of studying ancient DNA are very limited but there have been a few successful results. Studying DNA instead of proteins, as was done until a few years ago, and in particular the DNA of mitochondria and of the Y chromosome which are transmitted, respectively, by the maternal line and the paternal line, has greatly simplified the analysis. It is now possible to carry the analysis on individuals, while earlier studies were of necessity based on populations. Also the evolution of ÒcultureÓ (i.e. what we learn from others), in particular that of languages, gives some help and can be greatly enlightened by genetic studies. Even though it is largely based on mechanisms of mut...
Reconstructing Topological Graphs and Continua
Gartside, Paul; Pitz, Max F.; Suabedissen, Rolf
2015-01-01
The deck of a topological space $X$ is the set $\\mathcal{D}(X)=\\{[X \\setminus \\{x\\}] \\colon x \\in X\\}$, where $[Z]$ denotes the homeomorphism class of $Z$. A space $X$ is topologically reconstructible if whenever $\\mathcal{D}(X)=\\mathcal{D}(Y)$ then $X$ is homeomorphic to $Y$. It is shown that all metrizable compact connected spaces are reconstructible. It follows that all finite graphs, when viewed as a 1-dimensional cell-complex, are reconstructible in the topological sense, and more genera...
Tomographic reconstruction of binary fields
International Nuclear Information System (INIS)
Roux, Stéphane; Leclerc, Hugo; Hild, François
2012-01-01
A novel algorithm is proposed for reconstructing binary images from their projection along a set of different orientations. Based on a nonlinear transformation of the projection data, classical back-projection procedures can be used iteratively to converge to the sought image. A multiscale implementation allows for a faster convergence. The algorithm is tested on images up to 1 Mb definition, and an error free reconstruction is achieved with a very limited number of projection data, saving a factor of about 100 on the number of projections required for classical reconstruction algorithms.
A distributed multi-GPU system for high speed electron microscopic tomographic reconstruction
International Nuclear Information System (INIS)
Zheng, Shawn Q.; Branlund, Eric; Kesthelyi, Bettina; Braunfeld, Michael B.; Cheng, Yifan; Sedat, John W.; Agard, David A.
2011-01-01
Full resolution electron microscopic tomographic (EMT) reconstruction of large-scale tilt series requires significant computing power. The desire to perform multiple cycles of iterative reconstruction and realignment dramatically increases the pressing need to improve reconstruction performance. This has motivated us to develop a distributed multi-GPU (graphics processing unit) system to provide the required computing power for rapid constrained, iterative reconstructions of very large three-dimensional (3D) volumes. The participating GPUs reconstruct segments of the volume in parallel, and subsequently, the segments are assembled to form the complete 3D volume. Owing to its power and versatility, the CUDA (NVIDIA, USA) platform was selected for GPU implementation of the EMT reconstruction. For a system containing 10 GPUs provided by 5 GTX295 cards, 10 cycles of SIRT reconstruction for a tomogram of 4096 2 x512 voxels from an input tilt series containing 122 projection images of 4096 2 pixels (single precision float) takes a total of 1845 s of which 1032 s are for computation with the remainder being the system overhead. The same system takes only 39 s total to reconstruct 1024 2 x256 voxels from 122 1024 2 pixel projections. While the system overhead is non-trivial, performance analysis indicates that adding extra GPUs to the system would lead to steadily enhanced overall performance. Therefore, this system can be easily expanded to generate superior computing power for very large tomographic reconstructions and especially to empower iterative cycles of reconstruction and realignment. -- Highlights: → A distributed multi-GPU system has been developed for electron microscopic tomography (EMT). → This system allows for rapid constrained, iterative reconstruction of very large volumes. → This system can be easily expanded to generate superior computing power for large-scale iterative EMT realignment.
A distributed multi-GPU system for high speed electron microscopic tomographic reconstruction
Energy Technology Data Exchange (ETDEWEB)
Zheng, Shawn Q.; Branlund, Eric; Kesthelyi, Bettina; Braunfeld, Michael B.; Cheng, Yifan; Sedat, John W. [The Howard Hughes Medical Institute and the W.M. Keck Advanced Microscopy Laboratory, Department of Biochemistry and Biophysics, University of California, San Francisco, 600, 16th Street, Room S412D, CA 94158-2517 (United States); Agard, David A., E-mail: agard@msg.ucsf.edu [The Howard Hughes Medical Institute and the W.M. Keck Advanced Microscopy Laboratory, Department of Biochemistry and Biophysics, University of California, San Francisco, 600, 16th Street, Room S412D, CA 94158-2517 (United States)
2011-07-15
Full resolution electron microscopic tomographic (EMT) reconstruction of large-scale tilt series requires significant computing power. The desire to perform multiple cycles of iterative reconstruction and realignment dramatically increases the pressing need to improve reconstruction performance. This has motivated us to develop a distributed multi-GPU (graphics processing unit) system to provide the required computing power for rapid constrained, iterative reconstructions of very large three-dimensional (3D) volumes. The participating GPUs reconstruct segments of the volume in parallel, and subsequently, the segments are assembled to form the complete 3D volume. Owing to its power and versatility, the CUDA (NVIDIA, USA) platform was selected for GPU implementation of the EMT reconstruction. For a system containing 10 GPUs provided by 5 GTX295 cards, 10 cycles of SIRT reconstruction for a tomogram of 4096{sup 2}x512 voxels from an input tilt series containing 122 projection images of 4096{sup 2} pixels (single precision float) takes a total of 1845 s of which 1032 s are for computation with the remainder being the system overhead. The same system takes only 39 s total to reconstruct 1024{sup 2}x256 voxels from 122 1024{sup 2} pixel projections. While the system overhead is non-trivial, performance analysis indicates that adding extra GPUs to the system would lead to steadily enhanced overall performance. Therefore, this system can be easily expanded to generate superior computing power for very large tomographic reconstructions and especially to empower iterative cycles of reconstruction and realignment. -- Highlights: {yields} A distributed multi-GPU system has been developed for electron microscopic tomography (EMT). {yields} This system allows for rapid constrained, iterative reconstruction of very large volumes. {yields} This system can be easily expanded to generate superior computing power for large-scale iterative EMT realignment.
Electro-optical system for the high speed reconstruction of computed tomography images
International Nuclear Information System (INIS)
Tresp, V.
1989-01-01
An electro-optical system for the high-speed reconstruction of computed tomography (CT) images has been built and studied. The system is capable of reconstructing high-contrast and high-resolution images at video rate (30 images per second), which is more than two orders of magnitude faster than the reconstruction rate achieved by special purpose digital computers used in commercial CT systems. The filtered back-projection algorithm which was implemented in the reconstruction system requires the filtering of all projections with a prescribed filter function. A space-integrating acousto-optical convolver, a surface acoustic wave filter and a digital finite-impulse response filter were used for this purpose and their performances were compared. The second part of the reconstruction, the back projection of the filtered projections, is computationally very expensive. An optical back projector has been built which maps the filtered projections onto the two-dimensional image space using an anamorphic lens system and a prism image rotator. The reconstructed image is viewed by a video camera, routed through a real-time image-enhancement system, and displayed on a TV monitor. The system reconstructs parallel-beam projection data, and in a modified version, is also capable of reconstructing fan-beam projection data. This extension is important since the latter are the kind of projection data actually acquired in high-speed X-ray CT scanners. The reconstruction system was tested by reconstructing precomputed projection data of phantom images. These were stored in a special purpose projection memory and transmitted to the reconstruction system as an electronic signal. In this way, a projection measurement system that acquires projections sequentially was simulated
Nebula: reconstruction and visualization of scattering data in reciprocal space.
Reiten, Andreas; Chernyshov, Dmitry; Mathiesen, Ragnvald H
2015-04-01
Two-dimensional solid-state X-ray detectors can now operate at considerable data throughput rates that allow full three-dimensional sampling of scattering data from extended volumes of reciprocal space within second to minute time-scales. For such experiments, simultaneous analysis and visualization allows for remeasurements and a more dynamic measurement strategy. A new software, Nebula , is presented. It efficiently reconstructs X-ray scattering data, generates three-dimensional reciprocal space data sets that can be visualized interactively, and aims to enable real-time processing in high-throughput measurements by employing parallel computing on commodity hardware.
A Solution to Hammer's X-ray Reconstruction Problem
DEFF Research Database (Denmark)
Gardner, Richard J.; Kiderlen, Markus
2007-01-01
We propose algorithms for reconstructing a planar convex body K from possibly noisy measurements of either its parallel X-rays taken in a fixed finite set of directions or its point X-rays taken at a fixed finite set of points, in known situations that guarantee a unique solution when the data is...... to K in the Hausdorff metric as k tends to infinity. This solves, for the first time in the strongest sense, Hammer’s X-ray problem published in 1963....
SIRFING: Sparse Image Reconstruction For INterferometry using GPUs
Cranmer, Miles; Garsden, Hugh; Mitchell, Daniel A.; Greenhill, Lincoln
2018-01-01
We present a deconvolution code for radio interferometric imaging based on the compressed sensing algorithms in Garsden et al. (2015). Being computationally intensive, compressed sensing is ripe for parallelization over GPUs. Our compressed sensing implementation generates images using wavelets, and we have ported the underlying wavelet library to CUDA, targeting the spline filter reconstruction part of the algorithm. The speedup achieved is almost an order of magnitude. The code is modular but is also being integrated into the calibration and imaging pipeline in use by the LEDA project at the Long Wavelength Array (LWA) as well as by the Murchinson Widefield Array (MWA).
Portable parallel programming in a Fortran environment
International Nuclear Information System (INIS)
May, E.N.
1989-01-01
Experience using the Argonne-developed PARMACs macro package to implement a portable parallel programming environment is described. Fortran programs with intrinsic parallelism of coarse and medium granularity are easily converted to parallel programs which are portable among a number of commercially available parallel processors in the class of shared-memory bus-based and local-memory network based MIMD processors. The parallelism is implemented using standard UNIX (tm) tools and a small number of easily understood synchronization concepts (monitors and message-passing techniques) to construct and coordinate multiple cooperating processes on one or many processors. Benchmark results are presented for parallel computers such as the Alliant FX/8, the Encore MultiMax, the Sequent Balance, the Intel iPSC/2 Hypercube and a network of Sun 3 workstations. These parallel machines are typical MIMD types with from 8 to 30 processors, each rated at from 1 to 10 MIPS processing power. The demonstration code used for this work is a Monte Carlo simulation of the response to photons of a ''nearly realistic'' lead, iron and plastic electromagnetic and hadronic calorimeter, using the EGS4 code system. 6 refs., 2 figs., 2 tabs
Performance of the Galley Parallel File System
Nieuwejaar, Nils; Kotz, David
1996-01-01
As the input/output (I/O) needs of parallel scientific applications increase, file systems for multiprocessors are being designed to provide applications with parallel access to multiple disks. Many parallel file systems present applications with a conventional Unix-like interface that allows the application to access multiple disks transparently. This interface conceals the parallism within the file system, which increases the ease of programmability, but makes it difficult or impossible for sophisticated programmers and libraries to use knowledge about their I/O needs to exploit that parallelism. Furthermore, most current parallel file systems are optimized for a different workload than they are being asked to support. We introduce Galley, a new parallel file system that is intended to efficiently support realistic parallel workloads. Initial experiments, reported in this paper, indicate that Galley is capable of providing high-performance 1/O to applications the applications that rely on them. In Section 3 we describe that access data in patterns that have been observed to be common.
Parallelized Bayesian inversion for three-dimensional dental X-ray imaging.
Kolehmainen, Ville; Vanne, Antti; Siltanen, Samuli; Järvenpää, Seppo; Kaipio, Jari P; Lassas, Matti; Kalke, Martti
2006-02-01
Diagnostic and operational tasks based on dental radiology often require three-dimensional (3-D) information that is not available in a single X-ray projection image. Comprehensive 3-D information about tissues can be obtained by computerized tomography (CT) imaging. However, in dental imaging a conventional CT scan may not be available or practical because of high radiation dose, low-resolution or the cost of the CT scanner equipment. In this paper, we consider a novel type of 3-D imaging modality for dental radiology. We consider situations in which projection images of the teeth are taken from a few sparsely distributed projection directions using the dentist's regular (digital) X-ray equipment and the 3-D X-ray attenuation function is reconstructed. A complication in these experiments is that the reconstruction of the 3-D structure based on a few projection images becomes an ill-posed inverse problem. Bayesian inversion is a well suited framework for reconstruction from such incomplete data. In Bayesian inversion, the ill-posed reconstruction problem is formulated in a well-posed probabilistic form in which a priori information is used to compensate for the incomplete information of the projection data. In this paper we propose a Bayesian method for 3-D reconstruction in dental radiology. The method is partially based on Kolehmainen et al. 2003. The prior model for dental structures consist of a weighted l1 and total variation (TV)-prior together with the positivity prior. The inverse problem is stated as finding the maximum a posteriori (MAP) estimate. To make the 3-D reconstruction computationally feasible, a parallelized version of an optimization algorithm is implemented for a Beowulf cluster computer. The method is tested with projection data from dental specimens and patient data. Tomosynthetic reconstructions are given as reference for the proposed method.
Herlin, Christian; Doucet, Jean Charles; Bigorre, Michèle; Khelifa, Hatem Cheikh; Captier, Guillaume
2013-10-01
Treacher Collins syndrome (TCS) is a severe and complex craniofacial malformation affecting the facial skeleton and soft tissues. The palate as well as the external and middle ear are also affected, but his prognosis is mainly related to neonatal airway management. Methods of zygomatico-orbital reconstruction are numerous and currently use primarily autologous bone, lyophilized cartilage, alloplastic implants, or even free flaps. This work developed a reliable "customized" method of zygomatico-orbital bony reconstruction using a generic reference model tailored to each patient. From a standard computed tomography (CT) acquisition, we studied qualitatively and quantitatively the skeleton of four individuals with TCS whose age was between 6 and 20 years. In parallel, we studied 40 controls at the same age to obtain a morphometric database of reference. Surgical simulation was carried out using validated software used in craniofacial surgery. The zygomatic hypoplasia was very important quantitatively and morphologically in all TCS individuals. Orbital involvement was mainly morphological, with volumes comparable to the controls of the same age. The control database was used to create three-dimensional computer models to be used in the manufacture of cutting guides for autologous cranial bone grafts or alloplastic implants perfectly adapted to each patient's morphology. Presurgical simulation was also used to fabricate custom positioning guides permitting a simple and reliable surgical procedure. The use of a virtual database allowed us to design a reliable and reproducible skeletal reconstruction method for this rare and complex syndrome. The use of presurgical simulation tools seem essential in this type of craniofacial malformation to increase the reliability of these uncommon and complex surgical procedures, and to ensure stable results over time. Copyright © 2013 European Association for Cranio-Maxillo-Facial Surgery. Published by Elsevier Ltd. All rights
Study of DNA reconstruction enzymes
Energy Technology Data Exchange (ETDEWEB)
Sekiguchi, M [Kyushu Univ., Fukuoka (Japan). Faculty of Science
1976-12-01
Description was made of the characteristics and mechanism of 3 reconstructive enzymes which received from M. luteus or E. coli or T4, and of which natures were clarified as reconstructive enzymes of DNA irradiated with ultraviolet rays. As characteristics, the site of breaking, reaction, molecular weight, electric charge in the neutrality and a specific adhesion to DNA irradiated with ultraviolet rays were mentioned. As to mutant of ultraviolet ray sensitivity, hereditary control mechanism of removal and reconstruction by endo-nuclease activation was described, and suggestion was referred to removal and reconstruction of cells of xedoderma pigmentosum which is a hereditary disease of human. Description was also made as to the mechanism of exonuclease activation which separates dimer selectively from irradiated DNA.
Quantum Logic and Quantum Reconstruction
Stairs, Allen
2015-01-01
Quantum logic understood as a reconstruction program had real successes and genuine limitations. This paper offers a synopsis of both and suggests a way of seeing quantum logic in a larger, still thriving context.
International Nuclear Information System (INIS)
Ibarra, Alejandro
2007-01-01
In this talk we discuss the prospects to reconstruct the high-energy see-saw Lagrangian from low energy experiments in supersymmetric scenarios. We show that the model with three right-handed neutrinos could be reconstructed in theory, but not in practice. Then, we discuss the prospects to reconstruct the model with two right-handed neutrinos, which is the minimal see-saw model able to accommodate neutrino observations. We identify the relevant processes to achieve this goal, and comment on the sensitivity of future experiments to them. We find the prospects much more promising and we emphasize in particular the importance of the observation of rare leptonic decays for the reconstruction of the right-handed neutrino masses
Breast Reconstruction with Flap Surgery
... augmented with a breast implant to achieve the desired breast size. Surgical methods Autologous tissue breast reconstruction ... as long as a year or two before feeling completely healed and back to normal. Future breast ...
Rational reconstructions of modern physics
Mittelstaedt, Peter
2013-01-01
Newton’s classical physics and its underlying ontology are loaded with several metaphysical hypotheses that cannot be justified by rational reasoning nor by experimental evidence. Furthermore, it is well known that some of these hypotheses are not contained in the great theories of Modern Physics, such as the theory of Special Relativity and Quantum Mechanics. This book shows that, on the basis of Newton’s classical physics and by rational reconstruction, the theory of Special Relativity as well as Quantum Mechanics can be obtained by partly eliminating or attenuating the metaphysical hypotheses. Moreover, it is shown that these reconstructions do not require additional hypotheses or new experimental results. In the second edition the rational reconstructions are completed with respect to General Relativity and Cosmology. In addition, the statistics of quantum objects is elaborated in more detail with respect to the rational reconstruction of quantum mechanics. The new material completes the approach of t...
A two-step Hilbert transform method for 2D image reconstruction
International Nuclear Information System (INIS)
Noo, Frederic; Clackdoyle, Rolf; Pack, Jed D
2004-01-01
The paper describes a new accurate two-dimensional (2D) image reconstruction method consisting of two steps. In the first step, the backprojected image is formed after taking the derivative of the parallel projection data. In the second step, a Hilbert filtering is applied along certain lines in the differentiated backprojection (DBP) image. Formulae for performing the DBP step in fan-beam geometry are also presented. The advantage of this two-step Hilbert transform approach is that in certain situations, regions of interest (ROIs) can be reconstructed from truncated projection data. Simulation results are presented that illustrate very similar reconstructed image quality using the new method compared to standard filtered backprojection, and that show the capability to correctly handle truncated projections. In particular, a simulation is presented of a wide patient whose projections are truncated laterally yet for which highly accurate ROI reconstruction is obtained
Online real-time reconstruction of adaptive TSENSE with commodity CPU / GPU hardware
DEFF Research Database (Denmark)
Roujol, Sebastien; de Senneville, Baudouin Denis; Vahalla, Erkki
2009-01-01
Adaptive temporal sensitivity encoding (TSENSE) has been suggested as a robust parallel imaging method suitable for MR guidance of interventional procedures. However, in practice, the reconstruction of adaptive TSENSE images obtained with large coil arrays leads to long reconstruction times...... image sizes used in interventional imaging (128 × 96, 16 channels, sensitivity encoding (SENSE) factor 2-4), the pipeline is able to reconstruct adaptive TSENSE images with image latencies below 90 ms at frame rates of up to 40 images/s, rendering the MR performance in practice limited...... by the constraints of the MR acquisition. Its performance is demonstrated by the online reconstruction of in vivo MR images for rapid temperature mapping of the kidney and for cardiac catheterization....
The kpx, a program analyzer for parallelization
International Nuclear Information System (INIS)
Matsuyama, Yuji; Orii, Shigeo; Ota, Toshiro; Kume, Etsuo; Aikawa, Hiroshi.
1997-03-01
The kpx is a program analyzer, developed as a common technological basis for promoting parallel processing. The kpx consists of three tools. The first is ktool, that shows how much execution time is spent in program segments. The second is ptool, that shows parallelization overhead on the Paragon system. The last is xtool, that shows parallelization overhead on the VPP system. The kpx, designed to work for any FORTRAN cord on any UNIX computer, is confirmed to work well after testing on Paragon, SP2, SR2201, VPP500, VPP300, Monte-4, SX-4 and T90. (author)
Synchronization Of Parallel Discrete Event Simulations
Steinman, Jeffrey S.
1992-01-01
Adaptive, parallel, discrete-event-simulation-synchronization algorithm, Breathing Time Buckets, developed in Synchronous Parallel Environment for Emulation and Discrete Event Simulation (SPEEDES) operating system. Algorithm allows parallel simulations to process events optimistically in fluctuating time cycles that naturally adapt while simulation in progress. Combines best of optimistic and conservative synchronization strategies while avoiding major disadvantages. Algorithm processes events optimistically in time cycles adapting while simulation in progress. Well suited for modeling communication networks, for large-scale war games, for simulated flights of aircraft, for simulations of computer equipment, for mathematical modeling, for interactive engineering simulations, and for depictions of flows of information.
Multistage parallel-serial time averaging filters
International Nuclear Information System (INIS)
Theodosiou, G.E.
1980-01-01
Here, a new time averaging circuit design, the 'parallel filter' is presented, which can reduce the time jitter, introduced in time measurements using counters of large dimensions. This parallel filter could be considered as a single stage unit circuit which can be repeated an arbitrary number of times in series, thus providing a parallel-serial filter type as a result. The main advantages of such a filter over a serial one are much less electronic gate jitter and time delay for the same amount of total time uncertainty reduction. (orig.)
Implementations of BLAST for parallel computers.
Jülich, A
1995-02-01
The BLAST sequence comparison programs have been ported to a variety of parallel computers-the shared memory machine Cray Y-MP 8/864 and the distributed memory architectures Intel iPSC/860 and nCUBE. Additionally, the programs were ported to run on workstation clusters. We explain the parallelization techniques and consider the pros and cons of these methods. The BLAST programs are very well suited for parallelization for a moderate number of processors. We illustrate our results using the program blastp as an example. As input data for blastp, a 799 residue protein query sequence and the protein database PIR were used.
Speedup predictions on large scientific parallel programs
International Nuclear Information System (INIS)
Williams, E.; Bobrowicz, F.
1985-01-01
How much speedup can we expect for large scientific parallel programs running on supercomputers. For insight into this problem we extend the parallel processing environment currently existing on the Cray X-MP (a shared memory multiprocessor with at most four processors) to a simulated N-processor environment, where N greater than or equal to 1. Several large scientific parallel programs from Los Alamos National Laboratory were run in this simulated environment, and speedups were predicted. A speedup of 14.4 on 16 processors was measured for one of the three most used codes at the Laboratory
Language constructs for modular parallel programs
Energy Technology Data Exchange (ETDEWEB)
Foster, I.
1996-03-01
We describe programming language constructs that facilitate the application of modular design techniques in parallel programming. These constructs allow us to isolate resource management and processor scheduling decisions from the specification of individual modules, which can themselves encapsulate design decisions concerned with concurrence, communication, process mapping, and data distribution. This approach permits development of libraries of reusable parallel program components and the reuse of these components in different contexts. In particular, alternative mapping strategies can be explored without modifying other aspects of program logic. We describe how these constructs are incorporated in two practical parallel programming languages, PCN and Fortran M. Compilers have been developed for both languages, allowing experimentation in substantial applications.
Distributed parallel messaging for multiprocessor systems
Chen, Dong; Heidelberger, Philip; Salapura, Valentina; Senger, Robert M; Steinmacher-Burrow, Burhard; Sugawara, Yutaka
2013-06-04
A method and apparatus for distributed parallel messaging in a parallel computing system. The apparatus includes, at each node of a multiprocessor network, multiple injection messaging engine units and reception messaging engine units, each implementing a DMA engine and each supporting both multiple packet injection into and multiple reception from a network, in parallel. The reception side of the messaging unit (MU) includes a switch interface enabling writing of data of a packet received from the network to the memory system. The transmission side of the messaging unit, includes switch interface for reading from the memory system when injecting packets into the network.
Massively parallel Fokker-Planck code ALLAp
International Nuclear Information System (INIS)
Batishcheva, A.A.; Krasheninnikov, S.I.; Craddock, G.G.; Djordjevic, V.
1996-01-01
The recently developed for workstations Fokker-Planck code ALLA simulates the temporal evolution of 1V, 2V and 1D2V collisional edge plasmas. In this work we present the results of code parallelization on the CRI T3D massively parallel platform (ALLAp version). Simultaneously we benchmark the 1D2V parallel vesion against an analytic self-similar solution of the collisional kinetic equation. This test is not trivial as it demands a very strong spatial temperature and density variation within the simulation domain. (orig.)
Petz recovery versus matrix reconstruction
Holzäpfel, Milan; Cramer, Marcus; Datta, Nilanjana; Plenio, Martin B.
2018-04-01
The reconstruction of the state of a multipartite quantum mechanical system represents a fundamental task in quantum information science. At its most basic, it concerns a state of a bipartite quantum system whose subsystems are subjected to local operations. We compare two different methods for obtaining the original state from the state resulting from the action of these operations. The first method involves quantum operations called Petz recovery maps, acting locally on the two subsystems. The second method is called matrix (or state) reconstruction and involves local, linear maps that are not necessarily completely positive. Moreover, we compare the quantities on which the maps employed in the two methods depend. We show that any state that admits Petz recovery also admits state reconstruction. However, the latter is successful for a strictly larger set of states. We also compare these methods in the context of a finite spin chain. Here, the state of a finite spin chain is reconstructed from the reduced states of a few neighbouring spins. In this setting, state reconstruction is the same as the matrix product operator reconstruction proposed by Baumgratz et al. [Phys. Rev. Lett. 111, 020401 (2013)]. Finally, we generalize both these methods so that they employ long-range measurements instead of relying solely on short-range correlations embodied in such local reduced states. Long-range measurements enable the reconstruction of states which cannot be reconstructed from measurements of local few-body observables alone and hereby we improve existing methods for quantum state tomography of quantum many-body systems.
Animated Reconstruction of Forensic Animation
Hala, Albert; Unver, Ertu
1998-01-01
An animated accident display in court can be significant evidentiary tool. Computer graphics animation reconstructions which can be shown in court are cost effective, save valuable time and illustrate complex and technical issues, are realistic and can prove or disprove arguments or theories with reference to the perplexing newtonian physics involved in many accidents: this technology may well revolutionise accident reconstruction, thus enabling prosecution and defence to be more effective in...
Equilibrium Reconstruction in EAST Tokamak
International Nuclear Information System (INIS)
Qian Jinping; Wan Baonian; Shen Biao; Sun Youwen; Liu Dongmei; Xiao Bingjia; Ren Qilong; Gong Xianzu; Li Jiangang; Lao, L. L.; Sabbagh, S. A.
2009-01-01
Reconstruction of experimental axisymmetric equilibria is an important part of tokamak data analysis. Fourier expansion is applied to reconstruct the vessel current distribution in EFIT code. Benchmarking and testing calculations are performed to evaluate and validate this algorithm. Two cases for circular and non-circular plasma discharges are presented. Fourier expansion used to fit the eddy current is a robust method and the real time EFIT can be introduced to the plasma control system in the coming campaign. (magnetically confined plasma)
Benkert, Thomas; Tian, Ye; Huang, Chenchan; DiBella, Edward V R; Chandarana, Hersh; Feng, Li
2018-07-01
Golden-angle radial sparse parallel (GRASP) MRI reconstruction requires gridding and regridding to transform data between radial and Cartesian k-space. These operations are repeatedly performed in each iteration, which makes the reconstruction computationally demanding. This work aimed to accelerate GRASP reconstruction using self-calibrating GRAPPA operator gridding (GROG) and to validate its performance in clinical imaging. GROG is an alternative gridding approach based on parallel imaging, in which k-space data acquired on a non-Cartesian grid are shifted onto a Cartesian k-space grid using information from multicoil arrays. For iterative non-Cartesian image reconstruction, GROG is performed only once as a preprocessing step. Therefore, the subsequent iterative reconstruction can be performed directly in Cartesian space, which significantly reduces computational burden. Here, a framework combining GROG with GRASP (GROG-GRASP) is first optimized and then compared with standard GRASP reconstruction in 22 prostate patients. GROG-GRASP achieved approximately 4.2-fold reduction in reconstruction time compared with GRASP (∼333 min versus ∼78 min) while maintaining image quality (structural similarity index ≈ 0.97 and root mean square error ≈ 0.007). Visual image quality assessment by two experienced radiologists did not show significant differences between the two reconstruction schemes. With a graphics processing unit implementation, image reconstruction time can be further reduced to approximately 14 min. The GRASP reconstruction can be substantially accelerated using GROG. This framework is promising toward broader clinical application of GRASP and other iterative non-Cartesian reconstruction methods. Magn Reson Med 80:286-293, 2018. © 2017 International Society for Magnetic Resonance in Medicine. © 2017 International Society for Magnetic Resonance in Medicine.
International Nuclear Information System (INIS)
Xia Hui-Hui; Kan Rui-Feng; Liu Jian-Guo; Xu Zhen-Yu; He Ya-Bai
2016-01-01
An improved algebraic reconstruction technique (ART) combined with tunable diode laser absorption spectroscopy(TDLAS) is presented in this paper for determining two-dimensional (2D) distribution of H 2 O concentration and temperature in a simulated combustion flame. This work aims to simulate the reconstruction of spectroscopic measurements by a multi-view parallel-beam scanning geometry and analyze the effects of projection rays on reconstruction accuracy. It finally proves that reconstruction quality dramatically increases with the number of projection rays increasing until more than 180 for 20 × 20 grid, and after that point, the number of projection rays has little influence on reconstruction accuracy. It is clear that the temperature reconstruction results are more accurate than the water vapor concentration obtained by the traditional concentration calculation method. In the present study an innovative way to reduce the error of concentration reconstruction and improve the reconstruction quality greatly is also proposed, and the capability of this new method is evaluated by using appropriate assessment parameters. By using this new approach, not only the concentration reconstruction accuracy is greatly improved, but also a suitable parallel-beam arrangement is put forward for high reconstruction accuracy and simplicity of experimental validation. Finally, a bimodal structure of the combustion region is assumed to demonstrate the robustness and universality of the proposed method. Numerical investigation indicates that the proposed TDLAS tomographic algorithm is capable of detecting accurate temperature and concentration profiles. This feasible formula for reconstruction research is expected to resolve several key issues in practical combustion devices. (paper)
Secondary reconstruction of maxillofacial trauma.
Castro-Núñez, Jaime; Van Sickels, Joseph E
2017-08-01
Craniomaxillofacial trauma is one of the most complex clinical conditions in contemporary maxillofacial surgery. Vital structures and possible functional and esthetic sequelae are important considerations following this type of trauma and intervention. Despite the best efforts of the primary surgery, there are a group of patients that will have poor outcomes requiring secondary reconstruction to restore form and function. The purpose of this study is to review current concepts on secondary reconstruction to the maxillofacial complex. The evaluation of a posttraumatic patient for a secondary reconstruction must include an assessment of the different subunits of the upper face, middle face, and lower face. Virtual surgical planning and surgical guides represent the most important innovations in secondary reconstruction over the past few years. Intraoperative navigational surgery/computed-assisted navigation is used in complex cases. Facial asymmetry can be corrected or significantly improved by segmentation of the computerized tomography dataset and mirroring of the unaffected side by means of virtual surgical planning. Navigational surgery/computed-assisted navigation allows for a more precise surgical correction when secondary reconstruction involves the replacement of extensive anatomical areas. The use of technology can result in custom-made replacements and prebent plates, which are more stable and resistant to fracture because of metal fatigue. Careful perioperative evaluation is the key to positive outcomes of secondary reconstruction after trauma. The advent of technological tools has played a capital role in helping the surgical team perform a given treatment plan in a more precise and predictable manner.
Technical basis for dose reconstruction
International Nuclear Information System (INIS)
Anspaugh, L.R.
1996-01-01
The purpose of this paper is to consider two general topics: Technical considerations of why dose-reconstruction studies should or should not be performed and methods of dose reconstruction. The first topic is of general and growing interest as the number of dose-reconstruction studies increases, and one asks the question whether it is necessary to perform a dose reconstruction for virtually every site at which, for example, the Department of Energy (DOE) has operated a nuclear-related facility. And there is the broader question of how one might logically draw the line at performing or not performing dose-reconstruction (radiological and chemical) studies for virtually every industrial complex in the entire country. The second question is also of general interest. There is no single correct way to perform a dose-reconstruction study, and it is important not to follow blindly a single method to the point that cheaper, faster, more accurate, and more transparent methods might not be developed and applied. 90 refs., 4 tabs
Technical basis for dose reconstruction
International Nuclear Information System (INIS)
Anspaugh, L.R.
1996-01-01
The purpose of this paper is to consider two general topics: technical considerations of why dose-reconstruction studies should or should not be performed and methods of dose reconstruction. The first topic is of general and growing interest as the number of dose-reconstruction studies increases, and one asks the question whether it is necessary to perform a dose reconstruction for virtually every site at which, for example, the Department of Energy (DOE) has operated a nuclear-related facility. And there is the broader question of how one might logically draw the line at performing or not performing dose-reconstruction (radiological and chemical) studies for virtually every industrial complex in the entire country. The second question is also of general interest. There is no single correct way to perform a dose-reconstruction study, and it is important not to follow blindly a single method to the point that cheaper, faster, more accurate, and more transparent methods might not be developed and applied
International Nuclear Information System (INIS)
Satake, Shin-ichi; Kanamori, Hiroyuki; Kunugi, Tomoaki; Sato, Kazuho; Ito, Tomoyoshi; Yamamoto, Keisuke
2007-01-01
We have developed a parallel algorithm for microdigital-holographic particle-tracking velocimetry. The algorithm is used in (1) numerical reconstruction of a particle image computer using a digital hologram, and (2) searching for particles. The numerical reconstruction from the digital hologram makes use of the Fresnel diffraction equation and the FFT (fast Fourier transform),whereas the particle search algorithm looks for local maximum graduation in a reconstruction field represented by a 3D matrix. To achieve high performance computing for both calculations (reconstruction and particle search), two memory partitions are allocated to the 3D matrix. In this matrix, the reconstruction part consists of horizontally placed 2D memory partitions on the x-y plane for the FFT, whereas, the particle search part consists of vertically placed 2D memory partitions set along the z axes.Consequently, the scalability can be obtained for the proportion of processor elements,where the benchmarks are carried out for parallel computation by a SGI Altix machine
Oblique reconstructions in tomosynthesis. II. Super-resolution
International Nuclear Information System (INIS)
Acciavatti, Raymond J.; Maidment, Andrew D. A.
2013-01-01
Purpose: In tomosynthesis, super-resolution has been demonstrated using reconstruction planes parallel to the detector. Super-resolution allows for subpixel resolution relative to the detector. The purpose of this work is to develop an analytical model that generalizes super-resolution to oblique reconstruction planes.Methods: In a digital tomosynthesis system, a sinusoidal test object is modeled along oblique angles (i.e., “pitches”) relative to the plane of the detector in a 3D divergent-beam acquisition geometry. To investigate the potential for super-resolution, the input frequency is specified to be greater than the alias frequency of the detector. Reconstructions are evaluated in an oblique plane along the extent of the object using simple backprojection (SBP) and filtered backprojection (FBP). By comparing the amplitude of the reconstruction against the attenuation coefficient of the object at various frequencies, the modulation transfer function (MTF) is calculated to determine whether modulation is within detectable limits for super-resolution. For experimental validation of super-resolution, a goniometry stand was used to orient a bar pattern phantom along various pitches relative to the breast support in a commercial digital breast tomosynthesis system.Results: Using theoretical modeling, it is shown that a single projection image cannot resolve a sine input whose frequency exceeds the detector alias frequency. The high frequency input is correctly visualized in SBP or FBP reconstruction using a slice along the pitch of the object. The Fourier transform of this reconstructed slice is maximized at the input frequency as proof that the object is resolved. Consistent with the theoretical results, experimental images of a bar pattern phantom showed super-resolution in oblique reconstructions. At various pitches, the highest frequency with detectable modulation was determined by visual inspection of the bar patterns. The dependency of the highest
Oblique reconstructions in tomosynthesis. II. Super-resolution
Acciavatti, Raymond J.; Maidment, Andrew D. A.
2013-01-01
Purpose: In tomosynthesis, super-resolution has been demonstrated using reconstruction planes parallel to the detector. Super-resolution allows for subpixel resolution relative to the detector. The purpose of this work is to develop an analytical model that generalizes super-resolution to oblique reconstruction planes. Methods: In a digital tomosynthesis system, a sinusoidal test object is modeled along oblique angles (i.e., “pitches”) relative to the plane of the detector in a 3D divergent-beam acquisition geometry. To investigate the potential for super-resolution, the input frequency is specified to be greater than the alias frequency of the detector. Reconstructions are evaluated in an oblique plane along the extent of the object using simple backprojection (SBP) and filtered backprojection (FBP). By comparing the amplitude of the reconstruction against the attenuation coefficient of the object at various frequencies, the modulation transfer function (MTF) is calculated to determine whether modulation is within detectable limits for super-resolution. For experimental validation of super-resolution, a goniometry stand was used to orient a bar pattern phantom along various pitches relative to the breast support in a commercial digital breast tomosynthesis system. Results: Using theoretical modeling, it is shown that a single projection image cannot resolve a sine input whose frequency exceeds the detector alias frequency. The high frequency input is correctly visualized in SBP or FBP reconstruction using a slice along the pitch of the object. The Fourier transform of this reconstructed slice is maximized at the input frequency as proof that the object is resolved. Consistent with the theoretical results, experimental images of a bar pattern phantom showed super-resolution in oblique reconstructions. At various pitches, the highest frequency with detectable modulation was determined by visual inspection of the bar patterns. The dependency of the highest
A new asynchronous parallel algorithm for inferring large-scale gene regulatory networks.
Directory of Open Access Journals (Sweden)
Xiangyun Xiao
Full Text Available The reconstruction of gene regulatory networks (GRNs from high-throughput experimental data has been considered one of the most important issues in systems biology research. With the development of high-throughput technology and the complexity of biological problems, we need to reconstruct GRNs that contain thousands of genes. However, when many existing algorithms are used to handle these large-scale problems, they will encounter two important issues: low accuracy and high computational cost. To overcome these difficulties, the main goal of this study is to design an effective parallel algorithm to infer large-scale GRNs based on high-performance parallel computing environments. In this study, we proposed a novel asynchronous parallel framework to improve the accuracy and lower the time complexity of large-scale GRN inference by combining splitting technology and ordinary differential equation (ODE-based optimization. The presented algorithm uses the sparsity and modularity of GRNs to split whole large-scale GRNs into many small-scale modular subnetworks. Through the ODE-based optimization of all subnetworks in parallel and their asynchronous communications, we can easily obtain the parameters of the whole network. To test the performance of the proposed approach, we used well-known benchmark datasets from Dialogue for Reverse Engineering Assessments and Methods challenge (DREAM, experimentally determined GRN of Escherichia coli and one published dataset that contains more than 10 thousand genes to compare the proposed approach with several popular algorithms on the same high-performance computing environments in terms of both accuracy and time complexity. The numerical results demonstrate that our parallel algorithm exhibits obvious superiority in inferring large-scale GRNs.
A new asynchronous parallel algorithm for inferring large-scale gene regulatory networks.
Xiao, Xiangyun; Zhang, Wei; Zou, Xiufen
2015-01-01
The reconstruction of gene regulatory networks (GRNs) from high-throughput experimental data has been considered one of the most important issues in systems biology research. With the development of high-throughput technology and the complexity of biological problems, we need to reconstruct GRNs that contain thousands of genes. However, when many existing algorithms are used to handle these large-scale problems, they will encounter two important issues: low accuracy and high computational cost. To overcome these difficulties, the main goal of this study is to design an effective parallel algorithm to infer large-scale GRNs based on high-performance parallel computing environments. In this study, we proposed a novel asynchronous parallel framework to improve the accuracy and lower the time complexity of large-scale GRN inference by combining splitting technology and ordinary differential equation (ODE)-based optimization. The presented algorithm uses the sparsity and modularity of GRNs to split whole large-scale GRNs into many small-scale modular subnetworks. Through the ODE-based optimization of all subnetworks in parallel and their asynchronous communications, we can easily obtain the parameters of the whole network. To test the performance of the proposed approach, we used well-known benchmark datasets from Dialogue for Reverse Engineering Assessments and Methods challenge (DREAM), experimentally determined GRN of Escherichia coli and one published dataset that contains more than 10 thousand genes to compare the proposed approach with several popular algorithms on the same high-performance computing environments in terms of both accuracy and time complexity. The numerical results demonstrate that our parallel algorithm exhibits obvious superiority in inferring large-scale GRNs.
Directory of Open Access Journals (Sweden)
Sebastian Schaetz
2017-01-01
Full Text Available Purpose. To develop generic optimization strategies for image reconstruction using graphical processing units (GPUs in magnetic resonance imaging (MRI and to exemplarily report on our experience with a highly accelerated implementation of the nonlinear inversion (NLINV algorithm for dynamic MRI with high frame rates. Methods. The NLINV algorithm is optimized and ported to run on a multi-GPU single-node server. The algorithm is mapped to multiple GPUs by decomposing the data domain along the channel dimension. Furthermore, the algorithm is decomposed along the temporal domain by relaxing a temporal regularization constraint, allowing the algorithm to work on multiple frames in parallel. Finally, an autotuning method is presented that is capable of combining different decomposition variants to achieve optimal algorithm performance in different imaging scenarios. Results. The algorithm is successfully ported to a multi-GPU system and allows online image reconstruction with high frame rates. Real-time reconstruction with low latency and frame rates up to 30 frames per second is demonstrated. Conclusion. Novel parallel decomposition methods are presented which are applicable to many iterative algorithms for dynamic MRI. Using these methods to parallelize the NLINV algorithm on multiple GPUs, it is possible to achieve online image reconstruction with high frame rates.
Massively Parallel Computing: A Sandia Perspective
Energy Technology Data Exchange (ETDEWEB)
Dosanjh, Sudip S.; Greenberg, David S.; Hendrickson, Bruce; Heroux, Michael A.; Plimpton, Steve J.; Tomkins, James L.; Womble, David E.
1999-05-06
The computing power available to scientists and engineers has increased dramatically in the past decade, due in part to progress in making massively parallel computing practical and available. The expectation for these machines has been great. The reality is that progress has been slower than expected. Nevertheless, massively parallel computing is beginning to realize its potential for enabling significant break-throughs in science and engineering. This paper provides a perspective on the state of the field, colored by the authors' experiences using large scale parallel machines at Sandia National Laboratories. We address trends in hardware, system software and algorithms, and we also offer our view of the forces shaping the parallel computing industry.
Parallel generation of architecture on the GPU
Steinberger, Markus; Kenzel, Michael; Kainz, Bernhard K.; Mü ller, Jö rg; Wonka, Peter; Schmalstieg, Dieter
2014-01-01
they can take advantage of, or both, our method supports state of the art procedural modeling including stochasticity and context-sensitivity. To increase parallelism, we explicitly express independence in the grammar, reduce inter-rule dependencies
New high voltage parallel plate analyzer
International Nuclear Information System (INIS)
Hamada, Y.; Kawasumi, Y.; Masai, K.; Iguchi, H.; Fujisawa, A.; Abe, Y.
1992-01-01
A new modification on the parallel plate analyzer for 500 keV heavy ions to eliminate the effect of the intense UV and visible radiations, is successfully conducted. Its principle and results are discussed. (author)
Parallel data encryption with RSA algorithm
Неретин, А. А.
2016-01-01
In this paper a parallel RSA algorithm with preliminary shuffling of source text was presented.Dependence of an encryption speed on the number of encryption nodes has been analysed, The proposed algorithm was implemented on C# language.
Data parallel sorting for particle simulation
Dagum, Leonardo
1992-01-01
Sorting on a parallel architecture is a communications intensive event which can incur a high penalty in applications where it is required. In the case of particle simulation, only integer sorting is necessary, and sequential implementations easily attain the minimum performance bound of O (N) for N particles. Parallel implementations, however, have to cope with the parallel sorting problem which, in addition to incurring a heavy communications cost, can make the minimun performance bound difficult to attain. This paper demonstrates how the sorting problem in a particle simulation can be reduced to a merging problem, and describes an efficient data parallel algorithm to solve this merging problem in a particle simulation. The new algorithm is shown to be optimal under conditions usual for particle simulation, and its fieldwise implementation on the Connection Machine is analyzed in detail. The new algorithm is about four times faster than a fieldwise implementation of radix sort on the Connection Machine.
Parallel debt in the Serbian finance law
Directory of Open Access Journals (Sweden)
Kuzman Miloš
2014-01-01
Full Text Available The purpose of this paper is to present the mechanism of parallel debt in the Serbian financial law. While considering whether the mechanism of parallel debt exists under the Serbian law, the Anglo-Saxon mechanism of trust is represented. Hence it is explained why the mechanism of trust is not allowed under the Serbian law. Further on, the mechanism of parallel debt is introduced as well as a debate on permissibility of its cause in the Serbian law. Comparative legal arguments about this issue are also presented in this paper. In conclusion, the author suggests that on the basis of the conclusions drawn in this paper, the parallel debt mechanism is to be declared admissible if it is ever taken into consideration by the Serbian courts.
Parallel Monte Carlo simulation of aerosol dynamics
Zhou, K.; He, Z.; Xiao, M.; Zhang, Z.
2014-01-01
is simulated with a stochastic method (Marcus-Lushnikov stochastic process). Operator splitting techniques are used to synthesize the deterministic and stochastic parts in the algorithm. The algorithm is parallelized using the Message Passing Interface (MPI
Stranger than fiction: parallel universes beguile science
2007-01-01
We may not be able - at least not yet - to prove they exist, many serious scientists say, but there are plenty of reasons to think that parallel dimensions are more than figments of effeaded imagination. (1/2 page)
Parallel computation of nondeterministic algorithms in VLSI
Energy Technology Data Exchange (ETDEWEB)
Hortensius, P D
1987-01-01
This work examines parallel VLSI implementations of nondeterministic algorithms. It is demonstrated that conventional pseudorandom number generators are unsuitable for highly parallel applications. Efficient parallel pseudorandom sequence generation can be accomplished using certain classes of elementary one-dimensional cellular automata. The pseudorandom numbers appear in parallel on each clock cycle. Extensive study of the properties of these new pseudorandom number generators is made using standard empirical random number tests, cycle length tests, and implementation considerations. Furthermore, it is shown these particular cellular automata can form the basis of efficient VLSI architectures for computations involved in the Monte Carlo simulation of both the percolation and Ising models from statistical mechanics. Finally, a variation on a Built-In Self-Test technique based upon cellular automata is presented. These Cellular Automata-Logic-Block-Observation (CALBO) circuits improve upon conventional design for testability circuitry.
Adapting algorithms to massively parallel hardware
Sioulas, Panagiotis
2016-01-01
In the recent years, the trend in computing has shifted from delivering processors with faster clock speeds to increasing the number of cores per processor. This marks a paradigm shift towards parallel programming in which applications are programmed to exploit the power provided by multi-cores. Usually there is gain in terms of the time-to-solution and the memory footprint. Specifically, this trend has sparked an interest towards massively parallel systems that can provide a large number of processors, and possibly computing nodes, as in the GPUs and MPPAs (Massively Parallel Processor Arrays). In this project, the focus was on two distinct computing problems: k-d tree searches and track seeding cellular automata. The goal was to adapt the algorithms to parallel systems and evaluate their performance in different cases.
Implementing Shared Memory Parallelism in MCBEND
Directory of Open Access Journals (Sweden)
Bird Adam
2017-01-01
Full Text Available MCBEND is a general purpose radiation transport Monte Carlo code from AMEC Foster Wheelers’s ANSWERS® Software Service. MCBEND is well established in the UK shielding community for radiation shielding and dosimetry assessments. The existing MCBEND parallel capability effectively involves running the same calculation on many processors. This works very well except when the memory requirements of a model restrict the number of instances of a calculation that will fit on a machine. To more effectively utilise parallel hardware OpenMP has been used to implement shared memory parallelism in MCBEND. This paper describes the reasoning behind the choice of OpenMP, notes some of the challenges of multi-threading an established code such as MCBEND and assesses the performance of the parallel method implemented in MCBEND.
Domain decomposition methods and parallel computing
International Nuclear Information System (INIS)
Meurant, G.
1991-01-01
In this paper, we show how to efficiently solve large linear systems on parallel computers. These linear systems arise from discretization of scientific computing problems described by systems of partial differential equations. We show how to get a discrete finite dimensional system from the continuous problem and the chosen conjugate gradient iterative algorithm is briefly described. Then, the different kinds of parallel architectures are reviewed and their advantages and deficiencies are emphasized. We sketch the problems found in programming the conjugate gradient method on parallel computers. For this algorithm to be efficient on parallel machines, domain decomposition techniques are introduced. We give results of numerical experiments showing that these techniques allow a good rate of convergence for the conjugate gradient algorithm as well as computational speeds in excess of a billion of floating point operations per second. (author). 5 refs., 11 figs., 2 tabs., 1 inset
6th International Parallel Tools Workshop
Brinkmann, Steffen; Gracia, José; Resch, Michael; Nagel, Wolfgang
2013-01-01
The latest advances in the High Performance Computing hardware have significantly raised the level of available compute performance. At the same time, the growing hardware capabilities of modern supercomputing architectures have caused an increasing complexity of the parallel application development. Despite numerous efforts to improve and simplify parallel programming, there is still a lot of manual debugging and tuning work required. This process is supported by special software tools, facilitating debugging, performance analysis, and optimization and thus making a major contribution to the development of robust and efficient parallel software. This book introduces a selection of the tools, which were presented and discussed at the 6th International Parallel Tools Workshop, held in Stuttgart, Germany, 25-26 September 2012.
Parallel processor programs in the Federal Government
Schneck, P. B.; Austin, D.; Squires, S. L.; Lehmann, J.; Mizell, D.; Wallgren, K.
1985-01-01
In 1982, a report dealing with the nation's research needs in high-speed computing called for increased access to supercomputing resources for the research community, research in computational mathematics, and increased research in the technology base needed for the next generation of supercomputers. Since that time a number of programs addressing future generations of computers, particularly parallel processors, have been started by U.S. government agencies. The present paper provides a description of the largest government programs in parallel processing. Established in fiscal year 1985 by the Institute for Defense Analyses for the National Security Agency, the Supercomputing Research Center will pursue research to advance the state of the art in supercomputing. Attention is also given to the DOE applied mathematical sciences research program, the NYU Ultracomputer project, the DARPA multiprocessor system architectures program, NSF research on multiprocessor systems, ONR activities in parallel computing, and NASA parallel processor projects.
Density functional theory and parallel processing
International Nuclear Information System (INIS)
Ward, R.C.; Geist, G.A.; Butler, W.H.
1987-01-01
The authors demonstrate a method for obtaining the ground state energies and charge densities of a system of atoms described within density functional theory using simulated annealing on a parallel computer
High performance parallel computers for science
International Nuclear Information System (INIS)
Nash, T.; Areti, H.; Atac, R.; Biel, J.; Cook, A.; Deppe, J.; Edel, M.; Fischler, M.; Gaines, I.; Hance, R.
1989-01-01
This paper reports that Fermilab's Advanced Computer Program (ACP) has been developing cost effective, yet practical, parallel computers for high energy physics since 1984. The ACP's latest developments are proceeding in two directions. A Second Generation ACP Multiprocessor System for experiments will include $3500 RISC processors each with performance over 15 VAX MIPS. To support such high performance, the new system allows parallel I/O, parallel interprocess communication, and parallel host processes. The ACP Multi-Array Processor, has been developed for theoretical physics. Each $4000 node is a FORTRAN or C programmable pipelined 20 Mflops (peak), 10 MByte single board computer. These are plugged into a 16 port crossbar switch crate which handles both inter and intra crate communication. The crates are connected in a hypercube. Site oriented applications like lattice gauge theory are supported by system software called CANOPY, which makes the hardware virtually transparent to users. A 256 node, 5 GFlop, system is under construction
Parallel magnetic resonance imaging as approximation in a reproducing kernel Hilbert space
International Nuclear Information System (INIS)
Athalye, Vivek; Lustig, Michael; Martin Uecker
2015-01-01
In magnetic resonance imaging data samples are collected in the spatial frequency domain (k-space), typically by time-consuming line-by-line scanning on a Cartesian grid. Scans can be accelerated by simultaneous acquisition of data using multiple receivers (parallel imaging), and by using more efficient non-Cartesian sampling schemes. To understand and design k-space sampling patterns, a theoretical framework is needed to analyze how well arbitrary sampling patterns reconstruct unsampled k-space using receive coil information. As shown here, reconstruction from samples at arbitrary locations can be understood as approximation of vector-valued functions from the acquired samples and formulated using a reproducing kernel Hilbert space with a matrix-valued kernel defined by the spatial sensitivities of the receive coils. This establishes a formal connection between approximation theory and parallel imaging. Theoretical tools from approximation theory can then be used to understand reconstruction in k-space and to extend the analysis of the effects of samples selection beyond the traditional image-domain g-factor noise analysis to both noise amplification and approximation errors in k-space. This is demonstrated with numerical examples. (paper)
Massively parallel evolutionary computation on GPGPUs
Tsutsui, Shigeyoshi
2013-01-01
Evolutionary algorithms (EAs) are metaheuristics that learn from natural collective behavior and are applied to solve optimization problems in domains such as scheduling, engineering, bioinformatics, and finance. Such applications demand acceptable solutions with high-speed execution using finite computational resources. Therefore, there have been many attempts to develop platforms for running parallel EAs using multicore machines, massively parallel cluster machines, or grid computing environments. Recent advances in general-purpose computing on graphics processing units (GPGPU) have opened u
Freeman, Bryan
2013-01-01
This book contains practical recipes on everything you will need to create task-based parallel programs using C#, .NET 4.5, and Visual Studio. The book is packed with illustrated code examples to create scalable programs.This book is intended to help experienced C# developers write applications that leverage the power of modern multicore processors. It provides the necessary knowledge for an experienced C# developer to work with .NET parallelism APIs. Previous experience of writing multithreaded applications is not necessary.
Simulation Exploration through Immersive Parallel Planes: Preprint
Energy Technology Data Exchange (ETDEWEB)
Brunhart-Lupo, Nicholas; Bush, Brian W.; Gruchalla, Kenny; Smith, Steve
2016-03-01
We present a visualization-driven simulation system that tightly couples systems dynamics simulations with an immersive virtual environment to allow analysts to rapidly develop and test hypotheses in a high-dimensional parameter space. To accomplish this, we generalize the two-dimensional parallel-coordinates statistical graphic as an immersive 'parallel-planes' visualization for multivariate time series emitted by simulations running in parallel with the visualization. In contrast to traditional parallel coordinate's mapping the multivariate dimensions onto coordinate axes represented by a series of parallel lines, we map pairs of the multivariate dimensions onto a series of parallel rectangles. As in the case of parallel coordinates, each individual observation in the dataset is mapped to a polyline whose vertices coincide with its coordinate values. Regions of the rectangles can be 'brushed' to highlight and select observations of interest: a 'slider' control allows the user to filter the observations by their time coordinate. In an immersive virtual environment, users interact with the parallel planes using a joystick that can select regions on the planes, manipulate selection, and filter time. The brushing and selection actions are used to both explore existing data as well as to launch additional simulations corresponding to the visually selected portions of the input parameter space. As soon as the new simulations complete, their resulting observations are displayed in the virtual environment. This tight feedback loop between simulation and immersive analytics accelerates users' realization of insights about the simulation and its output.
Alternative derivation of the parallel ion viscosity
International Nuclear Information System (INIS)
Bravenec, R.V.; Berk, H.L.; Hammer, J.H.
1982-01-01
A set of double-adiabatic fluid equations with additional collisional relaxation between the ion temperatures parallel and perpendicular to a magnetic field are shown to reduce to a set involving a single temperature and a parallel viscosity. This result is applied to a recently published paper [R. V. Bravenec, A. J. Lichtenberg, M. A. Leiberman, and H. L. Berk, Phys. Fluids 24, 1320 (1981)] on viscous flow in a multiple-mirror configuration
Acoustic simulation in architecture with parallel algorithm
Li, Xiaohong; Zhang, Xinrong; Li, Dan
2004-03-01
In allusion to complexity of architecture environment and Real-time simulation of architecture acoustics, a parallel radiosity algorithm was developed. The distribution of sound energy in scene is solved with this method. And then the impulse response between sources and receivers at frequency segment, which are calculated with multi-process, are combined into whole frequency response. The numerical experiment shows that parallel arithmetic can improve the acoustic simulating efficiency of complex scene.
PARALLEL SOLUTION METHODS OF PARTIAL DIFFERENTIAL EQUATIONS
Directory of Open Access Journals (Sweden)
Korhan KARABULUT
1998-03-01
Full Text Available Partial differential equations arise in almost all fields of science and engineering. Computer time spent in solving partial differential equations is much more than that of in any other problem class. For this reason, partial differential equations are suitable to be solved on parallel computers that offer great computation power. In this study, parallel solution to partial differential equations with Jacobi, Gauss-Siedel, SOR (Succesive OverRelaxation and SSOR (Symmetric SOR algorithms is studied.
Simulation Exploration through Immersive Parallel Planes
Energy Technology Data Exchange (ETDEWEB)
Brunhart-Lupo, Nicholas J [National Renewable Energy Laboratory (NREL), Golden, CO (United States); Bush, Brian W [National Renewable Energy Laboratory (NREL), Golden, CO (United States); Gruchalla, Kenny M [National Renewable Energy Laboratory (NREL), Golden, CO (United States); Smith, Steve [Los Alamos Visualization Associates
2017-05-25
We present a visualization-driven simulation system that tightly couples systems dynamics simulations with an immersive virtual environment to allow analysts to rapidly develop and test hypotheses in a high-dimensional parameter space. To accomplish this, we generalize the two-dimensional parallel-coordinates statistical graphic as an immersive 'parallel-planes' visualization for multivariate time series emitted by simulations running in parallel with the visualization. In contrast to traditional parallel coordinate's mapping the multivariate dimensions onto coordinate axes represented by a series of parallel lines, we map pairs of the multivariate dimensions onto a series of parallel rectangles. As in the case of parallel coordinates, each individual observation in the dataset is mapped to a polyline whose vertices coincide with its coordinate values. Regions of the rectangles can be 'brushed' to highlight and select observations of interest: a 'slider' control allows the user to filter the observations by their time coordinate. In an immersive virtual environment, users interact with the parallel planes using a joystick that can select regions on the planes, manipulate selection, and filter time. The brushing and selection actions are used to both explore existing data as well as to launch additional simulations corresponding to the visually selected portions of the input parameter space. As soon as the new simulations complete, their resulting observations are displayed in the virtual environment. This tight feedback loop between simulation and immersive analytics accelerates users' realization of insights about the simulation and its output.
Current distribution characteristics of superconducting parallel circuits
International Nuclear Information System (INIS)
Mori, K.; Suzuki, Y.; Hara, N.; Kitamura, M.; Tominaka, T.
1994-01-01
In order to increase the current carrying capacity of the current path of the superconducting magnet system, the portion of parallel circuits such as insulated multi-strand cables or parallel persistent current switches (PCS) are made. In superconducting parallel circuits of an insulated multi-strand cable or a parallel persistent current switch (PCS), the current distribution during the current sweep, the persistent mode, and the quench process were investigated. In order to measure the current distribution, two methods were used. (1) Each strand was surrounded with a pure iron core with the air gap. In the air gap, a Hall probe was located. The accuracy of this method was deteriorated by the magnetic hysteresis of iron. (2) The Rogowski coil without iron was used for the current measurement of each path in a 4-parallel PCS. As a result, it was shown that the current distribution characteristics of a parallel PCS is very similar to that of an insulated multi-strand cable for the quench process
Parallel processing of structural integrity analysis codes
International Nuclear Information System (INIS)
Swami Prasad, P.; Dutta, B.K.; Kushwaha, H.S.
1996-01-01
Structural integrity analysis forms an important role in assessing and demonstrating the safety of nuclear reactor components. This analysis is performed using analytical tools such as Finite Element Method (FEM) with the help of digital computers. The complexity of the problems involved in nuclear engineering demands high speed computation facilities to obtain solutions in reasonable amount of time. Parallel processing systems such as ANUPAM provide an efficient platform for realising the high speed computation. The development and implementation of software on parallel processing systems is an interesting and challenging task. The data and algorithm structure of the codes plays an important role in exploiting the parallel processing system capabilities. Structural analysis codes based on FEM can be divided into two categories with respect to their implementation on parallel processing systems. The first category codes such as those used for harmonic analysis, mechanistic fuel performance codes need not require the parallelisation of individual modules of the codes. The second category of codes such as conventional FEM codes require parallelisation of individual modules. In this category, parallelisation of equation solution module poses major difficulties. Different solution schemes such as domain decomposition method (DDM), parallel active column solver and substructuring method are currently used on parallel processing systems. Two codes, FAIR and TABS belonging to each of these categories have been implemented on ANUPAM. The implementation details of these codes and the performance of different equation solvers are highlighted. (author). 5 refs., 12 figs., 1 tab
A New Tool for Intelligent Parallel Processing of Radar/SAR Remotely Sensed Imagery
Directory of Open Access Journals (Sweden)
A. Castillo Atoche
2013-01-01
Full Text Available A novel parallel tool for large-scale image enhancement/reconstruction and postprocessing of radar/SAR sensor systems is addressed. The proposed parallel tool performs the following intelligent processing steps: image formation, for the application of different system-level effects of image degradation with a particular remote sensing (RS system and simulation of random noising effects, enhancement/reconstruction by employing nonparametric robust high-resolution techniques, and image postprocessing using the fuzzy anisotropic diffusion technique which incorporates a better edge-preserving noise removal effect and faster diffusion process. This innovative tool allows the processing of high-resolution images provided with different radar/SAR sensor systems as required by RS endusers for environmental monitoring, risk prevention, and resource management. To verify the performance implementation of the proposed parallel framework, the processing steps are developed and specifically tested on graphic processing units (GPU, achieving considerable speedups compared to the serial version of the same techniques implemented in C language.
International Nuclear Information System (INIS)
Li Liang; Chen Zhiqiang; Xing Yuxiang; Zhang Li; Kang Kejun; Wang Ge
2006-01-01
In recent years, image reconstruction methods for cone-beam computed tomography (CT) have been extensively studied. However, few of these studies discussed computing parallel-beam projections from cone-beam projections. In this paper, we focus on the exact synthesis of complete or incomplete parallel-beam projections from cone-beam projections. First, an extended central slice theorem is described to establish a relationship between the Radon space and the Fourier space. Then, data sufficiency conditions are proposed for computing parallel-beam projection data from cone-beam data. Using these results, a general filtered backprojection algorithm is formulated that can exactly synthesize parallel-beam projection data from cone-beam projection data. As an example, we prove that parallel-beam projections can be exactly synthesized in an angular range in the case of circular cone-beam scanning. Interestingly, this angular range is larger than that derived in the Feldkamp reconstruction framework. Numerical experiments are performed in the circular scanning case to verify our method
Reconstruction of multiple-pinhole micro-SPECT data using origin ensembles.
Lyon, Morgan C; Sitek, Arkadiusz; Metzler, Scott D; Moore, Stephen C
2016-10-01
The authors are currently developing a dual-resolution multiple-pinhole microSPECT imaging system based on three large NaI(Tl) gamma cameras. Two multiple-pinhole tungsten collimator tubes will be used sequentially for whole-body "scout" imaging of a mouse, followed by high-resolution (hi-res) imaging of an organ of interest, such as the heart or brain. Ideally, the whole-body image will be reconstructed in real time such that data need only be acquired until the area of interest can be visualized well-enough to determine positioning for the hi-res scan. The authors investigated the utility of the origin ensemble (OE) algorithm for online and offline reconstructions of the scout data. This algorithm operates directly in image space, and can provide estimates of image uncertainty, along with reconstructed images. Techniques for accelerating the OE reconstruction were also introduced and evaluated. System matrices were calculated for our 39-pinhole scout collimator design. SPECT projections were simulated for a range of count levels using the MOBY digital mouse phantom. Simulated data were used for a comparison of OE and maximum-likelihood expectation maximization (MLEM) reconstructions. The OE algorithm convergence was evaluated by calculating the total-image entropy and by measuring the counts in a volume-of-interest (VOI) containing the heart. Total-image entropy was also calculated for simulated MOBY data reconstructed using OE with various levels of parallelization. For VOI measurements in the heart, liver, bladder, and soft-tissue, MLEM and OE reconstructed images agreed within 6%. Image entropy converged after ∼2000 iterations of OE, while the counts in the heart converged earlier at ∼200 iterations of OE. An accelerated version of OE completed 1000 iterations in <9 min for a 6.8M count data set, with some loss of image entropy performance, whereas the same dataset required ∼79 min to complete 1000 iterations of conventional OE. A combination of the two
A distributed multi-GPU system for high speed electron microscopic tomographic reconstruction.
Zheng, Shawn Q; Branlund, Eric; Kesthelyi, Bettina; Braunfeld, Michael B; Cheng, Yifan; Sedat, John W; Agard, David A
2011-07-01
Full resolution electron microscopic tomographic (EMT) reconstruction of large-scale tilt series requires significant computing power. The desire to perform multiple cycles of iterative reconstruction and realignment dramatically increases the pressing need to improve reconstruction performance. This has motivated us to develop a distributed multi-GPU (graphics processing unit) system to provide the required computing power for rapid constrained, iterative reconstructions of very large three-dimensional (3D) volumes. The participating GPUs reconstruct segments of the volume in parallel, and subsequently, the segments are assembled to form the complete 3D volume. Owing to its power and versatility, the CUDA (NVIDIA, USA) platform was selected for GPU implementation of the EMT reconstruction. For a system containing 10 GPUs provided by 5 GTX295 cards, 10 cycles of SIRT reconstruction for a tomogram of 4096(2) × 512 voxels from an input tilt series containing 122 projection images of 4096(2) pixels (single precision float) takes a total of 1845 s of which 1032 s are for computation with the remainder being the system overhead. The same system takes only 39 s total to reconstruct 1024(2) × 256 voxels from 122 1024(2) pixel projections. While the system overhead is non-trivial, performance analysis indicates that adding extra GPUs to the system would lead to steadily enhanced overall performance. Therefore, this system can be easily expanded to generate superior computing power for very large tomographic reconstructions and especially to empower iterative cycles of reconstruction and realignment. Copyright © 2011 Elsevier B.V. All rights reserved.