WorldWideScience

Sample records for distributed memory computers

  1. Distributed-memory matrix computations

    DEFF Research Database (Denmark)

    Balle, Susanne Mølleskov

    1995-01-01

    in these algorithms is that many scientific applications rely heavily on the performance of the involved dense linear algebra building blocks. Even though we consider the distributed-memory as well as the shared-memory programming paradigm, the major part of the thesis is dedicated to distributed-memory architectures....... We emphasize distributed-memory massively parallel computers - such as the Connection Machines model CM-200 and model CM-5/CM-5E - available to us at UNI-C and at Thinking Machines Corporation. The CM-200 was at the time this project started one of the few existing massively parallel computers...

  2. Distributed-memory matrix computations

    DEFF Research Database (Denmark)

    Balle, Susanne Mølleskov

    1995-01-01

    in these algorithms is that many scientific applications rely heavily on the performance of the involved dense linear algebra building blocks. Even though we consider the distributed-memory as well as the shared-memory programming paradigm, the major part of the thesis is dedicated to distributed-memory architectures....... We emphasize distributed-memory massively parallel computers - such as the Connection Machines model CM-200 and model CM-5/CM-5E - available to us at UNI-C and at Thinking Machines Corporation. The CM-200 was at the time this project started one of the few existing massively parallel computers...... performance can we expect to achieve? Why? 2.Solving systems of linear equations using a Strassen-type matrix-inversion algorithm. A good way to solve systems of linear equations on massively parallel computers? 3.Aspects of computing the singular value decomposition on the Connec-tion Machine CM-5/CM-5E...

  3. In-Memory Computing Architectures for Sparse Distributed Memory.

    Science.gov (United States)

    Kang, Mingu; Shanbhag, Naresh R

    2016-08-01

    This paper presents an energy-efficient and high-throughput architecture for Sparse Distributed Memory (SDM)-a computational model of the human brain [1]. The proposed SDM architecture is based on the recently proposed in-memory computing kernel for machine learning applications called Compute Memory (CM) [2], [3]. CM achieves energy and throughput efficiencies by deeply embedding computation into the memory array. SDM-specific techniques such as hierarchical binary decision (HBD) are employed to reduce the delay and energy further. The CM-based SDM (CM-SDM) is a mixed-signal circuit, and hence circuit-aware behavioral, energy, and delay models in a 65 nm CMOS process are developed in order to predict system performance of SDM architectures in the auto- and hetero-associative modes. The delay and energy models indicate that CM-SDM, in general, can achieve up to 25 × and 12 × delay and energy reduction, respectively, over conventional SDM. When classifying 16 × 16 binary images with high noise levels (input bad pixel ratios: 15%-25%) into nine classes, all SDM architectures are able to generate output bad pixel ratios (Bo) ≤ 2%. The CM-SDM exhibits negligible loss in accuracy, i.e., its Bo degradation is within 0.4% as compared to that of the conventional SDM.

  4. Communication Lower Bounds for Distributed-Memory Computations

    DEFF Research Database (Denmark)

    Scquizzato, Michele; Silvestri, Francesco

    2014-01-01

    In this paper we propose a new approach to the study of the communication requirements of distributed computations, which advocates for the removal of the restrictive assumptions under which earlier results were derived. We illustrate our approach by giving tight lower bounds on the communication...... complexity required to solve several computational problems in a distributed-memory parallel machine, namely standard matrix multiplication, stencil computations, comparison sorting, and the Fast Fourier Transform. Our bounds rely only on a mild assumption on work distribution, and significantly strengthen...... previous results which require either the computation to be balanced among the processors, or specific initial distributions of the input data, or an upper bound on the size of processors' local memories....

  5. Parallel matrix transpose algorithms on distributed memory concurrent computers

    Energy Technology Data Exchange (ETDEWEB)

    Choi, J.; Walker, D.W. [Oak Ridge National Lab., TN (United States); Dongarra, J.J. [Oak Ridge National Lab., TN (United States)]|[Univ. of Tennessee, Knoxville, TN (United States). Dept. of Computer Science

    1993-10-01

    This paper describes parallel matrix transpose algorithms on distributed memory concurrent processors. It is assumed that the matrix is distributed over a P x Q processor template with a block scattered data distribution. P, Q, and the block size can be arbitrary, so the algorithms have wide applicability. The communication schemes of the algorithms are determined by the greatest common divisor (GCD) of P and Q. If P and Q are relatively prime, the matrix transpose algorithm involves complete exchange communication. If P and Q are not relatively prime, processors are divided into GCD groups and the communication operations are overlapped for different groups of processors. Processors transpose GCD wrapped diagonal blocks simultaneously, and the matrix can be transposed with LCM/GCD steps, where LCM is the least common multiple of P and Q. The algorithms make use of non-blocking, point-to-point communication between processors. The use of nonblocking communication allows a processor to overlap the messages that it sends to different processors, thereby avoiding unnecessary synchronization. Combined with the matrix multiplication routine, C = A{center_dot}B, the algorithms are used to compute parallel multiplications of transposed matrices, C = A{sup T}{center_dot}B{sup T}, in the PUMMA package. Details of the parallel implementation of the algorithms are given, and results are presented for runs on the Intel Touchstone Delta computer.

  6. Adaptive Dynamic Process Scheduling on Distributed Memory Parallel Computers

    Directory of Open Access Journals (Sweden)

    Wei Shu

    1994-01-01

    Full Text Available One of the challenges in programming distributed memory parallel machines is deciding how to allocate work to processors. This problem is particularly important for computations with unpredictable dynamic behaviors or irregular structures. We present a scheme for dynamic scheduling of medium-grained processes that is useful in this context. The adaptive contracting within neighborhood (ACWN is a dynamic, distributed, load-dependent, and scalable scheme. It deals with dynamic and unpredictable creation of processes and adapts to different systems. The scheme is described and contrasted with two other schemes that have been proposed in this context, namely the randomized allocation and the gradient model. The performance of the three schemes on an Intel iPSC/2 hypercube is presented and analyzed. The experimental results show that even though the ACWN algorithm incurs somewhat larger overhead than the randomized allocation, it achieves better performance in most cases due to its adaptiveness. Its feature of quickly spreading the work helps it outperform the gradient model in performance and scalability.

  7. Parallelizing Sylvester-like operations on a distributed memory computer

    Energy Technology Data Exchange (ETDEWEB)

    Hu, D.Y.; Sorensen, D.C. [Rice Univ., Houston, TX (United States)

    1994-12-31

    Discretization of linear operators arising in applied mathematics often leads to matrices with the following structure: M(x) = (D {circle_times} A + B {circle_times} I{sub n} + V)x, where x {element_of} R{sup mn}, B, D {element_of} R{sup nxn}, A {element_of} R{sup mxm} and V {element_of} R{sup mnxmn}; both D and V are diagonal. For the notational convenience, the authors assume that both A and B are symmetric. All the results through this paper can be easily extended to the cases with general A and B. The linear operator on R{sup mn} defined above can be viewed as a generalization of the Sylvester operator: S(x) = (I{sub m} {circle_times} A + B {circle_times} I{sub n})x. The authors therefore refer to it as a Sylvester-like operator. The schemes discussed in this paper therefore also apply to Sylvester operator. In this paper, the authors present the SIMD scheme for parallelization of the Sylvester-like operator on a distributed memory computer. This scheme is designed to approach the best possible efficiency by avoiding unnecessary communication among processors.

  8. Numerics of High Performance Computers and Benchmark Evaluation of Distributed Memory Computers

    Directory of Open Access Journals (Sweden)

    H. S. Krishna

    2004-07-01

    Full Text Available The internal representation of numerical data, their speed of manipulation to generate the desired result through efficient utilisation of central processing unit, memory, and communication links are essential steps of all high performance scientific computations. Machine parameters, in particular, reveal accuracy and error bounds of computation, required for performance tuning of codes. This paper reports diagnosis of machine parameters, measurement of computing power of several workstations, serial and parallel computers, and a component-wise test procedure for distributed memory computers. Hierarchical memory structure is illustrated by block copying and unrolling techniques. Locality of reference for cache reuse of data is amply demonstrated by fast Fourier transform codes. Cache and register-blocking technique results in their optimum utilisation with consequent gain in throughput during vector-matrix operations. Implementation of these memory management techniques reduces cache inefficiency loss, which is known to be proportional to the number of processors. Of the two Linux clusters-ANUP16, HPC22 and HPC64, it has been found from the measurement of intrinsic parameters and from application benchmark of multi-block Euler code test run that ANUP16 is suitable for problems that exhibit fine-grained parallelism. The delivered performance of ANUP16 is of immense utility for developing high-end PC clusters like HPC64 and customised parallel computers with added advantage of speed and high degree of parallelism.

  9. ScaLAPACK: A scalable linear algebra library for distributed memory concurrent computers

    Energy Technology Data Exchange (ETDEWEB)

    Choi, Jaeyoung; Walker, D.W. (Oak Ridge National Lab., TN (United States)); Dongarra, J.J. (Oak Ridge National Lab., TN (United States) Tennessee Univ., Knoxville, TN (United States). Dept. of Computer Science); Pozo, R. (Tennessee Univ., Knoxville, TN (United States). Dept. of Computer Science)

    1992-01-01

    This paper describes ScaLAPACK, a distributed memory version of the LAPACK software package for dense and banded matrix computations. Key resign features are the use of distributed versions of the Level 3 BLAS as building blocks, and an object-based interface to the library routines. The square block scattered decomposition is described. The implementation of a distributed memory version of the right-looking LU factorization algorithm on the Intel Delta multicomputer is discussed, and performance results are presented that demonstrate the scalability of the algorithm.

  10. ScaLAPACK: A scalable linear algebra library for distributed memory concurrent computers

    Energy Technology Data Exchange (ETDEWEB)

    Choi, Jaeyoung; Walker, D.W. [Oak Ridge National Lab., TN (United States); Dongarra, J.J. [Oak Ridge National Lab., TN (United States)]|[Tennessee Univ., Knoxville, TN (United States). Dept. of Computer Science; Pozo, R. [Tennessee Univ., Knoxville, TN (United States). Dept. of Computer Science

    1992-09-01

    This paper describes ScaLAPACK, a distributed memory version of the LAPACK software package for dense and banded matrix computations. Key resign features are the use of distributed versions of the Level 3 BLAS as building blocks, and an object-based interface to the library routines. The square block scattered decomposition is described. The implementation of a distributed memory version of the right-looking LU factorization algorithm on the Intel Delta multicomputer is discussed, and performance results are presented that demonstrate the scalability of the algorithm.

  11. The design of a standard message passing interface for distributed memory concurrent computers

    Energy Technology Data Exchange (ETDEWEB)

    Walker, D.W.

    1993-10-01

    This paper presents an overview of MPI, a proposed standard message passing interface for MIMD distributed memory concurrent computers. The design of MPI has been a collective effort involving researchers in the United States and Europe from many organizations and institutions. MPI includes point-to-point and collective communication routines, as well as support for process groups, communication contexts, and application topologies. While making use of new ideas where appropriate, the MPI standard is based largely on current practice.

  12. High-Performance Computation of Distributed-Memory Parallel 3D Voronoi and Delaunay Tessellation

    Energy Technology Data Exchange (ETDEWEB)

    Peterka, Tom; Morozov, Dmitriy; Phillips, Carolyn

    2014-11-14

    Computing a Voronoi or Delaunay tessellation from a set of points is a core part of the analysis of many simulated and measured datasets: N-body simulations, molecular dynamics codes, and LIDAR point clouds are just a few examples. Such computational geometry methods are common in data analysis and visualization; but as the scale of simulations and observations surpasses billions of particles, the existing serial and shared-memory algorithms no longer suffice. A distributed-memory scalable parallel algorithm is the only feasible approach. The primary contribution of this paper is a new parallel Delaunay and Voronoi tessellation algorithm that automatically determines which neighbor points need to be exchanged among the subdomains of a spatial decomposition. Other contributions include periodic and wall boundary conditions, comparison of our method using two popular serial libraries, and application to numerous science datasets.

  13. pyCTQW: A continuous-time quantum walk simulator on distributed memory computers

    Science.gov (United States)

    Izaac, Josh A.; Wang, Jingbo B.

    2015-01-01

    In the general field of quantum information and computation, quantum walks are playing an increasingly important role in constructing physical models and quantum algorithms. We have recently developed a distributed memory software package pyCTQW, with an object-oriented Python interface, that allows efficient simulation of large multi-particle CTQW (continuous-time quantum walk)-based systems. In this paper, we present an introduction to the Python and Fortran interfaces of pyCTQW, discuss various numerical methods of calculating the matrix exponential, and demonstrate the performance behavior of pyCTQW on a distributed memory cluster. In particular, the Chebyshev and Krylov-subspace methods for calculating the quantum walk propagation are provided, as well as methods for visualization and data analysis.

  14. Computational cost of isogeometric multi-frontal solvers on parallel distributed memory machines

    KAUST Repository

    Woźniak, Maciej

    2015-02-01

    This paper derives theoretical estimates of the computational cost for isogeometric multi-frontal direct solver executed on parallel distributed memory machines. We show theoretically that for the Cp-1 global continuity of the isogeometric solution, both the computational cost and the communication cost of a direct solver are of order O(log(N)p2) for the one dimensional (1D) case, O(Np2) for the two dimensional (2D) case, and O(N4/3p2) for the three dimensional (3D) case, where N is the number of degrees of freedom and p is the polynomial order of the B-spline basis functions. The theoretical estimates are verified by numerical experiments performed with three parallel multi-frontal direct solvers: MUMPS, PaStiX and SuperLU, available through PETIGA toolkit built on top of PETSc. Numerical results confirm these theoretical estimates both in terms of p and N. For a given problem size, the strong efficiency rapidly decreases as the number of processors increases, becoming about 20% for 256 processors for a 3D example with 1283 unknowns and linear B-splines with C0 global continuity, and 15% for a 3D example with 643 unknowns and quartic B-splines with C3 global continuity. At the same time, one cannot arbitrarily increase the problem size, since the memory required by higher order continuity spaces is large, quickly consuming all the available memory resources even in the parallel distributed memory version. Numerical results also suggest that the use of distributed parallel machines is highly beneficial when solving higher order continuity spaces, although the number of processors that one can efficiently employ is somehow limited.

  15. PUMMA: Parallel Universal Matrix Multiplication Algorithms on distributed memory concurrent computers

    Energy Technology Data Exchange (ETDEWEB)

    Choi, Jaeyoung; Walker, D.W. [Oak Ridge National Lab., TN (US); Dongarra, J.J. [Oak Ridge National Lab., TN (US)]|[Univ. of Tennessee, Knoxville, TN (US). Dept. of Computer Science

    1993-08-01

    This paper describes the Parallel Universal Matrix Multiplication Algorithms (PUMMA) on distributed memory concurrent computers. The PUMMA package includes not only the non-transposed matrix multiplication routine C = A{center_dot}B, but also transposed multiplication routines C = A{sup T}{center_dot}B, C = A{center_dot}B{sup T}, and C = A{sup T}{center_dot}B{sup T}, for a block scattered data distribution. The routines perform efficiently for a wide range of processor configurations and block sizes. The PUMMA together provide the same functionality as the Level 3 BLAS routine xGEMM. Details of the parallel implementation of the routines are given, and results are presented for runs on the Intel Touchstone Delta computer.

  16. A new parallel-vector finite element analysis software on distributed-memory computers

    Science.gov (United States)

    Qin, Jiangning; Nguyen, Duc T.

    1993-01-01

    A new parallel-vector finite element analysis software package MPFEA (Massively Parallel-vector Finite Element Analysis) is developed for large-scale structural analysis on massively parallel computers with distributed-memory. MPFEA is designed for parallel generation and assembly of the global finite element stiffness matrices as well as parallel solution of the simultaneous linear equations, since these are often the major time-consuming parts of a finite element analysis. Block-skyline storage scheme along with vector-unrolling techniques are used to enhance the vector performance. Communications among processors are carried out concurrently with arithmetic operations to reduce the total execution time. Numerical results on the Intel iPSC/860 computers (such as the Intel Gamma with 128 processors and the Intel Touchstone Delta with 512 processors) are presented, including an aircraft structure and some very large truss structures, to demonstrate the efficiency and accuracy of MPFEA.

  17. ClimateSpark: An In-memory Distributed Computing Framework for Big Climate Data Analytics

    Science.gov (United States)

    Hu, F.; Yang, C. P.; Duffy, D.; Schnase, J. L.; Li, Z.

    2016-12-01

    Massive array-based climate data is being generated from global surveillance systems and model simulations. They are widely used to analyze the environment problems, such as climate changes, natural hazards, and public health. However, knowing the underlying information from these big climate datasets is challenging due to both data- and computing- intensive issues in data processing and analyzing. To tackle the challenges, this paper proposes ClimateSpark, an in-memory distributed computing framework to support big climate data processing. In ClimateSpark, the spatiotemporal index is developed to enable Apache Spark to treat the array-based climate data (e.g. netCDF4, HDF4) as native formats, which are stored in Hadoop Distributed File System (HDFS) without any preprocessing. Based on the index, the spatiotemporal query services are provided to retrieve dataset according to a defined geospatial and temporal bounding box. The data subsets will be read out, and a data partition strategy will be applied to equally split the queried data to each computing node, and store them in memory as climateRDDs for processing. By leveraging Spark SQL and User Defined Function (UDFs), the climate data analysis operations can be conducted by the intuitive SQL language. ClimateSpark is evaluated by two use cases using the NASA Modern-Era Retrospective Analysis for Research and Applications (MERRA) climate reanalysis dataset. One use case is to conduct the spatiotemporal query and visualize the subset results in animation; the other one is to compare different climate model outputs using Taylor-diagram service. Experimental results show that ClimateSpark can significantly accelerate data query and processing, and enable the complex analysis services served in the SQL-style fashion.

  18. Distributed Computing.

    Science.gov (United States)

    Ryland, Jane N.

    1988-01-01

    The microcomputer revolution, in which small and large computers have gained tremendously in capability, has created a distributed computing environment. This circumstance presents administrators with the opportunities and the dilemmas of choosing appropriate computing resources for each situation. (Author/MSE)

  19. Distributed computing

    CERN Document Server

    Van Renesse, R

    1991-01-01

    This series will start with an introduction to distributed computing systems. Distributed computing paradigms will be presented followed by a discussion on how several important contemporary distributed operating systems use these paradigms. Topics will include processing paradigms, storage paradigms, scalability and robustness. Throughout the course everything will be illustrated by modern distributed systems notably the Amoeba distributed operating system of the Free University in Amsterdam and the Plan 9 operating system of AT&T Bell Laboratories. Plan 9 is partly designed and implemented by Ken Thompson, the main person behind the successful UNIX operating system.

  20. Distributed computing and data storage in proteomics: many hands make light work, and a stronger memory.

    Science.gov (United States)

    Verheggen, Kenneth; Barsnes, Harald; Martens, Lennart

    2014-03-01

    Modern day proteomics generates ever more complex data, causing the requirements on the storage and processing of such data to outgrow the capacity of most desktop computers. To cope with the increased computational demands, distributed architectures have gained substantial popularity in the recent years. In this review, we provide an overview of the current techniques for distributed computing, along with examples of how the techniques are currently being employed in the field of proteomics. We thus underline the benefits of distributed computing in proteomics, while also pointing out the potential issues and pitfalls involved.

  1. Efficient implementation of a multidimensional fast fourier transform on a distributed-memory parallel multi-node computer

    Science.gov (United States)

    Bhanot, Gyan V.; Chen, Dong; Gara, Alan G.; Giampapa, Mark E.; Heidelberger, Philip; Steinmacher-Burow, Burkhard D.; Vranas, Pavlos M.

    2008-01-01

    The present in invention is directed to a method, system and program storage device for efficiently implementing a multidimensional Fast Fourier Transform (FFT) of a multidimensional array comprising a plurality of elements initially distributed in a multi-node computer system comprising a plurality of nodes in communication over a network, comprising: distributing the plurality of elements of the array in a first dimension across the plurality of nodes of the computer system over the network to facilitate a first one-dimensional FFT; performing the first one-dimensional FFT on the elements of the array distributed at each node in the first dimension; re-distributing the one-dimensional FFT-transformed elements at each node in a second dimension via "all-to-all" distribution in random order across other nodes of the computer system over the network; and performing a second one-dimensional FFT on elements of the array re-distributed at each node in the second dimension, wherein the random order facilitates efficient utilization of the network thereby efficiently implementing the multidimensional FFT. The "all-to-all" re-distribution of array elements is further efficiently implemented in applications other than the multidimensional FFT on the distributed-memory parallel supercomputer.

  2. Efficient implementation of multidimensional fast fourier transform on a distributed-memory parallel multi-node computer

    Science.gov (United States)

    Bhanot, Gyan V [Princeton, NJ; Chen, Dong [Croton-On-Hudson, NY; Gara, Alan G [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Heidelberger, Philip [Cortlandt Manor, NY; Steinmacher-Burow, Burkhard D [Mount Kisco, NY; Vranas, Pavlos M [Bedford Hills, NY

    2012-01-10

    The present in invention is directed to a method, system and program storage device for efficiently implementing a multidimensional Fast Fourier Transform (FFT) of a multidimensional array comprising a plurality of elements initially distributed in a multi-node computer system comprising a plurality of nodes in communication over a network, comprising: distributing the plurality of elements of the array in a first dimension across the plurality of nodes of the computer system over the network to facilitate a first one-dimensional FFT; performing the first one-dimensional FFT on the elements of the array distributed at each node in the first dimension; re-distributing the one-dimensional FFT-transformed elements at each node in a second dimension via "all-to-all" distribution in random order across other nodes of the computer system over the network; and performing a second one-dimensional FFT on elements of the array re-distributed at each node in the second dimension, wherein the random order facilitates efficient utilization of the network thereby efficiently implementing the multidimensional FFT. The "all-to-all" re-distribution of array elements is further efficiently implemented in applications other than the multidimensional FFT on the distributed-memory parallel supercomputer.

  3. Computational principles of memory.

    Science.gov (United States)

    Chaudhuri, Rishidev; Fiete, Ila

    2016-03-01

    The ability to store and later use information is essential for a variety of adaptive behaviors, including integration, learning, generalization, prediction and inference. In this Review, we survey theoretical principles that can allow the brain to construct persistent states for memory. We identify requirements that a memory system must satisfy and analyze existing models and hypothesized biological substrates in light of these requirements. We also highlight open questions, theoretical puzzles and problems shared with computer science and information theory.

  4. Solution of large nonlinear quasistatic structural mechanics problems on distributed-memory multiprocessor computers

    Energy Technology Data Exchange (ETDEWEB)

    Blanford, M. [Sandia National Labs., Albuquerque, NM (United States)

    1997-12-31

    Most commercially-available quasistatic finite element programs assemble element stiffnesses into a global stiffness matrix, then use a direct linear equation solver to obtain nodal displacements. However, for large problems (greater than a few hundred thousand degrees of freedom), the memory size and computation time required for this approach becomes prohibitive. Moreover, direct solution does not lend itself to the parallel processing needed for today`s multiprocessor systems. This talk gives an overview of the iterative solution strategy of JAS3D, the nonlinear large-deformation quasistatic finite element program. Because its architecture is derived from an explicit transient-dynamics code, it does not ever assemble a global stiffness matrix. The author describes the approach he used to implement the solver on multiprocessor computers, and shows examples of problems run on hundreds of processors and more than a million degrees of freedom. Finally, he describes some of the work he is presently doing to address the challenges of iterative convergence for ill-conditioned problems.

  5. 3-D parallel program for numerical calculation of gas dynamics problems with heat conductivity on distributed memory computational systems (CS)

    Energy Technology Data Exchange (ETDEWEB)

    Sofronov, I.D.; Voronin, B.L.; Butnev, O.I. [VNIIEF (Russian Federation)] [and others

    1997-12-31

    The aim of the work performed is to develop a 3D parallel program for numerical calculation of gas dynamics problem with heat conductivity on distributed memory computational systems (CS), satisfying the condition of numerical result independence from the number of processors involved. Two basically different approaches to the structure of massive parallel computations have been developed. The first approach uses the 3D data matrix decomposition reconstructed at temporal cycle and is a development of parallelization algorithms for multiprocessor CS with shareable memory. The second approach is based on using a 3D data matrix decomposition not reconstructed during a temporal cycle. The program was developed on 8-processor CS MP-3 made in VNIIEF and was adapted to a massive parallel CS Meiko-2 in LLNL by joint efforts of VNIIEF and LLNL staffs. A large number of numerical experiments has been carried out with different number of processors up to 256 and the efficiency of parallelization has been evaluated in dependence on processor number and their parameters.

  6. Time complexity analysis for distributed memory computers: implementation of parallel conjugate gradient method

    NARCIS (Netherlands)

    Hoekstra, A.G.; Sloot, P.M.A.; Haan, M.J.; Hertzberger, L.O.; van Leeuwen, J.

    1991-01-01

    New developments in Computer Science, both hardware and software, offer researchers, such as physicists, unprecedented possibilities to solve their computational intensive problems.However, full exploitation of e.g. new massively parallel computers, parallel languages or runtime environments require

  7. Distributed Computing: An Overview

    OpenAIRE

    Md. Firoj Ali; Rafiqul Zaman Khan

    2015-01-01

    Decrease in hardware costs and advances in computer networking technologies have led to increased interest in the use of large-scale parallel and distributed computing systems. Distributed computing systems offer the potential for improved performance and resource sharing. In this paper we have made an overview on distributed computing. In this paper we studied the difference between parallel and distributed computing, terminologies used in distributed computing, task allocation in distribute...

  8. Distributed memory compiler design for sparse problems

    Science.gov (United States)

    Wu, Janet; Saltz, Joel; Berryman, Harry; Hiranandani, Seema

    1991-01-01

    A compiler and runtime support mechanism is described and demonstrated. The methods presented are capable of solving a wide range of sparse and unstructured problems in scientific computing. The compiler takes as input a FORTRAN 77 program enhanced with specifications for distributing data, and the compiler outputs a message passing program that runs on a distributed memory computer. The runtime support for this compiler is a library of primitives designed to efficiently support irregular patterns of distributed array accesses and irregular distributed array partitions. A variety of Intel iPSC/860 performance results obtained through the use of this compiler are presented.

  9. Program Facilitates Distributed Computing

    Science.gov (United States)

    Hui, Joseph

    1993-01-01

    KNET computer program facilitates distribution of computing between UNIX-compatible local host computer and remote host computer, which may or may not be UNIX-compatible. Capable of automatic remote log-in. User communicates interactively with remote host computer. Data output from remote host computer directed to local screen, to local file, and/or to local process. Conversely, data input from keyboard, local file, or local process directed to remote host computer. Written in ANSI standard C language.

  10. Distributed computing in bioinformatics.

    Science.gov (United States)

    Jain, Eric

    2002-01-01

    This paper provides an overview of methods and current applications of distributed computing in bioinformatics. Distributed computing is a strategy of dividing a large workload among multiple computers to reduce processing time, or to make use of resources such as programs and databases that are not available on all computers. Participating computers may be connected either through a local high-speed network or through the Internet.

  11. Distributed computer control systems

    Energy Technology Data Exchange (ETDEWEB)

    Suski, G.J.

    1986-01-01

    This book focuses on recent advances in the theory, applications and techniques for distributed computer control systems. Contents (partial): Real-time distributed computer control in a flexible manufacturing system. Semantics and implementation problems of channels in a DCCS specification. Broadcast protocols in distributed computer control systems. Design considerations of distributed control architecture for a thermal power plant. The conic toolset for building distributed systems. Network management issues in distributed control systems. Interprocessor communication system architecture in a distributed control system environment. Uni-level homogenous distributed computer control system and optimal system design. A-nets for DCCS design. A methodology for the specification and design of fault tolerant real time systems. An integrated computer control system - architecture design, engineering methodology and practical experience.

  12. Intelligent distributed computing

    CERN Document Server

    Thampi, Sabu

    2015-01-01

    This book contains a selection of refereed and revised papers of the Intelligent Distributed Computing Track originally presented at the third International Symposium on Intelligent Informatics (ISI-2014), September 24-27, 2014, Delhi, India.  The papers selected for this Track cover several Distributed Computing and related topics including Peer-to-Peer Networks, Cloud Computing, Mobile Clouds, Wireless Sensor Networks, and their applications.

  13. Shark: Fast Data Analysis Using Coarse-grained Distributed Memory

    Science.gov (United States)

    2013-05-01

    Shark : Fast Data Analysis Using Coarse-grained Distributed Memory Clifford Engle Electrical Engineering and Computer Sciences University of...TYPE 3. DATES COVERED 00-00-2013 to 00-00-2013 4. TITLE AND SUBTITLE Shark : Fast Data Analysis Using Coarse-grained Distributed Memory 5a...NUMBER(S) 12. DISTRIBUTION/AVAILABILITY STATEMENT Approved for public release; distribution unlimited 13. SUPPLEMENTARY NOTES 14. ABSTRACT Shark is a

  14. Combined shared and distributed memory ab-initio computations of molecular-hydrogen systems in the correlated state: Process pool solution and two-level parallelism

    Science.gov (United States)

    Biborski, Andrzej; Kądzielawa, Andrzej P.; Spałek, Józef

    2015-12-01

    An efficient computational scheme devised for investigations of ground state properties of the electronically correlated systems is presented. As an example, (H2)n chain is considered with the long-range electron-electron interactions taken into account. The implemented procedure covers: (i) single-particle Wannier wave-function basis construction in the correlated state, (ii) microscopic parameters calculation, and (iii) ground state energy optimization. The optimization loop is based on highly effective process-pool solution - specific root-workers approach. The hierarchical, two-level parallelism was applied: both shared (by use of Open Multi-Processing) and distributed (by use of Message Passing Interface) memory models were utilized. We discuss in detail the feature that such approach results in a substantial increase of the calculation speed reaching factor of 300 for the fully parallelized solution. The scheme elaborated in detail reflects the situation in which the most demanding task is the single-particle basis optimization.

  15. ATLAS Distributed Computing Automation

    CERN Document Server

    Schovancova, J; The ATLAS collaboration; Borrego, C; Campana, S; Di Girolamo, A; Elmsheuser, J; Hejbal, J; Kouba, T; Legger, F; Magradze, E; Medrano Llamas, R; Negri, G; Rinaldi, L; Sciacca, G; Serfon, C; Van Der Ster, D C

    2012-01-01

    The ATLAS Experiment benefits from computing resources distributed worldwide at more than 100 WLCG sites. The ATLAS Grid sites provide over 100k CPU job slots, over 100 PB of storage space on disk or tape. Monitoring of status of such a complex infrastructure is essential. The ATLAS Grid infrastructure is monitored 24/7 by two teams of shifters distributed world-wide, by the ATLAS Distributed Computing experts, and by site administrators. In this paper we summarize automation efforts performed within the ATLAS Distributed Computing team in order to reduce manpower costs and improve the reliability of the system. Different aspects of the automation process are described: from the ATLAS Grid site topology provided by the ATLAS Grid Information System, via automatic site testing by the HammerCloud, to automatic exclusion from production or analysis activities.

  16. Total recall in distributive associative memories

    Science.gov (United States)

    Danforth, Douglas G.

    1991-01-01

    Iterative error correction of asymptotically large associative memories is equivalent to a one-step learning rule. This rule is the inverse of the activation function of the memory. Spectral representations of nonlinear activation functions are used to obtain the inverse in closed form for Sparse Distributed Memory, Selected-Coordinate Design, and Radial Basis Functions.

  17. The Science of Computing: Virtual Memory

    Science.gov (United States)

    Denning, Peter J.

    1986-01-01

    In the March-April issue, I described how a computer's storage system is organized as a hierarchy consisting of cache, main memory, and secondary memory (e.g., disk). The cache and main memory form a subsystem that functions like main memory but attains speeds approaching cache. What happens if a program and its data are too large for the main memory? This is not a frivolous question. Every generation of computer users has been frustrated by insufficient memory. A new line of computers may have sufficient storage for the computations of its predecessor, but new programs will soon exhaust its capacity. In 1960, a longrange planning committee at MIT dared to dream of a computer with 1 million words of main memory. In 1985, the Cray-2 was delivered with 256 million words. Computational physicists dream of computers with 1 billion words. Computer architects have done an outstanding job of enlarging main memories yet they have never kept up with demand. Only the shortsighted believe they can.

  18. Reliability of computer memories in radiation environment

    Directory of Open Access Journals (Sweden)

    Fetahović Irfan S.

    2016-01-01

    Full Text Available The aim of this paper is examining a radiation hardness of the magnetic (Toshiba MK4007 GAL and semiconductor (AT 27C010 EPROM and AT 28C010 EEPROM computer memories in the field of radiation. Magnetic memories have been examined in the field of neutron radiation, and semiconductor memories in the field of gamma radiation. The obtained results have shown a high radiation hardness of magnetic memories. On the other side, it has been shown that semiconductor memories are significantly more sensitive and a radiation can lead to an important damage of their functionality. [Projekat Ministarstva nauke Republike Srbije, br. 171007

  19. The parallel processing of EGS4 code on distributed memory scalar parallel computer:Intel Paragon XP/S15-256

    Energy Technology Data Exchange (ETDEWEB)

    Takemiya, Hiroshi; Ohta, Hirofumi; Honma, Ichirou

    1996-03-01

    The parallelization of Electro-Magnetic Cascade Monte Carlo Simulation Code, EGS4 on distributed memory scalar parallel computer: Intel Paragon XP/S15-256 is described. EGS4 has the feature that calculation time for one incident particle is quite different from each other because of the dynamic generation of secondary particles and different behavior of each particle. Granularity for parallel processing, parallel programming model and the algorithm of parallel random number generation are discussed and two kinds of method, each of which allocates particles dynamically or statically, are used for the purpose of realizing high speed parallel processing of this code. Among four problems chosen for performance evaluation, the speedup factors for three problems have been attained to nearly 100 times with 128 processor. It has been found that when both the calculation time for each incident particles and its dispersion are large, it is preferable to use dynamic particle allocation method which can average the load for each processor. And it has also been found that when they are small, it is preferable to use static particle allocation method which reduces the communication overhead. Moreover, it is pointed out that to get the result accurately, it is necessary to use double precision variables in EGS4 code. Finally, the workflow of program parallelization is analyzed and tools for program parallelization through the experience of the EGS4 parallelization are discussed. (author).

  20. Simplified Distributed Computing

    Science.gov (United States)

    Li, G. G.

    2006-05-01

    The distributed computing runs from high performance parallel computing, GRID computing, to an environment where idle CPU cycles and storage space of numerous networked systems are harnessed to work together through the Internet. In this work we focus on building an easy and affordable solution for computationally intensive problems in scientific applications based on existing technology and hardware resources. This system consists of a series of controllers. When a job request is detected by a monitor or initialized by an end user, the job manager launches the specific job handler for this job. The job handler pre-processes the job, partitions the job into relative independent tasks, and distributes the tasks into the processing queue. The task handler picks up the related tasks, processes the tasks, and puts the results back into the processing queue. The job handler also monitors and examines the tasks and the results, and assembles the task results into the overall solution for the job request when all tasks are finished for each job. A resource manager configures and monitors all participating notes. A distributed agent is deployed on all participating notes to manage the software download and report the status. The processing queue is the key to the success of this distributed system. We use BEA's Weblogic JMS queue in our implementation. It guarantees the message delivery and has the message priority and re-try features so that the tasks never get lost. The entire system is built on the J2EE technology and it can be deployed on heterogeneous platforms. It can handle algorithms and applications developed in any languages on any platforms. J2EE adaptors are provided to manage and communicate the existing applications to the system so that the applications and algorithms running on Unix, Linux and Windows can all work together. This system is easy and fast to develop based on the industry's well-adopted technology. It is highly scalable and heterogeneous. It is

  1. Computing betweenness centrality in external memory

    DEFF Research Database (Denmark)

    Arge, Lars; Goodrich, Michael T.; Walderveen, Freek van

    2013-01-01

    Betweenness centrality is one of the most well-known measures of the importance of nodes in a social-network graph. In this paper we describe the first known external-memory and cache-oblivious algorithms for computing betweenness centrality. We present four different external-memory algorithms...

  2. Computer Icons and the Art of Memory.

    Science.gov (United States)

    McNair, John R.

    1996-01-01

    States that key aspects of "memoria," the ancient Art of Memory, especially its focus on vivid representational images set against distinct backgrounds, can be helpful in creating memorable, universal, and easily retrievable computer icons. (PA)

  3. Execution time support for scientific programs on distributed memory machines

    Science.gov (United States)

    Berryman, Harry; Saltz, Joel; Scroggs, Jeffrey

    1990-01-01

    Optimizations are considered that are required for efficient execution of code segments that consists of loops over distributed data structures. The PARTI (Parallel Automated Runtime Toolkit at ICASE) execution time primitives are designed to carry out these optimizations and can be used to implement a wide range of scientific algorithms on distributed memory machines. These primitives allow the user to control array mappings in a way that gives an appearance of shared memory. Computations can be based on a global index set. Primitives are used to carry out gather and scatter operations on distributed arrays. Communications patterns are derived at runtime, and the appropriate send and receive messages are automatically generated.

  4. Distributed measurement-based quantum computation

    CERN Document Server

    Danos, V; Kashefi, E; Panangaden, P; Danos, Vincent; Hondt, Ellie D'; Kashefi, Elham; Panangaden, Prakash

    2005-01-01

    We develop a formal model for distributed measurement-based quantum computations, adopting an agent-based view, such that computations are described locally where possible. Because the network quantum state is in general entangled, we need to model it as a global structure, reminiscent of global memory in classical agent systems. Local quantum computations are described as measurement patterns. Since measurement-based quantum computation is inherently distributed, this allows us to extend naturally several concepts of the measurement calculus, a formal model for such computations. Our goal is to define an assembly language, i.e. we assume that computations are well-defined and we do not concern ourselves with verification techniques. The operational semantics for systems of agents is given by a probabilistic transition system, and we define operational equivalence in a way that it corresponds to the notion of bisimilarity. With this in place, we prove that teleportation is bisimilar to a direct quantum channe...

  5. Integer sparse distributed memory: analysis and results.

    Science.gov (United States)

    Snaider, Javier; Franklin, Stan; Strain, Steve; George, E Olusegun

    2013-10-01

    Sparse distributed memory is an auto-associative memory system that stores high dimensional Boolean vectors. Here we present an extension of the original SDM, the Integer SDM that uses modular arithmetic integer vectors rather than binary vectors. This extension preserves many of the desirable properties of the original SDM: auto-associativity, content addressability, distributed storage, and robustness over noisy inputs. In addition, it improves the representation capabilities of the memory and is more robust over normalization. It can also be extended to support forgetting and reliable sequence storage. We performed several simulations that test the noise robustness property and capacity of the memory. Theoretical analyses of the memory's fidelity and capacity are also presented.

  6. Programming distributed memory architectures using Kali

    Science.gov (United States)

    Mehrotra, Piyush; Vanrosendale, John

    1990-01-01

    Programming nonshared memory systems is more difficult than programming shared memory systems, in part because of the relatively low level of current programming environments for such machines. A new programming environment is presented, Kali, which provides a global name space and allows direct access to remote data values. In order to retain efficiency, Kali provides a system on annotations, allowing the user to control those aspects of the program critical to performance, such as data distribution and load balancing. The primitives and constructs provided by the language is described, and some of the issues raised in translating a Kali program for execution on distributed memory systems are also discussed.

  7. Lifetime-Based Memory Management for Distributed Data Processing Systems

    DEFF Research Database (Denmark)

    Lu, Lu; Shi, Xuanhua; Zhou, Yongluan

    2016-01-01

    In-memory caching of intermediate data and eager combining of data in shuffle buffers have been shown to be very effective in minimizing the re-computation and I/O cost in distributed data processing systems like Spark and Flink. However, it has also been widely reported that these techniques would...

  8. Memory Reconsolidation and Computational Learning

    Science.gov (United States)

    2010-03-01

    Siegelmann-Danieli and H.T. Siegelmann, "Robust Artificial Life Via Artificial Programmed Death," Artificial Inteligence 172(6-7), April 2008: 884-898. F...STATEMENT Unrestricted 13. SUPPLEMENTARY NOTES 20100402019 14. ABSTRACT Memory models are central to Artificial Intelligence and Machine...beyond [1]. The advances cited are a significant step toward creating Artificial Intelligence via neural networks at the human level. Our network

  9. Resistive content addressable memory based in-memory computation architecture

    KAUST Repository

    Salama, Khaled N.

    2016-12-08

    Various examples are provided examples related to resistive content addressable memory (RCAM) based in-memory computation architectures. In one example, a system includes a content addressable memory (CAM) including an array of cells having a memristor based crossbar and an interconnection switch matrix having a gateless memristor array, which is coupled to an output of the CAM. In another example, a method, includes comparing activated bit values stored a key register with corresponding bit values in a row of a CAM, setting a tag bit value to indicate that the activated bit values match the corresponding bit values, and writing masked key bit values to corresponding bit locations in the row of the CAM based on the tag bit value.

  10. Implementation of Parallel Dynamic Simulation on Shared-Memory vs. Distributed-Memory Environments

    Energy Technology Data Exchange (ETDEWEB)

    Jin, Shuangshuang; Chen, Yousu; Wu, Di; Diao, Ruisheng; Huang, Zhenyu

    2015-12-09

    Power system dynamic simulation computes the system response to a sequence of large disturbance, such as sudden changes in generation or load, or a network short circuit followed by protective branch switching operation. It consists of a large set of differential and algebraic equations, which is computational intensive and challenging to solve using single-processor based dynamic simulation solution. High-performance computing (HPC) based parallel computing is a very promising technology to speed up the computation and facilitate the simulation process. This paper presents two different parallel implementations of power grid dynamic simulation using Open Multi-processing (OpenMP) on shared-memory platform, and Message Passing Interface (MPI) on distributed-memory clusters, respectively. The difference of the parallel simulation algorithms and architectures of the two HPC technologies are illustrated, and their performances for running parallel dynamic simulation are compared and demonstrated.

  11. Distributed computing for global health

    CERN Document Server

    CERN. Geneva; Schwede, Torsten; Moore, Celia; Smith, Thomas E; Williams, Brian; Grey, François

    2005-01-01

    Distributed computing harnesses the power of thousands of computers within organisations or over the Internet. In order to tackle global health problems, several groups of researchers have begun to use this approach to exceed by far the computing power of a single lab. This event illustrates how companies, research institutes and the general public are contributing their computing power to these efforts, and what impact this may have on a range of world health issues. Grids for neglected diseases Vincent Breton, CNRS/EGEE This talk introduces the topic of distributed computing, explaining the similarities and differences between Grid computing, volunteer computing and supercomputing, and outlines the potential of Grid computing for tackling neglected diseases where there is little economic incentive for private R&D efforts. Recent results on malaria drug design using the Grid infrastructure of the EU-funded EGEE project, which is coordinated by CERN and involves 70 partners in Europe, the US and Russi...

  12. Computing Battery Lifetime Distributions

    NARCIS (Netherlands)

    Cloth, L.; Haverkort, Boudewijn R.H.M.; Jongerden, M.R.

    The usage of mobile devices like cell phones, navigation systems, or laptop computers, is limited by the lifetime of the included batteries. This lifetime depends naturally on the rate at which energy is consumed, however, it also depends on the usage pattern of the battery. Continuous drawing of a

  13. Computing Battery Lifetime Distributions

    NARCIS (Netherlands)

    Cloth, Lucia; Jongerden, Marijn R.; Haverkort, Boudewijn R.

    2007-01-01

    The usage of mobile devices like cell phones, navigation systems, or laptop computers, is limited by the lifetime of the included batteries. This lifetime depends naturally on the rate at which energy is consumed, however, it also depends on the usage pattern of the battery. Continuous drawing of a

  14. Computing compound distributions faster

    NARCIS (Netherlands)

    P. den Iseger; M.A.J. Smith; R. Dekker (Rommert)

    1997-01-01

    textabstractThe use of Panjer's algorithm has meanwhile become a widespread standard technique for actuaries (Kuon et al., 1955). Panjer's recursion formula is used for the evaluation of compound distributions and can be applied to life and general insurance problems. The discrete version of Panjer'

  15. Heterogeneous Distributed Computing for Computational Aerosciences

    Science.gov (United States)

    Sunderam, Vaidy S.

    1998-01-01

    The research supported under this award focuses on heterogeneous distributed computing for high-performance applications, with particular emphasis on computational aerosciences. The overall goal of this project was to and investigate issues in, and develop solutions to, efficient execution of computational aeroscience codes in heterogeneous concurrent computing environments. In particular, we worked in the context of the PVM[1] system and, subsequent to detailed conversion efforts and performance benchmarking, devising novel techniques to increase the efficacy of heterogeneous networked environments for computational aerosciences. Our work has been based upon the NAS Parallel Benchmark suite, but has also recently expanded in scope to include the NAS I/O benchmarks as specified in the NHT-1 document. In this report we summarize our research accomplishments under the auspices of the grant.

  16. Associative Memory computing power and its simulation

    CERN Document Server

    Ancu, L S; The ATLAS collaboration; Britzger, D; Giannetti, P; Howarth, J W; Luongo, C; Pandini, C; Schmitt, S; Volpi, G

    2014-01-01

    The associative memory (AM) system is a computing device made of hundreds of AM ASICs chips designed to perform “pattern matching” at very high speed. Since each AM chip stores a data base of 130000 pre-calculated patterns and large numbers of chips can be easily assembled together, it is possible to produce huge AM banks. Speed and size of the system are crucial for real-time High Energy Physics applications, such as the ATLAS Fast TracKer (FTK) Processor. Using 80 million channels of the ATLAS tracker, FTK finds tracks within 100 micro seconds. The simulation of such a parallelized system is an extremely complex task if executed in commercial computers based on normal CPUs. The algorithm performance is limited, due to the lack of parallelism, and in addition the memory requirement is very large. In fact the AM chip uses a content addressable memory (CAM) architecture. Any data inquiry is broadcast to all memory elements simultaneously, thus data retrieval time is independent of the database size. The gr...

  17. Associative Memory Computing Power and Its Simulation

    CERN Document Server

    Volpi, G; The ATLAS collaboration

    2014-01-01

    The associative memory (AM) system is a computing device made of hundreds of AM ASICs chips designed to perform “pattern matching” at very high speed. Since each AM chip stores a data base of 130000 pre-calculated patterns and large numbers of chips can be easily assembled together, it is possible to produce huge AM banks. Speed and size of the system are crucial for real-time High Energy Physics applications, such as the ATLAS Fast TracKer (FTK) Processor. Using 80 million channels of the ATLAS tracker, FTK finds tracks within 100 micro seconds. The simulation of such a parallelized system is an extremely complex task if executed in commercial computers based on normal CPUs. The algorithm performance is limited, due to the lack of parallelism, and in addition the memory requirement is very large. In fact the AM chip uses a content addressable memory (CAM) architecture. Any data inquiry is broadcast to all memory elements simultaneously, thus data retrieval time is independent of the database size. The gr...

  18. Lifetime-Based Memory Management for Distributed Data Processing Systems

    DEFF Research Database (Denmark)

    Lu, Lu; Shi, Xuanhua; Zhou, Yongluan;

    2016-01-01

    In-memory caching of intermediate data and eager combining of data in shuffle buffers have been shown to be very effective in minimizing the re-computation and I/O cost in distributed data processing systems like Spark and Flink. However, it has also been widely reported that these techniques would...... create a large amount of long-living data objects in the heap, which may quickly saturate the garbage collector, especially when handling a large dataset, and hence would limit the scalability of the system. To eliminate this problem, we propose a lifetime-based memory management framework, which......, by automatically analyzing the user-defined functions and data types, obtains the expected lifetime of the data objects, and then allocates and releases memory space accordingly to minimize the garbage collection overhead. In particular, we present Deca, a concrete implementation of our proposal on top of Spark...

  19. Lifetime-Based Memory Management for Distributed Data Processing Systems

    DEFF Research Database (Denmark)

    Lu, Lu; Shi, Xuanhua; Zhou, Yongluan;

    2016-01-01

    , by automatically analyzing the user-defined functions and data types, obtains the expected lifetime of the data objects, and then allocates and releases memory space accordingly to minimize the garbage collection overhead. In particular, we present Deca, a concrete implementation of our proposal on top of Spark......In-memory caching of intermediate data and eager combining of data in shuffle buffers have been shown to be very effective in minimizing the re-computation and I/O cost in distributed data processing systems like Spark and Flink. However, it has also been widely reported that these techniques would...... create a large amount of long-living data objects in the heap, which may quickly saturate the garbage collector, especially when handling a large dataset, and hence would limit the scalability of the system. To eliminate this problem, we propose a lifetime-based memory management framework, which...

  20. Distributed GPU Computing in GIScience

    Science.gov (United States)

    Jiang, Y.; Yang, C.; Huang, Q.; Li, J.; Sun, M.

    2013-12-01

    Geoscientists strived to discover potential principles and patterns hidden inside ever-growing Big Data for scientific discoveries. To better achieve this objective, more capable computing resources are required to process, analyze and visualize Big Data (Ferreira et al., 2003; Li et al., 2013). Current CPU-based computing techniques cannot promptly meet the computing challenges caused by increasing amount of datasets from different domains, such as social media, earth observation, environmental sensing (Li et al., 2013). Meanwhile CPU-based computing resources structured as cluster or supercomputer is costly. In the past several years with GPU-based technology matured in both the capability and performance, GPU-based computing has emerged as a new computing paradigm. Compare to traditional computing microprocessor, the modern GPU, as a compelling alternative microprocessor, has outstanding high parallel processing capability with cost-effectiveness and efficiency(Owens et al., 2008), although it is initially designed for graphical rendering in visualization pipe. This presentation reports a distributed GPU computing framework for integrating GPU-based computing within distributed environment. Within this framework, 1) for each single computer, computing resources of both GPU-based and CPU-based can be fully utilized to improve the performance of visualizing and processing Big Data; 2) within a network environment, a variety of computers can be used to build up a virtual super computer to support CPU-based and GPU-based computing in distributed computing environment; 3) GPUs, as a specific graphic targeted device, are used to greatly improve the rendering efficiency in distributed geo-visualization, especially for 3D/4D visualization. Key words: Geovisualization, GIScience, Spatiotemporal Studies Reference : 1. Ferreira de Oliveira, M. C., & Levkowitz, H. (2003). From visual data exploration to visual data mining: A survey. Visualization and Computer Graphics, IEEE

  1. Energy efficient distributed computing systems

    CERN Document Server

    Lee, Young-Choon

    2012-01-01

    The energy consumption issue in distributed computing systems raises various monetary, environmental and system performance concerns. Electricity consumption in the US doubled from 2000 to 2005.  From a financial and environmental standpoint, reducing the consumption of electricity is important, yet these reforms must not lead to performance degradation of the computing systems.  These contradicting constraints create a suite of complex problems that need to be resolved in order to lead to 'greener' distributed computing systems.  This book brings together a group of outsta

  2. Distributed terascale volume visualization using distributed shared virtual memory

    KAUST Repository

    Beyer, Johanna

    2011-10-01

    Table 1 illustrates the impact of different distribution unit sizes, different screen resolutions, and numbers of GPU nodes. We use two and four GPUs (NVIDIA Quadro 5000 with 2.5 GB memory) and a mouse cortex EM dataset (see Figure 2) of resolution 21,494 x 25,790 x 1,850 = 955GB. The size of the virtual distribution units significantly influences the data distribution between nodes. Small distribution units result in a high depth complexity for compositing. Large distribution units lead to a low utilization of GPUs, because in the worst case only a single distribution unit will be in view, which is rendered by only a single node. The choice of an optimal distribution unit size depends on three major factors: the output screen resolution, the block cache size on each node, and the number of nodes. Currently, we are working on optimizing the compositing step and network communication between nodes. © 2011 IEEE.

  3. Large Data Visualization on Distributed Memory Mulit-GPU Clusters

    Energy Technology Data Exchange (ETDEWEB)

    Childs, Henry R.

    2010-03-01

    Data sets of immense size are regularly generated on large scale computing resources. Even among more traditional methods for acquisition of volume data, such as MRI and CT scanners, data which is too large to be effectively visualization on standard workstations is now commonplace. One solution to this problem is to employ a 'visualization cluster,' a small to medium scale cluster dedicated to performing visualization and analysis of massive data sets generated on larger scale supercomputers. These clusters are designed to fit a different need than traditional supercomputers, and therefore their design mandates different hardware choices, such as increased memory, and more recently, graphics processing units (GPUs). While there has been much previous work on distributed memory visualization as well as GPU visualization, there is a relative dearth of algorithms which effectively use GPUs at a large scale in a distributed memory environment. In this work, we study a common visualization technique in a GPU-accelerated, distributed memory setting, and present performance characteristics when scaling to extremely large data sets.

  4. Comprehensive Memory-Bound Simulations on Single Board Computers

    OpenAIRE

    Himpe, Christian; Leibner, Tobias; Rave, Stephan

    2017-01-01

    Numerical simulations of increasingly complex models, demand growing amounts of (main) memory. Typically, large quantities of memory are provided by workstation- and server-type computers, but in turn consume massive amounts of power. Model order reduction can reduce the memory requirements of simulations by constructing reduced order models, yet the assembly of these surrogate models itself often requires memory-rich compute environments. We resolve this deadlock by careful algorithmic desig...

  5. A general purpose subroutine for fast fourier transform on a distributed memory parallel machine

    Science.gov (United States)

    Dubey, A.; Zubair, M.; Grosch, C. E.

    1992-01-01

    One issue which is central in developing a general purpose Fast Fourier Transform (FFT) subroutine on a distributed memory parallel machine is the data distribution. It is possible that different users would like to use the FFT routine with different data distributions. Thus, there is a need to design FFT schemes on distributed memory parallel machines which can support a variety of data distributions. An FFT implementation on a distributed memory parallel machine which works for a number of data distributions commonly encountered in scientific applications is presented. The problem of rearranging the data after computing the FFT is also addressed. The performance of the implementation on a distributed memory parallel machine Intel iPSC/860 is evaluated.

  6. Paging memory from random access memory to backing storage in a parallel computer

    Science.gov (United States)

    Archer, Charles J; Blocksome, Michael A; Inglett, Todd A; Ratterman, Joseph D; Smith, Brian E

    2013-05-21

    Paging memory from random access memory (`RAM`) to backing storage in a parallel computer that includes a plurality of compute nodes, including: executing a data processing application on a virtual machine operating system in a virtual machine on a first compute node; providing, by a second compute node, backing storage for the contents of RAM on the first compute node; and swapping, by the virtual machine operating system in the virtual machine on the first compute node, a page of memory from RAM on the first compute node to the backing storage on the second compute node.

  7. Scalable distributed computing hierarchy: cloud, fog and dew computing

    OpenAIRE

    Skala, Karolj; Davidović, Davor; Afgan, Enis; Sović, Ivan; Šojat, Zorislav

    2015-01-01

    The paper considers the conceptual approach for organization of the vertical hierarchical links between the scalable distributed computing paradigms: Cloud Computing, Fog Computing and Dew Computing. In this paper, the Dew Computing is described and recognized as a new structural layer in the existing distributed computing hierarchy. In the existing computing hierarchy, the Dew computing is positioned as the ground level for the Cloud and Fog computing paradigms. Vertical, complementary, hier...

  8. Computer memory power control for the Galileo spacecraft

    Science.gov (United States)

    Detwiler, R. C.

    1983-01-01

    The developmental history, major design drives, and final topology of the computer memory power system on the Galileo spacecraft are described. A unique method of generating memory backup power directly from the fault current drawn during a spacecraft power overload or fault condition allows this system to provide continuous memory power. This concept provides a unique solution to the problem of volatile memory loss without the use of a battery of other large energy storage elements usually associated with uninterrupted power supply designs.

  9. Languages, compilers and run-time environments for distributed memory machines

    CERN Document Server

    Saltz, J

    1992-01-01

    Papers presented within this volume cover a wide range of topics related to programming distributed memory machines. Distributed memory architectures, although having the potential to supply the very high levels of performance required to support future computing needs, present awkward programming problems. The major issue is to design methods which enable compilers to generate efficient distributed memory programs from relatively machine independent program specifications. This book is the compilation of papers describing a wide range of research efforts aimed at easing the task of programmin

  10. Hydronic distribution system computer model

    Energy Technology Data Exchange (ETDEWEB)

    Andrews, J.W.; Strasser, J.J.

    1994-10-01

    A computer model of a hot-water boiler and its associated hydronic thermal distribution loop has been developed at Brookhaven National Laboratory (BNL). It is intended to be incorporated as a submodel in a comprehensive model of residential-scale thermal distribution systems developed at Lawrence Berkeley. This will give the combined model the capability of modeling forced-air and hydronic distribution systems in the same house using the same supporting software. This report describes the development of the BNL hydronics model, initial results and internal consistency checks, and its intended relationship to the LBL model. A method of interacting with the LBL model that does not require physical integration of the two codes is described. This will provide capability now, with reduced up-front cost, as long as the number of runs required is not large.

  11. Migration of vectorized iterative solvers to distributed memory architectures

    Energy Technology Data Exchange (ETDEWEB)

    Pommerell, C. [AT& T Bell Labs., Murray Hill, NJ (United States); Ruehl, R. [CSCS-ETH, Manno (Switzerland)

    1994-12-31

    Both necessity and opportunity motivate the use of high-performance computers for iterative linear solvers. Necessity results from the size of the problems being solved-smaller problems are often better handled by direct methods. Opportunity arises from the formulation of the iterative methods in terms of simple linear algebra operations, even if this {open_quote}natural{close_quotes} parallelism is not easy to exploit in irregularly structured sparse matrices and with good preconditioners. As a result, high-performance implementations of iterative solvers have attracted a lot of interest in recent years. Most efforts are geared to vectorize or parallelize the dominating operation-structured or unstructured sparse matrix-vector multiplication, or to increase locality and parallelism by reformulating the algorithm-reducing global synchronization in inner products or local data exchange in preconditioners. Target architectures for iterative solvers currently include mostly vector supercomputers and architectures with one or few optimized (e.g., super-scalar and/or super-pipelined RISC) processors and hierarchical memory systems. More recently, parallel computers with physically distributed memory and a better price/performance ratio have been offered by vendors as a very interesting alternative to vector supercomputers. However, programming comfort on such distributed memory parallel processors (DMPPs) still lags behind. Here the authors are concerned with iterative solvers and their changing computing environment. In particular, they are considering migration from traditional vector supercomputers to DMPPs. Application requirements force one to use flexible and portable libraries. They want to extend the portability of iterative solvers rather than reimplementing everything for each new machine, or even for each new architecture.

  12. A Fundamental Tradeoff between Computation and Communication in Distributed Computing

    OpenAIRE

    Li, Songze; Maddah-Ali, Mohammad Ali; Yu, Qian; Avestimehr, A. Salman

    2016-01-01

    How can we optimally trade extra computing power to reduce the communication load in distributed computing? We answer this question by characterizing a fundamental tradeoff relationship between computation and communication in distributed computing, i.e., the two are inverse-linearly proportional to each other. More specifically, a general distributed computing framework, motivated by commonly used structures like MapReduce, is considered, where the goal is to compute $Q$ arbitrary output fun...

  13. A method to compute SEU fault probabilities in memory arrays with error correction

    Science.gov (United States)

    Gercek, Gokhan

    1994-01-01

    With the increasing packing densities in VLSI technology, Single Event Upsets (SEU) due to cosmic radiations are becoming more of a critical issue in the design of space avionics systems. In this paper, a method is introduced to compute the fault (mishap) probability for a computer memory of size M words. It is assumed that a Hamming code is used for each word to provide single error correction. It is also assumed that every time a memory location is read, single errors are corrected. Memory is read randomly whose distribution is assumed to be known. In such a scenario, a mishap is defined as two SEU's corrupting the same memory location prior to a read. The paper introduces a method to compute the overall mishap probability for the entire memory for a mission duration of T hours.

  14. Distributed Computing in Universities and Colleges.

    Science.gov (United States)

    Sircar, Sumit

    1979-01-01

    Analyzes the implications of distributed computing in institutions of higher education. Discusses (1) the extent to which the quality of computing might be enhanced by adopting a distributed computing approach, (2) variations in distributed systems design and the cost of adoption, and (3) administration of distributed systems. (Author/CMV)

  15. Associative Memory computing power and its simulation.

    CERN Document Server

    Ancu, L S; Britzger, D; Giannetti, P; Howarth, J W; Luongo, C; Pandini, C; Schmitt, S; Volpi, G

    2015-01-01

    An important step in the ATLAS upgrade program is the installation of a tracking processor, the Fast Tracker (FTK), with the goal to identify the tracks generated from charged tracks originated by the LHC 14 TeV proton-proton. The collisions will generate thousands of hits in each layer of the silicon tracker detector and track identification is a very challenging computational problem. At the core of the FTK there is associative memory (AM) system, made with hundreds of AM ASICs chips, specifically designed to allow pattern identification in high density environments at very high speed. This component is able to organize the following steps of the track identification providing a huge computing power for a specific application. The AM system will in fact being able to reconstruct tracks in 10s of microseconds. Within the FTK team there has also been a constant effort to maintain a detailed emulation of the system, to predict the impact of single component features in the final performance and in the ATLAS da...

  16. Associative memory model with long-tail-distributed Hebbian synaptic connections

    Directory of Open Access Journals (Sweden)

    Naoki eHiratani

    2013-02-01

    Full Text Available The postsynaptic potentials of pyramidal neurons have a non-Gaussian amplitude distribution with a heavy tail in both hippocampus and neocortex. Such distributions of synaptic weights were recently shown to generate spontaneous internal noise optimal for spike propagation in recurrent cortical circuits. However, whether this internal noise generation by heavy-tailed weight distributions is possible for and beneficial to other computational functions remains unknown. To clarify this point, we construct an associative memory network model of spiking neurons that stores multiple memory patterns in a connection matrix with a lognormal weight distribution. In associative memory networks, non-retrieved memory patterns generate a cross-talk noise that severely disturbs memory recall. We demonstrate that neurons encoding a retrieved memory pattern and those encoding non-retrieved memory patterns have different subthreshold membrane-potential distributions in our model. Consequently, the probability of responding to inputs at strong synapses increases for the encoding neurons, whereas it decreases for the non-encoding neurons. Our results imply that heavy-tailed distributions of connection weights can generate noise useful for associative memory recall.

  17. Optimizing NEURON Simulation Environment Using Remote Memory Access with Recursive Doubling on Distributed Memory Systems

    Directory of Open Access Journals (Sweden)

    Danish Shehzad

    2016-01-01

    Full Text Available Increase in complexity of neuronal network models escalated the efforts to make NEURON simulation environment efficient. The computational neuroscientists divided the equations into subnets amongst multiple processors for achieving better hardware performance. On parallel machines for neuronal networks, interprocessor spikes exchange consumes large section of overall simulation time. In NEURON for communication between processors Message Passing Interface (MPI is used. MPI_Allgather collective is exercised for spikes exchange after each interval across distributed memory systems. The increase in number of processors though results in achieving concurrency and better performance but it inversely affects MPI_Allgather which increases communication time between processors. This necessitates improving communication methodology to decrease the spikes exchange time over distributed memory systems. This work has improved MPI_Allgather method using Remote Memory Access (RMA by moving two-sided communication to one-sided communication, and use of recursive doubling mechanism facilitates achieving efficient communication between the processors in precise steps. This approach enhanced communication concurrency and has improved overall runtime making NEURON more efficient for simulation of large neuronal network models.

  18. Optimizing NEURON Simulation Environment Using Remote Memory Access with Recursive Doubling on Distributed Memory Systems.

    Science.gov (United States)

    Shehzad, Danish; Bozkuş, Zeki

    2016-01-01

    Increase in complexity of neuronal network models escalated the efforts to make NEURON simulation environment efficient. The computational neuroscientists divided the equations into subnets amongst multiple processors for achieving better hardware performance. On parallel machines for neuronal networks, interprocessor spikes exchange consumes large section of overall simulation time. In NEURON for communication between processors Message Passing Interface (MPI) is used. MPI_Allgather collective is exercised for spikes exchange after each interval across distributed memory systems. The increase in number of processors though results in achieving concurrency and better performance but it inversely affects MPI_Allgather which increases communication time between processors. This necessitates improving communication methodology to decrease the spikes exchange time over distributed memory systems. This work has improved MPI_Allgather method using Remote Memory Access (RMA) by moving two-sided communication to one-sided communication, and use of recursive doubling mechanism facilitates achieving efficient communication between the processors in precise steps. This approach enhanced communication concurrency and has improved overall runtime making NEURON more efficient for simulation of large neuronal network models.

  19. Optical neural computing for associative memories

    Energy Technology Data Exchange (ETDEWEB)

    Hsu, Ken Yuh.

    1990-01-01

    Optical techniques for implementing neural computers are presented. In particular, holographic associative memories with feedback are investigated. Characteristics of optical neurons and optical interconnections are discussed. An LCLV is used for simulating a 2-D array of approximately 160,000 optical neurons. Thermoplastic plates are used for providing holographic interconnections among these neurons. The problem of degenerate readout in holographic interconnections and the method of sampling grids to solve this problem are presented. Two optical neural networks for associative memories are implemented and demonstrated. The first one is an optical implementation of the Hopfield network. It performs the function of auto-association that recognizes 2-D images from a distorted or partially blocked input. The trade-off between distortion tolerance and discrimination capability against new images is discussed. The second optical loop is a 2-layer network with feedback. It performs the function of hetero-association, which locks the recognized input and its associated image as a stable state in the loop. In both optical loops, it is shown that the neural gain and the similarity between the input and the stored images are the main factors that determine the dynamics of the network. Neural network models for the optical loops are presented. Equations of motion for describing the dynamical behavior of the systems are derived. The reciprocal vector basis corresponding to stored images is derived. A geometrical method is then introduced which allows us to inspect the convergence property of the system. It is also shown that the main factors that determine the system dynamics are the neural gain and the initial conditions. Photorefractive holography for optical interconnections and sampling grids for volume holographic interconnections are presented.

  20. Parallel and distributed computation for fault-tolerant object recognition

    Science.gov (United States)

    Wechsler, Harry

    1988-01-01

    The distributed associative memory (DAM) model is suggested for distributed and fault-tolerant computation as it relates to object recognition tasks. The fault-tolerance is with respect to geometrical distortions (scale and rotation), noisy inputs, occulsion/overlap, and memory faults. An experimental system was developed for fault-tolerant structure recognition which shows the feasibility of such an approach. The approach is futher extended to the problem of multisensory data integration and applied successfully to the recognition of colored polyhedral objects.

  1. A simplified computational memory model from information processing

    Science.gov (United States)

    Zhang, Lanhua; Zhang, Dongsheng; Deng, Yuqin; Ding, Xiaoqian; Wang, Yan; Tang, Yiyuan; Sun, Baoliang

    2016-11-01

    This paper is intended to propose a computational model for memory from the view of information processing. The model, called simplified memory information retrieval network (SMIRN), is a bi-modular hierarchical functional memory network by abstracting memory function and simulating memory information processing. At first meta-memory is defined to express the neuron or brain cortices based on the biology and graph theories, and we develop an intra-modular network with the modeling algorithm by mapping the node and edge, and then the bi-modular network is delineated with intra-modular and inter-modular. At last a polynomial retrieval algorithm is introduced. In this paper we simulate the memory phenomena and functions of memorization and strengthening by information processing algorithms. The theoretical analysis and the simulation results show that the model is in accordance with the memory phenomena from information processing view.

  2. Multiple core computer processor with globally-accessible local memories

    Energy Technology Data Exchange (ETDEWEB)

    Shalf, John; Donofrio, David; Oliker, Leonid

    2016-09-20

    A multi-core computer processor including a plurality of processor cores interconnected in a Network-on-Chip (NoC) architecture, a plurality of caches, each of the plurality of caches being associated with one and only one of the plurality of processor cores, and a plurality of memories, each of the plurality of memories being associated with a different set of at least one of the plurality of processor cores and each of the plurality of memories being configured to be visible in a global memory address space such that the plurality of memories are visible to two or more of the plurality of processor cores.

  3. Distributed Shared Memory for the Cell Broadband Engine (DSMCBE)

    DEFF Research Database (Denmark)

    Larsen, Morten Nørgaard; Skovhede, Kenneth; Vinter, Brian

    2009-01-01

    in and out of non-coherent local storage blocks for each special processor element. In this paper we present a software library, namely the Distributed Shared Memory for the Cell Broadband Engine (DSMCBE). By using techniques known from distributed shared memory DSMCBE allows programmers to program the CELL...

  4. Parallel Breadth-First Search on Distributed Memory Systems

    Energy Technology Data Exchange (ETDEWEB)

    Computational Research Division; Buluc, Aydin; Madduri, Kamesh

    2011-04-15

    Data-intensive, graph-based computations are pervasive in several scientific applications, and are known to to be quite challenging to implement on distributed memory systems. In this work, we explore the design space of parallel algorithms for Breadth-First Search (BFS), a key subroutine in several graph algorithms. We present two highly-tuned par- allel approaches for BFS on large parallel systems: a level-synchronous strategy that relies on a simple vertex-based partitioning of the graph, and a two-dimensional sparse matrix- partitioning-based approach that mitigates parallel commu- nication overhead. For both approaches, we also present hybrid versions with intra-node multithreading. Our novel hybrid two-dimensional algorithm reduces communication times by up to a factor of 3.5, relative to a common vertex based approach. Our experimental study identifies execu- tion regimes in which these approaches will be competitive, and we demonstrate extremely high performance on lead- ing distributed-memory parallel systems. For instance, for a 40,000-core parallel execution on Hopper, an AMD Magny- Cours based system, we achieve a BFS performance rate of 17.8 billion edge visits per second on an undirected graph of 4.3 billion vertices and 68.7 billion edges with skewed degree distribution.

  5. Memory-assisted measurement-device-independent quantum key distribution

    Science.gov (United States)

    Panayi, Christiana; Razavi, Mohsen; Ma, Xiongfeng; Lütkenhaus, Norbert

    2014-04-01

    A protocol with the potential of beating the existing distance records for conventional quantum key distribution (QKD) systems is proposed. It borrows ideas from quantum repeaters by using memories in the middle of the link, and that of measurement-device-independent QKD, which only requires optical source equipment at the user's end. For certain memories with short access times, our scheme allows a higher repetition rate than that of quantum repeaters with single-mode memories, thereby requiring lower coherence times. By accounting for various sources of nonideality, such as memory decoherence, dark counts, misalignment errors, and background noise, as well as timing issues with memories, we develop a mathematical framework within which we can compare QKD systems with and without memories. In particular, we show that with the state-of-the-art technology for quantum memories, it is potentially possible to devise memory-assisted QKD systems that, at certain distances of practical interest, outperform current QKD implementations.

  6. Parallel implementation of inverse adding-doubling and Monte Carlo multi-layered programs for high performance computing systems with shared and distributed memory

    Science.gov (United States)

    Chugunov, Svyatoslav; Li, Changying

    2015-09-01

    Parallel implementation of two numerical tools popular in optical studies of biological materials-Inverse Adding-Doubling (IAD) program and Monte Carlo Multi-Layered (MCML) program-was developed and tested in this study. The implementation was based on Message Passing Interface (MPI) and standard C-language. Parallel versions of IAD and MCML programs were compared to their sequential counterparts in validation and performance tests. Additionally, the portability of the programs was tested using a local high performance computing (HPC) cluster, Penguin-On-Demand HPC cluster, and Amazon EC2 cluster. Parallel IAD was tested with up to 150 parallel cores using 1223 input datasets. It demonstrated linear scalability and the speedup was proportional to the number of parallel cores (up to 150x). Parallel MCML was tested with up to 1001 parallel cores using problem sizes of 104-109 photon packets. It demonstrated classical performance curves featuring communication overhead and performance saturation point. Optimal performance curve was derived for parallel MCML as a function of problem size. Typical speedup achieved for parallel MCML (up to 326x) demonstrated linear increase with problem size. Precision of MCML results was estimated in a series of tests - problem size of 106 photon packets was found optimal for calculations of total optical response and 108 photon packets for spatially-resolved results. The presented parallel versions of MCML and IAD programs are portable on multiple computing platforms. The parallel programs could significantly speed up the simulation for scientists and be utilized to their full potential in computing systems that are readily available without additional costs.

  7. Towards distributed multiscale computing for the VPH

    NARCIS (Netherlands)

    Hoekstra, A.G.; Coveney, P.

    2010-01-01

    Multiscale modeling is fundamental to the Virtual Physiological Human (VPH) initiative. Most detailed three-dimensional multiscale models lead to prohibitive computational demands. As a possible solution we present MAPPER, a computational science infrastructure for Distributed Multiscale Computing o

  8. A Review on Modern Distributed Computing Paradigms: Cloud Computing, Jungle Computing and Fog Computing

    OpenAIRE

    Hajibaba, Majid; Gorgin, Saeid

    2014-01-01

    The distributed computing attempts to improve performance in large-scale computing problems by resource sharing. Moreover, rising low-cost computing power coupled with advances in communications/networking and the advent of big data, now enables new distributed computing paradigms such as Cloud, Jungle and Fog computing.Cloud computing brings a number of advantages to consumers in terms of accessibility and elasticity. It is based on centralization of resources that possess huge processing po...

  9. Distributed representations in memory: insights from functional brain imaging.

    Science.gov (United States)

    Rissman, Jesse; Wagner, Anthony D

    2012-01-01

    Forging new memories for facts and events, holding critical details in mind on a moment-to-moment basis, and retrieving knowledge in the service of current goals all depend on a complex interplay between neural ensembles throughout the brain. Over the past decade, researchers have increasingly utilized powerful analytical tools (e.g., multivoxel pattern analysis) to decode the information represented within distributed functional magnetic resonance imaging activity patterns. In this review, we discuss how these methods can sensitively index neural representations of perceptual and semantic content and how leverage on the engagement of distributed representations provides unique insights into distinct aspects of memory-guided behavior. We emphasize that, in addition to characterizing the contents of memories, analyses of distributed patterns shed light on the processes that influence how information is encoded, maintained, or retrieved, and thus inform memory theory. We conclude by highlighting open questions about memory that can be addressed through distributed pattern analyses.

  10. Cooperative Fault Tolerant Distributed Computing

    Energy Technology Data Exchange (ETDEWEB)

    Fagg, Graham E.

    2006-03-15

    HARNESS was proposed as a system that combined the best of emerging technologies found in current distributed computing research and commercial products into a very flexible, dynamically adaptable framework that could be used by applications to allow them to evolve and better handle their execution environment. The HARNESS system was designed using the considerable experience from previous projects such as PVM, MPI, IceT and Cumulvs. As such, the system was designed to avoid any of the common problems found with using these current systems, such as no single point of failure, ability to survive machine, node and software failures. Additional features included improved inter-component connectivity, with full support for dynamic down loading of addition components at run-time thus reducing the stress on application developers to build in all the libraries they need in advance.

  11. Implementation of Computational Electromagnetic on Distributed Systems

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    Now the new generation of technology could raise the bar for distributed computing. It seems to be a trend to solve computational electromagnetic work on a distributed system with parallel computing techniques. In this paper, we analyze the parallel characteristics of the distributed system and the possibility of setting up a tightly coupled distributed system by using LAN in our lab. The analysis of the performance of different computational methods, such as FEM, MOM, FDTD and finite difference method, are given. Our work on setting up a distributed system and the performance of the test bed is also included. At last, we mention the implementation of one of our computational electromagnetic codes.

  12. Static Memory Deduplication for Performance Optimization in Cloud Computing

    Directory of Open Access Journals (Sweden)

    Gangyong Jia

    2017-04-01

    Full Text Available In a cloud computing environment, the number of virtual machines (VMs on a single physical server and the number of applications running on each VM are continuously growing. This has led to an enormous increase in the demand of memory capacity and subsequent increase in the energy consumption in the cloud. Lack of enough memory has become a major bottleneck for scalability and performance of virtualization interfaces in cloud computing. To address this problem, memory deduplication techniques which reduce memory demand through page sharing are being adopted. However, such techniques suffer from overheads in terms of number of online comparisons required for the memory deduplication. In this paper, we propose a static memory deduplication (SMD technique which can reduce memory capacity requirement and provide performance optimization in cloud computing. The main innovation of SMD is that the process of page detection is performed offline, thus potentially reducing the performance cost, especially in terms of response time. In SMD, page comparisons are restricted to the code segment, which has the highest shared content. Our experimental results show that SMD efficiently reduces memory capacity requirement and improves performance. We demonstrate that, compared to other approaches, the cost in terms of the response time is negligible.

  13. Static Memory Deduplication for Performance Optimization in Cloud Computing.

    Science.gov (United States)

    Jia, Gangyong; Han, Guangjie; Wang, Hao; Yang, Xuan

    2017-04-27

    In a cloud computing environment, the number of virtual machines (VMs) on a single physical server and the number of applications running on each VM are continuously growing. This has led to an enormous increase in the demand of memory capacity and subsequent increase in the energy consumption in the cloud. Lack of enough memory has become a major bottleneck for scalability and performance of virtualization interfaces in cloud computing. To address this problem, memory deduplication techniques which reduce memory demand through page sharing are being adopted. However, such techniques suffer from overheads in terms of number of online comparisons required for the memory deduplication. In this paper, we propose a static memory deduplication (SMD) technique which can reduce memory capacity requirement and provide performance optimization in cloud computing. The main innovation of SMD is that the process of page detection is performed offline, thus potentially reducing the performance cost, especially in terms of response time. In SMD, page comparisons are restricted to the code segment, which has the highest shared content. Our experimental results show that SMD efficiently reduces memory capacity requirement and improves performance. We demonstrate that, compared to other approaches, the cost in terms of the response time is negligible.

  14. Smart Memory Systems: Polymorphous Computing Architectures

    Science.gov (United States)

    2007-11-02

    this to check three types of code (in order of increasing difficulty): • AODV ad hoc network routing protocol . We checked three implementations of...by the authors of the AODV specification) [11]. • TCP. We checked the Linux TCP stack, which was roughly 10 times bigger than any AODV protocol . We...quad cache controller (CC) must be configured appropriately to send/ route memory operations to the right memory mat or mats according to the operation

  15. Execution time supports for adaptive scientific algorithms on distributed memory machines

    Science.gov (United States)

    Berryman, Harry; Saltz, Joel; Scroggs, Jeffrey

    1990-01-01

    Optimizations are considered that are required for efficient execution of code segments that consists of loops over distributed data structures. The PARTI (Parallel Automated Runtime Toolkit at ICASE) execution time primitives are designed to carry out these optimizations and can be used to implement a wide range of scientific algorithms on distributed memory machines. These primitives allow the user to control array mappings in a way that gives an appearance of shared memory. Computations can be based on a global index set. Primitives are used to carry out gather and scatter operations on distributed arrays. Communications patterns are derived at runtime, and the appropriate send and receive messages are automatically generated.

  16. Distributed trace using central performance counter memory

    Science.gov (United States)

    Satterfield, David L.; Sexton, James C.

    2013-01-22

    A plurality of processing cores, are central storage unit having at least memory connected in a daisy chain manner, forming a daisy chain ring layout on an integrated chip. At least one of the plurality of processing cores places trace data on the daisy chain connection for transmitting the trace data to the central storage unit, and the central storage unit detects the trace data and stores the trace data in the memory co-located in with the central storage unit.

  17. Memory Benchmarks for SMP-Based High Performance Parallel Computers

    Energy Technology Data Exchange (ETDEWEB)

    Yoo, A B; de Supinski, B; Mueller, F; Mckee, S A

    2001-11-20

    As the speed gap between CPU and main memory continues to grow, memory accesses increasingly dominates the performance of many applications. The problem is particularly acute for symmetric multiprocessor (SMP) systems, where the shared memory may be accessed concurrently by a group of threads running on separate CPUs. Unfortunately, several key issues governing memory system performance in current systems are not well understood. Complex interactions between the levels of the memory hierarchy, buses or switches, DRAM back-ends, system software, and application access patterns can make it difficult to pinpoint bottlenecks and determine appropriate optimizations, and the situation is even more complex for SMP systems. To partially address this problem, we formulated a set of multi-threaded microbenchmarks for characterizing and measuring the performance of the underlying memory system in SMP-based high-performance computers. We report our use of these microbenchmarks on two important SMP-based machines. This paper has four primary contributions. First, we introduce a microbenchmark suite to systematically assess and compare the performance of different levels in SMP memory hierarchies. Second, we present a new tool based on hardware performance monitors to determine a wide array of memory system characteristics, such as cache sizes, quickly and easily; by using this tool, memory performance studies can be targeted to the full spectrum of performance regimes with many fewer data points than is otherwise required. Third, we present experimental results indicating that the performance of applications with large memory footprints remains largely constrained by memory. Fourth, we demonstrate that thread-level parallelism further degrades memory performance, even for the latest SMPs with hardware prefetching and switch-based memory interconnects.

  18. High Performance Polar Decomposition on Distributed Memory Systems

    KAUST Repository

    Sukkari, Dalal

    2016-08-08

    The polar decomposition of a dense matrix is an important operation in linear algebra. It can be directly calculated through the singular value decomposition (SVD) or iteratively using the QR dynamically-weighted Halley algorithm (QDWH). The former is difficult to parallelize due to the preponderant number of memory-bound operations during the bidiagonal reduction. We investigate the latter scenario, which performs more floating-point operations but exposes at the same time more parallelism, and therefore, runs closer to the theoretical peak performance of the system, thanks to more compute-bound matrix operations. Profiling results show the performance scalability of QDWH for calculating the polar decomposition using around 9200 MPI processes on well and ill-conditioned matrices of 100K×100K problem size. We study then the performance impact of the QDWH-based polar decomposition as a pre-processing step toward calculating the SVD itself. The new distributed-memory implementation of the QDWH-SVD solver achieves up to five-fold speedup against current state-of-the-art vendor SVD implementations. © Springer International Publishing Switzerland 2016.

  19. Memory management and compiler support for rapid recovery from failures in computer systems

    Science.gov (United States)

    Fuchs, W. K.

    1991-01-01

    This paper describes recent developments in the use of memory management and compiler technology to support rapid recovery from failures in computer systems. The techniques described include cache coherence protocols for user transparent checkpointing in multiprocessor systems, compiler-based checkpoint placement, compiler-based code modification for multiple instruction retry, and forward recovery in distributed systems utilizing optimistic execution.

  20. Parallel Implementation of a Semidefinite Programming Solver based on CSDP in a distributed memory cluster

    NARCIS (Netherlands)

    Ivanov, I.D.; de Klerk, E.

    2007-01-01

    In this paper we present the algorithmic framework and practical aspects of implementing a parallel version of a primal-dual semidefinite programming solver on a distributed memory computer cluster. Our implementation is based on the CSDP solver and uses a message passing interface (MPI), and the Sc

  1. Parallel Implementation of a Semidefinite Programming Solver based on CSDP in a distributed memory cluster

    NARCIS (Netherlands)

    Ivanov, I.D.; de Klerk, E.

    2007-01-01

    In this paper we present the algorithmic framework and practical aspects of implementing a parallel version of a primal-dual semidefinite programming solver on a distributed memory computer cluster. Our implementation is based on the CSDP solver and uses a message passing interface (MPI), and the Sc

  2. Distributed Cloud Computing (Dagstuhl Seminar 15072)

    OpenAIRE

    Coady, Yvonne; Kempf, James; McGeer, Rick; Schmid, Stefan

    2015-01-01

    This report documents the program and the outcomes of Dagstuhl Seminar 15072 "Distributed Cloud Computing". A distributed cloud connecting multiple, geographically distributed and smaller datacenters, can be an attractive alternative to today's massive, centralized datacenters. A distributed cloud can reduce communication overheads, costs, and latency's by offering nearby computation and storage resources. Better data locality can also improve privacy. In this seminar, we revisit the vision o...

  3. Performance Assessment of OVERFLOW on Distributed Computing Environment

    Science.gov (United States)

    Djomehri, M. Jahed; Rizk, Yehia M.

    2000-01-01

    The aerodynamic computer code, OVERFLOW, with a multi-zone overset grid feature, has been parallelized to enhance its performance on distributed and shared memory paradigms. Practical application benchmarks have been set to assess the efficiency of code's parallelism on high-performance architectures. The code's performance has also been experimented with in the context of the distributed computing paradigm on distant computer resources using the Information Power Grid (IPG) toolkit, Globus. Two parallel versions of the code, namely OVERFLOW-MPI and -MLP, have developed around the natural coarse grained parallelism inherent in a multi-zonal domain decomposition paradigm. The algorithm invokes a strategy that forms a number of groups, each consisting of a zone, a cluster of zones and/or a partition of a large zone. Each group can be thought of as a process with one or multithreads assigned to it and that all groups run in parallel. The -MPI version of the code uses explicit message-passing based on the standard MPI library for sending and receiving interzonal boundary data across processors. The -MLP version employs no message-passing paradigm; the boundary data is transferred through the shared memory. The -MPI code is suited for both distributed and shared memory architectures, while the -MLP code can only be used on shared memory platforms. The IPG applications are implemented by the -MPI code using the Globus toolkit. While a computational task is distributed across multiple computer resources, the parallelism can be explored on each resource alone. Performance studies are achieved with some practical aerodynamic problems with complex geometries, consisting of 2.5 up to 33 million grid points and a large number of zonal blocks. The computations were executed primarily on SGI Origin 2000 multiprocessors and on the Cray T3E. OVERFLOW's IPG applications are carried out on NASA homogeneous metacomputing machines located at three sites, Ames, Langley and Glenn. Plans

  4. Automaticity and control in prospective memory: a computational model.

    Directory of Open Access Journals (Sweden)

    Sam J Gilbert

    Full Text Available Prospective memory (PM refers to our ability to realize delayed intentions. In event-based PM paradigms, participants must act on an intention when they detect the occurrence of a pre-established cue. Some theorists propose that in such paradigms PM responding can only occur when participants deliberately initiate processes for monitoring their environment for appropriate cues. Others propose that perceptual processing of PM cues can directly trigger PM responding in the absence of strategic monitoring, at least under some circumstances. In order to address this debate, we present a computational model implementing the latter account, using a parallel distributed processing (interactive activation framework. In this model PM responses can be triggered directly as a result of spreading activation from units representing perceptual inputs. PM responding can also be promoted by top-down monitoring for PM targets. The model fits a wide variety of empirical findings from PM paradigms, including the effect of maintaining PM intentions on ongoing response time and the intention superiority effect. The model also makes novel predictions concerning the effect of stimulus degradation on PM performance, the shape of response time distributions on ongoing and prospective memory trials, and the effects of instructing participants to make PM responses instead of ongoing responses or alongside them. These predictions were confirmed in two empirical experiments. We therefore suggest that PM should be considered to result from the interplay between bottom-up triggering of PM responses by perceptual input, and top-down monitoring for appropriate cues. We also show how the model can be extended to simulate encoding new intentions and subsequently deactivating them, and consider links between the model's performance and results from neuroimaging.

  5. A comparison of distributed memory and virtual shared memory parallel programming models

    Energy Technology Data Exchange (ETDEWEB)

    Keane, J.A. [Univ. of Manchester (United Kingdom). Dept. of Computer Science; Grant, A.J. [Univ. of Manchester (United Kingdom). Computer Graphics Unit; Xu, M.Q. [Argonne National Lab., IL (United States)

    1993-04-01

    The virtues of the different parallel programming models, shared memory and distributed memory, have been much debated. Conventionally the debate could be reduced to programming convenience on the one hand, and high salability factors on the other. More recently the debate has become somewhat blurred with the provision of virtual shared memory models built on machines with physically distributed memory. The intention of such models/machines is to provide scalable shared memory, i.e. to provide both programmer convenience and high salability. In this paper, the different models are considered from experiences gained with a number of system ranging from applications in both commerce and science to languages and operating systems. Case studies are introduced as appropriate.

  6. Distributed Function Computation in Asymmetric Communication Scenarios

    CERN Document Server

    Agnihotri, Samar

    2009-01-01

    We consider the distributed function computation problem in asymmetric communication scenarios, where the sink computes some deterministic function of the data split among N correlated informants. The distributed function computation problem is addressed as a generalization of distributed source coding (DSC) problem. We are mainly interested in minimizing the number of informant bits required, in the worst-case, to allow the sink to exactly compute the function. We provide a constructive solution for this in terms of an interactive communication protocol and prove its optimality. The proposed protocol also allows us to compute the worst-case achievable rate-region for the computation of any function. We define two classes of functions: lossy and lossless. We show that, in general, the lossy functions can be computed at the sink with fewer number of informant bits than the DSC problem, while computation of the lossless functions requires as many informant bits as the DSC problem.

  7. Translation techniques for distributed-shared memory programming models

    Energy Technology Data Exchange (ETDEWEB)

    Fuller, Douglas James [Iowa State Univ., Ames, IA (United States)

    2005-01-01

    The high performance computing community has experienced an explosive improvement in distributed-shared memory hardware. Driven by increasing real-world problem complexity, this explosion has ushered in vast numbers of new systems. Each new system presents new challenges to programmers and application developers. Part of the challenge is adapting to new architectures with new performance characteristics. Different vendors release systems with widely varying architectures that perform differently in different situations. Furthermore, since vendors need only provide a single performance number (total MFLOPS, typically for a single benchmark), they only have strong incentive initially to optimize the API of their choice. Consequently, only a fraction of the available APIs are well optimized on most systems. This causes issues porting and writing maintainable software, let alone issues for programmers burdened with mastering each new API as it is released. Also, programmers wishing to use a certain machine must choose their API based on the underlying hardware instead of the application. This thesis argues that a flexible, extensible translator for distributed-shared memory APIs can help address some of these issues. For example, a translator might take as input code in one API and output an equivalent program in another. Such a translator could provide instant porting for applications to new systems that do not support the application's library or language natively. While open-source APIs are abundant, they do not perform optimally everywhere. A translator would also allow performance testing using a single base code translated to a number of different APIs. Most significantly, this type of translator frees programmers to select the most appropriate API for a given application based on the application (and developer) itself instead of the underlying hardware.

  8. Translation techniques for distributed-shared memory programming models

    Energy Technology Data Exchange (ETDEWEB)

    Fuller, Douglas James

    2005-08-01

    The high performance computing community has experienced an explosive improvement in distributed-shared memory hardware. Driven by increasing real-world problem complexity, this explosion has ushered in vast numbers of new systems. Each new system presents new challenges to programmers and application developers. Part of the challenge is adapting to new architectures with new performance characteristics. Different vendors release systems with widely varying architectures that perform differently in different situations. Furthermore, since vendors need only provide a single performance number (total MFLOPS, typically for a single benchmark), they only have strong incentive initially to optimize the API of their choice. Consequently, only a fraction of the available APIs are well optimized on most systems. This causes issues porting and writing maintainable software, let alone issues for programmers burdened with mastering each new API as it is released. Also, programmers wishing to use a certain machine must choose their API based on the underlying hardware instead of the application. This thesis argues that a flexible, extensible translator for distributed-shared memory APIs can help address some of these issues. For example, a translator might take as input code in one API and output an equivalent program in another. Such a translator could provide instant porting for applications to new systems that do not support the application's library or language natively. While open-source APIs are abundant, they do not perform optimally everywhere. A translator would also allow performance testing using a single base code translated to a number of different APIs. Most significantly, this type of translator frees programmers to select the most appropriate API for a given application based on the application (and developer) itself instead of the underlying hardware.

  9. A system for simulating shared memory in heterogeneous distributed-memory networks with specialization for robotics applications

    Energy Technology Data Exchange (ETDEWEB)

    Jones, J.P.; Bangs, A.L.; Butler, P.L.

    1991-01-01

    Hetero Helix is a programming environment which simulates shared memory on a heterogeneous network of distributed-memory computers. The machines in the network may vary with respect to their native operating systems and internal representation of numbers. Hetero Helix presents a simple programming model to developers, and also considers the needs of designers, system integrators, and maintainers. The key software technology underlying Hetero Helix is the use of a compiler'' which analyzes the data structures in shared memory and automatically generates code which translates data representations from the format native to each machine into a common format, and vice versa. The design of Hetero Helix was motivated in particular by the requirements of robotics applications. Hetero Helix has been used successfully in an integration effort involving 27 CPUs in a heterogeneous network and a body of software totaling roughly 100,00 lines of code. 25 refs., 6 figs.

  10. Associative Memory computing power and its simulation.

    CERN Document Server

    Volpi, G; The ATLAS collaboration

    2014-01-01

    The associative memory (AM) chip is ASIC device specifically designed to perform ``pattern matching'' at very high speed and with parallel access to memory locations. The most extensive use for such device will be the ATLAS Fast Tracker (FTK) processor, where more than 8000 chips will be installed in 128 VME boards, specifically designed for high throughput in order to exploit the chip's features. Each AM chip will store a database of about 130000 pre-calculated patterns, allowing FTK to use about 1 billion patterns for the whole system, with any data inquiry broadcast to all memory elements simultaneously within the same clock cycle (10 ns), thus data retrieval time is independent of the database size. Speed and size of the system are crucial for real-time High Energy Physics applications, such as the ATLAS FTK processor. Using 80 million channels of the ATLAS tracker, FTK finds tracks within 100 $\\mathrm{\\mu s}$. The simulation of such a parallelized system is an extremely complex task when executed in comm...

  11. Brain computer interface to enhance episodic memory in human participants

    Directory of Open Access Journals (Sweden)

    John F Burke

    2015-01-01

    Full Text Available Recent research has revealed that neural oscillations in the theta (4-8 Hz and alpha (9-14 Hz bands are predictive of future success in memory encoding. Because these signals occur before the presentation of an upcoming stimulus, they are considered stimulus-independent in that they correlate with enhanced memory encoding independent of the item being encoded. Thus, such stimulus-independent activity has important implications for the neural mechanisms underlying episodic memory as well as the development of cognitive neural prosthetics. Here, we developed a brain computer interface (BCI to test the ability of such pre-stimulus activity to modulate subsequent memory encoding. We recorded intracranial electroencephalography (iEEG in neurosurgical patients as they performed a free recall memory task, and detected iEEG theta and alpha oscillations that correlated with optimal memory encoding. We then used these detected oscillatory changes to trigger the presentation of items in the free recall task. We found that item presentation contingent upon the presence of prestimulus theta and alpha oscillations modulated memory performance in more sessions than expected by chance. Our results suggest that an electrophysiological signal may be causally linked to a specific behavioral condition, and contingent stimulus presentation has the potential to modulate human memory encoding.

  12. Reducing communication costs in the conjugate gradient algorithm on distributed memory multiprocessors

    Energy Technology Data Exchange (ETDEWEB)

    D' Azevedo, E.F.; Romine, C.H.

    1992-09-01

    The standard formulation of the conjugate gradient algorithm involves two inner product computations. The results of these two inner products are needed to update the search direction and the computed solution. In a distributed memory parallel environment, the computation and subsequent distribution of these two values requires two separate communication and synchronization phases. In this paper, we present a mathematically equivalent rearrangement of the standard algorithm that reduces the number of communication phases. We give a second derivation of the modified conjugate gradient algorithm in terms of the natural relationship with the underlying Lanczos process. We also present empirical evidence of the stability of this modified algorithm.

  13. Supporting shared data structures on distributed memory architectures

    Science.gov (United States)

    Koelbel, Charles; Mehrotra, Piyush; Vanrosendale, John

    1990-01-01

    Programming nonshared memory systems is more difficult than programming shared memory systems, since there is no support for shared data structures. Current programming languages for distributed memory architectures force the user to decompose all data structures into separate pieces, with each piece owned by one of the processors in the machine, and with all communication explicitly specified by low-level message-passing primitives. A new programming environment is presented for distributed memory architectures, providing a global name space and allowing direct access to remote parts of data values. The analysis and program transformations required to implement this environment are described, and the efficiency of the resulting code on the NCUBE/7 and IPSC/2 hypercubes are described.

  14. PGHPF – An Optimizing High Performance Fortran Compiler for Distributed Memory Machines

    Directory of Open Access Journals (Sweden)

    Zeki Bozkus

    1997-01-01

    Full Text Available High Performance Fortran (HPF is the first widely supported, efficient, and portable parallel programming language for shared and distributed memory systems. HPF is realized through a set of directive-based extensions to Fortran 90. It enables application developers and Fortran end-users to write compact, portable, and efficient software that will compile and execute on workstations, shared memory servers, clusters, traditional supercomputers, or massively parallel processors. This article describes a production-quality HPF compiler for a set of parallel machines. Compilation techniques such as data and computation distribution, communication generation, run-time support, and optimization issues are elaborated as the basis for an HPF compiler implementation on distributed memory machines. The performance of this compiler on benchmark programs demonstrates that high efficiency can be achieved executing HPF code on parallel architectures.

  15. Distributed computing system with dual independent communications paths between computers and employing split tokens

    Science.gov (United States)

    Rasmussen, Robert D. (Inventor); Manning, Robert M. (Inventor); Lewis, Blair F. (Inventor); Bolotin, Gary S. (Inventor); Ward, Richard S. (Inventor)

    1990-01-01

    This is a distributed computing system providing flexible fault tolerance; ease of software design and concurrency specification; and dynamic balance of the loads. The system comprises a plurality of computers each having a first input/output interface and a second input/output interface for interfacing to communications networks each second input/output interface including a bypass for bypassing the associated computer. A global communications network interconnects the first input/output interfaces for providing each computer the ability to broadcast messages simultaneously to the remainder of the computers. A meshwork communications network interconnects the second input/output interfaces providing each computer with the ability to establish a communications link with another of the computers bypassing the remainder of computers. Each computer is controlled by a resident copy of a common operating system. Communications between respective ones of computers is by means of split tokens each having a moving first portion which is sent from computer to computer and a resident second portion which is disposed in the memory of at least one of computer and wherein the location of the second portion is part of the first portion. The split tokens represent both functions to be executed by the computers and data to be employed in the execution of the functions. The first input/output interfaces each include logic for detecting a collision between messages and for terminating the broadcasting of a message whereby collisions between messages are detected and avoided.

  16. Hybrid computing using a neural network with dynamic external memory.

    Science.gov (United States)

    Graves, Alex; Wayne, Greg; Reynolds, Malcolm; Harley, Tim; Danihelka, Ivo; Grabska-Barwińska, Agnieszka; Colmenarejo, Sergio Gómez; Grefenstette, Edward; Ramalho, Tiago; Agapiou, John; Badia, Adrià Puigdomènech; Hermann, Karl Moritz; Zwols, Yori; Ostrovski, Georg; Cain, Adam; King, Helen; Summerfield, Christopher; Blunsom, Phil; Kavukcuoglu, Koray; Hassabis, Demis

    2016-10-27

    Artificial neural networks are remarkably adept at sensory processing, sequence learning and reinforcement learning, but are limited in their ability to represent variables and data structures and to store data over long timescales, owing to the lack of an external memory. Here we introduce a machine learning model called a differentiable neural computer (DNC), which consists of a neural network that can read from and write to an external memory matrix, analogous to the random-access memory in a conventional computer. Like a conventional computer, it can use its memory to represent and manipulate complex data structures, but, like a neural network, it can learn to do so from data. When trained with supervised learning, we demonstrate that a DNC can successfully answer synthetic questions designed to emulate reasoning and inference problems in natural language. We show that it can learn tasks such as finding the shortest path between specified points and inferring the missing links in randomly generated graphs, and then generalize these tasks to specific graphs such as transport networks and family trees. When trained with reinforcement learning, a DNC can complete a moving blocks puzzle in which changing goals are specified by sequences of symbols. Taken together, our results demonstrate that DNCs have the capacity to solve complex, structured tasks that are inaccessible to neural networks without external read-write memory.

  17. Computing with memory for energy-efficient robust systems

    CERN Document Server

    Paul, Somnath

    2013-01-01

    This book analyzes energy and reliability as major challenges faced by designers of computing frameworks in the nanometer technology regime.  The authors describe the existing solutions to address these challenges and then reveal a new reconfigurable computing platform, which leverages high-density nanoscale memory for both data storage and computation to maximize the energy-efficiency and reliability. The energy and reliability benefits of this new paradigm are illustrated and the design challenges are discussed. Various hardware and software aspects of this exciting computing paradigm are de

  18. Computational modeling of memory allocation in neuronal and dendritic populations

    Directory of Open Access Journals (Sweden)

    George I Kastellakis

    2014-03-01

    Full Text Available Recent studies using molecular and cellular approaches have established that memory is supported by distributed and sparse populations of neurons. The allocation of neurons and synapses to store a long term memory engram is not random, but depends on properties such as neuronal excitability and CREB activation. The consolidation of synaptic plasticity, which is believed to serve long-term memory storage, is dependent on protein availability, and shaped by the mechanism of synaptic tagging and capture. In addition, dendritic protein synthesis allows for compartmentalized plasticity and synapse clustering. The implications of the rules governing long-term memory allocation in neurons and their dendrites are not yet known. To this aim, we present a model that incorporates multiple plasticity-related mechanisms which are known to be active during memory allocation and consolidation. Using this model, we show that memory allocation in neurons and their dendrites is affected by dendritic protein synthesis, and that the late-LTP associativity mechanisms allow related memories to be stored in overlapping populations of neurons.

  19. Experiences with distributed computing for meteorological applications: grid computing and cloud computing

    OpenAIRE

    Oesterle, F.; Ostermann, S; R. Prodan; G. J. Mayr

    2015-01-01

    Experiences with three practical meteorological applications with different characteristics are used to highlight the core computer science aspects and applicability of distributed computing to meteorology. Through presenting cloud and grid computing this paper shows use case scenarios fitting a wide range of meteorological applications from operational to research studies. The paper concludes that distributed computing complements and extends existing high performance comput...

  20. Multiple-User, Multitasking, Virtual-Memory Computer System

    Science.gov (United States)

    Generazio, Edward R.; Roth, Don J.; Stang, David B.

    1993-01-01

    Computer system designed and programmed to serve multiple users in research laboratory. Provides for computer control and monitoring of laboratory instruments, acquisition and anlaysis of data from those instruments, and interaction with users via remote terminals. System provides fast access to shared central processing units and associated large (from megabytes to gigabytes) memories. Underlying concept of system also applicable to monitoring and control of industrial processes.

  1. Graphical Visualization on Computational Simulation Using Shared Memory

    Science.gov (United States)

    Lima, A. B.; Correa, Eberth

    2014-03-01

    The Shared Memory technique is a powerful tool for parallelizing computer codes. In particular it can be used to visualize the results "on the fly" without stop running the simulation. In this presentation we discuss and show how to use the technique conjugated with a visualization code using openGL.

  2. Next generation distributed computing for cancer research.

    Science.gov (United States)

    Agarwal, Pankaj; Owzar, Kouros

    2014-01-01

    Advances in next generation sequencing (NGS) and mass spectrometry (MS) technologies have provided many new opportunities and angles for extending the scope of translational cancer research while creating tremendous challenges in data management and analysis. The resulting informatics challenge is invariably not amenable to the use of traditional computing models. Recent advances in scalable computing and associated infrastructure, particularly distributed computing for Big Data, can provide solutions for addressing these challenges. In this review, the next generation of distributed computing technologies that can address these informatics problems is described from the perspective of three key components of a computational platform, namely computing, data storage and management, and networking. A broad overview of scalable computing is provided to set the context for a detailed description of Hadoop, a technology that is being rapidly adopted for large-scale distributed computing. A proof-of-concept Hadoop cluster, set up for performance benchmarking of NGS read alignment, is described as an example of how to work with Hadoop. Finally, Hadoop is compared with a number of other current technologies for distributed computing.

  3. Magnetic Random Access Memory for Embedded Computing

    Science.gov (United States)

    2007-10-29

    Nicholas D. Rizzo, John Salter , Mark Durlam, Renu W. Dave, Jason Janesky, Brian Butcher, Ken Smith, 153 and Greg Grynkewich. Magnetoresistive random access...single magnetoresistive element. Nature, 425:485–487, October 2, 2003. [184] A. Ney and J. S. Harris , Jr. Reconfigurable magnetologic computing using the

  4. Distributed computing by oblivious mobile robots

    CERN Document Server

    Flocchini, Paola; Santoro, Nicola

    2012-01-01

    The study of what can be computed by a team of autonomous mobile robots, originally started in robotics and AI, has become increasingly popular in theoretical computer science (especially in distributed computing), where it is now an integral part of the investigations on computability by mobile entities. The robots are identical computational entities located and able to move in a spatial universe; they operate without explicit communication and are usually unable to remember the past; they are extremely simple, with limited resources, and individually quite weak. However, collectively the ro

  5. Impossibility results for distributed computing

    CERN Document Server

    Attiya, Hagit

    2014-01-01

    To understand the power of distributed systems, it is necessary to understand their inherent limitations: what problems cannot be solved in particular systems, or without sufficient resources (such as time or space). This book presents key techniques for proving such impossibility results and applies them to a variety of different problems in a variety of different system models. Insights gained from these results are highlighted, aspects of a problem that make it difficult are isolated, features of an architecture that make it inadequate for solving certain problems efficiently are identified

  6. Task allocation in a distributed computing system

    Science.gov (United States)

    Seward, Walter D.

    1987-01-01

    A conceptual framework is examined for task allocation in distributed systems. Application and computing system parameters critical to task allocation decision processes are discussed. Task allocation techniques are addressed which focus on achieving a balance in the load distribution among the system's processors. Equalization of computing load among the processing elements is the goal. Examples of system performance are presented for specific applications. Both static and dynamic allocation of tasks are considered and system performance is evaluated using different task allocation methodologies.

  7. Distributed computer systems theory and practice

    CERN Document Server

    Zedan, H S M

    2014-01-01

    Distributed Computer Systems: Theory and Practice is a collection of papers dealing with the design and implementation of operating systems, including distributed systems, such as the amoeba system, argus, Andrew, and grapevine. One paper discusses the concepts and notations for concurrent programming, particularly language notation used in computer programming, synchronization methods, and also compares three classes of languages. Another paper explains load balancing or load redistribution to improve system performance, namely, static balancing and adaptive load balancing. For program effici

  8. ATLAS Distributed Computing in LHC Run2

    CERN Document Server

    Campana, Simone; The ATLAS collaboration

    2015-01-01

    The ATLAS Distributed Computing infrastructure has evolved after the first period of LHC data taking in order to cope with the challenges of the upcoming LHC Run2. An increased data rate and computing demands of the Monte-Carlo simulation, as well as new approaches to ATLAS analysis, dictated a more dynamic workload management system (ProdSys2) and data management system (Rucio), overcoming the boundaries imposed by the design of the old computing model. In particular, the commissioning of new central computing system components was the core part of the migration toward the flexible computing model. The flexible computing utilization exploring the opportunistic resources such as HPC, cloud, and volunteer computing is embedded in the new computing model, the data access mechanisms have been enhanced with the remote access, and the network topology and performance is deeply integrated into the core of the system. Moreover a new data management strategy, based on defined lifetime for each dataset, has been defin...

  9. LHCb: LHCb Distributed Computing Operations

    CERN Multimedia

    Stagni, F

    2011-01-01

    The proliferation of tools for monitoring both activities and infrastructure, together with the pressing need for prompt reaction in case of problems impacting data taking, data reconstruction, data reprocessing and user analysis brought to the need of better organizing the huge amount of information available. The monitoring system for the LHCb Grid Computing relies on many heterogeneous and independent sources of information offering different views for a better understanding of problems while an operations team and defined procedures have been put in place to handle them. This work summarizes the state-of-the-art of LHCb Grid operations emphasizing the reasons that brought to various choices and what are the tools currently in use to run our daily activities. We highlight the most common problems experienced across years of activities on the WLCG infrastructure, the services with their criticality, the procedures in place, the relevant metrics and the tools available and the ones still missing.

  10. Rendezvous Facilities in a Distributed Computer System

    Institute of Scientific and Technical Information of China (English)

    廖先Zhi; 金兰

    1995-01-01

    The distributed computer system described in this paper is a set of computer nodes interconnected in an interconnection network via packet-switching interfaces.The nodes communicate with each other by means of message-passing protocols.This paper presents the implementation of rendezvous facilities as high-level primitives provided by a parallel programming language to support interprocess communication and synchronization.

  11. Mobile Agents in Networking and Distributed Computing

    CERN Document Server

    Cao, Jiannong

    2012-01-01

    The book focuses on mobile agents, which are computer programs that can autonomously migrate between network sites. This text introduces the concepts and principles of mobile agents, provides an overview of mobile agent technology, and focuses on applications in networking and distributed computing.

  12. A Software Rejuvenation Framework for Distributed Computing

    Science.gov (United States)

    Chau, Savio

    2009-01-01

    A performability-oriented conceptual framework for software rejuvenation has been constructed as a means of increasing levels of reliability and performance in distributed stateful computing. As used here, performability-oriented signifies that the construction of the framework is guided by the concept of analyzing the ability of a given computing system to deliver services with gracefully degradable performance. The framework is especially intended to support applications that involve stateful replicas of server computers.

  13. Design of multiple sequence alignment algorithms on parallel, distributed memory supercomputers.

    Science.gov (United States)

    Church, Philip C; Goscinski, Andrzej; Holt, Kathryn; Inouye, Michael; Ghoting, Amol; Makarychev, Konstantin; Reumann, Matthias

    2011-01-01

    The challenge of comparing two or more genomes that have undergone recombination and substantial amounts of segmental loss and gain has recently been addressed for small numbers of genomes. However, datasets of hundreds of genomes are now common and their sizes will only increase in the future. Multiple sequence alignment of hundreds of genomes remains an intractable problem due to quadratic increases in compute time and memory footprint. To date, most alignment algorithms are designed for commodity clusters without parallelism. Hence, we propose the design of a multiple sequence alignment algorithm on massively parallel, distributed memory supercomputers to enable research into comparative genomics on large data sets. Following the methodology of the sequential progressiveMauve algorithm, we design data structures including sequences and sorted k-mer lists on the IBM Blue Gene/P supercomputer (BG/P). Preliminary results show that we can reduce the memory footprint so that we can potentially align over 250 bacterial genomes on a single BG/P compute node. We verify our results on a dataset of E.coli, Shigella and S.pneumoniae genomes. Our implementation returns results matching those of the original algorithm but in 1/2 the time and with 1/4 the memory footprint for scaffold building. In this study, we have laid the basis for multiple sequence alignment of large-scale datasets on a massively parallel, distributed memory supercomputer, thus enabling comparison of hundreds instead of a few genome sequences within reasonable time.

  14. Quantum associative memory with improved distributed queries

    CERN Document Server

    Njafa, J -P Tchapet; Woafo, Paul

    2012-01-01

    The paper proposes an improved quantum associative algorithm with distributed query based on model proposed by Ezhov et al. We introduce two modifications of the query that optimized data retrieval of correct multi-patterns simultaneously for any rate of the number of the recognition pattern on the total patterns. Simulation results are given.

  15. PRISMA database machine: A distributed, main-memory approach

    NARCIS (Netherlands)

    Schmidt, J.W.; Apers, Peter M.G.; Ceri, S.; Kersten, Martin L.; Oerlemans, Hans C.M.; Missikoff, M.

    1988-01-01

    The PRISMA project is a large-scale research effort in the design and implementation of a highly parallel machine for data and knowledge processing. The PRISMA database machine is a distributed, main-memory database management system implemented in an object-oriented language that runs on top of a m

  16. PRISMA database machine: A distributed, main-memory approach

    NARCIS (Netherlands)

    Apers, Peter M.G.; Kersten, Martin L.; Oerlemans, Hans C.M.; Schmidt, J.W.; Ceri, S.; Missikoff, M.

    1988-01-01

    The PRISMA project is a large-scale research effort in the design and implementation of a highly parallel machine for data and knowledge processing. The PRISMA database machine is a distributed, main-memory database management system implemented in an object-oriented language that runs on top of a m

  17. Object-oriented Tools for Distributed Computing

    Science.gov (United States)

    Adler, Richard M.

    1993-01-01

    Distributed computing systems are proliferating, owing to the availability of powerful, affordable microcomputers and inexpensive communication networks. A critical problem in developing such systems is getting application programs to interact with one another across a computer network. Remote interprogram connectivity is particularly challenging across heterogeneous environments, where applications run on different kinds of computers and operating systems. NetWorks! (trademark) is an innovative software product that provides an object-oriented messaging solution to these problems. This paper describes the design and functionality of NetWorks! and illustrates how it is being used to build complex distributed applications for NASA and in the commercial sector.

  18. Sparse Distributed Memory: understanding the speed and robustness of expert memory

    Directory of Open Access Journals (Sweden)

    Marcelo Salhab Brogliato

    2014-04-01

    Full Text Available How can experts, sometimes in exacting detail, almost immediately and very precisely recall memory items from a vast repertoire? The problem in which we will be interested concerns models of theoretical neuroscience that could explain the speed and robustness of an expert's recollection. The approach is based on Sparse Distributed Memory, which has been shown to be plausible, both in a neuroscientific and in a psychological manner, in a number of ways. A crucial characteristic concerns the limits of human recollection, the `tip-of-tongue' memory event--which is found at a non-linearity in the model. We expand the theoretical framework, deriving an optimization formula to solve to this non-linearity. Numerical results demonstrate how the higher frequency of rehearsal, through work or study, immediately increases the robustness and speed associated with expert memory.

  19. Memory Allocation in Distributed Storage Networks

    CERN Document Server

    Sardari, Mohsen; Fekri, Faramarz; Soljanin, Emina

    2010-01-01

    We consider the problem of distributing a file in a network of storage nodes whose storage budget is limited but at least equals to the size file. We first generate $T$ encoded symbols (from the file) which are then distributed among the nodes. We investigate the optimal allocation of $T$ encoded packets to the storage nodes such that the probability of reconstructing the file by using any $r$ out of $n$ nodes is maximized. Since the optimal allocation of encoded packets is difficult to find in general, we find another objective function which well approximates the original problem and yet is easier to optimize. We find the optimal symmetric allocation for all coding redundancy constraints using the equivalent approximate problem. We also investigate the optimal allocation in random graphs. Finally, we provide simulations to verify the theoretical results.

  20. Fault-Tolerant Mechanism of the Distributed Cluster Computers"

    Institute of Scientific and Technical Information of China (English)

    SHANG Yizi; JIN Yang; WU Baosheng

    2007-01-01

    The distributed system with high performance and stability is commonly adopted in large scale scientific and engineering computing. In this paper, we discuss a fault-tolerant mechanism under Linux circumstance to improve the fault-tolerant ability of the system, namely a scheme and frame to form the stable computing platform. In terms of the structure and function of the distributed system, active list and file invocation strategies are employed in the task management. System multilevel fault-tolerance can be achieved by repeated processes in a single node and task migration on multi-nodes. Manager node agent introduced in this paper administrates the nodes using the list, disposes of the tasks according to the nodes'performance, and hence, to be able to make full use of the cluster resources. An evaluation method is proposed to appraise the performance. The analyzed results show the usefulness of the scheme proposed except for some additional overhead of memory consumption.

  1. The impact of distributed computing on education

    Science.gov (United States)

    Utku, S.; Lestingi, J.; Salama, M.

    1982-01-01

    In this paper, developments in digital computer technology since the early Fifties are reviewed briefly, and the parallelism which exists between these developments and developments in analysis and design procedures of structural engineering is identified. The recent trends in digital computer technology are examined in order to establish the fact that distributed processing is now an accepted philosophy for further developments. The impact of this on the analysis and design practices of structural engineering is assessed by first examining these practices from a data processing standpoint to identify the key operations and data bases, and then fitting them to the characteristics of distributed processing. The merits and drawbacks of the present philosophy in educating structural engineers are discussed and projections are made for the industry-academia relations in the distributed processing environment of structural analysis and design. An ongoing experiment of distributed computing in a university environment is described.

  2. Fast Computation of Wigner-Ville Distribution

    Institute of Scientific and Technical Information of China (English)

    曹家麟; 陈光化

    2003-01-01

    The paper proposes a new method for computation of the Wigner-Ville distribution(WVD) taking account of the conjugate symmetry of the WVD kernel function and the periodicity and symmetry of the trigonometric function. The method transfers the computation of WVD into real field from complex field to remove the redundancies in the fast Fourier transform(FFT) computation. The realization of fast cosine operation and fast sine operation considerably reduces the computation cost. Theoretical analysis shows that the algorithm provided in the paper is much more advantageous than the existed ones.

  3. Parallel calculations on shared memory, NUMA-based computers using MATLAB

    Science.gov (United States)

    Krotkiewski, Marcin; Dabrowski, Marcin

    2014-05-01

    Achieving satisfactory computational performance in numerical simulations on modern computer architectures can be a complex task. Multi-core design makes it necessary to parallelize the code. Efficient parallelization on NUMA (Non-Uniform Memory Access) shared memory architectures necessitates explicit placement of the data in the memory close to the CPU that uses it. In addition, using more than 8 CPUs (~100 cores) requires a cluster solution of interconnected nodes, which involves (expensive) communication between the processors. It takes significant effort to overcome these challenges even when programming in low-level languages, which give the programmer full control over data placement and work distribution. Instead, many modelers use high-level tools such as MATLAB, which severely limit the optimization/tuning options available. Nonetheless, the advantage of programming simplicity and a large available code base can tip the scale in favor of MATLAB. We investigate whether MATLAB can be used for efficient, parallel computations on modern shared memory architectures. A common approach to performance optimization of MATLAB programs is to identify a bottleneck and migrate the corresponding code block to a MEX file implemented in, e.g. C. Instead, we aim at achieving a scalable parallel performance of MATLABs core functionality. Some of the MATLABs internal functions (e.g., bsxfun, sort, BLAS3, operations on vectors) are multi-threaded. Achieving high parallel efficiency of those may potentially improve the performance of significant portion of MATLABs code base. Since we do not have MATLABs source code, our performance tuning relies on the tools provided by the operating system alone. Most importantly, we use custom memory allocation routines, thread to CPU binding, and memory page migration. The performance tests are carried out on multi-socket shared memory systems (2- and 4-way Intel-based computers), as well as a Distributed Shared Memory machine with 96 CPU

  4. A Screen Space GPGPU Surface LIC Algorithm for Distributed Memory Data Parallel Sort Last Rendering Infrastructures

    Energy Technology Data Exchange (ETDEWEB)

    Loring, Burlen; Karimabadi, Homa; Rortershteyn, Vadim

    2014-07-01

    The surface line integral convolution(LIC) visualization technique produces dense visualization of vector fields on arbitrary surfaces. We present a screen space surface LIC algorithm for use in distributed memory data parallel sort last rendering infrastructures. The motivations for our work are to support analysis of datasets that are too large to fit in the main memory of a single computer and compatibility with prevalent parallel scientific visualization tools such as ParaView and VisIt. By working in screen space using OpenGL we can leverage the computational power of GPUs when they are available and run without them when they are not. We address efficiency and performance issues that arise from the transformation of data from physical to screen space by selecting an alternate screen space domain decomposition. We analyze the algorithm's scaling behavior with and without GPUs on two high performance computing systems using data from turbulent plasma simulations.

  5. Intelligent Distributed Computing V : Proceedings of the 5th International Symposium on Intelligent Distributed Computing

    CERN Document Server

    Nieuwenhuis, Kees; Pavlin, Gregor; Warnier, Martijn; Badica, Costin

    2012-01-01

    This book represents the combined peer-reviewed proceedings of the Fifth International Symposium on Intelligent Distributed Computing -- IDC 2011 and of the Third International Workshop on Multi-Agent Systems Technology and Semantics -- MASTS 2011. Both events were held in Delft, The Netherlands during October 5-7, 2011. The 33 contributions published in this book address many topics related to theory and applications of intelligent distributed computing and multi-agent systems, including: adaptive and autonomous distributed systems, agent programming, ambient assisted living systems, business process modeling and verification, cloud computing, coalition formation, decision support systems, distributed optimization and constraint satisfaction, gesture recognition, intelligent energy management in WSNs, intelligent logistics, machine learning, mobile agents, parallel and distributed computational intelligence, parallel evolutionary computing, trust metrics and security, scheduling in distributed heterogenous c...

  6. Injecting Artificial Memory Errors Into a Running Computer Program

    Science.gov (United States)

    Bornstein, Benjamin J.; Granat, Robert A.; Wagstaff, Kiri L.

    2008-01-01

    Single-event upsets (SEUs) or bitflips are computer memory errors caused by radiation. BITFLIPS (Basic Instrumentation Tool for Fault Localized Injection of Probabilistic SEUs) is a computer program that deliberately injects SEUs into another computer program, while the latter is running, for the purpose of evaluating the fault tolerance of that program. BITFLIPS was written as a plug-in extension of the open-source Valgrind debugging and profiling software. BITFLIPS can inject SEUs into any program that can be run on the Linux operating system, without needing to modify the program s source code. Further, if access to the original program source code is available, BITFLIPS offers fine-grained control over exactly when and which areas of memory (as specified via program variables) will be subjected to SEUs. The rate of injection of SEUs is controlled by specifying either a fault probability or a fault rate based on memory size and radiation exposure time, in units of SEUs per byte per second. BITFLIPS can also log each SEU that it injects and, if program source code is available, report the magnitude of effect of the SEU on a floating-point value or other program variable.

  7. Pattern recognition and massively distributed computing.

    Science.gov (United States)

    Davies, E Keith; Glick, Meir; Harrison, Karl N; Richards, W Graham

    2002-12-01

    A feature of Peter Kollman's research was his exploitation of the latest computational techniques to devise novel applications of the free energy perturbation method. He would certainly have seized upon the opportunities offered by massively distributed computing. Here we describe the use of over a million personal computers to perform virtual screening of 3.5 billion druglike molecules against protein targets by pharmacophore pattern matching, together with other applications of pattern recognition such as docking ligands without any a priori knowledge about the binding site location.

  8. ATLAS distributed computing: experience and evolution

    Science.gov (United States)

    Nairz, A.; Atlas Collaboration

    2014-06-01

    The ATLAS experiment has just concluded its first running period which commenced in 2010. After two years of remarkable performance from the LHC and ATLAS, the experiment has accumulated more than 25 fb-1 of data. The total volume of beam and simulated data products exceeds 100 PB distributed across more than 150 computing centres around the world, managed by the experiment's distributed data management system. These sites have provided up to 150,000 computing cores to ATLAS's global production and analysis processing system, enabling a rich physics programme including the discovery of the Higgs-like boson in 2012. The wealth of accumulated experience in global data-intensive computing at this massive scale, and the considerably more challenging requirements of LHC computing from 2015 when the LHC resumes operation, are driving a comprehensive design and development cycle to prepare a revised computing model together with data processing and management systems able to meet the demands of higher trigger rates, energies and event complexities. An essential requirement will be the efficient utilisation of current and future processor technologies as well as a broad range of computing platforms, including supercomputing and cloud resources. We will report on experience gained thus far and our progress in preparing ATLAS computing for the future.

  9. Intelligent Distributed Computing VI : Proceedings of the 6th International Symposium on Intelligent Distributed Computing

    CERN Document Server

    Badica, Costin; Malgeri, Michele; Unland, Rainer

    2013-01-01

    This book represents the combined peer-reviewed proceedings of the Sixth International Symposium on Intelligent Distributed Computing -- IDC~2012, of the International Workshop on Agents for Cloud -- A4C~2012 and of the Fourth International Workshop on Multi-Agent Systems Technology and Semantics -- MASTS~2012. All the events were held in Calabria, Italy during September 24-26, 2012. The 37 contributions published in this book address many topics related to theory and applications of intelligent distributed computing and multi-agent systems, including: adaptive and autonomous distributed systems, agent programming, ambient assisted living systems, business process modeling and verification, cloud computing, coalition formation, decision support systems, distributed optimization and constraint satisfaction, gesture recognition, intelligent energy management in WSNs, intelligent logistics, machine learning, mobile agents, parallel and distributed computational intelligence, parallel evolutionary computing, trus...

  10. Distributed Shared Memory for the Cell Broadband Engine (DSMCBE)

    DEFF Research Database (Denmark)

    Larsen, Morten Nørgaard; Skovhede, Kenneth; Vinter, Brian

    2009-01-01

    in and out of non-coherent local storage blocks for each special processor element. In this paper we present a software library, namely the Distributed Shared Memory for the Cell Broadband Engine (DSMCBE). By using techniques known from distributed shared memory DSMCBE allows programmers to program the CELL......The CELL-BE processor provides high performance and has been shown to reach a performance close to the theoretical peak, however, the high performance comes at the price of a quite complex programming model. Central to the complexity of the CELL-BE programming model is the need to move data......-BE with relative ease and in addition scale their applications to use multiple CELL-BE processors in a network. Performance experiments show that a quite high performance can be obtained with DSMCBE even in a cluster environment....

  11. A portable implementation of ARPACK for distributed memory parallel architectures

    Energy Technology Data Exchange (ETDEWEB)

    Maschhoff, K.J.; Sorensen, D.C.

    1996-12-31

    ARPACK is a package of Fortran 77 subroutines which implement the Implicitly Restarted Arnoldi Method used for solving large sparse eigenvalue problems. A parallel implementation of ARPACK is presented which is portable across a wide range of distributed memory platforms and requires minimal changes to the serial code. The communication layers used for message passing are the Basic Linear Algebra Communication Subprograms (BLACS) developed for the ScaLAPACK project and Message Passing Interface(MPI).

  12. Distributed Computing: Options in the Eighties.

    Science.gov (United States)

    Klingenstein, Kenneth; Devine, Gary D.

    1985-01-01

    University administrative data processing is moving toward a more distributed environment. An architecture must be established that incorporates central sites, campus centers, and end users in a networked pool of computer systems, with applications located at appropriate nodes in the network. (Author/MLW)

  13. Masterless Distributed Computing Over Mobile Devices

    Science.gov (United States)

    2012-09-01

    8  C.  MAPREDUCE ...which are 3 commonly used to analyze large amounts of information?” We answer this question by modifying Google’s well-known MapReduce system to have... MAPREDUCE 1. Overview MapReduce is a programming framework that distributes computations across a heterogeneous mixture of machines in a manner that hides

  14. Distributed Computing on an Ensemble of Browsers

    NARCIS (Netherlands)

    Cushing, R.; Herawan, G.; Putra, H.; Belloum, A.; Bubak, M.; de Laat, C.

    2013-01-01

    In this article, the authors propose a new approach to distributed computing with Web browsers and introduce the WeevilScout prototype framework. The proliferation of Web browsers and the performance gains being achieved by current JavaScript virtual machines raises the question whether Internet bro

  15. Computer Systems for Distributed and Distance Learning.

    Science.gov (United States)

    Anderson, M.; Jackson, David

    2000-01-01

    Discussion of network-based learning focuses on a survey of computer systems for distributed and distance learning. Both Web-based systems and non-Web-based systems are reviewed in order to highlight some of the major trends of past projects and to suggest ways in which progress may be made in the future. (Contains 92 references.) (Author/LRW)

  16. Data-centric computing on distributed resources

    NARCIS (Netherlands)

    Cushing, R.S.

    2015-01-01

    Distributed computing has always been a challenge due to the NP-completeness of finding optimal underlying management routines. The advent of big data increases the dimensionality of the problem whereby data partitionability, processing complexity and locality play a crucial role in the effectivenes

  17. A Computational Fluid Dynamics (CFD) Analysis of an Undulatory Mechanical Fin Driven by Shape Memory Alloy

    Institute of Scientific and Technical Information of China (English)

    Yong-Hua Zhang; Jian-Hui He; Jie Yang; Shi-Wu Zhang; Kin Huat Low

    2006-01-01

    Many fishes use undulatory fin to propel themselves in the underwater environment. These locomotor mechanisms have a popular interest to many researchers. In the present study, we perform a three-dimensional unsteady computation of an undulatory mechanical fin that is driven by Shape Memory Alloy (SMA). The objective of the computation is to investigate the fluid dynamics of force production associated with the undulatory mechanical fin. An unstructured,grid-based, unsteady Navier-Stokes solver with automatic adaptive remeshing is used to compute the unsteady flow around the fin through five complete cycles. The pressure distribution on fin surface is computed and integrated to provide fin forces which are decomposed into lift and thrust. The velocity field is also computed throughout the swimming cycle. Finally, a comparison is conducted to reveal the dynamics of force generation according to the kinematic parameters of the undulatory fin (amplitude, frequency and wavelength).

  18. ATLAS Distributed Computing in LHC Run2

    Science.gov (United States)

    Campana, Simone

    2015-12-01

    The ATLAS Distributed Computing infrastructure has evolved after the first period of LHC data taking in order to cope with the challenges of the upcoming LHC Run-2. An increase in both the data rate and the computing demands of the Monte-Carlo simulation, as well as new approaches to ATLAS analysis, dictated a more dynamic workload management system (Prodsys-2) and data management system (Rucio), overcoming the boundaries imposed by the design of the old computing model. In particular, the commissioning of new central computing system components was the core part of the migration toward a flexible computing model. A flexible computing utilization exploring the use of opportunistic resources such as HPC, cloud, and volunteer computing is embedded in the new computing model; the data access mechanisms have been enhanced with the remote access, and the network topology and performance is deeply integrated into the core of the system. Moreover, a new data management strategy, based on a defined lifetime for each dataset, has been defined to better manage the lifecycle of the data. In this note, an overview of an operational experience of the new system and its evolution is presented.

  19. Distributed Computing and its Scope in Defence Applications

    Directory of Open Access Journals (Sweden)

    B.V. George

    2005-10-01

    Full Text Available Distributed computing is one of the paradigms in the world of information technology. Middleware is the essential tool for implementing distributed computing for overtaking theheterogeneity of platform and language. DRDO’s intranet, DRONA, has the potential of hosting distributed applications across the network. This paper deals with the essentials of distributed computing, architecture of DRONA network, and the scope of distributed computing in Defence applications. It also suggests a few possible applications of distributed computing.

  20. Computing the Exit Complexity of Knowledge in Distributed Quantum Computers

    Directory of Open Access Journals (Sweden)

    M.A.Abbas

    2013-01-01

    Full Text Available Distributed Quantum computers abide from the exit complexity of the knowledge. The exit complexity is the accrue of the nodal information needed to clarify the total egress system with deference to a distinguished exit node. The core objective of this paper is to compile an arrogant methodology for assessing the exit complexity of the knowledge in distributed quantum computers. The proposed methodology is based on contouring the knowledge using the unlabeled binary trees, hence building an benchmarked and a computer based model. The proposed methodology dramatizes knowledge autocratically calculates the exit complexity. The methodology consists of several amphitheaters, starting with detecting the baron aspect of the tree of others entitled express knowledge and then measure the volume of information and the complexity of behavior destining from the bargain of information. Then calculate egress resulting from episodes that do not lead to the withdrawal of the information. In the end is calculated total egress complexity and then appraised total exit complexity of the system. Given the complexity of the operations within the Distributed Computing Quantity, this research addresses effective transactions that could affect the three-dimensional behavior of knowledge. The results materialized that the best affair where total exit complexity as minimal as possible is a picture of a binary tree is entitled at the rate of positive and negative cardinal points medium value. It could be argued that these cardinal points should not amass the upper bound apex or minimum.

  1. Distributed Computing Framework for Synthetic Radar Application

    Science.gov (United States)

    Gurrola, Eric M.; Rosen, Paul A.; Aivazis, Michael

    2006-01-01

    We are developing an extensible software framework, in response to Air Force and NASA needs for distributed computing facilities for a variety of radar applications. The objective of this work is to develop a Python based software framework, that is the framework elements of the middleware that allows developers to control processing flow on a grid in a distributed computing environment. Framework architectures to date allow developers to connect processing functions together as interchangeable objects, thereby allowing a data flow graph to be devised for a specific problem to be solved. The Pyre framework, developed at the California Institute of Technology (Caltech), and now being used as the basis for next-generation radar processing at JPL, is a Python-based software framework. We have extended the Pyre framework to include new facilities to deploy processing components as services, including components that monitor and assess the state of the distributed network for eventual real-time control of grid resources.

  2. Research computing in a distributed cloud environment

    Energy Technology Data Exchange (ETDEWEB)

    Fransham, K; Agarwal, A; Armstrong, P; Bishop, A; Charbonneau, A; Desmarais, R; Hill, N; Gable, I; Gaudet, S; Goliath, S; Impey, R; Leavett-Brown, C; Ouellete, J; Paterson, M; Pritchet, C; Penfold-Brown, D; Podaima, W; Schade, D; Sobie, R J, E-mail: fransham@uvic.ca

    2010-11-01

    The recent increase in availability of Infrastructure-as-a-Service (IaaS) computing clouds provides a new way for researchers to run complex scientific applications. However, using cloud resources for a large number of research jobs requires significant effort and expertise. Furthermore, running jobs on many different clouds presents even more difficulty. In order to make it easy for researchers to deploy scientific applications across many cloud resources, we have developed a virtual machine resource manager (Cloud Scheduler) for distributed compute clouds. In response to a user's job submission to a batch system, the Cloud Scheduler manages the distribution and deployment of user-customized virtual machines across multiple clouds. We describe the motivation for and implementation of a distributed cloud using the Cloud Scheduler that is spread across both commercial and dedicated private sites, and present some early results of scientific data analysis using the system.

  3. Research computing in a distributed cloud environment

    Science.gov (United States)

    Fransham, K.; Agarwal, A.; Armstrong, P.; Bishop, A.; Charbonneau, A.; Desmarais, R.; Hill, N.; Gable, I.; Gaudet, S.; Goliath, S.; Impey, R.; Leavett-Brown, C.; Ouellete, J.; Paterson, M.; Pritchet, C.; Penfold-Brown, D.; Podaima, W.; Schade, D.; Sobie, R. J.

    2010-11-01

    The recent increase in availability of Infrastructure-as-a-Service (IaaS) computing clouds provides a new way for researchers to run complex scientific applications. However, using cloud resources for a large number of research jobs requires significant effort and expertise. Furthermore, running jobs on many different clouds presents even more difficulty. In order to make it easy for researchers to deploy scientific applications across many cloud resources, we have developed a virtual machine resource manager (Cloud Scheduler) for distributed compute clouds. In response to a user's job submission to a batch system, the Cloud Scheduler manages the distribution and deployment of user-customized virtual machines across multiple clouds. We describe the motivation for and implementation of a distributed cloud using the Cloud Scheduler that is spread across both commercial and dedicated private sites, and present some early results of scientific data analysis using the system.

  4. Contention Modeling for Multithreaded Distributed Shared Memory Machines: The Cray XMT

    Energy Technology Data Exchange (ETDEWEB)

    Secchi, Simone; Tumeo, Antonino; Villa, Oreste

    2011-07-27

    Distributed Shared Memory (DSM) machines are a wide class of multi-processor computing systems where a large virtually-shared address space is mapped on a network of physically distributed memories. High memory latency and network contention are two of the main factors that limit performance scaling of such architectures. Modern high-performance computing DSM systems have evolved toward exploitation of massive hardware multi-threading and fine-grained memory hashing to tolerate irregular latencies, avoid network hot-spots and enable high scaling. In order to model the performance of such large-scale machines, parallel simulation has been proved to be a promising approach to achieve good accuracy in reasonable times. One of the most critical factors in solving the simulation speed-accuracy trade-off is network modeling. The Cray XMT is a massively multi-threaded supercomputing architecture that belongs to the DSM class, since it implements a globally-shared address space abstraction on top of a physically distributed memory substrate. In this paper, we discuss the development of a contention-aware network model intended to be integrated in a full-system XMT simulator. We start by measuring the effects of network contention in a 128-processor XMT machine and then investigate the trade-off that exists between simulation accuracy and speed, by comparing three network models which operate at different levels of accuracy. The comparison and model validation is performed by executing a string-matching algorithm on the full-system simulator and on the XMT, using three datasets that generate noticeably different contention patterns.

  5. An object oriented design for high performance linear algebra on distributed memory architectures

    Energy Technology Data Exchange (ETDEWEB)

    Dongarra, J.J. [Oak Ridge National Lab., TN (United States)]|[Univ. of Tennessee, Knoxville, TN (United States). Dept. of Computer Science; Walker, D.W. [Oak Ridge National Lab., TN (United States); Pozo, R. [Univ. of Tennessee, Knoxville, TN (United States). Dept. of Computer Science

    1993-12-31

    We describe the design of ScaLAPACK++, an object oriented C++ library for implementing linear algebra computations on distributed memory multicomputers. This package, when complete, will support distributed dense, banded, sparse matrix operations for symmetric, positive-definite, and non-symmetric cases. In ScaLAPACK++ we have employed object oriented design methods to enchance scalability, portability, flexibility, and ease-of-use. We illustrate some of these points by describing the implementation of a right-looking LU factorization for dense systems in ScaLAPACK++.

  6. LDRD final report : managing shared memory data distribution in hybrid HPC applications.

    Energy Technology Data Exchange (ETDEWEB)

    Merritt, Alexander M. (Georgia Institute of Technology, Atlanta, GA); Pedretti, Kevin Thomas Tauke

    2010-09-01

    MPI is the dominant programming model for distributed memory parallel computers, and is often used as the intra-node programming model on multi-core compute nodes. However, application developers are increasingly turning to hybrid models that use threading within a node and MPI between nodes. In contrast to MPI, most current threaded models do not require application developers to deal explicitly with data locality. With increasing core counts and deeper NUMA hierarchies seen in the upcoming LANL/SNL 'Cielo' capability supercomputer, data distribution poses an upper boundary on intra-node scalability within threaded applications. Data locality therefore has to be identified at runtime using static memory allocation policies such as first-touch or next-touch, or specified by the application user at launch time. We evaluate several existing techniques for managing data distribution using micro-benchmarks on an AMD 'Magny-Cours' system with 24 cores among 4 NUMA domains and argue for the adoption of a dynamic runtime system implemented at the kernel level, employing a novel page table replication scheme to gather per-NUMA domain memory access traces.

  7. Translation Memory and Computer Assisted Translation Tool for Medieval Texts

    Directory of Open Access Journals (Sweden)

    Törcsvári Attila

    2013-05-01

    Full Text Available Translation memories (TMs, as part of Computer Assisted Translation (CAT tools, support translators reusing portions of formerly translated text. Fencing books are good candidates for using TMs due to the high number of repeated terms. Medieval texts suffer a number of drawbacks that make hard even “simple” rewording to the modern version of the same language. The analyzed difficulties are: lack of systematic spelling, unusual word orders and typos in the original. A hypothesis is made and verified that even simple modernization increases legibility and it is feasible, also it is worthwhile to apply translation memories due to the numerous and even extremely long repeated terms. Therefore, methods and algorithms are presented 1. for automated transcription of medieval texts (when a limited training set is available, and 2. collection of repeated patterns. The efficiency of the algorithms is analyzed for recall and precision.

  8. Cloud Computing as Evolution of Distributed Computing – A Case Study for SlapOS Distributed Cloud Computing Platform

    Directory of Open Access Journals (Sweden)

    George SUCIU

    2013-01-01

    Full Text Available The cloud computing paradigm has been defined from several points of view, the main two directions being either as an evolution of the grid and distributed computing paradigm, or, on the contrary, as a disruptive revolution in the classical paradigms of operating systems, network layers and web applications. This paper presents a distributed cloud computing platform called SlapOS, which unifies technologies and communication protocols into a new technology model for offering any application as a service. Both cloud and distributed computing can be efficient methods for optimizing resources that are aggregated from a grid of standard PCs hosted in homes, offices and small data centers. The paper fills a gap in the existing distributed computing literature by providing a distributed cloud computing model which can be applied for deploying various applications.

  9. Wireless distributed computing for cyclostationary feature detection

    Directory of Open Access Journals (Sweden)

    Mohammed I.M. Alfaqawi

    2016-02-01

    Full Text Available Recently, wireless distributed computing (WDC concept has emerged promising manifolds improvements to current wireless technologies. Despite the various expected benefits of this concept, significant drawbacks were addressed in the open literature. One of WDC key challenges is the impact of wireless channel quality on the load of distributed computations. Therefore, this research investigates the wireless channel impact on WDC performance when the latter is applied to spectrum sensing in cognitive radio (CR technology. However, a trade-off is found between accuracy and computational complexity in spectrum sensing approaches. Increasing these approaches accuracy is accompanied by an increase in computational complexity. This results in greater power consumption and processing time. A novel WDC scheme for cyclostationary feature detection spectrum sensing approach is proposed in this paper and thoroughly investigated. The benefits of the proposed scheme are firstly presented. Then, the impact of the wireless channel of the proposed scheme is addressed considering two scenarios. In the first scenario, workload matrices are distributed over the wireless channel. Then, a fusion center combines these matrices in order to make a decision. Meanwhile, in the second scenario, local decisions are made by CRs, then, only a binary flag is sent to the fusion center.

  10. Simulation of Human Episodic Memory by Using a Computational Model of the Hippocampus

    Directory of Open Access Journals (Sweden)

    Naoyuki Sato

    2010-01-01

    Full Text Available The episodic memory, the memory of personal events and history, is essential for understanding the mechanism of human intelligence. Neuroscience evidence has shown that the hippocampus, a part of the limbic system, plays an important role in the encoding and the retrieval of the episodic memory. This paper reviews computational models of the hippocampus and introduces our own computational model of human episodic memory based on neural synchronization. Results from computer simulations demonstrate that our model provides advantage for instantaneous memory formation and selective retrieval enabling memory search. Moreover, this model was found to have the ability to predict human memory recall by integrating human eye movement data during encoding. The combined approach between computational models and experiment is efficient for theorizing the human episodic memory.

  11. Open Source Live Distributions for Computer Forensics

    Science.gov (United States)

    Giustini, Giancarlo; Andreolini, Mauro; Colajanni, Michele

    Current distributions of open source forensic software provide digital investigators with a large set of heterogeneous tools. Their use is not always focused on the target and requires high technical expertise. We present a new GNU/Linux live distribution, named CAINE (Computer Aided INvestigative Environment) that contains a collection of tools wrapped up into a user friendly environment. The CAINE forensic framework introduces novel important features, aimed at filling the interoperability gap across different forensic tools. Moreover, it provides a homogeneous graphical interface that drives digital investigators during the acquisition and analysis of electronic evidence, and it offers a semi-automatic mechanism for the creation of the final report.

  12. Robust dynamical decoupling for quantum computing and quantum memory.

    Science.gov (United States)

    Souza, Alexandre M; Alvarez, Gonzalo A; Suter, Dieter

    2011-06-17

    Dynamical decoupling (DD) is a popular technique for protecting qubits from the environment. However, unless special care is taken, experimental errors in the control pulses used in this technique can destroy the quantum information instead of preserving it. Here, we investigate techniques for making DD sequences robust against different types of experimental errors while retaining good decoupling efficiency in a fluctuating environment. We present experimental data from solid-state nuclear spin qubits and introduce a new DD sequence that is suitable for quantum computing and quantum memory.

  13. A Case Study on Neural Inspired Dynamic Memory Management Strategies for High Performance Computing.

    Energy Technology Data Exchange (ETDEWEB)

    Vineyard, Craig Michael; Verzi, Stephen Joseph

    2017-09-01

    As high performance computing architectures pursue more computational power there is a need for increased memory capacity and bandwidth as well. A multi-level memory (MLM) architecture addresses this need by combining multiple memory types with different characteristics as varying levels of the same architecture. How to efficiently utilize this memory infrastructure is an unknown challenge, and in this research we sought to investigate whether neural inspired approaches can meaningfully help with memory management. In particular we explored neurogenesis inspired re- source allocation, and were able to show a neural inspired mixed controller policy can beneficially impact how MLM architectures utilize memory.

  14. BES-III distributed computing status

    Science.gov (United States)

    Belov, S. D.; Deng, Z. Y.; Korenkov, V. V.; Li, W. D.; Lin, T.; Ma, Z. T.; Nicholson, C.; Pelevanyuk, I. S.; Suo, B.; Trofimov, V. V.; Tsaregorodtsev, A. U.; Uzhinskiy, A. V.; Yan, T.; Yan, X. F.; Zhang, X. M.; Zhemchugov, A. S.

    2016-09-01

    The BES-III experiment at the Institute of High Energy Physics (Beijing, China) is aimed at the precision measurements in e+e- annihilation in the energy range from 2.0 till 4.6 GeV. The world's largest samples of J/psi and psi' events and unique samples of XYZ data have been already collected. The expected increase of the data volume in the coming years required a significant evolution of the computing model, namely shift from a centralized data processing to a distributed one. This report summarizes a current design of the BES-III distributed computing system, some of key decisions and experience gained during 2 years of operations.

  15. Distributed Computation in the Digitized Battlefield

    Science.gov (United States)

    1999-01-01

    the digitized battlefield differs form the general model of distributed computation. • Transaction oriented state machine model of a node is not always...The transaction oriented state machine model of a node is not always applicable in battlefield situations, and the number of participating nodes in...other. This means, that processing order of these events in the state machine model does not have any effect on the global parameters. These kinds of

  16. A new metric enabling an exact hypergraph model for the communication volume in distributed-memory parallel applications

    NARCIS (Netherlands)

    Fortmeier, O.; Bücker, H.M.; Fagginger Auer, B.O.; Bisseling, R.H.

    2013-01-01

    A hypergraph model for mapping applications with an all-neighbor communication pattern to distributed-memory computers is proposed, which originated in finite element triangulations. Rather than approximating the communication volume for linear algebra operations, this new model represents the commu

  17. Enhanced computation method of topological smoothing on shared memory parallel machines

    Directory of Open Access Journals (Sweden)

    Mahmoudi Ramzi

    2011-01-01

    Full Text Available Abstract To prepare images for better segmentation, we need preprocessing applications, such as smoothing, to reduce noise. In this paper, we present an enhanced computation method for smoothing 2D object in binary case. Unlike existing approaches, proposed method provides a parallel computation and better memory management, while preserving the topology (number of connected components of the original image by using homotopic transformations defined in the framework of digital topology. We introduce an adapted parallelization strategy called split, distribute and merge (SDM strategy which allows efficient parallelization of a large class of topological operators. To achieve a good speedup and better memory allocation, we cared about task scheduling and managing. Distributed work during smoothing process is done by a variable number of threads. Tests on 2D grayscale image (512*512, using shared memory parallel machine (SMPM with 8 CPU cores (2× Xeon E5405 running at frequency of 2 GHz, showed an enhancement of 5.2 with cache success rate of 70%.

  18. Computational performance of a smoothed particle hydrodynamics simulation for shared-memory parallel computing

    Science.gov (United States)

    Nishiura, Daisuke; Furuichi, Mikito; Sakaguchi, Hide

    2015-09-01

    The computational performance of a smoothed particle hydrodynamics (SPH) simulation is investigated for three types of current shared-memory parallel computer devices: many integrated core (MIC) processors, graphics processing units (GPUs), and multi-core CPUs. We are especially interested in efficient shared-memory allocation methods for each chipset, because the efficient data access patterns differ between compute unified device architecture (CUDA) programming for GPUs and OpenMP programming for MIC processors and multi-core CPUs. We first introduce several parallel implementation techniques for the SPH code, and then examine these on our target computer architectures to determine the most effective algorithms for each processor unit. In addition, we evaluate the effective computing performance and power efficiency of the SPH simulation on each architecture, as these are critical metrics for overall performance in a multi-device environment. In our benchmark test, the GPU is found to produce the best arithmetic performance as a standalone device unit, and gives the most efficient power consumption. The multi-core CPU obtains the most effective computing performance. The computational speed of the MIC processor on Xeon Phi approached that of two Xeon CPUs. This indicates that using MICs is an attractive choice for existing SPH codes on multi-core CPUs parallelized by OpenMP, as it gains computational acceleration without the need for significant changes to the source code.

  19. Data analytics in the ATLAS Distributed Computing

    CERN Document Server

    Vukotic, Ilija; The ATLAS collaboration; Bryant, Lincoln

    2015-01-01

    The ATLAS Data analytics effort is focused on creating systems which provide the ATLAS ADC with new capabilities for understanding distributed systems and overall operational performance. These capabilities include: warehousing information from multiple systems (the production and distributed analysis system - PanDA, the distributed data management system - Rucio, the file transfer system, various monitoring services etc. ); providing a platform to execute arbitrary data mining and machine learning algorithms over aggregated data; satisfy a variety of use cases for different user roles; host new third party analytics services on a scalable compute platform. We describe the implemented system where: data sources are existing RDBMS (Oracle) and Flume collectors; a Hadoop cluster is used to store the data; native Hadoop and Apache Pig scripts are used for data aggregation; and R for in-depth analytics. Part of the data is indexed in ElasticSearch so both simpler investigations and complex dashboards can be made ...

  20. Optimized Parallel Execution of Declarative Programs on Distributed Memory Multiprocessors

    Institute of Scientific and Technical Information of China (English)

    沈美明; 田新民; 等

    1993-01-01

    In this paper,we focus on the compiling implementation of parlalel logic language PARLOG and functional language ML on distributed memory multiprocessors.Under the graph rewriting framework, a Heterogeneous Parallel Graph Rewriting Execution Model(HPGREM)is presented firstly.Then based on HPGREM,a parallel abstact machine PAM/TGR is described.Furthermore,several optimizing compilation schemes for executing declarative programs on transputer array are proposed.The performance statistics on transputer array demonstrate the effectiveness of our model,parallel abstract machine,optimizing compilation strategies and compiler.

  1. Reservoir Computing Beyond Memory-Nonlinearity Trade-off.

    Science.gov (United States)

    Inubushi, Masanobu; Yoshimura, Kazuyuki

    2017-08-31

    Reservoir computing is a brain-inspired machine learning framework that employs a signal-driven dynamical system, in particular harnessing common-signal-induced synchronization which is a widely observed nonlinear phenomenon. Basic understanding of a working principle in reservoir computing can be expected to shed light on how information is stored and processed in nonlinear dynamical systems, potentially leading to progress in a broad range of nonlinear sciences. As a first step toward this goal, from the viewpoint of nonlinear physics and information theory, we study the memory-nonlinearity trade-off uncovered by Dambre et al. (2012). Focusing on a variational equation, we clarify a dynamical mechanism behind the trade-off, which illustrates why nonlinear dynamics degrades memory stored in dynamical system in general. Moreover, based on the trade-off, we propose a mixture reservoir endowed with both linear and nonlinear dynamics and show that it improves the performance of information processing. Interestingly, for some tasks, significant improvements are observed by adding a few linear dynamics to the nonlinear dynamical system. By employing the echo state network model, the effect of the mixture reservoir is numerically verified for a simple function approximation task and for more complex tasks.

  2. Distributing Working Memory Resources and the Use of External Representations on Problem Solving

    OpenAIRE

    大塚, 一徳; 宮谷, 真人

    2007-01-01

    This study examines how problem solvers use distributing working memory resources over internal and external epresentations. Participants played three-dimensional versions of number guessing games. The playing of number guessing games is directly related to consumption of working memory resources. They could use arbitrarily the game record windows which are the external representations of these games and, thus, they could distribute working memory demands over internal working memory resource...

  3. Distributed Computing for the Pierre Auger Observatory

    Science.gov (United States)

    Chudoba, J.

    2015-12-01

    Pierre Auger Observatory operates the largest system of detectors for ultra-high energy cosmic ray measurements. Comparison of theoretical models of interactions with recorded data requires thousands of computing cores for Monte Carlo simulations. Since 2007 distributed resources connected via EGI grid are successfully used. The first and the second versions of production system based on bash scripts and MySQL database were able to submit jobs to all reliable sites supporting Virtual Organization auger. For many years VO auger belongs to top ten of EGI users based on the total used computing time. Migration of the production system to DIRAC interware started in 2014. Pilot jobs improve efficiency of computing jobs and eliminate problems with small and less reliable sites used for the bulk production. The new system has also possibility to use available resources in clouds. Dirac File Catalog replaced LFC for new files, which are organized in datasets defined via metadata. CVMFS is used for software distribution since 2014. In the presentation we give a comparison of the old and the new production system and report the experience on migrating to the new system.

  4. Pseudo-interactive monitoring in distributed computing

    Energy Technology Data Exchange (ETDEWEB)

    Sfiligoi, I.; /Fermilab; Bradley, D.; Livny, M.; /Wisconsin U., Madison

    2009-05-01

    Distributed computing, and in particular Grid computing, enables physicists to use thousands of CPU days worth of computing every day, by submitting thousands of compute jobs. Unfortunately, a small fraction of such jobs regularly fail; the reasons vary from disk and network problems to bugs in the user code. A subset of these failures result in jobs being stuck for long periods of time. In order to debug such failures, interactive monitoring is highly desirable; users need to browse through the job log files and check the status of the running processes. Batch systems typically don't provide such services; at best, users get job logs at job termination, and even this may not be possible if the job is stuck in an infinite loop. In this paper we present a novel approach of using regular batch system capabilities of Condor to enable users to access the logs and processes of any running job. This does not provide true interactive access, so commands like vi are not viable, but it does allow operations like ls, cat, top, ps, lsof, netstat and dumping the stack of any process owned by the user; we call this pseudo-interactive monitoring. It is worth noting that the same method can be used to monitor Grid jobs in a glidein-based environment. We further believe that the same mechanism could be applied to many other batch systems.

  5. Automating usability of ATLAS Distributed Computing resources

    CERN Document Server

    "Tupputi, S A; The ATLAS collaboration

    2013-01-01

    The automation of ATLAS Distributed Computing (ADC) operations is essential to reduce manpower costs and allow performance-enhancing actions, which improve the reliability of the system. In this perspective a crucial case is the automatic exclusion/recovery of ATLAS computing sites storage resources, which are continuously exploited at the edge of their capabilities. It is challenging to adopt unambiguous decision criteria for storage resources who feature non-homogeneous types, sizes and roles. The recently developed Storage Area Automatic Blacklisting (SAAB) tool has provided a suitable solution, by employing an inference algorithm which processes SAM (Site Availability Test) site-by-site SRM tests outcome. SAAB accomplishes both the tasks of providing global monitoring as well as automatic operations on single sites.\

  6. Fast Distributed Computation of Distances in Networks

    CERN Document Server

    Almeida, Paulo Sérgio; Cunha, Alcino

    2011-01-01

    This paper presents a distributed algorithm to simultaneously compute the diameter, radius and node eccentricity in all nodes of a synchronous network. Such topological information may be useful as input to configure other algorithms. Previous approaches have been modular, progressing in sequential phases using building blocks such as BFS tree construction, thus incurring longer executions than strictly required. We present an algorithm that, by timely propagation of available estimations, achieves a faster convergence to the correct values. We show local criteria for detecting convergence in each node. The algorithm avoids the creation of BFS trees and simply manipulates sets of node ids and hop counts. For the worst scenario of variable start times, each node i with eccentricity ecc(i) can compute: the node eccentricity in diam(G)+ecc(i)+2 rounds; the diameter in 2*diam(G)+ecc(i)+2 rounds; and the radius in diam(G)+ecc(i)+2*radius(G) rounds.

  7. Computational Modeling of the Negative Priming Effect Based on Inhibition Patterns and Working Memory

    Directory of Open Access Journals (Sweden)

    Dongil eChung

    2013-11-01

    Full Text Available Negative priming (NP, slowing down of the response for target stimuli that have been previously exposed, but ignored, has been reported in multiple psychological paradigms including the Stroop task. Although NP likely results from the interplay of selective attention, episodic memory retrieval, working memory, and inhibition mechanisms, a comprehensive theoretical account of NP is currently unavailable. This lacuna may result from the complexity of stimuli combinations in NP. Thus, we aimed to investigate the presence of different degrees of the NP effect according to prime-probe combinations within a classic Stroop task. We recorded reaction times (RTs from 66 healthy participants during Stroop task performance and examined three different NP subtypes, defined according to the type of the Stroop probe in prime-probe pairs. Our findings show significant RT differences among NP subtypes that are putatively due to the presence of differential disinhibition, i.e., release from inhibition. Among the several potential origins for differential subtypes of NP, we investigated the involvement of selective attention and/or working memory using a parallel distributed processing (PDP model (employing selective attention only and a modified PDP model with working memory (PDP-WM, employing both selective attention and working memory. Our findings demonstrate that, unlike the conventional PDP model, the PDP-WM successfully simulates different levels of NP effects that closely follow the behavioral data. This outcome suggests that working memory engages in the re-accumulation of the evidence for target response and induces differential NP effects. Our computational model complements earlier efforts and may pave the road to further insights into an integrated theoretical account of complex NP effects.

  8. Computational modeling of the negative priming effect based on inhibition patterns and working memory.

    Science.gov (United States)

    Chung, Dongil; Raz, Amir; Lee, Jaewon; Jeong, Jaeseung

    2013-01-01

    Negative priming (NP), slowing down of the response for target stimuli that have been previously exposed, but ignored, has been reported in multiple psychological paradigms including the Stroop task. Although NP likely results from the interplay of selective attention, episodic memory retrieval, working memory, and inhibition mechanisms, a comprehensive theoretical account of NP is currently unavailable. This lacuna may result from the complexity of stimuli combinations in NP. Thus, we aimed to investigate the presence of different degrees of the NP effect according to prime-probe combinations within a classic Stroop task. We recorded reaction times (RTs) from 66 healthy participants during Stroop task performance and examined three different NP subtypes, defined according to the type of the Stroop probe in prime-probe pairs. Our findings show significant RT differences among NP subtypes that are putatively due to the presence of differential disinhibition, i.e., release from inhibition. Among the several potential origins for differential subtypes of NP, we investigated the involvement of selective attention and/or working memory using a parallel distributed processing (PDP) model (employing selective attention only) and a modified PDP model with working memory (PDP-WM, employing both selective attention and working memory). Our findings demonstrate that, unlike the conventional PDP model, the PDP-WM successfully simulates different levels of NP effects that closely follow the behavioral data. This outcome suggests that working memory engages in the re-accumulation of the evidence for target response and induces differential NP effects. Our computational model complements earlier efforts and may pave the road to further insights into an integrated theoretical account of complex NP effects.

  9. Computer Use and Its Effect on the Memory Process in Young and Adults

    Science.gov (United States)

    Alliprandini, Paula Mariza Zedu; Straub, Sandra Luzia Wrobel; Brugnera, Elisangela; de Oliveira, Tânia Pitombo; Souza, Isabela Augusta Andrade

    2013-01-01

    This work investigates the effect of computer use in the memory process in young and adults under the Perceptual and Memory experimental conditions. The memory condition involved the phases acquisition of information and recovery, on time intervals (2 min, 24 hours and 1 week) on situations of pre and post-test (before and after the participants…

  10. MSAProbs-MPI: parallel multiple sequence aligner for distributed-memory systems.

    Science.gov (United States)

    González-Domínguez, Jorge; Liu, Yongchao; Touriño, Juan; Schmidt, Bertil

    2016-12-15

    MSAProbs is a state-of-the-art protein multiple sequence alignment tool based on hidden Markov models. It can achieve high alignment accuracy at the expense of relatively long runtimes for large-scale input datasets. In this work we present MSAProbs-MPI, a distributed-memory parallel version of the multithreaded MSAProbs tool that is able to reduce runtimes by exploiting the compute capabilities of common multicore CPU clusters. Our performance evaluation on a cluster with 32 nodes (each containing two Intel Haswell processors) shows reductions in execution time of over one order of magnitude for typical input datasets. Furthermore, MSAProbs-MPI using eight nodes is faster than the GPU-accelerated QuickProbs running on a Tesla K20. Another strong point is that MSAProbs-MPI can deal with large datasets for which MSAProbs and QuickProbs might fail due to time and memory constraints, respectively.

  11. Shared and Distributed Memory Parallel Security Analysis of Large-Scale Source Code and Binary Applications

    Energy Technology Data Exchange (ETDEWEB)

    Quinlan, D; Barany, G; Panas, T

    2007-08-30

    Many forms of security analysis on large scale applications can be substantially automated but the size and complexity can exceed the time and memory available on conventional desktop computers. Most commercial tools are understandably focused on such conventional desktop resources. This paper presents research work on the parallelization of security analysis of both source code and binaries within our Compass tool, which is implemented using the ROSE source-to-source open compiler infrastructure. We have focused on both shared and distributed memory parallelization of the evaluation of rules implemented as checkers for a wide range of secure programming rules, applicable to desktop machines, networks of workstations and dedicated clusters. While Compass as a tool focuses on source code analysis and reports violations of an extensible set of rules, the binary analysis work uses the exact same infrastructure but is less well developed into an equivalent final tool.

  12. Skyline View: Efficient Distributed Subspace Skyline Computation

    Science.gov (United States)

    Kim, Jinhan; Lee, Jongwuk; Hwang, Seung-Won

    Skyline queries have gained much attention as alternative query semantics with pros (e.g.low query formulation overhead) and cons (e.g.large control over result size). To overcome the cons, subspace skyline queries have been recently studied, where users iteratively specify relevant feature subspaces on search space. However, existing works mainly focuss on centralized databases. This paper aims to extend subspace skyline computation to distributed environments such as the Web, where the most important issue is to minimize the cost of accessing vertically distributed objects. Toward this goal, we exploit prior skylines that have overlapped subspaces to the given subspace. In particular, we develop algorithms for three scenarios- when the subspace of prior skylines is superspace, subspace, or the rest. Our experimental results validate that our proposed algorithm shows significantly better performance than the state-of-the-art algorithms.

  13. Computation of current distributions using FEMLAB

    Energy Technology Data Exchange (ETDEWEB)

    Shankar, M.S.; Pullabhotla, S.R. [Vellore Institute of Technology, Tamilnadu (India); Vijayasekaran, B. [Central Electrochemical Research Institute, Tamilnadu (India); Chemical Engineering, Tennessee Technological University, Tennessee (United States); Basha, C.A.

    2009-04-15

    An efficient method for the computation of current density and surface concentration distributions in electrochemical processes is analyzed using the commercial mathematical software FEMLAB. To illustrate the utility of the software, the procedure is applied to some realistic problems encountered in electrochemical engineering, such as current distribution in a continuous moving electrode, parallel plate electrode, hull cell, curvilinear hull cell, thin layer galvanic cell, through-hole plating, and a recessed disc electrode. The model equations of the above cases are considered and their implementations into the software, FEMLAB, are analyzed. The technique is attractive because it involves a systematic way of coupling equations to perform case studies. (Abstract Copyright [2009], Wiley Periodicals, Inc.)

  14. Computing shifts to monitor ATLAS distributed computing infrastructure and operations

    CERN Document Server

    Bourdarios, Claire; The ATLAS collaboration; Crepe-Renaudin, Sabine Chrystel; De, Kaushik

    2017-01-01

    The ATLAS Distributed Computing (ADC) group established a new Computing Run Coordinator (CRC) shift at the start of LHC Run2 in 2015. The main goal was to rely on a person with a good overview of the ADC activities to ease the ADC experts' workload. The CRC shifter keeps track of ADC tasks related to their fields of expertise and responsibility. At the same time, the shifter maintains a global view of the day-to-day operations of the ADC system. During Run1, this task was accomplished by the ADC Manager on Duty (AMOD), a position that was removed during the shutdown period due to the reduced number and availability of ADC experts foreseen for Run2. The CRC position was proposed to cover some of the AMOD’s former functions, while allowing more people involved in computing to participate. In this way, CRC shifters help train future ADC experts. The CRC shifters coordinate daily ADC shift operations, including tracking open issues, reporting, and representing ADC in relevant meetings. The CRC also facilitates ...

  15. Computing shifts to monitor ATLAS distributed computing infrastructure and operations

    CERN Document Server

    Adam Bourdarios, Claire; The ATLAS collaboration

    2016-01-01

    The ATLAS Distributed Computing (ADC) group established a new Computing Run Coordinator (CRC) shift at the start of LHC Run2 in 2015. The main goal was to rely on a person with a good overview of the ADC activities to ease the ADC experts' workload. The CRC shifter keeps track of ADC tasks related to their fields of expertise and responsibility. At the same time, the shifter maintains a global view of the day-to-day operations of the ADC system. During Run1, this task was accomplished by the ADC Manager on Duty (AMOD), a position that was removed during the shutdown period due to the reduced number and availability of ADC experts foreseen for Run2. The CRC position was proposed to cover some of the AMOD’s former functions, while allowing more people involved in computing to participate. In this way, CRC shifters help train future ADC experts. The CRC shifters coordinate daily ADC shift operations, including tracking open issues, reporting, and representing ADC in relevant meetings. The CRC also facilitates ...

  16. Energy-aware memory management for embedded multimedia systems a computer-aided design approach

    CERN Document Server

    Balasa, Florin

    2011-01-01

    Energy-Aware Memory Management for Embedded Multimedia Systems: A Computer-Aided Design Approach presents recent computer-aided design (CAD) ideas that address memory management tasks, particularly the optimization of energy consumption in the memory subsystem. It explains how to efficiently implement CAD solutions, including theoretical methods and novel algorithms. The book covers various energy-aware design techniques, including data-dependence analysis techniques, memory size estimation methods, extensions of mapping approaches, and memory banking approaches. It shows how these techniques

  17. GAiN: Distributed Array Computation with Python

    Energy Technology Data Exchange (ETDEWEB)

    Daily, Jeffrey A. [Washington State Univ., Pullman, WA (United States)

    2009-05-01

    Scientific computing makes use of very large, multidimensional numerical arrays - typically, gigabytes to terabytes in size - much larger than can fit on even the largest single compute node. Such arrays must be distributed across a "cluster" of nodes. Global Arrays is a cluster-based software system from Battelle Pacific Northwest National Laboratory that enables an efficient, portable, and parallel shared-memory programming interface to manipulate these arrays. Written in and for the C and FORTRAN programming languages, it takes advantage of high-performance cluster interconnections to allow any node in the cluster to access data on any other node very rapidly. The "numpy" module is the de facto standard for numerical calculation in the Python programming language, a language whose use is growing rapidly in the scientific and engineering communities. numpy provides a powerful N-dimensional array class as well as other scientific computing capabilities. However, like the majority of the core Python modules, numpy is inherently serial. Our system, GAiN (Global Arrays in NumPy), is a parallel extension to Python that accesses Global Arrays through numpy. This allows parallel processing and/or larger problem sizes to be harnessed almost transparently within new or existing numpy programs.

  18. Fast distributed large-pixel-count hologram computation using a GPU cluster.

    Science.gov (United States)

    Pan, Yuechao; Xu, Xuewu; Liang, Xinan

    2013-09-10

    Large-pixel-count holograms are one essential part for big size holographic three-dimensional (3D) display, but the generation of such holograms is computationally demanding. In order to address this issue, we have built a graphics processing unit (GPU) cluster with 32.5 Tflop/s computing power and implemented distributed hologram computation on it with speed improvement techniques, such as shared memory on GPU, GPU level adaptive load balancing, and node level load distribution. Using these speed improvement techniques on the GPU cluster, we have achieved 71.4 times computation speed increase for 186M-pixel holograms. Furthermore, we have used the approaches of diffraction limits and subdivision of holograms to overcome the GPU memory limit in computing large-pixel-count holograms. 745M-pixel and 1.80G-pixel holograms were computed in 343 and 3326 s, respectively, for more than 2 million object points with RGB colors. Color 3D objects with 1.02M points were successfully reconstructed from 186M-pixel hologram computed in 8.82 s with all the above three speed improvement techniques. It is shown that distributed hologram computation using a GPU cluster is a promising approach to increase the computation speed of large-pixel-count holograms for large size holographic display.

  19. Irrelevant sensory stimuli interfere with working memory storage: evidence from a computational model of prefrontal neurons.

    Science.gov (United States)

    Bancroft, Tyler D; Hockley, William E; Servos, Philip

    2013-03-01

    The encoding of irrelevant stimuli into the memory store has previously been suggested as a mechanism of interference in working memory (e.g., Lange & Oberauer, Memory, 13, 333-339, 2005; Nairne, Memory & Cognition, 18, 251-269, 1990). Recently, Bancroft and Servos (Experimental Brain Research, 208, 529-532, 2011) used a tactile working memory task to provide experimental evidence that irrelevant stimuli were, in fact, encoded into working memory. In the present study, we replicated Bancroft and Servos's experimental findings using a biologically based computational model of prefrontal neurons, providing a neurocomputational model of overwriting in working memory. Furthermore, our modeling results show that inhibition acts to protect the contents of working memory, and they suggest a need for further experimental research into the capacity of vibrotactile working memory.

  20. Sensorimotor memory of object weight distribution during multidigit grasp.

    Science.gov (United States)

    Albert, Frederic; Santello, Marco; Gordon, Andrew M

    2009-10-09

    We studied the ability to transfer three-digit force sharing patterns learned through consecutive lifts of an object with an asymmetric center of mass (CM). After several object lifts, we asked subjects to rotate and translate the object to the contralateral hand and perform one additional lift. This task was performed under two weight conditions (550 and 950 g) to determine the extent to which subjects would be able to transfer weight and CM information. Learning transfer was quantified by measuring the extent to which force sharing patterns and peak object roll on the first post-translation trial resembled those measured on the pre-translation trial with the same CM. We found that the overall gain of fingertip forces was transferred following object rotation, but that the scaling of individual digit forces was specific to the learned digit-object configuration, and thus was not transferred following rotation. As a result, on the first post-translation trial there was a significantly larger object roll following object lift-off than on the pre-translation trial. This suggests that sensorimotor memories for weight, requiring scaling of fingertip force gain, may differ from memories for mass distribution.

  1. An Applet-based Anonymous Distributed Computing System.

    Science.gov (United States)

    Finkel, David; Wills, Craig E.; Ciaraldi, Michael J.; Amorin, Kevin; Covati, Adam; Lee, Michael

    2001-01-01

    Defines anonymous distributed computing systems and focuses on the specifics of a Java, applet-based approach for large-scale, anonymous, distributed computing on the Internet. Explains the possibility of a large number of computers participating in a single computation and describes a test of the functionality of the system. (Author/LRW)

  2. 10th International Symposium on Intelligent Distributed Computing

    CERN Document Server

    Seghrouchni, Amal; Beynier, Aurélie; Camacho, David; Herpson, Cédric; Hindriks, Koen; Novais, Paulo

    2017-01-01

    This book presents the combined peer-reviewed proceedings of the tenth International Symposium on Intelligent Distributed Computing (IDC’2016), which was held in Paris, France from October 10th to 12th, 2016. The 23 contributions address a range of topics related to theory and application of intelligent distributed computing, including: Intelligent Distributed Agent-Based Systems, Ambient Intelligence and Social Networks, Computational Sustainability, Intelligent Distributed Knowledge Representation and Processing, Smart Networks, Networked Intelligence and Intelligent Distributed Applications, amongst others.

  3. Spin-transfer torque magnetoresistive random-access memory technologies for normally off computing (invited)

    Energy Technology Data Exchange (ETDEWEB)

    Ando, K., E-mail: ando-koji@aist.go.jp; Yuasa, S. [National Institute of Advanced Industrial Science and Technology (AIST), Tsukuba 305-8568 (Japan); Fujita, S.; Ito, J.; Yoda, H. [Toshiba Corporation, Kawasaki 212-8582 (Japan); Suzuki, Y. [Graduate School of Engineering Science, Osaka University, Toyonaka 560-8531 (Japan); Nakatani, Y. [Department of Communication Engineering and Informatics, University of Electro-Communication, Chofu 182-8585 (Japan); Miyazaki, T. [WPI-AIMR, Tohoku University, Sendai 980-8577 (Japan)

    2014-05-07

    Most parts of present computer systems are made of volatile devices, and the power to supply them to avoid information loss causes huge energy losses. We can eliminate this meaningless energy loss by utilizing the non-volatile function of advanced spin-transfer torque magnetoresistive random-access memory (STT-MRAM) technology and create a new type of computer, i.e., normally off computers. Critical tasks to achieve normally off computers are implementations of STT-MRAM technologies in the main memory and low-level cache memories. STT-MRAM technology for applications to the main memory has been successfully developed by using perpendicular STT-MRAMs, and faster STT-MRAM technologies for applications to the cache memory are now being developed. The present status of STT-MRAMs and challenges that remain for normally off computers are discussed.

  4. Equilibrium distribution from distributed computing (simulations of protein folding).

    Science.gov (United States)

    Scalco, Riccardo; Caflisch, Amedeo

    2011-05-19

    Multiple independent molecular dynamics (MD) simulations are often carried out starting from a single protein structure or a set of conformations that do not correspond to a thermodynamic ensemble. Therefore, a significant statistical bias is usually present in the Markov state model generated by simply combining the whole MD sampling into a network whose nodes and links are clusters of snapshots and transitions between them, respectively. Here, we introduce a depth-first search algorithm to extract from the whole conformation space network the largest ergodic component, i.e., the subset of nodes of the network whose transition matrix corresponds to an ergodic Markov chain. For multiple short MD simulations of a globular protein (as in distributed computing), the steady state, i.e., stationary distribution determined using the largest ergodic component, yields more accurate free energy profiles and mean first passage times than the original network or the ergodic network obtained by imposing detailed balance by means of symmetrization of the transition counts.

  5. DISTRIBUTED COMPUTING SUPPORT SERVICE USER SURVEY

    CERN Multimedia

    2001-01-01

    IT Division operates a Distributed Computing Support Service, which offers support to owners and users of all variety of desktops throughout CERN as well as more dedicated services for certain groups, divisions and experiments. It also provides the staff who operate the central and satellite Computing Helpdesks, it supports printers throughout the site and it provides the installation activities of the IT Division PC Service. We have published a questionnaire, which seeks to gather your feedback on how the services are seen, how they are progressing and how they can be improved. Please take a few minutes to fill in this questionnaire. Replies will be treated in confidence if desired although you may also request an opportunity to be contacted by CERN's service management directly. Please tell us if you met problems but also if you had a successful conclusion to your request for assistance. You will find the questionnaire at the web site http://wwwinfo/support/survey/desktop-contract There will also be a link...

  6. DISTRIBUTED COMPUTING SUPPORT CONTRACT USER SURVEY

    CERN Multimedia

    2001-01-01

    IT Division operates a Distributed Computing Support Service, which offers support to owners and users of all variety of desktops throughout CERN as well as more dedicated services for certain groups, divisions and experiments. It also provides the staff who operate the central and satellite Computing Helpdesks, it supports printers throughout the site and it provides the installation activities of the IT Division PC Service. We have published a questionnaire which seeks to gather your feedback on how the services are seen, how they are progressing and how they can be improved. Please take a few minutes to fill in this questionnaire. Replies will be treated in confidence if desired although you may also request an opportunity to be contacted by CERN's service management directly. Please tell us if you met problems but also if you had a successful conclusion to your request for assistance. You will find the questionnaire at the web site http://wwwinfo/support/survey/desktop-contract There will also be a link ...

  7. Distributed computing grid experiences in CMS

    CERN Document Server

    Andreeva, Julia; Barrass, T; Bonacorsi, D; Bunn, Julian; Capiluppi, P; Corvo, M; Darmenov, N; De Filippis, N; Donno, F; Donvito, G; Eulisse, G; Fanfani, A; Fanzago, F; Filine, A; Grandi, C; Hernández, J M; Innocente, V; Jan, A; Lacaprara, S; Legrand, I; Metson, S; Newbold, D; Newman, H; Pierro, A; Silvestris, L; Steenberg, C; Stockinger, H; Taylor, Lucas; Thomas, M; Tuura, L; Van Lingen, F; Wildish, Tony

    2005-01-01

    The CMS experiment is currently developing a computing system capable of serving, processing and archiving the large number of events that will be generated when the CMS detector starts taking data. During 2004 CMS undertook a large scale data challenge to demonstrate the ability of the CMS computing system to cope with a sustained data- taking rate equivalent to 25% of startup rate. Its goals were: to run CMS event reconstruction at CERN for a sustained period at 25 Hz input rate; to distribute the data to several regional centers; and enable data access at those centers for analysis. Grid middleware was utilized to help complete all aspects of the challenge. To continue to provide scalable access from anywhere in the world to the data, CMS is developing a layer of software that uses Grid tools to gain access to data and resources, and that aims to provide physicists with a user friendly interface for submitting their analysis jobs. This paper describes the data challenge experience with Grid infrastructure ...

  8. Automating usability of ATLAS Distributed Computing resources

    Science.gov (United States)

    Tupputi, S. A.; Di Girolamo, A.; Kouba, T.; Schovancová, J.; Atlas Collaboration

    2014-06-01

    The automation of ATLAS Distributed Computing (ADC) operations is essential to reduce manpower costs and allow performance-enhancing actions, which improve the reliability of the system. In this perspective a crucial case is the automatic handling of outages of ATLAS computing sites storage resources, which are continuously exploited at the edge of their capabilities. It is challenging to adopt unambiguous decision criteria for storage resources of non-homogeneous types, sizes and roles. The recently developed Storage Area Automatic Blacklisting (SAAB) tool has provided a suitable solution, by employing an inference algorithm which processes history of storage monitoring tests outcome. SAAB accomplishes both the tasks of providing global monitoring as well as automatic operations on single sites. The implementation of the SAAB tool has been the first step in a comprehensive review of the storage areas monitoring and central management at all levels. Such review has involved the reordering and optimization of SAM tests deployment and the inclusion of SAAB results in the ATLAS Site Status Board with both dedicated metrics and views. The resulting structure allows monitoring the storage resources status with fine time-granularity and automatic actions to be taken in foreseen cases, like automatic outage handling and notifications to sites. Hence, the human actions are restricted to reporting and following up problems, where and when needed. In this work we show SAAB working principles and features. We present also the decrease of human interactions achieved within the ATLAS Computing Operation team. The automation results in a prompt reaction to failures, which leads to the optimization of resource exploitation.

  9. On distributed memory MPI-based parallelization of SPH codes in massive HPC context

    Science.gov (United States)

    Oger, G.; Le Touzé, D.; Guibert, D.; de Leffe, M.; Biddiscombe, J.; Soumagne, J.; Piccinali, J.-G.

    2016-03-01

    Most of particle methods share the problem of high computational cost and in order to satisfy the demands of solvers, currently available hardware technologies must be fully exploited. Two complementary technologies are now accessible. On the one hand, CPUs which can be structured into a multi-node framework, allowing massive data exchanges through a high speed network. In this case, each node is usually comprised of several cores available to perform multithreaded computations. On the other hand, GPUs which are derived from the graphics computing technologies, able to perform highly multi-threaded calculations with hundreds of independent threads connected together through a common shared memory. This paper is primarily dedicated to the distributed memory parallelization of particle methods, targeting several thousands of CPU cores. The experience gained clearly shows that parallelizing a particle-based code on moderate numbers of cores can easily lead to an acceptable scalability, whilst a scalable speedup on thousands of cores is much more difficult to obtain. The discussion revolves around speeding up particle methods as a whole, in a massive HPC context by making use of the MPI library. We focus on one particular particle method which is Smoothed Particle Hydrodynamics (SPH), one of the most widespread today in the literature as well as in engineering.

  10. Hierarchical Temporal Memory Based on Spin-Neurons and Resistive Memory for Energy-Efficient Brain-Inspired Computing

    OpenAIRE

    Fan, Deliang; Sharad, Mrigank; Sengupta, Abhronil; Roy, Kaushik

    2014-01-01

    Hierarchical temporal memory (HTM) tries to mimic the computing in cerebral-neocortex. It identifies spatial and temporal patterns in the input for making inferences. This may require large number of computationally expensive tasks like, dot-product evaluations. Nano-devices that can provide direct mapping for such primitives are of great interest. In this work we show that the computing blocks for HTM can be mapped using low-voltage, fast-switching, magneto-metallic spin-neurons combined wit...

  11. Simulation of radiation effects on three-dimensional computer optical memories

    Science.gov (United States)

    Moscovitch, M.; Emfietzoglou, D.

    1997-01-01

    A model was developed to simulate the effects of heavy charged-particle (HCP) radiation on the information stored in three-dimensional computer optical memories. The model is based on (i) the HCP track radial dose distribution, (ii) the spatial and temporal distribution of temperature in the track, (iii) the matrix-specific radiation-induced changes that will affect the response, and (iv) the kinetics of transition of photochromic molecules from the colored to the colorless isomeric form (bit flip). It is shown that information stored in a volume of several nanometers radius around the particle's track axis may be lost. The magnitude of the effect is dependent on the particle's track structure.

  12. Differentiation and Response Bias in Episodic Memory: Evidence from Reaction Time Distributions

    Science.gov (United States)

    Criss, Amy H.

    2010-01-01

    In differentiation models, the processes of encoding and retrieval produce an increase in the distribution of memory strength for targets and a decrease in the distribution of memory strength for foils as the amount of encoding increases. This produces an increase in the hit rate and decrease in the false-alarm rate for a strongly encoded compared…

  13. Continuous-variable quantum computing in optical time-frequency modes using quantum memories.

    Science.gov (United States)

    Humphreys, Peter C; Kolthammer, W Steven; Nunn, Joshua; Barbieri, Marco; Datta, Animesh; Walmsley, Ian A

    2014-09-26

    We develop a scheme for time-frequency encoded continuous-variable cluster-state quantum computing using quantum memories. In particular, we propose a method to produce, manipulate, and measure two-dimensional cluster states in a single spatial mode by exploiting the intrinsic time-frequency selectivity of Raman quantum memories. Time-frequency encoding enables the scheme to be extremely compact, requiring a number of memories that are a linear function of only the number of different frequencies in which the computational state is encoded, independent of its temporal duration. We therefore show that quantum memories can be a powerful component for scalable photonic quantum information processing architectures.

  14. The reminiscence bump without memories: The distribution of imagined word-cued and important autobiographical memories in a hypothetical 70-year-old

    DEFF Research Database (Denmark)

    Koppel, Jonathan Mark; Berntsen, Dorthe

    2016-01-01

    of autobiographical memories per se, most notably factors that aid in their encoding or retention, by asking students to generate imagined word-cued and imagined ‘most important’ autobiographical memories of a hypothetical, prototypical 70-year-old of their own culture and gender. We compared the distribution...... of these fictional memories with the distributions of actual word-cued and most important autobiographical memories in a sample of 61–70-year-olds. We found a striking similarity between the temporal distributions of the imagined memories and the actual memories. These results suggest that the reminiscence bump...

  15. Compact Differential Evolution Light: High Performance Despite Limited Memory Requirement and Modest Computational Overhead

    Institute of Scientific and Technical Information of China (English)

    Giovanni Iacca; Fabio Caraffini; Ferrante Neri

    2012-01-01

    Compact algorithms are Estimation of Distribution Algorithms which mimic the behavior of population-based algorithms by means of a probabilistic representation of the population of candidate solutions.These algorithms have a similar behaviour with respect to population-based algorithms but require a much smaller memory.This feature is crucially important in some engineering applications,especially in robotics.A high performance compact algorithm is the compact Differential Evolution (cDE) algorithm.This paper proposes a novel implementation of cDE,namely compact Differential Evolution light (cDElight),to address not only the memory saving necessities but also real-time requirements.cDElight employs two novel algorithmic modifications for employing a smaller computational overhead without a performance loss,with respect to cDE.Numerical results,carried out on a broad set of test problems,show that cDElight,despite its minimal hardware requirements,does not deteriorate the performance of cDE and thus is competitive with other memory saving and population-based algorithms.An application in the field of mobile robotics highlights the usability and advantages of the proposed approach.

  16. The ICAAP Project, Part Three: OSF Distributed Computing Environment.

    Science.gov (United States)

    Cantor, Scott

    1997-01-01

    DCE (Distributed Computing Environment) is a collection of services, tools, and libraries for building the infrastructure necessary for distributed computing within an enterprise. This articles discusses the Open Software Foundation (OSF); the components of DCE, including the Directory and Security Services, the Distributed Time Service, and the…

  17. Evaluating computer-assisted memory retraining programmes for people with post-head injury amnesia.

    Science.gov (United States)

    Tam, Sing-Fai; Man, Wai-Kwong

    2004-05-01

    The present study was designed to perform theory-driven empirical work that might contribute to a better understanding of computer-assisted training effects adopting theoretically different memory retraining strategies for people who had amnesia as a result of a brain injury. A pre-test and post-test control group quasi-experimental design was adopted to test the differences in effectiveness of four different computer-assisted memory training strategies, which were hypothesized to improve different memory skills of persons with brain injury. Twenty-six persons with brain injury were randomly assigned to four age- and gender-matched memory training groups (self-paced, feedback, personalized, visual presentation) and they were trained using the related computer software, evaluated by the Rivermead Behavioural Memory Test (RBMT), self-efficacy scale and built-up computer performance records. All the four memory training methods showed positive among the persons with brain injury as compared with a control group, although there was no statistically significant difference among the four training methods. However, clinical improvement was found in all four methods and the Feedback group showed significant improvement in self efficacy, in comparison with the other groups. This attempt to develop and evaluate different computer applications for memory retraining was made and the effectiveness of applying customized computer technology in memory rehabilitation was critically evaluated. Results of the present study showed that the unique customized therapeutic characteristics of computer-assisted memory retraining (e.g. self-paced practice, performance feedback, salient visual presentation and personalized training contents) are positive attributes of memory skill retraining outcomes.

  18. A survey of checkpointing algorithms for parallel and distributed computers

    Indian Academy of Sciences (India)

    S Kalaiselvi; V Rajaraman

    2000-10-01

    Checkpoint is defined as a designated place in a program at which normal processing is interrupted specifically to preserve the status information necessary to allow resumption of processing at a later time. Checkpointing is the process of saving the status information. This paper surveysthe algorithms which have been reported in the literature for checkpointing parallel/distributed systems. It has been observed that most of the algorithms published for checkpointing in message passing systems are based on the seminal article by Chandy and Lamport. A large number of articles have been published in this area by relaxing the assumptions made in this paper and by extending it to minimise the overheads of coordination and context saving. Checkpointing for sharedmemory systems primarily extend cache coherence protocolstomaintain a consistent memory. All of them assume that the main memory is safe for storing the context. Recently algorithms have been published for distributed shared memory systems, which extend the cache coherence protocols used in shared memory systems. They however also include methods for storing the status of distributed memory in stable storage. Most of the algorithms assume that there is no knowledge about the programs being executed.It is howeverfelt that in development of parallel programs the user has to do a fair amount of work in distributing tasks and this information can be effectively used to simplify checkpointing and rollback recovery.

  19. Distributed Quantum Computation over Noisy Channels

    CERN Document Server

    Ekert, A K; Macchiavello, C; Cirac, J I

    1999-01-01

    We analyse the use of entangled states to perform quantum computations non locally among distant nodes in a quantum network. We show that for a sufficiently large number of nodes maximally entangled states are always advantageous over independent computations in each node, even in the presence of noise during the computation process.

  20. A computational approach to memory deficits in schizophrenia

    NARCIS (Netherlands)

    J.M.J. Murre; M. Meeter; L.M. Talamini

    2002-01-01

    Episodic memory impairment is one of the most reliable neuropsychological findings in schizophrenia. It has been suggested that medial temporal lobe abnormalities in schizophrenia underlie this impairment. This article suggests that the specific memory deficits in schizophrenia may be caused by abno

  1. A Logistics Distribution Plan Based on Cloud Computing

    OpenAIRE

    2013-01-01

    Aiming at the problems of lowing informatization level and degree of specialization, high consumption and low efficiency in logistics and distribution industry, this paper analyzes the characteristics of cloud computing and the actual needs of enterprise logistics. On this basis, depth study of the logistics and distribution needs of the cloud computing architecture, depth study of the cloud computing architecture in the logistics and distribution needs, and then propose a cloud-based modern ...

  2. A high-throughput bioinformatics distributed computing platform

    OpenAIRE

    Keane, Thomas M; Page, Andrew J.; McInerney, James O; Naughton, Thomas J.

    2005-01-01

    In the past number of years the demand for high performance computing has greatly increased in the area of bioinformatics. The huge increase in size of many genomic databases has meant that many common tasks in bioinformatics are not possible to complete in a reasonable amount of time on a single processor. Recently distributed computing has emerged as an inexpensive alternative to dedicated parallel computing. We have developed a general-purpose distributed computing platform ...

  3. The Sensitivity of Memory Consolidation and Reconsolidation to Inhibitors of Protein Synthesis and Kinases: Computational Analysis

    Science.gov (United States)

    Zhang, Yili; Smolen, Paul; Baxter, Douglas A.; Byrne, John H.

    2010-01-01

    Memory consolidation and reconsolidation require kinase activation and protein synthesis. Blocking either process during or shortly after training or recall disrupts memory stabilization, which suggests the existence of a critical time window during which these processes are necessary. Using a computational model of kinase synthesis and…

  4. Power-aware applications for scientific cluster and distributed computing

    CERN Document Server

    Abdurachmanov, David; Eulisse, Giulio; Grosso, Paola; Hillegas, Curtis; Holzman, Burt; Klous, Sander; Knight, Robert; Muzaffar, Shahzad

    2014-01-01

    The aggregate power use of computing hardware is an important cost factor in scientific cluster and distributed computing systems. The Worldwide LHC Computing Grid (WLCG) is a major example of such a distributed computing system, used primarily for high throughput computing (HTC) applications. It has a computing capacity and power consumption rivaling that of the largest supercomputers. The computing capacity required from this system is also expected to grow over the next decade. Optimizing the power utilization and cost of such systems is thus of great interest. A number of trends currently underway will provide new opportunities for power-aware optimizations. We discuss how power-aware software applications and scheduling might be used to reduce power consumption, both as autonomous entities and as part of a (globally) distributed system. As concrete examples of computing centers we provide information on the large HEP-focused Tier-1 at FNAL, and the Tigress High Performance Computing Center at Princeton U...

  5. A Distributed Computational Infrastructure for Science and Education

    Directory of Open Access Journals (Sweden)

    Rustam K. Bazarov

    2014-06-01

    Full Text Available Researchers have lately been paying increasingly more attention to parallel and distributed algorithms for solving high-dimensionality problems. In this regard, the issue of acquiring or renting computational resources becomes a topical one for employees of scientific and educational institutions. This article examines technology and methods for organizing a distributed computational infrastructure. The author addresses the experience of creating a high-performance system powered by existing clusterization and grid computing technology. The approach examined in the article helps minimize financial costs, aggregate territorially distributed computational resources and ensures a more rational use of available computer equipment, eliminating its downtimes.

  6. Distributing an executable job load file to compute nodes in a parallel computer

    Science.gov (United States)

    Gooding, Thomas M.

    2016-08-09

    Distributing an executable job load file to compute nodes in a parallel computer, the parallel computer comprising a plurality of compute nodes, including: determining, by a compute node in the parallel computer, whether the compute node is participating in a job; determining, by the compute node in the parallel computer, whether a descendant compute node is participating in the job; responsive to determining that the compute node is participating in the job or that the descendant compute node is participating in the job, communicating, by the compute node to a parent compute node, an identification of a data communications link over which the compute node receives data from the parent compute node; constructing a class route for the job, wherein the class route identifies all compute nodes participating in the job; and broadcasting the executable load file for the job along the class route for the job.

  7. Distributing an executable job load file to compute nodes in a parallel computer

    Energy Technology Data Exchange (ETDEWEB)

    Gooding, Thomas M.

    2016-09-13

    Distributing an executable job load file to compute nodes in a parallel computer, the parallel computer comprising a plurality of compute nodes, including: determining, by a compute node in the parallel computer, whether the compute node is participating in a job; determining, by the compute node in the parallel computer, whether a descendant compute node is participating in the job; responsive to determining that the compute node is participating in the job or that the descendant compute node is participating in the job, communicating, by the compute node to a parent compute node, an identification of a data communications link over which the compute node receives data from the parent compute node; constructing a class route for the job, wherein the class route identifies all compute nodes participating in the job; and broadcasting the executable load file for the job along the class route for the job.

  8. Computational dissection of human episodic memory reveals mental process-specific genetic profiles.

    Science.gov (United States)

    Luksys, Gediminas; Fastenrath, Matthias; Coynel, David; Freytag, Virginie; Gschwind, Leo; Heck, Angela; Jessen, Frank; Maier, Wolfgang; Milnik, Annette; Riedel-Heller, Steffi G; Scherer, Martin; Spalek, Klara; Vogler, Christian; Wagner, Michael; Wolfsgruber, Steffen; Papassotiropoulos, Andreas; de Quervain, Dominique J-F

    2015-09-01

    Episodic memory performance is the result of distinct mental processes, such as learning, memory maintenance, and emotional modulation of memory strength. Such processes can be effectively dissociated using computational models. Here we performed gene set enrichment analyses of model parameters estimated from the episodic memory performance of 1,765 healthy young adults. We report robust and replicated associations of the amine compound SLC (solute-carrier) transporters gene set with the learning rate, of the collagen formation and transmembrane receptor protein tyrosine kinase activity gene sets with the modulation of memory strength by negative emotional arousal, and of the L1 cell adhesion molecule (L1CAM) interactions gene set with the repetition-based memory improvement. Furthermore, in a large functional MRI sample of 795 subjects we found that the association between L1CAM interactions and memory maintenance revealed large clusters of differences in brain activity in frontal cortical areas. Our findings provide converging evidence that distinct genetic profiles underlie specific mental processes of human episodic memory. They also provide empirical support to previous theoretical and neurobiological studies linking specific neuromodulators to the learning rate and linking neural cell adhesion molecules to memory maintenance. Furthermore, our study suggests additional memory-related genetic pathways, which may contribute to a better understanding of the neurobiology of human memory.

  9. Distributed computing environments for future space control systems

    Science.gov (United States)

    Viallefont, Pierre

    1993-01-01

    The aim of this paper is to present the results of a CNES research project on distributed computing systems. The purpose of this research was to study the impact of the use of new computer technologies in the design and development of future space applications. The first part of this study was a state-of-the-art review of distributed computing systems. One of the interesting ideas arising from this review is the concept of a 'virtual computer' allowing the distributed hardware architecture to be hidden from a software application. The 'virtual computer' can improve system performance by adapting the best architecture (addition of computers) to the software application without having to modify its source code. This concept can also decrease the cost and obsolescence of the hardware architecture. In order to verify the feasibility of the 'virtual computer' concept, a prototype representative of a distributed space application is being developed independently of the hardware architecture.

  10. The ATLAS Distributed Computing: the challenges of the future

    CERN Document Server

    Sakamoto, H; The ATLAS collaboration

    2013-01-01

    The ATLAS experiment has collected more than 25 fb-1 of data since LHC has started it's operation in 2010. Tens of petabytes of collision events and Monte-Carlo simulations are stored over more than 150 computing centers all over the world. The data processing is performed on grid sites providing more than 100.000 computing cores and orchestrated by the ATLAS in-house developed job and data management services. The discovery of the Higgs-like boson in 2012 would not be possible without the excellent performance of the ATLAS Distributed Computing. The future ATLAS experiment operation with increased LHC beam energy and luminosity foreseen for 2014 imposes a significant increase in computing demands the ATLAS Distributed Computing needs to satisfy. Therefore, a development of the new data-processing, storage and data-distribution systems has been started to efficiently use the computing resources exploiting current and future technologies of distributed computing.

  11. A Working Memory System With Distributed Executive Control.

    Science.gov (United States)

    Vandierendonck, André

    2016-01-01

    Working memory consists of domain-specific storage facilities and domain-general executive control processes. In some working memory theories, these control processes are accounted for via a homunculus, the central executive. In the present article, the author defends a mechanistic view of executive control by adopting the position that executive control is situated in the context of goal-directed behavior to maintain and protect the goal and to select an action to attain the goal. On the basis of findings in task switching and dual tasking, he proposes an adapted multicomponent working memory model in which the central executive is replaced by three interacting components: an executive memory that maintains the task set, a collection of acquired procedural rules, and an engine that executes the procedural rules that match the ensemble of working memory contents. The strongest among the rules that match the ensemble of working memory contents is applied, resulting in changes of the working memory contents or in motor actions. According to this model, goals are attained when the route to the goals is known or can be searched when the route is unknown (problem solving). Empirical evidence for this proposal and new predictions are discussed.

  12. Research on Dynamic Distributed Computing System for Small and Medium-Sized Computer Clusters

    Institute of Scientific and Technical Information of China (English)

    Le Kang; Jianliang Xu; Feng Liu

    2012-01-01

      Distributed computing system is a science by which a complex task that need for large amount of computation can be divided into small pieces and calculated by more than one computer,and we can get the final result according to results from each computer.This paper considers a distributed computing system running in the small and medium-sized computer clusters to solve the problem that single computer has a low efficiency,and improve the efficiency of large-scale computing.The experiments show that the system can effectively improve the efficiency and it is a viable program.

  13. An empirical investigation of sparse distributed memory using discrete speech recognition

    Science.gov (United States)

    Danforth, Douglas G.

    1990-01-01

    Presented here is a step by step analysis of how the basic Sparse Distributed Memory (SDM) model can be modified to enhance its generalization capabilities for classification tasks. Data is taken from speech generated by a single talker. Experiments are used to investigate the theory of associative memories and the question of generalization from specific instances.

  14. The Distribution of Subjective Memory Strength: List Strength and Response Bias

    Science.gov (United States)

    Criss, Amy H.

    2009-01-01

    Models of recognition memory assume that memory decisions are based partially on the subjective strength of the test item. Models agree that the subjective strength of targets increases with additional time for encoding however the origin of the subjective strength of foils remains disputed. Under the fixed strength assumption the distribution of…

  15. A nonclassical symbolic theory of working memory, mental computations, and mental set

    CERN Document Server

    Eliashberg, Victor

    2009-01-01

    The paper tackles four basic questions associated with human brain as a learning system. How can the brain learn to (1) mentally simulate different external memory aids, (2) perform, in principle, any mental computations using imaginary memory aids, (3) recall the real sensory and motor events and synthesize a combinatorial number of imaginary events, (4) dynamically change its mental set to match a combinatorial number of contexts? We propose a uniform answer to (1)-(4) based on the general postulate that the human neocortex processes symbolic information in a "nonclassical" way. Instead of manipulating symbols in a read/write memory, as the classical symbolic systems do, it manipulates the states of dynamical memory representing different temporary attributes of immovable symbolic structures stored in a long-term memory. The approach is formalized as the concept of E-machine. Intuitively, an E-machine is a system that deals mainly with characteristic functions representing subsets of memory pointers rather ...

  16. Encoding of fear learning and memory in distributed neuronal circuits.

    Science.gov (United States)

    Herry, Cyril; Johansen, Joshua P

    2014-12-01

    How sensory information is transformed by learning into adaptive behaviors is a fundamental question in neuroscience. Studies of auditory fear conditioning have revealed much about the formation and expression of emotional memories and have provided important insights into this question. Classical work focused on the amygdala as a central structure for fear conditioning. Recent advances, however, have identified new circuits and neural coding strategies mediating fear learning and the expression of fear behaviors. One area of research has identified key brain regions and neuronal coding mechanisms that regulate the formation, specificity and strength of fear memories. Other work has discovered critical circuits and neuronal dynamics by which fear memories are expressed through a medial prefrontal cortex pathway and coordinated activity across interconnected brain regions. Here we review these recent advances alongside prior work to provide a working model of the extended circuits and neuronal coding mechanisms mediating fear learning and memory.

  17. Projection multiplex recording of computer-synthesised one-dimensional Fourier holograms for holographic memory systems: mathematical and experimental modelling

    Energy Technology Data Exchange (ETDEWEB)

    Betin, A Yu; Bobrinev, V I; Verenikina, N M; Donchenko, S S; Odinokov, S B [Research Institute ' Radiotronics and Laser Engineering' , Bauman Moscow State Technical University, Moscow (Russian Federation); Evtikhiev, N N; Zlokazov, E Yu; Starikov, S N; Starikov, R S [National Reseach Nuclear University MEPhI (Moscow Engineering Physics Institute), Moscow (Russian Federation)

    2015-08-31

    A multiplex method of recording computer-synthesised one-dimensional Fourier holograms intended for holographic memory devices is proposed. The method potentially allows increasing the recording density in the previously proposed holographic memory system based on the computer synthesis and projection recording of data page holograms. (holographic memory)

  18. Distributed Computing: Considerations for Its Use within Educational Environments.

    Science.gov (United States)

    Pratt, S. J.

    1985-01-01

    Emphasizing more effective use of existing equipment, this article highlights distributed computing design considerations applicable to educational environments; identifies potential roles of networking in the provision of adequate teaching aids; presents a networking model; and describes the development of a distributed computing configuration at…

  19. The Principals and Practice of Distributed High Throughput Computing

    CERN Document Server

    CERN. Geneva

    2016-01-01

    The potential of Distributed Processing Systems to deliver computing capabilities with qualities ranging from high availability and reliability to easy expansion in functionality and capacity were recognized and formalized in the 1970’s. For more three decade these principals Distributed Computing guided the development of the HTCondor resource and job management system. The widely adopted suite of software tools offered by HTCondor are based on novel distributed computing technologies and are driven by the evolving needs of High Throughput scientific applications. We will review the principals that underpin our work, the distributed computing frameworks and technologies we developed and the lessons we learned from delivering effective and dependable software tools in an ever changing landscape computing technologies and needs that range today from a desktop computer to tens of thousands of cores offered by commercial clouds. About the speaker Miron Livny received a B.Sc. degree in Physics and Mat...

  20. Building Mail Server on Distributed Computing SYstem

    Institute of Scientific and Technical Information of China (English)

    AkihiroShibata; OsamuHamada; 等

    2001-01-01

    The electronic mail has become the indispensable function in daily job,and the server stability and performance are required.Using DCE and DFS we have built the distributed electronic mail sever,that is,servers such as SMPT,IMAP are distributed symmetrically,and provids the seamless access.

  1. Distributed computing of Seismic Imaging Algorithms

    CERN Document Server

    Emami, Masnida; Jaberi, Nasrin

    2012-01-01

    The primary use of technical computing in the oil and gas industries is for seismic imaging of the earth's subsurface, driven by the business need for making well-informed drilling decisions during petroleum exploration and production. Since each oil/gas well in exploration areas costs several tens of millions of dollars, producing high-quality seismic images in a reasonable time can significantly reduce the risk of drilling a "dry hole". Similarly, these images are important as they can improve the position of wells in a billion-dollar producing oil field. However seismic imaging is very data- and compute-intensive which needs to process terabytes of data and require Gflop-years of computation (using "flop" to mean floating point operation per second). Due to the data/computing intensive nature of seismic imaging, parallel computing are used to process data to reduce the time compilation. With introducing of Cloud computing, MapReduce programming model has been attracted a lot of attention in parallel and di...

  2. A Weibull distribution accrual failure detector for cloud computing.

    Science.gov (United States)

    Liu, Jiaxi; Wu, Zhibo; Wu, Jin; Dong, Jian; Zhao, Yao; Wen, Dongxin

    2017-01-01

    Failure detectors are used to build high availability distributed systems as the fundamental component. To meet the requirement of a complicated large-scale distributed system, accrual failure detectors that can adapt to multiple applications have been studied extensively. However, several implementations of accrual failure detectors do not adapt well to the cloud service environment. To solve this problem, a new accrual failure detector based on Weibull Distribution, called the Weibull Distribution Failure Detector, has been proposed specifically for cloud computing. It can adapt to the dynamic and unexpected network conditions in cloud computing. The performance of the Weibull Distribution Failure Detector is evaluated and compared based on public classical experiment data and cloud computing experiment data. The results show that the Weibull Distribution Failure Detector has better performance in terms of speed and accuracy in unstable scenarios, especially in cloud computing.

  3. Bringing the CMS distributed computing system into scalable operations

    CERN Document Server

    Belforte, S; Fisk, I; Flix, J; Hernández, J M; Kress, T; Letts, J; Magini, N; Miccio, V; Sciabà, A

    2010-01-01

    Establishing efficient and scalable operations of the CMS distributed computing system critically relies on the proper integration, commissioning and scale testing of the data and workload management tools, the various computing workflows and the underlying computing infrastructure, located at more than 50 computing centres worldwide and interconnected by the Worldwide LHC Computing Grid. Computing challenges periodically undertaken by CMS in the past years with increasing scale and complexity have revealed the need for a sustained effort on computing integration and commissioning activities. The Processing and Data Access (PADA) Task Force was established at the beginning of 2008 within the CMS Computing Program with the mandate of validating the infrastructure for organized processing and user analysis including the sites and the workload and data management tools, validating the distributed production system by performing functionality, reliability and scale tests, helping sites to commission, configure an...

  4. Distributed Function Computation Under Privacy Constraints

    CERN Document Server

    Tyagi, Himanshu

    2012-01-01

    A set of terminals observe correlated data and seek to compute functions of the data using interactive public communication. At the same time, it is required that the value of a private function of the data remains concealed from an eavesdropper observing this communication. In general, the private function and the functions computed by the nodes can be all different. We show that a class of functions are securely computable if and only if the conditional entropy of data given the value of private function is greater than the least rate of interactive communication required for a related multiterminal source-coding task. A single-letter formula is provided for this rate in special cases.

  5. Triangular Dynamic Architecture for Distributed Computing in a LAN Environment

    CERN Document Server

    Hossain, M Shahriar; Fuad, M Muztaba; Deb, Debzani

    2011-01-01

    A computationally intensive large job, granulized to concurrent pieces and operating in a dynamic environment should reduce the total processing time. However, distributing jobs across a networked environment is a tedious and difficult task. Job distribution in a Local Area Network based on Triangular Dynamic Architecture (TDA) is a mechanism that establishes a dynamic environment for job distribution, load balancing and distributed processing with minimum interaction from the user. This paper introduces TDA and discusses its architecture and shows the benefits gained by utilizing such architecture in a distributed computing environment.

  6. A lightweight communication library for distributed computing

    NARCIS (Netherlands)

    Groen, D.; Rieder, S.; Grosso, P.; de Laat, C.; Portegies Zwart, S.

    2010-01-01

    We present MPWide, a platform-independent communication library for performing message passing between computers. Our library allows coupling of several local message passing interface (MPI) applications through a long-distance network and is specifically optimized for such communications. The imple

  7. A comparison of queueing, cluster and distributed computing systems

    Science.gov (United States)

    Kaplan, Joseph A.; Nelson, Michael L.

    1993-01-01

    Using workstation clusters for distributed computing has become popular with the proliferation of inexpensive, powerful workstations. Workstation clusters offer both a cost effective alternative to batch processing and an easy entry into parallel computing. However, a number of workstations on a network does not constitute a cluster. Cluster management software is necessary to harness the collective computing power. A variety of cluster management and queuing systems are compared: Distributed Queueing Systems (DQS), Condor, Load Leveler, Load Balancer, Load Sharing Facility (LSF - formerly Utopia), Distributed Job Manager (DJM), Computing in Distributed Networked Environments (CODINE), and NQS/Exec. The systems differ in their design philosophy and implementation. Based on published reports on the different systems and conversations with the system's developers and vendors, a comparison of the systems are made on the integral issues of clustered computing.

  8. Computer-Presented Organizational/Memory Aids as Instruction for Solving Pico-Fomi Problems.

    Science.gov (United States)

    Steinberg, Esther R.; And Others

    1985-01-01

    Describes investigation of effectiveness of computer-presented organizational/memory aids (matrix and verbal charts controlled by computer or learner) as instructional technique for solving Pico-Fomi problems, and the acquisition of deductive inference rules when such aids are present. Results indicate chart use control should be adapted to…

  9. Distributed metadata in a high performance computing environment

    Energy Technology Data Exchange (ETDEWEB)

    Bent, John M.; Faibish, Sorin; Zhang, Zhenhua; Liu, Xuezhao; Tang, Haiying

    2017-07-11

    A computer-executable method, system, and computer program product for managing meta-data in a distributed storage system, wherein the distributed storage system includes one or more burst buffers enabled to operate with a distributed key-value store, the co computer-executable method, system, and computer program product comprising receiving a request for meta-data associated with a block of data stored in a first burst buffer of the one or more burst buffers in the distributed storage system, wherein the meta data is associated with a key-value, determining which of the one or more burst buffers stores the requested metadata, and upon determination that a first burst buffer of the one or more burst buffers stores the requested metadata, locating the key-value in a portion of the distributed key-value store accessible from the first burst buffer.

  10. Distributed memory compiler methods for irregular problems: Data copy reuse and runtime partitioning

    Science.gov (United States)

    Das, Raja; Ponnusamy, Ravi; Saltz, Joel; Mavriplis, Dimitri

    1991-01-01

    Outlined here are two methods which we believe will play an important role in any distributed memory compiler able to handle sparse and unstructured problems. We describe how to link runtime partitioners to distributed memory compilers. In our scheme, programmers can implicitly specify how data and loop iterations are to be distributed between processors. This insulates users from having to deal explicitly with potentially complex algorithms that carry out work and data partitioning. We also describe a viable mechanism for tracking and reusing copies of off-processor data. In many programs, several loops access the same off-processor memory locations. As long as it can be verified that the values assigned to off-processor memory locations remain unmodified, we show that we can effectively reuse stored off-processor data. We present experimental data from a 3-D unstructured Euler solver run on iPSC/860 to demonstrate the usefulness of our methods.

  11. High speed vision processor with reconfigurable processing element array based on full-custom distributed memory

    Science.gov (United States)

    Chen, Zhe; Yang, Jie; Shi, Cong; Qin, Qi; Liu, Liyuan; Wu, Nanjian

    2016-04-01

    In this paper, a hybrid vision processor based on a compact full-custom distributed memory for near-sensor high-speed image processing is proposed. The proposed processor consists of a reconfigurable processing element (PE) array, a row processor (RP) array, and a dual-core microprocessor. The PE array includes two-dimensional processing elements with a compact full-custom distributed memory. It supports real-time reconfiguration between the PE array and the self-organized map (SOM) neural network. The vision processor is fabricated using a 0.18 µm CMOS technology. The circuit area of the distributed memory is reduced markedly into 1/3 of that of the conventional memory so that the circuit area of the vision processor is reduced by 44.2%. Experimental results demonstrate that the proposed design achieves correct functions.

  12. Memory

    Science.gov (United States)

    ... it has to decide what is worth remembering. Memory is the process of storing and then remembering this information. There are different types of memory. Short-term memory stores information for a few ...

  13. Integrating network awareness in ATLAS distributed computing

    CERN Document Server

    De, K; The ATLAS collaboration; Klimentov, A; Maeno, T; Mckee, S; Nilsson, P; Petrosyan, A; Vukotic, I; Wenaus, T

    2014-01-01

    A crucial contributor to the success of the massively scaled global computing system that delivers the analysis needs of the LHC experiments is the networking infrastructure upon which the system is built. The experiments have been able to exploit excellent high-bandwidth networking in adapting their computing models for the most efficient utilization of resources. New advanced networking technologies now becoming available such as software defined networks hold the potential of further leveraging the network to optimize workflows and dataflows, through proactive control of the network fabric on the part of high level applications such as experiment workload management and data management systems. End to end monitoring of networking and data flow performance further allows applications to adapt based on real time conditions. We will describe efforts underway in ATLAS on integrating network awareness at the application level, particularly in workload management.

  14. A distributed computing approach to mission operations support. [for spacecraft

    Science.gov (United States)

    Larsen, R. L.

    1975-01-01

    Computing mission operation support includes orbit determination, attitude processing, maneuver computation, resource scheduling, etc. The large-scale third-generation distributed computer network discussed is capable of fulfilling these dynamic requirements. It is shown that distribution of resources and control leads to increased reliability, and exhibits potential for incremental growth. Through functional specialization, a distributed system may be tuned to very specific operational requirements. Fundamental to the approach is the notion of process-to-process communication, which is effected through a high-bandwidth communications network. Both resource-sharing and load-sharing may be realized in the system.

  15. A Logistics Distribution Plan Based on Cloud Computing

    Directory of Open Access Journals (Sweden)

    Zhou Feng

    2013-12-01

    Full Text Available Aiming at the problems of lowing informatization level and degree of specialization, high consumption and low efficiency in logistics and distribution industry, this paper analyzes the characteristics of cloud computing and the actual needs of enterprise logistics. On this basis, depth study of the logistics and distribution needs of the cloud computing architecture, depth study of the cloud computing architecture in the logistics and distribution needs, and then propose a cloud-based modern logistics solutions, for the development of modern logistics provides a new operating mode

  16. AGIS: Integration of new technologies used in ATLAS Distributed Computing

    CERN Document Server

    Anisenkov, Alexey; The ATLAS collaboration; Alandes Pradillo, Maria

    2016-01-01

    AGIS is the information system designed to integrate configuration and status information about resources, services and topology of the computing infrastructure used by ATLAS Distributed Computing (ADC) applications and services. In this note, we describe the evolution and the recent developments of AGIS functionalities, related to integration of new technologies recently become widely used in ATLAS Computing like flexible computing utilization of opportunistic Cloud and HPC resources, ObjectStore services integration for Distributed Data Management (Rucio) and ATLAS workload management (PanDA) systems, unified storage protocols declaration required for PandDA Pilot site movers and others.

  17. Computer Generation of Fourier Transform Libraries for Distributed Memory Architectures

    Science.gov (United States)

    2010-12-01

    Cooley-TukeyFFT.Thefirst “fast” algorithm for theDFTwas discoveredbyCooley andTukey [Coo- ley and Tukey, 1965] 1, and is often referred to as “the...Carl Friedrich Gauss , but his work was not widely recog- nized [Heideman et al., 1985]. 2It is important to note the distinction between the terms...Applications, pages 129–136. North-Holland, 1994. 26 M. T. Heideman, D. H. Johnson, and C. S. Burrus. Gauss and the history of the fast Fourier

  18. CUDA optimization strategies for compute- and memory-bound neuroimaging algorithms.

    Science.gov (United States)

    Lee, Daren; Dinov, Ivo; Dong, Bin; Gutman, Boris; Yanovsky, Igor; Toga, Arthur W

    2012-06-01

    As neuroimaging algorithms and technology continue to grow faster than CPU performance in complexity and image resolution, data-parallel computing methods will be increasingly important. The high performance, data-parallel architecture of modern graphical processing units (GPUs) can reduce computational times by orders of magnitude. However, its massively threaded architecture introduces challenges when GPU resources are exceeded. This paper presents optimization strategies for compute- and memory-bound algorithms for the CUDA architecture. For compute-bound algorithms, the registers are reduced through variable reuse via shared memory and the data throughput is increased through heavier thread workloads and maximizing the thread configuration for a single thread block per multiprocessor. For memory-bound algorithms, fitting the data into the fast but limited GPU resources is achieved through reorganizing the data into self-contained structures and employing a multi-pass approach. Memory latencies are reduced by selecting memory resources whose cache performance are optimized for the algorithm's access patterns. We demonstrate the strategies on two computationally expensive algorithms and achieve optimized GPU implementations that perform up to 6× faster than unoptimized ones. Compared to CPU implementations, we achieve peak GPU speedups of 129× for the 3D unbiased nonlinear image registration technique and 93× for the non-local means surface denoising algorithm. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.

  19. A computational predictor of human episodic memory based on a theta phase precession network.

    Directory of Open Access Journals (Sweden)

    Naoyuki Sato

    Full Text Available In the rodent hippocampus, a phase precession phenomena of place cell firing with the local field potential (LFP theta is called "theta phase precession" and is considered to contribute to memory formation with spike time dependent plasticity (STDP. On the other hand, in the primate hippocampus, the existence of theta phase precession is unclear. Our computational studies have demonstrated that theta phase precession dynamics could contribute to primate-hippocampal dependent memory formation, such as object-place association memory. In this paper, we evaluate human theta phase precession by using a theory-experiment combined analysis. Human memory recall of object-place associations was analyzed by an individual hippocampal network simulated by theta phase precession dynamics of human eye movement and EEG data during memory encoding. It was found that the computational recall of the resultant network is significantly correlated with human memory recall performance, while other computational predictors without theta phase precession are not significantly correlated with subsequent memory recall. Moreover the correlation is larger than the correlation between human recall and traditional experimental predictors. These results indicate that theta phase precession dynamics are necessary for the better prediction of human recall performance with eye movement and EEG data. In this analysis, theta phase precession dynamics appear useful for the extraction of memory-dependent components from the spatio-temporal pattern of eye movement and EEG data as an associative network. Theta phase precession may be a common neural dynamic between rodents and humans for the formation of environmental memories.

  20. Tactical Airborne Distributed Computing and Networks

    Science.gov (United States)

    1981-10-01

    d’ornsembleLi du syst~me qui n~cessitent un volume important do logiquos. La gestion du digibus avec son architecture rodondante, ost un cas particulier ... gestion d’ensen’ble du syst~me et plus particuli ~rement sur deux aspects: - a gestion des commandos, des visualisations et des modes - kisurveillance...Benefits credited to distributed data processing are not capable of being realixed within the current state-of-the-art. o Preparation of military

  1. Predictive access control for distributed computation

    DEFF Research Database (Denmark)

    Yang, Fan; Hankin, Chris; Nielson, Flemming

    2013-01-01

    We show how to use aspect-oriented programming to separate security and trust issues from the logical design of mobile, distributed systems. The main challenge is how to enforce various types of security policies, in particular predictive access control policies — policies based on the future...... behavior of a program. A novel feature of our approach is that we can define policies concerning secondary use of data....

  2. LaRC local area networks to support distributed computing

    Science.gov (United States)

    Riddle, E. P.

    1984-01-01

    The Langley Research Center's (LaRC) Local Area Network (LAN) effort is discussed. LaRC initiated the development of a LAN to support a growing distributed computing environment at the Center. The purpose of the network is to provide an improved capability (over inteactive and RJE terminal access) for sharing multivendor computer resources. Specifically, the network will provide a data highway for the transfer of files between mainframe computers, minicomputers, work stations, and personal computers. An important influence on the overall network design was the vital need of LaRC researchers to efficiently utilize the large CDC mainframe computers in the central scientific computing facility. Although there was a steady migration from a centralized to a distributed computing environment at LaRC in recent years, the work load on the central resources increased. Major emphasis in the network design was on communication with the central resources within the distributed environment. The network to be implemented will allow researchers to utilize the central resources, distributed minicomputers, work stations, and personal computers to obtain the proper level of computing power to efficiently perform their jobs.

  3. Simulation model of load balancing in distributed computing systems

    Science.gov (United States)

    Botygin, I. A.; Popov, V. N.; Frolov, S. G.

    2017-02-01

    The availability of high-performance computing, high speed data transfer over the network and widespread of software for the design and pre-production in mechanical engineering have led to the fact that at the present time the large industrial enterprises and small engineering companies implement complex computer systems for efficient solutions of production and management tasks. Such computer systems are generally built on the basis of distributed heterogeneous computer systems. The analytical problems solved by such systems are the key models of research, but the system-wide problems of efficient distribution (balancing) of the computational load and accommodation input, intermediate and output databases are no less important. The main tasks of this balancing system are load and condition monitoring of compute nodes, and the selection of a node for transition of the user’s request in accordance with a predetermined algorithm. The load balancing is one of the most used methods of increasing productivity of distributed computing systems through the optimal allocation of tasks between the computer system nodes. Therefore, the development of methods and algorithms for computing optimal scheduling in a distributed system, dynamically changing its infrastructure, is an important task.

  4. Attention and visual memory in visualization and computer graphics.

    Science.gov (United States)

    Healey, Christopher G; Enns, James T

    2012-07-01

    A fundamental goal of visualization is to produce images of data that support visual analysis, exploration, and discovery of novel insights. An important consideration during visualization design is the role of human visual perception. How we "see" details in an image can directly impact a viewer's efficiency and effectiveness. This paper surveys research on attention and visual perception, with a specific focus on results that have direct relevance to visualization and visual analytics. We discuss theories of low-level visual perception, then show how these findings form a foundation for more recent work on visual memory and visual attention. We conclude with a brief overview of how knowledge of visual attention and visual memory is being applied in visualization and graphics. We also discuss how challenges in visualization are motivating research in psychophysics.

  5. A distributed computing model for telemetry data processing

    Science.gov (United States)

    Barry, Matthew R.; Scott, Kevin L.; Weismuller, Steven P.

    1994-01-01

    We present a new approach to distributing processed telemetry data among spacecraft flight controllers within the control centers at NASA's Johnson Space Center. This approach facilitates the development of application programs which integrate spacecraft-telemetered data and ground-based synthesized data, then distributes this information to flight controllers for analysis and decision-making. The new approach combines various distributed computing models into one hybrid distributed computing model. The model employs both client-server and peer-to-peer distributed computing models cooperating to provide users with information throughout a diverse operations environment. Specifically, it provides an attractive foundation upon which we are building critical real-time monitoring and control applications, while simultaneously lending itself to peripheral applications in playback operations, mission preparations, flight controller training, and program development and verification. We have realized the hybrid distributed computing model through an information sharing protocol. We shall describe the motivations that inspired us to create this protocol, along with a brief conceptual description of the distributed computing models it employs. We describe the protocol design in more detail, discussing many of the program design considerations and techniques we have adopted. Finally, we describe how this model is especially suitable for supporting the implementation of distributed expert system applications.

  6. A distributed computing model for telemetry data processing

    Science.gov (United States)

    Barry, Matthew R.; Scott, Kevin L.; Weismuller, Steven P.

    1994-05-01

    We present a new approach to distributing processed telemetry data among spacecraft flight controllers within the control centers at NASA's Johnson Space Center. This approach facilitates the development of application programs which integrate spacecraft-telemetered data and ground-based synthesized data, then distributes this information to flight controllers for analysis and decision-making. The new approach combines various distributed computing models into one hybrid distributed computing model. The model employs both client-server and peer-to-peer distributed computing models cooperating to provide users with information throughout a diverse operations environment. Specifically, it provides an attractive foundation upon which we are building critical real-time monitoring and control applications, while simultaneously lending itself to peripheral applications in playback operations, mission preparations, flight controller training, and program development and verification. We have realized the hybrid distributed computing model through an information sharing protocol. We shall describe the motivations that inspired us to create this protocol, along with a brief conceptual description of the distributed computing models it employs. We describe the protocol design in more detail, discussing many of the program design considerations and techniques we have adopted. Finally, we describe how this model is especially suitable for supporting the implementation of distributed expert system applications.

  7. A randomized controlled trial of the effectiveness of handheld computers for improving everyday memory functioning in patients with memory impairments after acquired brain injury.

    Science.gov (United States)

    Lannin, Natasha; Carr, Belinda; Allaous, Jeanine; Mackenzie, Bronwyn; Falcon, Alex; Tate, Robyn

    2014-05-01

    To determine the effectiveness of personal digital assistant devices on achievement of memory and organization goals in patients with poor memory after acquired brain injury. Assessor blinded randomized controlled trial. Specialist brain injury rehabilitation hospital (inpatients and outpatients). Adults with acquired brain impairments (85% traumatic brain injury; aged ≥17 years) who were assessed as having functional memory impairment on the Rivermead Behavioural Memory Test (General Memory Index). Training and support to use a personal digital assistant for eight weeks to compensate for memory failures by an occupational therapist. The control intervention was standard rehabilitation, including use of non-electronic memory aids. Goal Attainment Scale which assessed achievement of participants' daily memory functioning goals and caregiver perception of memory functioning; and General Frequency of Forgetting subscale of the Memory Functioning Questionnaire administered at baseline (pre-randomization) and post intervention (eight weeks later). Forty-two participants with memory impairment were recruited. Use of a personal digital assistant led to greater achievement of functional memory goals (mean difference 1.6 (95% confidence interval (CI) 1.0 to 2.2), P = 0.0001) and improvement on the General Frequency of Forgetting subscale (mean difference 12.5 (95% CI 2.0 to 22.9), P = 0.021). Occupational therapy training in the use of a handheld computer improved patients' daily memory function more than standard rehabilitation.

  8. Optical interconnection network for parallel access to multi-rank memory in future computing systems.

    Science.gov (United States)

    Wang, Kang; Gu, Huaxi; Yang, Yintang; Wang, Kun

    2015-08-10

    With the number of cores increasing, there is an emerging need for a high-bandwidth low-latency interconnection network, serving core-to-memory communication. In this paper, aiming at the goal of simultaneous access to multi-rank memory, we propose an optical interconnection network for core-to-memory communication. In the proposed network, the wavelength usage is delicately arranged so that cores can communicate with different ranks at the same time and broadcast for flow control can be achieved. A distributed memory controller architecture that works in a pipeline mode is also designed for efficient optical communication and transaction address processes. The scaling method and wavelength assignment for the proposed network are investigated. Compared with traditional electronic bus-based core-to-memory communication, the simulation results based on the PARSEC benchmark show that the bandwidth enhancement and latency reduction are apparent.

  9. Foundations of distributed multiscale computing: formalization, specification, and analysis

    NARCIS (Netherlands)

    Borgdorff, J.; Falcone, J.-L.; Lorenz, E.; Bona-Casas, C.; Chopard, B.; Hoekstra, A.G.

    2013-01-01

    Inherently complex problems from many scientific disciplines require a multiscale modeling approach. Yet its practical contents remain unclear and inconsistent. Moreover, multiscale models can be very computationally expensive, and may have potential to be executed on distributed infrastructure. In

  10. Modeling Workflow Management in a Distributed Computing System ...

    African Journals Online (AJOL)

    Modeling Workflow Management in a Distributed Computing System Using Petri Nets. ... who use it to share information more rapidly and increases their productivity. ... Petri nets are an established tool for modelling and analyzing processes.

  11. 7th International Symposium on Intelligent Distributed Computing

    CERN Document Server

    Jung, Jason; Badica, Costin

    2014-01-01

    This book represents the combined peer-reviewed proceedings of the Seventh International Symposium on Intelligent Distributed Computing - IDC-2013, of the Second Workshop on Agents for Clouds - A4C-2013, of the Fifth International Workshop on Multi-Agent Systems Technology and Semantics - MASTS-2013, and of the International Workshop on Intelligent Robots - iR-2013. All the events were held in Prague, Czech Republic during September 4-6, 2013. The 41 contributions published in this book address many topics related to theory and applications of intelligent distributed computing and multi-agent systems, including: agent-based data processing, ambient intelligence, bio-informatics, collaborative systems, cryptography and security, distributed algorithms, grid and cloud computing, information extraction, intelligent robotics, knowledge management, linked data, mobile agents, ontologies, pervasive computing, self-organizing systems, peer-to-peer computing, social networks and trust, and swarm intelligence.  .

  12. Using Sloane Rulers for Optimal Recovery Schemes in Distributed Computing

    Directory of Open Access Journals (Sweden)

    R. Delhi Babu

    2009-12-01

    Full Text Available Clusters and distributed systems offer fault tolerance and high performance through load sharing, and are thus attractive in real-time applications. When all computers are up and running, we would like the load to be evenly distributed among the computers. When one or more computers fail, the load must be redistributed. The redistribution is determined by the recovery scheme. The recovery scheme should keep the load as evenly distributed as possible even when the most unfavorable combinations of computers break down, i.e., we want to optimize the worst-case behavior. In this paper we compare the worst-case behavior of schemes such as Modulo ruler, Golomb ruler, Greedy sequence and Log sequence with worst-case behavior of Sloane sequence. Finally we observe that Sloane scheme performs better than all the other schemes. Keywords: Fault tolerance; High performance computing; Cluster technique; Recovery schemes; Sloane sequence;

  13. The process group approach to reliable distributed computing

    Science.gov (United States)

    Birman, Kenneth P.

    1992-01-01

    The difficulty of developing reliable distribution software is an impediment to applying distributed computing technology in many settings. Experience with the ISIS system suggests that a structured approach based on virtually synchronous process groups yields systems that are substantially easier to develop, exploit sophisticated forms of cooperative computation, and achieve high reliability. Six years of research on ISIS, describing the model, its implementation challenges, and the types of applications to which ISIS has been applied are reviewed.

  14. Asynchronous Distributed Execution of Fixpoint-Based Computational Fields

    DEFF Research Database (Denmark)

    Lluch Lafuente, Alberto; Loreti, Michele; Montanari, Ugo

    2016-01-01

    Coordination is essential for dynamic distributed systems whose components exhibit interactive and autonomous behaviors. Spatially distributed, locally interacting, propagating computational fields are particularly appealing for allowing components to join and leave with little or no overhead....... Computational fields are a key ingredient of aggregate programming, a promising software engineering methodology particularly relevant for the Internet of Things. In our approach, space topology is represented by a fixed graph-shaped field, namely a network with attributes on both nodes and arcs, where arcs...

  15. Synthetic analog and digital circuits for cellular computation and memory

    OpenAIRE

    Purcell, Oliver; Lu, Timothy K.

    2014-01-01

    Biological computation is a major area of focus in synthetic biology because it has the potential to enable a wide range of applications. Synthetic biologists have applied engineering concepts to biological systems in order to construct progressively more complex gene circuits capable of processing information in living cells. Here, we review the current state of computational genetic circuits and describe artificial gene circuits that perform digital and analog computation. We then discuss r...

  16. 9th International Symposium on Intelligent Distributed Computing

    CERN Document Server

    Camacho, David; Analide, Cesar; Seghrouchni, Amal; Badica, Costin

    2016-01-01

    This book represents the combined peer-reviewed proceedings of the ninth International Symposium on Intelligent Distributed Computing – IDC’2015, of the Workshop on Cyber Security and Resilience of Large-Scale Systems – WSRL’2015, and of the International Workshop on Future Internet and Smart Networks – FI&SN’2015. All the events were held in Guimarães, Portugal during October 7th-9th, 2015. The 46 contributions published in this book address many topics related to theory and applications of intelligent distributed computing, including: Intelligent Distributed Agent-Based Systems, Ambient Intelligence and Social Networks, Computational Sustainability, Intelligent Distributed Knowledge Representation and Processing, Smart Networks, Networked Intelligence and Intelligent Distributed Applications, amongst others.

  17. Integration of Cloud resources in the LHCb Distributed Computing

    CERN Document Server

    Ubeda Garcia, Mario; Stagni, Federico; Cabarrou, Baptiste; Rauschmayr, Nathalie; Charpentier, Philippe; Closier, Joel

    2014-01-01

    This contribution describes how Cloud resources have been integrated in the LHCb Distributed Computing. LHCb is using its specific Dirac extension (LHCbDirac) as an interware for its Distributed Computing. So far, it was seamlessly integrating Grid resources and Computer clusters. The cloud extension of DIRAC (VMDIRAC) allows the integration of Cloud computing infrastructures. It is able to interact with multiple types of infrastructures in commercial and institutional clouds, supported by multiple interfaces (Amazon EC2, OpenNebula, OpenStack and CloudStack) – instantiates, monitors and manages Virtual Machines running on this aggregation of Cloud resources. Moreover, specifications for institutional Cloud resources proposed by Worldwide LHC Computing Grid (WLCG), mainly by the High Energy Physics Unix Information Exchange (HEPiX) group, have been taken into account. Several initiatives and computing resource providers in the eScience environment have already deployed IaaS in production during 2013. Keepin...

  18. The computational implementation of the landscape model: modeling inferential processes and memory representations of text comprehension.

    Science.gov (United States)

    Tzeng, Yuhtsuen; van den Broek, Paul; Kendeou, Panayiota; Lee, Chengyuan

    2005-05-01

    The complexity of text comprehension demands a computational approach to describe the cognitive processes involved. In this article, we present the computational implementation of the landscape model of reading. This model captures both on-line comprehension processes during reading and the off-line memory representation after reading is completed, incorporating both memory-based and coherence-based mechanisms of comprehension. The overall architecture and specific parameters of the program are described, and a running example is provided. Several studies comparing computational and behavioral data indicate that the implemented model is able to account for cycle-by-cycle comprehension processes and memory for a variety of text types and reading situations.

  19. A Web-based Distributed Voluntary Computing Platform for Large Scale Hydrological Computations

    Science.gov (United States)

    Demir, I.; Agliamzanov, R.

    2014-12-01

    Distributed volunteer computing can enable researchers and scientist to form large parallel computing environments to utilize the computing power of the millions of computers on the Internet, and use them towards running large scale environmental simulations and models to serve the common good of local communities and the world. Recent developments in web technologies and standards allow client-side scripting languages to run at speeds close to native application, and utilize the power of Graphics Processing Units (GPU). Using a client-side scripting language like JavaScript, we have developed an open distributed computing framework that makes it easy for researchers to write their own hydrologic models, and run them on volunteer computers. Users will easily enable their websites for visitors to volunteer sharing their computer resources to contribute running advanced hydrological models and simulations. Using a web-based system allows users to start volunteering their computational resources within seconds without installing any software. The framework distributes the model simulation to thousands of nodes in small spatial and computational sizes. A relational database system is utilized for managing data connections and queue management for the distributed computing nodes. In this paper, we present a web-based distributed volunteer computing platform to enable large scale hydrological simulations and model runs in an open and integrated environment.

  20. Developing a Distributed Computing Architecture at Arizona State University.

    Science.gov (United States)

    Armann, Neil; And Others

    1994-01-01

    Development of Arizona State University's computing architecture, designed to ensure that all new distributed computing pieces will work together, is described. Aspects discussed include the business rationale, the general architectural approach, characteristics and objectives of the architecture, specific services, and impact on the university…

  1. Pladipus Enables Universal Distributed Computing in Proteomics Bioinformatics.

    Science.gov (United States)

    Verheggen, Kenneth; Maddelein, Davy; Hulstaert, Niels; Martens, Lennart; Barsnes, Harald; Vaudel, Marc

    2016-03-04

    The use of proteomics bioinformatics substantially contributes to an improved understanding of proteomes, but this novel and in-depth knowledge comes at the cost of increased computational complexity. Parallelization across multiple computers, a strategy termed distributed computing, can be used to handle this increased complexity; however, setting up and maintaining a distributed computing infrastructure requires resources and skills that are not readily available to most research groups. Here we propose a free and open-source framework named Pladipus that greatly facilitates the establishment of distributed computing networks for proteomics bioinformatics tools. Pladipus is straightforward to install and operate thanks to its user-friendly graphical interface, allowing complex bioinformatics tasks to be run easily on a network instead of a single computer. As a result, any researcher can benefit from the increased computational efficiency provided by distributed computing, hence empowering them to tackle more complex bioinformatics challenges. Notably, it enables any research group to perform large-scale reprocessing of publicly available proteomics data, thus supporting the scientific community in mining these data for novel discoveries.

  2. AGIS: Integration of new technologies used in ATLAS Distributed Computing

    CERN Document Server

    Anisenkov, Alexey; The ATLAS collaboration

    2017-01-01

    AGIS is the information system designed to integrate configuration and status information about resources, services and topology of the computing infrastructure used by ATLAS Distributed Computing (ADC) applications and services. In this note, we describe the evolution and the recent developments of AGIS functionalities.

  3. Toward Distributed Service Discovery in Pervasive Computing Environments

    Science.gov (United States)

    2006-02-01

    Library at www.computer.org/publications/ dlib . 112 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 5, NO. 2, FEBRUARY 2006 ... Library for Parallel Simulation of Large-Scale Wireless Networks,” Proc. 12th Workshop Parallel and Distributed Simulations, 1998. CHAKRABORTY ET AL...computing, digital libraries , electronic commerce, and trusted information systems. She has published eight books and more than 100 refereed articles in

  4. 分布内存系统中节点间软流水优化技术%Exploiting Inter-node Pipelining Parallelism in Distributed Memory Systems

    Institute of Scientific and Technical Information of China (English)

    陈莉; 张兆庆; 冯晓兵

    2002-01-01

    Maximize parallelism and minimize communication overheads are important issues for distributed memory systems. Communication and data redistribution cannot be avoided even when considering global optimization of data distribution and computation decomposition. A new approach based on loop fusion is presented exploiting pipelining parallelism, thus communication overhead can be hidden and data redistribution can be avoided. This technique exploits pipelining from complex loop structures, which distinguishes itself from traditional pipelining techniques. Ex-periments show that the technique is superior to other optimizations.

  5. A database for on-line event analysis on a distributed memory machine

    CERN Document Server

    Argante, E; Van der Stok, P D V; Willers, Ian Malcolm

    1995-01-01

    Parallel in-memory databases can enhance the structuring and parallelization of programs used in High Energy Physics (HEP). Efficient database access routines are used as communication primitives which hide the communication topology in contrast to the more explicit communications like PVM or MPI. A parallel in-memory database, called SPIDER, has been implemented on a 32 node Meiko CS-2 distributed memory machine. The spider primitives generate a lower overhead than the one generated by PVM or PMI. The event reconstruction program, CPREAD of the CPLEAR experiment, has been used as a test case. Performance measurerate generated by CPLEAR.

  6. Synchronization and associative memory of FitzHugh-Nagumo neuronal networks with randomly distributed time delays

    Energy Technology Data Exchange (ETDEWEB)

    Peng, J H; Wu, Y J [School of Information Science and Engineering, East China University of Science and Technology, Shanghai 200237 (China); Yu, H J [Department of Mechanics, Shanghai Jiao Tong University, Shanghai 200240 (China)], E-mail: jhpeng@ecust.edu.cn

    2008-02-15

    Synchronization and associative memory in a neural network composed of the widely discussed FitzHugh-Nagumo neurons is investigated in this paper. Based on the reality of the microscopic biological structure in the neural system, the couplings among those neurons are accompanied with randomly distributed time delays which models the times needed for pulses propagating on the axons from the presynaptic neurons to the postsynaptic neurons. The memory is represented in the spatiotemporal firing pattern of the neurons, and the memory retrieval is accomplished with the fluctuations of the noise in the system.

  7. Distriblets: Java-Based Distributed Computing on the Web.

    Science.gov (United States)

    Finkel, David; Wills, Craig E.; Brennan, Brian; Brennan, Chris

    1999-01-01

    Describes a system for using the World Wide Web to distribute computational tasks to multiple hosts on the Web that is written in Java programming language. Describes the programs written to carry out the load distribution, the structure of a "distriblet" class, and experiences in using this system. (Author/LRW)

  8. Load flow computations in hybrid transmission - distributed power systems

    NARCIS (Netherlands)

    Wobbes, E.D.; Lahaye, D.J.P.

    2013-01-01

    We interconnect transmission and distribution power systems and perform load flow computations in the hybrid network. In the largest example we managed to build, fifty copies of a distribution network consisting of fifteen nodes is connected to the UCTE study model, resulting in a system consisting

  9. Understanding and Improving the Performance Consistency of Distributed Computing Systems

    NARCIS (Netherlands)

    Yigitbasi, M.N.

    2012-01-01

    With the increasing adoption of distributed systems in both academia and industry, and with the increasing computational and storage requirements of distributed applications, users inevitably demand more from these systems. Moreover, users also depend on these systems for latency and throughput sens

  10. Understanding and Improving the Performance Consistency of Distributed Computing Systems

    NARCIS (Netherlands)

    Yigitbasi, M.N.

    2012-01-01

    With the increasing adoption of distributed systems in both academia and industry, and with the increasing computational and storage requirements of distributed applications, users inevitably demand more from these systems. Moreover, users also depend on these systems for latency and throughput sens

  11. Perspectives on distributed computing : thirty people, four user types, and the distributed computing user experience.

    Energy Technology Data Exchange (ETDEWEB)

    Childers, L.; Liming, L.; Foster, I.; Mathematics and Computer Science; Univ. of Chicago

    2008-10-15

    This report summarizes the methodology and results of a user perspectives study conducted by the Community Driven Improvement of Globus Software (CDIGS) project. The purpose of the study was to document the work-related goals and challenges facing today's scientific technology users, to record their perspectives on Globus software and the distributed-computing ecosystem, and to provide recommendations to the Globus community based on the observations. Globus is a set of open source software components intended to provide a framework for collaborative computational science activities. Rather than attempting to characterize all users or potential users of Globus software, our strategy has been to speak in detail with a small group of individuals in the scientific community whose work appears to be the kind that could benefit from Globus software, learn as much as possible about their work goals and the challenges they face, and describe what we found. The result is a set of statements about specific individuals experiences. We do not claim that these are representative of a potential user community, but we do claim to have found commonalities and differences among the interviewees that may be reflected in the user community as a whole. We present these as a series of hypotheses that can be tested by subsequent studies, and we offer recommendations to Globus developers based on the assumption that these hypotheses are representative. Specifically, we conducted interviews with thirty technology users in the scientific community. We included both people who have used Globus software and those who have not. We made a point of including individuals who represent a variety of roles in scientific projects, for example, scientists, software developers, engineers, and infrastructure providers. The following material is included in this report: (1) A summary of the reported work-related goals, significant issues, and points of satisfaction with the use of Globus software

  12. Hierarchical Temporal Memory Based on Spin-Neurons and Resistive Memory for Energy-Efficient Brain-Inspired Computing.

    Science.gov (United States)

    Fan, Deliang; Sharad, Mrigank; Sengupta, Abhronil; Roy, Kaushik

    2016-09-01

    Hierarchical temporal memory (HTM) tries to mimic the computing in cerebral neocortex. It identifies spatial and temporal patterns in the input for making inferences. This may require a large number of computationally expensive tasks, such as dot product evaluations. Nanodevices that can provide direct mapping for such primitives are of great interest. In this paper, we propose that the computing blocks for HTM can be mapped using low-voltage, magnetometallic spin-neurons combined with an emerging resistive crossbar network, which involves a comprehensive design at algorithm, architecture, circuit, and device levels. Simulation results show the possibility of more than 200× lower energy as compared with a 45-nm CMOS ASIC design.

  13. Arcade: A Web-Java Based Framework for Distributed Computing

    Science.gov (United States)

    Chen, Zhikai; Maly, Kurt; Mehrotra, Piyush; Zubair, Mohammad; Bushnell, Dennis M. (Technical Monitor)

    2000-01-01

    Distributed heterogeneous environments are being increasingly used to execute a variety of large size simulations and computational problems. We are developing Arcade, a web-based environment to design, execute, monitor, and control distributed applications. These targeted applications consist of independent heterogeneous modules which can be executed on a distributed heterogeneous environment. In this paper we describe the overall design of the system and discuss the prototype implementation of the core functionalities required to support such a framework.

  14. Distributed computing for membrane-based modeling of action potential propagation.

    Science.gov (United States)

    Porras, D; Rogers, J M; Smith, W M; Pollard, A E

    2000-08-01

    Action potential propagation simulations with physiologic membrane currents and macroscopic tissue dimensions are computationally expensive. We, therefore, analyzed distributed computing schemes to reduce execution time in workstation clusters by parallelizing solutions with message passing. Four schemes were considered in two-dimensional monodomain simulations with the Beeler-Reuter membrane equations. Parallel speedups measured with each scheme were compared to theoretical speedups, recognizing the relationship between speedup and code portions that executed serially. A data decomposition scheme based on total ionic current provided the best performance. Analysis of communication latencies in that scheme led to a load-balancing algorithm in which measured speedups at 89 +/- 2% and 75 +/- 8% of theoretical speedups were achieved in homogeneous and heterogeneous clusters of workstations. Speedups in this scheme with the Luo-Rudy dynamic membrane equations exceeded 3.0 with eight distributed workstations. Cluster speedups were comparable to those measured during parallel execution on a shared memory machine.

  15. One approach for evaluating the Distributed Computing Design System (DCDS)

    Science.gov (United States)

    Ellis, J. T.

    1985-01-01

    The Distributed Computer Design System (DCDS) provides an integrated environment to support the life cycle of developing real-time distributed computing systems. The primary focus of DCDS is to significantly increase system reliability and software development productivity, and to minimize schedule and cost risk. DCDS consists of integrated methodologies, languages, and tools to support the life cycle of developing distributed software and systems. Smooth and well-defined transistions from phase to phase, language to language, and tool to tool provide a unique and unified environment. An approach to evaluating DCDS highlights its benefits.

  16. Distributed computing and artificial intelligence : 10th International Conference

    CERN Document Server

    Neves, José; Rodriguez, Juan; Santana, Juan; Gonzalez, Sara

    2013-01-01

    The International Symposium on Distributed Computing and Artificial Intelligence 2013 (DCAI 2013) is a forum in which applications of innovative techniques for solving complex problems are presented. Artificial intelligence is changing our society. Its application in distributed environments, such as the internet, electronic commerce, environment monitoring, mobile communications, wireless devices, distributed computing, to mention only a few, is continuously increasing, becoming an element of high added value with social and economic potential, in industry, quality of life, and research. This conference is a stimulating and productive forum where the scientific community can work towards future cooperation in Distributed Computing and Artificial Intelligence areas. These technologies are changing constantly as a result of the large research and technical effort being undertaken in both universities and businesses. The exchange of ideas between scientists and technicians from both the academic and industry se...

  17. 9th International conference on distributed computing and artificial intelligence

    CERN Document Server

    Santana, Juan; González, Sara; Molina, Jose; Bernardos, Ana; Rodríguez, Juan; DCAI 2012; International Symposium on Distributed Computing and Artificial Intelligence 2012

    2012-01-01

    The International Symposium on Distributed Computing and Artificial Intelligence 2012 (DCAI 2012) is a stimulating and productive forum where the scientific community can work towards future cooperation in Distributed Computing and Artificial Intelligence areas. This conference is a forum in which  applications of innovative techniques for solving complex problems will be presented. Artificial intelligence is changing our society. Its application in distributed environments, such as the internet, electronic commerce, environment monitoring, mobile communications, wireless devices, distributed computing, to mention only a few, is continuously increasing, becoming an element of high added value with social and economic potential, in industry, quality of life, and research. These technologies are changing constantly as a result of the large research and technical effort being undertaken in both universities and businesses. The exchange of ideas between scientists and technicians from both the academic and indus...

  18. Reliable, Memory Speed Storage for Cluster Computing Frameworks

    Science.gov (United States)

    2014-06-16

    specification API that can capture computations in many of today’s popular data-parallel computing models, e.g., MapReduce and SQL. We also ported the Hadoop ...runs on. We present solutions for priority and weighted fair sharing, the most common policies in systems like Hadoop and Dryad [45, 27]. Priority Based...understand its own configuration. For example, in Hadoop , configurations are kept in HadoopConf, while Spark stores these in SparkEnv. Therefore, their wrap

  19. SuperLU{_}DIST: A scalable distributed-memory sparse direct solver for unsymmetric linear systems

    Energy Technology Data Exchange (ETDEWEB)

    Li, Xiaoye S.; Demmel, James W.

    2002-03-27

    In this paper, we present the main algorithmic features in the software package SuperLU{_}DIST, a distributed-memory sparse direct solver for large sets of linear equations. We give in detail our parallelization strategies, with focus on scalability issues, and demonstrate the parallel performance and scalability on current machines. The solver is based on sparse Gaussian elimination, with an innovative static pivoting strategy proposed earlier by the authors. The main advantage of static pivoting over classical partial pivoting is that it permits a priori determination of data structures and communication pattern for sparse Gaussian elimination, which makes it more scalable on distributed memory machines. Based on this a priori knowledge, we designed highly parallel and scalable algorithms for both LU decomposition and triangular solve and we show that they are suitable for large-scale distributed memory machines.

  20. Computational skills, working memory, and conceptual knowledge in older children with mathematics learning disabilities.

    Science.gov (United States)

    Mabbott, Donald J; Bisanz, Jeffrey

    2008-01-01

    Knowledge and skill in multiplication were investigated for late elementary-grade students with mathematics learning disabilities (MLD), typically achieving age-matched peers, low-achieving age-matched peers, and ability-matched peers by examining multiple measures of computational skill, working memory, and conceptual knowledge. Poor multiplication fact mastery and calculation fluency and general working memory discriminated children with MLD from typically achieving age-matched peers. Furthermore, children with MLD were slower in executing backup procedures than typically achieving age-matched peers. The performance of children with MLD on multiple measures of multiplication skill and knowledge was most similar to that of ability-matched younger children. MLD may be due to difficulties in computational skills and working memory. Implications for the diagnosis and remediation of MLD are discussed.

  1. Optimized distributed computing environment for mask data preparation

    Science.gov (United States)

    Ahn, Byoung-Sup; Bang, Ju-Mi; Ji, Min-Kyu; Kang, Sun; Jang, Sung-Hoon; Choi, Yo-Han; Ki, Won-Tai; Choi, Seong-Woon; Han, Woo-Sung

    2005-11-01

    As the critical dimension (CD) becomes smaller, various resolution enhancement techniques (RET) are widely adopted. In developing sub-100nm devices, the complexity of optical proximity correction (OPC) is severely increased and applied OPC layers are expanded to non-critical layers. The transformation of designed pattern data by OPC operation causes complexity, which cause runtime overheads to following steps such as mask data preparation (MDP), and collapse of existing design hierarchy. Therefore, many mask shops exploit the distributed computing method in order to reduce the runtime of mask data preparation rather than exploit the design hierarchy. Distributed computing uses a cluster of computers that are connected to local network system. However, there are two things to limit the benefit of the distributing computing method in MDP. First, every sequential MDP job, which uses maximum number of available CPUs, is not efficient compared to parallel MDP job execution due to the input data characteristics. Second, the runtime enhancement over input cost is not sufficient enough since the scalability of fracturing tools is limited. In this paper, we will discuss optimum load balancing environment that is useful in increasing the uptime of distributed computing system by assigning appropriate number of CPUs for each input design data. We will also describe the distributed processing (DP) parameter optimization to obtain maximum throughput in MDP job processing.

  2. Toward a high performance distributed memory climate model

    Energy Technology Data Exchange (ETDEWEB)

    Wehner, M.F.; Ambrosiano, J.J.; Brown, J.C.; Dannevik, W.P.; Eltgroth, P.G.; Mirin, A.A. [Lawrence Livermore National Lab., CA (United States); Farrara, J.D.; Ma, C.C.; Mechoso, C.R.; Spahr, J.A. [Univ. of California, Los Angeles, CA (US). Dept. of Atmospheric Sciences

    1993-02-15

    As part of a long range plan to develop a comprehensive climate systems modeling capability, the authors have taken the Atmospheric General Circulation Model originally developed by Arakawa and collaborators at UCLA and have recast it in a portable, parallel form. The code uses an explicit time-advance procedure on a staggered three-dimensional Eulerian mesh. The authors have implemented a two-dimensional latitude/longitude domain decomposition message passing strategy. Both dynamic memory management and interprocessor communication are handled with macro constructs that are preprocessed prior to compilation. The code can be moved about a variety of platforms, including massively parallel processors, workstation clusters, and vector processors, with a mere change of three parameters. Performance on the various platforms as well as issues associated with coupling different models for major components of the climate system are discussed.

  3. GISpark: A Geospatial Distributed Computing Platform for Spatiotemporal Big Data

    Science.gov (United States)

    Wang, S.; Zhong, E.; Wang, E.; Zhong, Y.; Cai, W.; Li, S.; Gao, S.

    2016-12-01

    Geospatial data are growing exponentially because of the proliferation of cost effective and ubiquitous positioning technologies such as global remote-sensing satellites and location-based devices. Analyzing large amounts of geospatial data can provide great value for both industrial and scientific applications. Data- and compute- intensive characteristics inherent in geospatial big data increasingly pose great challenges to technologies of data storing, computing and analyzing. Such challenges require a scalable and efficient architecture that can store, query, analyze, and visualize large-scale spatiotemporal data. Therefore, we developed GISpark - a geospatial distributed computing platform for processing large-scale vector, raster and stream data. GISpark is constructed based on the latest virtualized computing infrastructures and distributed computing architecture. OpenStack and Docker are used to build multi-user hosting cloud computing infrastructure for GISpark. The virtual storage systems such as HDFS, Ceph, MongoDB are combined and adopted for spatiotemporal data storage management. Spark-based algorithm framework is developed for efficient parallel computing. Within this framework, SuperMap GIScript and various open-source GIS libraries can be integrated into GISpark. GISpark can also integrated with scientific computing environment (e.g., Anaconda), interactive computing web applications (e.g., Jupyter notebook), and machine learning tools (e.g., TensorFlow/Orange). The associated geospatial facilities of GISpark in conjunction with the scientific computing environment, exploratory spatial data analysis tools, temporal data management and analysis systems make up a powerful geospatial computing tool. GISpark not only provides spatiotemporal big data processing capacity in the geospatial field, but also provides spatiotemporal computational model and advanced geospatial visualization tools that deals with other domains related with spatial property. We

  4. Cloud manufacturing distributed computing technologies for global and sustainable manufacturing

    CERN Document Server

    Mehnen, Jörn

    2013-01-01

    Global networks, which are the primary pillars of the modern manufacturing industry and supply chains, can only cope with the new challenges, requirements and demands when supported by new computing and Internet-based technologies. Cloud Manufacturing: Distributed Computing Technologies for Global and Sustainable Manufacturing introduces a new paradigm for scalable service-oriented sustainable and globally distributed manufacturing systems.   The eleven chapters in this book provide an updated overview of the latest technological development and applications in relevant research areas.  Following an introduction to the essential features of Cloud Computing, chapters cover a range of methods and applications such as the factors that actually affect adoption of the Cloud Computing technology in manufacturing companies and new geometrical simplification method to stream 3-Dimensional design and manufacturing data via the Internet. This is further supported case studies and real life data for Waste Electrical ...

  5. Exact score distribution computation for ontological similarity searches

    Directory of Open Access Journals (Sweden)

    Schulz Marcel H

    2011-11-01

    Full Text Available Abstract Background Semantic similarity searches in ontologies are an important component of many bioinformatic algorithms, e.g., finding functionally related proteins with the Gene Ontology or phenotypically similar diseases with the Human Phenotype Ontology (HPO. We have recently shown that the performance of semantic similarity searches can be improved by ranking results according to the probability of obtaining a given score at random rather than by the scores themselves. However, to date, there are no algorithms for computing the exact distribution of semantic similarity scores, which is necessary for computing the exact P-value of a given score. Results In this paper we consider the exact computation of score distributions for similarity searches in ontologies, and introduce a simple null hypothesis which can be used to compute a P-value for the statistical significance of similarity scores. We concentrate on measures based on Resnik's definition of ontological similarity. A new algorithm is proposed that collapses subgraphs of the ontology graph and thereby allows fast score distribution computation. The new algorithm is several orders of magnitude faster than the naive approach, as we demonstrate by computing score distributions for similarity searches in the HPO. It is shown that exact P-value calculation improves clinical diagnosis using the HPO compared to approaches based on sampling. Conclusions The new algorithm enables for the first time exact P-value calculation via exact score distribution computation for ontology similarity searches. The approach is applicable to any ontology for which the annotation-propagation rule holds and can improve any bioinformatic method that makes only use of the raw similarity scores. The algorithm was implemented in Java, supports any ontology in OBO format, and is available for non-commercial and academic usage under: https://compbio.charite.de/svn/hpo/trunk/src/tools/significance/

  6. Fast Incremental and Personalized PageRank over Distributed Main Memory Databases

    CERN Document Server

    Bahmani, Bahman; Goel, Ashish

    2010-01-01

    In this paper, we analyze the efficiency of Monte Carlo methods for incremental computation of PageRank, personalized PageRank, and similar random walk based methods (with focus on SALSA), on large-scale dynamically evolving social networks. We assume that the graph of friendships is stored in distributed shared memory, as is the case for large social networks such as Twitter. For global PageRank, we assume that the social network has $n$ nodes, and $m$ adversarially chosen edges arrive in a random order. We show that with a reset probability of $\\epsilon$, the total work needed to maintain an accurate estimate (using the Monte Carlo method) of the PageRank of every node at all times is $O(\\frac{n\\log m}{\\epsilon^{2}})$. This is significantly better than all known bounds for incremental PageRank. For instance, if we naively recompute the PageRanks as each edge arrives, the simple power iteration method needs $\\Omega(\\frac{m^2}{\\log(1/(1-\\epsilon))})$ total time and the Monte Carlo method needs $O(mn/\\epsilon)...

  7. Distributed Computing and Artificial Intelligence, 12th International Conference

    CERN Document Server

    Malluhi, Qutaibah; Gonzalez, Sara; Bocewicz, Grzegorz; Bucciarelli, Edgardo; Giulioni, Gianfranco; Iqba, Farkhund

    2015-01-01

    The 12th International Symposium on Distributed Computing and Artificial Intelligence 2015 (DCAI 2015) is a forum to present applications of innovative techniques for studying and solving complex problems. The exchange of ideas between scientists and technicians from both the academic and industrial sector is essential to facilitate the development of systems that can meet the ever-increasing demands of today’s society. The present edition brings together past experience, current work and promising future trends associated with distributed computing, artificial intelligence and their application in order to provide efficient solutions to real problems. This symposium is organized by the Osaka Institute of Technology, Qatar University and the University of Salamanca.

  8. Computation of glint, glare, and solar irradiance distribution

    Energy Technology Data Exchange (ETDEWEB)

    Ho, Clifford Kuofei; Khalsa, Siri Sahib Singh

    2017-08-01

    Described herein are technologies pertaining to computing the solar irradiance distribution on a surface of a receiver in a concentrating solar power system or glint/glare emitted from a reflective entity. At least one camera captures images of the Sun and the entity of interest, wherein the images have pluralities of pixels having respective pluralities of intensity values. Based upon the intensity values of the pixels in the respective images, the solar irradiance distribution on the surface of the entity or glint/glare corresponding to the entity is computed.

  9. First Experiences with LHC Grid Computing and Distributed Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Fisk, Ian

    2010-12-01

    In this presentation the experiences of the LHC experiments using grid computing were presented with a focus on experience with distributed analysis. After many years of development, preparation, exercises, and validation the LHC (Large Hadron Collider) experiments are in operations. The computing infrastructure has been heavily utilized in the first 6 months of data collection. The general experience of exploiting the grid infrastructure for organized processing and preparation is described, as well as the successes employing the infrastructure for distributed analysis. At the end the expected evolution and future plans are outlined.

  10. A distributed spatial computing prototype system in grid environment

    Institute of Scientific and Technical Information of China (English)

    2010-01-01

    Digital Earth has been a hot topic and research trend since it was proposed,and Digital China has drawn much attention in China.As a key technique to implement Digital China,grid is an excellent and promising concept to construct a dynamic,inter-domain and distributed computing environment.It is appropriate to process geographic information across dispersed computing resources in networks effectively and cooperatively.A distributed spatial computing prototype system is designed and implemented with the Globus Toolkit.Several important aspects are discussed in detail.The architecture is proposed according to the characteristics of grid firstly,and then the spatial resource query and access interfaces are designed for heterogeneous data sources.An open-up hierarchical architecture for resource discovery and management is represented to detect spatial and computing resources in grid.A standard spatial job management mechanism is implemented by grid service for convenient use.In addition,the control mechanism of spatial datasets access is developed based on GSI.The prototype system utilizes the Globus Toolkit to implement a common distributed spatial computing framework,and it reveals the spatial computing ability of grid to support Digital China.

  11. Distributed MRI reconstruction using Gadgetron-based cloud computing.

    Science.gov (United States)

    Xue, Hui; Inati, Souheil; Sørensen, Thomas Sangild; Kellman, Peter; Hansen, Michael S

    2015-03-01

    To expand the open source Gadgetron reconstruction framework to support distributed computing and to demonstrate that a multinode version of the Gadgetron can be used to provide nonlinear reconstruction with clinically acceptable latency. The Gadgetron framework was extended with new software components that enable an arbitrary number of Gadgetron instances to collaborate on a reconstruction task. This cloud-enabled version of the Gadgetron was deployed on three different distributed computing platforms ranging from a heterogeneous collection of commodity computers to the commercial Amazon Elastic Compute Cloud. The Gadgetron cloud was used to provide nonlinear, compressed sensing reconstruction on a clinical scanner with low reconstruction latency (eg, cardiac and neuroimaging applications). The proposed setup was able to handle acquisition and 11 -SPIRiT reconstruction of nine high temporal resolution real-time, cardiac short axis cine acquisitions, covering the ventricles for functional evaluation, in under 1 min. A three-dimensional high-resolution brain acquisition with 1 mm(3) isotropic pixel size was acquired and reconstructed with nonlinear reconstruction in less than 5 min. A distributed computing enabled Gadgetron provides a scalable way to improve reconstruction performance using commodity cluster computing. Nonlinear, compressed sensing reconstruction can be deployed clinically with low image reconstruction latency. © 2014 Wiley Periodicals, Inc.

  12. Spin-wave interference patterns created by spin-torque nano-oscillators for memory and computation

    Energy Technology Data Exchange (ETDEWEB)

    Macia, Ferran; Kent, Andrew D [Department of Physics, New York University, 4 Washington Place, New York, NY 10003 (United States); Hoppensteadt, Frank C, E-mail: fmb2@nyu.edu [Courant Institute of Mathematical Sciences, New York University, 251 Mercer Street, New York, NY 10012 (United States)

    2011-03-04

    Magnetization dynamics in nanomagnets has attracted broad interest since it was predicted that a dc current flowing through a thin magnetic layer can create spin-wave excitations. These excitations are due to spin momentum transfer, a transfer of spin angular momentum between conduction electrons and the background magnetization, that enables new types of information processing. Here we show how arrays of spin-torque nano-oscillators can create propagating spin-wave interference patterns of use for memory and computation. Memristic transponders distributed on the thin film respond to threshold tunnel magnetoresistance values, thereby allowing spin-wave detection and creating new excitation patterns. We show how groups of transponders create resonant (reverberating) spin-wave interference patterns that may be used for polychronous wave computation and information storage.

  13. Power profiling of Cholesky and QR factorizations on distributed memory systems

    KAUST Repository

    Bosilca, George

    2012-08-30

    This paper presents the power profile of two high performance dense linear algebra libraries on distributed memory systems, ScaLAPACK and DPLASMA. From the algorithmic perspective, their methodologies are opposite. The former is based on block algorithms and relies on multithreaded BLAS and a two-dimensional block cyclic data distribution to achieve high parallel performance. The latter is based on tile algorithms running on top of a tile data layout and uses fine-grained task parallelism combined with a dynamic distributed scheduler (DAGuE) to leverage distributed memory systems. We present performance results (Gflop/s) as well as the power profile (Watts) of two common dense factorizations needed to solve linear systems of equations, namely Cholesky and QR. The reported numbers show that DPLASMA surpasses ScaLAPACK not only in terms of performance (up to 2X speedup) but also in terms of energy efficiency (up to 62 %). © 2012 Springer-Verlag (outside the USA).

  14. Performance evaluation of explicit finite difference algorithms with varying amounts of computational and memory intensity

    CERN Document Server

    Jammy, Satya P; Sandham, Neil D

    2016-01-01

    Future architectures designed to deliver exascale performance motivate the need for novel algorithmic changes in order to fully exploit their capabilities. In this paper, the performance of several numerical algorithms, characterised by varying degrees of memory and computational intensity, are evaluated in the context of finite difference methods for fluid dynamics problems. It is shown that, by storing some of the evaluated derivatives as single thread- or process-local variables in memory, or recomputing the derivatives on-the-fly, a speed-up of ~2 can be obtained compared to traditional algorithms that store all derivatives in global arrays.

  15. Distributed Memory Breadth-First Search Revisited: Enabling Bottom-Up Search

    Science.gov (United States)

    2013-01-03

    10227). Additional support comes from Par Lab affiliates National Instruments, Nokia , NVIDIA, Oracle, and Samsung. Distributed Memory Breadth-First...by matching funding by U.C. Discovery (Award #DIG07-10227). Additional support comes from Par Lab affiliates National Instruments, Nokia , NVIDIA

  16. Parallelization of the molecular dynamics code GROMOS87 for distributed memory parallel architectures

    NARCIS (Netherlands)

    Green, DG; Meacham, KE; vanHoesel, F; Hertzberger, B; Serazzi, G

    1995-01-01

    This paper describes the techniques and methodologies employed during parallelization of the Molecular Dynamics (MD) code GROMOS87, with the specific requirement that the program run efficiently on a range of distributed-memory parallel platforms. We discuss the preliminary results of our parallel

  17. Extending and implementing the Self-adaptive Virtual Processor for distributed memory architectures

    NARCIS (Netherlands)

    van Tol, M.W.; Koivisto, J.

    2011-01-01

    Many-core architectures of the future are likely to have distributed memory organizations and need fine grained concurrency management to be used effectively. The Self-adaptive Virtual Processor (SVP) is an abstract concurrent programming model which can provide this, but the model and its current i

  18. Extending and implementing the Self-adaptive Virtual Processor for distributed memory architectures

    NARCIS (Netherlands)

    van Tol, M.W.; Koivisto, J.

    2011-01-01

    Many-core architectures of the future are likely to have distributed memory organizations and need fine grained concurrency management to be used effectively. The Self-adaptive Virtual Processor (SVP) is an abstract concurrent programming model which can provide this, but the model and its current

  19. The future of PanDA in ATLAS distributed computing

    Science.gov (United States)

    De, K.; Klimentov, A.; Maeno, T.; Nilsson, P.; Oleynik, D.; Panitkin, S.; Petrosyan, A.; Schovancova, J.; Vaniachine, A.; Wenaus, T.

    2015-12-01

    Experiments at the Large Hadron Collider (LHC) face unprecedented computing challenges. Heterogeneous resources are distributed worldwide at hundreds of sites, thousands of physicists analyse the data remotely, the volume of processed data is beyond the exabyte scale, while data processing requires more than a few billion hours of computing usage per year. The PanDA (Production and Distributed Analysis) system was developed to meet the scale and complexity of LHC distributed computing for the ATLAS experiment. In the process, the old batch job paradigm of locally managed computing in HEP was discarded in favour of a far more automated, flexible and scalable model. The success of PanDA in ATLAS is leading to widespread adoption and testing by other experiments. PanDA is the first exascale workload management system in HEP, already operating at more than a million computing jobs per day, and processing over an exabyte of data in 2013. There are many new challenges that PanDA will face in the near future, in addition to new challenges of scale, heterogeneity and increasing user base. PanDA will need to handle rapidly changing computing infrastructure, will require factorization of code for easier deployment, will need to incorporate additional information sources including network metrics in decision making, be able to control network circuits, handle dynamically sized workload processing, provide improved visualization, and face many other challenges. In this talk we will focus on the new features, planned or recently implemented, that are relevant to the next decade of distributed computing workload management using PanDA.

  20. Integration of Cloud resources in the LHCb Distributed Computing

    Science.gov (United States)

    Úbeda García, Mario; Méndez Muñoz, Víctor; Stagni, Federico; Cabarrou, Baptiste; Rauschmayr, Nathalie; Charpentier, Philippe; Closier, Joel

    2014-06-01

    This contribution describes how Cloud resources have been integrated in the LHCb Distributed Computing. LHCb is using its specific Dirac extension (LHCbDirac) as an interware for its Distributed Computing. So far, it was seamlessly integrating Grid resources and Computer clusters. The cloud extension of DIRAC (VMDIRAC) allows the integration of Cloud computing infrastructures. It is able to interact with multiple types of infrastructures in commercial and institutional clouds, supported by multiple interfaces (Amazon EC2, OpenNebula, OpenStack and CloudStack) - instantiates, monitors and manages Virtual Machines running on this aggregation of Cloud resources. Moreover, specifications for institutional Cloud resources proposed by Worldwide LHC Computing Grid (WLCG), mainly by the High Energy Physics Unix Information Exchange (HEPiX) group, have been taken into account. Several initiatives and computing resource providers in the eScience environment have already deployed IaaS in production during 2013. Keeping this on mind, pros and cons of a cloud based infrasctructure have been studied in contrast with the current setup. As a result, this work addresses four different use cases which represent a major improvement on several levels of our infrastructure. We describe the solution implemented by LHCb for the contextualisation of the VMs based on the idea of Cloud Site. We report on operational experience of using in production several institutional Cloud resources that are thus becoming integral part of the LHCb Distributed Computing resources. Furthermore, we describe as well the gradual migration of our Service Infrastructure towards a fully distributed architecture following the Service as a Service (SaaS) model.

  1. Immigration, language proficiency, and autobiographical memories: Lifespan distribution and second-language access.

    Science.gov (United States)

    Esposito, Alena G; Baker-Ward, Lynne

    2016-08-01

    This investigation examined two controversies in the autobiographical literature: how cross-language immigration affects the distribution of autobiographical memories across the lifespan and under what circumstances language-dependent recall is observed. Both Spanish/English bilingual immigrants and English monolingual non-immigrants participated in a cue word study, with the bilingual sample taking part in a within-subject language manipulation. The expected bump in the number of memories from early life was observed for non-immigrants but not immigrants, who reported more memories for events surrounding immigration. Aspects of the methodology addressed possible reasons for past discrepant findings. Language-dependent recall was influenced by second-language proficiency. Results were interpreted as evidence that bilinguals with high second-language proficiency, in contrast to those with lower second-language proficiency, access a single conceptual store through either language. The final multi-level model predicting language-dependent recall, including second-language proficiency, age of immigration, internal language, and cue word language, explained ¾ of the between-person variance and (1)/5 of the within-person variance. We arrive at two conclusions. First, major life transitions influence the distribution of memories. Second, concept representation across multiple languages follows a developmental model. In addition, the results underscore the importance of considering language experience in research involving memory reports.

  2. Analysis of a distributed neural system involved in spatial information, novelty, and memory processing.

    Science.gov (United States)

    Menon, V; White, C D; Eliez, S; Glover, G H; Reiss, A L

    2000-10-01

    Perceiving a complex visual scene and encoding it into memory involves a hierarchical distributed network of brain regions, most notably the hippocampus (HIPP), parahippocampal gyrus (PHG), lingual gyrus (LNG), and inferior frontal gyrus (IFG). Lesion and imaging studies in humans have suggested that these regions are involved in spatial information processing as well as novelty and memory encoding; however, the relative contributions of these regions of interest (ROIs) are poorly understood. This study investigated regional dissociations in spatial information and novelty processing in the context of memory encoding using a 2 x 2 factorial design with factors Novelty (novel vs. repeated) and Stimulus (viewing scenes with rich vs. poor spatial information). Greater activation was observed in the right than left hemisphere; however, hemispheric effects did not differ across regions, novelty, or stimulus type. Significant novelty effects were observed in all four regions. A significant ROI x Stimulus interaction was observed - spatial information processing effects were largest effects in the LNG, significant in the PHG and HIPP and nonsignificant in the IFG. Novelty processing was stimulus dependent in the LNG and stimulus independent in the PHG, HIPP, and IFG. Analysis of the profile of Novelty x Stimulus interaction across ROIs provided evidence for a hierarchical independence in novelty processing characterized by increased dissociation from spatial information processing. Despite these differences in spatial information processing, memory performance for novel scenes with rich and poor spatial information was not significantly different. Memory performance was inversely correlated with right IFG activation, suggesting the involvement of this region in strategically flawed encoding effort. Stepwise regression analysis revealed that memory encoding accounted for only a small fraction of the variance (temporal lobe activation. The implications of these results for

  3. Distributed Computing with Centralized Support Works at Brigham Young.

    Science.gov (United States)

    McDonald, Kelly; Stone, Brad

    1992-01-01

    Brigham Young University (Utah) has addressed the need for maintenance and support of distributed computing systems on campus by implementing a program patterned after a national business franchise, providing the support and training of a centralized administration but allowing each unit to operate much as an independent small business.…

  4. Chandrasekhar equations and computational algorithms for distributed parameter systems

    Science.gov (United States)

    Burns, J. A.; Ito, K.; Powers, R. K.

    1984-01-01

    The Chandrasekhar equations arising in optimal control problems for linear distributed parameter systems are considered. The equations are derived via approximation theory. This approach is used to obtain existence, uniqueness, and strong differentiability of the solutions and provides the basis for a convergent computation scheme for approximating feedback gain operators. A numerical example is presented to illustrate these ideas.

  5. CMS Monte Carlo production operations in a distributed computing environment

    CERN Document Server

    Mohapatra, A; Khomich, A; Lazaridis, C; Hernández, J M; Caballero, J; Hof, C; Kalinin, S; Flossdorf, A; Abbrescia, M; De Filippis, N; Donvito, G; Maggi, G; My, S; Pompili, A; Sarkar, S; Maes, J; Van Mulders, P; Villella, I; De Weirdt, S; Hammad, G; Wakefield, S; Guan, W; Lajas, J A S; Elmer, P; Evans, D; Fanfani, A; Bacchi, W; Codispoti, G; Van Lingen, F; Kavka, C; Eulisse, G

    2008-01-01

    Monte Carlo production for the CMS experiment is carried out in a distributed computing environment; the goal of producing 30M simulated events per month in the first half of 2007 has been reached. A brief overview of the production operations and statistics is presented.

  6. Nested Transactions: An Approach to Reliable Distributed Computing.

    Science.gov (United States)

    1981-04-01

    Undoubtedly such universal use of computers and rapid exchange of information will have a dramatic impact: social , economic, and political. Distributed...level tiansaction, these committed inferiors are SLJ C.e’,ssfulI inferiors of the top-level transaction, too. Therefore q will indeed get a commint

  7. The Future of PanDA in ATLAS Distributed Computing

    CERN Document Server

    De, Kaushik; The ATLAS collaboration; Maeno, Tadashi; Nilsson, Paul; Oleynik, Danila; Panitkin, Sergey; Petrosyan, Artem; Schovancova, Jaroslava; Vaniachine, Alexandre; Wenaus, Torre

    2015-01-01

    Experiments at the Large Hadron Collider (LHC) face unprecedented computing challenges. Heterogeneous resources are distributed worldwide at hundreds of sites, thousands of physicists analyze the data remotely, the volume of processed data is beyond the exabyte scale, while data processing requires more than a few billion hours of computing usage per year. The PanDA (Production and Distributed Analysis) system was developed to meet the scale and complexity of LHC distributed computing for the ATLAS experiment. In the process, the old batch job paradigm of locally managed computing in HEP was discarded in favor of a far more automated, flexible and scalable model. The success of PanDA in ATLAS is leading to widespread adoption and testing by other experiments. PanDA is the first exascale workload management system in HEP, already operating at more than a million computing jobs per day, and processing over an exabyte of data in 2013. There are many new challenges that PanDA will face in the near future, in addi...

  8. The Future of PanDA in ATLAS Distributed Computing

    CERN Document Server

    De, Kaushik; The ATLAS collaboration; Maeno, Tadashi; Nilsson, Paul; Oleynik, Danila; Panitkin, Sergey; Petrosyan, Artem; Schovancova, Jaroslava; Vaniachine, Alexandre; Wenaus, Torre

    2015-01-01

    Experiments at the Large Hadron Collider (LHC) face unprecedented computing challenges. Heterogeneous resources are distributed worldwide at hundreds of sites, thousands of physicists analyze the data remotely, the volume of processed data is beyond the exabyte scale, while data processing requires more than a few billion hours of computing usage per year. The PanDA (Production and Distributed Analysis) system was developed to meet the scale and complexity of LHC distributed computing for the ATLAS experiment. In the process, the old batch job paradigm of locally managed computing in HEP was discarded in favor of a far more automated, flexible and scalable model. The success of PanDA in ATLAS is leading to widespread adoption and testing by other experiments. PanDA is the first exascale workload management system in HEP, already operating at more than a million computing jobs per day, and processing over an exabyte of data in 2013. There are many new challenges that PanDA will face in the near future, in addi...

  9. A distributed deadlock detection algorithm for mobile computing system

    Institute of Scientific and Technical Information of China (English)

    CHENG Xin; LIU Hong-wei; ZUO De-cheng; JIN Feng; YANG Xiao-zong

    2005-01-01

    The mode of mobile computing originated from distributed computing and it has the un-idempotent operation property, therefore the deadlock detection algorithm designed for mobile computing systems will face challenges with regard to correctness and high efficiency. This paper attempts a fundamental study of deadlock detection for the AND model of mobile computing systems. First, the existing deadlock detection algorithms for distributed systems are classified into the resource node dependent (RD) and the resource node independent (RI) categories, and their corresponding weaknesses are discussed. Afterwards a new RI algorithm based on the AND model of mobile computing system is presented. The novelties of our algorithm are that: 1 ) the blocked nodes inform their predecessors and successors simultaneously; 2 ) the detection messages ( agents )hold the predecessors information of their originator; 3) no agent is stored midway. Additionally, the quit-inform scheme is introduced to treat the excessive victim quitting problem raised by the overlapped cycles. By these methods the proposed algorithm can detect a cycle of size n within n - 2 steps and with ( n2 - n - 2)/2 agents. The performance of our algorithm is compared with the most competitive RD and RI algorithms for distributed systems on a mobile agent simulation platform. Experiment results point out that our algorithm outperforms the two algorithms under the vast majority of resource configurations and concurrent workloads. The correctness of the proposed algorithm is formally proven by the invariant verification technique.

  10. ATLAS Distributed Computing Monitoring tools during the LHC Run I

    CERN Document Server

    Schovancova, J; The ATLAS collaboration; Di Girolamo, A; Jezequel, S; Ueda, I; Wenaus, T

    2014-01-01

    This contribution summarizes evolution of the ATLAS Distributed Computing (ADC) Monitoring project during the LHC Run I. The ADC Monitoring targets at the three groups of customers: ADC Operations team to early identify malfunctions and escalate issues to an activity or a service expert, ATLAS national contacts and sites for the real-time monitoring and long-term measurement of the performance of the provided computing resources, and the ATLAS Management for long-term trends and accounting information about the ATLAS Distributed Computing resources.\\\\ During the LHC Run I a significant development effort has been invested in standardization of the monitoring and accounting applications in order to provide extensive monitoring and accounting suite. ADC Monitoring applications separate the data layer and the visualization layer. The data layer exposes data in a predefined format. The visualization layer is designed bearing in mind visual identity of the provided graphical elements, and re-usability of the visua...

  11. Dynamic Distribution Model with Prime Granularity for Parallel Computing

    Institute of Scientific and Technical Information of China (English)

    2005-01-01

    Dynamic distribution model is one of the best schemes for parallel volume rendering. However, in homogeneous cluster system, since the granularity is traditionally identical, all processors communicate almost simultaneously and computation load may lose balance. Due to problems above, a dynamic distribution model with prime granularity for parallel computing is presented.Granularities of each processor are relatively prime, and related theories are introduced. A high parallel performance can be achieved by minimizing network competition and using a load balancing strategy that ensures all processors finish almost simultaneously. Based on Master-Slave-Gleaner (MSG) scheme, the parallel Splatting Algorithm for volume rendering is used to test the model on IBM Cluster 1350 system. The experimental results show that the model can bring a considerable improvement in performance, including computation efficiency, total execution time, speed, and load balancing.

  12. Accelerating Computation of DNA Sequence Alignment in Distributed Environment

    Science.gov (United States)

    Guo, Tao; Li, Guiyang; Deaton, Russel

    Sequence similarity and alignment are most important operations in computational biology. However, analyzing large sets of DNA sequence seems to be impractical on a regular PC. Using multiple threads with JavaParty mechanism, this project has successfully implemented in extending the capabilities of regular Java to a distributed environment for simulation of DNA computation. With the aid of JavaParty and the design of multiple threads, the results of this study demonstrated that the modified regular Java program could perform parallel computing without using RMI or socket communication. In this paper, an efficient method for modeling and comparing DNA sequences with dynamic programming and JavaParty was firstly proposed. Additionally, results of this method in distributed environment have been discussed.

  13. ATLAS Distributed Computing Monitoring tools during the LHC Run I

    CERN Document Server

    Schovancova, J; The ATLAS collaboration; Di Girolamo, A; Jezequel, S; Ueda, I; Wenaus, T

    2013-01-01

    This contribution summarizes evolution of the ATLAS Distributed Computing (ADC) Monitoring project during the LHC Run I. The ADC Monitoring targets at the three groups of customers: ADC Operations team to early identify malfunctions and escalate issues to an activity or a service expert, ATLAS national contacts and sites for the real-time monitoring and long-term measurement of the performance of the provided computing resources, and the ATLAS Management for long-term trends and accounting information about the ATLAS Distributed Computing resources.\\\\ During the LHC Run I a significant development effort has been invested in standardization of the monitoring and accounting applications in order to provide extensive monitoring and accounting suite. ADC Monitoring applications separate the data layer and the visualization layer. The data layer exposes data in a predefined format. The visualization layer is designed bearing in mind visual identity of the provided graphical elements, and re-usability of the visua...

  14. A fault detection service for wide area distributed computations.

    Energy Technology Data Exchange (ETDEWEB)

    Stelling, P.

    1998-06-09

    The potential for faults in distributed computing systems is a significant complicating factor for application developers. While a variety of techniques exist for detecting and correcting faults, the implementation of these techniques in a particular context can be difficult. Hence, we propose a fault detection service designed to be incorporated, in a modular fashion, into distributed computing systems, tools, or applications. This service uses well-known techniques based on unreliable fault detectors to detect and report component failure, while allowing the user to tradeoff timeliness of reporting against false positive rates. We describe the architecture of this service, report on experimental results that quantify its cost and accuracy, and describe its use in two applications, monitoring the status of system components of the GUSTO computational grid testbed and as part of the NetSolve network-enabled numerical solver.

  15. The Study On Load Balancing Strategies In Distributed Computing System

    Directory of Open Access Journals (Sweden)

    Md. Firoj Ali

    2012-05-01

    Full Text Available A number of load balancing algorithms were developed in order to improve the execution of a distributed application in any kind of distributed architecture. Load balancing involves assigning tasks to each processor and minimizing the execution time of the program. In practice, it would be possible even to execute the applications on any machine of worldwide distributed systems. However, the ‘distributed system’ becomes popular and attractive with the introduction of the web. This results in a significant performance improvement for the users. This paper describes the necessary, newly developed, principal concepts for several load balancing techniques in a distributed computing environment. This paper also includes various types of load balancing strategies, their merits, demerits and comparison depending on certain parameters.

  16. Fencing direct memory access data transfers in a parallel active messaging interface of a parallel computer

    Science.gov (United States)

    Blocksome, Michael A.; Mamidala, Amith R.

    2013-09-03

    Fencing direct memory access (`DMA`) data transfers in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI including data communications endpoints, each endpoint including specifications of a client, a context, and a task, the endpoints coupled for data communications through the PAMI and through DMA controllers operatively coupled to segments of shared random access memory through which the DMA controllers deliver data communications deterministically, including initiating execution through the PAMI of an ordered sequence of active DMA instructions for DMA data transfers between two endpoints, effecting deterministic DMA data transfers through a DMA controller and a segment of shared memory; and executing through the PAMI, with no FENCE accounting for DMA data transfers, an active FENCE instruction, the FENCE instruction completing execution only after completion of all DMA instructions initiated prior to execution of the FENCE instruction for DMA data transfers between the two endpoints.

  17. Guide to cloud computing for business and technology managers from distributed computing to cloudware applications

    CERN Document Server

    Kale, Vivek

    2014-01-01

    Guide to Cloud Computing for Business and Technology Managers: From Distributed Computing to Cloudware Applications unravels the mystery of cloud computing and explains how it can transform the operating contexts of business enterprises. It provides a clear understanding of what cloud computing really means, what it can do, and when it is practical to use. Addressing the primary management and operation concerns of cloudware, including performance, measurement, monitoring, and security, this pragmatic book:Introduces the enterprise applications integration (EAI) solutions that were a first ste

  18. Parallelization of Finite Element Analysis Codes Using Heterogeneous Distributed Computing

    Science.gov (United States)

    Ozguner, Fusun

    1996-01-01

    Performance gains in computer design are quickly consumed as users seek to analyze larger problems to a higher degree of accuracy. Innovative computational methods, such as parallel and distributed computing, seek to multiply the power of existing hardware technology to satisfy the computational demands of large applications. In the early stages of this project, experiments were performed using two large, coarse-grained applications, CSTEM and METCAN. These applications were parallelized on an Intel iPSC/860 hypercube. It was found that the overall speedup was very low, due to large, inherently sequential code segments present in the applications. The overall execution time T(sub par), of the application is dependent on these sequential segments. If these segments make up a significant fraction of the overall code, the application will have a poor speedup measure.

  19. Computational cost estimates for parallel shared memory isogeometric multi-frontal solvers

    KAUST Repository

    Woźniak, Maciej

    2014-06-01

    In this paper we present computational cost estimates for parallel shared memory isogeometric multi-frontal solvers. The estimates show that the ideal isogeometric shared memory parallel direct solver scales as O( p2log(N/p)) for one dimensional problems, O(Np2) for two dimensional problems, and O(N4/3p2) for three dimensional problems, where N is the number of degrees of freedom, and p is the polynomial order of approximation. The computational costs of the shared memory parallel isogeometric direct solver are compared with those corresponding to the sequential isogeometric direct solver, being the latest equal to O(N p2) for the one dimensional case, O(N1.5p3) for the two dimensional case, and O(N2p3) for the three dimensional case. The shared memory version significantly reduces both the scalability in terms of N and p. Theoretical estimates are compared with numerical experiments performed with linear, quadratic, cubic, quartic, and quintic B-splines, in one and two spatial dimensions. © 2014 Elsevier Ltd. All rights reserved.

  20. Computational strategies for three-dimensional flow simulations on distributed computer systems

    Science.gov (United States)

    Sankar, Lakshmi N.; Weed, Richard A.

    1995-08-01

    This research effort is directed towards an examination of issues involved in porting large computational fluid dynamics codes in use within the industry to a distributed computing environment. This effort addresses strategies for implementing the distributed computing in a device independent fashion and load balancing. A flow solver called TEAM presently in use at Lockheed Aeronautical Systems Company was acquired to start this effort. The following tasks were completed: (1) The TEAM code was ported to a number of distributed computing platforms including a cluster of HP workstations located in the School of Aerospace Engineering at Georgia Tech; a cluster of DEC Alpha Workstations in the Graphics visualization lab located at Georgia Tech; a cluster of SGI workstations located at NASA Ames Research Center; and an IBM SP-2 system located at NASA ARC. (2) A number of communication strategies were implemented. Specifically, the manager-worker strategy and the worker-worker strategy were tested. (3) A variety of load balancing strategies were investigated. Specifically, the static load balancing, task queue balancing and the Crutchfield algorithm were coded and evaluated. (4) The classical explicit Runge-Kutta scheme in the TEAM solver was replaced with an LU implicit scheme. And (5) the implicit TEAM-PVM solver was extensively validated through studies of unsteady transonic flow over an F-5 wing, undergoing combined bending and torsional motion. These investigations are documented in extensive detail in the dissertation, 'Computational Strategies for Three-Dimensional Flow Simulations on Distributed Computing Systems', enclosed as an appendix.

  1. Computational strategies for three-dimensional flow simulations on distributed computer systems

    Science.gov (United States)

    Sankar, Lakshmi N.; Weed, Richard A.

    1995-01-01

    This research effort is directed towards an examination of issues involved in porting large computational fluid dynamics codes in use within the industry to a distributed computing environment. This effort addresses strategies for implementing the distributed computing in a device independent fashion and load balancing. A flow solver called TEAM presently in use at Lockheed Aeronautical Systems Company was acquired to start this effort. The following tasks were completed: (1) The TEAM code was ported to a number of distributed computing platforms including a cluster of HP workstations located in the School of Aerospace Engineering at Georgia Tech; a cluster of DEC Alpha Workstations in the Graphics visualization lab located at Georgia Tech; a cluster of SGI workstations located at NASA Ames Research Center; and an IBM SP-2 system located at NASA ARC. (2) A number of communication strategies were implemented. Specifically, the manager-worker strategy and the worker-worker strategy were tested. (3) A variety of load balancing strategies were investigated. Specifically, the static load balancing, task queue balancing and the Crutchfield algorithm were coded and evaluated. (4) The classical explicit Runge-Kutta scheme in the TEAM solver was replaced with an LU implicit scheme. And (5) the implicit TEAM-PVM solver was extensively validated through studies of unsteady transonic flow over an F-5 wing, undergoing combined bending and torsional motion. These investigations are documented in extensive detail in the dissertation, 'Computational Strategies for Three-Dimensional Flow Simulations on Distributed Computing Systems', enclosed as an appendix.

  2. Computation of distribution of minimum resolution for log-normal distribution of chromatographic peak heights.

    Science.gov (United States)

    Davis, Joe M

    2011-10-28

    General equations are derived for the distribution of minimum resolution between two chromatographic peaks, when peak heights in a multi-component chromatogram follow a continuous statistical distribution. The derivation draws on published theory by relating the area under the distribution of minimum resolution to the area under the distribution of the ratio of peak heights, which in turn is derived from the peak-height distribution. Two procedures are proposed for the equations' numerical solution. The procedures are applied to the log-normal distribution, which recently was reported to describe the distribution of component concentrations in three complex natural mixtures. For published statistical parameters of these mixtures, the distribution of minimum resolution is similar to that for the commonly assumed exponential distribution of peak heights used in statistical-overlap theory. However, these two distributions of minimum resolution can differ markedly, depending on the scale parameter of the log-normal distribution. Theory for the computation of the distribution of minimum resolution is extended to other cases of interest. With the log-normal distribution of peak heights as an example, the distribution of minimum resolution is computed when small peaks are lost due to noise or detection limits, and when the height of at least one peak is less than an upper limit. The distribution of minimum resolution shifts slightly to lower resolution values in the first case and to markedly larger resolution values in the second one. The theory and numerical procedure are confirmed by Monte Carlo simulation. Copyright © 2011 Elsevier B.V. All rights reserved.

  3. Software for Distributed Computation on Medical Databases: A Demonstration Project

    Directory of Open Access Journals (Sweden)

    Balasubramanian Narasimhan

    2017-05-01

    Full Text Available Bringing together the information latent in distributed medical databases promises to personalize medical care by enabling reliable, stable modeling of outcomes with rich feature sets (including patient characteristics and treatments received. However, there are barriers to aggregation of medical data, due to lack of standardization of ontologies, privacy concerns, proprietary attitudes toward data, and a reluctance to give up control over end use. Aggregation of data is not always necessary for model fitting. In models based on maximizing a likelihood, the computations can be distributed, with aggregation limited to the intermediate results of calculations on local data, rather than raw data. Distributed fitting is also possible for singular value decomposition. There has been work on the technical aspects of shared computation for particular applications, but little has been published on the software needed to support the "social networking" aspect of shared computing, to reduce the barriers to collaboration. We describe a set of software tools that allow the rapid assembly of a collaborative computational project, based on the flexible and extensible R statistical software and other open source packages, that can work across a heterogeneous collection of database environments, with full transparency to allow local officials concerned with privacy protections to validate the safety of the method. We describe the principles, architecture, and successful test results for the site-stratified Cox model and rank-k singular value decomposition.

  4. Distributed parallel computing in stochastic modeling of groundwater systems.

    Science.gov (United States)

    Dong, Yanhui; Li, Guomin; Xu, Haizhen

    2013-03-01

    Stochastic modeling is a rapidly evolving, popular approach to the study of the uncertainty and heterogeneity of groundwater systems. However, the use of Monte Carlo-type simulations to solve practical groundwater problems often encounters computational bottlenecks that hinder the acquisition of meaningful results. To improve the computational efficiency, a system that combines stochastic model generation with MODFLOW-related programs and distributed parallel processing is investigated. The distributed computing framework, called the Java Parallel Processing Framework, is integrated into the system to allow the batch processing of stochastic models in distributed and parallel systems. As an example, the system is applied to the stochastic delineation of well capture zones in the Pinggu Basin in Beijing. Through the use of 50 processing threads on a cluster with 10 multicore nodes, the execution times of 500 realizations are reduced to 3% compared with those of a serial execution. Through this application, the system demonstrates its potential in solving difficult computational problems in practical stochastic modeling. © 2012, The Author(s). Groundwater © 2012, National Ground Water Association.

  5. Distributed computing operations in the German ATLAS cloud

    Energy Technology Data Exchange (ETDEWEB)

    Boehler, Michael; Gamel, Anton; Sundermann, Jan Erik [Universitaet Freiburg, Freiburg im Breisgau (Germany); Petzold, Andreas [KIT, Karlsruhe (Germany); Kawamura, Gen [Universitaet Mainz (Germany); Leffhalm, Kai [DESY (Germany); Sandhoff, Marisa; Harenberg, Torsten [Bergische Universitaet Wuppertal (Germany); Walker, Rod; Duckeck, Guenter [LMU Muenchen (Germany)

    2013-07-01

    Before announcing the discovery of a Higgs-like boson at the 4th of July 2012 a huge amount of data had to be distributed around the world and analysed. Moreover, to have well optimised analyses with solid background estimates, Monte Carlo simulated event samples needed to be generated. All of this, data distribution, Monte Carlo production, and also data reprocessing, is performed by the Worldwide LHC Computing Grid. The ATLAS grid computing resources in Austria, the Czech Republic, Germany, Poland, and Switzerland are organized in the GridKa cloud which is one out of 10 ATLAS computing clouds. It consists of the Tier-1 centre at KIT in Karlsruhe which serves as a hub for data management and stores raw ATLAS data and the Tier-2 centres that provide the resources for user analysis and Monte Carlo samples production. This talk gives an overview of the ATLAS grid computing operations in 2012 focusing on the performance and experiences at both the Tier-1 and Tier-2 centres and it summarises the prospects and requirements for grid computing during and after the long shut-down of the LHC in 2013/2014.

  6. Contributions to Desktop Grid Computing : From High Throughput Computing to Data-Intensive Sciences on Hybrid Distributed Computing Infrastructures

    OpenAIRE

    Fedak, Gilles

    2015-01-01

    Since the mid 90’s, Desktop Grid Computing - i.e the idea of using a large number of remote PCs distributed on the Internet to execute large parallel applications - has proved to be an efficient paradigm to provide a large computational power at the fraction of the cost of a dedicated computing infrastructure.This document presents my contributions over the last decade to broaden the scope of Desktop Grid Computing. My research has followed three different directions. The first direction has ...

  7. Common accounting system for monitoring the ATLAS Distributed Computing resources

    CERN Document Server

    Karavakis, E; The ATLAS collaboration; Campana, S; Gayazov, S; Jezequel, S; Saiz, P; Sargsyan, L; Schovancova, J; Ueda, I

    2014-01-01

    This paper covers in detail a variety of accounting tools used to monitor the utilisation of the available computational and storage resources within the ATLAS Distributed Computing during the first three years of Large Hadron Collider data taking. The Experiment Dashboard provides a set of common accounting tools that combine monitoring information originating from many different information sources; either generic or ATLAS specific. This set of tools provides quality and scalable solutions that are flexible enough to support the constantly evolving requirements of the ATLAS user community.

  8. EST analysis pipeline: use of distributed computing resources.

    Science.gov (United States)

    González, Francisco Javier; Vizcaíno, Juan Antonio

    2011-01-01

    This chapter describes how a pipeline for the analysis of expressed sequence tag (EST) data can be -implemented, based on our previous experience generating ESTs from Trichoderma spp. We focus on key steps in the workflow, such as the processing of raw data from the sequencers, the clustering of ESTs, and the functional annotation of the sequences using BLAST, InterProScan, and BLAST2GO. Some of the steps require the use of intensive computing power. Since these resources are not available for small research groups or institutes without bioinformatics support, an alternative will be described: the use of distributed computing resources (local grids and Amazon EC2).

  9. Cloud Scheduler: a resource manager for distributed compute clouds

    CERN Document Server

    Armstrong, P; Bishop, A; Charbonneau, A; Desmarais, R; Fransham, K; Hill, N; Gable, I; Gaudet, S; Goliath, S; Impey, R; Leavett-Brown, C; Ouellete, J; Paterson, M; Pritchet, C; Penfold-Brown, D; Podaima, W; Schade, D; Sobie, R J

    2010-01-01

    The availability of Infrastructure-as-a-Service (IaaS) computing clouds gives researchers access to a large set of new resources for running complex scientific applications. However, exploiting cloud resources for large numbers of jobs requires significant effort and expertise. In order to make it simple and transparent for researchers to deploy their applications, we have developed a virtual machine resource manager (Cloud Scheduler) for distributed compute clouds. Cloud Scheduler boots and manages the user-customized virtual machines in response to a user's job submission. We describe the motivation and design of the Cloud Scheduler and present results on its use on both science and commercial clouds.

  10. Novel spintronics devices for memory and logic: prospects and challenges for room temperature all spin computing

    Science.gov (United States)

    Wang, Jian-Ping

    An energy efficient memory and logic device for the post-CMOS era has been the goal of a variety of research fields. The limits of scaling, which we expect to reach by the year 2025, demand that future advances in computational power will not be realized from ever-shrinking device sizes, but rather by innovative designs and new materials and physics. Magnetoresistive based devices have been a promising candidate for future integrated magnetic computation because of its unique non-volatility and functionalities. The application of perpendicular magnetic anisotropy for potential STT-RAM application was demonstrated and later has been intensively investigated by both academia and industry groups, but there is no clear path way how scaling will eventually work for both memory and logic applications. One of main reasons is that there is no demonstrated material stack candidate that could lead to a scaling scheme down to sub 10 nm. Another challenge for the usage of magnetoresistive based devices for logic application is its available switching speed and writing energy. Although a good progress has been made to demonstrate the fast switching of a thermally stable magnetic tunnel junction (MTJ) down to 165 ps, it is still several times slower than its CMOS counterpart. In this talk, I will review the recent progress by my research group and my C-SPIN colleagues, then discuss the opportunities, challenges and some potential path ways for magnetoresitive based devices for memory and logic applications and their integration for room temperature all spin computing system.

  11. Extending distributed shared memory for the cell broadband engine to a channel model

    DEFF Research Database (Denmark)

    Skovhede, Kenneth; Larsen, Morten Nørgaard; Vinter, Brian

    2010-01-01

    at the price of a quite complex programming model. In this paper we present an easy-to-use, CSP-like, communication method, which enables transfers of shared memory objects. The channel based communication method can significantly reduce the complexity of massively parallel programs. By implementing a few...... scientific computational cores we show that performance and scalability of the system is acceptable for most problems. © 2012 Springer-Verlag....

  12. PREFACE: Special section on Computational Fluid Dynamics—in memory of Professor Kunio Kuwahara Special section on Computational Fluid Dynamics—in memory of Professor Kunio Kuwahara

    Science.gov (United States)

    Ishii, Katsuya

    2011-08-01

    This issue includes a special section on computational fluid dynamics (CFD) in memory of the late Professor Kunio Kuwahara, who passed away on 15 September 2008, at the age of 66. In this special section, five articles are included that are based on the lectures and discussions at `The 7th International Nobeyama Workshop on CFD: To the Memory of Professor Kuwahara' held in Tokyo on 23 and 24 September 2009. Professor Kuwahara started his research in fluid dynamics under Professor Imai at the University of Tokyo. His first paper was published in 1969 with the title 'Steady Viscous Flow within Circular Boundary', with Professor Imai. In this paper, he combined theoretical and numerical methods in fluid dynamics. Since that time, he made significant and seminal contributions to computational fluid dynamics. He undertook pioneering numerical studies on the vortex method in 1970s. From then to the early nineties, he developed numerical analyses on a variety of three-dimensional unsteady phenomena of incompressible and compressible fluid flows and/or complex fluid flows using his own supercomputers with academic and industrial co-workers and members of his private research institute, ICFD in Tokyo. In addition, a number of senior and young researchers of fluid mechanics around the world were invited to ICFD and the Nobeyama workshops, which were held near his villa, and they intensively discussed new frontier problems of fluid physics and fluid engineering at Professor Kuwahara's kind hospitality. At the memorial Nobeyama workshop held in 2009, 24 overseas speakers presented their papers, including the talks of Dr J P Boris (Naval Research Laboratory), Dr E S Oran (Naval Research Laboratory), Professor Z J Wang (Iowa State University), Dr M Meinke (RWTH Aachen), Professor K Ghia (University of Cincinnati), Professor U Ghia (University of Cincinnati), Professor F Hussain (University of Houston), Professor M Farge (École Normale Superieure), Professor J Y Yong (National

  13. High-fidelity quantum memory using nitrogen-vacancy center ensemble for hybrid quantum computation

    CERN Document Server

    Yang, W L; Hu, Y; Feng, M; Du, J F

    2011-01-01

    We study a hybrid quantum computing system using nitrogen-vacancy center ensemble (NVE) as quantum memory, current-biased Josephson junction (CBJJ) superconducting qubit fabricated in a transmission line resonator (TLR) as quantum computing processor and the microwave photons in TLR as quantum data bus. The storage process is seriously treated by considering all kinds of decoherence mechanisms. Such a hybrid quantum device can also be used to create multi-qubit W states of NVEs through a common CBJJ. The experimental feasibility and challenge are justified using currently available technology.

  14. A biological solution to a fundamental distributed computing problem.

    Science.gov (United States)

    Afek, Yehuda; Alon, Noga; Barad, Omer; Hornstein, Eran; Barkai, Naama; Bar-Joseph, Ziv

    2011-01-14

    Computational and biological systems are often distributed so that processors (cells) jointly solve a task, without any of them receiving all inputs or observing all outputs. Maximal independent set (MIS) selection is a fundamental distributed computing procedure that seeks to elect a set of local leaders in a network. A variant of this problem is solved during the development of the fly's nervous system, when sensory organ precursor (SOP) cells are chosen. By studying SOP selection, we derived a fast algorithm for MIS selection that combines two attractive features. First, processors do not need to know their degree; second, it has an optimal message complexity while only using one-bit messages. Our findings suggest that simple and efficient algorithms can be developed on the basis of biologically derived insights.

  15. 11th International Conference on Distributed Computing and Artificial Intelligence

    CERN Document Server

    Bersini, Hugues; Corchado, Juan; Rodríguez, Sara; Pawlewski, Paweł; Bucciarelli, Edgardo

    2014-01-01

    The 11th International Symposium on Distributed Computing and Artificial Intelligence 2014 (DCAI 2014) is a forum to present applications of innovative techniques for studying and solving complex problems. The exchange of ideas between scientists and technicians from both the academic and industrial sector is essential to facilitate the development of systems that can meet the ever-increasing demands of today’s society. The present edition brings together past experience, current work and promising future trends associated with distributed computing, artificial intelligence and their application in order to provide efficient solutions to real problems. This year’s technical program presents both high quality and diversity, with contributions in well-established and evolving areas of research (Algeria, Brazil, China, Croatia, Czech Republic, Denmark, France, Germany, Ireland, Italy, Japan, Malaysia, Mexico, Poland, Portugal, Republic of Korea, Spain, Taiwan, Tunisia, Ukraine, United Kingdom), representing ...

  16. Efficient Model for Distributed Computing based on Smart Embedded Agent

    Directory of Open Access Journals (Sweden)

    Hassna Bensag

    2017-02-01

    Full Text Available Technological advances of embedded computing exposed humans to an increasing intrusion of computing in their day-to-day life (e.g. smart devices. Cooperation, autonomy, and mobility made the agent a promising mechanism for embedded devices. The work aims to present a new model of an embedded agent designed to be implemented in smart devices in order to achieve parallel tasks in a distribute environment. To validate the proposed model, a case study was developed for medical image segmentation using Cardiac Magnetic Resonance Image (MRI. In the first part of this paper, we focus on implementing the parallel algorithm of classification using C-means method in embedded systems. We propose then a new concept of distributed classification using multi-agent systems based on JADE and Raspberry PI 2 devices.

  17. The BaBar Experiment's Distributed Computing Model

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    In order to face the expected increase in statistics between now and 2005,the Babar experiment at SLAC is evolving its computing model toward a distributed multitier system.It is foreseen that data will be spread among Tier-A centers and deleted from the SLAC center,A unifrom computing enviromment is being deployed in the centers,the network bandwidth is continuously increased and data distribution tools has been designed in order to reach a transfer rate of -100 TB of data per year,In parallel,smaller Tier-B and C sites receive subsets of data,presently in Kanga-Root[1] format and later in Objectivity[2] format,GRID tools will be used for remote job submission.

  18. Software Quality Measurement for Distributed Systems. Volume 3. Distributed Computing Systems: Impact on Software Quality.

    Science.gov (United States)

    1983-07-01

    methodology precludes two computers from being casually linked together (as with SEAC and DYSEAC), but also lessens the likelihood of major innovations ...provide a more natural computer achitecture for applications such as artificial intelligence, real time command and control, and data base management. 3-7...state-of-the- art in how we can separate C31 nodes. A truly distributed C31 system may not be achievable for another 5 to 10 years. However, unless we

  19. A learnable parallel processing architecture towards unity of memory and computing.

    Science.gov (United States)

    Li, H; Gao, B; Chen, Z; Zhao, Y; Huang, P; Ye, H; Liu, L; Liu, X; Kang, J

    2015-08-14

    Developing energy-efficient parallel information processing systems beyond von Neumann architecture is a long-standing goal of modern information technologies. The widely used von Neumann computer architecture separates memory and computing units, which leads to energy-hungry data movement when computers work. In order to meet the need of efficient information processing for the data-driven applications such as big data and Internet of Things, an energy-efficient processing architecture beyond von Neumann is critical for the information society. Here we show a non-von Neumann architecture built of resistive switching (RS) devices named "iMemComp", where memory and logic are unified with single-type devices. Leveraging nonvolatile nature and structural parallelism of crossbar RS arrays, we have equipped "iMemComp" with capabilities of computing in parallel and learning user-defined logic functions for large-scale information processing tasks. Such architecture eliminates the energy-hungry data movement in von Neumann computers. Compared with contemporary silicon technology, adder circuits based on "iMemComp" can improve the speed by 76.8% and the power dissipation by 60.3%, together with a 700 times aggressive reduction in the circuit area.

  20. Computationally intensive econometrics using a distributed matrix-programming language.

    Science.gov (United States)

    Doornik, Jurgen A; Hendry, David F; Shephard, Neil

    2002-06-15

    This paper reviews the need for powerful computing facilities in econometrics, focusing on concrete problems which arise in financial economics and in macroeconomics. We argue that the profession is being held back by the lack of easy-to-use generic software which is able to exploit the availability of cheap clusters of distributed computers. Our response is to extend, in a number of directions, the well-known matrix-programming interpreted language Ox developed by the first author. We note three possible levels of extensions: (i) Ox with parallelization explicit in the Ox code; (ii) Ox with a parallelized run-time library; and (iii) Ox with a parallelized interpreter. This paper studies and implements the first case, emphasizing the need for deterministic computing in science. We give examples in the context of financial economics and time-series modelling.

  1. A Parallel and Distributed Surrogate Model Implementation for Computational Steering

    KAUST Repository

    Butnaru, Daniel

    2012-06-01

    Understanding the influence of multiple parameters in a complex simulation setting is a difficult task. In the ideal case, the scientist can freely steer such a simulation and is immediately presented with the results for a certain configuration of the input parameters. Such an exploration process is however not possible if the simulation is computationally too expensive. For these cases we present in this paper a scalable computational steering approach utilizing a fast surrogate model as substitute for the time-consuming simulation. The surrogate model we propose is based on the sparse grid technique, and we identify the main computational tasks associated with its evaluation and its extension. We further show how distributed data management combined with the specific use of accelerators allows us to approximate and deliver simulation results to a high-resolution visualization system in real-time. This significantly enhances the steering workflow and facilitates the interactive exploration of large datasets. © 2012 IEEE.

  2. StarBus+: Distributed Object Middleware Practice for Internet Computing

    Institute of Scientific and Technical Information of China (English)

    Huai-Min Wang; Yu-Feng Wang; Yang-Bin Tang

    2005-01-01

    In the past decade, the booming of Internet challenges the middleware in three aspects: quality of service,balance of changes and stabilization, and across-Internet integration. This paper presents our work on distributed object computing middleware technology for these challenges, as well as the research and development on StarBus+, which is a CORBA standard-compliant middleware suite with the features such as object request broker supporting multi-* quality of service, component model, and integration with Web Service. This paper comprehensively presents the design and characteristics of StarBns+, and demonstrates how StarBus+ is enhanced to address the challenges of Internet computing through three case studies: inter-enterprise integration over Internet, application evolution through dynamic reconfiguration,and great massive information system building. The paper also suggests some research directions which are important for Internet computing.

  3. Algorithm-dependent fault tolerance for distributed computing

    Energy Technology Data Exchange (ETDEWEB)

    P. D. Hough; M. e. Goldsby; E. J. Walsh

    2000-02-01

    Large-scale distributed systems assembled from commodity parts, like CPlant, have become common tools in the distributed computing world. Because of their size and diversity of parts, these systems are prone to failures. Applications that are being run on these systems have not been equipped to efficiently deal with failures, nor is there vendor support for fault tolerance. Thus, when a failure occurs, the application crashes. While most programmers make use of checkpoints to allow for restarting of their applications, this is cumbersome and incurs substantial overhead. In many cases, there are more efficient and more elegant ways in which to address failures. The goal of this project is to develop a software architecture for the detection of and recovery from faults in a cluster computing environment. The detection phase relies on the latest techniques developed in the fault tolerance community. Recovery is being addressed in an application-dependent manner, thus allowing the programmer to take advantage of algorithmic characteristics to reduce the overhead of fault tolerance. This architecture will allow large-scale applications to be more robust in high-performance computing environments that are comprised of clusters of commodity computers such as CPlant and SMP clusters.

  4. Lightweight distributed computing for intraoperative real-time image guidance

    Science.gov (United States)

    Suwelack, Stefan; Katic, Darko; Wagner, Simon; Spengler, Patrick; Bodenstedt, Sebastian; Röhl, Sebastian; Dillmann, Rüdiger; Speidel, Stefanie

    2012-02-01

    In order to provide real-time intraoperative guidance, computer assisted surgery (CAS) systems often rely on computationally expensive algorithms. The real-time constraint is especially challenging if several components such as intraoperative image processing, soft tissue registration or context aware visualization are combined in a single system. In this paper, we present a lightweight approach to distribute the workload over several workstations based on the OpenIGTLink protocol. We use XML-based message passing for remote procedure calls and native types for transferring data such as images, meshes or point coordinates. Two different, but typical scenarios are considered in order to evaluate the performance of the new system. First, we analyze a real-time soft tissue registration algorithm based on a finite element (FE) model. Here, we use the proposed approach to distribute the computational workload between a primary workstation that handles sensor data processing and visualization and a dedicated workstation that runs the real-time FE algorithm. We show that the additional overhead that is introduced by the technique is small compared to the total execution time. Furthermore, the approach is used to speed up a context aware augmented reality based navigation system for dental implant surgery. In this scenario, the additional delay for running the computationally expensive reasoning server on a separate workstation is less than a millisecond. The results show that the presented approach is a promising strategy to speed up real-time CAS systems.

  5. Distributed and Big Data Storage Management in Grid Computing

    Directory of Open Access Journals (Sweden)

    Ajay Kumar

    2012-07-01

    Full Text Available Big data storage management is one of the most challenging issues for Grid computing environments, since large amount of data intensive applications frequently involve a high degree of data access locality. Grid applications typically deal with large amounts of data. In traditional approaches high-performance computing consists dedicated servers that are used to data storage and data replication. In this paper we present a new mechanism for distributed and big data storage and resource discovery services. Here we proposed an architecture named Dynamic and Scalable Storage Management (DSSM architecture in grid environments. This allows in grid computing not only sharing the computational cycles, but also share the storage space. The storage can be transparently accessed from any grid machine, allowing easy data sharing among grid users and applications. The concept of virtual ids that, allows the creation of virtual spaces has been introduced and used. The DSSM divides all Grid Oriented Storage devices (nodes into multiple geographically distributed domains and to facilitate the locality and simplify the intra-domain storage management. Grid service based storage resources are adopted to stack simple modular service piece by piece as demand grows. To this end, we propose four axes that define: DSSM architecture and algorithms description, Storage resources and resource discovery into Grid service, Evaluate purpose prototype system, dynamically, scalability, and bandwidth, and Discuss results. Algorithms at bottom and upper level for standardization dynamic and scalable storage management, along with higher bandwidths have been designed.

  6. Simulation of emission tomography using grid middleware for distributed computing.

    Science.gov (United States)

    Thomason, M G; Longton, R F; Gregor, J; Smith, G T; Hutson, R K

    2004-09-01

    SimSET is Monte Carlo simulation software for emission tomography. This paper describes a simple but effective scheme for parallel execution of SimSET using NetSolve, a client-server system for distributed computation. NetSolve (version 1.4.1) is "grid middleware" which enables a user (the client) to run specific computations remotely and simultaneously on a grid of networked computers (the servers). Since the servers do not have to be identical machines, computation may take place in a heterogeneous environment. To take advantage of diversity in machines and their workloads, a client-side scheduler was implemented for the Monte Carlo simulation. The scheduler partitions the total decay events by taking into account the inherent compute-speeds and recent average workloads, i.e., the scheduler assigns more decay events to processors expected to give faster service and fewer decay events to those expected to give slower service. When compute-speeds and sustained workloads are taken into account, the speed-up is essentially linear in the number of equivalent "maximum-service" processors. One modification in the SimSET code (version 2.6.2.3) was made to ensure that the total number of decay events specified by the user is maintained in the distributed simulation. No other modifications in the standard SimSET code were made. Each processor runs complete SimSET code for its assignment of decay events, independently of others running simultaneously. Empirical results are reported for simulation of a clinical-quality lung perfusion study.

  7. Global stability of bidirectional associative memory neural networks with continuously distributed delays

    Institute of Scientific and Technical Information of China (English)

    张强; 马润年; 许进

    2003-01-01

    Global asymptotic stability of the equilibrium point of bidirectional associative memory (BAM) neural networks with continuously distributed delays is studied. Under two mild assumptions on the acti-vation functions, two sufficient conditions ensuring global stability of such networks are derived by utiliz-ing Lyapunov functional and some inequality analysis technique. The results here extend some previous results. A numerical example is given showing the validity of our method.

  8. Properties of Sparse Distributed Representations and their Application to Hierarchical Temporal Memory

    OpenAIRE

    Ahmad, Subutai; Hawkins, Jeff

    2015-01-01

    Empirical evidence demonstrates that every region of the neocortex represents information using sparse activity patterns. This paper examines Sparse Distributed Representations (SDRs), the primary information representation strategy in Hierarchical Temporal Memory (HTM) systems and the neocortex. We derive a number of properties that are core to scaling, robustness, and generalization. We use the theory to provide practical guidelines and illustrate the power of SDRs as the basis of HTM. Our ...

  9. Multi-threaded, discrete event simulation of distributed computing systems

    Science.gov (United States)

    Legrand, Iosif; MONARC Collaboration

    2001-10-01

    The LHC experiments have envisaged computing systems of unprecedented complexity, for which is necessary to provide a realistic description and modeling of data access patterns, and of many jobs running concurrently on large scale distributed systems and exchanging very large amounts of data. A process oriented approach for discrete event simulation is well suited to describe various activities running concurrently, as well the stochastic arrival patterns specific for such type of simulation. Threaded objects or "Active Objects" can provide a natural way to map the specific behaviour of distributed data processing into the simulation program. The simulation tool developed within MONARC is based on Java (TM) technology which provides adequate tools for developing a flexible and distributed process oriented simulation. Proper graphics tools, and ways to analyze data interactively, are essential in any simulation project. The design elements, status and features of the MONARC simulation tool are presented. The program allows realistic modeling of complex data access patterns by multiple concurrent users in large scale computing systems in a wide range of possible architectures, from centralized to highly distributed. Comparison between queuing theory and realistic client-server measurements is also presented.

  10. Computational Model for Internal Relative Humidity Distributions in Concrete

    Directory of Open Access Journals (Sweden)

    Wondwosen Ali

    2014-01-01

    Full Text Available A computational model is developed for predicting nonuniform internal relative humidity distribution in concrete. Internal relative humidity distribution is known to have a direct effect on the nonuniform drying shrinkage strains. These nonuniform drying shrinkage strains result in the buildup of internal stresses, which may lead to cracking of concrete. This may be particularly true at early ages of concrete since the concrete is relatively weak while the difference in internal relative humidity is probably high. The results obtained from this model can be used by structural and construction engineers to predict critical drying shrinkage stresses induced due to differential internal humidity distribution. The model uses finite elment-finite difference numerical methods. The finite element is used to space discretization while the finite difference is used to obtain transient solutions of the model. The numerical formulations are then programmed in Matlab. The numerical results were compared with experimental results found in the literature and demonstrated very good agreement.

  11. Distributed Computation Resources for Earth System Grid Federation (ESGF)

    Science.gov (United States)

    Duffy, D.; Doutriaux, C.; Williams, D. N.

    2014-12-01

    The Intergovernmental Panel on Climate Change (IPCC), prompted by the United Nations General Assembly, has published a series of papers in their Fifth Assessment Report (AR5) on processes, impacts, and mitigations of climate change in 2013. The science used in these reports was generated by an international group of domain experts. They studied various scenarios of climate change through the use of highly complex computer models to simulate the Earth's climate over long periods of time. The resulting total data of approximately five petabytes are stored in a distributed data grid known as the Earth System Grid Federation (ESGF). Through the ESGF, consumers of the data can find and download data with limited capabilities for server-side processing. The Sixth Assessment Report (AR6) is already in the planning stages and is estimated to create as much as two orders of magnitude more data than the AR5 distributed archive. It is clear that data analysis capabilities currently in use will be inadequate to allow for the necessary science to be done with AR6 data—the data will just be too big. A major paradigm shift from downloading data to local systems to perform data analytics must evolve to moving the analysis routines to the data and performing these computations on distributed platforms. In preparation for this need, the ESGF has started a Compute Working Team (CWT) to create solutions that allow users to perform distributed, high-performance data analytics on the AR6 data. The team will be designing and developing a general Application Programming Interface (API) to enable highly parallel, server-side processing throughout the ESGF data grid. This API will be integrated with multiple analysis and visualization tools, such as the Ultrascale Visualization Climate Data Analysis Tools (UV-CDAT), netCDF Operator (NCO), and others. This presentation will provide an update on the ESGF CWT's overall approach toward enabling the necessary storage proximal computational

  12. Computational implementation of a tunable multicellular memory circuit for engineered eukaryotic consortia

    Directory of Open Access Journals (Sweden)

    Josep eSardanyés

    2015-10-01

    Full Text Available Cells are complex machines capable of processing information by means of an entangled network ofmolecular interactions. A crucial component of these decision-making systems is the presence of memoryand this is also a specially relevant target of engineered synthetic systems. A classic example of memorydevices is a 1-bit memory element known as the flip-flop. Such system can be in principle designed usinga single-cell implementation, but a direct mapping between standard circuit design and a living circuitcan be cumbersome. Here we present a novel computational implementation of a 1-bit memory deviceusing a reliable multicellular design able to behave as a set-reset flip-flop that could be implemented inyeast cells. The dynamics of the proposed synthetic circuit is investigated with a mathematical modelusing biologically-meaningful parameters. The circuit is shown to behave as a flip-flop in a wide range ofparameter values. The repression strength for the NOT logics is shown to be crucial to obtain a goodflip-flop signal. Our model also shows that the circuit can be externally tuned to achieve different memorystates and dynamics, such as persistent and transient memory. We have characterised the parameterdomains for robust memory storage and retrieval as well as the corresponding time response dynamics.

  13. An associative capacitive network based on nanoscale complementary resistive switches for memory-intensive computing.

    Science.gov (United States)

    Kavehei, Omid; Linn, Eike; Nielen, Lutz; Tappertzhofen, Stefan; Skafidas, Efstratios; Valov, Ilia; Waser, Rainer

    2013-06-07

    We report on the implementation of an Associative Capacitive Network (ACN) based on the nondestructive capacitive readout of two Complementary Resistive Switches (2-CRSs). ACNs are capable of performing a fully parallel search for Hamming distances (i.e. similarity) between input and stored templates. Unlike conventional associative memories where charge retention is a key function and hence, they require frequent refresh cycles, in ACNs, information is retained in a nonvolatile resistive state and normal tasks are carried out through capacitive coupling between input and output nodes. Each device consists of two CRS cells and no selective element is needed, therefore, CMOS circuitry is only required in the periphery, for addressing and read-out. Highly parallel processing, nonvolatility, wide interconnectivity and low-energy consumption are significant advantages of ACNs over conventional and emerging associative memories. These characteristics make ACNs one of the promising candidates for applications in memory-intensive and cognitive computing, switches and routers as binary and ternary Content Addressable Memories (CAMs) and intelligent data processing.

  14. An ATLAS distributed computing architecture for HL-LHC

    CERN Document Server

    Campana, Simone; The ATLAS collaboration

    2017-01-01

    The ATLAS collaboration started a process to understand the computing needs for the High Luminosity LHC era. Based on our best understanding of the computing model input parameters for the HL-LHC data taking conditions, results indicate the need for a larger amount of computational and storage resources with respect of the projection of constant yearly budget for computing in 2026. Filling the gap between the projection and the needs will be one of the challenges in preparation for LHC Run-4. While the gains from improvements in offline software will play a crucial role in this process, a different model for data processing, management, access and bookkeeping should also be envisaged to optimise resource usage. In this contribution we will describe a straw man of this model, founded on basic principles such as single event level granularity for data processing and virtual data. We will explain how the current architecture will evolve adiabatically into the future distributed computing system, through the prot...

  15. Energy Scaling Advantages of Resistive Memory Crossbar Based Computation and Its Application to Sparse Coding

    Science.gov (United States)

    Agarwal, Sapan; Quach, Tu-Thach; Parekh, Ojas; Hsia, Alexander H.; DeBenedictis, Erik P.; James, Conrad D.; Marinella, Matthew J.; Aimone, James B.

    2016-01-01

    The exponential increase in data over the last decade presents a significant challenge to analytics efforts that seek to process and interpret such data for various applications. Neural-inspired computing approaches are being developed in order to leverage the computational properties of the analog, low-power data processing observed in biological systems. Analog resistive memory crossbars can perform a parallel read or a vector-matrix multiplication as well as a parallel write or a rank-1 update with high computational efficiency. For an N × N crossbar, these two kernels can be O(N) more energy efficient than a conventional digital memory-based architecture. If the read operation is noise limited, the energy to read a column can be independent of the crossbar size (O(1)). These two kernels form the basis of many neuromorphic algorithms such as image, text, and speech recognition. For instance, these kernels can be applied to a neural sparse coding algorithm to give an O(N) reduction in energy for the entire algorithm when run with finite precision. Sparse coding is a rich problem with a host of applications including computer vision, object tracking, and more generally unsupervised learning. PMID:26778946

  16. Energy Scaling Advantages of Resistive Memory Crossbar Based Computation and its Application to Sparse Coding

    Directory of Open Access Journals (Sweden)

    Sapan eAgarwal

    2016-01-01

    Full Text Available The exponential increase in data over the last decade presents a significant challenge to analytics efforts that seek to process and interpret such data for various applications. Neural-inspired computing approaches are being developed in order to leverage the computational advantages of the analog, low-power data processing observed in biological systems. Analog resistive memory crossbars can perform a parallel read or a vector-matrix multiplication as well as a parallel write or a rank-1 update with high computational efficiency. For an NxN crossbar, these two kernels are at a minimum O(N more energy efficient than a digital memory-based architecture. If the read operation is noise limited, the energy to read a column can be independent of the crossbar size (O(1. These two kernels form the basis of many neuromorphic algorithms such as image, text, and speech recognition. For instance, these kernels can be applied to a neural sparse coding algorithm to give an O(N reduction in energy for the entire algorithm. Sparse coding is a rich problem with a host of applications including computer vision, object tracking, and more generally unsupervised learning.

  17. Metal oxide resistive random access memory based synaptic devices for brain-inspired computing

    Science.gov (United States)

    Gao, Bin; Kang, Jinfeng; Zhou, Zheng; Chen, Zhe; Huang, Peng; Liu, Lifeng; Liu, Xiaoyan

    2016-04-01

    The traditional Boolean computing paradigm based on the von Neumann architecture is facing great challenges for future information technology applications such as big data, the Internet of Things (IoT), and wearable devices, due to the limited processing capability issues such as binary data storage and computing, non-parallel data processing, and the buses requirement between memory units and logic units. The brain-inspired neuromorphic computing paradigm is believed to be one of the promising solutions for realizing more complex functions with a lower cost. To perform such brain-inspired computing with a low cost and low power consumption, novel devices for use as electronic synapses are needed. Metal oxide resistive random access memory (ReRAM) devices have emerged as the leading candidate for electronic synapses. This paper comprehensively addresses the recent work on the design and optimization of metal oxide ReRAM-based synaptic devices. A performance enhancement methodology and optimized operation scheme to achieve analog resistive switching and low-energy training behavior are provided. A three-dimensional vertical synapse network architecture is proposed for high-density integration and low-cost fabrication. The impacts of the ReRAM synaptic device features on the performances of neuromorphic systems are also discussed on the basis of a constructed neuromorphic visual system with a pattern recognition function. Possible solutions to achieve the high recognition accuracy and efficiency of neuromorphic systems are presented.

  18. A cross-cultural study of the lifespan distributions of life script events and autobiographical memories of life story events

    DEFF Research Database (Denmark)

    Zaragoza Scherman, Alejandra; Salgado, Sinué; Shao, Zhifang

    of major transitional life events in an idealized life course. By comparing the lifespan distribution of life scripts events and memories of life story events, we can determine the degree to which the cultural life script serves as a recall template for autobiographical memories, especially of positive...

  19. Cost Optimization Technique of Task Allocation in Heterogeneous Distributed Computing System

    Directory of Open Access Journals (Sweden)

    Faizul Navi Khan

    2013-11-01

    Full Text Available A Distributed Computing System (DCS is a network of workstations, personal computer and /or other computing systems. Such system may be heterogeneous in the sense that the computing nodes may have different speeds and memory capacities. A DCS accepts tasks from users and executes different modules of these tasks on various nodes of the system. Task allocation in a DCS is a common problem and a good number of task allocation algorithms have been proposed in the literature. In such environment an application runs in a DCS can be accessible on every node present within the DCS. In such cases if number of tasks is less than or equal to available processors in the DCS, we can assign these task without any trouble. But this allocation becomes complicated when numbers of tasks are greater than the numbers of processors. The problem of task allocation for processing of ‘m’ tasks to ‘n’ processors (m>n in a DCS is addressed here through a new modified tasks allocation technique. The model, presented in this paper allocates the tasks to the processor of different processing capacity to increase the performance of the DCS. The technique presented in this paper is based on the consideration of processing cost of the task to the processors. We have tried a new technique to assign all the tasks as per the required availability of processors and their processing capacity so that all the tasks of application get execute in the DCS.

  20. Spatial Memory Activity Distributions Indicate the Hippocampus Operates in a Continuous Manner.

    Science.gov (United States)

    Jeye, Brittany M; Karanian, Jessica M; Slotnick, Scott D

    2016-08-26

    There is a long-standing debate as to whether recollection is a continuous/graded process or a threshold/all-or-none process. In the current spatial memory functional magnetic resonance imaging (fMRI) study, we examined the hippocampal activity distributions-the magnitude of activity as a function of memory strength-to determine the nature of processing in this region. During encoding, participants viewed abstract shapes in the left or right visual field. During retrieval, old shapes were presented at fixation and participants classified each shape as previously in the "left" or "right" visual field followed by an "unsure"-"sure"-"very sure" confidence rating. The contrast of left-hits and left-misses produced two activations in the hippocampus. The hippocampal activity distributions for left shapes and right shapes were completely overlapping. Critically, the magnitude of activity associated with right-miss-very sure responses was significantly greater than zero. These results support the continuous model of recollection, which predicts overlapping activity distributions, and contradict the threshold model of recollection, which predicts a threshold above which only one distribution exists. Receiver operating characteristic analysis did not distinguish between models. The present results demonstrate that the hippocampus operates in a continuous manner during recollection and highlight the utility of analyzing activity distributions to determine the nature of neural processing.

  1. ATLAS Distributed Computing Monitoring tools during the LHC Run I

    Science.gov (United States)

    Schovancová, J.; Campana, S.; Di Girolamo, A.; Jézéquel, S.; Ueda, I.; Wenaus, T.; Atlas Collaboration

    2014-06-01

    This contribution summarizes evolution of the ATLAS Distributed Computing (ADC) Monitoring project during the LHC Run I. The ADC Monitoring targets at the three groups of customers: ADC Operations team to early identify malfunctions and escalate issues to an activity or a service expert, ATLAS national contacts and sites for the real-time monitoring and long-term measurement of the performance of the provided computing resources, and the ATLAS Management for long-term trends and accounting information about the ATLAS Distributed Computing resources. During the LHC Run I a significant development effort has been invested in standardization of the monitoring and accounting applications in order to provide extensive monitoring and accounting suite. ADC Monitoring applications separate the data layer and the visualization layer. The data layer exposes data in a predefined format. The visualization layer is designed bearing in mind visual identity of the provided graphical elements, and re-usability of the visualization bits across the different tools. A rich family of various filtering and searching options enhancing available user interfaces comes naturally with the data and visualization layer separation. With a variety of reliable monitoring data accessible through standardized interfaces, the possibility of automating actions under well defined conditions correlating multiple data sources has become feasible. In this contribution we discuss also about the automated exclusion of degraded resources and their automated recovery in various activities.

  2. Improving CMS data transfers among its distributed computing facilities

    CERN Document Server

    Flix, J; Sartirana, A

    2001-01-01

    CMS computing needs reliable, stable and fast connections among multi-tiered computing infrastructures. For data distribution, the CMS experiment relies on a data placement and transfer system, PhEDEx, managing replication operations at each site in the distribution network. PhEDEx uses the File Transfer Service (FTS), a low level data movement service responsible for moving sets of files from one site to another, while allowing participating sites to control the network resource usage. FTS servers are provided by Tier-0 and Tier-1 centres and are used by all computing sites in CMS, according to the established policy. FTS needs to be set up according to the Grid site's policies, and properly configured to satisfy the requirements of all Virtual Organizations making use of the Grid resources at the site. Managing the service efficiently requires good knowledge of the CMS needs for all kinds of transfer workflows. This contribution deals with a revision of FTS servers used by CMS, collecting statistics on thei...

  3. Improving CMS data transfers among its distributed computing facilities

    CERN Document Server

    Flix, Jose

    2010-01-01

    CMS computing needs reliable, stable and fast connections among multi-tiered computing infrastructures. For data distribution, the CMS experiment relies on a data placement and transfer system, PhEDEx, managing replication operations at each site in the distribution network. PhEDEx uses the File Transfer Service (FTS), a low level data movement service responsible for moving sets of files from one site to another, while allowing participating sites to control the network resource usage. FTS servers are provided by Tier-0 and Tier-1 centres and are used by all computing sites in CMS, according to the established policy. FTS needs to be set up according to the Grid site's policies, and properly configured to satisfy the requirements of all Virtual Organizations making use of the Grid resources at the site. Managing the service efficiently requires good knowledge of the CMS needs for all kinds of transfer workflows. This contribution deals with a revision of FTS servers used by CMS, collecting statistics on the...

  4. Logic computation in phase change materials by threshold and memory switching.

    Science.gov (United States)

    Cassinerio, M; Ciocchini, N; Ielmini, D

    2013-11-06

    Memristors, namely hysteretic devices capable of changing their resistance in response to applied electrical stimuli, may provide new opportunities for future memory and computation, thanks to their scalable size, low switching energy and nonvolatile nature. We have developed a functionally complete set of logic functions including NOR, NAND and NOT gates, each utilizing a single phase-change memristor (PCM) where resistance switching is due to the phase transformation of an active chalcogenide material. The logic operations are enabled by the high functionality of nanoscale phase change, featuring voltage comparison, additive crystallization and pulse-induced amorphization. The nonvolatile nature of memristive states provides the basis for developing reconfigurable hybrid logic/memory circuits featuring low-power and high-speed switching. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  5. Quantum error correcting codes and one-way quantum computing: Towards a quantum memory

    CERN Document Server

    Schlingemann, D

    2003-01-01

    For realizing a quantum memory we suggest to first encode quantum information via a quantum error correcting code and then concatenate combined decoding and re-encoding operations. This requires that the encoding and the decoding operation can be performed faster than the typical decoherence time of the underlying system. The computational model underlying the one-way quantum computer, which has been introduced by Hans Briegel and Robert Raussendorf, provides a suitable concept for a fast implementation of quantum error correcting codes. It is shown explicitly in this article is how encoding and decoding operations for stabilizer codes can be realized on a one-way quantum computer. This is based on the graph code representation for stabilizer codes, on the one hand, and the relation between cluster states and graph codes, on the other hand.

  6. Cyber-EDA: Estimation of Distribution Algorithms with Adaptive Memory Programming

    Directory of Open Access Journals (Sweden)

    Peng-Yeng Yin

    2013-01-01

    Full Text Available The estimation of distribution algorithm (EDA aims to explicitly model the probability distribution of the quality solutions to the underlying problem. By iterative filtering for quality solution from competing ones, the probability model eventually approximates the distribution of global optimum solutions. In contrast to classic evolutionary algorithms (EAs, EDA framework is flexible and is able to handle inter variable dependence, which usually imposes difficulties on classic EAs. The success of EDA relies on effective and efficient building of the probability model. This paper facilitates EDA from the adaptive memory programming (AMP domain which has developed several improved forms of EAs using the Cyber-EA framework. The experimental result on benchmark TSP instances supports our anticipation that the AMP strategies can enhance the performance of classic EDA by deriving a better approximation for the true distribution of the target solutions.

  7. Exploring the Future of Out-of-Core Computing with Compute-Local Non-Volatile Memory

    Directory of Open Access Journals (Sweden)

    Myoungsoo Jung

    2014-01-01

    Full Text Available Drawing parallels to the rise of general purpose graphical processing units (GPGPUs as accelerators for specific high-performance computing (HPC workloads, there is a rise in the use of non-volatile memory (NVM as accelerators for I/O-intensive scientific applications. However, existing works have explored use of NVM within dedicated I/O nodes, which are distant from the compute nodes that actually need such acceleration. As NVM bandwidth begins to out-pace point-to-point network capacity, we argue for the need to break from the archetype of completely separated storage. Therefore, in this work we investigate co-location of NVM and compute by varying I/O interfaces, file systems, types of NVM, and both current and future SSD architectures, uncovering numerous bottlenecks implicit in these various levels in the I/O stack. We present novel hardware and software solutions, including the new Unified File System (UFS, to enable fuller utilization of the new compute-local NVM storage. Our experimental evaluation, which employs a real-world Out-of-Core (OoC HPC application, demonstrates throughput increases in excess of an order of magnitude over current approaches.

  8. Soft-error tolerance and energy consumption evaluation of embedded computer with magnetic random access memory in practical systems using computer simulations

    Science.gov (United States)

    Nebashi, Ryusuke; Sakimura, Noboru; Sugibayashi, Tadahiko

    2017-08-01

    We evaluated the soft-error tolerance and energy consumption of an embedded computer with magnetic random access memory (MRAM) using two computer simulators. One is a central processing unit (CPU) simulator of a typical embedded computer system. We simulated the radiation-induced single-event-upset (SEU) probability in a spin-transfer-torque MRAM cell and also the failure rate of a typical embedded computer due to its main memory SEU error. The other is a delay tolerant network (DTN) system simulator. It simulates the power dissipation of wireless sensor network nodes of the system using a revised CPU simulator and a network simulator. We demonstrated that the SEU effect on the embedded computer with 1 Gbit MRAM-based working memory is less than 1 failure in time (FIT). We also demonstrated that the energy consumption of the DTN sensor node with MRAM-based working memory can be reduced to 1/11. These results indicate that MRAM-based working memory enhances the disaster tolerance of embedded computers.

  9. From Thread to Transcontinental Computer: Disturbing Lessons in Distributed Supercomputing

    CERN Document Server

    Groen, Derek

    2015-01-01

    We describe the political and technical complications encountered during the astronomical CosmoGrid project. CosmoGrid is a numerical study on the formation of large scale structure in the universe. The simulations are challenging due to the enormous dynamic range in spatial and temporal coordinates, as well as the enormous computer resources required. In CosmoGrid we dealt with the computational requirements by connecting up to four supercomputers via an optical network and make them operate as a single machine. This was challenging, if only for the fact that the supercomputers of our choice are separated by half the planet, as three of them are located scattered across Europe and fourth one is in Tokyo. The co-scheduling of multiple computers and the 'gridification' of the code enabled us to achieve an efficiency of up to $93\\%$ for this distributed intercontinental supercomputer. In this work, we find that high-performance computing on a grid can be done much more effectively if the sites involved are will...

  10. A Token Based Algorithm to Distributed Computation in Sensor Networks

    CERN Document Server

    Saligrama, Venkatesh

    2011-01-01

    We consider distributed algorithms for data aggregation and function computation in sensor networks. The algorithms perform pairwise computations along edges of an underlying communication graph. A token is associated with each sensor node, which acts as a transmission permit. Nodes with active tokens have transmission permits; they generate messages at a constant rate and send each message to a randomly selected neighbor. By using different strategies to control the transmission permits we can obtain tradeoffs between message and time complexity. Gossip corresponds to the case when all nodes have permits all the time. We study algorithms where permits are revoked after transmission and restored upon reception. Examples of such algorithms include Simple-Random Walk(SRW), Coalescent-Random-Walk(CRW) and Controlled Flooding(CFLD) and their hybrid variants. SRW has a single node permit, which is passed on in the network. CRW, initially initially has a permit for each node but these permits are revoked gradually....

  11. A Prototype Embedded Microprocessor Interconnect for Distributed and Parallel Computing

    Directory of Open Access Journals (Sweden)

    Bryan Hughes

    2008-08-01

    Full Text Available Parallel computing is currently undergoing a transition from a niche use to widespread acceptance due to new, computationally intensive applications and multi-core processors. While parallel processing is an invaluable tool for increasing performance, more time and expertise are required to develop a parallel system than are required for sequential systems. This paper discusses a toolkit currently in development that will simplify both the hardware and software development of embedded distributed and parallel systems. The hardware interconnection mechanism uses the Serial Peripheral Interface as a physical medium and provides routing and management services for the system. The topics in this paper are primarily limited to the interconnection aspect of the toolkit.

  12. Storm blueprints patterns for distributed real-time computation

    CERN Document Server

    Goetz, P Taylor

    2014-01-01

    A blueprints book with 10 different projects built in 10 different chapters which demonstrate the various use cases of storm for both beginner and intermediate users, grounded in real-world example applications.Although the book focuses primarily on Java development with Storm, the patterns are more broadly applicable and the tips, techniques, and approaches described in the book apply to architects, developers, and operations.Additionally, the book should provoke and inspire applications of distributed computing to other industries and domains. Hadoop enthusiasts will also find this book a go

  13. Job monitoring on DIRAC for Belle II distributed computing

    Science.gov (United States)

    Kato, Yuji; Hayasaka, Kiyoshi; Hara, Takanori; Miyake, Hideki; Ueda, Ikuo

    2015-12-01

    We developed a monitoring system for Belle II distributed computing, which consists of active and passive methods. In this paper we describe the passive monitoring system, where information stored in the DIRAC database is processed and visualized. We divide the DIRAC workload management flow into steps and store characteristic variables which indicate issues. These variables are chosen carefully based on our experiences, then visualized. As a result, we are able to effectively detect issues. Finally, we discuss the future development for automating log analysis, notification of issues, and disabling problematic sites.

  14. Repeat-until-success linear optics distributed quantum computing.

    Science.gov (United States)

    Lim, Yuan Liang; Beige, Almut; Kwek, Leong Chuan

    2005-07-15

    We demonstrate the possibility to perform distributed quantum computing using only single-photon sources (atom-cavity-like systems), linear optics, and photon detectors. The qubits are encoded in stable ground states of the sources. To implement a universal two-qubit gate, two photons should be generated simultaneously and pass through a linear optics network, where a measurement is performed on them. Gate operations can be repeated until a success is heralded without destroying the qubits at any stage of the operation. In contrast with other schemes, this does not require explicit qubit-qubit interactions, a priori entangled ancillas, nor the feeding of photons into photon sources.

  15. Computation of Universal Objects for Distributions Over Co-Trees

    DEFF Research Database (Denmark)

    Petersen, Henrik Densing; Topsøe, Flemming

    2012-01-01

    For an ordered set, consider the model of distributions P for which an element that precedes another element is considered the more significant one in the sense that the implication a ≤ b⇒ P(a) ≥ P(b) holds. It will be shown that if the ordered set is a finite co-tree, then the universal predictor...... for the model or, equivalently, the corresponding universal code, can be determined exactly via an algorithm of low complexity. Natural relations to problems on the computation of capacity and on the determination of information projections are established. More surprisingly, a direct connection to a problem...

  16. KNET - DISTRIBUTED COMPUTING AND/OR DATA TRANSFER PROGRAM

    Science.gov (United States)

    Hui, J.

    1994-01-01

    KNET facilitates distributed computing between a UNIX compatible local host and a remote host which may or may not be UNIX compatible. It is capable of automatic remote login. That is, it performs on the user's behalf the chore of handling host selection, user name, and password to the designated host. Once the login has been successfully completed, the user may interactively communicate with the remote host. Data output from the remote host may be directed to the local screen, to a local file, and/or to a local process. Conversely, data input from the keyboard, a local file, or a local process may be directed to the remote host. KNET takes advantage of the multitasking and terminal mode control features of the UNIX operating system. A parent process is used as the upper layer for interfacing with the local user. A child process is used for a lower layer for interfacing with the remote host computer, and optionally one or more child processes can be used for the remote data output. Output may be directed to the screen and/or to the local processes under the control of a data pipe switch. In order for KNET to operate, the local and remote hosts must observe a common communications protocol. KNET is written in ANSI standard C-language for computers running UNIX. It has been successfully implemented on several Sun series computers and a DECstation 3100 and used to run programs remotely on VAX VMS and UNIX based computers. It requires 100K of RAM under SunOS and 120K of RAM under DEC RISC ULTRIX. An electronic copy of the documentation is provided on the distribution medium. The standard distribution medium for KNET is a .25 inch streaming magnetic tape cartridge in UNIX tar format. It is also available on a 3.5 inch diskette in UNIX tar format. KNET was developed in 1991 and is a copyrighted work with all copyright vested in NASA. UNIX is a registered trademark of AT&T Bell Laboratories. Sun and SunOS are trademarks of Sun Microsystems, Inc. DECstation, VAX, VMS, and

  17. Performance Evaluation of Three Distributed Computing Environments for Scientific Applications

    Science.gov (United States)

    Fatoohi, Rod; Weeratunga, Sisira; Lasinski, T. A. (Technical Monitor)

    1994-01-01

    We present performance results for three distributed computing environments using the three simulated CFD applications in the NAS Parallel Benchmark suite. These environments are the DCF cluster, the LACE cluster, and an Intel iPSC/860 machine. The DCF is a prototypic cluster of loosely coupled SGI R3000 machines connected by Ethernet. The LACE cluster is a tightly coupled cluster of 32 IBM RS6000/560 machines connected by Ethernet as well as by either FDDI or an IBM Allnode switch. Results of several parallel algorithms for the three simulated applications are presented and analyzed based on the interplay between the communication requirements of an algorithm and the characteristics of the communication network of a distributed system.

  18. Spatial Memory Activity Distributions Indicate the Hippocampus Operates in a Continuous Manner

    Directory of Open Access Journals (Sweden)

    Brittany M. Jeye

    2016-08-01

    Full Text Available There is a long-standing debate as to whether recollection is a continuous/graded process or a threshold/all-or-none process. In the current spatial memory functional magnetic resonance imaging (fMRI study, we examined the hippocampal activity distributions—the magnitude of activity as a function of memory strength—to determine the nature of processing in this region. During encoding, participants viewed abstract shapes in the left or right visual field. During retrieval, old shapes were presented at fixation and participants classified each shape as previously in the “left” or “right” visual field followed by an “unsure”–“sure”–“very sure” confidence rating. The contrast of left-hits and left-misses produced two activations in the hippocampus. The hippocampal activity distributions for left shapes and right shapes were completely overlapping. Critically, the magnitude of activity associated with right-miss-very sure responses was significantly greater than zero. These results support the continuous model of recollection, which predicts overlapping activity distributions, and contradict the threshold model of recollection, which predicts a threshold above which only one distribution exists. Receiver operating characteristic analysis did not distinguish between models. The present results demonstrate that the hippocampus operates in a continuous manner during recollection and highlight the utility of analyzing activity distributions to determine the nature of neural processing.

  19. Distribution of glutamine synthetase in the chick forebrain: implications for passive avoidance memory formation.

    Science.gov (United States)

    O'Dowd, B S; Ng, K T; Robinson, S R

    1997-01-01

    The glial enzyme glutamine synthetase (GS) converts glutamate to glutamine; the latter is used by neurons for the resynthesis of glutamate and GABA. We have used a monoclonal antibody to GS to examine the regional distribution of this enzyme in the forebrains of day-old chicks. GS was detected in glia throughout the rostral and caudal regions of the forebrain and was particularly intense in the hippocampus, area parahippocampus and parts of the hyperstriatal and paleostriatal complex, regions widely considered to be involved in memory formation. Thus, our data provide an anatomical framework for the conclusion that neurons require the support of glia in order to restock their glutamate and/or GABA transmitter supplies during memory processing.

  20. ALGEBRAIC TURBULENCE MODEL WITH MEMORY FOR COMPUTATION OF 3-D TURBULENT BOUNDARY LAYERS WITH VALIDATION

    Institute of Scientific and Technical Information of China (English)

    2000-01-01

    Additional equations were found based on experiments for an algebraic turbulence model to improve the prediction of the behavior of three dimensional turbulent boundary layers by taking account of the effects of pressure gradient and the historical variation of eddy viscosity, so the model is with memory. Numerical calculation by solving boundary layer equations was carried out for the five pressure driven three dimensional turbulent boundary layers developed on flat plates, swept-wing, and prolate spheroid in symmetrical plane. Comparing the computational results with the experimental data, it is obvious that the prediction will be more accurate if the proposed closure equations are used, especially for the turbulent shear stresses.

  1. Chaining direct memory access data transfer operations for compute nodes in a parallel computer

    Science.gov (United States)

    Archer, Charles J.; Blocksome, Michael A.

    2010-09-28

    Methods, systems, and products are disclosed for chaining DMA data transfer operations for compute nodes in a parallel computer that include: receiving, by an origin DMA engine on an origin node in an origin injection FIFO buffer for the origin DMA engine, a RGET data descriptor specifying a DMA transfer operation data descriptor on the origin node and a second RGET data descriptor on the origin node, the second RGET data descriptor specifying a target RGET data descriptor on the target node, the target RGET data descriptor specifying an additional DMA transfer operation data descriptor on the origin node; creating, by the origin DMA engine, an RGET packet in dependence upon the RGET data descriptor, the RGET packet containing the DMA transfer operation data descriptor and the second RGET data descriptor; and transferring, by the origin DMA engine to a target DMA engine on the target node, the RGET packet.

  2. A Distributed Memory Hierarchy and Data Management for Interactive Scene Navigation and Modification on Tiled Display Walls.

    Science.gov (United States)

    Duy-Quoc Lai; Sajadi, Behzad; Jiang, Shan; Meenakshisundaram, Gopi; Majumder, Aditi

    2015-06-01

    Simultaneous modification and navigation of massive 3D models are difficult because repeated data edits affect the data layout and coherency on a secondary storage, which in turn affect the interactive out-of-core rendering performance. In this paper, we propose a novel approach for distributed data management for simultaneous interactive navigation and modification of massive 3D data using the readily available infrastructure of a tiled display. Tiled multi-displays, projection or LCD panel based, driven by a PC cluster, can be viewed as a cluster of storage-compute-display (SCD) nodes. Given a cluster of SCD node infrastructure, we first propose a distributed memory hierarchy for interactive rendering applications. Second, in order to further reduce the latency in such applications, we propose a new data partitioning approach for distributed storage among the SCD nodes that reduces the variance in the data load across the SCD nodes. Our data distribution method takes in a data set of any size, and reorganizes it into smaller partitions, and stores it across the multiple SCD nodes. These nodes store, manage, and coordinate data with other SCD nodes to simultaneously achieve interactive navigation and modification. Specifically, the data is not duplicated across these distributed secondary storage devices. In addition, coherency in data access, due to screen-space adjacency of adjacent displays in the tile, as well as object space adjacency of the data sets, is well leveraged in the design of the data management technique. Empirical evaluation on two large data sets, with different data density distribution, demonstrates that the proposed data management approach achieves superior performance over alternative state-of-the-art methods.

  3. A uniform approach for programming distributed heterogeneous computing systems.

    Science.gov (United States)

    Grasso, Ivan; Pellegrini, Simone; Cosenza, Biagio; Fahringer, Thomas

    2014-12-01

    Large-scale compute clusters of heterogeneous nodes equipped with multi-core CPUs and GPUs are getting increasingly popular in the scientific community. However, such systems require a combination of different programming paradigms making application development very challenging. In this article we introduce libWater, a library-based extension of the OpenCL programming model that simplifies the development of heterogeneous distributed applications. libWater consists of a simple interface, which is a transparent abstraction of the underlying distributed architecture, offering advanced features such as inter-context and inter-node device synchronization. It provides a runtime system which tracks dependency information enforced by event synchronization to dynamically build a DAG of commands, on which we automatically apply two optimizations: collective communication pattern detection and device-host-device copy removal. We assess libWater's performance in three compute clusters available from the Vienna Scientific Cluster, the Barcelona Supercomputing Center and the University of Innsbruck, demonstrating improved performance and scaling with different test applications and configurations.

  4. DISTRIBUTED GENERATION OF COMPUTER MUSIC IN THE INTERNET OF THINGS

    Directory of Open Access Journals (Sweden)

    G. G. Rogozinsky

    2015-07-01

    Full Text Available Problem Statement. The paper deals with distributed intelligent multi-agent system for computer music generation. A mathematical model for data extraction from the environment and their application in the music generation process is proposed. Methods. We use Resource Description Framework for representation of timbre data. A special musical programming language Csound is used for subsystem of synthesis and sound processing. Sound generation occurs according to the parameters of compositional model, getting data from the outworld. Results. We propose architecture of a potential distributed system for computer music generation. An example of core sound synthesis is presented. We also propose a method for mapping real world parameters to the plane of compositional model, in an attempt to imitate elements and aspects of creative inspiration. Music generation system has been represented as an artifact in the Central Museum of Communication n.a. A.S. Popov in the framework of «Night of Museums» action. In the course of public experiment it was stated that, in the whole, the system tends to a quick settling of neutral state with no musical events generation. This proves the necessity of algorithms design for active condition support of agents’ network, in the whole. Practical Relevance. Realization of the proposed system will give the possibility for creation of a technological platform for a whole new class of applications, including augmented acoustic reality and algorithmic composition.

  5. Performance Evaluation of Communication Software Systems for Distributed Computing

    Science.gov (United States)

    Fatoohi, Rod

    1996-01-01

    In recent years there has been an increasing interest in object-oriented distributed computing since it is better quipped to deal with complex systems while providing extensibility, maintainability, and reusability. At the same time, several new high-speed network technologies have emerged for local and wide area networks. However, the performance of networking software is not improving as fast as the networking hardware and the workstation microprocessors. This paper gives an overview and evaluates the performance of the Common Object Request Broker Architecture (CORBA) standard in a distributed computing environment at NASA Ames Research Center. The environment consists of two testbeds of SGI workstations connected by four networks: Ethernet, FDDI, HiPPI, and ATM. The performance results for three communication software systems are presented, analyzed and compared. These systems are: BSD socket programming interface, IONA's Orbix, an implementation of the CORBA specification, and the PVM message passing library. The results show that high-level communication interfaces, such as CORBA and PVM, can achieve reasonable performance under certain conditions.

  6. Model for teaching distributed computing in a distance-based educational environment

    CSIR Research Space (South Africa)

    le Roux, P

    2010-10-01

    Full Text Available for teaching distributed computing in a DEE. This research examines how distributed computing should be taught in a DEE in order to ensure effective and quality learning for students, specifically by investigating both the specific characteristics...

  7. Individual differences in components of reaction time distributions and their relations to working memory and intelligence.

    Science.gov (United States)

    Schmiedek, Florian; Oberauer, Klaus; Wilhelm, Oliver; Süss, Heinz-Martin; Wittmann, Werner W

    2007-08-01

    The authors bring together approaches from cognitive and individual differences psychology to model characteristics of reaction time distributions beyond measures of central tendency. Ex-Gaussian distributions and a diffusion model approach are used to describe individuals' reaction time data. The authors identified common latent factors for each of the 3 ex-Gaussian parameters and for 3 parameters central to the diffusion model using structural equation modeling for a battery of choice reaction tasks. These factors had differential relations to criterion constructs. Parameters reflecting the tail of the distribution (i.e., tau in the ex-Gaussian and drift rate in the diffusion model) were the strongest unique predictors of working memory, reasoning, and psychometric speed. Theories of controlled attention and binding are discussed as potential theoretical explanations.

  8. GPU computing with OpenCL to model 2D elastic wave propagation: exploring memory usage

    Science.gov (United States)

    Iturrarán-Viveros, Ursula; Molero-Armenta, Miguel

    2015-01-01

    Graphics processing units (GPUs) have become increasingly powerful in recent years. Programs exploring the advantages of this architecture could achieve large performance gains and this is the aim of new initiatives in high performance computing. The objective of this work is to develop an efficient tool to model 2D elastic wave propagation on parallel computing devices. To this end, we implement the elastodynamic finite integration technique, using the industry open standard open computing language (OpenCL) for cross-platform, parallel programming of modern processors, and an open-source toolkit called [Py]OpenCL. The code written with [Py]OpenCL can run on a wide variety of platforms; it can be used on AMD or NVIDIA GPUs as well as classical multicore CPUs, adapting to the underlying architecture. Our main contribution is its implementation with local and global memory and the performance analysis using five different computing devices (including Kepler, one of the fastest and most efficient high performance computing technologies) with various operating systems.

  9. Computational investigations into the orgins of 'short term' biochemical memory in T cell activation

    CERN Document Server

    Locasale, Jason W

    2007-01-01

    Recent studies have reported that T cells can integrate signals between interrupted encounters with Antigen Presenting Cells (APCs) in such a way that the process of signal integration exhibits a form of memory. Here, we carry out a computational study using a simple mathematical model of T cell activation to investigate the ramifications of interrupted T cell-APC contacts on signal integration. We consider several mechanisms of how signal integration at these time scales may be achieved and conclude that feedback control of immediate early gene products (IEGs) appears to be a highly plausible mechanism that allows for effective signal integration and cytokine production from multiple exposures to APCs. Analysis of these computer simulations provides an experimental roadmap involving several testable predictions.

  10. Distributed computations in a dynamic, heterogeneous Grid environment

    Science.gov (United States)

    Dramlitsch, Thomas

    2003-06-01

    In order to face the rapidly increasing need for computational resources of various scientific and engineering applications one has to think of new ways to make more efficient use of the worlds current computational resources. In this respect, the growing speed of wide area networks made a new kind of distributed computing possible: Metacomputing or (distributed) Grid computing. This is a rather new and uncharted field in computational science. The rapidly increasing speed of networks even outperforms the average increase of processor speed: Processor speeds double on average each 18 month whereas network bandwidths double every 9 months. Due to this development of local and wide area networks Grid computing will certainly play a key role in the future of parallel computing. This type of distributed computing, however, distinguishes from the traditional parallel computing in many ways since it has to deal with many problems not occurring in classical parallel computing. Those problems are for example heterogeneity, authentication and slow networks to mention only a few. Some of those problems, e.g. the allocation of distributed resources along with the providing of information about these resources to the application have been already attacked by the Globus software. Unfortunately, as far as we know, hardly any application or middle-ware software takes advantage of this information, since most parallelizing algorithms for finite differencing codes are implicitly designed for single supercomputer or cluster execution. We show that although it is possible to apply classical parallelizing algorithms in a Grid environment, in most cases the observed efficiency of the executed code is very poor. In this work we are closing this gap. In our thesis, we will - show that an execution of classical parallel codes in Grid environments is possible but very slow - analyze this situation of bad performance, nail down bottlenecks in communication, remove unnecessary overhead and

  11. Pervasive Computing, Privacy and Distribution of the Self

    Directory of Open Access Journals (Sweden)

    Soraj Hongladarom

    2011-05-01

    Full Text Available The emergence of what is commonly known as “ambient intelligence” or “ubiquitous computing” means that our conception of privacy and trust needs to be reconsidered. Many have voiced their concerns about the threat to privacy and the more prominent role of trust that have been brought about by emerging technologies. In this paper, I will present an investigation of what this means for the self and identity in our ambient intelligence environment. Since information about oneself can be actively distributed and processed, it is proposed that in a significant sense it is the self itself that is distributed throughout a pervasive or ubiquitous computing network when information pertaining to the self of the individual travels through the network. Hence privacy protection needs to be extended to all types of information distributed. It is also recommended that appropriately strong legislation on privacy and data protection regarding this pervasive network is necessary, but at present not sufficient, to ensure public trust. What is needed is a campaign on public awareness and positive perception of the technology.

  12. Classification of bacterial contamination using image processing and distributed computing.

    Science.gov (United States)

    Ahmed, W M; Bayraktar, B; Bhunia, A; Hirleman, E D; Robinson, J P; Rajwa, B

    2013-01-01

    Disease outbreaks due to contaminated food are a major concern not only for the food-processing industry but also for the public at large. Techniques for automated detection and classification of microorganisms can be a great help in preventing outbreaks and maintaining the safety of the nations food supply. Identification and classification of foodborne pathogens using colony scatter patterns is a promising new label-free technique that utilizes image-analysis and machine-learning tools. However, the feature-extraction tools employed for this approach are computationally complex, and choosing the right combination of scatter-related features requires extensive testing with different feature combinations. In the presented work we used computer clusters to speed up the feature-extraction process, which enables us to analyze the contribution of different scatter-based features to the overall classification accuracy. A set of 1000 scatter patterns representing ten different bacterial strains was used. Zernike and Chebyshev moments as well as Haralick texture features were computed from the available light-scatter patterns. The most promising features were first selected using Fishers discriminant analysis, and subsequently a support-vector-machine (SVM) classifier with a linear kernel was used. With extensive testing we were able to identify a small subset of features that produced the desired results in terms of classification accuracy and execution speed. The use of distributed computing for scatter-pattern analysis, feature extraction, and selection provides a feasible mechanism for large-scale deployment of a light scatter-based approach to bacterial classification.

  13. Efficient and Scalable Algorithms for Smoothed Particle Hydrodynamics on Hybrid Shared/Distributed-Memory Architectures

    CERN Document Server

    Gonnet, Pedro

    2014-01-01

    This paper describes a new fast and implicitly parallel approach to neighbour-finding in multi-resolution Smoothed Particle Hydrodynamics (SPH) simulations. This new approach is based on hierarchical cell decompositions and sorted interactions, within a task-based formulation. It is shown to be faster than traditional tree-based codes, and to scale better than domain decomposition-based approaches on hybrid shared/distributed-memory parallel architectures, e.g. clusters of multi-cores, achieving a $40\\times$ speedup over the Gadget-2 simulation code.

  14. Self-pacing direct memory access data transfer operations for compute nodes in a parallel computer

    Energy Technology Data Exchange (ETDEWEB)

    Blocksome, Michael A

    2015-02-17

    Methods, apparatus, and products are disclosed for self-pacing DMA data transfer operations for nodes in a parallel computer that include: transferring, by an origin DMA on an origin node, a RTS message to a target node, the RTS message specifying an message on the origin node for transfer to the target node; receiving, in an origin injection FIFO for the origin DMA from a target DMA on the target node in response to transferring the RTS message, a target RGET descriptor followed by a DMA transfer operation descriptor, the DMA descriptor for transmitting a message portion to the target node, the target RGET descriptor specifying an origin RGET descriptor on the origin node that specifies an additional DMA descriptor for transmitting an additional message portion to the target node; processing, by the origin DMA, the target RGET descriptor; and processing, by the origin DMA, the DMA transfer operation descriptor.

  15. High Performance Distributed Computing in a Supercomputer Environment: Computational Services and Applications Issues

    Science.gov (United States)

    Kramer, Williams T. C.; Simon, Horst D.

    1994-01-01

    This tutorial proposes to be a practical guide for the uninitiated to the main topics and themes of high-performance computing (HPC), with particular emphasis to distributed computing. The intent is first to provide some guidance and directions in the rapidly increasing field of scientific computing using both massively parallel and traditional supercomputers. Because of their considerable potential computational power, loosely or tightly coupled clusters of workstations are increasingly considered as a third alternative to both the more conventional supercomputers based on a small number of powerful vector processors, as well as high massively parallel processors. Even though many research issues concerning the effective use of workstation clusters and their integration into a large scale production facility are still unresolved, such clusters are already used for production computing. In this tutorial we will utilize the unique experience made at the NAS facility at NASA Ames Research Center. Over the last five years at NAS massively parallel supercomputers such as the Connection Machines CM-2 and CM-5 from Thinking Machines Corporation and the iPSC/860 (Touchstone Gamma Machine) and Paragon Machines from Intel were used in a production supercomputer center alongside with traditional vector supercomputers such as the Cray Y-MP and C90.

  16. Research into display sharing techniques for distributed computing environments

    Science.gov (United States)

    Hugg, Steven B.; Fitzgerald, Paul F., Jr.; Rosson, Nina Y.; Johns, Stephen R.

    1990-01-01

    The X-based Display Sharing solution for distributed computing environments is described. The Display Sharing prototype includes the base functionality for telecast and display copy requirements. Since the prototype implementation is modular and the system design provided flexibility for the Mission Control Center Upgrade (MCCU) operational consideration, the prototype implementation can be the baseline for a production Display Sharing implementation. To facilitate the process the following discussions are presented: Theory of operation; System of architecture; Using the prototype; Software description; Research tools; Prototype evaluation; and Outstanding issues. The prototype is based on the concept of a dedicated central host performing the majority of the Display Sharing processing, allowing minimal impact on each individual workstation. Each workstation participating in Display Sharing hosts programs to facilitate the user's access to Display Sharing as host machine.

  17. 13th International Conference on Distributed Computing and Artificial Intelligence

    CERN Document Server

    Silvestri, Marcello; González, Sara

    2016-01-01

    The special session Decision Economics (DECON) 2016 is a scientific forum by which to share ideas, projects, researches results, models and experiences associated with the complexity of behavioral decision processes aiming at explaining socio-economic phenomena. DECON 2016 held in the University of Seville, Spain, as part of the 13th International Conference on Distributed Computing and Artificial Intelligence (DCAI) 2016. In the tradition of Herbert A. Simon’s interdisciplinary legacy, this book dedicates itself to the interdisciplinary study of decision-making in the recognition that relevant decision-making takes place in a range of critical subject areas and research fields, including economics, finance, information systems, small and international business, management, operations, and production. Decision-making issues are of crucial importance in economics. Not surprisingly, the study of decision-making has received a growing empirical research efforts in the applied economic literature over the last ...

  18. Distributed Monitoring Infrastructure for Worldwide LHC Computing Grid

    CERN Document Server

    Andrade, Pedro; Bhatt, Kislay; Chand, Phool; Collados, David; Duggal, Vibhuti; Fuente, Paloma; Hayashi, Soichi; Imamagic, Emir; Joshi, Pradyumna; Kalmady, Rajesh; Karnani, Urvashi; Kumar, Vaibhav; Lapka, Wojciech; Quick, Robert; Tarragon, Jacobo; Teige, Scott; Triantafyllidis, Christos

    2012-01-01

    The journey of a monitoring probe from its development phase to the moment its execution result is presented in an availability report is a complex process. It goes through multiple phases such as development, testing, integration, release, deployment, execution, data aggregation, computation, and reporting. Further, it involves people with different roles (developers, site managers, VO managers, service managers, management), from different middleware providers (ARC, dCache, gLite, UNICORE and VDT), consortiums (WLCG, EMI, EGI, OSG), and operational teams (GOC, OMB, OTAG, CSIRT). The seamless harmonization of these distributed actors is in daily use for monitoring of the WLCG infrastructure. In this paper we describe the monitoring of the WLCG infrastructure from the operational perspective. We explain the complexity of the journey of a monitoring probe from its execution on a grid node to the visualization on the MyWLCG portal where it is exposed to other clients. This monitoring workflow profits from the i...

  19. Collaborative and distributed computing applied to biomedicine with the FightAIDS@Home

    Directory of Open Access Journals (Sweden)

    BARBOSA, G. C.

    2011-12-01

    Full Text Available Distributed computing systems are used for high performance computing tasks, taking advantage of join processing power of multiple independent computers, but they are interconnected by network. These high performance systems can be divided into two classes of distributed computing systems: computer cluster and grid computing. The FightAIDS@Home is a distributed system toward the search for solutions for the AIDS treatment, and this project is led by the Olson Laboratory, California. Nowadays, this project use computing resources of the Grid World Community Grid, which consist mostly of computers of ordinary users, using open and non-specific standard protocols and interfaces to ensure interoperability between different systems. These paper presents informations from a case study about FightAIDS@Home and its implementation, using aspects of distributed systems such as grid computer and cloud computing to support implementation of collaborative computing.

  20. Comparison of TCP automatic tuning techniques for distributed computing

    Energy Technology Data Exchange (ETDEWEB)

    Weigle, E. H. (Eric H.); Feng, W. C. (Wu-Chun)

    2002-01-01

    Rather than painful, manual, static, per-connection optimization of TCP buffer sizes simply to achieve acceptable performance for distributed applications, many researchers have proposed techniques to perform this tuning automatically. This paper first discusses the relative merits of the various approaches in theory, and then provides substantial experimental data concerning two competing implementations - the buffer autotuning already present in Linux 2.4.x and 'Dynamic Right-Sizing.' This paper reveals heretofore unknown aspects of the problem and current solutions, provides insight into the proper approach for various circumstances, and points toward ways to further improve performance. TCP, for good or ill, is the only protocol widely available for reliable end-to-end congestion-controlled network communication, and thus it is the one used for almost all distributed computing. Unfortunately, TCP was not designed with high-performance computing in mind - its original design decisions focused on long-term fairness first, with performance a distant second. Thus users must often perform tortuous manual optimizations simply to achieve acceptable behavior. The most important and often most difficult task is determining and setting appropriate buffer sizes. Because of this, at least six ways of automatically setting these sizes have been proposed. In this paper, we compare and contrast these tuning methods. First we explain each method, followed by an in-depth discussion of their features. Next we discuss the experiments to fully characterize two particularly interesting methods (Linux 2.4 autotuning and Dynamic Right-Sizing). We conclude with results and possible improvements.

  1. Dynamic resource allocation scheme for distributed heterogeneous computer systems

    Science.gov (United States)

    Liu, Howard T. (Inventor); Silvester, John A. (Inventor)

    1991-01-01

    This invention relates to a resource allocation in computer systems, and more particularly, to a method and associated apparatus for shortening response time and improving efficiency of a heterogeneous distributed networked computer system by reallocating the jobs queued up for busy nodes to idle, or less-busy nodes. In accordance with the algorithm (SIDA for short), the load-sharing is initiated by the server device in a manner such that extra overhead in not imposed on the system during heavily-loaded conditions. The algorithm employed in the present invention uses a dual-mode, server-initiated approach. Jobs are transferred from heavily burdened nodes (i.e., over a high threshold limit) to low burdened nodes at the initiation of the receiving node when: (1) a job finishes at a node which is burdened below a pre-established threshold level, or (2) a node is idle for a period of time as established by a wakeup timer at the node. The invention uses a combination of the local queue length and the local service rate ratio at each node as the workload indicator.

  2. Distributed Monitoring Infrastructure for Worldwide LHC Computing Grid

    Science.gov (United States)

    Andrade, P.; Babik, M.; Bhatt, K.; Chand, P.; Collados, D.; Duggal, V.; Fuente, P.; Hayashi, S.; Imamagic, E.; Joshi, P.; Kalmady, R.; Karnani, U.; Kumar, V.; Lapka, W.; Quick, R.; Tarragon, J.; Teige, S.; Triantafyllidis, C.

    2012-12-01

    The journey of a monitoring probe from its development phase to the moment its execution result is presented in an availability report is a complex process. It goes through multiple phases such as development, testing, integration, release, deployment, execution, data aggregation, computation, and reporting. Further, it involves people with different roles (developers, site managers, VO[1] managers, service managers, management), from different middleware providers (ARC[2], dCache[3], gLite[4], UNICORE[5] and VDT[6]), consortiums (WLCG[7], EMI[11], EGI[15], OSG[13]), and operational teams (GOC[16], OMB[8], OTAG[9], CSIRT[10]). The seamless harmonization of these distributed actors is in daily use for monitoring of the WLCG infrastructure. In this paper we describe the monitoring of the WLCG infrastructure from the operational perspective. We explain the complexity of the journey of a monitoring probe from its execution on a grid node to the visualization on the MyWLCG[27] portal where it is exposed to other clients. This monitoring workflow profits from the interoperability established between the SAM[19] and RSV[20] frameworks. We show how these two distributed structures are capable of uniting technologies and hiding the complexity around them, making them easy to be used by the community. Finally, the different supported deployment strategies, tailored not only for monitoring the entire infrastructure but also for monitoring sites and virtual organizations, are presented and the associated operational benefits highlighted.

  3. Distributed Algorithm for Computing the Vehicle Launch Dynamics under Interaction with the Medium

    Directory of Open Access Journals (Sweden)

    G. A. Shcheglov

    2015-01-01

    Full Text Available The paper describes a distributed algorithm and a structure of the software package for its implementation in which a program for computing the vehicle launch dynamics under interaction with the medium flow is complemented with a program to determine the unsteady hydrodynamic loads by the vortex element method.A distinctive feature of the developed system is that its local (running on a single computing core LEAVING program to calculate the launch dynamics runs together with concurrent (running on multiple computing cores MDVDD program to compute the unsteady vortex flow and hydrodynamic loads. The LEAVING program is the main one. It is launched app and then runs the MDVDD program in concurrent mode on the specified number of cores. Using MPI technology allows you to use a multiprocessor PC or a local network of multiple PCs to perform calculations. The equations of launcher spring-mass model dynamics and equations of vortex elements parameters evolution are integrated with the same time step. The interprogram communiaction in the step is provided asynchronously using the OS Windows Event mechanism (Events. Interfacing between LEAVING and MDVDD programs is built using the OS Windows FileMapping technology, which allows a specified data structure to be displayed and read to the fixed memory area.The paper provides analysis of acceleration achieved with parallel processing on different numbers of cores, and defines a parallelization degree of various operations. It shows that the parallelization efficiency of the developed algorithm is slower than in case of calculation of the rigid body flow. The causes of reduced efficiency are discussed.It is shown that the developed algorithm can be effectively used to solve problems on a small number of cores, e.g. on PC based on one or two quad-core processors.

  4. Distributed computing feasibility in a non-dedicated homogeneous distributed system

    Science.gov (United States)

    Leutenegger, Scott T.; Sun, Xian-He

    1993-01-01

    The low cost and availability of clusters of workstations have lead researchers to re-explore distributed computing using independent workstations. This approach may provide better cost/performance than tightly coupled multiprocessors. In practice, this approach often utilizes wasted cycles to run parallel jobs. The feasibility of such a non-dedicated parallel processing environment assuming workstation processes have preemptive priority over parallel tasks is addressed. An analytical model is developed to predict parallel job response times. Our model provides insight into how significantly workstation owner interference degrades parallel program performance. A new term task ratio, which relates the parallel task demand to the mean service demand of nonparallel workstation processes, is introduced. It was proposed that task ratio is a useful metric for determining how large the demand of a parallel applications must be in order to make efficient use of a non-dedicated distributed system.

  5. Distributed and cloud computing from parallel processing to the Internet of Things

    CERN Document Server

    Hwang, Kai; Fox, Geoffrey C

    2012-01-01

    Distributed and Cloud Computing, named a 2012 Outstanding Academic Title by the American Library Association's Choice publication, explains how to create high-performance, scalable, reliable systems, exposing the design principles, architecture, and innovative applications of parallel, distributed, and cloud computing systems. Starting with an overview of modern distributed models, the book provides comprehensive coverage of distributed and cloud computing, including: Facilitating management, debugging, migration, and disaster recovery through virtualization Clustered systems for resear

  6. Hyperincursive McCulloch and Pitts neurons for designing a computing flip-flop memory

    Science.gov (United States)

    Dubois, Daniel M.

    1999-03-01

    This paper will firstly review a new theoretical basis for modelling neural Boolean networks by non-linear digital equations. With integer numbers, these digital equations are Heaviside Fixed Functions in the framework of the Threshold Logic. These can represent non-linear neurons which can be split very easily into a set of McCulloch and Pitts formal neurons with hidden neurons. It is demonstrated that any Boolean tables can be very easily represented by such neural networks where the weights are always either an activation weight +1 or an inhibition weight -1, with integer threshold. A fundamental problem in neural systems is the design of memory. This paper will present new memory neural systems based on hyperincursive neurons, that is neurons with multiple output states for the same input, instead of synaptic weights. Finally, a differential equation of membrane neural potential is used as a model of a brain, the incursive, that is the implicit recursive, computation of which gives rise to non-locality effects.

  7. The Case for Higher Computational Density in the Memory-Bound FDTD Method within Multicore Environments

    Directory of Open Access Journals (Sweden)

    Mohammed F. Hadi

    2012-01-01

    Full Text Available It is argued here that more accurate though more compute-intensive alternate algorithms to certain computational methods which are deemed too inefficient and wasteful when implemented within serial codes can be more efficient and cost-effective when implemented in parallel codes designed to run on today's multicore and many-core environments. This argument is most germane to methods that involve large data sets with relatively limited computational density—in other words, algorithms with small ratios of floating point operations to memory accesses. The examples chosen here to support this argument represent a variety of high-order finite-difference time-domain algorithms. It will be demonstrated that a three- to eightfold increase in floating-point operations due to higher-order finite-differences will translate to only two- to threefold increases in actual run times using either graphical or central processing units of today. It is hoped that this argument will convince researchers to revisit certain numerical techniques that have long been shelved and reevaluate them for multicore usability.

  8. Computational Thermodynamics and Kinetics-Based ICME Framework for High-Temperature Shape Memory Alloys

    Science.gov (United States)

    Arróyave, Raymundo; Talapatra, Anjana; Johnson, Luke; Singh, Navdeep; Ma, Ji; Karaman, Ibrahim

    2015-11-01

    Over the last decade, considerable interest in the development of High-Temperature Shape Memory Alloys (HTSMAs) for solid-state actuation has increased dramatically as key applications in the aerospace and automotive industry demand actuation temperatures well above those of conventional SMAs. Most of the research to date has focused on establishing the (forward) connections between chemistry, processing, (micro)structure, properties, and performance. Much less work has been dedicated to the development of frameworks capable of addressing the inverse problem of establishing necessary chemistry and processing schedules to achieve specific performance goals. Integrated Computational Materials Engineering (ICME) has emerged as a powerful framework to address this problem, although it has yet to be applied to the development of HTSMAs. In this paper, the contributions of computational thermodynamics and kinetics to ICME of HTSMAs are described. Some representative examples of the use of computational thermodynamics and kinetics to understand the phase stability and microstructural evolution in HTSMAs are discussed. Some very recent efforts at combining both to assist in the design of HTSMAs and limitations to the full implementation of ICME frameworks for HTSMA development are presented.

  9. Achieving High Performance Distributed System: Using Grid, Cluster and Cloud Computing

    Directory of Open Access Journals (Sweden)

    Sunil Kr Singh

    2015-02-01

    Full Text Available To increase the efficiency of any task, we require a system that would provide high performance along with flexibilities and cost efficiencies for user. Distributed computing, as we are all aware, has become very popular over the past decade. Distributed computing has three major types, namely, cluster, grid and cloud. In order to develop a high performance distributed system, we need to utilize all the above mentioned three types of computing. In this paper, we shall first have an introduction of all the three types of distributed computing. Subsequently examining them we shall explore trends in computing and green sustainable computing to enhance the performance of a distributed system. Finally presenting the future scope, we conclude the paper suggesting a path to achieve a Green high performance distributed system using cluster, grid and cloud computing

  10. Maintaining Traceability in an Evolving Distributed Computing Environment

    Science.gov (United States)

    Collier, I.; Wartel, R.

    2015-12-01

    The management of risk is fundamental to the operation of any distributed computing infrastructure. Identifying the cause of incidents is essential to prevent them from re-occurring. In addition, it is a goal to contain the impact of an incident while keeping services operational. For response to incidents to be acceptable this needs to be commensurate with the scale of the problem. The minimum level of traceability for distributed computing infrastructure usage is to be able to identify the source of all actions (executables, file transfers, pilot jobs, portal jobs, etc.) and the individual who initiated them. In addition, sufficiently fine-grained controls, such as blocking the originating user and monitoring to detect abnormal behaviour, are necessary for keeping services operational. It is essential to be able to understand the cause and to fix any problems before re-enabling access for the user. The aim is to be able to answer the basic questions who, what, where, and when concerning any incident. This requires retaining all relevant information, including timestamps and the digital identity of the user, sufficient to identify, for each service instance, and for every security event including at least the following: connect, authenticate, authorize (including identity changes) and disconnect. In traditional grid infrastructures (WLCG, EGI, OSG etc.) best practices and procedures for gathering and maintaining the information required to maintain traceability are well established. In particular, sites collect and store information required to ensure traceability of events at their sites. With the increased use of virtualisation and private and public clouds for HEP workloads established procedures, which are unable to see 'inside' running virtual machines no longer capture all the information required. Maintaining traceability will at least involve a shift of responsibility from sites to Virtual Organisations (VOs) bringing with it new requirements for their

  11. WRF4G project: Adaptation of WRF Model to Distributed Computing Infrastructures

    Science.gov (United States)

    Cofino, Antonio S.; Fernández Quiruelas, Valvanuz; García Díez, Markel; Blanco Real, Jose C.; Fernández, Jesús

    2013-04-01

    Nowadays Grid Computing is powerful computational tool which is ready to be used for scientific community in different areas (such as biomedicine, astrophysics, climate, etc.). However, the use of this distributed computing infrastructures (DCI) is not yet common practice in climate research, and only a few teams and applications in this area take advantage of this infrastructure. Thus, the first objective of this project is to popularize the use of this technology in the atmospheric sciences area. In order to achieve this objective, one of the most used applications has been taken (WRF; a limited- area model, successor of the MM5 model), that has a user community formed by more than 8000 researchers worldwide. This community develop its research activity on different areas and could benefit from the advantages of Grid resources (case study simulations, regional hind-cast/forecast, sensitivity studies, etc.). The WRF model is been used as input by many energy and natural hazards community, therefore those community will also benefit. However, Grid infrastructures have some drawbacks for the execution of applications that make an intensive use of CPU and memory for a long period of time. This makes necessary to develop a specific framework (middleware). This middleware encapsulates the application and provides appropriate services for the monitoring and management of the jobs and the data. Thus, the second objective of the project consists on the development of a generic adaptation of WRF for Grid (WRF4G), to be distributed as open-source and to be integrated in the official WRF development cycle. The use of this WRF adaptation should be transparent and useful to face any of the previously described studies, and avoid any of the problems of the Grid infrastructure. Moreover it should simplify the access to the Grid infrastructures for the research teams, and also to free them from the technical and computational aspects of the use of the Grid. Finally, in order to

  12. Protein folding by distributed computing and the denatured state ensemble.

    Science.gov (United States)

    Marianayagam, Neelan J; Fawzi, Nicolas L; Head-Gordon, Teresa

    2005-11-15

    The distributed computing (DC) paradigm in conjunction with the folding@home (FH) client server has been used to study the folding kinetics of small peptides and proteins, giving excellent agreement with experimentally measured folding rates, although pathways sampled in these simulations are not always consistent with the folding mechanism. In this study, we use a coarse-grain model of protein L, whose two-state kinetics have been characterized in detail by using long-time equilibrium simulations, to rigorously test a FH protocol using approximately 10,000 short-time, uncoupled folding simulations starting from an extended state of the protein. We show that the FH results give non-Poisson distributions and early folding events that are unphysical, whereas longer folding events experience a correct barrier to folding but are not representative of the equilibrium folding ensemble. Using short-time, uncoupled folding simulations started from an equilibrated denatured state ensemble (DSE), we also do not get agreement with the equilibrium two-state kinetics because of overrepresented folding events arising from higher energy subpopulations in the DSE. The DC approach using uncoupled short trajectories can make contact with traditionally measured experimental rates and folding mechanism when starting from an equilibrated DSE, when the simulation time is long enough to sample the lowest energy states of the unfolded basin and the simulated free-energy surface is correct. However, the DC paradigm, together with faster time-resolved and single-molecule experiments, can also reveal the breakdown in the two-state approximation due to observation of folding events from higher energy subpopulations in the DSE.

  13. High-performance computing, high-speed networks, and configurable computing environments: progress toward fully distributed computing.

    Science.gov (United States)

    Johnston, W E; Jacobson, V L; Loken, S C; Robertson, D W; Tierney, B L

    1992-01-01

    The next several years will see the maturing of a collection of technologies that will enable fully and transparently distributed computing environments. Networks will be used to configure independent computing, storage, and I/O elements into "virtual systems" that are optimal for solving a particular problem. This environment will make the most powerful computing systems those that are logically assembled from network-based components and will also make those systems available to a widespread audience. Anticipating that the necessary technology and communications infrastructure will be available in the next 3 to 5 years, we are developing and demonstrating prototype applications that test and exercise the currently available elements of this configurable environment. The Lawrence Berkeley Laboratory (LBL) Information and Computing Sciences and Research Medicine Divisions have collaborated with the Pittsburgh Supercomputer Center to demonstrate one distributed application that illuminates the issues and potential of using networks to configure virtual systems. This application allows the interactive visualization of large three-dimensional (3D) scalar fields (voxel data sets) by using a network-based configuration of heterogeneous supercomputers and workstations. The specific test case is visualization of 3D magnetic resonance imaging (MRI) data. The virtual system architecture consists of a Connection Machine-2 (CM-2) that performs surface reconstruction from the voxel data, a Cray Y-MP that renders the resulting geometric data into an image, and a workstation that provides the display of the image and the user interface for specifying the parameters for the geometry generation and 3D viewing. These three elements are configured into a virtual system by using several different network technologies. This paper reviews the current status of the software, hardware, and communications technologies that are needed to enable this configurable environment. These

  14. Memory architecture for efficient utilization of SDRAM: a case study of the computation/memory access trade-off

    DEFF Research Database (Denmark)

    Gleerup, Thomas Møller; Holten-Lund, Hans Erik; Madsen, Jan

    2000-01-01

    This paper discusses the trade-off between calculations and memory accesses in a 3D graphics tile renderer for visualization of data from medical scanners. The performance requirement of this application is a frame rate of 25 frames per second when rendering 3D models with 2 million triangles, i....... In software, forward differencing is usually better, but in this hardware implementation, the trade-off has made it possible to develop a very regular memory architecture with a buffering system, which can reach 95% bandwidth utilization using off-the-shelf SDRAM, This is achieved by changing the algorithm...... to use a memory access strategy with write-only and read-only phases, and a buffering system, which uses round-robin bank write-access combined with burst read-access....

  15. Analytical Study of Object Components for Distributed and Ubiquitous Computing Environment

    CERN Document Server

    Batra, Usha; Bhardwaj, Sachin

    2011-01-01

    The Distributed object computing is a paradigm that allows objects to be distributed across a heterogeneous network, and allows each of the components to interoperate as a unified whole. A new generation of distributed applications, such as telemedicine and e-commerce applications, are being deployed in heterogeneous and ubiquitous computing environments. The objective of this paper is to explore an applicability of a component based services in ubiquitous computational environment. While the fundamental structure of various distributed object components is similar, there are differences that can profoundly impact an application developer or the administrator of a distributed simulation exercise and to implement in Ubiquitous Computing Environment.

  16. Scaling Techniques for Massive Scale-Free Graphs in Distributed (External) Memory

    KAUST Repository

    Pearce, Roger

    2013-05-01

    We present techniques to process large scale-free graphs in distributed memory. Our aim is to scale to trillions of edges, and our research is targeted at leadership class supercomputers and clusters with local non-volatile memory, e.g., NAND Flash. We apply an edge list partitioning technique, designed to accommodate high-degree vertices (hubs) that create scaling challenges when processing scale-free graphs. In addition to partitioning hubs, we use ghost vertices to represent the hubs to reduce communication hotspots. We present a scaling study with three important graph algorithms: Breadth-First Search (BFS), K-Core decomposition, and Triangle Counting. We also demonstrate scalability on BG/P Intrepid by comparing to best known Graph500 results. We show results on two clusters with local NVRAM storage that are capable of traversing trillion-edge scale-free graphs. By leveraging node-local NAND Flash, our approach can process thirty-two times larger datasets with only a 39% performance degradation in Traversed Edges Per Second (TEPS). © 2013 IEEE.

  17. Aho-Corasick String Matching on Shared and Distributed Memory Parallel Architectures

    Energy Technology Data Exchange (ETDEWEB)

    Tumeo, Antonino; Villa, Oreste; Chavarría-Miranda, Daniel

    2012-03-01

    String matching is at the core of many critical applications, including network intrusion detection systems, search engines, virus scanners, spam filters, DNA and protein sequencing, and data mining. For all of these applications string matching requires a combination of (sometimes all) the following characteristics: high and/or predictable performance, support for large data sets and flexibility of integration and customization. Many software based implementations targeting conventional cache-based microprocessors fail to achieve high and predictable performance requirements, while Field-Programmable Gate Array (FPGA) implementations and dedicated hardware solutions fail to support large data sets (dictionary sizes) and are difficult to integrate and customize. The advent of multicore, multithreaded, and GPU-based systems is opening the possibility for software based solutions to reach very high performance at a sustained rate. This paper compares several software-based implementations of the Aho-Corasick string searching algorithm for high performance systems. We discuss the implementation of the algorithm on several types of shared-memory high-performance architectures (Niagara 2, large x86 SMPs and Cray XMT), distributed memory with homogeneous processing elements (InfiniBand cluster of x86 multicores) and heterogeneous processing elements (InfiniBand cluster of x86 multicores with NVIDIA Tesla C10 GPUs). We describe in detail how each solution achieves the objectives of supporting large dictionaries, sustaining high performance, and enabling customization and flexibility using various data sets.

  18. An evaluation of biosurveillance grid--dynamic algorithm distribution across multiple computer nodes.

    Science.gov (United States)

    Tsai, Ming-Chi; Tsui, Fu-Chiang; Wagner, Michael M

    2007-10-11

    Performing fast data analysis to detect disease outbreaks plays a critical role in real-time biosurveillance. In this paper, we described and evaluated an Algorithm Distribution Manager Service (ADMS) based on grid technologies, which dynamically partition and distribute detection algorithms across multiple computers. We compared the execution time to perform the analysis on a single computer and on a grid network (3 computing nodes) with and without using dynamic algorithm distribution. We found that algorithms with long runtime completed approximately three times earlier in distributed environment than in a single computer while short runtime algorithms performed worse in distributed environment. A dynamic algorithm distribution approach also performed better than static algorithm distribution approach. This pilot study shows a great potential to reduce lengthy analysis time through dynamic algorithm partitioning and parallel processing, and provides the opportunity of distributing algorithms from a client to remote computers in a grid network.

  19. Evaluating Emulation-based Models of Distributed Computing Systems.

    Energy Technology Data Exchange (ETDEWEB)

    Jones, Stephen T.

    2017-10-01

    Emulation-based models of distributed computing systems are collections of virtual ma- chines, virtual networks, and other emulation components configured to stand in for oper- ational systems when performing experimental science, training, analysis of design alterna- tives, test and evaluation, or idea generation. As with any tool, we should carefully evaluate whether our uses of emulation-based models are appropriate and justified. Otherwise, we run the risk of using a model incorrectly and creating meaningless results. The variety of uses of emulation-based models each have their own goals and deserve thoughtful evaluation. In this paper, we enumerate some of these uses and describe approaches that one can take to build an evidence-based case that a use of an emulation-based model is credible. Predictive uses of emulation-based models, where we expect a model to tell us something true about the real world, set the bar especially high and the principal evaluation method, called validation , is comensurately rigorous. We spend the majority of our time describing and demonstrating the validation of a simple predictive model using a well-established methodology inherited from decades of development in the compuational science and engineering community.

  20. 26 CFR 1.1247-2 - Computation and distribution of taxable income.

    Science.gov (United States)

    2010-04-01

    ... 26 Internal Revenue 11 2010-04-01 2010-04-01 true Computation and distribution of taxable income....1247-2 Computation and distribution of taxable income. (a) In general. Taxable income of a foreign... such taxable year as a distribution made during such taxable year of such taxable income. The...

  1. Design & implementation of distributed spatial computing node based on WPS

    Science.gov (United States)

    Liu, Liping; Li, Guoqing; Xie, Jibo

    2014-03-01

    Currently, the research work of SIG (Spatial Information Grid) technology mostly emphasizes on the spatial data sharing in grid environment, while the importance of spatial computing resources is ignored. In order to implement the sharing and cooperation of spatial computing resources in grid environment, this paper does a systematical research of the key technologies to construct Spatial Computing Node based on the WPS (Web Processing Service) specification by OGC (Open Geospatial Consortium). And a framework of Spatial Computing Node is designed according to the features of spatial computing resources. Finally, a prototype of Spatial Computing Node is implemented and the relevant verification work under the environment is completed.

  2. Comparison between sparsely distributed memory and Hopfield-type neural network models

    Science.gov (United States)

    Keeler, James D.

    1986-01-01

    The Sparsely Distributed Memory (SDM) model (Kanerva, 1984) is compared to Hopfield-type neural-network models. A mathematical framework for comparing the two is developed, and the capacity of each model is investigated. The capacity of the SDM can be increased independently of the dimension of the stored vectors, whereas the Hopfield capacity is limited to a fraction of this dimension. However, the total number of stored bits per matrix element is the same in the two models, as well as for extended models with higher order interactions. The models are also compared in their ability to store sequences of patterns. The SDM is extended to include time delays so that contextual information can be used to cover sequences. Finally, it is shown how a generalization of the SDM allows storage of correlated input pattern vectors.

  3. Advanced compilation techniques in the PARADIGM compiler for distributed-memory multicomputers

    Science.gov (United States)

    Su, Ernesto; Lain, Antonio; Ramaswamy, Shankar; Palermo, Daniel J.; Hodges, Eugene W., IV; Banerjee, Prithviraj

    1995-01-01

    The PARADIGM compiler project provides an automated means to parallelize programs, written in a serial programming model, for efficient execution on distributed-memory multicomputers. .A previous implementation of the compiler based on the PTD representation allowed symbolic array sizes, affine loop bounds and array subscripts, and variable number of processors, provided that arrays were single or multi-dimensionally block distributed. The techniques presented here extend the compiler to also accept multidimensional cyclic and block-cyclic distributions within a uniform symbolic framework. These extensions demand more sophisticated symbolic manipulation capabilities. A novel aspect of our approach is to meet this demand by interfacing PARADIGM with a powerful off-the-shelf symbolic package, Mathematica. This paper describes some of the Mathematica routines that performs various transformations, shows how they are invoked and used by the compiler to overcome the new challenges, and presents experimental results for code involving cyclic and block-cyclic arrays as evidence of the feasibility of the approach.

  4. Managing to Change: The Wharton School's Distributed Staff Model for Computing Support.

    Science.gov (United States)

    Eleey, Michael

    1993-01-01

    The University of Pennsylvania's Wharton School introduced a "distributed" organization for managing computing support services. The hybrid structure combined elements of centralized computing and departmental computing by placing computing personnel in the departments, under central management. The program covers a wide range of support services…

  5. Autonomous management of distributed information systems using evolutionary computation techniques

    Science.gov (United States)

    Oates, Martin J.

    1999-03-01

    As the size of typical industrial strength information systems continues to rise, particularly in the arena of Internet based management information systems and multimedia servers, the issue of managing data distribution over clusters or `farms' to overcome performance and scalability issues is becoming of paramount importance. Further, where access is global, this can cause points of geographically localized load contention to `follow the sun' during the day. Traditional site mirroring is not overly effective in addressing this contention and so a more dynamic approach is being investigated to tackle load balancing. The general objective is to manage a self-adapting, distributed database so as to reliably and consistently provide near optimal performance as perceived by client applications. Such a management system must be ultimately capable of operating over a range of time varying usage profiles and fault scenarios, incorporate considerations for communications network delays, multiple updates and maintenance operations. It must also be shown to be capable of being scaled in a practical fashion to ever larger sized networks and databases. Two key components of such an automated system are an optimiser capable of efficiently finding new configuration options, and a suitable model of the system capable of accurately reflecting the performance (or any other required quality of service metric) of the real world system. As conditions change in the real world system, these are fed into the model. The optimiser is then run to find new configurations which are tested in the model prior to implementation in the real world. The model therefore forms an evaluation function which the optimiser utilises to direct its search. Whilst it has already been shown that Genetic Algorithms can provide good solutions to this problem, there are a number of issues associated with this approach. In particular, for industrial strength applications, it must be shown that the GA employed

  6. Playing the computer game Tetris prior to viewing traumatic film material and subsequent intrusive memories: Examining proactive interference.

    Science.gov (United States)

    James, Ella L; Lau-Zhu, Alex; Tickle, Hannah; Horsch, Antje; Holmes, Emily A

    2016-12-01

    Visuospatial working memory (WM) tasks performed concurrently or after an experimental trauma (traumatic film viewing) have been shown to reduce subsequent intrusive memories (concurrent or retroactive interference, respectively). This effect is thought to arise because, during the time window of memory consolidation, the film memory is labile and vulnerable to interference by the WM task. However, it is not known whether tasks before an experimental trauma (i.e. proactive interference) would also be effective. Therefore, we tested if a visuospatial WM task given before a traumatic film reduced intrusions. Findings are relevant to the development of preventative strategies to reduce intrusive memories of trauma for groups who are routinely exposed to trauma (e.g. emergency services personnel) and for whom tasks prior to trauma exposure might be beneficial. Participants were randomly assigned to 1 of 2 conditions. In the Tetris condition (n = 28), participants engaged in the computer game for 11 min immediately before viewing a 12-min traumatic film, whereas those in the Control condition (n = 28) had no task during this period. Intrusive memory frequency was assessed using an intrusion diary over 1-week and an Intrusion Provocation Task at 1-week follow-up. Recognition memory for the film was also assessed at 1-week. Compared to the Control condition, participants in the Tetris condition did not report statistically significant difference in intrusive memories of the trauma film on either measure. There was also no statistically significant difference in recognition memory scores between conditions. The study used an experimental trauma paradigm and findings may not be generalizable to a clinical population. Compared to control, playing Tetris before viewing a trauma film did not lead to a statistically significant reduction in the frequency of later intrusive memories of the film. It is unlikely that proactive interference, at least with this task

  7. Asteroids@home - A BOINC distributed computing project for asteroid shape reconstruction

    CERN Document Server

    Durech, Josef; Vanco, Radim

    2015-01-01

    We present the project Asteroids@home that uses distributed computing to solve the time-consuming inverse problem of shape reconstruction of asteroids. The project uses the Berkeley Open Infrastructure for Network Computing (BOINC) framework to distribute, collect, and validate small computational units that are solved independently at individual computers of volunteers connected to the project. Shapes, rotational periods, and orientations of the spin axes of asteroids are reconstructed from their disk-integrated photometry by the lightcurve inversion method.

  8. Memory conformity affects inaccurate memories more than accurate memories.

    Science.gov (United States)

    Wright, Daniel B; Villalba, Daniella K

    2012-01-01

    After controlling for initial confidence, inaccurate memories were shown to be more easily distorted than accurate memories. In two experiments groups of participants viewed 50 stimuli and were then presented with these stimuli plus 50 fillers. During this test phase participants reported their confidence that each stimulus was originally shown. This was followed by computer-generated responses from a bogus participant. After being exposed to this response participants again rated the confidence of their memory. The computer-generated responses systematically distorted participants' responses. Memory distortion depended on initial memory confidence, with uncertain memories being more malleable than confident memories. This effect was moderated by whether the participant's memory was initially accurate or inaccurate. Inaccurate memories were more malleable than accurate memories. The data were consistent with a model describing two types of memory (i.e., recollective and non-recollective memories), which differ in how susceptible these memories are to memory distortion.

  9. Serotonergic modulation of spatial working memory: predictions from a computational network model

    Directory of Open Access Journals (Sweden)

    Maria eCano-Colino

    2013-09-01

    Full Text Available Serotonin (5-HT receptors of types 1A and 2A are massively expressed in prefrontal cortex (PFC neurons, an area associated with cognitive function. Hence, 5-HT could be effective in modulating prefrontal-dependent cognitive functions, such as spatial working memory (SWM. However, a direct association between 5-HT and SWM has proved elusive in psycho-pharmacological studies. Recently, a computational network model of the PFC microcircuit was used to explore the relationship between 5‑HT and SWM (Cano-Colino et al. 2013. This study found that both excessive and insufficient 5-HT levels lead to impaired SWM performance in the network, and it concluded that analyzing behavioral responses based on confidence reports could facilitate the experimental identification of SWM behavioral effects of 5‑HT neuromodulation. Such analyses may have confounds based on our limited understanding of metacognitive processes. Here, we extend these results by deriving three additional predictions from the model that do not rely on confidence reports. Firstly, only excessive levels of 5-HT should result in SWM deficits that increase with delay duration. Secondly, excessive 5-HT baseline concentration makes the network vulnerable to distractors at distances that were robust to distraction in control conditions, while the network still ignores distractors efficiently for low 5‑HT levels that impair SWM. Finally, 5-HT modulates neuronal memory fields in neurophysiological experiments: Neurons should be better tuned to the cued stimulus than to the behavioral report for excessive 5-HT levels, while the reverse should happen for low 5-HT concentrations. In all our simulations agonists of 5-HT1A receptors and antagonists of 5-HT2A receptors produced behavioral and physiological effects in line with global 5-HT level increases. Our model makes specific predictions to be tested experimentally and advance our understanding of the neural basis of SWM and its neuromodulation

  10. A wireless computational platform for distributed computing based traffic monitoring involving mixed Eulerian-Lagrangian sensing

    KAUST Repository

    Jiang, Jiming

    2013-06-01

    This paper presents a new wireless platform designed for an integrated traffic monitoring system based on combined Lagrangian (mobile) and Eulerian (fixed) sensing. The sensor platform is built around a 32-bit ARM Cortex M4 micro-controller and a 2.4GHz 802.15.4 ISM compliant radio module, and can be interfaced with fixed traffic sensors, or receive data from vehicle transponders. The platform is specially designed and optimized to be integrated in a solar-powered wireless sensor network in which traffic flow maps are computed by the nodes directly using distributed computing. A MPPT circuitry is proposed to increase the power output of the attached solar panel. A self-recovering unit is designed to increase reliability and allow periodic hard resets, an essential requirement for sensor networks. A radio monitoring circuitry is proposed to monitor incoming and outgoing transmissions, simplifying software debug. An ongoing implementation is briefly discussed, and compared with existing platforms used in wireless sensor networks. © 2013 IEEE.

  11. Evaluation of a connectionless NoC for a real-time distributed shared memory many-core system

    NARCIS (Netherlands)

    Rutgers, Jochem H.; Bekooij, Marco J.G.; Smit, Gerard J.M.

    2012-01-01

    Real-time embedded systems like smartphones tend to comprise an ever increasing number of processing cores. For scalability and the need for guaranteed performance, the use of a connection-oriented network-on-chip (NoC) is advocated. Furthermore, a distributed shared memory architecture is preferred

  12. Access Control for Agent-based Computing: A Distributed Approach.

    Science.gov (United States)

    Antonopoulos, Nick; Koukoumpetsos, Kyriakos; Shafarenko, Alex

    2001-01-01

    Discusses the mobile software agent paradigm that provides a foundation for the development of high performance distributed applications and presents a simple, distributed access control architecture based on the concept of distributed, active authorization entities (lock cells), any combination of which can be referenced by an agent to provide…

  13. Matrix-dependent Strain Distributions of Au and Ag Nanoparticles in a Metal-oxide-semiconductor-based Nonvolatile Memory Device

    OpenAIRE

    Honghua Huang; Ying Zhang; Wenyan Wei; Ting Yu; Xingfang Luo; Cailei Yuan

    2015-01-01

    The matrix-dependent strain distributions of Au and Ag nanoparticles in a metal-oxide-semiconductor based nonvolatile memory device are investigated by finite element calculations. The simulation results clearly indicate that both Au and Ag nanoparticles incur compressive strain by high-k Al2O3 and conventional SiO2 dielectrics. The strain distribution of nanoparticles is closely related to the surrounding matrix. Nanoparticles embedded in different matrices experience different compressive s...

  14. Distributed cerebellar plasticity implements generalized multiple-scale memory components in real-robot sensorimotor tasks

    Directory of Open Access Journals (Sweden)

    Claudia eCasellato

    2015-02-01

    Full Text Available The cerebellum plays a crucial role in motor learning and it acts as a predictive controller. Modeling it and embedding it into sensorimotor tasks allows us to create functional links between plasticity mechanisms, neural circuits and behavioral learning. Moreover, if applied to real-time control of a neurorobot, the cerebellar model has to deal with a real noisy and changing environment, thus showing its robustness and effectiveness in learning. A biologically inspired cerebellar model with distributed plasticity, both at cortical and nuclear sites, has been used. Two cerebellum-mediated paradigms have been designed: an associative Pavlovian task and a vestibulo-ocular reflex, with multiple sessions of acquisition and extinction and with different stimuli and perturbation patterns. The cerebellar controller succeeded to generate conditioned responses and finely tuned eye movement compensation, thus reproducing human-like behaviors. Through a productive plasticity transfer from cortical to nuclear sites, the distributed cerebellar controller showed in both tasks the capability to optimize learning on multiple time-scales, to store motor memory and to effectively adapt to dynamic ranges of stimuli.

  15. Playable Serious Games for Studying and Programming Computational STEM and Informatics Applications of Distributed and Parallel Computer Architectures

    Science.gov (United States)

    Amenyo, John-Thones

    2012-01-01

    Carefully engineered playable games can serve as vehicles for students and practitioners to learn and explore the programming of advanced computer architectures to execute applications, such as high performance computing (HPC) and complex, inter-networked, distributed systems. The article presents families of playable games that are grounded in…

  16. Strategic Priming with Multiple Antigens can Yield Memory Cell Phenotypes Optimized for Infection with Mycobacterium tuberculosis: A Computational Study

    Science.gov (United States)

    Ziraldo, Cordelia; Gong, Chang; Kirschner, Denise E.; Linderman, Jennifer J.

    2016-01-01

    Lack of an effective vaccine results in 9 million new cases of tuberculosis (TB) every year and 1.8 million deaths worldwide. Although many infants are vaccinated at birth with BCG (an attenuated M. bovis), this does not prevent infection or development of TB after childhood. Immune responses necessary for prevention of infection or disease are still unknown, making development of effective vaccines against TB challenging. Several new vaccines are ready for human clinical trials, but these trials are difficult and expensive; especially challenging is determining the appropriate cellular response necessary for protection. The magnitude of an immune response is likely key to generating a successful vaccine. Characteristics such as numbers of central memory (CM) and effector memory (EM) T cells responsive to a diverse set of epitopes are also correlated with protection. Promising vaccines against TB contain mycobacterial subunit antigens (Ag) present during both active and latent infection. We hypothesize that protection against different key immunodominant antigens could require a vaccine that produces different levels of EM and CM for each Ag-specific memory population. We created a computational model to explore EM and CM values, and their ratio, within what we term Memory Design Space. Our model captures events involved in T cell priming within lymph nodes and tracks their circulation through blood to peripheral tissues. We used the model to test whether multiple Ag-specific memory cell populations could be generated with distinct locations within Memory Design Space at a specific time point post vaccination. Boosting can further shift memory populations to memory cell ratios unreachable by initial priming events. By strategically varying antigen load, properties of cellular interactions within the LN, and delivery parameters (e.g., number of boosts) of multi-subunit vaccines, we can generate multiple Ag-specific memory populations that cover a wide range of

  17. Altered distribution of peripheral blood memory B cells in humans chronically infected with Trypanosoma cruzi.

    Science.gov (United States)

    Fernández, Esteban R; Olivera, Gabriela C; Quebrada Palacio, Luz P; González, Mariela N; Hernandez-Vasquez, Yolanda; Sirena, Natalia María; Morán, María L; Ledesma Patiño, Oscar S; Postan, Miriam

    2014-01-01

    Numerous abnormalities of the peripheral blood T cell compartment have been reported in human chronic Trypanosoma cruzi infection and related to prolonged antigenic stimulation by persisting parasites. Herein, we measured circulating lymphocytes of various phenotypes based on the differential expression of CD19, CD4, CD27, CD10, IgD, IgM, IgG and CD138 in a total of 48 T. cruzi-infected individuals and 24 healthy controls. Infected individuals had decreased frequencies of CD19+CD27+ cells, which positively correlated with the frequencies of CD4+CD27+ cells. The contraction of CD19+CD27+ cells was comprised of IgG+IgD-, IgM+IgD- and isotype switched IgM-IgD- memory B cells, CD19+CD10+CD27+ B cell precursors and terminally differentiated CD19+CD27+CD138+ plasma cells. Conversely, infected individuals had increased proportions of CD19+IgG+CD27-IgD- memory and CD19+IgM+CD27-IgD+ transitional/naïve B cells. These observations prompted us to assess soluble CD27, a molecule generated by the cleavage of membrane-bound CD27 and used to monitor systemic immune activation. Elevated levels of serum soluble CD27 were observed in infected individuals with Chagas cardiomyopathy, indicating its potentiality as an immunological marker for disease progression in endemic areas. In conclusion, our results demonstrate that chronic T. cruzi infection alters the distribution of various peripheral blood B cell subsets, probably related to the CD4+ T cell deregulation process provoked by the parasite in humans.

  18. Altered distribution of peripheral blood memory B cells in humans chronically infected with Trypanosoma cruzi.

    Directory of Open Access Journals (Sweden)

    Esteban R Fernández

    Full Text Available Numerous abnormalities of the peripheral blood T cell compartment have been reported in human chronic Trypanosoma cruzi infection and related to prolonged antigenic stimulation by persisting parasites. Herein, we measured circulating lymphocytes of various phenotypes based on the differential expression of CD19, CD4, CD27, CD10, IgD, IgM, IgG and CD138 in a total of 48 T. cruzi-infected individuals and 24 healthy controls. Infected individuals had decreased frequencies of CD19+CD27+ cells, which positively correlated with the frequencies of CD4+CD27+ cells. The contraction of CD19+CD27+ cells was comprised of IgG+IgD-, IgM+IgD- and isotype switched IgM-IgD- memory B cells, CD19+CD10+CD27+ B cell precursors and terminally differentiated CD19+CD27+CD138+ plasma cells. Conversely, infected individuals had increased proportions of CD19+IgG+CD27-IgD- memory and CD19+IgM+CD27-IgD+ transitional/naïve B cells. These observations prompted us to assess soluble CD27, a molecule generated by the cleavage of membrane-bound CD27 and used to monitor systemic immune activation. Elevated levels of serum soluble CD27 were observed in infected individuals with Chagas cardiomyopathy, indicating its potentiality as an immunological marker for disease progression in endemic areas. In conclusion, our results demonstrate that chronic T. cruzi infection alters the distribution of various peripheral blood B cell subsets, probably related to the CD4+ T cell deregulation process provoked by the parasite in humans.

  19. Distributed Computing on Gadgetron: A new paradigm for MRI reconstruction

    DEFF Research Database (Denmark)

    Xue, Hui; Kelmann, Peter; Inati, Souheil

    cloud computing. With this extension (named GT-Plus), any number of Gadgetron processes can run cooperatively across multiple computers. GT-Plus framework was deployed on Amazon EC2 cloud and NIH’s Biowulf system. We demonstrate that with the GT-Plus cloud, a multi-slice free-breathing myocardial cine...

  20. Distributed Computing on Gadgetron: A new paradigm for MRI reconstruction

    DEFF Research Database (Denmark)

    Xue, Hui; Kelmann, Peter; Inati, Souheil

    cloud computing. With this extension (named GT-Plus), any number of Gadgetron processes can run cooperatively across multiple computers. GT-Plus framework was deployed on Amazon EC2 cloud and NIH’s Biowulf system. We demonstrate that with the GT-Plus cloud, a multi-slice free-breathing myocardial cine...

  1. Integration of distributed computing into the drug discovery process.

    Science.gov (United States)

    von Korff, Modest; Rufener, Christian; Stritt, Manuel; Freyss, Joel; Bär, Roman; Sander, Thomas

    2011-02-01

    Grid computing offers an opportunity to gain massive computing power at low costs. We give a short introduction into the drug discovery process and exemplify the use of grid computing for image processing, docking and 3D pharmacophore descriptor calculations. The principle of a grid and its architecture are briefly explained. More emphasis is laid on the issues related to a company-wide grid installation and embedding the grid into the research process. The future of grid computing in drug discovery is discussed in the expert opinion section. Most needed, besides reliable algorithms to predict compound properties, is embedding the grid seamlessly into the discovery process. User friendly access to powerful algorithms without any restrictions, that is, by a limited number of licenses, has to be the goal of grid computing in drug discovery.

  2. ATLAS distributed computing operations in the GridKa cloud

    Energy Technology Data Exchange (ETDEWEB)

    Duckeck, Guenter; Serfon, Cedric; Walker, Rodney [Ludwig-Maximilians-Universitaet, Garching (Germany); Harenberg, Torsten; Kalinin, Sergey; Schultes, Joachim [Bergische Universitaet, Wuppertal (Germany); Kawamura, Gen [Johannes-Gutenberg-Universitaet, Mainz (Germany); Leffhalm, Kai [DESY, Zeuthen (Germany); Meyer, Joerg [Georg-August-Universitaet, Goettingen (Germany); Petzold, Andreas [Karlsruher Institut fuer Technologie (Germany); Sundermann, Jan Erik [Albert-Ludwigs-Universitaet, Freiburg (Germany)

    2011-07-01

    The ATLAS Grid Computing resources in Germany, Poland, the Czech Republic, Austria, and Switzerland consist of a cloud of 12 Tier-2 computing centers grouped around the Tier-1 center GridKa at the Steinbuch Centre for Computing at KIT. While the Tier-1 center serves as a hub for data management in the cloud and is the principal resource for reprocessing and custodial storage of raw ATLAS data, the Tier-2 centers provide the resources for user analysis and production of simulated events. During the first full year of data taking at the LHC, the GridKa cloud has successfully contributed to the overall ATLAS computing effort, enabling physicists to quickly analyze the large volume of new incoming data and the corresponding simulated events. This talk covers the computing operations in the GridKa cloud with focus on performance and experiences at both the Tier-1 and Tier-2 centers.

  3. Functional topography of a distributed neural system for spatial and nonspatial information maintenance in working memory.

    Science.gov (United States)

    Sala, Joseph B; Rämä, Pia; Courtney, Susan M

    2003-01-01

    We investigated the degree to which the distributed and overlapping patterns of activity for working memory (WM) maintenance of objects and spatial locations are functionally dissociable. Previous studies of the neural system responsible for maintenance of different types of information in WM have reported seemingly contradictory results concerning the degree to which spatial and nonspatial information maintenance leads to distinct patterns of activation in prefrontal cortex. These inconsistent results may be partly attributable to the fact that different types of objects were used for the "object WM task" across studies. In the current study, we directly compared the patterns of response during WM tasks for face identity, house identity, and spatial location using functional magnetic resonance imaging (fMRI). Furthermore, independence of the neural resources available for spatial and object WM was tested behaviorally using a dual-task paradigm. Together, these results suggest that the mechanisms for the maintenance of house identity information are distributed and overlapping with those that maintain spatial location information, while the mechanisms for maintenance of face identity information are relatively more independent. There is, however, a consistent functional topography that results in superior prefrontal cortex producing the greatest response during spatial WM tasks, and middle and inferior prefrontal cortices producing their greatest responses during object WM tasks, independent of the object type. These results argue for a dorsal-ventral functional organization for spatial and nonspatial information. However, objects may contain both spatial and nonspatial information and, thus, have a distributed but not equipotent representation across both dorsal and ventral prefrontal cortex.

  4. Empirical evidence and stability analysis of the linear car-following model with gamma-distributed memory effect

    Science.gov (United States)

    Pei, Xin; Pan, Yan; Wang, Haixin; Wong, S. C.; Choi, Keechoo

    2016-05-01

    Car-following models, which describe the reactions of the driver of a following car to the changes of the leading car, are essential for the development of traffic flow theory. A car-following model with a stochastic memory effect is considered to be more realistic in modeling drivers' behavior. Because a gamma-distributed memory function has been shown to outperform other forms according to empirical data, in this study, we thus focus on a car-following model with a gamma-distributed memory effect; analytical and numerical studies are then conducted for stability analysis. Accordingly, the general expression of undamped and stability points is achieved by analytical study. The numerical results show great agreement with the analytical results: introducing the effect of the driver's memory causes the stable regions to weaken slightly, but the metastable region is obviously enlarged. In addition, a numerical study is performed to further analyze the variation of the stable and unstable regions with respect to the different profiles of gamma distribution.

  5. Some well-posedness and general stability results in Timoshenko systems with infinite memory and distributed time delay

    Science.gov (United States)

    Guesmia, Aissa

    2014-08-01

    In this paper, we consider a Timoshenko system in one-dimensional bounded domain with infinite memory and distributed time delay both acting on the equation of the rotation angle. Without any restriction on the speeds of wave propagation and under appropriate assumptions on the infinite memory and distributed time delay convolution kernels, we prove, first, the well-posedness and, second, the stability of the system, where we present some decay estimates depending on the equal-speed propagation case and the opposite one. The obtained decay rates depend on the growths of the memory and delay kernels at infinity. In the nonequal-speed case, the decay rate depends also on the regularity of initial data. Our stability results show that the only dissipation resulting from the infinite memory guarantees the asymptotic stability of the system regardless to the speeds of wave propagation and in spite of the presence of a distributed time delay. Applications of our approach to specific coupled Timoshenko-heat and Timoshenko-wave systems as well as the discrete time delay case are also presented.

  6. Cluster-Enabled OpenMP: An OpenMP Compiler for the SCASH Software Distributed Shared Memory System

    Directory of Open Access Journals (Sweden)

    Mitsuhisa Sato

    2001-01-01

    Full Text Available OpenMP is attracting wide-spread interest because of its easy-to-use parallel programming model for shared memory multiprocessors. We have implemented a "cluster-enabled" OpenMP compiler for a page-based software distributed shared memory system, SCASH, which works on a cluster of PCs. It allows OpenMP programs to run transparently in a distributed memory environment. The compiler transforms OpenMP programs into parallel programs using SCASH so that shared global variables are allocated at run time in the shared address space of SCASH. A set of directives is added to specify data mapping and loop scheduling method which schedules iterations onto threads associated with the data mapping. Our experimental results show that the data mapping may greatly impact on the performance of OpenMP programs in the software distributed shared memory system. The performance of some NAS parallel benchmark programs in OpenMP is improved by using our extended directives.

  7. Determination of strain fields in porous shape memory alloys using micro-computed tomography

    Science.gov (United States)

    Bormann, Therese; Friess, Sebastian; de Wild, Michael; Schumacher, Ralf; Schulz, Georg; Müller, Bert

    2010-09-01

    Shape memory alloys (SMAs) belong to 'intelligent' materials since the metal alloy can change its macroscopic shape as the result of the temperature-induced, reversible martensite-austenite phase transition. SMAs are often applied for medical applications such as stents, hinge-less instruments, artificial muscles, and dental braces. Rapid prototyping techniques, including selective laser melting (SLM), allow fabricating complex porous SMA microstructures. In the present study, the macroscopic shape changes of the SMA test structures fabricated by SLM have been investigated by means of micro computed tomography (μCT). For this purpose, the SMA structures are placed into the heating stage of the μCT system SkyScan 1172™ (SkyScan, Kontich, Belgium) to acquire three-dimensional datasets above and below the transition temperature, i.e. at room temperature and at about 80°C, respectively. The two datasets were registered on the basis of an affine registration algorithm with nine independent parameters - three for the translation, three for the rotation and three for the scaling in orthogonal directions. Essentially, the scaling parameters characterize the macroscopic deformation of the SMA structure of interest. Furthermore, applying the non-rigid registration algorithm, the three-dimensional strain field of the SMA structure on the micrometer scale comes to light. The strain fields obtained will serve for the optimization of the SLM-process and, more important, of the design of the complex shaped SMA structures for tissue engineering and medical implants.

  8. Efficient Data-parallel Computations on Distributed Systems

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    Task scheduling determines the performance of NOW computing to a large extent.However,the computer system architecture, computing capability and sys tem load are rarely proposed together.In this paper,a biggest-heterogeneous scheduling algorithm is presented.It fully considers the system characterist ics (from application view), structure and state.So it always can utilize all processing resource under a reasonable premise.The results of experiment show the algorithm can significantly shorten the response time of jobs.

  9. 内存计算技术研究综述∗%Survey on In-Memory Computing Technology

    Institute of Scientific and Technical Information of China (English)

    罗乐; 刘轶; 钱德沛

    2016-01-01

    在大数据时代,如何高效地处理海量数据以满足性能需求,是一个需要解决的重要问题。内存计算充分利用大容量内存进行数据处理,减少甚至避免 I/O 操作,因而极大地提高了海量数据处理的性能,同时也面临一系列有待解决的问题。首先,在分析内存计算技术特点的基础上对其进行了分类,并分别介绍了各类技术及系统的原理、研究现状及热点问题;其次,对内存计算的典型应用进行了分析;最后,从总体层面和应用层面对内存计算面临的挑战予以分析,并且对其发展前景做了展望。%In the era of big data, systems need to process massive data efficiently to meet performance requirements of applications. In-memory computing technology can improve performance of massive data processing significantly by utilizing memory and avoid I/O operations. However, the technology faces a series of challenges that need to be solved. This paper analyzes characteristics of in-memory computing technology, lays out its classification, and introduces principles, related works and hot topics of every category. In addition, several typical applications of in-memory technology are introduced. Finally, challenges and opportunities for in-memory computing are elaborated.

  10. Stochastic fluctuations and distributed control of gene expression impact cellular memory.

    Directory of Open Access Journals (Sweden)

    Guillaume Corre

    Full Text Available Despite the stochastic noise that characterizes all cellular processes the cells are able to maintain and transmit to their daughter cells the stable level of gene expression. In order to better understand this phenomenon, we investigated the temporal dynamics of gene expression variation using a double reporter gene model. We compared cell clones with transgenes coding for highly stable mRNA and fluorescent proteins with clones expressing destabilized mRNA-s and proteins. Both types of clones displayed strong heterogeneity of reporter gene expression levels. However, cells expressing stable gene products produced daughter cells with similar level of reporter proteins, while in cell clones with short mRNA and protein half-lives the epigenetic memory of the gene expression level was completely suppressed. Computer simulations also confirmed the role of mRNA and protein stability in the conservation of constant gene expression levels over several cell generations. These data indicate that the conservation of a stable phenotype in a cellular lineage may largely depend on the slow turnover of mRNA-s and proteins.

  11. Building Trust and Confidentiality in Cloud computing Distributed ...

    African Journals Online (AJOL)

    2013-03-01

    Mar 1, 2013 ... Department of Computer Science, University of Port Harcourt, Rivers State ... supporting encryption, maintaining integrity of data in the cloud will be a success. Keyword: ... can create, communicate and collaborate faster,.

  12. Mobile device-to-device distributed computing using data sets

    OpenAIRE

    Remédios, Diogo; Teófilo, António; Paulino, Hervé; Lourenço, João

    2015-01-01

    The rapidly increasing computing power, available storage and communication capabilities of mobile devices makes it possible to start processing and storing data locally, rather than offloading it to remote servers; allowing scenarios of mobile clouds without infrastructure dependency. We can now aim at connecting neighboring mobile devices, creating a local mobile cloud that provides storage and computing services on local generated data. In this paper, we describe an early overview of a dis...

  13. Rapid distributed fronto-parieto-occipital processing stages during working memory in humans.

    Science.gov (United States)

    Halgren, E; Boujon, C; Clarke, J; Wang, C; Chauvel, P

    2002-07-01

    Cortical potentials were recorded from implanted electrodes during a difficult working memory task requiring rapid storage, modification and retrieval of multiple memoranda. Synchronous event-related potentials were generated in distributed occipital, parietal, Rolandic and prefrontal sites beginning approximately 130 ms after stimulus onset and continuing for >500 ms. Coherent phase-locked, event-related oscillations supported interaction between these dorsal stream structures throughout the task period. The Rolandic structures generated early as well as sustained potentials to sensory stimuli in the absence of movement. Activation peaks and phase lags between synaptic populations suggested that perceptual processing occurred exclusively in the visual association cortex from approximately 90 to 130 ms, with its results projected to fronto-parietal areas for interpretation from approximately 130 to 280 ms. The direction of interaction then appeared to reverse from approximately 300 to 400 ms, consistent with mental arithmetic being performed by fronto-parietal areas operating upon a visual scratch pad in the dorsolateral occipital cortex. A second reversal, from approximately 420 to 600 ms, may have represented an updating of memoranda stored in fronto-parietal sites. Lateralized perisylvian oscillations suggested an articulatory loop. Anterior cingulate activity was evoked by feedback signals indicating errors. These results indicate how a fronto-centro-parietal 'central executive' might interact with an occipital visual scratch pad, perisylvian articulatory loop and limbic monitor to implement the sequential stages of a complex mental operation.

  14. Extracting the temperature distribution on a phase-change memory cell during crystallization

    Science.gov (United States)

    Bakan, Gokhan; Gerislioglu, Burak; Dirisaglik, Faruk; Jurado, Zoila; Sullivan, Lindsay; Dana, Aykutlu; Lam, Chung; Gokirmak, Ali; Silva, Helena

    2016-10-01

    Phase-change memory (PCM) devices are enabled by amorphization- and crystallization-induced changes in the devices' electrical resistances. Amorphization is achieved by melting and quenching the active volume using short duration electrical pulses (˜ns). The crystallization (set) pulse duration, however, is much longer and depends on the cell temperature reached during the pulse. Hence, the temperature-dependent crystallization process of the phase-change materials at the device level has to be well characterized to achieve fast PCM operations. A main challenge is determining the cell temperature during crystallization. Here, we report extraction of the temperature distribution on a lateral PCM cell during a set pulse using measured voltage-current characteristics and thermal modelling. The effect of the thermal properties of materials on the extracted cell temperature is also studied, and a better cell design is proposed for more accurate temperature extraction. The demonstrated study provides promising results for characterization of the temperature-dependent crystallization process within a cell.

  15. Computational complexity and memory usage for multi-frontal direct solvers used in p finite element analysis

    KAUST Repository

    Calo, Victor M.

    2011-05-14

    The multi-frontal direct solver is the state of the art for the direct solution of linear systems. This paper provides computational complexity and memory usage estimates for the application of the multi-frontal direct solver algorithm on linear systems resulting from p finite elements. Specifically we provide the estimates for systems resulting from C0 polynomial spaces spanned by B-splines. The structured grid and uniform polynomial order used in isogeometric meshes simplifies the analysis.

  16. Simulating the Immune Response on a Distributed Parallel Computer

    Science.gov (United States)

    Castiglione, F.; Bernaschi, M.; Succi, S.

    The application of ideas and methods of statistical mechanics to problems of biological relevance is one of the most promising frontiers of theoretical and computational mathematical physics.1,2 Among others, the computer simulation of the immune system dynamics stands out as one of the prominent candidates for this type of investigations. In the recent years immunological research has been drawing increasing benefits from the resort to advanced mathematical modeling on modern computers.3,4 Among others, Cellular Automata (CA), i.e., fully discrete dynamical systems evolving according to boolean laws, appear to be extremely well suited to computer simulation of biological systems.5 A prominent example of immunological CA is represented by the Celada-Seiden automaton, that has proven capable of providing several new insights into the dynamics of the immune system response. To date, the Celada-Seiden automaton was not in a position to exploit the impressive advances of computer technology, and notably parallel processing, simply because no parallel version of this automaton had been developed yet. In this paper we fill this gap and describe a parallel version of the Celada-Seiden cellular automaton aimed at simulating the dynamic response of the immune system. Details on the parallel implementation as well as performance data on the IBM SP2 parallel platform are presented and commented on.

  17. Earth observation scientific workflows in a distributed computing environment

    CSIR Research Space (South Africa)

    Van Zyl, TL

    2011-09-01

    Full Text Available Geospatially Enabled Scientific Workflows offer a promising paradigm to facilitate researchers, in the earth observation domain, with many aspects of the scientific process. One such aspect is that of access to distributed earth observation data...

  18. Computer routines for probability distributions, random numbers, and related functions

    Science.gov (United States)

    Kirby, W.H.

    1980-01-01

    Use of previously codes and tested subroutines simplifies and speeds up program development and testing. This report presents routines that can be used to calculate various probability distributions and other functions of importance in statistical hydrology. The routines are designed as general-purpose Fortran subroutines and functions to be called from user-written main programs. The probability distributions provided include the beta, chisquare, gamma, Gaussian (normal), Pearson Type III (tables and approximation), and Weibull. Also provided are the distributions of the Grubbs-Beck outlier test, Kolmogorov 's and Smirnov 's D, Student 's t, noncentral t (approximate), and Snedecor F tests. Other mathematical functions include the Bessel function I (subzero), gamma and log-gamma functions, error functions and exponential integral. Auxiliary services include sorting and printer plotting. Random number generators for uniform and normal numbers are provided and may be used with some of the above routines to generate numbers from other distributions. (USGS)

  19. Information processing in bacteria: memory, computation, and statistical physics: a key issues review.

    Science.gov (United States)

    Lan, Ganhui; Tu, Yuhai

    2016-05-01

    preserving information, it does not reveal the underlying mechanism that leads to the observed input-output relationship, nor does it tell us much about which information is important for the organism and how biological systems use information to carry out specific functions. To do that, we need to develop models of the biological machineries, e.g. biochemical networks and neural networks, to understand the dynamics of biological information processes. This is a much more difficult task. It requires deep knowledge of the underlying biological network-the main players (nodes) and their interactions (links)-in sufficient detail to build a model with predictive power, as well as quantitative input-output measurements of the system under different perturbations (both genetic variations and different external conditions) to test the model predictions to guide further development of the model. Due to the recent growth of biological knowledge thanks in part to high throughput methods (sequencing, gene expression microarray, etc) and development of quantitative in vivo techniques such as various florescence technology, these requirements are starting to be realized in different biological systems. The possible close interaction between quantitative experimentation and theoretical modeling has made systems biology an attractive field for physicists interested in quantitative biology. In this review, we describe some of the recent work in developing a quantitative predictive model of bacterial chemotaxis, which can be considered as the hydrogen atom of systems biology. Using statistical physics approaches, such as the Ising model and Langevin equation, we study how bacteria, such as E. coli, sense and amplify external signals, how they keep a working memory of the stimuli, and how they use these data to compute the chemical gradient. In particular, we will describe how E. coli cells avoid cross-talk in a heterogeneous receptor cluster to keep a ligand-specific memory. We will also

  20. Information processing in bacteria: memory, computation, and statistical physics: a key issues review

    Science.gov (United States)

    Lan, Ganhui; Tu, Yuhai

    2016-05-01

    preserving information, it does not reveal the underlying mechanism that leads to the observed input-output relationship, nor does it tell us much about which information is important for the organism and how biological systems use information to carry out specific functions. To do that, we need to develop models of the biological machineries, e.g. biochemical networks and neural networks, to understand the dynamics of biological information processes. This is a much more difficult task. It requires deep knowledge of the underlying biological network—the main players (nodes) and their interactions (links)—in sufficient detail to build a model with predictive power, as well as quantitative input-output measurements of the system under different perturbations (both genetic variations and different external conditions) to test the model predictions to guide further development of the model. Due to the recent growth of biological knowledge thanks in part to high throughput methods (sequencing, gene expression microarray, etc) and development of quantitative in vivo techniques such as various florescence technology, these requirements are starting to be realized in different biological systems. The possible close interaction between quantitative experimentation and theoretical modeling has made systems biology an attractive field for physicists interested in quantitative biology. In this review, we describe some of the recent work in developing a quantitative predictive model of bacterial chemotaxis, which can be considered as the hydrogen atom of systems biology. Using statistical physics approaches, such as the Ising model and Langevin equation, we study how bacteria, such as E. coli, sense and amplify external signals, how they keep a working memory of the stimuli, and how they use these data to compute the chemical gradient. In particular, we will describe how E. coli cells avoid cross-talk in a heterogeneous receptor cluster to keep a ligand-specific memory. We will also

  1. Computing a non-Maxwellian velocity distribution from first principles.

    Science.gov (United States)

    Cáceres, Manuel O

    2003-01-01

    We investigate a family of single-particle anomalous velocity distribution by solving a particular class of stochastic Liouville equations. The stationary state is obtained analytically and the Maxwell-Boltzmann distribution is reobtained in a particular limit. We discuss the comparison with other different methods to obtain the stationary state. Extensions when the models cannot be solved in an exact way are also pointed out in connection with the one-ficton approximation.

  2. Human learning and memory.

    Science.gov (United States)

    Johnson, M K; Hasher, L

    1987-01-01

    There have been several notable recent trends in the area of learning and memory. Problems with the episodic/semantic distinction have become more apparent, and new efforts have been made (exemplar models, distributed-memory models) to represent general knowledge without assuming a separate semantic system. Less emphasis is being placed on stable, prestored prototypes and more emphasis on a flexible memory system that provides the basis for a multitude of categories or frames of reference, derived on the spot as tasks demand. There is increasing acceptance of the idea that mental models are constructed and stored in memory in addition to, rather than instead of, memorial representations that are more closely tied to perceptions. This gives rise to questions concerning the conditions that permit inferences to be drawn and mental models to be constructed, and to questions concerning the similarities and differences in the nature of the representations in memory of perceived and generated information and in their functions. There has also been a swing from interest in deliberate strategies to interest in automatic, unconscious (even mechanistic!) processes, reflecting an appreciation that certain situations (e.g. recognition, frequency judgements, savings in indirect tasks, aspects of skill acquisition, etc) seem not to depend much on the products of strategic, effortful or reflective processes. There is a lively interest in relations among memory measures and attempts to characterize memory representations and/or processes that could give rise to dissociations among measures. Whether the pattern of results reflects the operation of functional subsystems of memory and, if so, what the "modules" are is far from clear. This issue has been fueled by work with amnesics and has contributed to a revival of interaction between researchers studying learning and memory in humans and those studying learning and memory in animals. Thus, neuroscience rivals computer science as a

  3. An Alternative Method for Computing Mean and Covariance Matrix of Some Multivariate Distributions

    Science.gov (United States)

    Radhakrishnan, R.; Choudhury, Askar

    2009-01-01

    Computing the mean and covariance matrix of some multivariate distributions, in particular, multivariate normal distribution and Wishart distribution are considered in this article. It involves a matrix transformation of the normal random vector into a random vector whose components are independent normal random variables, and then integrating…

  4. A Data Mining Algorithm Based on Distributed Decision-Tree in Grid Computing Environments

    Institute of Scientific and Technical Information of China (English)

    Zhongda Lin; Yanfeng Hong; Kun Deng

    2006-01-01

    Recently, researches on distributed data mining by making use of grid are in trend. This paper introduces a data mining algorithm by means of distributed decision-tree, which has taken the advantage of conveniences and services supplied by the computing platform-grid, and can perform a data mining of distributed classification on grid.

  5. Chemokine receptor co-expression reveals aberrantly distributed TH effector memory cells in GPA patients.

    Science.gov (United States)

    Lintermans, Lucas L; Rutgers, Abraham; Stegeman, Coen A; Heeringa, Peter; Abdulahad, Wayel H

    2017-06-14

    Persistent expansion of circulating CD4(+) effector memory T cells (TEM) in patients with granulomatosis with polyangiitis (GPA) suggests their fundamental role in disease pathogenesis. Recent studies have shown that distinct functional CD4(+) TEM cell subsets can be identified based on expression patterns of chemokine receptors. The current study aimed to determine different CD4(+) TEM cell subsets based on chemokine receptor expression in peripheral blood of GPA patients. Identification of particular circulating CD4(+) TEM cells subsets may reveal distinct contributions of specific CD4(+) TEM subsets to the disease pathogenesis in GPA. Peripheral blood of 63 GPA patients in remission and 42 age- and sex-matched healthy controls was stained immediately after blood withdrawal with fluorochrome-conjugated antibodies for cell surface markers (CD3, CD4, CD45RO) and chemokine receptors (CCR4, CCR6, CCR7, CRTh2, CXCR3) followed by flow cytometry analysis. CD4(+) TEM memory cells (CD3(+)CD4(+)CD45RO(+)CCR7(-)) were gated, and the expression patterns of chemokine receptors CXCR3(+)CCR4(-)CCR6(-)CRTh2(-), CXCR3(-)CCR4(+)CCR6(-)CRTh2(+), CXCR3(-)CCR4(+)CCR6(+)CRTh2(-), and CXCR3(+)CCR4(-)CCR6(+)CRTh2(-) were used to distinguish TEM1, TEM2, TEM17, and TEM17.1 cells, respectively. The percentage of CD4(+) TEM cells was significantly increased in GPA patients in remission compared to HCs. Chemokine receptor co-expression analysis within the CD4(+) TEM cell population demonstrated a significant increase in the proportion of TEM17 cells with a concomitant significant decrease in the TEM1 cells in GPA patients compared to HC. The percentage of TEM17 cells correlated negatively with TEM1 cells in GPA patients. Moreover, the circulating proportion of TEM17 cells showed a positive correlation with the number of organs involved and an association with the tendency to relapse in GPA patients. Interestingly, the aberrant distribution of TEM1 and TEM17 cells is modulated in CMV

  6. Remarks on neurocybernetics and its links to computing science. To the memory of Prof. Luigi M. Ricciardi.

    Science.gov (United States)

    Moreno-Díaz, Roberto; Moreno-Díaz, Arminda

    2013-06-01

    This paper explores the origins and content of neurocybernetics and its links to artificial intelligence, computer science and knowledge engineering. Starting with three remarkable pieces of work, we center attention on a number of events that initiated and developed basic topics that are still nowadays a matter of research and inquire, from goal directed activity theories to circular causality and to reverberations and learning. Within this context, we pay tribute to the memory of Prof. Ricciardi documenting the importance of his contributions in the mathematics of brain, neural nets and neurophysiological models, computational simulations and techniques.

  7. Efficient Computation of Distance Sketches in Distributed Networks

    CERN Document Server

    Sarma, Atish Das; Pandurangan, Gopal

    2011-01-01

    Distance computation is one of the most fundamental primitives used in communication networks. The cost of effectively and accurately computing pairwise network distances can become prohibitive in large-scale networks such as the Internet and Peer-to-Peer (P2P) networks. To negotiate the rising need for very efficient distance computation, approximation techniques for numerous variants of this question have recently received significant attention in the literature. The goal is to preprocess the graph and store a small amount of information such that whenever a query for any pairwise distance is issued, the distance can be well approximated (i.e., with small stretch) very quickly in an online fashion. Specifically, the pre-processing (usually) involves storing a small sketch with each node, such that at query time only the sketches of the concerned nodes need to be looked up to compute the approximate distance. In this paper, we present the first theoretical study of distance sketches derived from distance ora...

  8. Distributed computing for autonomous on board planning and sequence validations

    Science.gov (United States)

    Ko, A. Y.; Alkalai, L.; Chau, S.; Cheung, K.; Tong, D.; Maldague, P. F.

    2002-01-01

    We propose a new conceptual approach to system-level autonomy that exploits in a synergistic way recent breakthroughs in three specific areas: automatic generation of embeddable planning and validation software, integration of telecommunications forecaster and planning tools, and fault-tolerant assignment of computing tasks to multiple processors.

  9. Secure Computation, I/O-Efficient Algorithms and Distributed Signatures

    DEFF Research Database (Denmark)

    Damgård, Ivan Bjerre; Kölker, Jonas; Toft, Tomas

    2012-01-01

    adversary corrupting a constant fraction of the players and servers. Using packed secret sharing, the data can be stored in a compact way but will only be accessible in a block-wise fashion. We explore the possibility of using I/O-efficient algorithms to nevertheless compute on the data as efficiently...

  10. Deployment strategies for distributed applications on cloud computing infrastructures

    NARCIS (Netherlands)

    Veen, J.S. van der; Lazovik, E.; Makkes, M.X.; Meijer, R.J.

    2013-01-01

    Cloud computing enables on-demand access to a shared pool of IT resources. In the case of Infrastructure as a Service (IaaS), the cloud user typically acquires Virtual Machines (VMs) from the provider. It is up to the user to decide at what time and for how long they want to use these VMs. Because o

  11. Distributed storage and cloud computing: a test case

    Science.gov (United States)

    Piano, S.; Delia Ricca, G.

    2014-06-01

    Since 2003 the computing farm hosted by the INFN Tier3 facility in Trieste supports the activities of many scientific communities. Hundreds of jobs from 45 different VOs, including those of the LHC experiments, are processed simultaneously. Given that normally the requirements of the different computational communities are not synchronized, the probability that at any given time the resources owned by one of the participants are not fully utilized is quite high. A balanced compensation should in principle allocate the free resources to other users, but there are limits to this mechanism. In fact, the Trieste site may not hold the amount of data needed to attract enough analysis jobs, and even in that case there could be a lack of bandwidth for their access. The Trieste ALICE and CMS computing groups, in collaboration with other Italian groups, aim to overcome the limitations of existing solutions using two approaches: sharing the data among all the participants taking full advantage of GARR-X wide area networks (10 GB/s) and integrating the resources dedicated to batch analysis with the ones reserved for dynamic interactive analysis, through modern solutions as cloud computing.

  12. GridFactory - Distributed computing on ephemeral resources

    DEFF Research Database (Denmark)

    Orellana, Frederik; Niinimaki, Marko

    2011-01-01

    A novel batch system for high throughput computing is presented. The system is specifically designed to leverage virtualization and web technology to facilitate deployment on cloud and other ephemeral resources. In particular, it implements a security model suited for forming collaborations...

  13. Polytopol computing for multi-core and distributed systems

    Science.gov (United States)

    Spaanenburg, Henk; Spaanenburg, Lambert; Ranefors, Johan

    2009-05-01

    Multi-core computing provides new challenges to software engineering. The paper addresses such issues in the general setting of polytopol computing, that takes multi-core problems in such widely differing areas as ambient intelligence sensor networks and cloud computing into account. It argues that the essence lies in a suitable allocation of free moving tasks. Where hardware is ubiquitous and pervasive, the network is virtualized into a connection of software snippets judiciously injected to such hardware that a system function looks as one again. The concept of polytopol computing provides a further formalization in terms of the partitioning of labor between collector and sensor nodes. Collectors provide functions such as a knowledge integrator, awareness collector, situation displayer/reporter, communicator of clues and an inquiry-interface provider. Sensors provide functions such as anomaly detection (only communicating singularities, not continuous observation), they are generally powered or self-powered, amorphous (not on a grid) with generation-and-attrition, field re-programmable, and sensor plug-and-play-able. Together the collector and the sensor are part of the skeleton injector mechanism, added to every node, and give the network the ability to organize itself into some of many topologies. Finally we will discuss a number of applications and indicate how a multi-core architecture supports the security aspects of the skeleton injector.

  14. Deployment Strategies for Distributed Applications on Cloud Computing Infrastructures

    NARCIS (Netherlands)

    Veen, J.S. van der; Elena Lazovik, E.; Makkes, M.X.; Meijer, R.J.

    2013-01-01

    Cloud computing enables on-demand access to a shared pool of IT resources. In the case of Infrastructure as a Service (IaaS), the cloud user typically acquires Virtual Machines (VMs) from the provider. It is up to the user to decide at what time and for how long they want to use these VMs. Because o

  15. Deployment strategies for distributed applications on cloud computing infrastructures

    NARCIS (Netherlands)

    Veen, J.S. van der; Lazovik, E.; Makkes, M.X.; Meijer, R.J.

    2013-01-01

    Cloud computing enables on-demand access to a shared pool of IT resources. In the case of Infrastructure as a Service (IaaS), the cloud user typically acquires Virtual Machines (VMs) from the provider. It is up to the user to decide at what time and for how long they want to use these VMs. Because

  16. Fractional Stefan problems exhibiting lumped and distributed latent-heat memory effects

    Science.gov (United States)

    Voller, Vaughan R.; Falcini, Federico; Garra, Roberto

    2013-04-01

    We consider fractional Stefan melting problems which involve a memory of the latent-heat accumulation. We show that the manner in which the memory of the latent-heat accumulation is recorded depends on the assumed nature of the transition between the liquid and the solid phases. When a sharp interface between the liquid and the solid phases is assumed, the memory of the accumulation of the latent heat is “lumped” in the history of the speed of the interface. In contrast, when a diffuse interface is assumed, the memory of the accumulation is “distributed” throughout the liquid phase. By use of an example problem, we demonstrate that the equivalence of the sharp- and diffuse-interface models can only occur when there is no memory in the system.

  17. ATLAS Distributed Computing Monitoring tools after full 2 years of LHC data taking

    Science.gov (United States)

    Schovancová, Jaroslava

    2012-12-01

    This paper details a variety of Monitoring tools used within ATLAS Distributed Computing during the first 2 years of LHC data taking. We discuss tools used to monitor data processing from the very first steps performed at the CERN Analysis Facility after data is read out of the ATLAS detector, through data transfers to the ATLAS computing centres distributed worldwide. We present an overview of monitoring tools used daily to track ATLAS Distributed Computing activities ranging from network performance and data transfer throughput, through data processing and readiness of the computing services at the ATLAS computing centres, to the reliability and usability of the ATLAS computing centres. The described tools provide monitoring for issues of varying levels of criticality: from identifying issues with the instant online monitoring to long-term accounting information.

  18. ATLAS Distributed Computing Monitoring tools after full 2 years of LHC data taking

    CERN Document Server

    Schovancová, J; The ATLAS collaboration

    2012-01-01

    This paper details variety of Monitoring tools used within the ATLAS Distributed Computing during the first 2 years of LHC data taking. We discuss tools used to monitor data processing from the very first steps performed at the Tier-0 facility at CERN after data is read out of the ATLAS detector, through data transfers to the ATLAS computing centers distributed world-wide. We present an overview of monitoring tools used daily to track ATLAS Distributed Computing activities ranging from network performance and data transfers throughput, through data processing and readiness of the computing services at the ATLAS computing centers, to the reliability and usability of the ATLAS computing centers. Described tools provide monitoring for issues of different level of criticality: from spotting issues with the instant online monitoring to the long-term accounting information.

  19. An Algorithm for Optimized Time, Cost, and Reliability in a Distributed Computing System

    Directory of Open Access Journals (Sweden)

    Pankaj Saxena

    2013-03-01

    Full Text Available Distributed Computing System (DCS refers to multiple computer systems working on a single problem. A distributed system consists of a collection of autonomous computers, connected through a network which enables computers to coordinate their activities and to share the resources of the system. In distributed computing, a single problem is divided into many parts, and each part is solved by different computers. As long as the computers are networked, they can communicate with each other to solve the problem. DCS consists of multiple software components that are on multiple computers, but run as a single system. The computers that are in a distributed system can be physically close together and connected by a local network, or they can be geographically distant and connected by a wide area network. The ultimate goal of distributed computing is to maximize performance in a time effective, cost-effective, and reliability effective manner. In DCS the whole workload is divided into small and independent units, called tasks and it allocates onto the available processors. It also ensures fault tolerance and enables resource accessibility in the event that one of the components fails. The problem is addressed of assigning a task to a distributed computing system. The assignment of the modules of tasks is done statically. We have to give an algorithm to solve the problem of static task assignment in DCS, i.e. given a set of communicating tasks to be executed on a distributed system on a set of processors, to which processor should each task be assigned to get the more reliable results in lesser time and cost. In this paper an efficient algorithm for task allocation in terms of optimum time or optimum cost or optimum reliability is presented where numbers of tasks are more then the number of processors.

  20. Modeling of long-range memory processes with inverse cubic distributions by the nonlinear stochastic differential equations

    Science.gov (United States)

    Kaulakys, B.; Alaburda, M.; Ruseckas, J.

    2016-05-01

    A well-known fact in the financial markets is the so-called ‘inverse cubic law’ of the cumulative distributions of the long-range memory fluctuations of market indicators such as a number of events of trades, trading volume and the logarithmic price change. We propose the nonlinear stochastic differential equation (SDE) giving both the power-law behavior of the power spectral density and the long-range dependent inverse cubic law of the cumulative distribution. This is achieved using the suggestion that when the market evolves from calm to violent behavior there is a decrease of the delay time of multiplicative feedback of the system in comparison to the driving noise correlation time. This results in a transition from the Itô to the Stratonovich sense of the SDE and yields a long-range memory process.