WorldWideScience

Sample records for distributed memory multiprocessors

  1. Multigrid solution of diffusion equations on distributed memory multiprocessor systems

    International Nuclear Information System (INIS)

    Finnemann, H.

    1988-01-01

    The subject is the solution of partial differential equations for simulation of the reactor core on high-performance computers. The parallelization and implementation of nodal multigrid diffusion algorithms on array and ring configurations of the DIRMU multiprocessor system is outlined. The particular iteration scheme employed in the nodal expansion method appears similarly efficient in serial and parallel environments. The combination of modern multi-level techniques with innovative hardware (vector-multiprocessor systems) provides powerful tools needed for real time simulation of physical systems. The parallel efficiencies range from 70 to 90%. The same performance is estimated for large problems on large multiprocessor systems being designed at present. (orig.) [de

  2. Multiprocessor shared-memory information exchange

    International Nuclear Information System (INIS)

    Santoline, L.L.; Bowers, M.D.; Crew, A.W.; Roslund, C.J.; Ghrist, W.D. III

    1989-01-01

    In distributed microprocessor-based instrumentation and control systems, the inter-and intra-subsystem communication requirements ultimately form the basis for the overall system architecture. This paper describes a software protocol which addresses the intra-subsystem communications problem. Specifically the protocol allows for multiple processors to exchange information via a shared-memory interface. The authors primary goal is to provide a reliable means for information to be exchanged between central application processor boards (masters) and dedicated function processor boards (slaves) in a single computer chassis. The resultant Multiprocessor Shared-Memory Information Exchange (MSMIE) protocol, a standard master-slave shared-memory interface suitable for use in nuclear safety systems, is designed to pass unidirectional buffers of information between the processors while providing a minimum, deterministic cycle time for this data exchange

  3. Distributed parallel messaging for multiprocessor systems

    Science.gov (United States)

    Chen, Dong; Heidelberger, Philip; Salapura, Valentina; Senger, Robert M; Steinmacher-Burrow, Burhard; Sugawara, Yutaka

    2013-06-04

    A method and apparatus for distributed parallel messaging in a parallel computing system. The apparatus includes, at each node of a multiprocessor network, multiple injection messaging engine units and reception messaging engine units, each implementing a DMA engine and each supporting both multiple packet injection into and multiple reception from a network, in parallel. The reception side of the messaging unit (MU) includes a switch interface enabling writing of data of a packet received from the network to the memory system. The transmission side of the messaging unit, includes switch interface for reading from the memory system when injecting packets into the network.

  4. GOTHIC memory management : a multiprocessor shared single level store

    OpenAIRE

    Michel , Béatrice

    1990-01-01

    Gothic purpose is to build an object-oriented fault-tolerant distributed operating system for a local area network of multiprocessor workstations. This paper describes Gothic memory manager. It realizes the sharing of the secondary memory space between any process running on the Gothic system. Processes on different processors can communicate by sharing permanent information. The manager implements a shared single level storage with an invalidation protocol working on disk-pages to maintain s...

  5. Solution of the Euler and Navier-Stokes equations on MIMD distributed memory multiprocessors using cyclic reduction

    International Nuclear Information System (INIS)

    Curchitser, E.N.; Pelz, R.B.; Marconi, F.

    1992-01-01

    The Euler and Navier-Stokes equations are solved for the steady, two-dimensional flow over a NACA 0012 airfoil using a 1024 node nCUBE/2 multiprocessor. Second-order, upwind-discretized difference equations are solved implicitly using ADI factorization. Parallel cyclic reduction is employed to solve the block tridiagonal systems. For realistic problems, communication times are negligible compared to calculation times. The processors are tightly synchronized, and their loads are well balanced. When the flux Jacobians flux are frozen, the wall-clock time for one implicit timestep is about equal to that of a multistage explicit scheme. 10 refs

  6. Parallel implementation and evaluation of motion estimation system algorithms on a distributed memory multiprocessor using knowledge based mappings

    Science.gov (United States)

    Choudhary, Alok Nidhi; Leung, Mun K.; Huang, Thomas S.; Patel, Janak H.

    1989-01-01

    Several techniques to perform static and dynamic load balancing techniques for vision systems are presented. These techniques are novel in the sense that they capture the computational requirements of a task by examining the data when it is produced. Furthermore, they can be applied to many vision systems because many algorithms in different systems are either the same, or have similar computational characteristics. These techniques are evaluated by applying them on a parallel implementation of the algorithms in a motion estimation system on a hypercube multiprocessor system. The motion estimation system consists of the following steps: (1) extraction of features; (2) stereo match of images in one time instant; (3) time match of images from different time instants; (4) stereo match to compute final unambiguous points; and (5) computation of motion parameters. It is shown that the performance gains when these data decomposition and load balancing techniques are used are significant and the overhead of using these techniques is minimal.

  7. Communication and Memory Architecture Design of Application-Specific High-End Multiprocessors

    Directory of Open Access Journals (Sweden)

    Yahya Jan

    2012-01-01

    Full Text Available This paper is devoted to the design of communication and memory architectures of massively parallel hardware multiprocessors necessary for the implementation of highly demanding applications. We demonstrated that for the massively parallel hardware multiprocessors the traditionally used flat communication architectures and multi-port memories do not scale well, and the memory and communication network influence on both the throughput and circuit area dominates the processors influence. To resolve the problems and ensure scalability, we proposed to design highly optimized application-specific hierarchical and/or partitioned communication and memory architectures through exploring and exploiting the regularity and hierarchy of the actual data flows of a given application. Furthermore, we proposed some data distribution and related data mapping schemes in the shared (global partitioned memories with the aim to eliminate the memory access conflicts, as well as, to ensure that our communication design strategies will be applicable. We incorporated these architecture synthesis strategies into our quality-driven model-based multi-processor design method and related automated architecture exploration framework. Using this framework, we performed a large series of experiments that demonstrate many various important features of the synthesized memory and communication architectures. They also demonstrate that our method and related framework are able to efficiently synthesize well scalable memory and communication architectures even for the high-end multiprocessors. The gains as high as 12-times in performance and 25-times in area can be obtained when using the hierarchical communication networks instead of the flat networks. However, for the high parallelism levels only the partitioned approach ensures the scalability in performance.

  8. A general model for memory interference in a multiprocessor system with memory hierarchy

    Science.gov (United States)

    Taha, Badie A.; Standley, Hilda M.

    1989-01-01

    The problem of memory interference in a multiprocessor system with a hierarchy of shared buses and memories is addressed. The behavior of the processors is represented by a sequence of memory requests with each followed by a determined amount of processing time. A statistical queuing network model for determining the extent of memory interference in multiprocessor systems with clusters of memory hierarchies is presented. The performance of the system is measured by the expected number of busy memory clusters. The results of the analytic model are compared with simulation results, and the correlation between them is found to be very high.

  9. Elastic pointer directory organization for scalable shared memory multiprocessors

    Institute of Scientific and Technical Information of China (English)

    Yuhang Liu; Mingfa Zhu; Limin Xiao

    2014-01-01

    In the field of supercomputing, one key issue for scal-able shared-memory multiprocessors is the design of the directory which denotes the sharing state for a cache block. A good direc-tory design intends to achieve three key attributes: reasonable memory overhead, sharer position precision and implementation complexity. However, researchers often face the problem that gain-ing one attribute may result in losing another. The paper proposes an elastic pointer directory (EPD) structure based on the analysis of shared-memory applications, taking the fact that the number of sharers for each directory entry is typical y smal . Analysis re-sults show that for 4 096 nodes, the ratio of memory overhead to the ful-map directory is 2.7%. Theoretical analysis and cycle-accurate execution-driven simulations on a 16 and 64-node cache coherence non uniform memory access (CC-NUMA) multiproces-sor show that the corresponding pointer overflow probability is reduced significantly. The performance is observed to be better than that of a limited pointers directory and almost identical to the ful-map directory, except for the slight implementation complex-ity. Using the directory cache to explore directory access locality is also studied. The experimental result shows that this is a promis-ing approach to be used in the state-of-the-art high performance computing domain.

  10. Assessing Programming Costs of Explicit Memory Localization on a Large Scale Shared Memory Multiprocessor

    Directory of Open Access Journals (Sweden)

    Silvio Picano

    1992-01-01

    Full Text Available We present detailed experimental work involving a commercially available large scale shared memory multiple instruction stream-multiple data stream (MIMD parallel computer having a software controlled cache coherence mechanism. To make effective use of such an architecture, the programmer is responsible for designing the program's structure to match the underlying multiprocessors capabilities. We describe the techniques used to exploit our multiprocessor (the BBN TC2000 on a network simulation program, showing the resulting performance gains and the associated programming costs. We show that an efficient implementation relies heavily on the user's ability to explicitly manage the memory system.

  11. Generation-based memory synchronization in a multiprocessor system with weakly consistent memory accesses

    Energy Technology Data Exchange (ETDEWEB)

    Ohmacht, Martin

    2017-08-15

    In a multiprocessor system, a central memory synchronization module coordinates memory synchronization requests responsive to memory access requests in flight, a generation counter, and a reclaim pointer. The central module communicates via point-to-point communication. The module includes a global OR reduce tree for each memory access requesting device, for detecting memory access requests in flight. An interface unit is implemented associated with each processor requesting synchronization. The interface unit includes multiple generation completion detectors. The generation count and reclaim pointer do not pass one another.

  12. Generation-based memory synchronization in a multiprocessor system with weakly consistent memory accesses

    Science.gov (United States)

    Ohmacht, Martin

    2014-09-09

    In a multiprocessor system, a central memory synchronization module coordinates memory synchronization requests responsive to memory access requests in flight, a generation counter, and a reclaim pointer. The central module communicates via point-to-point communication. The module includes a global OR reduce tree for each memory access requesting device, for detecting memory access requests in flight. An interface unit is implemented associated with each processor requesting synchronization. The interface unit includes multiple generation completion detectors. The generation count and reclaim pointer do not pass one another.

  13. Optical RAM-enabled cache memory and optical routing for chip multiprocessors: technologies and architectures

    Science.gov (United States)

    Pleros, Nikos; Maniotis, Pavlos; Alexoudi, Theonitsa; Fitsios, Dimitris; Vagionas, Christos; Papaioannou, Sotiris; Vyrsokinos, K.; Kanellos, George T.

    2014-03-01

    The processor-memory performance gap, commonly referred to as "Memory Wall" problem, owes to the speed mismatch between processor and electronic RAM clock frequencies, forcing current Chip Multiprocessor (CMP) configurations to consume more than 50% of the chip real-estate for caching purposes. In this article, we present our recent work spanning from Si-based integrated optical RAM cell architectures up to complete optical cache memory architectures for Chip Multiprocessor configurations. Moreover, we discuss on e/o router subsystems with up to Tb/s routing capacity for cache interconnection purposes within CMP configurations, currently pursued within the FP7 PhoxTrot project.

  14. Shared random access memory resource for multiprocessor real-time systems

    International Nuclear Information System (INIS)

    Dimmler, D.G.; Hardy, W.H. II

    1977-01-01

    A shared random-access memory resource is described which is used within real-time data acquisition and control systems with multiprocessor and multibus organizations. Hardware and software aspects are discussed in a specific example where interconnections are done via a UNIBUS. The general applicability of the approach is also discussed

  15. Parallel-vector algorithms for particle simulations on shared-memory multiprocessors

    International Nuclear Information System (INIS)

    Nishiura, Daisuke; Sakaguchi, Hide

    2011-01-01

    Over the last few decades, the computational demands of massive particle-based simulations for both scientific and industrial purposes have been continuously increasing. Hence, considerable efforts are being made to develop parallel computing techniques on various platforms. In such simulations, particles freely move within a given space, and so on a distributed-memory system, load balancing, i.e., assigning an equal number of particles to each processor, is not guaranteed. However, shared-memory systems achieve better load balancing for particle models, but suffer from the intrinsic drawback of memory access competition, particularly during (1) paring of contact candidates from among neighboring particles and (2) force summation for each particle. Here, novel algorithms are proposed to overcome these two problems. For the first problem, the key is a pre-conditioning process during which particle labels are sorted by a cell label in the domain to which the particles belong. Then, a list of contact candidates is constructed by pairing the sorted particle labels. For the latter problem, a table comprising the list indexes of the contact candidate pairs is created and used to sum the contact forces acting on each particle for all contacts according to Newton's third law. With just these methods, memory access competition is avoided without additional redundant procedures. The parallel efficiency and compatibility of these two algorithms were evaluated in discrete element method (DEM) simulations on four types of shared-memory parallel computers: a multicore multiprocessor computer, scalar supercomputer, vector supercomputer, and graphics processing unit. The computational efficiency of a DEM code was found to be drastically improved with our algorithms on all but the scalar supercomputer. Thus, the developed parallel algorithms are useful on shared-memory parallel computers with sufficient memory bandwidth.

  16. Ring interconnection for distributed memory automation and computing system

    Energy Technology Data Exchange (ETDEWEB)

    Vinogradov, V I [Inst. for Nuclear Research of the Russian Academy of Sciences, Moscow (Russian Federation)

    1996-12-31

    Problems of development of measurement, acquisition and central systems based on a distributed memory and a ring interface are discussed. It has been found that the RAM LINK-type protocol can be used for ringlet links in non-symmetrical distributed memory architecture multiprocessor system interaction. 5 refs.

  17. A Heterogeneous Multiprocessor Graphics System Using Processor-Enhanced Memories

    Science.gov (United States)

    1989-02-01

    frames per second, font generation directly from conic spline descriptions, and rapid calculation of radiosity form factors. The hardware consists of...generality for rendering curved surfaces, volume data, objects dcscri id with Constructive Solid Geometry, for rendering scenes using the radiosity ...f.aces and for computing a spherical radiosity lighting model (see Section 7.6). Custom Memory Chips \\ 208 bits x 128 pixels - Renderer Board ix p o a

  18. Multiprocessor systems and their concurrency

    Energy Technology Data Exchange (ETDEWEB)

    Starke, P H

    1984-01-01

    A multiprocessor system can be considered as a collection of finite automata which communicate over channels or shared memory units. The behaviour of such a system can be described by a semilanguage. This approach allows to define a numerical measure for the concurrency of multiprocessor systems and of distributed systems. This measure is characterized algebraically and the reconfiguration problem asking for an algorithm to construct an l-processor system which is equivalent to a given n-processor system is solved in the paper. 6 references.

  19. A multiprocessor computer simulation model employing a feedback scheduler/allocator for memory space and bandwidth matching and TMR processing

    Science.gov (United States)

    Bradley, D. B.; Irwin, J. D.

    1974-01-01

    A computer simulation model for a multiprocessor computer is developed that is useful for studying the problem of matching multiprocessor's memory space, memory bandwidth and numbers and speeds of processors with aggregate job set characteristics. The model assumes an input work load of a set of recurrent jobs. The model includes a feedback scheduler/allocator which attempts to improve system performance through higher memory bandwidth utilization by matching individual job requirements for space and bandwidth with space availability and estimates of bandwidth availability at the times of memory allocation. The simulation model includes provisions for specifying precedence relations among the jobs in a job set, and provisions for specifying precedence execution of TMR (Triple Modular Redundant and SIMPLEX (non redundant) jobs.

  20. Debugging in a multi-processor environment

    International Nuclear Information System (INIS)

    Spann, J.M.

    1981-01-01

    The Supervisory Control and Diagnostic System (SCDS) for the Mirror Fusion Test Facility (MFTF) consists of nine 32-bit minicomputers arranged in a tightly coupled distributed computer system utilizing a share memory as the data exchange medium. Debugging of more than one program in the multi-processor environment is a difficult process. This paper describes what new tools were developed and how the testing of software is performed in the SCDS for the MFTF project

  1. Sparse distributed memory overview

    Science.gov (United States)

    Raugh, Mike

    1990-01-01

    The Sparse Distributed Memory (SDM) project is investigating the theory and applications of massively parallel computing architecture, called sparse distributed memory, that will support the storage and retrieval of sensory and motor patterns characteristic of autonomous systems. The immediate objectives of the project are centered in studies of the memory itself and in the use of the memory to solve problems in speech, vision, and robotics. Investigation of methods for encoding sensory data is an important part of the research. Examples of NASA missions that may benefit from this work are Space Station, planetary rovers, and solar exploration. Sparse distributed memory offers promising technology for systems that must learn through experience and be capable of adapting to new circumstances, and for operating any large complex system requiring automatic monitoring and control. Sparse distributed memory is a massively parallel architecture motivated by efforts to understand how the human brain works. Sparse distributed memory is an associative memory, able to retrieve information from cues that only partially match patterns stored in the memory. It is able to store long temporal sequences derived from the behavior of a complex system, such as progressive records of the system's sensory data and correlated records of the system's motor controls.

  2. Distributed power management of real-time applications on a GALS multiprocessor SOC

    NARCIS (Netherlands)

    Nelson, Andrew; Goossens, Kees

    2015-01-01

    It is generally desirable to reduce the power consumption of embedded systems. Dynamic Voltage and Frequency Scaling (DVFS) is a commonly applied technique to achieve power reduction at the cost of computational performance. Multiprocessor System on Chips (MPSoCs) can have multiple voltage and

  3. Multiprocessor development for robot control

    International Nuclear Information System (INIS)

    Lee, Jong Min; Kim, Byung Soo; Kim, Chang Hoi; Hwang, Suk Yong; Sohn, Surg Won; Yoon, Tae Seob; Lee, Yong Bum; Kim, Woong Ki

    1988-02-01

    A mutiprocessor system that is essential to A.I. (Artificial Intelligence) robot control was developed. A.I. robot control needs very complex real time control. The multiprocessor system interconnecting many SBC's (Single Board Computer) is much faster and accurater than using only one SBC. Various multiprocessor systems and their applications were compared and discussed. The multiprocessor architecture system is specially designed to be used in nuclear environments. The main functions are job distribution, multitasking, and intelligent remote control by SDLC protocol using optical fiber. The system can be applied to position control for locomotion and manipulation, data fusion system, and image processing. (Author)

  4. Matrix factorization on a hypercube multiprocessor

    International Nuclear Information System (INIS)

    Geist, G.A.; Heath, M.T.

    1985-08-01

    This paper is concerned with parallel algorithms for matrix factorization on distributed-memory, message-passing multiprocessors, with special emphasis on the hypercube. Both Cholesky factorization of symmetric positive definite matrices and LU factorization of nonsymmetric matrices using partial pivoting are considered. The use of the resulting triangular factors to solve systems of linear equations by forward and back substitutions is also considered. Efficiencies of various parallel computational approaches are compared in terms of empirical results obtained on an Intel iPSC hypercube. 19 refs., 6 figs., 2 tabs

  5. Sparse distributed memory

    Science.gov (United States)

    Denning, Peter J.

    1989-01-01

    Sparse distributed memory was proposed be Pentti Kanerva as a realizable architecture that could store large patterns and retrieve them based on partial matches with patterns representing current sensory inputs. This memory exhibits behaviors, both in theory and in experiment, that resemble those previously unapproached by machines - e.g., rapid recognition of faces or odors, discovery of new connections between seemingly unrelated ideas, continuation of a sequence of events when given a cue from the middle, knowing that one doesn't know, or getting stuck with an answer on the tip of one's tongue. These behaviors are now within reach of machines that can be incorporated into the computing systems of robots capable of seeing, talking, and manipulating. Kanerva's theory is a break with the Western rationalistic tradition, allowing a new interpretation of learning and cognition that respects biology and the mysteries of individual human beings.

  6. Event parallelism: Distributed memory parallel computing for high energy physics experiments

    International Nuclear Information System (INIS)

    Nash, T.

    1989-05-01

    This paper describes the present and expected future development of distributed memory parallel computers for high energy physics experiments. It covers the use of event parallel microprocessor farms, particularly at Fermilab, including both ACP multiprocessors and farms of MicroVAXES. These systems have proven very cost effective in the past. A case is made for moving to the more open environment of UNIX and RISC processors. The 2nd Generation ACP Multiprocessor System, which is based on powerful RISC systems, is described. Given the promise of still more extraordinary increases in processor performance, a new emphasis on point to point, rather than bussed, communication will be required. Developments in this direction are described. 6 figs

  7. Event parallelism: Distributed memory parallel computing for high energy physics experiments

    International Nuclear Information System (INIS)

    Nash, T.

    1989-01-01

    This paper describes the present and expected future development of distributed memory parallel computers for high energy physics experiments. It covers the use of event parallel microprocessor farms, particularly at Fermilab, including both ACP multiprocessors and farms of MicroVAXES. These systems have proven very cost effective in the past. A case is made for moving to the more open environment of UNIX and RISC processors. The 2nd Generation ACP Multiprocessor System, which is based on powerful RISC systems, is described. Given the promise of still more extraordinary increases in processor performance, a new emphasis on point to point, rather than bussed, communication will be required. Developments in this direction are described. (orig.)

  8. Event parallelism: Distributed memory parallel computing for high energy physics experiments

    Science.gov (United States)

    Nash, Thomas

    1989-12-01

    This paper describes the present and expected future development of distributed memory parallel computers for high energy physics experiments. It covers the use of event parallel microprocessor farms, particularly at Fermilab, including both ACP multiprocessors and farms of MicroVAXES. These systems have proven very cost effective in the past. A case is made for moving to the more open environment of UNIX and RISC processors. The 2nd Generation ACP Multiprocessor System, which is based on powerful RISC system, is described. Given the promise of still more extraordinary increases in processor performance, a new emphasis on point to point, rather than bussed, communication will be required. Developments in this direction are described.

  9. Multiprocessor development for robot control

    International Nuclear Information System (INIS)

    Lee, Jong Min; Kim, Seung Ho; Hwang, Suk Yeoung; Sohn, Surg Won; Kim, Byung Soo; Kim, Chang Hoi; Lee, Yong Bum; Kim, Woong Ki

    1988-12-01

    The object of this project is to develop a multiprocessor system which is essential to robot technology. A multiprocessor system interconnecting many single board computer is much faster and flexible than a single processor. The developed multiprocessor will be used to control nuclear mobile robot, so a loosely coupled system is adopted as a robot controller. A total configuration of controller is divided into three main parts in related with its function. It is consisted of supervisory control part, functional control part, remote control part. The designed control system is to be expanded easily for further use with a modular architecture, so the functional independency within sub-systems can be obtained throughout the system structure. Electromagnetic interference affecting to the control system is minimized by using optical fiber as communication media between robot and control system. System performances is enhanced not only by using distributed architecture in hardware, but by adopting real-time, multi-tasking operating system in software. The iRMX86 OS is used and reconfigured for real-time, multi-tasking operation. RS-485 serial communication protocol is used between functional control part and remote control part. Since the developed multiprocessor control system is an essential and fundamental technology for artificial intelligent robot, the result of this project can be applied directly to nuclear mobile robot. (Author)

  10. Utilizing a multiprocessor architecture - The performance of MIDAS

    International Nuclear Information System (INIS)

    Maples, C.; Logan, D.; Meng, J.; Rathbun, W.; Weaver, D.

    1983-01-01

    The MIDAS architecture organizes multiple CPUs into clusters called distributed subsystems. Each subsystem consists of an array of processors controlled by a supervisory CPU. The multiprocessor array is composed of commercial CPUs (with floating point hardware) and specialized processing elements. Interprocessor communication within the array may occur either through switched memory modules or common shared memory. The architecture permits multiple processors to be focused on single problems. A distributed subsystem has been constructed and tested. It currently consists of a supervisor CPU; 16 blocks of independently switchable memory; 9 general purpose, VAX-class CPUs; and 2 specialized pipelined processors to handle I/O. Results on a variety of problems indicate that the subsystem performs 8 to 15 times faster than a standard computer with an identical CPU. The difference in performance represents the effect of differing CPU and I/O requirements

  11. A Multiprocessor Operating System Simulator

    Science.gov (United States)

    Johnston, Gary M.; Campbell, Roy H.

    1988-01-01

    This paper describes a multiprocessor operating system simulator that was developed by the authors in the Fall semester of 1987. The simulator was built in response to the need to provide students with an environment in which to build and test operating system concepts as part of the coursework of a third-year undergraduate operating systems course. Written in C++, the simulator uses the co-routine style task package that is distributed with the AT&T C++ Translator to provide a hierarchy of classes that represents a broad range of operating system software and hardware components. The class hierarchy closely follows that of the 'Choices' family of operating systems for loosely- and tightly-coupled multiprocessors. During an operating system course, these classes are refined and specialized by students in homework assignments to facilitate experimentation with different aspects of operating system design and policy decisions. The current implementation runs on the IBM RT PC under 4.3bsd UNIX.

  12. On the Parallel Elliptic Single/Multigrid Solutions about Aligned and Nonaligned Bodies Using the Virtual Machine for Multiprocessors

    Directory of Open Access Journals (Sweden)

    A. Averbuch

    1994-01-01

    Full Text Available Parallel elliptic single/multigrid solutions around an aligned and nonaligned body are presented and implemented on two multi-user and single-user shared memory multiprocessors (Sequent Symmetry and MOS and on a distributed memory multiprocessor (a Transputer network. Our parallel implementation uses the Virtual Machine for Muli-Processors (VMMP, a software package that provides a coherent set of services for explicitly parallel application programs running on diverse multiple instruction multiple data (MIMD multiprocessors, both shared memory and message passing. VMMP is intended to simplify parallel program writing and to promote portable and efficient programming. Furthermore, it ensures high portability of application programs by implementing the same services on all target multiprocessors. The performance of our algorithm is investigated in detail. It is seen to fit well the above architectures when the number of processors is less than the maximal number of grid points along the axes. In general, the efficiency in the nonaligned case is higher than in the aligned case. Alignment overhead is observed to be up to 200% in the shared-memory case and up to 65% in the message-passing case. We have demonstrated that when using VMMP, the portability of the algorithms is straightforward and efficient.

  13. Distributed-memory matrix computations

    DEFF Research Database (Denmark)

    Balle, Susanne Mølleskov

    1995-01-01

    The main goal of this project is to investigate, develop, and implement algorithms for numerical linear algebra on parallel computers in order to acquire expertise in methods for parallel computations. An important motivation for analyzaing and investigating the potential for parallelism in these......The main goal of this project is to investigate, develop, and implement algorithms for numerical linear algebra on parallel computers in order to acquire expertise in methods for parallel computations. An important motivation for analyzaing and investigating the potential for parallelism...... in these algorithms is that many scientific applications rely heavily on the performance of the involved dense linear algebra building blocks. Even though we consider the distributed-memory as well as the shared-memory programming paradigm, the major part of the thesis is dedicated to distributed-memory architectures....... We emphasize distributed-memory massively parallel computers - such as the Connection Machines model CM-200 and model CM-5/CM-5E - available to us at UNI-C and at Thinking Machines Corporation. The CM-200 was at the time this project started one of the few existing massively parallel computers...

  14. Multiprocessor data acquisition system

    International Nuclear Information System (INIS)

    Haumann, J.R.; Crawford, R.K.

    1987-01-01

    A multiprocessor data acquisition system has been built to replace the single processor systems at the Intense Pulsed Neutron Source (IPNS) at Argonne National Laboratory. The multiprocessor system was needed to accommodate the higher data rates at IPNS brought about by improvements in the source and changes in instrument configurations. This paper describes the hardware configuration of the system and the method of task sharing and compares results to the single processor system

  15. Multiprocessor architecture: Synthesis and evaluation

    Science.gov (United States)

    Standley, Hilda M.

    1990-01-01

    Multiprocessor computed architecture evaluation for structural computations is the focus of the research effort described. Results obtained are expected to lead to more efficient use of existing architectures and to suggest designs for new, application specific, architectures. The brief descriptions given outline a number of related efforts directed toward this purpose. The difficulty is analyzing an existing architecture or in designing a new computer architecture lies in the fact that the performance of a particular architecture, within the context of a given application, is determined by a number of factors. These include, but are not limited to, the efficiency of the computation algorithm, the programming language and support environment, the quality of the program written in the programming language, the multiplicity of the processing elements, the characteristics of the individual processing elements, the interconnection network connecting processors and non-local memories, and the shared memory organization covering the spectrum from no shared memory (all local memory) to one global access memory. These performance determiners may be loosely classified as being software or hardware related. This distinction is not clear or even appropriate in many cases. The effect of the choice of algorithm is ignored by assuming that the algorithm is specified as given. Effort directed toward the removal of the effect of the programming language and program resulted in the design of a high-level parallel programming language. Two characteristics of the fundamental structure of the architecture (memory organization and interconnection network) are examined.

  16. Simulation of Particulate Flows on Multi-Processor Machines with Distributed Memory

    International Nuclear Information System (INIS)

    Uhlmann, M.

    2004-01-01

    We present a method for the parallelization of an immersed boundary algorithm for particulate flows using the MPI standard of communication. The treatment of the fluid phase uses the domain decomposition technique over a Cartesian processor grid. The solution of the Hehnholtz problem is approximately factorized an relies upon apparel tri-diagonal solver; the Poisson problem is solved by means of a parallel multi-grid technique simulator MUDPACK. For the solid phase we employ a master-slaves technique where one process or handles all the particles contained in its Eulerian fluid sub-domain and zero or more neighbor processors collaborate in the computation of particle-related quantities whenever a particle position overlaps the boundary of a sub- do mam.The parallel efficiency for some preliminary computations is presented. (Author) 9 refs

  17. Simulation of Particulate Flows Multi-Processor Machines with Distributed Memory

    Energy Technology Data Exchange (ETDEWEB)

    Uhlmann, M.

    2004-07-01

    We presented a method for the parallelization of an immersed boundary algorithm for particulate flows using the MPI standard of communication. The treatment of the fluid phase used the domain decomposition technique over a Cartesian processor grid. The solution of the Helmholtz problem is approximately factorized an relies upon apparel tri-diagonal solver the Poisson problem is solved by means of a parallel multi-grid technique similar to MUDPACK. for the solid phase we employ a master-slaves technique where one processor handles all the particles contained in its Eulerian fluid sub-domain and zero or more neighbor processors collaborate in the computation of particle-related quantities whenever a particle position over laps the boundary of a sub-domain. the parallel efficiency for some preliminary computations is presented. (Author) 9 refs.

  18. Over-Distribution in Source Memory

    Science.gov (United States)

    Brainerd, C. J.; Reyna, V. F.; Holliday, R. E.; Nakamura, K.

    2012-01-01

    Semantic false memories are confounded with a second type of error, over-distribution, in which items are attributed to contradictory episodic states. Over-distribution errors have proved to be more common than false memories when the two are disentangled. We investigated whether over-distribution is prevalent in another classic false memory paradigm: source monitoring. It is. Conventional false memory responses (source misattributions) were predominantly over-distribution errors, but unlike semantic false memory, over-distribution also accounted for more than half of true memory responses (correct source attributions). Experimental control of over-distribution was achieved via a series of manipulations that affected either recollection of contextual details or item memory (concreteness, frequency, list-order, number of presentation contexts, and individual differences in verbatim memory). A theoretical model was used to analyze the data (conjoint process dissociation) that predicts that predicts that (a) over-distribution is directly proportional to item memory but inversely proportional to recollection and (b) item memory is not a necessary precondition for recollection of contextual details. The results were consistent with both predictions. PMID:21942494

  19. Multiprocessor programming environment

    Energy Technology Data Exchange (ETDEWEB)

    Smith, M.B.; Fornaro, R.

    1988-12-01

    Programming tools and techniques have been well developed for traditional uniprocessor computer systems. The focus of this research project is on the development of a programming environment for a high speed real time heterogeneous multiprocessor system, with special emphasis on languages and compilers. The new tools and techniques will allow a smooth transition for programmers with experience only on single processor systems.

  20. Total recall in distributive associative memories

    Science.gov (United States)

    Danforth, Douglas G.

    1991-01-01

    Iterative error correction of asymptotically large associative memories is equivalent to a one-step learning rule. This rule is the inverse of the activation function of the memory. Spectral representations of nonlinear activation functions are used to obtain the inverse in closed form for Sparse Distributed Memory, Selected-Coordinate Design, and Radial Basis Functions.

  1. Parallel simulated annealing algorithms for cell placement on hypercube multiprocessors

    Science.gov (United States)

    Banerjee, Prithviraj; Jones, Mark Howard; Sargent, Jeff S.

    1990-01-01

    Two parallel algorithms for standard cell placement using simulated annealing are developed to run on distributed-memory message-passing hypercube multiprocessors. The cells can be mapped in a two-dimensional area of a chip onto processors in an n-dimensional hypercube in two ways, such that both small and large cell exchange and displacement moves can be applied. The computation of the cost function in parallel among all the processors in the hypercube is described, along with a distributed data structure that needs to be stored in the hypercube to support the parallel cost evaluation. A novel tree broadcasting strategy is used extensively for updating cell locations in the parallel environment. A dynamic parallel annealing schedule estimates the errors due to interacting parallel moves and adapts the rate of synchronization automatically. Two novel approaches in controlling error in parallel algorithms are described: heuristic cell coloring and adaptive sequence control.

  2. A survey of Tumult, a real-time multi-processor system

    International Nuclear Information System (INIS)

    Jansen, P.G.

    1986-01-01

    Tumult (Twente University MULTi processor system) is the name of an ongoing project aiming at the design and implementation of a modular extendible multiprocessor system. All memory is distributed and processors communicate in parallel via a fast and reliable local switching network instead of a shared bus. A distributed real-time operating system is being designed and implemented, consisting of a multi-tasking subsystem per processor. Processes can communicate via a message passing mechanism. Communication links and processes are dynamically created and disposed by the application. In this article a brief description of the system is given; communication aspects are emphasized. (Auth.)

  3. The art of multiprocessor programming

    CERN Document Server

    Herlihy, Maurice

    2012-01-01

    Revised and updated with improvements conceived in parallel programming courses, The Art of Multiprocessor Programming is an authoritative guide to multicore programming. It introduces a higher level set of software development skills than that needed for efficient single-core programming. This book provides comprehensive coverage of the new principles, algorithms, and tools necessary for effective multiprocessor programming. Students and professionals alike will benefit from thorough coverage of key multiprocessor programming issues. This revised edition incorporates much-demanded updates t

  4. Distributed learning enhances relational memory consolidation.

    Science.gov (United States)

    Litman, Leib; Davachi, Lila

    2008-09-01

    It has long been known that distributed learning (DL) provides a mnemonic advantage over massed learning (ML). However, the underlying mechanisms that drive this robust mnemonic effect remain largely unknown. In two experiments, we show that DL across a 24 hr interval does not enhance immediate memory performance but instead slows the rate of forgetting relative to ML. Furthermore, we demonstrate that this savings in forgetting is specific to relational, but not item, memory. In the context of extant theories and knowledge of memory consolidation, these results suggest that an important mechanism underlying the mnemonic benefit of DL is enhanced memory consolidation. We speculate that synaptic strengthening mechanisms supporting long-term memory consolidation may be differentially mediated by the spacing of memory reactivation. These findings have broad implications for the scientific study of episodic memory consolidation and, more generally, for educational curriculum development and policy.

  5. Distributed-Memory Fast Maximal Independent Set

    Energy Technology Data Exchange (ETDEWEB)

    Kanewala Appuhamilage, Thejaka Amila J.; Zalewski, Marcin J.; Lumsdaine, Andrew

    2017-09-13

    The Maximal Independent Set (MIS) graph problem arises in many applications such as computer vision, information theory, molecular biology, and process scheduling. The growing scale of MIS problems suggests the use of distributed-memory hardware as a cost-effective approach to providing necessary compute and memory resources. Luby proposed four randomized algorithms to solve the MIS problem. All those algorithms are designed focusing on shared-memory machines and are analyzed using the PRAM model. These algorithms do not have direct efficient distributed-memory implementations. In this paper, we extend two of Luby’s seminal MIS algorithms, “Luby(A)” and “Luby(B),” to distributed-memory execution, and we evaluate their performance. We compare our results with the “Filtered MIS” implementation in the Combinatorial BLAS library for two types of synthetic graph inputs.

  6. TUMULT, the Twente University multiprocessor

    NARCIS (Netherlands)

    Scholten, Johan; Jansen, P.G.

    1988-01-01

    TUMULT, (Twente University multiprocessor) is described. Its aim is the design and implementation of a modular extendable multiprocessor system. Up to 15 processing elements are connected through an interprocessor communication network, using message-passing for the exchange of data. The hardware is

  7. File-System Workload on a Scientific Multiprocessor

    Science.gov (United States)

    Kotz, David; Nieuwejaar, Nils

    1995-01-01

    Many scientific applications have intense computational and I/O requirements. Although multiprocessors have permitted astounding increases in computational performance, the formidable I/O needs of these applications cannot be met by current multiprocessors a their I/O subsystems. To prevent I/O subsystems from forever bottlenecking multiprocessors and limiting the range of feasible applications, new I/O subsystems must be designed. The successful design of computer systems (both hardware and software) depends on a thorough understanding of their intended use. A system designer optimizes the policies and mechanisms for the cases expected to most common in the user's workload. In the case of multiprocessor file systems, however, designers have been forced to build file systems based only on speculation about how they would be used, extrapolating from file-system characterizations of general-purpose workloads on uniprocessor and distributed systems or scientific workloads on vector supercomputers (see sidebar on related work). To help these system designers, in June 1993 we began the Charisma Project, so named because the project sought to characterize 1/0 in scientific multiprocessor applications from a variety of production parallel computing platforms and sites. The Charisma project is unique in recording individual read and write requests-in live, multiprogramming, parallel workloads (rather than from selected or nonparallel applications). In this article, we present the first results from the project: a characterization of the file-system workload an iPSC/860 multiprocessor running production, parallel scientific applications at NASA's Ames Research Center.

  8. Contention Modeling for Multithreaded Distributed Shared Memory Machines: The Cray XMT

    Energy Technology Data Exchange (ETDEWEB)

    Secchi, Simone; Tumeo, Antonino; Villa, Oreste

    2011-07-27

    Distributed Shared Memory (DSM) machines are a wide class of multi-processor computing systems where a large virtually-shared address space is mapped on a network of physically distributed memories. High memory latency and network contention are two of the main factors that limit performance scaling of such architectures. Modern high-performance computing DSM systems have evolved toward exploitation of massive hardware multi-threading and fine-grained memory hashing to tolerate irregular latencies, avoid network hot-spots and enable high scaling. In order to model the performance of such large-scale machines, parallel simulation has been proved to be a promising approach to achieve good accuracy in reasonable times. One of the most critical factors in solving the simulation speed-accuracy trade-off is network modeling. The Cray XMT is a massively multi-threaded supercomputing architecture that belongs to the DSM class, since it implements a globally-shared address space abstraction on top of a physically distributed memory substrate. In this paper, we discuss the development of a contention-aware network model intended to be integrated in a full-system XMT simulator. We start by measuring the effects of network contention in a 128-processor XMT machine and then investigate the trade-off that exists between simulation accuracy and speed, by comparing three network models which operate at different levels of accuracy. The comparison and model validation is performed by executing a string-matching algorithm on the full-system simulator and on the XMT, using three datasets that generate noticeably different contention patterns.

  9. Multiprocessors for high energy physics

    International Nuclear Information System (INIS)

    Pohl, M.

    1987-01-01

    I review the role, status and progress of multiprocessor projects relevant to high energy physics. A short overview of the large variety of multiprocessors architectures is given, with special emphasis on machines suitable for experimental data reconstruction. A lot of progress has been made in the attempt to make the use of multiprocessors less painful by creating a ''Parallel Programming Environment'' supporting the non-expert user. A high degree of usability has been reached for coarse grain (event level) parallelism. The program development tools available on various systems (subroutine packages, preprocessors and parallelizing compilers) are discussed in some detail. Tools for execution control and debugging are also developing, thus opening the path from dedicated systems for large scale, stable production towards support of a more general job mix. At medium term, multiprocessors will thus cover a growing fraction of the typical high energy physics computing task. (orig.)

  10. Embedded multiprocessors scheduling and synchronization

    CERN Document Server

    Sriram, Sundararajan

    2009-01-01

    Techniques for Optimizing Multiprocessor Implementations of Signal Processing ApplicationsAn indispensable component of the information age, signal processing is embedded in a variety of consumer devices, including cell phones and digital television, as well as in communication infrastructure, such as media servers and cellular base stations. Multiple programmable processors, along with custom hardware running in parallel, are needed to achieve the computation throughput required of such applications. Reviews important research in key areas related to the multiprocessor implementation of multi

  11. The Distributed Nature of Working Memory

    NARCIS (Netherlands)

    Christophel, Thomas B.; Klink, P. Christiaan; Spitzer, Bernhard; Roelfsema, Pieter R.; Haynes, John-Dylan

    2017-01-01

    Studies in humans and non-human primates have provided evidence for storage of working memory contents in multiple regions ranging from sensory to parietal and prefrontal cortex. We discuss potential explanations for these distributed representations: (i) features in sensory regions versus

  12. Performance of Multithreaded Chip Multiprocessors And Implications for Operating System Design

    OpenAIRE

    Fedorova, Alexandra; Seltzer, Margo I.; Small, Christopher A.; Nussbaum, Daniel

    2005-01-01

    An operating system’s design is often influenced by the architecture of the target hardware. While uniprocessor and multiprocessor architectures are well understood, such is not the case for multithreaded chip multiprocessors (CMT) – a new generation of processors designed to improve performance of memory-intensive applications. The first systems equipped with CMT processors are just becoming available, so it is critical that we now understand how to obtain the best performance from such syst...

  13. Distributed terascale volume visualization using distributed shared virtual memory

    KAUST Repository

    Beyer, Johanna; Hadwiger, Markus; Schneider, Jens; Jeong, Wonki; Pfister, Hanspeter

    2011-01-01

    Table 1 illustrates the impact of different distribution unit sizes, different screen resolutions, and numbers of GPU nodes. We use two and four GPUs (NVIDIA Quadro 5000 with 2.5 GB memory) and a mouse cortex EM dataset (see Figure 2) of resolution

  14. Real-Time Multiprocessor Programming Language (RTMPL) user's manual

    Science.gov (United States)

    Arpasi, D. J.

    1985-01-01

    A real-time multiprocessor programming language (RTMPL) has been developed to provide for high-order programming of real-time simulations on systems of distributed computers. RTMPL is a structured, engineering-oriented language. The RTMPL utility supports a variety of multiprocessor configurations and types by generating assembly language programs according to user-specified targeting information. Many programming functions are assumed by the utility (e.g., data transfer and scaling) to reduce the programming chore. This manual describes RTMPL from a user's viewpoint. Source generation, applications, utility operation, and utility output are detailed. An example simulation is generated to illustrate many RTMPL features.

  15. Distributed terascale volume visualization using distributed shared virtual memory

    KAUST Repository

    Beyer, Johanna

    2011-10-01

    Table 1 illustrates the impact of different distribution unit sizes, different screen resolutions, and numbers of GPU nodes. We use two and four GPUs (NVIDIA Quadro 5000 with 2.5 GB memory) and a mouse cortex EM dataset (see Figure 2) of resolution 21,494 x 25,790 x 1,850 = 955GB. The size of the virtual distribution units significantly influences the data distribution between nodes. Small distribution units result in a high depth complexity for compositing. Large distribution units lead to a low utilization of GPUs, because in the worst case only a single distribution unit will be in view, which is rendered by only a single node. The choice of an optimal distribution unit size depends on three major factors: the output screen resolution, the block cache size on each node, and the number of nodes. Currently, we are working on optimizing the compositing step and network communication between nodes. © 2011 IEEE.

  16. Cache aware mapping of streaming apllications on a multiprocessor system-on-chip

    NARCIS (Netherlands)

    Moonen, A.J.M.; Bekooij, M.J.G.; Berg, van den R.M.J.; Meerbergen, van J.; Sciuto, D.; Peng, Z.

    2008-01-01

    Efficient use of the memory hierarchy is critical for achieving high performance in a multiprocessor system- on-chip. An external memory that is shared between processors is a bottleneck in current and future systems. Cache misses and a large cache miss penalty contribute to a low processor

  17. A view of Kanerva's sparse distributed memory

    Science.gov (United States)

    Denning, P. J.

    1986-01-01

    Pentti Kanerva is working on a new class of computers, which are called pattern computers. Pattern computers may close the gap between capabilities of biological organisms to recognize and act on patterns (visual, auditory, tactile, or olfactory) and capabilities of modern computers. Combinations of numeric, symbolic, and pattern computers may one day be capable of sustaining robots. The overview of the requirements for a pattern computer, a summary of Kanerva's Sparse Distributed Memory (SDM), and examples of tasks this computer can be expected to perform well are given.

  18. Distributed Memory Parallel Computing with SEAWAT

    Science.gov (United States)

    Verkaik, J.; Huizer, S.; van Engelen, J.; Oude Essink, G.; Ram, R.; Vuik, K.

    2017-12-01

    Fresh groundwater reserves in coastal aquifers are threatened by sea-level rise, extreme weather conditions, increasing urbanization and associated groundwater extraction rates. To counteract these threats, accurate high-resolution numerical models are required to optimize the management of these precious reserves. The major model drawbacks are long run times and large memory requirements, limiting the predictive power of these models. Distributed memory parallel computing is an efficient technique for reducing run times and memory requirements, where the problem is divided over multiple processor cores. A new Parallel Krylov Solver (PKS) for SEAWAT is presented. PKS has recently been applied to MODFLOW and includes Conjugate Gradient (CG) and Biconjugate Gradient Stabilized (BiCGSTAB) linear accelerators. Both accelerators are preconditioned by an overlapping additive Schwarz preconditioner in a way that: a) subdomains are partitioned using Recursive Coordinate Bisection (RCB) load balancing, b) each subdomain uses local memory only and communicates with other subdomains by Message Passing Interface (MPI) within the linear accelerator, c) it is fully integrated in SEAWAT. Within SEAWAT, the PKS-CG solver replaces the Preconditioned Conjugate Gradient (PCG) solver for solving the variable-density groundwater flow equation and the PKS-BiCGSTAB solver replaces the Generalized Conjugate Gradient (GCG) solver for solving the advection-diffusion equation. PKS supports the third-order Total Variation Diminishing (TVD) scheme for computing advection. Benchmarks were performed on the Dutch national supercomputer (https://userinfo.surfsara.nl/systems/cartesius) using up to 128 cores, for a synthetic 3D Henry model (100 million cells) and the real-life Sand Engine model ( 10 million cells). The Sand Engine model was used to investigate the potential effect of the long-term morphological evolution of a large sand replenishment and climate change on fresh groundwater resources

  19. Reproducibility in a multiprocessor system

    Science.gov (United States)

    Bellofatto, Ralph A; Chen, Dong; Coteus, Paul W; Eisley, Noel A; Gara, Alan; Gooding, Thomas M; Haring, Rudolf A; Heidelberger, Philip; Kopcsay, Gerard V; Liebsch, Thomas A; Ohmacht, Martin; Reed, Don D; Senger, Robert M; Steinmacher-Burow, Burkhard; Sugawara, Yutaka

    2013-11-26

    Fixing a problem is usually greatly aided if the problem is reproducible. To ensure reproducibility of a multiprocessor system, the following aspects are proposed; a deterministic system start state, a single system clock, phase alignment of clocks in the system, system-wide synchronization events, reproducible execution of system components, deterministic chip interfaces, zero-impact communication with the system, precise stop of the system and a scan of the system state.

  20. Hardware support for CSP on a Java chip multiprocessor

    DEFF Research Database (Denmark)

    Gruian, Flavius; Schoeberl, Martin

    2013-01-01

    Due to memory bandwidth limitations, chip multiprocessors (CMPs) adopting the convenient shared memory model for their main memory architecture scale poorly. On-chip core-to-core communication is a solution to this problem, that can lead to further performance increase for a number of multithreaded...... applications. Programmatically, the Communicating Sequential Processes (CSPs) paradigm provides a sound computational model for such an architecture with message based communication. In this paper we explore hardware support for CSP in the context of an embedded Java CMP. The hardware support for CSP are on......-chip communication channels, implemented by a ring-based network-on-chip (NoC), to reduce the memory bandwidth pressure on the shared memory.The presented solution is scalable and also specific for our limited resources and real-time predictability requirements. CMP architectures of three to eight processors were...

  1. DiFX: A software correlator for very long baseline interferometry using multi-processor computing environments

    OpenAIRE

    Deller, A. T.; Tingay, S. J.; Bailes, M.; West, C.

    2007-01-01

    We describe the development of an FX style correlator for Very Long Baseline Interferometry (VLBI), implemented in software and intended to run in multi-processor computing environments, such as large clusters of commodity machines (Beowulf clusters) or computers specifically designed for high performance computing, such as multi-processor shared-memory machines. We outline the scientific and practical benefits for VLBI correlation, these chiefly being due to the inherent flexibility of softw...

  2. Distributed Shared Memory for the Cell Broadband Engine (DSMCBE)

    DEFF Research Database (Denmark)

    Larsen, Morten Nørgaard; Skovhede, Kenneth; Vinter, Brian

    2009-01-01

    in and out of non-coherent local storage blocks for each special processor element. In this paper we present a software library, namely the Distributed Shared Memory for the Cell Broadband Engine (DSMCBE). By using techniques known from distributed shared memory DSMCBE allows programmers to program the CELL...

  3. A simple multiprocessor management system for event-parallel computing

    International Nuclear Information System (INIS)

    Bracker, S.; Gounder, K.; Hendrix, K.; Summers, D.

    1996-01-01

    Offline software using Transmission Control Protocol/Internet Protocol (TCP/IP) sockets to distribute particle physics events to multiple UNIX/RISC workstations is described. A modular, building block approach was taken that allowed tailoring to solve specific tasks efficiently and simply as they arose. The modest, initial cost was having to learn about sockets for interprocess communication. This multiprocessor management software has been used to control the reconstruction of eight billion raw data events from Fermilab Experiment E791

  4. Single-chip serial channel enhances multi-processor systems

    Energy Technology Data Exchange (ETDEWEB)

    Millar, J.

    1982-01-01

    In this paper multiprocessor systems are described and explained. The impact that VLSI advancements are having on multiprocessor design is pointed out. The TMS 7041 single-chip microcomputer is described briefly, highlighting its multiprocessor communication capability. And finally, a typical multiprocessor system is shown, implementing the TMS 7041.

  5. Memory-assisted measurement-device-independent quantum key distribution

    Science.gov (United States)

    Panayi, Christiana; Razavi, Mohsen; Ma, Xiongfeng; Lütkenhaus, Norbert

    2014-04-01

    A protocol with the potential of beating the existing distance records for conventional quantum key distribution (QKD) systems is proposed. It borrows ideas from quantum repeaters by using memories in the middle of the link, and that of measurement-device-independent QKD, which only requires optical source equipment at the user's end. For certain memories with short access times, our scheme allows a higher repetition rate than that of quantum repeaters with single-mode memories, thereby requiring lower coherence times. By accounting for various sources of nonideality, such as memory decoherence, dark counts, misalignment errors, and background noise, as well as timing issues with memories, we develop a mathematical framework within which we can compare QKD systems with and without memories. In particular, we show that with the state-of-the-art technology for quantum memories, it is potentially possible to devise memory-assisted QKD systems that, at certain distances of practical interest, outperform current QKD implementations.

  6. Task-FIFO co-scheduling of streaming applications on MPSoCs with predictable memory hierarchy

    NARCIS (Netherlands)

    Tang, Q.; Basten, A.A.; Geilen, M.C.W.; Stuijk, S.; Wei, Ji-Bo

    This article studies the scheduling of real-time streaming applications on multiprocessor systems-on-chips with predictable memory hierarchy. An iteration-based task-FIFO co-scheduling framework is proposed for this problem. We obtain FIFO size distributions using Pareto space searching, based on

  7. Task-FIFO co-scheduling of streaming applications on MPSoCs with predictable memory hierarchy

    NARCIS (Netherlands)

    Tang, Q.; Basten, T.; Geilen, M.; Stuijk, S.; Wei, J.B.

    2017-01-01

    This article studies the scheduling of real-time streaming applications on multiprocessor systems-on-chips with predictable memory hierarchy. An iteration-based task-FIFO co-scheduling framework is proposed for this problem. We obtain FIFO size distributions using Pareto space searching, based on

  8. A scalable single-chip multi-processor architecture with on-chip RTOS kernel

    NARCIS (Netherlands)

    Theelen, B.D.; Verschueren, A.C.; Reyes Suarez, V.V.; Stevens, M.P.J.; Nunez, A.

    2003-01-01

    Now that system-on-chip technology is emerging, single-chip multi-processors are becoming feasible. A key problem of designing such systems is the complexity of their on-chip interconnects and memory architecture. It is furthermore unclear at what level software should be integrated. An example of a

  9. 3D-TV Rendering on a Multiprocessor System on a Chip

    NARCIS (Netherlands)

    Van Eijndhoven, J.T.J.; Li, X.

    2006-01-01

    This thesis focuses on the issue of mapping 3D-TV rendering applications to a multiprocessor platform. The target platform aims to address tomorrow's multi-media consumer market. The prototype chip, called Wasabi, contains a set of TriMedia processors that communicate viaa shared memory, fast

  10. Multiprocessor system with multiple concurrent modes of execution

    Science.gov (United States)

    Ahn, Daniel; Ceze, Luis H; Chen, Dong; Gara, Alan; Heidelberger, Philip; Ohmacht, Martin

    2013-12-31

    A multiprocessor system supports multiple concurrent modes of speculative execution. Speculation identification numbers (IDs) are allocated to speculative threads from a pool of available numbers. The pool is divided into domains, with each domain being assigned to a mode of speculation. Modes of speculation include TM, TLS, and rollback. Allocation of the IDs is carried out with respect to a central state table and using hardware pointers. The IDs are used for writing different versions of speculative results in different ways of a set in a cache memory.

  11. Operating System for Runtime Reconfigurable Multiprocessor Systems

    Directory of Open Access Journals (Sweden)

    Diana Göhringer

    2011-01-01

    Full Text Available Operating systems traditionally handle the task scheduling of one or more application instances on processor-like hardware architectures. RAMPSoC, a novel runtime adaptive multiprocessor System-on-Chip, exploits the dynamic reconfiguration on FPGAs to generate, start and terminate hardware and software tasks. The hardware tasks have to be transferred to the reconfigurable hardware via a configuration access port. The software tasks can be loaded into the local memory of the respective IP core either via the configuration access port or via the on-chip communication infrastructure (e.g. a Network-on-Chip. Recent-series of Xilinx FPGAs, such as Virtex-5, provide two Internal Configuration Access Ports, which cannot be accessed simultaneously. To prevent conflicts, the access to these ports as well as the hardware resource management needs to be controlled, e.g. by a special-purpose operating system running on an embedded processor. For that purpose and to handle the relations between temporally and spatially scheduled operations, the novel approach of an operating system is of high importance. This special purpose operating system, called CAP-OS (Configuration Access Port-Operating System, which will be presented in this paper, supports the clients using the configuration port with the services of priority-based access scheduling, hardware task mapping and resource management.

  12. A Comparison of Two Paradigms for Distributed Shared Memory

    NARCIS (Netherlands)

    Levelt, W.G.; Kaashoek, M.F.; Bal, H.E.; Tanenbaum, A.S.

    1992-01-01

    Two paradigms for distributed shared memory on loosely‐coupled computing systems are compared: the shared data‐object model as used in Orca, a programming language specially designed for loosely‐coupled computing systems, and the shared virtual memory model. For both paradigms two systems are

  13. Distributed trace using central performance counter memory

    Science.gov (United States)

    Satterfield, David L.; Sexton, James C.

    2013-01-22

    A plurality of processing cores, are central storage unit having at least memory connected in a daisy chain manner, forming a daisy chain ring layout on an integrated chip. At least one of the plurality of processing cores places trace data on the daisy chain connection for transmitting the trace data to the central storage unit, and the central storage unit detects the trace data and stores the trace data in the memory co-located in with the central storage unit.

  14. Massively Parallel Polar Decomposition on Distributed-Memory Systems

    KAUST Repository

    Ltaief, Hatem; Sukkari, Dalal E.; Esposito, Aniello; Nakatsukasa, Yuji; Keyes, David E.

    2018-01-01

    We present a high-performance implementation of the Polar Decomposition (PD) on distributed-memory systems. Building upon on the QR-based Dynamically Weighted Halley (QDWH) algorithm, the key idea lies in finding the best rational approximation

  15. Multiprocessor scheduling for real-time systems

    CERN Document Server

    Baruah, Sanjoy; Buttazzo, Giorgio

    2015-01-01

    This book provides a comprehensive overview of both theoretical and pragmatic aspects of resource-allocation and scheduling in multiprocessor and multicore hard-real-time systems.  The authors derive new, abstract models of real-time tasks that capture accurately the salient features of real application systems that are to be implemented on multiprocessor platforms, and identify rules for mapping application systems onto the most appropriate models.  New run-time multiprocessor scheduling algorithms are presented, which are demonstrably better than those currently used, both in terms of run-time efficiency and tractability of off-line analysis.  Readers will benefit from a new design and analysis framework for multiprocessor real-time systems, which will translate into a significantly enhanced ability to provide formally verified, safety-critical real-time systems at a significantly lower cost.

  16. The structural robustness of multiprocessor computing system

    Directory of Open Access Journals (Sweden)

    N. Andronaty

    1996-03-01

    Full Text Available The model of the multiprocessor computing system on the base of transputers which permits to resolve the question of valuation of a structural robustness (viability, survivability is described.

  17. Multiprocessor development for robot control

    International Nuclear Information System (INIS)

    Lee, John Min; Kim, Seung Ho; Kim, Chang Hoi; Kim, Byung Soo; Hwang, Suk Yeong; Lee, Young Bum; Sohn, Suk Won; Kim, Woon Gi

    1990-01-01

    The project of this study is to develop a real time controller applying autonomous robotic systems operated in hostile environment. Developed control system is designed with a multiprocessor to get independency and reliability as well as to extend the system easily. The control system is designed in three distinct subsystems (supervisory control part, functional control part, and remote control part). To review the functional performance of developed controller, a prototype mobile robot, which was installed 4 DOF mainpulator, was designed and manufactured. Initial tests showed that the robot could turn with a radius of 38 cm and a maximum speed of 1.26 km/hr and go over obstacle of 18 cm in height. (author)

  18. Multiprocessor performance modeling with ADAS

    Science.gov (United States)

    Hayes, Paul J.; Andrews, Asa M.

    1989-01-01

    A graph managing strategy referred to as the Algorithm to Architecture Mapping Model (ATAMM) appears useful for the time-optimized execution of application algorithm graphs in embedded multiprocessors and for the performance prediction of graph designs. This paper reports the modeling of ATAMM in the Architecture Design and Assessment System (ADAS) to make an independent verification of ATAMM's performance prediction capability and to provide a user framework for the evaluation of arbitrary algorithm graphs. Following an overview of ATAMM and its major functional rules are descriptions of the ADAS model of ATAMM, methods to enter an arbitrary graph into the model, and techniques to analyze the simulation results. The performance of a 7-node graph example is evaluated using the ADAS model and verifies the ATAMM concept by substantiating previously published performance results.

  19. Working Memory and Distributed Vocabulary Learning.

    Science.gov (United States)

    Atkins, Paul W. B.; Baddeley, Alan D.

    1998-01-01

    Tested the hypothesis that individual differences in immediate-verbal-memory span predict success in second-language vocabulary acquisition. In the two-session study, adult subjects learned 56 English-Finnish translations. Tested one week later, subjects were less likely to remember those words they had difficulty learning, even though they had…

  20. The ACP (Advanced Computer Program) multiprocessor system at Fermilab

    Energy Technology Data Exchange (ETDEWEB)

    Nash, T.; Areti, H.; Atac, R.; Biel, J.; Case, G.; Cook, A.; Fischler, M.; Gaines, I.; Hance, R.; Husby, D.

    1986-09-01

    The Advanced Computer Program at Fermilab has developed a multiprocessor system which is easy to use and uniquely cost effective for many high energy physics problems. The system is based on single board computers which cost under $2000 each to build including 2 Mbytes of on board memory. These standard VME modules each run experiment reconstruction code in Fortran at speeds approaching that of a VAX 11/780. Two versions have been developed: one uses Motorola's 68020 32 bit microprocessor, the other runs with AT and T's 32100. both include the corresponding floating point coprocessor chip. The first system, when fully configured, uses 70 each of the two types of processors. A 53 processor system has been operated for several months with essentially no down time by computer operators in the Fermilab Computer Center, performing at nearly the capacity of 6 CDC Cyber 175 mainframe computers. The VME crates in which the processing ''nodes'' sit are connected via a high speed ''Branch Bus'' to one or more MicroVAX computers which act as hosts handling system resource management and all I/O in offline applications. An interface from Fastbus to the Branch Bus has been developed for online use which has been tested error free at 20 Mbytes/sec for 48 hours. ACP hardware modules are now available commercially. A major package of software, including a simulator that runs on any VAX, has been developed. It allows easy migration of existing programs to this multiprocessor environment. This paper describes the ACP Multiprocessor System and early experience with it at Fermilab and elsewhere.

  1. The ACP [Advanced Computer Program] multiprocessor system at Fermilab

    International Nuclear Information System (INIS)

    Nash, T.; Areti, H.; Atac, R.

    1986-09-01

    The Advanced Computer Program at Fermilab has developed a multiprocessor system which is easy to use and uniquely cost effective for many high energy physics problems. The system is based on single board computers which cost under $2000 each to build including 2 Mbytes of on board memory. These standard VME modules each run experiment reconstruction code in Fortran at speeds approaching that of a VAX 11/780. Two versions have been developed: one uses Motorola's 68020 32 bit microprocessor, the other runs with AT and T's 32100. both include the corresponding floating point coprocessor chip. The first system, when fully configured, uses 70 each of the two types of processors. A 53 processor system has been operated for several months with essentially no down time by computer operators in the Fermilab Computer Center, performing at nearly the capacity of 6 CDC Cyber 175 mainframe computers. The VME crates in which the processing ''nodes'' sit are connected via a high speed ''Branch Bus'' to one or more MicroVAX computers which act as hosts handling system resource management and all I/O in offline applications. An interface from Fastbus to the Branch Bus has been developed for online use which has been tested error free at 20 Mbytes/sec for 48 hours. ACP hardware modules are now available commercially. A major package of software, including a simulator that runs on any VAX, has been developed. It allows easy migration of existing programs to this multiprocessor environment. This paper describes the ACP Multiprocessor System and early experience with it at Fermilab and elsewhere

  2. Memory-assisted measurement-device-independent quantum key distribution

    International Nuclear Information System (INIS)

    Panayi, Christiana; Razavi, Mohsen; Ma, Xiongfeng; Lütkenhaus, Norbert

    2014-01-01

    A protocol with the potential of beating the existing distance records for conventional quantum key distribution (QKD) systems is proposed. It borrows ideas from quantum repeaters by using memories in the middle of the link, and that of measurement-device-independent QKD, which only requires optical source equipment at the user's end. For certain memories with short access times, our scheme allows a higher repetition rate than that of quantum repeaters with single-mode memories, thereby requiring lower coherence times. By accounting for various sources of nonideality, such as memory decoherence, dark counts, misalignment errors, and background noise, as well as timing issues with memories, we develop a mathematical framework within which we can compare QKD systems with and without memories. In particular, we show that with the state-of-the-art technology for quantum memories, it is potentially possible to devise memory-assisted QKD systems that, at certain distances of practical interest, outperform current QKD implementations. (paper)

  3. Language Constructs for Data Partitioning and Distribution

    Directory of Open Access Journals (Sweden)

    P. Crooks

    1995-01-01

    Full Text Available This article presents a survey of language features for distributed memory multiprocessor systems (DMMs, in particular, systems that provide features for data partitioning and distribution. In these systems the programmer is freed from consideration of the low-level details of the target architecture in that there is no need to program explicit processes or specify interprocess communication. Programs are written according to the shared memory programming paradigm but the programmer is required to specify, by means of directives, additional syntax or interactive methods, how the data of the program are decomposed and distributed.

  4. Performances of multiprocessor multidisk architectures for continuous media storage

    Science.gov (United States)

    Gennart, Benoit A.; Messerli, Vincent; Hersch, Roger D.

    1996-03-01

    Multimedia interfaces increase the need for large image databases, capable of storing and reading streams of data with strict synchronicity and isochronicity requirements. In order to fulfill these requirements, we consider a parallel image server architecture which relies on arrays of intelligent disk nodes, each disk node being composed of one processor and one or more disks. This contribution analyzes through bottleneck performance evaluation and simulation the behavior of two multi-processor multi-disk architectures: a point-to-point architecture and a shared-bus architecture similar to current multiprocessor workstation architectures. We compare the two architectures on the basis of two multimedia algorithms: the compute-bound frame resizing by resampling and the data-bound disk-to-client stream transfer. The results suggest that the shared bus is a potential bottleneck despite its very high hardware throughput (400Mbytes/s) and that an architecture with addressable local memories located closely to their respective processors could partially remove this bottleneck. The point- to-point architecture is scalable and able to sustain high throughputs for simultaneous compute- bound and data-bound operations.

  5. DOLIB: Distributed Object Library

    Energy Technology Data Exchange (ETDEWEB)

    D' Azevedo, E.F.

    1994-01-01

    This report describes the use and implementation of DOLIB (Distributed Object Library), a library of routines that emulates global or virtual shared memory on Intel multiprocessor systems. Access to a distributed global array is through explicit calls to gather and scatter. Advantages of using DOLIB include: dynamic allocation and freeing of huge (gigabyte) distributed arrays, both C and FORTRAN callable interfaces, and the ability to mix shared-memory and message-passing programming models for ease of use and optimal performance. DOLIB is independent of language and compiler extensions and requires no special operating system support. DOLIB also supports automatic caching of read-only data for high performance. The virtual shared memory support provided in DOLIB is well suited for implementing Lagrangian particle tracking techniques. We have also used DOLIB to create DONIO (Distributed Object Network I/O Library), which obtains over a 10-fold improvement in disk I/O performance on the Intel Paragon.

  6. DOLIB: Distributed Object Library

    Energy Technology Data Exchange (ETDEWEB)

    D`Azevedo, E.F.; Romine, C.H.

    1994-10-01

    This report describes the use and implementation of DOLIB (Distributed Object Library), a library of routines that emulates global or virtual shared memory on Intel multiprocessor systems. Access to a distributed global array is through explicit calls to gather and scatter. Advantages of using DOLIB include: dynamic allocation and freeing of huge (gigabyte) distributed arrays, both C and FORTRAN callable interfaces, and the ability to mix shared-memory and message-passing programming models for ease of use and optimal performance. DOLIB is independent of language and compiler extensions and requires no special operating system support. DOLIB also supports automatic caching of read-only data for high performance. The virtual shared memory support provided in DOLIB is well suited for implementing Lagrangian particle tracking techniques. We have also used DOLIB to create DONIO (Distributed Object Network I/O Library), which obtains over a 10-fold improvement in disk I/O performance on the Intel Paragon.

  7. Parallel discrete ordinates algorithms on distributed and common memory systems

    International Nuclear Information System (INIS)

    Wienke, B.R.; Hiromoto, R.E.; Brickner, R.G.

    1987-01-01

    The S/sub n/ algorithm employs iterative techniques in solving the linear Boltzmann equation. These methods, both ordered and chaotic, were compared on both the Denelcor HEP and the Intel hypercube. Strategies are linked to the organization and accessibility of memory (common memory versus distributed memory architectures), with common concern for acquisition of global information. Apart from this, the inherent parallelism of the algorithm maps directly onto the two architectures. Results comparing execution times, speedup, and efficiency are based on a representative 16-group (full upscatter and downscatter) sample problem. Calculations were performed on both the Los Alamos National Laboratory (LANL) Denelcor HEP and the LANL Intel hypercube. The Denelcor HEP is a 64-bit multi-instruction, multidate MIMD machine consisting of up to 16 process execution modules (PEMs), each capable of executing 64 processes concurrently. Each PEM can cooperate on a job, or run several unrelated jobs, and share a common global memory through a crossbar switch. The Intel hypercube, on the other hand, is a distributed memory system composed of 128 processing elements, each with its own local memory. Processing elements are connected in a nearest-neighbor hypercube configuration and sharing of data among processors requires execution of explicit message-passing constructs

  8. Monte Carlo photon transport on shared memory and distributed memory parallel processors

    International Nuclear Information System (INIS)

    Martin, W.R.; Wan, T.C.; Abdel-Rahman, T.S.; Mudge, T.N.; Miura, K.

    1987-01-01

    Parallelized Monte Carlo algorithms for analyzing photon transport in an inertially confined fusion (ICF) plasma are considered. Algorithms were developed for shared memory (vector and scalar) and distributed memory (scalar) parallel processors. The shared memory algorithm was implemented on the IBM 3090/400, and timing results are presented for dedicated runs with two, three, and four processors. Two alternative distributed memory algorithms (replication and dispatching) were implemented on a hypercube parallel processor (1 through 64 nodes). The replication algorithm yields essentially full efficiency for all cube sizes; with the 64-node configuration, the absolute performance is nearly the same as with the CRAY X-MP. The dispatching algorithm also yields efficiencies above 80% in a large simulation for the 64-processor configuration

  9. A real-time multichannel memory controller and optimal mapping of memory clients to memory channels

    NARCIS (Netherlands)

    Gomony, M.D.; Akesson, K.B.; Goossens, K.G.W.

    2015-01-01

    Ever-increasing demands for main memory bandwidth and memory speed/power tradeoff led to the introduction of memories with multiple memory channels, such as Wide IO DRAM. Efficient utilization of a multichannel memory as a shared resource in multiprocessor real-time systems depends on mapping of the

  10. Realtime multiprocessor for mobile ad hoc networks

    Directory of Open Access Journals (Sweden)

    T. Jungeblut

    2008-05-01

    Full Text Available This paper introduces a real-time Multiprocessor System-On-Chip (MPSoC for low power wireless applications. The multiprocessor is based on eight 32bit RISC processors that are connected via an Network-On-Chip (NoC. The NoC follows a novel approach with guaranteed bandwidth to the application that meets hard realtime requirements. At a clock frequency of 100 MHz the total power consumption of the MPSoC that has been fabricated in 180 nm UMC standard cell technology is 772 mW.

  11. Parallelising a molecular dynamics algorithm on a multi-processor workstation

    Science.gov (United States)

    Müller-Plathe, Florian

    1990-12-01

    The Verlet neighbour-list algorithm is parallelised for a multi-processor Hewlett-Packard/Apollo DN10000 workstation. The implementation makes use of memory shared between the processors. It is a genuine master-slave approach by which most of the computational tasks are kept in the master process and the slaves are only called to do part of the nonbonded forces calculation. The implementation features elements of both fine-grain and coarse-grain parallelism. Apart from three calls to library routines, two of which are standard UNIX calls, and two machine-specific language extensions, the whole code is written in standard Fortran 77. Hence, it may be expected that this parallelisation concept can be transfered in parts or as a whole to other multi-processor shared-memory computers. The parallel code is routinely used in production work.

  12. PRISMA database machine: A distributed, main-memory approach

    NARCIS (Netherlands)

    Schmidt, J.W.; Apers, Peter M.G.; Ceri, S.; Kersten, Martin L.; Oerlemans, Hans C.M.; Missikoff, M.

    1988-01-01

    The PRISMA project is a large-scale research effort in the design and implementation of a highly parallel machine for data and knowledge processing. The PRISMA database machine is a distributed, main-memory database management system implemented in an object-oriented language that runs on top of a

  13. Dynamic overset grid communication on distributed memory parallel processors

    Science.gov (United States)

    Barszcz, Eric; Weeratunga, Sisira K.; Meakin, Robert L.

    1993-01-01

    A parallel distributed memory implementation of intergrid communication for dynamic overset grids is presented. Included are discussions of various options considered during development. Results are presented comparing an Intel iPSC/860 to a single processor Cray Y-MP. Results for grids in relative motion show the iPSC/860 implementation to be faster than the Cray implementation.

  14. Using a commercial symmetric multiprocessor for lattice QCD

    International Nuclear Information System (INIS)

    Brower, R.C.; Chen, D.; Negele, J.W.

    1998-01-01

    In its evolution, the computer industry has reached the point when considerable computing power can be packaged on a single microprocessor chip. At the same time, costs of designing a computer system around such a CPU are growing. For these reasons we decided to explore a possibility of using commercially available symmetric multiprocessors (SMP) as building blocks for the LQCD computer. Careful analysis of the architecture allowed us to build a QCD primitive library running close to the peak performance on the UltraSPARC processor. As a result, multithreaded QCD code (both the heatbath and the Wilson fermion inverter) runs at about 50% efficiency on a single SMP. The communication between different CPUs is handled by a coherent memory system. Currently we are planning to connect several SMPs with a high bandwidth network into a single system. (orig.)

  15. One-Step Programmable Arbiters for Multiprocessors

    DEFF Research Database (Denmark)

    Højberg, Kristian Søe

    1978-01-01

    When processors in a multiprocessor system demand service from a shared bus in an asynchronous mode, a synchronous state arbiter resolves conflicts and allocates resources. Independent of the combination of requests, only one state transition is required from a free to allocated resource...

  16. Shared performance monitor in a multiprocessor system

    Science.gov (United States)

    Chiu, George; Gara, Alan G.; Salapura, Valentina

    2012-07-24

    A performance monitoring unit (PMU) and method for monitoring performance of events occurring in a multiprocessor system. The multiprocessor system comprises a plurality of processor devices units, each processor device for generating signals representing occurrences of events in the processor device, and, a single shared counter resource for performance monitoring. The performance monitor unit is shared by all processor cores in the multiprocessor system. The PMU comprises: a plurality of performance counters each for counting signals representing occurrences of events from one or more the plurality of processor units in the multiprocessor system; and, a plurality of input devices for receiving the event signals from one or more processor devices of the plurality of processor units, the plurality of input devices programmable to select event signals for receipt by one or more of the plurality of performance counters for counting, wherein the PMU is shared between multiple processing units, or within a group of processors in the multiprocessing system. The PMU is further programmed to monitor event signals issued from non-processor devices.

  17. The fast Amsterdam multiprocessor (FAMP) system hardware

    International Nuclear Information System (INIS)

    Hertzberger, L.O.; Kieft, G.; Kisielewski, B.; Wiggers, L.W.; Engster, C.; Koningsveld, L. van

    1981-01-01

    The architecture of a multiprocessor system is described that will be used for on-line filter and second stage trigger applications. The system is based on the MC 68000 microprocessor from Motorola. Emphasis is paid to hardware aspects, in particular the modularity, processor communication and interfacing, whereas the system software and the applications will be described in separate articles. (orig.)

  18. Sparse Distributed Memory: understanding the speed and robustness of expert memory

    Directory of Open Access Journals (Sweden)

    Marcelo Salhab Brogliato

    2014-04-01

    Full Text Available How can experts, sometimes in exacting detail, almost immediately and very precisely recall memory items from a vast repertoire? The problem in which we will be interested concerns models of theoretical neuroscience that could explain the speed and robustness of an expert's recollection. The approach is based on Sparse Distributed Memory, which has been shown to be plausible, both in a neuroscientific and in a psychological manner, in a number of ways. A crucial characteristic concerns the limits of human recollection, the `tip-of-tongue' memory event--which is found at a non-linearity in the model. We expand the theoretical framework, deriving an optimization formula to solve to this non-linearity. Numerical results demonstrate how the higher frequency of rehearsal, through work or study, immediately increases the robustness and speed associated with expert memory.

  19. Lifetime-Based Memory Management for Distributed Data Processing Systems

    DEFF Research Database (Denmark)

    Lu, Lu; Shi, Xuanhua; Zhou, Yongluan

    2016-01-01

    create a large amount of long-living data objects in the heap, which may quickly saturate the garbage collector, especially when handling a large dataset, and hence would limit the scalability of the system. To eliminate this problem, we propose a lifetime-based memory management framework, which...... the garbage collection time by up to 99.9%, 2) to achieve up to 22.7x speed up in terms of execution time in cases without data spilling and 41.6x speedup in cases with data spilling, and 3) to consume up to 46.6% less memory.......In-memory caching of intermediate data and eager combining of data in shuffle buffers have been shown to be very effective in minimizing the re-computation and I/O cost in distributed data processing systems like Spark and Flink. However, it has also been widely reported that these techniques would...

  20. Distributed-Memory Breadth-First Search on Massive Graphs

    Energy Technology Data Exchange (ETDEWEB)

    Buluc, Aydin [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Computational Research Division; Beamer, Scott [Univ. of California, Berkeley, CA (United States). Dept. of Electrical Engineering and Computer Sciences; Madduri, Kamesh [Pennsylvania State Univ., University Park, PA (United States). Computer Science & Engineering Dept.; Asanovic, Krste [Univ. of California, Berkeley, CA (United States). Dept. of Electrical Engineering and Computer Sciences; Patterson, David [Univ. of California, Berkeley, CA (United States). Dept. of Electrical Engineering and Computer Sciences

    2017-09-26

    This chapter studies the problem of traversing large graphs using the breadth-first search order on distributed-memory supercomputers. We consider both the traditional level-synchronous top-down algorithm as well as the recently discovered direction optimizing algorithm. We analyze the performance and scalability trade-offs in using different local data structures such as CSR and DCSC, enabling in-node multithreading, and graph decompositions such as 1D and 2D decomposition.

  1. A portable implementation of ARPACK for distributed memory parallel architectures

    Energy Technology Data Exchange (ETDEWEB)

    Maschhoff, K.J.; Sorensen, D.C.

    1996-12-31

    ARPACK is a package of Fortran 77 subroutines which implement the Implicitly Restarted Arnoldi Method used for solving large sparse eigenvalue problems. A parallel implementation of ARPACK is presented which is portable across a wide range of distributed memory platforms and requires minimal changes to the serial code. The communication layers used for message passing are the Basic Linear Algebra Communication Subprograms (BLACS) developed for the ScaLAPACK project and Message Passing Interface(MPI).

  2. A compositional reservoir simulator on distributed memory parallel computers

    International Nuclear Information System (INIS)

    Rame, M.; Delshad, M.

    1995-01-01

    This paper presents the application of distributed memory parallel computes to field scale reservoir simulations using a parallel version of UTCHEM, The University of Texas Chemical Flooding Simulator. The model is a general purpose highly vectorized chemical compositional simulator that can simulate a wide range of displacement processes at both field and laboratory scales. The original simulator was modified to run on both distributed memory parallel machines (Intel iPSC/960 and Delta, Connection Machine 5, Kendall Square 1 and 2, and CRAY T3D) and a cluster of workstations. A domain decomposition approach has been taken towards parallelization of the code. A portion of the discrete reservoir model is assigned to each processor by a set-up routine that attempts a data layout as even as possible from the load-balance standpoint. Each of these subdomains is extended so that data can be shared between adjacent processors for stencil computation. The added routines that make parallel execution possible are written in a modular fashion that makes the porting to new parallel platforms straight forward. Results of the distributed memory computing performance of Parallel simulator are presented for field scale applications such as tracer flood and polymer flood. A comparison of the wall-clock times for same problems on a vector supercomputer is also presented

  3. A possible approach to estimating the operational efficiency of multiprocessor systems

    International Nuclear Information System (INIS)

    Kuznetsov, N.Y.; Gorlach, S.P.; Sumskaya, A.A.

    1984-01-01

    This article presents a mathematical model that constructs the upper and lower estimates evaluating the efficiency of solution of a large class of problems using a multiprocessor system with a specific architecture. Efficiency depends on a system's architecture (e.g., the number of processors, memory volume, the number of communication links, commutation speed) and the types of problems it is intended to solve. The behavior of the model is considered in a stationary mode. The model is used to evaluate the efficiency of a particular algorithm implemented in a multiprocessor system. It is concluded that the model is flexible and enables the investigation of a broad class of problems in computational mathematics, including linear algebra and boundary-value problems of mathematical physics

  4. Academic training: Advanced lectures on multiprocessor programming

    CERN Multimedia

    PH Department

    2011-01-01

    Academic Training Lecture - Regular Programme 31 October 1, 2 November 2011 from 11:00 to 12:00 -  IT Auditorium, Bldg. 31   Three classes (60 mins) on Multiprocessor Programming Prof. Dr. Christoph von Praun Georg-Simon-Ohm University of Applied Sciences Nuremberg, Germany This is an advanced class on multiprocessor programming. The class gives an introduction to principles of concurrent objects and the notion of different progress guarantees that concurrent computations can have. The focus of this class is on non-blocking computations, i.e. concurrent programs that do not make use of locks. We discuss the implementation of practical non-blocking data structures in detail. 1st class: Introduction to concurrent objects 2nd class: Principles of non-blocking synchronization 3rd class: Concurrent queues Brief Bio of Christoph von Praun Christoph worked on a variety of analysis techniques and runtime platforms for parallel programs. Hist most recent research studies programming models an...

  5. [Distribution of neural memory, loading factor, its regulation and optimization].

    Science.gov (United States)

    Radchenko, A N

    1999-01-01

    Recording and retrieving functions of the neural memory are simulated as a control of local conformational processes in neural synaptic fields. The localization of conformational changes is related to the afferent temporal-spatial pulse pattern flow, the microstructure of connections and a plurality of temporal delays in synaptic fields and afferent pathways. The loci of conformations are described by sets of afferent addresses named address domains. Being superimposed on each other, address domains form a multilayer covering of the address space of the neuron or the ensemble. The superposition factor determines the dissemination of the conformational process, and the fuzzing of memory, and its accuracy and reliability. The engram is formed as detects in the packing of the address space and hence can be retrieved in inverse form. The accuracy of the retrieved information depends on the threshold level of conformational transitions, the distribution of conformational changes in synaptic fields of the neuronal population, and the memory loading factor. The latter is represented in the model by a slow potential. It reflects total conformational changes and displaces the membrane potential to monostable conformational regimes, by governing the exit from the recording regime, the potentiation of the neurone, and the readiness to reproduction. A relative amplitude of the slow potential and the coefficient of postconformational modification of ionic conductivity, which provides maximum reliability, accuracy, and capacity of memory, are calculated.

  6. Techniques for Reducing Consistency-Related Communication in Distributed Shared Memory System

    OpenAIRE

    Zwaenepoel, W; Bennett, J.K.; Carter, J.B.

    1995-01-01

    Distributed shared memory 8DSM) is an abstraction of shared memory on a distributed memory machine. Hardware DSM systems support this abstraction at the architecture level; software DSM systems support the abstraction within the runtime system. One of the key problems in building an efficient software DSM system is to reduce the amount of communication needed to keep the distributed memories consistent. In this paper we present four techniques for doing so: 1) software release consistency; 2)...

  7. Parallel SN algorithms in shared- and distributed-memory environments

    International Nuclear Information System (INIS)

    Haghighat, Alireza; Hunter, Melissa A.; Mattis, Ronald E.

    1995-01-01

    Different 2-D spatial domain partitioning Sn transport theory algorithms have been developed on the basis of the Block-Jacobi iterative scheme. These algorithms have been incorporated into TWOTRAN-II, and tested on a shared-memory CRAY Y-MP C90 and a distributed-memory IBM SP1. For a series of fixed source r-z geometry homogeneous problems, parallel efficiencies in a range of 50-90% are achieved on the C90 with 6 processors, and lower values (20-60%) are obtained on the SP1. It is demonstrated that better performance is attainable if one addresses issues such as convergence rate, load-balancing, and granularity for both architectures, as well as message passing (network bandwidth and latency) for SP1. (author). 17 refs, 4 figs

  8. An Adaptive Hybrid Multiprocessor technique for bioinformatics sequence alignment

    KAUST Repository

    Bonny, Talal

    2012-07-28

    Sequence alignment algorithms such as the Smith-Waterman algorithm are among the most important applications in the development of bioinformatics. Sequence alignment algorithms must process large amounts of data which may take a long time. Here, we introduce our Adaptive Hybrid Multiprocessor technique to accelerate the implementation of the Smith-Waterman algorithm. Our technique utilizes both the graphics processing unit (GPU) and the central processing unit (CPU). It adapts to the implementation according to the number of CPUs given as input by efficiently distributing the workload between the processing units. Using existing resources (GPU and CPU) in an efficient way is a novel approach. The peak performance achieved for the platforms GPU + CPU, GPU + 2CPUs, and GPU + 3CPUs is 10.4 GCUPS, 13.7 GCUPS, and 18.6 GCUPS, respectively (with the query length of 511 amino acid). © 2010 IEEE.

  9. Multi-processor network implementations in Multibus II and VME

    International Nuclear Information System (INIS)

    Briegel, C.

    1992-01-01

    ACNET (Fermilab Accelerator Controls Network), a proprietary network protocol, is implemented in a multi-processor configuration for both Multibus II and VME. The implementations are contrasted by the bus protocol and software design goals. The Multibus II implementation provides for multiple processors running a duplicate set of tasks on each processor. For a network connected task, messages are distributed by a network round-robin scheduler. Further, messages can be stopped, continued, or re-routed for each task by user-callable commands. The VME implementation provides for multiple processors running one task across all processors. The process can either be fixed to a particular processor or dynamically allocated to an available processor depending on the scheduling algorithm of the multi-processing operating system. (author)

  10. Geometric Algorithms for Private-Cache Chip Multiprocessors

    DEFF Research Database (Denmark)

    Ajwani, Deepak; Sitchinava, Nodari; Zeh, Norbert

    2010-01-01

    -D convex hulls. These results are obtained by analyzing adaptations of either the PEM merge sort algorithm or PRAM algorithms. For the second group of problems—orthogonal line segment intersection reporting, batched range reporting, and related problems—more effort is required. What distinguishes......We study techniques for obtaining efficient algorithms for geometric problems on private-cache chip multiprocessors. We show how to obtain optimal algorithms for interval stabbing counting, 1-D range counting, weighted 2-D dominance counting, and for computing 3-D maxima, 2-D lower envelopes, and 2...... these problems from the ones in the previous group is the variable output size, which requires I/O-efficient load balancing strategies based on the contribution of the individual input elements to the output size. To obtain nearly optimal algorithms for these problems, we introduce a parallel distribution...

  11. Parallel Breadth-First Search on Distributed Memory Systems

    Energy Technology Data Exchange (ETDEWEB)

    Computational Research Division; Buluc, Aydin; Madduri, Kamesh

    2011-04-15

    Data-intensive, graph-based computations are pervasive in several scientific applications, and are known to to be quite challenging to implement on distributed memory systems. In this work, we explore the design space of parallel algorithms for Breadth-First Search (BFS), a key subroutine in several graph algorithms. We present two highly-tuned par- allel approaches for BFS on large parallel systems: a level-synchronous strategy that relies on a simple vertex-based partitioning of the graph, and a two-dimensional sparse matrix- partitioning-based approach that mitigates parallel commu- nication overhead. For both approaches, we also present hybrid versions with intra-node multithreading. Our novel hybrid two-dimensional algorithm reduces communication times by up to a factor of 3.5, relative to a common vertex based approach. Our experimental study identifies execu- tion regimes in which these approaches will be competitive, and we demonstrate extremely high performance on lead- ing distributed-memory parallel systems. For instance, for a 40,000-core parallel execution on Hopper, an AMD Magny- Cours based system, we achieve a BFS performance rate of 17.8 billion edge visits per second on an undirected graph of 4.3 billion vertices and 68.7 billion edges with skewed degree distribution.

  12. Periodic bidirectional associative memory neural networks with distributed delays

    Science.gov (United States)

    Chen, Anping; Huang, Lihong; Liu, Zhigang; Cao, Jinde

    2006-05-01

    Some sufficient conditions are obtained for the existence and global exponential stability of a periodic solution to the general bidirectional associative memory (BAM) neural networks with distributed delays by using the continuation theorem of Mawhin's coincidence degree theory and the Lyapunov functional method and the Young's inequality technique. These results are helpful for designing a globally exponentially stable and periodic oscillatory BAM neural network, and the conditions can be easily verified and be applied in practice. An example is also given to illustrate our results.

  13. Modeling and Analyzing Real-Time Multiprocessor Systems

    NARCIS (Netherlands)

    Wiggers, M.H.; Thiele, Lothar; Lee, Edward A.; Schlieker, Simon; Bekooij, Marco Jan Gerrit

    2010-01-01

    Researchers have proposed approaches to verify that real-time multiprocessor systems meet their timeliness constraints. These approaches make assumptions on the model of computation, the load placed on the multiprocessor system, and the faults that can arise. This heterogeneous set of assumptions

  14. Investigation of implementing a synchronization protocol under multiprocessors hierarchical scheduling

    NARCIS (Netherlands)

    Nemati, F.; Behnam, M.; Bril, R.J.

    2009-01-01

    In the multi-core and multiprocessor domain, there has been considerable work done on scheduling techniques assuming that real-time tasks are independent. In practice a typical real-time system usually share logical resources among tasks. However, synchronization in the multiprocessor area has not

  15. Optimizing NEURON Simulation Environment Using Remote Memory Access with Recursive Doubling on Distributed Memory Systems.

    Science.gov (United States)

    Shehzad, Danish; Bozkuş, Zeki

    2016-01-01

    Increase in complexity of neuronal network models escalated the efforts to make NEURON simulation environment efficient. The computational neuroscientists divided the equations into subnets amongst multiple processors for achieving better hardware performance. On parallel machines for neuronal networks, interprocessor spikes exchange consumes large section of overall simulation time. In NEURON for communication between processors Message Passing Interface (MPI) is used. MPI_Allgather collective is exercised for spikes exchange after each interval across distributed memory systems. The increase in number of processors though results in achieving concurrency and better performance but it inversely affects MPI_Allgather which increases communication time between processors. This necessitates improving communication methodology to decrease the spikes exchange time over distributed memory systems. This work has improved MPI_Allgather method using Remote Memory Access (RMA) by moving two-sided communication to one-sided communication, and use of recursive doubling mechanism facilitates achieving efficient communication between the processors in precise steps. This approach enhanced communication concurrency and has improved overall runtime making NEURON more efficient for simulation of large neuronal network models.

  16. Optimizing NEURON Simulation Environment Using Remote Memory Access with Recursive Doubling on Distributed Memory Systems

    Directory of Open Access Journals (Sweden)

    Danish Shehzad

    2016-01-01

    Full Text Available Increase in complexity of neuronal network models escalated the efforts to make NEURON simulation environment efficient. The computational neuroscientists divided the equations into subnets amongst multiple processors for achieving better hardware performance. On parallel machines for neuronal networks, interprocessor spikes exchange consumes large section of overall simulation time. In NEURON for communication between processors Message Passing Interface (MPI is used. MPI_Allgather collective is exercised for spikes exchange after each interval across distributed memory systems. The increase in number of processors though results in achieving concurrency and better performance but it inversely affects MPI_Allgather which increases communication time between processors. This necessitates improving communication methodology to decrease the spikes exchange time over distributed memory systems. This work has improved MPI_Allgather method using Remote Memory Access (RMA by moving two-sided communication to one-sided communication, and use of recursive doubling mechanism facilitates achieving efficient communication between the processors in precise steps. This approach enhanced communication concurrency and has improved overall runtime making NEURON more efficient for simulation of large neuronal network models.

  17. Implementation of Parallel Dynamic Simulation on Shared-Memory vs. Distributed-Memory Environments

    Energy Technology Data Exchange (ETDEWEB)

    Jin, Shuangshuang; Chen, Yousu; Wu, Di; Diao, Ruisheng; Huang, Zhenyu

    2015-12-09

    Power system dynamic simulation computes the system response to a sequence of large disturbance, such as sudden changes in generation or load, or a network short circuit followed by protective branch switching operation. It consists of a large set of differential and algebraic equations, which is computational intensive and challenging to solve using single-processor based dynamic simulation solution. High-performance computing (HPC) based parallel computing is a very promising technology to speed up the computation and facilitate the simulation process. This paper presents two different parallel implementations of power grid dynamic simulation using Open Multi-processing (OpenMP) on shared-memory platform, and Message Passing Interface (MPI) on distributed-memory clusters, respectively. The difference of the parallel simulation algorithms and architectures of the two HPC technologies are illustrated, and their performances for running parallel dynamic simulation are compared and demonstrated.

  18. Migration of vectorized iterative solvers to distributed memory architectures

    Energy Technology Data Exchange (ETDEWEB)

    Pommerell, C. [AT& T Bell Labs., Murray Hill, NJ (United States); Ruehl, R. [CSCS-ETH, Manno (Switzerland)

    1994-12-31

    Both necessity and opportunity motivate the use of high-performance computers for iterative linear solvers. Necessity results from the size of the problems being solved-smaller problems are often better handled by direct methods. Opportunity arises from the formulation of the iterative methods in terms of simple linear algebra operations, even if this {open_quote}natural{close_quotes} parallelism is not easy to exploit in irregularly structured sparse matrices and with good preconditioners. As a result, high-performance implementations of iterative solvers have attracted a lot of interest in recent years. Most efforts are geared to vectorize or parallelize the dominating operation-structured or unstructured sparse matrix-vector multiplication, or to increase locality and parallelism by reformulating the algorithm-reducing global synchronization in inner products or local data exchange in preconditioners. Target architectures for iterative solvers currently include mostly vector supercomputers and architectures with one or few optimized (e.g., super-scalar and/or super-pipelined RISC) processors and hierarchical memory systems. More recently, parallel computers with physically distributed memory and a better price/performance ratio have been offered by vendors as a very interesting alternative to vector supercomputers. However, programming comfort on such distributed memory parallel processors (DMPPs) still lags behind. Here the authors are concerned with iterative solvers and their changing computing environment. In particular, they are considering migration from traditional vector supercomputers to DMPPs. Application requirements force one to use flexible and portable libraries. They want to extend the portability of iterative solvers rather than reimplementing everything for each new machine, or even for each new architecture.

  19. Scalable Distributed Architectures for Information Retrieval

    National Research Council Canada - National Science Library

    Lu, Zhihong

    1999-01-01

    .... Our distributed architectures exploit parallelism in information retrieval on a cluster of parallel IR servers using symmetric multiprocessors, and use partial collection replication and selection...

  20. Neuronal model with distributed delay: analysis and simulation study for gamma distribution memory kernel.

    Science.gov (United States)

    Karmeshu; Gupta, Varun; Kadambari, K V

    2011-06-01

    A single neuronal model incorporating distributed delay (memory)is proposed. The stochastic model has been formulated as a Stochastic Integro-Differential Equation (SIDE) which results in the underlying process being non-Markovian. A detailed analysis of the model when the distributed delay kernel has exponential form (weak delay) has been carried out. The selection of exponential kernel has enabled the transformation of the non-Markovian model to a Markovian model in an extended state space. For the study of First Passage Time (FPT) with exponential delay kernel, the model has been transformed to a system of coupled Stochastic Differential Equations (SDEs) in two-dimensional state space. Simulation studies of the SDEs provide insight into the effect of weak delay kernel on the Inter-Spike Interval(ISI) distribution. A measure based on Jensen-Shannon divergence is proposed which can be used to make a choice between two competing models viz. distributed delay model vis-á-vis LIF model. An interesting feature of the model is that the behavior of (CV(t))((ISI)) (Coefficient of Variation) of the ISI distribution with respect to memory kernel time constant parameter η reveals that neuron can switch from a bursting state to non-bursting state as the noise intensity parameter changes. The membrane potential exhibits decaying auto-correlation structure with or without damped oscillatory behavior depending on the choice of parameters. This behavior is in agreement with empirically observed pattern of spike count in a fixed time window. The power spectral density derived from the auto-correlation function is found to exhibit single and double peaks. The model is also examined for the case of strong delay with memory kernel having the form of Gamma distribution. In contrast to fast decay of damped oscillations of the ISI distribution for the model with weak delay kernel, the decay of damped oscillations is found to be slower for the model with strong delay kernel.

  1. Parallel algorithms for geometric connected component labeling on a hypercube multiprocessor

    Science.gov (United States)

    Belkhale, K. P.; Banerjee, P.

    1992-01-01

    Different algorithms for the geometric connected component labeling (GCCL) problem are defined each of which involves d stages of message passing, for a d-dimensional hypercube. The major idea is that in each stage a hypercube multiprocessor increases its knowledge of domain. The algorithms under consideration include the QUAD algorithm for small number of processors and the Overlap Quad algorithm for large number of processors, subject to the locality of the connected sets. These algorithms differ in their run time, memory requirements, and message complexity. They were implemented on an Intel iPSC2/D4/MX hypercube.

  2. Massively Parallel Polar Decomposition on Distributed-Memory Systems

    KAUST Repository

    Ltaief, Hatem

    2018-01-01

    We present a high-performance implementation of the Polar Decomposition (PD) on distributed-memory systems. Building upon on the QR-based Dynamically Weighted Halley (QDWH) algorithm, the key idea lies in finding the best rational approximation for the scalar sign function, which also corresponds to the polar factor for symmetric matrices, to further accelerate the QDWH convergence. Based on the Zolotarev rational functions—introduced by Zolotarev (ZOLO) in 1877— this new PD algorithm ZOLO-PD converges within two iterations even for ill-conditioned matrices, instead of the original six iterations needed for QDWH. ZOLO-PD uses the property of Zolotarev functions that optimality is maintained when two functions are composed in an appropriate manner. The resulting ZOLO-PD has a convergence rate up to seventeen, in contrast to the cubic convergence rate for QDWH. This comes at the price of higher arithmetic costs and memory footprint. These extra floating-point operations can, however, be processed in an embarrassingly parallel fashion. We demonstrate performance using up to 102, 400 cores on two supercomputers. We demonstrate that, in the presence of a large number of processing units, ZOLO-PD is able to outperform QDWH by up to 2.3X speedup, especially in situations where QDWH runs out of work, for instance, in the strong scaling mode of operation.

  3. Sensorimotor memory of object weight distribution during multidigit grasp.

    Science.gov (United States)

    Albert, Frederic; Santello, Marco; Gordon, Andrew M

    2009-10-09

    We studied the ability to transfer three-digit force sharing patterns learned through consecutive lifts of an object with an asymmetric center of mass (CM). After several object lifts, we asked subjects to rotate and translate the object to the contralateral hand and perform one additional lift. This task was performed under two weight conditions (550 and 950 g) to determine the extent to which subjects would be able to transfer weight and CM information. Learning transfer was quantified by measuring the extent to which force sharing patterns and peak object roll on the first post-translation trial resembled those measured on the pre-translation trial with the same CM. We found that the overall gain of fingertip forces was transferred following object rotation, but that the scaling of individual digit forces was specific to the learned digit-object configuration, and thus was not transferred following rotation. As a result, on the first post-translation trial there was a significantly larger object roll following object lift-off than on the pre-translation trial. This suggests that sensorimotor memories for weight, requiring scaling of fingertip force gain, may differ from memories for mass distribution.

  4. High Performance Polar Decomposition on Distributed Memory Systems

    KAUST Repository

    Sukkari, Dalal E.

    2016-08-08

    The polar decomposition of a dense matrix is an important operation in linear algebra. It can be directly calculated through the singular value decomposition (SVD) or iteratively using the QR dynamically-weighted Halley algorithm (QDWH). The former is difficult to parallelize due to the preponderant number of memory-bound operations during the bidiagonal reduction. We investigate the latter scenario, which performs more floating-point operations but exposes at the same time more parallelism, and therefore, runs closer to the theoretical peak performance of the system, thanks to more compute-bound matrix operations. Profiling results show the performance scalability of QDWH for calculating the polar decomposition using around 9200 MPI processes on well and ill-conditioned matrices of 100K×100K problem size. We study then the performance impact of the QDWH-based polar decomposition as a pre-processing step toward calculating the SVD itself. The new distributed-memory implementation of the QDWH-SVD solver achieves up to five-fold speedup against current state-of-the-art vendor SVD implementations. © Springer International Publishing Switzerland 2016.

  5. Memory intensive functional architecture for distributed computer control systems

    International Nuclear Information System (INIS)

    Dimmler, D.G.

    1983-10-01

    A memory-intensive functional architectue for distributed data-acquisition, monitoring, and control systems with large numbers of nodes has been conceptually developed and applied in several large-scale and some smaller systems. This discussion concentrates on: (1) the basic architecture; (2) recent expansions of the architecture which now become feasible in view of the rapidly developing component technologies in microprocessors and functional large-scale integration circuits; and (3) implementation of some key hardware and software structures and one system implementation which is a system for performing control and data acquisition of a neutron spectrometer at the Brookhaven High Flux Beam Reactor. The spectrometer is equipped with a large-area position-sensitive neutron detector

  6. Particle simulation on a distributed memory highly parallel processor

    International Nuclear Information System (INIS)

    Sato, Hiroyuki; Ikesaka, Morio

    1990-01-01

    This paper describes parallel molecular dynamics simulation of atoms governed by local force interaction. The space in the model is divided into cubic subspaces and mapped to the processor array of the CAP-256, a distributed memory, highly parallel processor developed at Fujitsu Labs. We developed a new technique to avoid redundant calculation of forces between atoms in different processors. Experiments showed the communication overhead was less than 5%, and the idle time due to load imbalance was less than 11% for two model problems which contain 11,532 and 46,128 argon atoms. From the software simulation, the CAP-II which is under development is estimated to be about 45 times faster than CAP-256 and will be able to run the same problem about 40 times faster than Fujitsu's M-380 mainframe when 256 processors are used. (author)

  7. Translation techniques for distributed-shared memory programming models

    Energy Technology Data Exchange (ETDEWEB)

    Fuller, Douglas James [Iowa State Univ., Ames, IA (United States)

    2005-01-01

    The high performance computing community has experienced an explosive improvement in distributed-shared memory hardware. Driven by increasing real-world problem complexity, this explosion has ushered in vast numbers of new systems. Each new system presents new challenges to programmers and application developers. Part of the challenge is adapting to new architectures with new performance characteristics. Different vendors release systems with widely varying architectures that perform differently in different situations. Furthermore, since vendors need only provide a single performance number (total MFLOPS, typically for a single benchmark), they only have strong incentive initially to optimize the API of their choice. Consequently, only a fraction of the available APIs are well optimized on most systems. This causes issues porting and writing maintainable software, let alone issues for programmers burdened with mastering each new API as it is released. Also, programmers wishing to use a certain machine must choose their API based on the underlying hardware instead of the application. This thesis argues that a flexible, extensible translator for distributed-shared memory APIs can help address some of these issues. For example, a translator might take as input code in one API and output an equivalent program in another. Such a translator could provide instant porting for applications to new systems that do not support the application's library or language natively. While open-source APIs are abundant, they do not perform optimally everywhere. A translator would also allow performance testing using a single base code translated to a number of different APIs. Most significantly, this type of translator frees programmers to select the most appropriate API for a given application based on the application (and developer) itself instead of the underlying hardware.

  8. The distribution and the functions of autobiographical memories: Why do older adults remember autobiographical memories from their youth?

    Science.gov (United States)

    Wolf, Tabea; Zimprich, Daniel

    2016-09-01

    In the present study, the distribution of autobiographical memories was examined from a functional perspective: we examined whether the extent to which long-term autobiographical memories were rated as having a self-, a directive, or a social function affects the location (mean age) and scale (standard deviation) of the memory distribution. Analyses were based on a total of 5598 autobiographical memories generated by 149 adults aged between 50 and 81 years in response to 51 cue-words. Participants provided their age at the time when the recalled events had happened and rated how frequently they recall these events for self-, directive, and social purposes. While more frequently using autobiographical memories for self-functions was associated with an earlier mean age, memories frequently shared with others showed a narrower distribution around a later mean age. The directive function, by contrast, did not affect the memory distribution. The results strengthen the assumption that experiences from an individual's late adolescence serve to maintain a sense of self-continuity throughout the lifespan. Experiences that are frequently shared with others, in contrast, stem from a narrow age range located in young adulthood.

  9. USC orthogonal multiprocessor for image processing with neural networks

    Science.gov (United States)

    Hwang, Kai; Panda, Dhabaleswar K.; Haddadi, Navid

    1990-07-01

    This paper presents the architectural features and imaging applications of the Orthogonal MultiProcessor (OMP) system, which is under construction at the University of Southern California with research funding from NSF and assistance from several industrial partners. The prototype OMP is being built with 16 Intel i860 RISC microprocessors and 256 parallel memory modules using custom-designed spanning buses, which are 2-D interleaved and orthogonally accessed without conflicts. The 16-processor OMP prototype is targeted to achieve 430 MIPS and 600 Mflops, which have been verified by simulation experiments based on the design parameters used. The prototype OMP machine will be initially applied for image processing, computer vision, and neural network simulation applications. We summarize important vision and imaging algorithms that can be restructured with neural network models. These algorithms can efficiently run on the OMP hardware with linear speedup. The ultimate goal is to develop a high-performance Visual Computer (Viscom) for integrated low- and high-level image processing and vision tasks.

  10. Power profiling of Cholesky and QR factorizations on distributed memory systems

    KAUST Repository

    Bosilca, George; Ltaief, Hatem; Dongarra, Jack

    2012-01-01

    with a dynamic distributed scheduler (DAGuE) to leverage distributed memory systems. We present performance results (Gflop/s) as well as the power profile (Watts) of two common dense factorizations needed to solve linear systems of equations, namely

  11. A general purpose subroutine for fast fourier transform on a distributed memory parallel machine

    Science.gov (United States)

    Dubey, A.; Zubair, M.; Grosch, C. E.

    1992-01-01

    One issue which is central in developing a general purpose Fast Fourier Transform (FFT) subroutine on a distributed memory parallel machine is the data distribution. It is possible that different users would like to use the FFT routine with different data distributions. Thus, there is a need to design FFT schemes on distributed memory parallel machines which can support a variety of data distributions. An FFT implementation on a distributed memory parallel machine which works for a number of data distributions commonly encountered in scientific applications is presented. The problem of rearranging the data after computing the FFT is also addressed. The performance of the implementation on a distributed memory parallel machine Intel iPSC/860 is evaluated.

  12. Adaptive Dynamic Process Scheduling on Distributed Memory Parallel Computers

    Directory of Open Access Journals (Sweden)

    Wei Shu

    1994-01-01

    Full Text Available One of the challenges in programming distributed memory parallel machines is deciding how to allocate work to processors. This problem is particularly important for computations with unpredictable dynamic behaviors or irregular structures. We present a scheme for dynamic scheduling of medium-grained processes that is useful in this context. The adaptive contracting within neighborhood (ACWN is a dynamic, distributed, load-dependent, and scalable scheme. It deals with dynamic and unpredictable creation of processes and adapts to different systems. The scheme is described and contrasted with two other schemes that have been proposed in this context, namely the randomized allocation and the gradient model. The performance of the three schemes on an Intel iPSC/2 hypercube is presented and analyzed. The experimental results show that even though the ACWN algorithm incurs somewhat larger overhead than the randomized allocation, it achieves better performance in most cases due to its adaptiveness. Its feature of quickly spreading the work helps it outperform the gradient model in performance and scalability.

  13. Differentiation and Response Bias in Episodic Memory: Evidence from Reaction Time Distributions

    Science.gov (United States)

    Criss, Amy H.

    2010-01-01

    In differentiation models, the processes of encoding and retrieval produce an increase in the distribution of memory strength for targets and a decrease in the distribution of memory strength for foils as the amount of encoding increases. This produces an increase in the hit rate and decrease in the false-alarm rate for a strongly encoded compared…

  14. Efficient process migration in the EMPS multiprocessor system

    NARCIS (Netherlands)

    van Dijk, G.J.W.; Gils, van M.J.

    1992-01-01

    The process migration facility in the Eindhoven multiprocessor system (EMPS) is presented. In the EMPS system, mailboxes are used for interprocess communication. These mailboxes provide transparency of location for communicating processes. The major advantages of mailbox communication in the EMPS

  15. Mapping of H.264 decoding on a multiprocessor architecture

    Science.gov (United States)

    van der Tol, Erik B.; Jaspers, Egbert G.; Gelderblom, Rob H.

    2003-05-01

    Due to the increasing significance of development costs in the competitive domain of high-volume consumer electronics, generic solutions are required to enable reuse of the design effort and to increase the potential market volume. As a result from this, Systems-on-Chip (SoCs) contain a growing amount of fully programmable media processing devices as opposed to application-specific systems, which offered the most attractive solutions due to a high performance density. The following motivates this trend. First, SoCs are increasingly dominated by their communication infrastructure and embedded memory, thereby making the cost of the functional units less significant. Moreover, the continuously growing design costs require generic solutions that can be applied over a broad product range. Hence, powerful programmable SoCs are becoming increasingly attractive. However, to enable power-efficient designs, that are also scalable over the advancing VLSI technology, parallelism should be fully exploited. Both task-level and instruction-level parallelism can be provided by means of e.g. a VLIW multiprocessor architecture. To provide the above-mentioned scalability, we propose to partition the data over the processors, instead of traditional functional partitioning. An advantage of this approach is the inherent locality of data, which is extremely important for communication-efficient software implementations. Consequently, a software implementation is discussed, enabling e.g. SD resolution H.264 decoding with a two-processor architecture, whereas High-Definition (HD) decoding can be achieved with an eight-processor system, executing the same software. Experimental results show that the data communication considerably reduces up to 65% directly improving the overall performance. Apart from considerable improvement in memory bandwidth, this novel concept of partitioning offers a natural approach for optimally balancing the load of all processors, thereby further improving the

  16. Multiprocessor Priority Ceiling Emulation for Safety-Critical Java

    DEFF Research Database (Denmark)

    Strøm, Torur Biskopstø; Schoeberl, Martin

    2015-01-01

    Priority ceiling emulation has preferable properties on uniprocessor systems, such as avoiding priority inversion and being deadlock free. This has made it a popular locking protocol. According to the safety-critical Java specication, priority ceiling emulation is a requirement for implementations....... However, implementing the protocol for multiprocessor systemsis more complex so implementations might perform worse than non-preemptive implementations. In this paper we compare two multiprocessor lock implementations with hardware support for the Java optimized processor: non-preemptive locking...

  17. Hardware locks for a real-time Java chip multiprocessor

    DEFF Research Database (Denmark)

    Strøm, Torur Biskopstø; Puffitsch, Wolfgang; Schoeberl, Martin

    2016-01-01

    A software locking mechanism commonly protects shared resources for multithreaded applications. This mechanism can, especially in chip-multiprocessor systems, result in a large synchronization overhead. For real-time systems in particular, this overhead increases the worst-case execution time....... This improvement can allow a larger number of real-time tasks to be reliably scheduled on a multiprocessor real-time platform....

  18. Multiprocessor Global Scheduling on Frame-Based DVFS Systems

    OpenAIRE

    Berten, Vandy; Goossens, Joël

    2008-01-01

    International audience; In this work, we are interested in multiprocessor energy efficient systems where task durations are not known in advance but are known stochastically. More precisely we consider global scheduling algorithms for frame-based multiprocessor stochastic DVFS (Dynamic Voltage and Frequency Scaling) systems. Moreover we consider processors with a discrete set of available frequencies. We provide a global scheduling algorithm, and formally show that no deadline will ever be mi...

  19. An optimal multi-channel memory controller for real-time systems

    NARCIS (Netherlands)

    Gomony, M.D.; Akesson, K.B.; Goossens, K.G.W.

    2013-01-01

    Optimal utilization of a multi-channel memory, such as Wide IO DRAM, as shared memory in multi-processor platforms depends on the mapping of memory clients to the memory channels, the granularity at which the memory requests are interleaved in each channel, and the bandwidth and memory capacity

  20. Intelligent discrete particle swarm optimization for multiprocessor task scheduling problem

    Directory of Open Access Journals (Sweden)

    S Sarathambekai

    2017-03-01

    Full Text Available Discrete particle swarm optimization is one of the most recently developed population-based meta-heuristic optimization algorithm in swarm intelligence that can be used in any discrete optimization problems. This article presents a discrete particle swarm optimization algorithm to efficiently schedule the tasks in the heterogeneous multiprocessor systems. All the optimization algorithms share a common algorithmic step, namely population initialization. It plays a significant role because it can affect the convergence speed and also the quality of the final solution. The random initialization is the most commonly used method in majority of the evolutionary algorithms to generate solutions in the initial population. The initial good quality solutions can facilitate the algorithm to locate the optimal solution or else it may prevent the algorithm from finding the optimal solution. Intelligence should be incorporated to generate the initial population in order to avoid the premature convergence. This article presents a discrete particle swarm optimization algorithm, which incorporates opposition-based technique to generate initial population and greedy algorithm to balance the load of the processors. Make span, flow time, and reliability cost are three different measures used to evaluate the efficiency of the proposed discrete particle swarm optimization algorithm for scheduling independent tasks in distributed systems. Computational simulations are done based on a set of benchmark instances to assess the performance of the proposed algorithm.

  1. Languages, compilers and run-time environments for distributed memory machines

    CERN Document Server

    Saltz, J

    1992-01-01

    Papers presented within this volume cover a wide range of topics related to programming distributed memory machines. Distributed memory architectures, although having the potential to supply the very high levels of performance required to support future computing needs, present awkward programming problems. The major issue is to design methods which enable compilers to generate efficient distributed memory programs from relatively machine independent program specifications. This book is the compilation of papers describing a wide range of research efforts aimed at easing the task of programmin

  2. Scaling Techniques for Massive Scale-Free Graphs in Distributed (External) Memory

    KAUST Repository

    Pearce, Roger; Gokhale, Maya; Amato, Nancy M.

    2013-01-01

    We present techniques to process large scale-free graphs in distributed memory. Our aim is to scale to trillions of edges, and our research is targeted at leadership class supercomputers and clusters with local non-volatile memory, e.g., NAND Flash

  3. Prefetching in file systems for MIMD multiprocessors

    Science.gov (United States)

    Kotz, David F.; Ellis, Carla Schlatter

    1990-01-01

    The question of whether prefetching blocks on the file into the block cache can effectively reduce overall execution time of a parallel computation, even under favorable assumptions, is considered. Experiments have been conducted with an interleaved file system testbed on the Butterfly Plus multiprocessor. Results of these experiments suggest that (1) the hit ratio, the accepted measure in traditional caching studies, may not be an adequate measure of performance when the workload consists of parallel computations and parallel file access patterns, (2) caching with prefetching can significantly improve the hit ratio and the average time to perform an I/O (input/output) operation, and (3) an improvement in overall execution time has been observed in most cases. In spite of these gains, prefetching sometimes results in increased execution times (a negative result, given the optimistic nature of the study). The authors explore why it is not trivial to translate savings on individual I/O requests into consistently better overall performance and identify the key problems that need to be addressed in order to improve the potential of prefetching techniques in the environment.

  4. FTMP (Fault Tolerant Multiprocessor) programmer's manual

    Science.gov (United States)

    Feather, F. E.; Liceaga, C. A.; Padilla, P. A.

    1986-01-01

    The Fault Tolerant Multiprocessor (FTMP) computer system was constructed using the Rockwell/Collins CAPS-6 processor. It is installed in the Avionics Integration Research Laboratory (AIRLAB) of NASA Langley Research Center. It is hosted by AIRLAB's System 10, a VAX 11/750, for the loading of programs and experimentation. The FTMP support software includes a cross compiler for a high level language called Automated Engineering Design (AED) System, an assembler for the CAPS-6 processor assembly language, and a linker. Access to this support software is through an automated remote access facility on the VAX which relieves the user of the burden of learning how to use the IBM 4381. This manual is a compilation of information about the FTMP support environment. It explains the FTMP software and support environment along many of the finer points of running programs on FTMP. This will be helpful to the researcher trying to run an experiment on FTMP and even to the person probing FTMP with fault injections. Much of the information in this manual can be found in other sources; we are only attempting to bring together the basic points in a single source. If the reader should need points clarified, there is a list of support documentation in the back of this manual.

  5. Administrator of 9/11 victim compensation fund to administer Hokie Spirit Memorial Fund distributions

    OpenAIRE

    Hincker, Lawrence

    2007-01-01

    Virginia Tech President Charles Steger has asked Kenneth R. Feinberg, who served as "Special Master of the federal September 11th Victim Compensation Fund of 2001," to administer distributions of the university Hokie Spirit Memorial Fund (HSMF).

  6. Computational cost of isogeometric multi-frontal solvers on parallel distributed memory machines

    KAUST Repository

    Woźniak, Maciej; Paszyński, Maciej R.; Pardo, D.; Dalcin, Lisandro; Calo, Victor M.

    2015-01-01

    This paper derives theoretical estimates of the computational cost for isogeometric multi-frontal direct solver executed on parallel distributed memory machines. We show theoretically that for the Cp-1 global continuity of the isogeometric solution

  7. Construction and Application of an AMR Algorithm for Distributed Memory Computers

    OpenAIRE

    Deiterding, Ralf

    2003-01-01

    While the parallelization of blockstructured adaptive mesh refinement techniques is relatively straight-forward on shared memory architectures, appropriate distribution strategies for the emerging generation of distributed memory machines are a topic of on-going research. In this paper, a locality-preserving domain decomposition is proposed that partitions the entire AMR hierarchy from the base level on. It is shown that the approach reduces the communication costs and simplifies the im...

  8. Multiprocessor based data acquisition system for radiation monitoring in nuclear reactors

    International Nuclear Information System (INIS)

    Pansare, M.G.; Narsaiah, A.; Anantha Krishnan, T.S.

    1989-01-01

    Expensive minicomputers are required for building powerful Data Acquisition Systems (DAS) capable of scanning and processing large number of signals in a real-time environment. However by using the inexpensive microprocessors in multiprocessor configuration it is possible to build DASs that are as powerful as minicomputer based systems at much lesser cost. This paper describes such a multiprocessor based DAS designed for acquiring data from various radiation monitoring instruments of a nuclear reactor. The system is built by using MULTIBUS standard boards based on intel 8086, 16 bit microprocessor, with local and shared memory. The system monitors upto 128 analog input channels, 64 digital input channels and actuates upto 128 digital output contacts. The system continuously checks for the alarm condition of the input channels and displays the alarm status on an ALARM CRT. Facility has been provided for the transfer of data to a central computer. At any instant of time, the information regarding different channels being monitored is available from the local console as well as through five remote terminals located at various places in the reactor building. (author)

  9. Development of a VME multi-processor system for plasma control at the JT-60 Upgrade

    International Nuclear Information System (INIS)

    Takahashi, M.; Kurihara, K.; Kawamata, Y.; Akasaka, H.; Kimura, T.

    1992-01-01

    Design and initial operation results are reported of a VME multi-processor system [1] for plasma control at a large fusion device named 'the JT-60 Upgrade' utilizing three 32-bit MC88100 based RISC computers and VME components. Development of the system was stimulated by faster and more accurate computation requirements for the plasma position and current control. The RISC computers operate at 25 MHz along with two cashe memories named MC88200. We newly developed VME bus modules of up/down counter, analog-to-digital converter and clock pulse generator for measuring magnetic field and coil current and for synchronizing the processing in the three RISCs and direct digital controllers (DDCs) of magnet power supplies. We also evaluated that the speed of the data transfer between the VME bus system and the DDCs through CAMAC highways satisfies the above requirements. In the initial operation of the JT-60 upgrade, it has been proved that the VME multi-processor system well controls the plasma position and current with a sampling period of 250 μsec and a delay of 500 μsec. (author)

  10. Childhood amnesia in the making: different distributions of autobiographical memories in children and adults.

    Science.gov (United States)

    Bauer, Patricia J; Larkina, Marina

    2014-04-01

    Within the memory literature, a robust finding is of childhood amnesia: a relative paucity among adults for autobiographical or personal memories from the first 3 to 4 years of life, and from the first 7 years, a smaller number of memories than would be expected based on normal forgetting. Childhood amnesia is observed in spite of strong evidence that during the period eventually obscured by the amnesia, children construct and preserve autobiographical memories. Why early memories seemingly are lost to recollection is an unanswered question. In the present research, we examined the issue by using the cue word technique to chart the distributions of autobiographical memories in samples of children ages 7 to 11 years and samples of young and middle-aged adults. Among adults, the distributions were best fit by the power function, whereas among children, the exponential function provided a better fit to the distributions of memories. The findings suggest that a major source of childhood amnesia is a constant rate of forgetting in childhood, seemingly resulting from failed consolidation, the outcome of which is a smaller pool of memories available for later retrieval.

  11. Best Speed Fit EDF Scheduling for Performance Asymmetric Multiprocessors

    Directory of Open Access Journals (Sweden)

    Peng Wu

    2017-01-01

    Full Text Available In order to improve the performance of a real-time system, asymmetric multiprocessors have been proposed. The benefits of improved system performance and reduced power consumption from such architectures cannot be fully exploited unless suitable task scheduling and task allocation approaches are implemented at the operating system level. Unfortunately, most of the previous research on scheduling algorithms for performance asymmetric multiprocessors is focused on task priority assignment. They simply assign the highest priority task to the fastest processor. In this paper, we propose BSF-EDF (best speed fit for earliest deadline first for performance asymmetric multiprocessor scheduling. This approach chooses a suitable processor rather than the fastest one, when allocating tasks. With this proposed BSF-EDF scheduling, we also derive an effective schedulability test.

  12. Estimating Performance of Single Bus, Shared Memory Multiprocessors

    Science.gov (United States)

    1987-05-01

    Chandy78] K.M. Chandy, C.M. Sauer, "Approximate methods for analyzing queuing network models of computing systems," Computing Surveys, vol10 , no 3...Denning78] P. Denning, J. Buzen, "The operational analysis of queueing network models", Computing Sur- veys, vol10 , no 3, September 1978, pp 225-261

  13. High speed vision processor with reconfigurable processing element array based on full-custom distributed memory

    Science.gov (United States)

    Chen, Zhe; Yang, Jie; Shi, Cong; Qin, Qi; Liu, Liyuan; Wu, Nanjian

    2016-04-01

    In this paper, a hybrid vision processor based on a compact full-custom distributed memory for near-sensor high-speed image processing is proposed. The proposed processor consists of a reconfigurable processing element (PE) array, a row processor (RP) array, and a dual-core microprocessor. The PE array includes two-dimensional processing elements with a compact full-custom distributed memory. It supports real-time reconfiguration between the PE array and the self-organized map (SOM) neural network. The vision processor is fabricated using a 0.18 µm CMOS technology. The circuit area of the distributed memory is reduced markedly into 1/3 of that of the conventional memory so that the circuit area of the vision processor is reduced by 44.2%. Experimental results demonstrate that the proposed design achieves correct functions.

  14. Memory-assisted quantum key distribution resilient against multiple-excitation effects

    Science.gov (United States)

    Lo Piparo, Nicolò; Sinclair, Neil; Razavi, Mohsen

    2018-01-01

    Memory-assisted measurement-device-independent quantum key distribution (MA-MDI-QKD) has recently been proposed as a technique to improve the rate-versus-distance behavior of QKD systems by using existing, or nearly-achievable, quantum technologies. The promise is that MA-MDI-QKD would require less demanding quantum memories than the ones needed for probabilistic quantum repeaters. Nevertheless, early investigations suggest that, in order to beat the conventional memory-less QKD schemes, the quantum memories used in the MA-MDI-QKD protocols must have high bandwidth-storage products and short interaction times. Among different types of quantum memories, ensemble-based memories offer some of the required specifications, but they typically suffer from multiple excitation effects. To avoid the latter issue, in this paper, we propose two new variants of MA-MDI-QKD both relying on single-photon sources for entangling purposes. One is based on known techniques for entanglement distribution in quantum repeaters. This scheme turns out to offer no advantage even if one uses ideal single-photon sources. By finding the root cause of the problem, we then propose another setup, which can outperform single memory-less setups even if we allow for some imperfections in our single-photon sources. For such a scheme, we compare the key rate for different types of ensemble-based memories and show that certain classes of atomic ensembles can improve the rate-versus-distance behavior.

  15. Performance of the coupled thermalhydraulics/neutron kinetics code R/P/C on workstation clusters and multiprocessor systems

    International Nuclear Information System (INIS)

    Hammer, C.; Paffrath, M.; Boeer, R.; Finnemann, H.; Jackson, C.J.

    1996-01-01

    The light water reactor core simulation code PANBOX has been coupled with the transient analysis code RELAP5 for the purpose of performing plant safety analyses with a three-dimensional (3-D) neutron kinetics model. The system has been parallelized to improve the computational efficiency. The paper describes the features of this system with emphasis on performance aspects. Performance results are given for different types of parallelization, i. e. for using an automatic parallelizing compiler, using the portable PVM platform on a workstation cluster, using PVM on a shared memory multiprocessor, and for using machine dependent interfaces. (author)

  16. Embedded software design and programming of multiprocessor system-on-chip simulink and system C case studies

    CERN Document Server

    Popovici, Katalin; Jerraya, Ahmed A; Wolf, Marilyn

    2010-01-01

    Current multimedia and telecom applications require complex, heterogeneous multiprocessor system on chip (MPSoC) architectures with specific communication infrastructure in order to achieve the required performance. Heterogeneous MPSoC includes different types of processing units (DSP, microcontroller, ASIP) and different communication schemes (fast links, non standard memory organization and access).Programming an MPSoC requires the generation of efficient software running on MPSoC from a high level environment, by using the characteristics of the architecture. This task is known to be tediou

  17. Job-mix modeling and system analysis of an aerospace multiprocessor.

    Science.gov (United States)

    Mallach, E. G.

    1972-01-01

    An aerospace guidance computer organization, consisting of multiple processors and memory units attached to a central time-multiplexed data bus, is described. A job mix for this type of computer is obtained by analysis of Apollo mission programs. Multiprocessor performance is then analyzed using: 1) queuing theory, under certain 'limiting case' assumptions; 2) Markov process methods; and 3) system simulation. Results of the analyses indicate: 1) Markov process analysis is a useful and efficient predictor of simulation results; 2) efficient job execution is not seriously impaired even when the system is so overloaded that new jobs are inordinately delayed in starting; 3) job scheduling is significant in determining system performance; and 4) a system having many slow processors may or may not perform better than a system of equal power having few fast processors, but will not perform significantly worse.

  18. Operating experience with a VMEbus multiprocessor system for data acquisition and reduction in nuclear physics

    International Nuclear Information System (INIS)

    Kutt, P.H.; Balamuth, D.P.

    1989-01-01

    A multiprocessor system based on commercially available VMEbus components has been developed for the acquisition and reduction of event-mode data in nuclear physics experiments. The system contains seven 68000 CPU's and 14 MB of memory. A minimal operating system handles data transfer and task allocation, and a compiler for a specially designed event analysis language produces code for the processors. The system has been in operation for four years at the University of Pennsylvania Tandem Accelerator Laboratory. Computation rates over 3 times that of a MicroVAX II have been achieved at a fraction of the cost. The use of WORM optical disks for event recording allows the processing for gigabyte data sets without operator intervention. A more powerful system is being planned which will make use of recently developed RISC processors to obtain an order of magnitude increase in computing power per node

  19. Chip-Multiprocessor Hardware Locks for Safety-Critical Java

    DEFF Research Database (Denmark)

    Strøm, Torur Biskopstø; Puffitsch, Wolfgang; Schoeberl, Martin

    2013-01-01

    and may void a task set's schedulability. In this paper we present a hardware locking mechanism to reduce the synchronization overhead. The solution is implemented for the chip-multiprocessor version of the Java Optimized Processor in the context of safety-critical Java. The implementation is compared...

  20. Stream-processing pipelines: processing of streams on multiprocessor architecture

    NARCIS (Netherlands)

    Kavaldjiev, N.K.; Smit, Gerardus Johannes Maria; Jansen, P.G.

    In this paper we study the timing aspects of the operation of stream-processing applications that run on a multiprocessor architecture. Dependencies are derived for the processing and communication times of the processors in such a system. Three cases of real-time constrained operation and four

  1. Software for event oriented processing on multiprocessor systems

    International Nuclear Information System (INIS)

    Fischler, M.; Areti, H.; Biel, J.; Bracker, S.; Case, G.; Gaines, I.; Husby, D.; Nash, T.

    1984-08-01

    Computing intensive problems that require the processing of numerous essentially independent events are natural customers for large scale multi-microprocessor systems. This paper describes the software required to support users with such problems in a multiprocessor environment. It is based on experience with and development work aimed at processing very large amounts of high energy physics data

  2. Two alternate proofs of Wang's lune formula for sparse distributed memory and an integral approximation

    Science.gov (United States)

    Jaeckel, Louis A.

    1988-01-01

    In Kanerva's Sparse Distributed Memory, writing to and reading from the memory are done in relation to spheres in an n-dimensional binary vector space. Thus it is important to know how many points are in the intersection of two spheres in this space. Two proofs are given of Wang's formula for spheres of unequal radii, and an integral approximation for the intersection in this case.

  3. How are rescaled range analyses affected by different memory and distributional properties? A Monte Carlo study

    Czech Academy of Sciences Publication Activity Database

    Krištoufek, Ladislav

    2012-01-01

    Roč. 391, č. 17 (2012), s. 4252-4260 ISSN 0378-4371 R&D Projects: GA ČR GA402/09/0965 Grant - others:GA UK(CZ) 118310; SVV(CZ) 261 501 Institutional support: RVO:67985556 Keywords : Rescaled range analysis * Modified rescaled range analysis * Hurst exponent * Long - term memory * Short- term memory Subject RIV: AH - Economics Impact factor: 1.676, year: 2012 http://library.utia.cas.cz/separaty/2012/E/kristoufek-how are rescaled range analyses affected by different memory and distributional properties.pdf

  4. Efficient packing of patterns in sparse distributed memory by selective weighting of input bits

    Science.gov (United States)

    Kanerva, Pentti

    1991-01-01

    When a set of patterns is stored in a distributed memory, any given storage location participates in the storage of many patterns. From the perspective of any one stored pattern, the other patterns act as noise, and such noise limits the memory's storage capacity. The more similar the retrieval cues for two patterns are, the more the patterns interfere with each other in memory, and the harder it is to separate them on retrieval. A method is described of weighting the retrieval cues to reduce such interference and thus to improve the separability of patterns that have similar cues.

  5. A database for on-line event analysis on a distributed memory machine

    CERN Document Server

    Argante, E; Van der Stok, P D V; Willers, Ian Malcolm

    1995-01-01

    Parallel in-memory databases can enhance the structuring and parallelization of programs used in High Energy Physics (HEP). Efficient database access routines are used as communication primitives which hide the communication topology in contrast to the more explicit communications like PVM or MPI. A parallel in-memory database, called SPIDER, has been implemented on a 32 node Meiko CS-2 distributed memory machine. The spider primitives generate a lower overhead than the one generated by PVM or PMI. The event reconstruction program, CPREAD of the CPLEAR experiment, has been used as a test case. Performance measurerate generated by CPLEAR.

  6. Portable memory consistency for software managed distributed memory in many-core SoC

    NARCIS (Netherlands)

    Rutgers, J.H.; Bekooij, Marco Jan Gerrit; Smit, Gerardus Johannes Maria

    2013-01-01

    Porting software to different platforms can require modifications of the application. One of the issues is that the targeted hardware supports another memory consistency model. As a consequence, the completion order of reads and writes in a multi-threaded application can change, which may result in

  7. Recognition of simple visual images using a sparse distributed memory: Some implementations and experiments

    Science.gov (United States)

    Jaeckel, Louis A.

    1990-01-01

    Previously, a method was described of representing a class of simple visual images so that they could be used with a Sparse Distributed Memory (SDM). Herein, two possible implementations are described of a SDM, for which these images, suitably encoded, will serve both as addresses to the memory and as data to be stored in the memory. A key feature of both implementations is that a pattern that is represented as an unordered set with a variable number of members can be used as an address to the memory. In the 1st model, an image is encoded as a 9072 bit string to be used as a read or write address; the bit string may also be used as data to be stored in the memory. Another representation, in which an image is encoded as a 256 bit string, may be used with either model as data to be stored in the memory, but not as an address. In the 2nd model, an image is not represented as a vector of fixed length to be used as an address. Instead, a rule is given for determining which memory locations are to be activated in response to an encoded image. This activation rule treats the pieces of an image as an unordered set. With this model, the memory can be simulated, based on a method of computing the approximate result of a read operation.

  8. Scalable shared-memory multiprocessing

    CERN Document Server

    Lenoski, Daniel E

    1995-01-01

    Dr. Lenoski and Dr. Weber have experience with leading-edge research and practical issues involved in implementing large-scale parallel systems. They were key contributors to the architecture and design of the DASH multiprocessor. Currently, they are involved with commercializing scalable shared-memory technology.

  9. PGHPF – An Optimizing High Performance Fortran Compiler for Distributed Memory Machines

    Directory of Open Access Journals (Sweden)

    Zeki Bozkus

    1997-01-01

    Full Text Available High Performance Fortran (HPF is the first widely supported, efficient, and portable parallel programming language for shared and distributed memory systems. HPF is realized through a set of directive-based extensions to Fortran 90. It enables application developers and Fortran end-users to write compact, portable, and efficient software that will compile and execute on workstations, shared memory servers, clusters, traditional supercomputers, or massively parallel processors. This article describes a production-quality HPF compiler for a set of parallel machines. Compilation techniques such as data and computation distribution, communication generation, run-time support, and optimization issues are elaborated as the basis for an HPF compiler implementation on distributed memory machines. The performance of this compiler on benchmark programs demonstrates that high efficiency can be achieved executing HPF code on parallel architectures.

  10. SuperLU{_}DIST: A scalable distributed-memory sparse direct solver for unsymmetric linear systems

    Energy Technology Data Exchange (ETDEWEB)

    Li, Xiaoye S.; Demmel, James W.

    2002-03-27

    In this paper, we present the main algorithmic features in the software package SuperLU{_}DIST, a distributed-memory sparse direct solver for large sets of linear equations. We give in detail our parallelization strategies, with focus on scalability issues, and demonstrate the parallel performance and scalability on current machines. The solver is based on sparse Gaussian elimination, with an innovative static pivoting strategy proposed earlier by the authors. The main advantage of static pivoting over classical partial pivoting is that it permits a priori determination of data structures and communication pattern for sparse Gaussian elimination, which makes it more scalable on distributed memory machines. Based on this a priori knowledge, we designed highly parallel and scalable algorithms for both LU decomposition and triangular solve and we show that they are suitable for large-scale distributed memory machines.

  11. Memory

    Science.gov (United States)

    ... it has to decide what is worth remembering. Memory is the process of storing and then remembering this information. There are different types of memory. Short-term memory stores information for a few ...

  12. External parallel sorting with multiprocessor computers

    International Nuclear Information System (INIS)

    Comanceau, S.I.

    1984-01-01

    This article describes methods of external sorting in which the entire main computer memory is used for the internal sorting of entries, forming out of them sorted segments of the greatest possible size, and outputting them to external memories. The obtained segments are merged into larger segments until all entries form one ordered segment. The described methods are suitable for sequential files stored on magnetic tape. The needs of the sorting algorithm can be met by using the relatively slow peripheral storage devices (e.g., tapes, disks, drums). The efficiency of the external sorting methods is determined by calculating the total sorting time as a function of the number of entries to be sorted and the number of parallel processors participating in the sorting process

  13. Data Provenance for Agent-Based Models in a Distributed Memory

    Directory of Open Access Journals (Sweden)

    Delmar B. Davis

    2018-04-01

    Full Text Available Agent-Based Models (ABMs assist with studying emergent collective behavior of individual entities in social, biological, economic, network, and physical systems. Data provenance can support ABM by explaining individual agent behavior. However, there is no provenance support for ABMs in a distributed setting. The Multi-Agent Spatial Simulation (MASS library provides a framework for simulating ABMs at fine granularity, where agents and spatial data are shared application resources in a distributed memory. We introduce a novel approach to capture ABM provenance in a distributed memory, called ProvMASS. We evaluate our technique with traditional data provenance queries and performance measures. Our results indicate that a configurable approach can capture provenance that explains coordination of distributed shared resources, simulation logic, and agent behavior while limiting performance overhead. We also show the ability to support practical analyses (e.g., agent tracking and storage requirements for different capture configurations.

  14. Power profiling of Cholesky and QR factorizations on distributed memory systems

    KAUST Repository

    Bosilca, George

    2012-08-30

    This paper presents the power profile of two high performance dense linear algebra libraries on distributed memory systems, ScaLAPACK and DPLASMA. From the algorithmic perspective, their methodologies are opposite. The former is based on block algorithms and relies on multithreaded BLAS and a two-dimensional block cyclic data distribution to achieve high parallel performance. The latter is based on tile algorithms running on top of a tile data layout and uses fine-grained task parallelism combined with a dynamic distributed scheduler (DAGuE) to leverage distributed memory systems. We present performance results (Gflop/s) as well as the power profile (Watts) of two common dense factorizations needed to solve linear systems of equations, namely Cholesky and QR. The reported numbers show that DPLASMA surpasses ScaLAPACK not only in terms of performance (up to 2X speedup) but also in terms of energy efficiency (up to 62 %). © 2012 Springer-Verlag (outside the USA).

  15. Scientific applications and numerical algorithms on the midas multiprocessor system

    International Nuclear Information System (INIS)

    Logan, D.; Maples, C.

    1986-01-01

    The MIDAS multiprocessor system is a multi-level, hierarchial structure designed at the Advanced Computer Architecture Laboratory of the University of California's Lawrence Berkeley Laboratory. A two-stage, 11-processor system has been operational for over a year and is currently undergoing expansion. It has been employed to investigate the performance of different methods of decomposing various problems and algorithms into a multiprocessor environment. The results of such tests on a variety of applications such as scientific data analysis, Monte Carlo calculations, and image processing, are discussed. Often such decompositions involve investigating the parallel structure of fundamental algorithms. Several basic algorithms dealing with random number generation, matrix diagonalization, fast Fourier transforms, and finite element methods in solving partial differential equations are also discussed. The performance and projected extensibilities of these decompositions on the MIDAS system are reported

  16. Cache-aware network-on-chip for chip multiprocessors

    Science.gov (United States)

    Tatas, Konstantinos; Kyriacou, Costas; Dekoulis, George; Demetriou, Demetris; Avraam, Costas; Christou, Anastasia

    2009-05-01

    This paper presents the hardware prototype of a Network-on-Chip (NoC) for a chip multiprocessor that provides support for cache coherence, cache prefetching and cache-aware thread scheduling. A NoC with support to these cache related mechanisms can assist in improving systems performance by reducing the cache miss ratio. The presented multi-core system employs the Data-Driven Multithreading (DDM) model of execution. In DDM thread scheduling is done according to data availability, thus the system is aware of the threads to be executed in the near future. This characteristic of the DDM model allows for cache aware thread scheduling and cache prefetching. The NoC prototype is a crossbar switch with output buffering that can support a cache-aware 4-node chip multiprocessor. The prototype is built on the Xilinx ML506 board equipped with a Xilinx Virtex-5 FPGA.

  17. Advanced lectures on multiprocessor programming (1/3)

    CERN Multimedia

    CERN. Geneva

    2011-01-01

    Three classes (60 mins) on Multiprocessor Programming Prof. Dr. Christoph von Praun Georg-Simon-Ohm University of Applied Sciences Nuremberg, Germany This is an advanced class on multiprocessor programming. The class gives an introduction to principles of concurrent objects and the notion of different progress guarantees that concurrent computations can have. The focus of this class is on non-blocking computations, i.e. concurrent programs that do not make use of locks. We discuss the implementation of practical non-blocking data structures in detail. 1st class: Introduction to concurrent objects 2nd class: Principles of non-blocking synchronization 3rd class: Concurrent queues Brief Bio of Christoph von Praun Christoph worked on a variety of analysis techniques and runtime platforms for parallel programs. Hist most recent research studies programming models and tools that support transactional synchronization. In prior work, which he also did at the IBM T.J. Watson Research Center in Yorktown Height...

  18. Extending and implementing the Self-adaptive Virtual Processor for distributed memory architectures

    NARCIS (Netherlands)

    van Tol, M.W.; Koivisto, J.

    2011-01-01

    Many-core architectures of the future are likely to have distributed memory organizations and need fine grained concurrency management to be used effectively. The Self-adaptive Virtual Processor (SVP) is an abstract concurrent programming model which can provide this, but the model and its current

  19. Learning to read aloud: A neural network approach using sparse distributed memory

    Science.gov (United States)

    Joglekar, Umesh Dwarkanath

    1989-01-01

    An attempt to solve a problem of text-to-phoneme mapping is described which does not appear amenable to solution by use of standard algorithmic procedures. Experiments based on a model of distributed processing are also described. This model (sparse distributed memory (SDM)) can be used in an iterative supervised learning mode to solve the problem. Additional improvements aimed at obtaining better performance are suggested.

  20. Plasma physics modeling and the Cray-2 multiprocessor

    International Nuclear Information System (INIS)

    Killeen, J.

    1985-01-01

    The importance of computer modeling in the magnetic fusion energy research program is discussed. The need for the most advanced supercomputers is described. To meet the demand for more powerful scientific computers to solve larger and more complicated problems, the computer industry is developing multiprocessors. The role of the Cray-2 in plasma physics modeling is discussed with some examples. 28 refs., 2 figs., 1 tab

  1. Multiprocessor Real-Time Scheduling with Hierarchical Processor Affinities

    OpenAIRE

    Bonifaci , Vincenzo; Brandenburg , Björn; D'Angelo , Gianlorenzo; Marchetti-Spaccamela , Alberto

    2016-01-01

    International audience; Many multiprocessor real-time operating systems offer the possibility to restrict the migrations of any task to a specified subset of processors by setting affinity masks. A notion of " strong arbitrary processor affinity scheduling " (strong APA scheduling) has been proposed; this notion avoids schedulability losses due to overly simple implementations of processor affinities. Due to potential overheads, strong APA has not been implemented so far in a real-time operat...

  2. 2: Local area networks as a multiprocessor treatment planning system

    International Nuclear Information System (INIS)

    Neblett, D.L.; Hogan, S.E.

    1987-01-01

    The creation of a local area network (LAN) of interconnected computers provides an environment of multi computer processors that adds a new dimension to treatment planning. A LAN system provides the opportunity to have two or more computers working on the plan in parallel. With high speed interprocessor transfer, events such as the time consuming task of correcting several individual beams for contours and inhomogeneities can be performed simultaneously; thus, effectively creating a parallel multiprocessor treatment planning system

  3. Design considerations for a multiprocessor based data acquisition system

    International Nuclear Information System (INIS)

    Tippie, J.W.; Kulaga, J.E.

    1979-01-01

    The rapid advance of digital technology has provided the systems designer with many new design options. Hardware is no longer the controlling expense. Complex operating systems provide the flexibility and development tools needed by software designers, but restrict throughput. Multiprocessor-based systems can be used to ''front-end'' high-throughput applications while maintaining the many advantages offered by multitasking operating systems. The design of a high-throughput data acquisition system for application in low energy nuclear physics is considered

  4. Multiprocessor Real-Time Locking Protocols for Replicated Resources

    Science.gov (United States)

    2016-07-01

    assignment problem, the ac- tual identities of the allocated replicas must be known. When locking protocols are used, tasks may experience delays due to both...Multiprocessor Real-Time Locking Protocols for Replicated Resources ∗ Catherine E. Jarrett1, Kecheng Yang1, Ming Yang1, Pontus Ekberg2, and James H...replicas to execute. In prior work on replicated resources, k-exclusion locks have been used, but this restricts tasks to lock only one replica at a time. To

  5. The use of fractal dimension calculation algorithm to determine the nature of autobiographical memories distribution across the life span

    Science.gov (United States)

    Mitina, Olga V.; Nourkova, Veronica V.

    In the given research we offer the technique for the calculation of the density of events which people retrieve from autobiographical memory. We wanted to prove a non-uniformity nature of memories distribution in the course of time and were interested with the law of distribution of these events during life course.

  6. Feature-Based Visual Short-Term Memory Is Widely Distributed and Hierarchically Organized.

    Science.gov (United States)

    Dotson, Nicholas M; Hoffman, Steven J; Goodell, Baldwin; Gray, Charles M

    2018-06-15

    Feature-based visual short-term memory is known to engage both sensory and association cortices. However, the extent of the participating circuit and the neural mechanisms underlying memory maintenance is still a matter of vigorous debate. To address these questions, we recorded neuronal activity from 42 cortical areas in monkeys performing a feature-based visual short-term memory task and an interleaved fixation task. We find that task-dependent differences in firing rates are widely distributed throughout the cortex, while stimulus-specific changes in firing rates are more restricted and hierarchically organized. We also show that microsaccades during the memory delay encode the stimuli held in memory and that units modulated by microsaccades are more likely to exhibit stimulus specificity, suggesting that eye movements contribute to visual short-term memory processes. These results support a framework in which most cortical areas, within a modality, contribute to mnemonic representations at timescales that increase along the cortical hierarchy. Copyright © 2018 Elsevier Inc. All rights reserved.

  7. Immigration, language proficiency, and autobiographical memories: Lifespan distribution and second-language access.

    Science.gov (United States)

    Esposito, Alena G; Baker-Ward, Lynne

    2016-08-01

    This investigation examined two controversies in the autobiographical literature: how cross-language immigration affects the distribution of autobiographical memories across the lifespan and under what circumstances language-dependent recall is observed. Both Spanish/English bilingual immigrants and English monolingual non-immigrants participated in a cue word study, with the bilingual sample taking part in a within-subject language manipulation. The expected bump in the number of memories from early life was observed for non-immigrants but not immigrants, who reported more memories for events surrounding immigration. Aspects of the methodology addressed possible reasons for past discrepant findings. Language-dependent recall was influenced by second-language proficiency. Results were interpreted as evidence that bilinguals with high second-language proficiency, in contrast to those with lower second-language proficiency, access a single conceptual store through either language. The final multi-level model predicting language-dependent recall, including second-language proficiency, age of immigration, internal language, and cue word language, explained ¾ of the between-person variance and (1)/5 of the within-person variance. We arrive at two conclusions. First, major life transitions influence the distribution of memories. Second, concept representation across multiple languages follows a developmental model. In addition, the results underscore the importance of considering language experience in research involving memory reports.

  8. Design of multiple sequence alignment algorithms on parallel, distributed memory supercomputers.

    Science.gov (United States)

    Church, Philip C; Goscinski, Andrzej; Holt, Kathryn; Inouye, Michael; Ghoting, Amol; Makarychev, Konstantin; Reumann, Matthias

    2011-01-01

    The challenge of comparing two or more genomes that have undergone recombination and substantial amounts of segmental loss and gain has recently been addressed for small numbers of genomes. However, datasets of hundreds of genomes are now common and their sizes will only increase in the future. Multiple sequence alignment of hundreds of genomes remains an intractable problem due to quadratic increases in compute time and memory footprint. To date, most alignment algorithms are designed for commodity clusters without parallelism. Hence, we propose the design of a multiple sequence alignment algorithm on massively parallel, distributed memory supercomputers to enable research into comparative genomics on large data sets. Following the methodology of the sequential progressiveMauve algorithm, we design data structures including sequences and sorted k-mer lists on the IBM Blue Gene/P supercomputer (BG/P). Preliminary results show that we can reduce the memory footprint so that we can potentially align over 250 bacterial genomes on a single BG/P compute node. We verify our results on a dataset of E.coli, Shigella and S.pneumoniae genomes. Our implementation returns results matching those of the original algorithm but in 1/2 the time and with 1/4 the memory footprint for scaffold building. In this study, we have laid the basis for multiple sequence alignment of large-scale datasets on a massively parallel, distributed memory supercomputer, thus enabling comparison of hundreds instead of a few genome sequences within reasonable time.

  9. Parallel External Memory Graph Algorithms

    DEFF Research Database (Denmark)

    Arge, Lars Allan; Goodrich, Michael T.; Sitchinava, Nodari

    2010-01-01

    In this paper, we study parallel I/O efficient graph algorithms in the Parallel External Memory (PEM) model, one o f the private-cache chip multiprocessor (CMP) models. We study the fundamental problem of list ranking which leads to efficient solutions to problems on trees, such as computing lowest...... an optimal speedup of ¿(P) in parallel I/O complexity and parallel computation time, compared to the single-processor external memory counterparts....

  10. The fast Amsterdam multiprocessor (FAMP) operation system

    International Nuclear Information System (INIS)

    Gosman, D.; Hertzberger, L.O.; Holthuizen, D.J.; Por, G.J.A.; Schoorel, M.

    1981-01-01

    The Fast Amsterdam Multi Processor system (FAMP system) is developed for on-line filtering and second stage triggering. The system is based on the MC 68000 microprocessor from MOTOROLA. In this report we will describe: The FAMP operating system software, the features of the slaves and supervisor in the FAMP operating system, the communication between supervisor and slaves using the dual port memories, the communication between user programs and the operating system. The hardware as well as the application of the system will be described elsewhere. (orig.)

  11. Mnemonic transmission, social contagion, and emergence of collective memory: Influence of emotional valence, group structure, and information distribution.

    Science.gov (United States)

    Choi, Hae-Yoon; Kensinger, Elizabeth A; Rajaram, Suparna

    2017-09-01

    Social transmission of memory and its consequence on collective memory have generated enduring interdisciplinary interest because of their widespread significance in interpersonal, sociocultural, and political arenas. We tested the influence of 3 key factors-emotional salience of information, group structure, and information distribution-on mnemonic transmission, social contagion, and collective memory. Participants individually studied emotionally salient (negative or positive) and nonemotional (neutral) picture-word pairs that were completely shared, partially shared, or unshared within participant triads, and then completed 3 consecutive recalls in 1 of 3 conditions: individual-individual-individual (control), collaborative-collaborative (identical group; insular structure)-individual, and collaborative-collaborative (reconfigured group; diverse structure)-individual. Collaboration enhanced negative memories especially in insular group structure and especially for shared information, and promoted collective forgetting of positive memories. Diverse group structure reduced this negativity effect. Unequally distributed information led to social contagion that creates false memories; diverse structure propagated a greater variety of false memories whereas insular structure promoted confidence in false recognition and false collective memory. A simultaneous assessment of network structure, information distribution, and emotional valence breaks new ground to specify how network structure shapes the spread of negative memories and false memories, and the emergence of collective memory. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  12. Fundamental Parallel Algorithms for Private-Cache Chip Multiprocessors

    DEFF Research Database (Denmark)

    Arge, Lars Allan; Goodrich, Michael T.; Nelson, Michael

    2008-01-01

    about the way cores are interconnected, for we assume that all inter-processor communication occurs through the memory hierarchy. We study several fundamental problems, including prefix sums, selection, and sorting, which often form the building blocks of other parallel algorithms. Indeed, we present...... two sorting algorithms, a distribution sort and a mergesort. Our algorithms are asymptotically optimal in terms of parallel cache accesses and space complexity under reasonable assumptions about the relationships between the number of processors, the size of memory, and the size of cache blocks....... In addition, we study sorting lower bounds in a computational model, which we call the parallel external-memory (PEM) model, that formalizes the essential properties of our algorithms for private-cache CMPs....

  13. Control and Reliability of Optical Networks in Multiprocessors

    Science.gov (United States)

    Olsen, James Jonathan

    1993-01-01

    Optical communication links have great potential to improve the performance of interconnection networks within large parallel multiprocessors, but the problems of semiconductor laser drive control and reliability inhibit their wide use. These problems have been solved in the telecommunications context, but the telecommunications solutions, based on a small number of links, are often too bulky, complex, power-hungry, and expensive to be feasible for use in a multiprocessor network with thousands of optical links. The main problems with the telecommunications approaches are that they are, by definition, designed for long-distance communication and therefore deal with communications links in isolation, instead of in an overall systems context. By taking a system-level approach to solving the laser reliability problem in a multiprocessor, and by exploiting the short -distance nature of the links, one can achieve small, simple, low-power, and inexpensive solutions, practical for implementation in the thousands of optical links that might be used in a multiprocessor. Through modeling and experimentation, I demonstrate that such system-level solutions exist, and are feasible for use in a multiprocessor network. I divide semiconductor laser reliability problems into two classes: transient errors and hard failures, and develop solutions to each type of problem in the context of a large multiprocessor. I find that for transient errors, the computer system would require a very low bit-error-rate (BER), such as 10^{-23}, if no provision were made for error control. Optical links cannot achieve such rates directly, but I find that a much more reasonable link-level BER (such as 10^{-7} ) would be acceptable with simple error detection coding. I then propose a feedback system that will enable lasers to achieve these error levels even when laser threshold current varies. Instead of telecommunications techniques, which require laser output power monitors, I describe a software

  14. Multi-processor data acquisition and monitoring systems for particle physics

    International Nuclear Information System (INIS)

    White, V.; Burch, B.; Eng, K.; Heinicke, P.; Pyatetsky, M.; Ritchie, D.

    1983-01-01

    A high speed distributed processing system, using PDP-11 and VAX processors, is being developed at Fermilab. The acquisition of data is done using one or more PDP-11s. Additional processors are connected to provide either data logging or extra data analysis capabilities. Within this framework, functional interchangeability of PDP-11 and VAX processors and of the PDP-11 operating systems, RT-11 and RSX-11M, has been maintained. Inter-processor connections have been implemented in a general way using the 5 megabit DR11-W hardware currently selected for the purpose. Using this approach the authors have been able to make use of several existing data acquisition and analysis packages, such as RT/MULTI, in a multi-processor system

  15. A data base for on-line event analysis on a distributed memory machine

    International Nuclear Information System (INIS)

    Argante, E.; Meesters, M.R.J.; Willers, I.; Stok, P. van der

    1996-01-01

    Parallel in-memory databases can enhance the structuring and parallelization of programs used in High Energy Physics (HEP). Efficient database access routines are used as communication primitives which hide the communication topology in contrast to the more explicit communications like PVM or MPI. A parallel in-memory database, called SPIDER, has been implemented on a 32 node Meiko CS-2 distributed memory machine. The SPIDER primitives generate a lower overhead than the one generated by PVM or MPI. The even reconstruction program, CPREAD, of the CLEAR experiment, has been used as test case. Performance measurements showed that CPREAD interfaced to SPIDER can easily cope with the event rate generated by CPLEAR. (author)

  16. Virtual memory support for distributed computing environments using a shared data object model

    Science.gov (United States)

    Huang, F.; Bacon, J.; Mapp, G.

    1995-12-01

    Conventional storage management systems provide one interface for accessing memory segments and another for accessing secondary storage objects. This hinders application programming and affects overall system performance due to mandatory data copying and user/kernel boundary crossings, which in the microkernel case may involve context switches. Memory-mapping techniques may be used to provide programmers with a unified view of the storage system. This paper extends such techniques to support a shared data object model for distributed computing environments in which good support for coherence and synchronization is essential. The approach is based on a microkernel, typed memory objects, and integrated coherence control. A microkernel architecture is used to support multiple coherence protocols and the addition of new protocols. Memory objects are typed and applications can choose the most suitable protocols for different types of object to avoid protocol mismatch. Low-level coherence control is integrated with high-level concurrency control so that the number of messages required to maintain memory coherence is reduced and system-wide synchronization is realized without severely impacting the system performance. These features together contribute a novel approach to the support for flexible coherence under application control.

  17. Use of the CAMAC-MULTIBUS combined protocol for organizing multi-processor operation in a crate

    International Nuclear Information System (INIS)

    Glejbman, Eh.M.

    1985-01-01

    Problems of developing electronic units for large on-line systems for nuclear-physical experiments automation and developed on the base of principles of distributed control and data processing are discussed. Crates with simultaneous disposition and operation of CAMAC moduli (EUR-4100) and those realizing the MULTIBUS hardcopy log in dataway are described. It is attained due to sharing the CAMAC and the MULTIBUS hardcopy logs in the crate dataway. Application of job scheduler and executor moduli in the MULTIBUS interface permits to organize multiprocessor operation and to obtain separation of data stream as well as to increase total computational capacity in the crate

  18. Cyclic executive for safety-critical Java on chip-multiprocessors

    DEFF Research Database (Denmark)

    Ravn, Anders P.; Schoeberl, Martin

    2010-01-01

    , that uses model checking to find a static schedule, if one exists at all, which gives an implementation of a table driven multiprocessor scheduler. To evaluate the proposed cyclic executive for multiprocessors we have implemented it in the context of safety-critical Java on a Java processor....

  19. The reminiscence bump without memories: The distribution of imagined word-cued and important autobiographical memories in a hypothetical 70-year-old.

    Science.gov (United States)

    Koppel, Jonathan; Berntsen, Dorthe

    2016-08-01

    The reminiscence bump is the disproportionate number of autobiographical memories dating from adolescence and early adulthood. It has often been ascribed to a consolidation of the mature self in the period covered by the bump. Here we stripped away factors relating to the characteristics of autobiographical memories per se, most notably factors that aid in their encoding or retention, by asking students to generate imagined word-cued and imagined 'most important' autobiographical memories of a hypothetical, prototypical 70-year-old of their own culture and gender. We compared the distribution of these fictional memories with the distributions of actual word-cued and most important autobiographical memories in a sample of 61-70-year-olds. We found a striking similarity between the temporal distributions of the imagined memories and the actual memories. These results suggest that the reminiscence bump is largely driven by constructive, schematic factors at retrieval, thereby challenging most existing theoretical accounts. Copyright © 2016 Elsevier Inc. All rights reserved.

  20. Distributed memory in a heterogeneous network, as used in the CERN-PS complex timing system

    CERN Document Server

    Kovaltsov, V I

    1995-01-01

    The Distributed Table Manager (DTM) is a fast and efficient utility for distributing named binary data structures called Tables, of arbitrary size and structure, around a heterogeneous network of computers to a set of registered clients. The Tables are transmitted over a UDP network between DTM servers in network format, where the servers perform the conversions to and from host format for local clients. The servers provide clients with synchronization mechanisms, a choice of network data flows, and table options such as keeping table disc copies, shared memory or heap memory table allocation, table read/write permissions, and table subnet broadcasting. DTM has been designed to be easily maintainable, and to automatically recover from the type of errors typically encountered in a large control system network. The DTM system is based on a three level server daemon hierarchy, in which an inter daemon protocol handles network failures, and incorporates recovery procedures which will guarantee table consistency w...

  1. Capacity for patterns and sequences in Kanerva's SDM as compared to other associative memory models. [Sparse, Distributed Memory

    Science.gov (United States)

    Keeler, James D.

    1988-01-01

    The information capacity of Kanerva's Sparse Distributed Memory (SDM) and Hopfield-type neural networks is investigated. Under the approximations used here, it is shown that the total information stored in these systems is proportional to the number connections in the network. The proportionality constant is the same for the SDM and Hopfield-type models independent of the particular model, or the order of the model. The approximations are checked numerically. This same analysis can be used to show that the SDM can store sequences of spatiotemporal patterns, and the addition of time-delayed connections allows the retrieval of context dependent temporal patterns. A minor modification of the SDM can be used to store correlated patterns.

  2. Immigration, Language Proficiency, and Autobiographical Memories: Lifespan Distribution and Second-Language Access

    OpenAIRE

    Esposito, Alena G.; Baker-Ward, Lynne

    2015-01-01

    This investigation examined two controversies in the autobiographical literature: how cross-language immigration affects the distribution of autobiographical memories across the lifespan and under what circumstances language-dependent recall is observed. Both Spanish/English bilingual immigrants and English monolingual non-immigrants participated in a cue word study, with the bilingual sample taking part in a within-subject language manipulation. The expected bump in the num...

  3. More than a filter: Feature-based attention regulates the distribution of visual working memory resources.

    Science.gov (United States)

    Dube, Blaire; Emrich, Stephen M; Al-Aidroos, Naseem

    2017-10-01

    Across 2 experiments we revisited the filter account of how feature-based attention regulates visual working memory (VWM). Originally drawing from discrete-capacity ("slot") models, the filter account proposes that attention operates like the "bouncer in the brain," preventing distracting information from being encoded so that VWM resources are reserved for relevant information. Given recent challenges to the assumptions of discrete-capacity models, we investigated whether feature-based attention plays a broader role in regulating memory. Both experiments used partial report tasks in which participants memorized the colors of circle and square stimuli, and we provided a feature-based goal by manipulating the likelihood that 1 shape would be probed over the other across a range of probabilities. By decomposing participants' responses using mixture and variable-precision models, we estimated the contributions of guesses, nontarget responses, and imprecise memory representations to their errors. Consistent with the filter account, participants were less likely to guess when the probed memory item matched the feature-based goal. Interestingly, this effect varied with goal strength, even across high probabilities where goal-matching information should always be prioritized, demonstrating strategic control over filter strength. Beyond this effect of attention on which stimuli were encoded, we also observed effects on how they were encoded: Estimates of both memory precision and nontarget errors varied continuously with feature-based attention. The results offer support for an extension to the filter account, where feature-based attention dynamically regulates the distribution of resources within working memory so that the most relevant items are encoded with the greatest precision. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  4. Software for the ACP [Advanced Computer Program] multiprocessor system

    International Nuclear Information System (INIS)

    Biel, J.; Areti, H.; Atac, R.

    1987-01-01

    Software has been developed for use with the Fermilab Advanced Computer Program (ACP) multiprocessor system. The software was designed to make a system of a hundred independent node processors as easy to use as a single, powerful CPU. Subroutines have been developed by which a user's host program can send data to and get results from the program running in each of his ACP node processors. Utility programs make it easy to compile and link host and node programs, to debug a node program on an ACP development system, and to submit a debugged program to an ACP production system

  5. Global asymptotic stability analysis of bidirectional associative memory neural networks with distributed delays and impulse

    International Nuclear Information System (INIS)

    Huang Zaitang; Luo Xiaoshu; Yang Qigui

    2007-01-01

    Many systems existing in physics, chemistry, biology, engineering and information science can be characterized by impulsive dynamics caused by abrupt jumps at certain instants during the process. These complex dynamical behaviors can be model by impulsive differential system or impulsive neural networks. This paper formulates and studies a new model of impulsive bidirectional associative memory (BAM) networks with finite distributed delays. Several fundamental issues, such as global asymptotic stability and existence and uniqueness of such BAM neural networks with impulse and distributed delays, are established

  6. Patterns of particle distribution in multiparticle systems by random walks with memory enhancement and decay

    Science.gov (United States)

    Tan, Zhi-Jie; Zou, Xian-Wu; Huang, Sheng-You; Zhang, Wei; Jin, Zhun-Zhi

    2002-07-01

    We investigate the pattern of particle distribution and its evolution with time in multiparticle systems using the model of random walks with memory enhancement and decay. This model describes some biological intelligent walks. With decrease in the memory decay exponent α, the distribution of particles changes from a random dispersive pattern to a locally dense one, and then returns to the random one. Correspondingly, the fractal dimension Df,p characterizing the distribution of particle positions increases from a low value to a maximum and then decreases to the low one again. This is determined by the degree of overlap of regions consisting of sites with remanent information. The second moment of the density ρ(2) was introduced to investigate the inhomogeneity of the particle distribution. The dependence of ρ(2) on α is similar to that of Df,p on α. ρ(2) increases with time as a power law in the process of adjusting the particle distribution, and then ρ(2) tends to a stable equilibrium value.

  7. Computational cost of isogeometric multi-frontal solvers on parallel distributed memory machines

    KAUST Repository

    Woźniak, Maciej

    2015-02-01

    This paper derives theoretical estimates of the computational cost for isogeometric multi-frontal direct solver executed on parallel distributed memory machines. We show theoretically that for the Cp-1 global continuity of the isogeometric solution, both the computational cost and the communication cost of a direct solver are of order O(log(N)p2) for the one dimensional (1D) case, O(Np2) for the two dimensional (2D) case, and O(N4/3p2) for the three dimensional (3D) case, where N is the number of degrees of freedom and p is the polynomial order of the B-spline basis functions. The theoretical estimates are verified by numerical experiments performed with three parallel multi-frontal direct solvers: MUMPS, PaStiX and SuperLU, available through PETIGA toolkit built on top of PETSc. Numerical results confirm these theoretical estimates both in terms of p and N. For a given problem size, the strong efficiency rapidly decreases as the number of processors increases, becoming about 20% for 256 processors for a 3D example with 1283 unknowns and linear B-splines with C0 global continuity, and 15% for a 3D example with 643 unknowns and quartic B-splines with C3 global continuity. At the same time, one cannot arbitrarily increase the problem size, since the memory required by higher order continuity spaces is large, quickly consuming all the available memory resources even in the parallel distributed memory version. Numerical results also suggest that the use of distributed parallel machines is highly beneficial when solving higher order continuity spaces, although the number of processors that one can efficiently employ is somehow limited.

  8. Distribution of return point memory states for systems with stochastic inputs

    International Nuclear Information System (INIS)

    Amann, A; Brokate, M; Rachinskii, D; Temnov, G

    2011-01-01

    We consider the long term effect of stochastic inputs on the state of an open loop system which exhibits the so-called return point memory. An example of such a system is the Preisach model; more generally, systems with the Preisach type input-state relationship, such as in spin-interaction models, are considered. We focus on the characterisation of the expected memory configuration after the system has been effected by the input for sufficiently long period of time. In the case where the input is given by a discrete time random walk process, or the Wiener process, simple closed form expressions for the probability density of the vector of the main input extrema recorded by the memory state, and scaling laws for the dimension of this vector, are derived. If the input is given by a general continuous Markov process, we show that the distribution of previous memory elements can be obtained from a Markov chain scheme which is derived from the solution of an associated one-dimensional escape type problem. Formulas for transition probabilities defining this Markov chain scheme are presented. Moreover, explicit formulas for the conditional probability densities of previous main extrema are obtained for the Ornstein-Uhlenbeck input process. The analytical results are confirmed by numerical experiments.

  9. Software/hardware distributed processing network supporting the Ada environment

    Science.gov (United States)

    Wood, Richard J.; Pryk, Zen

    1993-09-01

    A high-performance, fault-tolerant, distributed network has been developed, tested, and demonstrated. The network is based on the MIPS Computer Systems, Inc. R3000 Risc for processing, VHSIC ASICs for high speed, reliable, inter-node communications and compatible commercial memory and I/O boards. The network is an evolution of the Advanced Onboard Signal Processor (AOSP) architecture. It supports Ada application software with an Ada- implemented operating system. A six-node implementation (capable of expansion up to 256 nodes) of the RISC multiprocessor architecture provides 120 MIPS of scalar throughput, 96 Mbytes of RAM and 24 Mbytes of non-volatile memory. The network provides for all ground processing applications, has merit for space-qualified RISC-based network, and interfaces to advanced Computer Aided Software Engineering (CASE) tools for application software development.

  10. A combined PLC and CPU approach to multiprocessor control

    International Nuclear Information System (INIS)

    Harris, J.J.; Broesch, J.D.; Coon, R.M.

    1995-10-01

    A sophisticated multiprocessor control system has been developed for use in the E-Power Supply System Integrated Control (EPSSIC) on the DIII-D tokamak. EPSSIC provides control and interlocks for the ohmic heating coil power supply and its associated systems. Of particular interest is the architecture of this system: both a Programmable Logic Controller (PLC) and a Central Processor Unit (CPU) have been combined on a standard VME bus. The PLC and CPU input and output signals are routed through signal conditioning modules, which provide the necessary voltage and ground isolation. Additionally these modules adapt the signal levels to that of the VME I/O boards. One set of I/O signals is shared between the two processors. The resulting multiprocessor system provides a number of advantages: redundant operation for mission critical situations, flexible communications using conventional TCP/IP protocols, the simplicity of ladder logic programming for the majority of the control code, and an easily maintained and expandable non-proprietary system

  11. Thermal-Aware Scheduling for Future Chip Multiprocessors

    Directory of Open Access Journals (Sweden)

    Pedro Trancoso

    2007-04-01

    Full Text Available The increased complexity and operating frequency in current single chip microprocessors is resulting in a decrease in the performance improvements. Consequently, major manufacturers offer chip multiprocessor (CMP architectures in order to keep up with the expected performance gains. This architecture is successfully being introduced in many markets including that of the embedded systems. Nevertheless, the integration of several cores onto the same chip may lead to increased heat dissipation and consequently additional costs for cooling, higher power consumption, decrease of the reliability, and thermal-induced performance loss, among others. In this paper, we analyze the evolution of the thermal issues for the future chip multiprocessor architectures and show that as the number of on-chip cores increases, the thermal-induced problems will worsen. In addition, we present several scenarios that result in excessive thermal stress to the CMP chip or significant performance loss. In order to minimize or even eliminate these problems, we propose thermal-aware scheduler (TAS algorithms. When assigning processes to cores, TAS takes their temperature and cooling ability into account in order to avoid thermal stress and at the same time improve the performance. Experimental results have shown that a TAS algorithm that considers also the temperatures of neighboring cores is able to significantly reduce the temperature-induced performance loss while at the same time, decrease the chip's temperature across many different operation and configuration scenarios.

  12. A Screen Space GPGPU Surface LIC Algorithm for Distributed Memory Data Parallel Sort Last Rendering Infrastructures

    Energy Technology Data Exchange (ETDEWEB)

    Loring, Burlen; Karimabadi, Homa; Rortershteyn, Vadim

    2014-07-01

    The surface line integral convolution(LIC) visualization technique produces dense visualization of vector fields on arbitrary surfaces. We present a screen space surface LIC algorithm for use in distributed memory data parallel sort last rendering infrastructures. The motivations for our work are to support analysis of datasets that are too large to fit in the main memory of a single computer and compatibility with prevalent parallel scientific visualization tools such as ParaView and VisIt. By working in screen space using OpenGL we can leverage the computational power of GPUs when they are available and run without them when they are not. We address efficiency and performance issues that arise from the transformation of data from physical to screen space by selecting an alternate screen space domain decomposition. We analyze the algorithm's scaling behavior with and without GPUs on two high performance computing systems using data from turbulent plasma simulations.

  13. A distributed-memory hierarchical solver for general sparse linear systems

    Energy Technology Data Exchange (ETDEWEB)

    Chen, Chao [Stanford Univ., CA (United States). Inst. for Computational and Mathematical Engineering; Pouransari, Hadi [Stanford Univ., CA (United States). Dept. of Mechanical Engineering; Rajamanickam, Sivasankaran [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States). Center for Computing Research; Boman, Erik G. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States). Center for Computing Research; Darve, Eric [Stanford Univ., CA (United States). Inst. for Computational and Mathematical Engineering and Dept. of Mechanical Engineering

    2017-12-20

    We present a parallel hierarchical solver for general sparse linear systems on distributed-memory machines. For large-scale problems, this fully algebraic algorithm is faster and more memory-efficient than sparse direct solvers because it exploits the low-rank structure of fill-in blocks. Depending on the accuracy of low-rank approximations, the hierarchical solver can be used either as a direct solver or as a preconditioner. The parallel algorithm is based on data decomposition and requires only local communication for updating boundary data on every processor. Moreover, the computation-to-communication ratio of the parallel algorithm is approximately the volume-to-surface-area ratio of the subdomain owned by every processor. We also provide various numerical results to demonstrate the versatility and scalability of the parallel algorithm.

  14. Multi-processor developments in the United States for future high energy physics experiments and accelerators

    International Nuclear Information System (INIS)

    Gaines, I.

    1988-03-01

    The use of multi-processors for analysis and high-level triggering in High Energy Physics experiments, pioneered by the early emulator systems, has reached maturity, in particular with the multiple microprocessor systems in use at Fermilab. It is widely acknowledged that such systems will fulfill the major portion of the computing needs of future large experiments. Recent developments at Fermilab's Advanced Computer Program will make such systems even more powerful, cost-effective, and easier to use than they are at present. The next generation of microprocessors, already available, will provide CPU power of about one VAX 780 equivalent/$300, while supporting most VMS FORTRAN extensions and large (>8MB) amounts of memory. Low cost high density mass storage devices (based on video tape cartridge technology) will allow parallel I/O to remove potential I/O bottlenecks in systems of over 1000 VAX equipment processors. New interconnection schemes and system software will allow more flexible topologies and extremely high data bandwidth, especially for on-line systems. This talk will summarize the work at the Advanced Computer Program and the rest of the US in this field. 3 refs., 4 figs

  15. The Cortex Transform as an image preprocessor for sparse distributed memory: An initial study

    Science.gov (United States)

    Olshausen, Bruno; Watson, Andrew

    1990-01-01

    An experiment is described which was designed to evaluate the use of the Cortex Transform as an image processor for Sparse Distributed Memory (SDM). In the experiment, a set of images were injected with Gaussian noise, preprocessed with the Cortex Transform, and then encoded into bit patterns. The various spatial frequency bands of the Cortex Transform were encoded separately so that they could be evaluated based on their ability to properly cluster patterns belonging to the same class. The results of this study indicate that by simply encoding the low pass band of the Cortex Transform, a very suitable input representation for the SDM can be achieved.

  16. Convergence dynamics of hybrid bidirectional associative memory neural networks with distributed delays

    International Nuclear Information System (INIS)

    Liao Xiaofeng; Wong, K.-W.; Yang Shizhong

    2003-01-01

    In this Letter, the characteristics of the convergence dynamics of hybrid bidirectional associative memory neural networks with distributed transmission delays are studied. Without assuming the symmetry of synaptic connection weights and the monotonicity and differentiability of activation functions, the Lyapunov functionals are constructed and the generalized Halanay-type inequalities are employed to derive the delay-independent sufficient conditions under which the networks converge exponentially to the equilibria associated with temporally uniform external inputs. Some examples are given to illustrate the correctness of our results

  17. Memory

    OpenAIRE

    Wager, Nadia

    2017-01-01

    This chapter will explore a response to traumatic victimisation which has divided the opinions of psychologists at an exponential rate. We will be examining amnesia for memories of childhood sexual abuse and the potential to recover these memories in adulthood. Whilst this phenomenon is generally accepted in clinical circles, it is seen as highly contentious amongst research psychologists, particularly experimental cognitive psychologists. The chapter will begin with a real case study of a wo...

  18. System-Level Design Methodologies for Networked Multiprocessor Systems-on-Chip

    DEFF Research Database (Denmark)

    Virk, Kashif Munir

    2008-01-01

    is the first such attempt in the published literature. The second part of the thesis deals with the issues related to the development of system-level design methodologies for networked multiprocessor systems-on-chip at various levels of design abstraction with special focus on the modeling and design...... at the system-level. The multiprocessor modeling framework is then extended to include models of networked multiprocessor systems-on-chip which is then employed to model wireless sensor networks both at the sensor node level as well as the wireless network level. In the third and the final part, the thesis...... to the transaction-level model. The thesis, as a whole makes contributions by describing a design methodology for networked multiprocessor embedded systems at three layers of abstraction from system-level through transaction-level to the cycle accurate level as well as demonstrating it practically by implementing...

  19. Operating system for a real-time multiprocessor propulsion system simulator. User's manual

    Science.gov (United States)

    Cole, G. L.

    1985-01-01

    The NASA Lewis Research Center is developing and evaluating experimental hardware and software systems to help meet future needs for real-time, high-fidelity simulations of air-breathing propulsion systems. Specifically, the real-time multiprocessor simulator project focuses on the use of multiple microprocessors to achieve the required computing speed and accuracy at relatively low cost. Operating systems for such hardware configurations are generally not available. A real time multiprocessor operating system (RTMPOS) that supports a variety of multiprocessor configurations was developed at Lewis. With some modification, RTMPOS can also support various microprocessors. RTMPOS, by means of menus and prompts, provides the user with a versatile, user-friendly environment for interactively loading, running, and obtaining results from a multiprocessor-based simulator. The menu functions are described and an example simulation session is included to demonstrate the steps required to go from the simulation loading phase to the execution phase.

  20. Standard interfaces for program-modular multiprocessor systems

    International Nuclear Information System (INIS)

    Chernykh, E.V.

    1982-01-01

    The peculiarities of the structures of existing and developed standard interfaces used in automation systems for nuclear physical experiments are considered. general structural characteristics of multiprocessor system interfaces are revealed. The comparison of the existing system CAMAC crate and designed standards of COMPEX, E3S and FASTBUS interfaces by capacity and relative cost is carried out. The analysis of the given data shows that operation of any interface is more advantageous at the rates close to capacity values, the relative cost being minimum. In this case the advantage is on the side of interfaces with greater capacity values for which at a moderated decrease of the exchange or requests processing rate the relative costs grow slower. A higher capacity of one-cycle exchange is provided with functional data way specialization in the interface. The conclusion is drawn that most perspective trend in the development of automation systems for high energy physics experiments is using FASTBUS standard

  1. A measurement-based performability model for a multiprocessor system

    Science.gov (United States)

    Ilsueh, M. C.; Iyer, Ravi K.; Trivedi, K. S.

    1987-01-01

    A measurement-based performability model based on real error-data collected on a multiprocessor system is described. Model development from the raw errror-data to the estimation of cumulative reward is described. Both normal and failure behavior of the system are characterized. The measured data show that the holding times in key operational and failure states are not simple exponential and that semi-Markov process is necessary to model the system behavior. A reward function, based on the service rate and the error rate in each state, is then defined in order to estimate the performability of the system and to depict the cost of different failure types and recovery procedures.

  2. Analysis and Optimisation of Hierarchically Scheduled Multiprocessor Embedded Systems

    DEFF Research Database (Denmark)

    Pop, Traian; Pop, Paul; Eles, Petru

    2008-01-01

    We present an approach to the analysis and optimisation of heterogeneous multiprocessor embedded systems. The systems are heterogeneous not only in terms of hardware components, but also in terms of communication protocols and scheduling policies. When several scheduling policies share a resource......, they are organised in a hierarchy. In this paper, we first develop a holistic scheduling and schedulability analysis that determines the timing properties of a hierarchically scheduled system. Second, we address design problems that are characteristic to such hierarchically scheduled systems: assignment...... of scheduling policies to tasks, mapping of tasks to hardware components, and the scheduling of the activities. We also present several algorithms for solving these problems. Our heuristics are able to find schedulable implementations under limited resources, achieving an efficient utilisation of the system...

  3. Recommending the heterogeneous cluster type multi-processor system computing

    International Nuclear Information System (INIS)

    Iijima, Nobukazu

    2010-01-01

    Real-time reactor simulator had been developed by reusing the equipment of the Musashi reactor and its performance improvement became indispensable for research tools to increase sampling rate with introduction of arithmetic units using multi-Digital Signal Processor(DSP) system (cluster). In order to realize the heterogeneous cluster type multi-processor system computing, combination of two kinds of Control Processor (CP) s, Cluster Control Processor (CCP) and System Control Processor (SCP), were proposed with Large System Control Processor (LSCP) for hierarchical cluster if needed. Faster computing performance of this system was well evaluated by simulation results for simultaneous execution of plural jobs and also pipeline processing between clusters, which showed the system led to effective use of existing system and enhancement of the cost performance. (T. Tanaka)

  4. Behavioral Simulation and Performance Evaluation of Multi-Processor Architectures

    Directory of Open Access Journals (Sweden)

    Ausif Mahmood

    1996-01-01

    Full Text Available The development of multi-processor architectures requires extensive behavioral simulations to verify the correctness of design and to evaluate its performance. A high level language can provide maximum flexibility in this respect if the constructs for handling concurrent processes and a time mapping mechanism are added. This paper describes a novel technique for emulating hardware processes involved in a parallel architecture such that an object-oriented description of the design is maintained. The communication and synchronization between hardware processes is handled by splitting the processes into their equivalent subprograms at the entry points. The proper scheduling of these subprograms is coordinated by a timing wheel which provides a time mapping mechanism. Finally, a high level language pre-processor is proposed so that the timing wheel and the process emulation details can be made transparent to the user.

  5. Discrete Ziggurat: A time-memory trade-off for sampling from a Gaussian distribution over the integers

    NARCIS (Netherlands)

    Buchmann, J.; Cabarcas, D.; Göpfert, F.; Hülsing, A.T.; Weiden, P.; Lange, T.; Lauter, K.; Lisonek, P.

    2014-01-01

    Several lattice-based cryptosystems require to sample from a discrete Gaussian distribution over the integers. Existing methods to sample from such a distribution either need large amounts of memory or they are very slow. In this paper we explore a different method that allows for a flexible

  6. Runtime adaptive multi-processor system-on-chip: RAMPSoC

    OpenAIRE

    Göhringer, D.; Hübner, M.; Schatz, V.; Becker, J.

    2008-01-01

    Current trends in high performance computing show, that the usage of multiprocessor systems on chip are one approach for the requirements of computing intensive applications. The multiprocessor system on chip (MPSoC) approaches often provide a static and homogeneous infrastructure of networked microprocessor on the chip die. A novel idea in this research area is to introduce the dynamic adaptivity of reconfigurable hardware in order to provide a flexible heterogeneous set of processing elemen...

  7. Safety-critical Java with cyclic executives on chip-multiprocessors

    DEFF Research Database (Denmark)

    Ravn, Anders P.; Schoeberl, Martin

    2012-01-01

    Chip-multiprocessors offer increased processing power at a low cost. However, in order to use them for real-time systems, tasks have to be scheduled efficiently and predictably. It is well known that finding optimal schedules is a computationally hard problem. In this paper we present a solution ...... for multiprocessors, we have implemented it in the context of safety-critical Java on a Java processor....

  8. Accelerated Cyclic Reduction: A Distributed-Memory Fast Solver for Structured Linear Systems

    KAUST Repository

    Chávez, Gustavo

    2017-12-15

    We present Accelerated Cyclic Reduction (ACR), a distributed-memory fast solver for rank-compressible block tridiagonal linear systems arising from the discretization of elliptic operators, developed here for three dimensions. Algorithmic synergies between Cyclic Reduction and hierarchical matrix arithmetic operations result in a solver that has O(kNlogN(logN+k2)) arithmetic complexity and O(k Nlog N) memory footprint, where N is the number of degrees of freedom and k is the rank of a block in the hierarchical approximation, and which exhibits substantial concurrency. We provide a baseline for performance and applicability by comparing with the multifrontal method with and without hierarchical semi-separable matrices, with algebraic multigrid and with the classic cyclic reduction method. Over a set of large-scale elliptic systems with features of nonsymmetry and indefiniteness, the robustness of the direct solvers extends beyond that of the multigrid solver, and relative to the multifrontal approach ACR has lower or comparable execution time and size of the factors, with substantially lower numerical ranks. ACR exhibits good strong and weak scaling in a distributed context and, as with any direct solver, is advantageous for problems that require the solution of multiple right-hand sides. Numerical experiments show that the rank k patterns are of O(1) for the Poisson equation and of O(n) for the indefinite Helmholtz equation. The solver is ideal in situations where low-accuracy solutions are sufficient, or otherwise as a preconditioner within an iterative method.

  9. Accelerated Cyclic Reduction: A Distributed-Memory Fast Solver for Structured Linear Systems

    KAUST Repository

    Chá vez, Gustavo; Turkiyyah, George; Zampini, Stefano; Ltaief, Hatem; Keyes, David E.

    2017-01-01

    We present Accelerated Cyclic Reduction (ACR), a distributed-memory fast solver for rank-compressible block tridiagonal linear systems arising from the discretization of elliptic operators, developed here for three dimensions. Algorithmic synergies between Cyclic Reduction and hierarchical matrix arithmetic operations result in a solver that has O(kNlogN(logN+k2)) arithmetic complexity and O(k Nlog N) memory footprint, where N is the number of degrees of freedom and k is the rank of a block in the hierarchical approximation, and which exhibits substantial concurrency. We provide a baseline for performance and applicability by comparing with the multifrontal method with and without hierarchical semi-separable matrices, with algebraic multigrid and with the classic cyclic reduction method. Over a set of large-scale elliptic systems with features of nonsymmetry and indefiniteness, the robustness of the direct solvers extends beyond that of the multigrid solver, and relative to the multifrontal approach ACR has lower or comparable execution time and size of the factors, with substantially lower numerical ranks. ACR exhibits good strong and weak scaling in a distributed context and, as with any direct solver, is advantageous for problems that require the solution of multiple right-hand sides. Numerical experiments show that the rank k patterns are of O(1) for the Poisson equation and of O(n) for the indefinite Helmholtz equation. The solver is ideal in situations where low-accuracy solutions are sufficient, or otherwise as a preconditioner within an iterative method.

  10. Memories.

    Science.gov (United States)

    Brand, Judith, Ed.

    1998-01-01

    This theme issue of the journal "Exploring" covers the topic of "memories" and describes an exhibition at San Francisco's Exploratorium that ran from May 22, 1998 through January 1999 and that contained over 40 hands-on exhibits, demonstrations, artworks, images, sounds, smells, and tastes that demonstrated and depicted the biological,…

  11. A QDWH-Based SVD Software Framework on Distributed-Memory Manycore Systems

    KAUST Repository

    Sukkari, Dalal

    2017-01-01

    This paper presents a high performance software framework for computing a dense SVD on distributed- memory manycore systems. Originally introduced by Nakatsukasa et al. (Nakatsukasa et al. 2010; Nakatsukasa and Higham 2013), the SVD solver relies on the polar decomposition using the QR Dynamically-Weighted Halley algorithm (QDWH). Although the QDWH-based SVD algorithm performs a significant amount of extra floating-point operations compared to the traditional SVD with the one-stage bidiagonal reduction, the inherent high level of concurrency associated with Level 3 BLAS compute-bound kernels ultimately compensates for the arithmetic complexity overhead. Using the ScaLAPACK two-dimensional block cyclic data distribution with a rectangular processor topology, the resulting QDWH-SVD further reduces excessive communications during the panel factorization, while increasing the degree of parallelism during the update of the trailing submatrix, as opposed to relying to the default square processor grid. After detailing the algorithmic complexity and the memory footprint of the algorithm, we conduct a thorough performance analysis and study the impact of the grid topology on the performance by looking at the communication and computation profiling trade-offs. We report performance results against state-of-the-art existing QDWH software implementations (e.g., Elemental) and their SVD extensions on large-scale distributed-memory manycore systems based on commodity Intel x86 Haswell processors and Knights Landing (KNL) architecture. The QDWH-SVD framework achieves up to 3/8-fold on the Haswell/KNL-based platforms, respectively, against ScaLAPACK PDGESVD and turns out to be a competitive alternative for well and ill-conditioned matrices. We finally come up herein with a performance model based on these empirical results. Our QDWH-based polar decomposition and its SVD extension are freely available at https://github.com/ecrc/qdwh.git and https

  12. Diffusion with space memory modelled with distributed order space fractional differential equations

    Directory of Open Access Journals (Sweden)

    M. Caputo

    2003-06-01

    Full Text Available Distributed order fractional differential equations (Caputo, 1995, 2001; Bagley and Torvik, 2000a,b were fi rst used in the time domain; they are here considered in the space domain and introduced in the constitutive equation of diffusion. The solution of the classic problems are obtained, with closed form formulae. In general, the Green functions act as low pass fi lters in the frequency domain. The major difference with the case when a single space fractional derivative is present in the constitutive equations of diffusion (Caputo and Plastino, 2002 is that the solutions found here are potentially more fl exible to represent more complex media (Caputo, 2001a. The difference between the space memory medium and that with the time memory is that the former is more fl exible to represent local phenomena while the latter is more fl exible to represent variations in space. Concerning the boundary value problem, the difference with the solution of the classic diffusion medium, in the case when a constant boundary pressure is assigned and in the medium the pressure is initially nil, is that one also needs to assign the fi rst order space derivative at the boundary.

  13. MSAProbs-MPI: parallel multiple sequence aligner for distributed-memory systems.

    Science.gov (United States)

    González-Domínguez, Jorge; Liu, Yongchao; Touriño, Juan; Schmidt, Bertil

    2016-12-15

    MSAProbs is a state-of-the-art protein multiple sequence alignment tool based on hidden Markov models. It can achieve high alignment accuracy at the expense of relatively long runtimes for large-scale input datasets. In this work we present MSAProbs-MPI, a distributed-memory parallel version of the multithreaded MSAProbs tool that is able to reduce runtimes by exploiting the compute capabilities of common multicore CPU clusters. Our performance evaluation on a cluster with 32 nodes (each containing two Intel Haswell processors) shows reductions in execution time of over one order of magnitude for typical input datasets. Furthermore, MSAProbs-MPI using eight nodes is faster than the GPU-accelerated QuickProbs running on a Tesla K20. Another strong point is that MSAProbs-MPI can deal with large datasets for which MSAProbs and QuickProbs might fail due to time and memory constraints, respectively. Source code in C ++ and MPI running on Linux systems as well as a reference manual are available at http://msaprobs.sourceforge.net CONTACT: jgonzalezd@udc.esSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  14. Scaling Techniques for Massive Scale-Free Graphs in Distributed (External) Memory

    KAUST Repository

    Pearce, Roger

    2013-05-01

    We present techniques to process large scale-free graphs in distributed memory. Our aim is to scale to trillions of edges, and our research is targeted at leadership class supercomputers and clusters with local non-volatile memory, e.g., NAND Flash. We apply an edge list partitioning technique, designed to accommodate high-degree vertices (hubs) that create scaling challenges when processing scale-free graphs. In addition to partitioning hubs, we use ghost vertices to represent the hubs to reduce communication hotspots. We present a scaling study with three important graph algorithms: Breadth-First Search (BFS), K-Core decomposition, and Triangle Counting. We also demonstrate scalability on BG/P Intrepid by comparing to best known Graph500 results. We show results on two clusters with local NVRAM storage that are capable of traversing trillion-edge scale-free graphs. By leveraging node-local NAND Flash, our approach can process thirty-two times larger datasets with only a 39% performance degradation in Traversed Edges Per Second (TEPS). © 2013 IEEE.

  15. A Time-predictable Memory Network-on-Chip

    DEFF Research Database (Denmark)

    Schoeberl, Martin; Chong, David VH; Puffitsch, Wolfgang

    2014-01-01

    To derive safe bounds on worst-case execution times (WCETs), all components of a computer system need to be time-predictable: the processor pipeline, the caches, the memory controller, and memory arbitration on a multicore processor. This paper presents a solution for time-predictable memory...... arbitration and access for chip-multiprocessors. The memory network-on-chip is organized as a tree with time-division multiplexing (TDM) of accesses to the shared memory. The TDM based arbitration completely decouples processor cores and allows WCET analysis of the memory accesses on individual cores without...

  16. Studies of electron collisions with polyatomic molecules using distributed-memory parallel computers

    International Nuclear Information System (INIS)

    Winstead, C.; Hipes, P.G.; Lima, M.A.P.; McKoy, V.

    1991-01-01

    Elastic electron scattering cross sections from 5--30 eV are reported for the molecules C 2 H 4 , C 2 H 6 , C 3 H 8 , Si 2 H 6 , and GeH 4 , obtained using an implementation of the Schwinger multichannel method for distributed-memory parallel computer architectures. These results, obtained within the static-exchange approximation, are in generally good agreement with the available experimental data. These calculations demonstrate the potential of highly parallel computation in the study of collisions between low-energy electrons and polyatomic gases. The computational methodology discussed is also directly applicable to the calculation of elastic cross sections at higher levels of approximation (target polarization) and of electronic excitation cross sections

  17. Comparison between sparsely distributed memory and Hopfield-type neural network models

    Science.gov (United States)

    Keeler, James D.

    1986-01-01

    The Sparsely Distributed Memory (SDM) model (Kanerva, 1984) is compared to Hopfield-type neural-network models. A mathematical framework for comparing the two is developed, and the capacity of each model is investigated. The capacity of the SDM can be increased independently of the dimension of the stored vectors, whereas the Hopfield capacity is limited to a fraction of this dimension. However, the total number of stored bits per matrix element is the same in the two models, as well as for extended models with higher order interactions. The models are also compared in their ability to store sequences of patterns. The SDM is extended to include time delays so that contextual information can be used to cover sequences. Finally, it is shown how a generalization of the SDM allows storage of correlated input pattern vectors.

  18. Interoperable mesh components for large-scale, distributed-memory simulations

    International Nuclear Information System (INIS)

    Devine, K; Leung, V; Diachin, L; Miller, M

    2009-01-01

    SciDAC applications have a demonstrated need for advanced software tools to manage the complexities associated with sophisticated geometry, mesh, and field manipulation tasks, particularly as computer architectures move toward the petascale. In this paper, we describe a software component - an abstract data model and programming interface - designed to provide support for parallel unstructured mesh operations. We describe key issues that must be addressed to successfully provide high-performance, distributed-memory unstructured mesh services and highlight some recent research accomplishments in developing new load balancing and MPI-based communication libraries appropriate for leadership class computing. Finally, we give examples of the use of parallel adaptive mesh modification in two SciDAC applications.

  19. Global exponential stability of bidirectional associative memory neural networks with distributed delays

    Science.gov (United States)

    Song, Qiankun; Cao, Jinde

    2007-05-01

    A bidirectional associative memory neural network model with distributed delays is considered. By constructing a new Lyapunov functional, employing the homeomorphism theory, M-matrix theory and the inequality (a[greater-or-equal, slanted]0,bk[greater-or-equal, slanted]0,qk>0 with , and r>1), a sufficient condition is obtained to ensure the existence, uniqueness and global exponential stability of the equilibrium point for the model. Moreover, the exponential converging velocity index is estimated, which depends on the delay kernel functions and the system parameters. The results generalize and improve the earlier publications, and remove the usual assumption that the activation functions are bounded . Two numerical examples are given to show the effectiveness of the obtained results.

  20. Evict on write, a management strategy for a prefetch unit and/or first level cache in a multiprocessor system with speculative execution

    Science.gov (United States)

    Gara, Alan; Ohmacht, Martin

    2014-09-16

    In a multiprocessor system with at least two levels of cache, a speculative thread may run on a core processor in parallel with other threads. When the thread seeks to do a write to main memory, this access is to be written through the first level cache to the second level cache. After the write though, the corresponding line is deleted from the first level cache and/or prefetch unit, so that any further accesses to the same location in main memory have to be retrieved from the second level cache. The second level cache keeps track of multiple versions of data, where more than one speculative thread is running in parallel, while the first level cache does not have any of the versions during speculation. A switch allows choosing between modes of operation of a speculation blind first level cache.

  1. ClimateSpark: An In-memory Distributed Computing Framework for Big Climate Data Analytics

    Science.gov (United States)

    Hu, F.; Yang, C. P.; Duffy, D.; Schnase, J. L.; Li, Z.

    2016-12-01

    Massive array-based climate data is being generated from global surveillance systems and model simulations. They are widely used to analyze the environment problems, such as climate changes, natural hazards, and public health. However, knowing the underlying information from these big climate datasets is challenging due to both data- and computing- intensive issues in data processing and analyzing. To tackle the challenges, this paper proposes ClimateSpark, an in-memory distributed computing framework to support big climate data processing. In ClimateSpark, the spatiotemporal index is developed to enable Apache Spark to treat the array-based climate data (e.g. netCDF4, HDF4) as native formats, which are stored in Hadoop Distributed File System (HDFS) without any preprocessing. Based on the index, the spatiotemporal query services are provided to retrieve dataset according to a defined geospatial and temporal bounding box. The data subsets will be read out, and a data partition strategy will be applied to equally split the queried data to each computing node, and store them in memory as climateRDDs for processing. By leveraging Spark SQL and User Defined Function (UDFs), the climate data analysis operations can be conducted by the intuitive SQL language. ClimateSpark is evaluated by two use cases using the NASA Modern-Era Retrospective Analysis for Research and Applications (MERRA) climate reanalysis dataset. One use case is to conduct the spatiotemporal query and visualize the subset results in animation; the other one is to compare different climate model outputs using Taylor-diagram service. Experimental results show that ClimateSpark can significantly accelerate data query and processing, and enable the complex analysis services served in the SQL-style fashion.

  2. Multiprocessor systems for real-time data acquisition on the Asdex upgrade and future plasma experiments

    International Nuclear Information System (INIS)

    Zilker, M.; Hallatschek, K.; Heimann, P.; Hertweck, F.

    1999-01-01

    In this paper we present our transputer-based multitop multiprocessor systems for data acquisition, which are currently used on the Asdex upgrade experiment. The bandwidth of these systems goes from low-speed like the calorimetry diagnostic up to highspeed and large data volume systems like the soft-X-ray and Mirnov diagnostics, which collect several hundreds of megabytes of data during a plasma discharge of ∼8 s. Further, we present the multitop-MX, a newly developed system based on transputers and powerPCs, which provides real-time facilities for analysing the acquired data, to generate necessary information for the dynamic adaptation of sample rates, and to deliver triggers when certain events in the plasma are detected. The algorithm running on the powerPCs performs a wavelet like time-frequency transform. In the last part we give an outlook how to build the next generation of data acquisition systems to be used on the future plasma experiments W7-X and ITER, but also on Asdex upgrade. The hardware of these new distributed systems should be mainly based on established industry standards like the VME-bus, PCI-bus and FiberChannel, but also emerging technologies like SCI (scalable coherent interconnect) should be considered. The systems software should be well designed with object oriented methods to simplify the maintenance process and to enable further expansions and adaptations to new problems in an easy way. (orig.)

  3. Trade-Off Exploration for Target Tracking Application in a Customized Multiprocessor Architecture

    Directory of Open Access Journals (Sweden)

    Yassin El-Hillali

    2009-01-01

    Full Text Available This paper presents the design of an FPGA-based multiprocessor-system-on-chip (MPSoC architecture optimized for Multiple Target Tracking (MTT in automotive applications. An MTT system uses an automotive radar to track the speed and relative position of all the vehicles (targets within its field of view. As the number of targets increases, the computational needs of the MTT system also increase making it difficult for a single processor to handle it alone. Our implementation distributes the computational load among multiple soft processor cores optimized for executing specific computational tasks. The paper explains how we designed and profiled the MTT application to partition it among different processors. It also explains how we applied different optimizations to customize the individual processor cores to their assigned tasks and to assess their impact on performance and FPGA resource utilization. The result is a complete MTT application running on an optimized MPSoC architecture that fits in a contemporary medium-sized FPGA and that meets the application's real-time constraints.

  4. Blocking Optimality in Distributed Real-Time Locking Protocols

    Directory of Open Access Journals (Sweden)

    Björn Bernhard Brandenburg

    2014-09-01

    Full Text Available Lower and upper bounds on the maximum priority inversion blocking (pi-blocking that is generally unavoidable in distributed multiprocessor real-time locking protocols (where resources may be accessed only from specific synchronization processors are established. Prior work on suspension-based shared-memory multiprocessor locking protocols (which require resources to be accessible from all processors has established asymptotically tight bounds of Ω(m and Ω(n maximum pi-blocking under suspension-oblivious and suspension-aware analysis, respectively, where m denotes the total number of processors and n denotes the number of tasks. In this paper, it is shown that, in the case of distributed semaphore protocols, there exist two different task allocation scenarios that give rise to distinct lower bounds. In the case of co-hosted task allocation, where application tasks may also be assigned to synchronization processors (i.e., processors hosting critical sections, Ω(Φ · n maximum pi-blocking is unavoidable for some tasks under any locking protocol under both suspension-aware and suspension-oblivious schedulability analysis, where Φ denotes the ratio of the maximum response time to the shortest period. In contrast, in the case of disjoint task allocation (i.e., if application tasks may not be assigned to synchronization processors, only Ω(m and Ω(n maximum pi-blocking is fundamentally unavoidable under suspension-oblivious and suspension-aware analysis, respectively, as in the shared-memory case. These bounds are shown to be asymptotically tight with the construction of two new distributed real-time locking protocols that ensure O(m and O(n maximum pi-blocking under suspension-oblivious and suspension-aware analysis, respectively.

  5. On a Multiprocessor Computer Farm for Online Physics Data Processing

    CERN Document Server

    Sinanis, N J

    1999-01-01

    The topic of this thesis is the design-phase performance evaluation of a large multiprocessor (MP) computer farm intended for the on-line data processing of the Compact Muon Solenoid (CMS) experiment. CMS is a high energy Physics experiment, planned to operate at CERN (Geneva, Switzerland) during the year 2005. The CMS computer farm is consisting of 1,000 MP computer systems and a 1,000 X 1,000 communications switch. The followed approach to the farm performance evaluation is through simulation studies and evaluation of small prototype systems building blocks of the farm. For the purposes of the simulation studies, we have developed a discrete-event, event-driven simulator that is capable to describe the high-level architecture of the farm and give estimates of the farm's performance. The simulator is designed in a modular way to facilitate the development of various modules that model the behavior of the farm building blocks in the desired level of detail. With the aid of this simulator, we make a particular...

  6. Numerical fluid flow and heat transfer calculations on multiprocessor systems

    Energy Technology Data Exchange (ETDEWEB)

    Oehman, G.A.; Malen, T.E.; Kuusela, P.

    1989-01-01

    The first part of the report presents the basic principles of parallel processing, and factors influencing tbe efficiency of practical applications are discussed. In a multiprocessor computer, different parts of the program code are executed in parallel, i.e. simultaneous with respect to time, on different processors, and thus it becomes possible to decrease the overall computation time by a factor, which in the ideal case is equal to the number of processors. The application study starts from the numerical solution of the twodimesional Laplace equation, which describes the steady heat conduction in a solid plate and advances through the solution of the three dimensional Laplace equation to the case of study laminar fluid flow in a twodimensional box at Reynolds numbers up to 20. Hereby the stream function-vorticity method is first applied and the SIMPLER method. The conventional (sequential) numerical algoritms for these fluid flow and heat transfer problems are found not to be ideally suited for conversion to parallel computation, but sped-up ratios considerably above 50 % of the theoretical maximum are regularly achieved in the runs. The numerical procedures we coded in the OCCAM-2 language and the test runs were performed at who Akademi on the imperimental HATHI-computers containing 16 T4l4 and 100 INMOS T800 transputers respectively.

  7. Numerical fluid flow and heat transfer calculations on multiprocessor systems

    Energy Technology Data Exchange (ETDEWEB)

    Oehman, G.A.; Malen, T.E.; Kuusela, P.

    1989-12-31

    The first part of the report presents the basic principles of parallel processing, and factors influencing tbe efficiency of practical applications are discussed. In a multiprocessor computer, different parts of the program code are executed in parallel, i.e. simultaneous with respect to time, on different processors, and thus it becomes possible to decrease the overall computation time by a factor, which in the ideal case is equal to the number of processors. The application study starts from the numerical solution of the twodimesional Laplace equation, which describes the steady heat conduction in a solid plate and advances through the solution of the three dimensional Laplace equation to the case of study laminar fluid flow in a twodimensional box at Reynolds numbers up to 20. Hereby the stream function-vorticity method is first applied and the SIMPLER method. The conventional (sequential) numerical algoritms for these fluid flow and heat transfer problems are found not to be ideally suited for conversion to parallel computation, but sped-up ratios considerably above 50 % of the theoretical maximum are regularly achieved in the runs. The numerical procedures we coded in the OCCAM-2 language and the test runs were performed at who Akademi on the imperimental HATHI-computers containing 16 T4l4 and 100 INMOS T800 transputers respectively.

  8. Evaluation of a Connectionless NoC for a Real-Time Distributed Shared Memory Many-Core System

    NARCIS (Netherlands)

    Rutgers, J.H.; Bekooij, Marco Jan Gerrit; Smit, Gerardus Johannes Maria

    2012-01-01

    Real-time embedded systems like smartphones tend to comprise an ever increasing number of processing cores. For scalability and the need for guaranteed performance, the use of a connection-oriented network-on-chip (NoC) is advocated. Furthermore, a distributed shared memory architecture is preferred

  9. Distributed cerebellar plasticity implements generalized multiple-scale memory components in real-robot sensorimotor tasks

    Directory of Open Access Journals (Sweden)

    Claudia eCasellato

    2015-02-01

    Full Text Available The cerebellum plays a crucial role in motor learning and it acts as a predictive controller. Modeling it and embedding it into sensorimotor tasks allows us to create functional links between plasticity mechanisms, neural circuits and behavioral learning. Moreover, if applied to real-time control of a neurorobot, the cerebellar model has to deal with a real noisy and changing environment, thus showing its robustness and effectiveness in learning. A biologically inspired cerebellar model with distributed plasticity, both at cortical and nuclear sites, has been used. Two cerebellum-mediated paradigms have been designed: an associative Pavlovian task and a vestibulo-ocular reflex, with multiple sessions of acquisition and extinction and with different stimuli and perturbation patterns. The cerebellar controller succeeded to generate conditioned responses and finely tuned eye movement compensation, thus reproducing human-like behaviors. Through a productive plasticity transfer from cortical to nuclear sites, the distributed cerebellar controller showed in both tasks the capability to optimize learning on multiple time-scales, to store motor memory and to effectively adapt to dynamic ranges of stimuli.

  10. Altered distribution of peripheral blood memory B cells in humans chronically infected with Trypanosoma cruzi.

    Science.gov (United States)

    Fernández, Esteban R; Olivera, Gabriela C; Quebrada Palacio, Luz P; González, Mariela N; Hernandez-Vasquez, Yolanda; Sirena, Natalia María; Morán, María L; Ledesma Patiño, Oscar S; Postan, Miriam

    2014-01-01

    Numerous abnormalities of the peripheral blood T cell compartment have been reported in human chronic Trypanosoma cruzi infection and related to prolonged antigenic stimulation by persisting parasites. Herein, we measured circulating lymphocytes of various phenotypes based on the differential expression of CD19, CD4, CD27, CD10, IgD, IgM, IgG and CD138 in a total of 48 T. cruzi-infected individuals and 24 healthy controls. Infected individuals had decreased frequencies of CD19+CD27+ cells, which positively correlated with the frequencies of CD4+CD27+ cells. The contraction of CD19+CD27+ cells was comprised of IgG+IgD-, IgM+IgD- and isotype switched IgM-IgD- memory B cells, CD19+CD10+CD27+ B cell precursors and terminally differentiated CD19+CD27+CD138+ plasma cells. Conversely, infected individuals had increased proportions of CD19+IgG+CD27-IgD- memory and CD19+IgM+CD27-IgD+ transitional/naïve B cells. These observations prompted us to assess soluble CD27, a molecule generated by the cleavage of membrane-bound CD27 and used to monitor systemic immune activation. Elevated levels of serum soluble CD27 were observed in infected individuals with Chagas cardiomyopathy, indicating its potentiality as an immunological marker for disease progression in endemic areas. In conclusion, our results demonstrate that chronic T. cruzi infection alters the distribution of various peripheral blood B cell subsets, probably related to the CD4+ T cell deregulation process provoked by the parasite in humans.

  11. Altered distribution of peripheral blood memory B cells in humans chronically infected with Trypanosoma cruzi.

    Directory of Open Access Journals (Sweden)

    Esteban R Fernández

    Full Text Available Numerous abnormalities of the peripheral blood T cell compartment have been reported in human chronic Trypanosoma cruzi infection and related to prolonged antigenic stimulation by persisting parasites. Herein, we measured circulating lymphocytes of various phenotypes based on the differential expression of CD19, CD4, CD27, CD10, IgD, IgM, IgG and CD138 in a total of 48 T. cruzi-infected individuals and 24 healthy controls. Infected individuals had decreased frequencies of CD19+CD27+ cells, which positively correlated with the frequencies of CD4+CD27+ cells. The contraction of CD19+CD27+ cells was comprised of IgG+IgD-, IgM+IgD- and isotype switched IgM-IgD- memory B cells, CD19+CD10+CD27+ B cell precursors and terminally differentiated CD19+CD27+CD138+ plasma cells. Conversely, infected individuals had increased proportions of CD19+IgG+CD27-IgD- memory and CD19+IgM+CD27-IgD+ transitional/naïve B cells. These observations prompted us to assess soluble CD27, a molecule generated by the cleavage of membrane-bound CD27 and used to monitor systemic immune activation. Elevated levels of serum soluble CD27 were observed in infected individuals with Chagas cardiomyopathy, indicating its potentiality as an immunological marker for disease progression in endemic areas. In conclusion, our results demonstrate that chronic T. cruzi infection alters the distribution of various peripheral blood B cell subsets, probably related to the CD4+ T cell deregulation process provoked by the parasite in humans.

  12. Supporting Multiprocessors in the Icecap Safety-Critical Java Run-Time Environment

    DEFF Research Database (Denmark)

    Zhao, Shuai; Wellings, Andy; Korsholm, Stephan Erbs

    The current version of the Safety Critical Java (SCJ) specification defines three compliance levels. Level 0 targets single processor programs while Level 1 and 2 can support multiprocessor platforms. Level 1 programs must be fully partitioned but Level 2 programs can also be more globally...... scheduled. As of yet, there is no official Reference Implementation for SCJ. However, the icecap project has produced a Safety-Critical Java Run-time Environment based on the Hardware-near Virtual Machine (HVM). This supports SCJ at all compliance levels and provides an implementation of the safety......-critical Java (javax.safetycritical) package. This is still work-in-progress and lacks certain key features. Among these is the ability to support multiprocessor platforms. In this paper, we explore two possible options to adding multiprocessor support to this environment: the “green thread” and the “native...

  13. A cross-cultural study of the lifespan distributions of life script events and autobiographical memories of life story events

    DEFF Research Database (Denmark)

    Zaragoza Scherman, Alejandra; Salgado, Sinué; Shao, Zhifang

    Cultural Life Script Theory provides a cultural explanation of the reminiscence bump: adults older than 40 years remember more life events happening between 15 - 30 years of age. The cultural life script represents semantic knowledge about commonly shared expectations regarding the order and timing...... of major transitional life events in an idealized life course. By comparing the lifespan distribution of life scripts events and memories of life story events, we can determine the degree to which the cultural life script serves as a recall template for autobiographical memories, especially of positive...

  14. Kmerind: A Flexible Parallel Library for K-mer Indexing of Biological Sequences on Distributed Memory Systems.

    Science.gov (United States)

    Pan, Tony; Flick, Patrick; Jain, Chirag; Liu, Yongchao; Aluru, Srinivas

    2017-10-09

    Counting and indexing fixed length substrings, or k-mers, in biological sequences is a key step in many bioinformatics tasks including genome alignment and mapping, genome assembly, and error correction. While advances in next generation sequencing technologies have dramatically reduced the cost and improved latency and throughput, few bioinformatics tools can efficiently process the datasets at the current generation rate of 1.8 terabases every 3 days. We present Kmerind, a high performance parallel k-mer indexing library for distributed memory environments. The Kmerind library provides a set of simple and consistent APIs with sequential semantics and parallel implementations that are designed to be flexible and extensible. Kmerind's k-mer counter performs similarly or better than the best existing k-mer counting tools even on shared memory systems. In a distributed memory environment, Kmerind counts k-mers in a 120 GB sequence read dataset in less than 13 seconds on 1024 Xeon CPU cores, and fully indexes their positions in approximately 17 seconds. Querying for 1% of the k-mers in these indices can be completed in 0.23 seconds and 28 seconds, respectively. Kmerind is the first k-mer indexing library for distributed memory environments, and the first extensible library for general k-mer indexing and counting. Kmerind is available at https://github.com/ParBLiSS/kmerind.

  15. Hard Real-Time Performances in Multiprocessor-Embedded Systems Using ASMP-Linux

    Directory of Open Access Journals (Sweden)

    Daniel Pierre Bovet

    2008-01-01

    Full Text Available Multiprocessor systems, especially those based on multicore or multithreaded processors, and new operating system architectures can satisfy the ever increasing computational requirements of embedded systems. ASMP-LINUX is a modified, high responsiveness, open-source hard real-time operating system for multiprocessor systems capable of providing high real-time performance while maintaining the code simple and not impacting on the performances of the rest of the system. Moreover, ASMP-LINUX does not require code changing or application recompiling/relinking. In order to assess the performances of ASMP-LINUX, benchmarks have been performed on several hardware platforms and configurations.

  16. Hard Real-Time Performances in Multiprocessor-Embedded Systems Using ASMP-Linux

    Directory of Open Access Journals (Sweden)

    Betti Emiliano

    2008-01-01

    Full Text Available Abstract Multiprocessor systems, especially those based on multicore or multithreaded processors, and new operating system architectures can satisfy the ever increasing computational requirements of embedded systems. ASMP-LINUX is a modified, high responsiveness, open-source hard real-time operating system for multiprocessor systems capable of providing high real-time performance while maintaining the code simple and not impacting on the performances of the rest of the system. Moreover, ASMP-LINUX does not require code changing or application recompiling/relinking. In order to assess the performances of ASMP-LINUX, benchmarks have been performed on several hardware platforms and configurations.

  17. A system-level multiprocessor system-on-chip modeling framework

    DEFF Research Database (Denmark)

    Virk, Kashif Munir; Madsen, Jan

    2004-01-01

    We present a system-level modeling framework to model system-on-chips (SoC) consisting of heterogeneous multiprocessors and network-on-chip communication structures in order to enable the developers of today's SoC designs to take advantage of the flexibility and scalability of network-on-chip and...... SoC design. We show how a hand-held multimedia terminal, consisting of JPEG, MP3 and GSM applications, can be modeled as a multiprocessor SoC in our framework....

  18. ParBiBit: Parallel tool for binary biclustering on modern distributed-memory systems.

    Science.gov (United States)

    González-Domínguez, Jorge; Expósito, Roberto R

    2018-01-01

    Biclustering techniques are gaining attention in the analysis of large-scale datasets as they identify two-dimensional submatrices where both rows and columns are correlated. In this work we present ParBiBit, a parallel tool to accelerate the search of interesting biclusters on binary datasets, which are very popular on different fields such as genetics, marketing or text mining. It is based on the state-of-the-art sequential Java tool BiBit, which has been proved accurate by several studies, especially on scenarios that result on many large biclusters. ParBiBit uses the same methodology as BiBit (grouping the binary information into patterns) and provides the same results. Nevertheless, our tool significantly improves performance thanks to an efficient implementation based on C++11 that includes support for threads and MPI processes in order to exploit the compute capabilities of modern distributed-memory systems, which provide several multicore CPU nodes interconnected through a network. Our performance evaluation with 18 representative input datasets on two different eight-node systems shows that our tool is significantly faster than the original BiBit. Source code in C++ and MPI running on Linux systems as well as a reference manual are available at https://sourceforge.net/projects/parbibit/.

  19. ClimateSpark: An in-memory distributed computing framework for big climate data analytics

    Science.gov (United States)

    Hu, Fei; Yang, Chaowei; Schnase, John L.; Duffy, Daniel Q.; Xu, Mengchao; Bowen, Michael K.; Lee, Tsengdar; Song, Weiwei

    2018-06-01

    The unprecedented growth of climate data creates new opportunities for climate studies, and yet big climate data pose a grand challenge to climatologists to efficiently manage and analyze big data. The complexity of climate data content and analytical algorithms increases the difficulty of implementing algorithms on high performance computing systems. This paper proposes an in-memory, distributed computing framework, ClimateSpark, to facilitate complex big data analytics and time-consuming computational tasks. Chunking data structure improves parallel I/O efficiency, while a spatiotemporal index is built for the chunks to avoid unnecessary data reading and preprocessing. An integrated, multi-dimensional, array-based data model (ClimateRDD) and ETL operations are developed to address big climate data variety by integrating the processing components of the climate data lifecycle. ClimateSpark utilizes Spark SQL and Apache Zeppelin to develop a web portal to facilitate the interaction among climatologists, climate data, analytic operations and computing resources (e.g., using SQL query and Scala/Python notebook). Experimental results show that ClimateSpark conducts different spatiotemporal data queries/analytics with high efficiency and data locality. ClimateSpark is easily adaptable to other big multiple-dimensional, array-based datasets in various geoscience domains.

  20. Modeling of long-range memory processes with inverse cubic distributions by the nonlinear stochastic differential equations

    Science.gov (United States)

    Kaulakys, B.; Alaburda, M.; Ruseckas, J.

    2016-05-01

    A well-known fact in the financial markets is the so-called ‘inverse cubic law’ of the cumulative distributions of the long-range memory fluctuations of market indicators such as a number of events of trades, trading volume and the logarithmic price change. We propose the nonlinear stochastic differential equation (SDE) giving both the power-law behavior of the power spectral density and the long-range dependent inverse cubic law of the cumulative distribution. This is achieved using the suggestion that when the market evolves from calm to violent behavior there is a decrease of the delay time of multiplicative feedback of the system in comparison to the driving noise correlation time. This results in a transition from the Itô to the Stratonovich sense of the SDE and yields a long-range memory process.

  1. A Performance-Prediction Model for PIC Applications on Clusters of Symmetric MultiProcessors: Validation with Hierarchical HPF+OpenMP Implementation

    Directory of Open Access Journals (Sweden)

    Sergio Briguglio

    2003-01-01

    Full Text Available A performance-prediction model is presented, which describes different hierarchical workload decomposition strategies for particle in cell (PIC codes on Clusters of Symmetric MultiProcessors. The devised workload decomposition is hierarchically structured: a higher-level decomposition among the computational nodes, and a lower-level one among the processors of each computational node. Several decomposition strategies are evaluated by means of the prediction model, with respect to the memory occupancy, the parallelization efficiency and the required programming effort. Such strategies have been implemented by integrating the high-level languages High Performance Fortran (at the inter-node stage and OpenMP (at the intra-node one. The details of these implementations are presented, and the experimental values of parallelization efficiency are compared with the predicted results.

  2. Temporal analysis and scheduling of hard real-time radios running on a multi-processor

    NARCIS (Netherlands)

    Moreira, O.

    2012-01-01

    On a multi-radio baseband system, multiple independent transceivers must share the resources of a multi-processor, while meeting each its own hard real-time requirements. Not all possible combinations of transceivers are known at compile time, so a solution must be found that either allows for

  3. DAEDALUS: System-Level Design Methodology for Streaming Multiprocessor Embedded Systems on Chips

    NARCIS (Netherlands)

    Stefanov, T.; Pimentel, A.; Nikolov, H.; Ha, S.; Teich, J.

    2017-01-01

    The complexity of modern embedded systems, which are increasingly based on heterogeneous multiprocessor system-on-chip (MPSoC) architectures, has led to the emergence of system-level design. To cope with this design complexity, system-level design aims at raising the abstraction level of the design

  4. Design of massively parallel hardware multi-processors for highly-demanding embedded applications

    NARCIS (Netherlands)

    Jozwiak, L.; Jan, Y.

    2013-01-01

    Many new embedded applications require complex computations to be performed to tight schedules, while at the same time demanding low energy consumption and low cost. For implementation of these highly-demanding applications, highly-optimized application-specific multi-processor system-on-a-chip

  5. Abstractions for aperiodic multiprocessor scheduling of real-time stream processing applications

    NARCIS (Netherlands)

    Hausmans, J.P.H.M.

    2015-01-01

    Embedded multiprocessor systems are often used in the domain of real-time stream processing applications to keep up with increasing power and performance requirements. Examples of such real-time stream processing applications are digital radio baseband processing and WLAN transceivers. These stream

  6. An FPGA design flow for reconfigurable network-based multi-processor systems on chip

    NARCIS (Netherlands)

    Kumar, A.; Hansson, M.A; Huisken, J.; Corporaal, H.

    2007-01-01

    Multi-processor systems on chip (MPSoC) platforms are becoming increasingly more heterogeneous and are shifting towards a more communication-centric methodology. Networks on chip (NoC) have emerged as the design paradigm for scalable on-chip communication architectures. As the system complexity

  7. Multi-processor system-level synthesis for multiple applications on platform FPGA

    NARCIS (Netherlands)

    Kumar, A.; Fernando, S.D.; Ha, Y.; Mesman, B.; Corporaal, H.; Bertels, Koen

    2007-01-01

    Multiprocessor systems-on-chip (MPSoC) are being developed in increasing numbers to support the high number of applications running on modern embedded systems. Designing and programming such systems prove to be a major challenge. Most of the current design methodologies rely on creating the design

  8. VME multiprocessor system for plasma control at the JT-60 Upgrade

    International Nuclear Information System (INIS)

    Kimura, T.; Kurihara, K.; Takahashi, M.; Kawamata, Y.; Akasaka, H.; Matsukawa, M.

    1989-01-01

    In this paper design and preliminary tests are reported of a VME multiprocessor system for the JT-60 Upgrade plasma control utilizing three MC88100 based RISC computers and VME buses. The design of the VME system was stimulated by faster and more accurate computation requirements for the plasma position and shape control

  9. Hybrid Simulation of the Interaction of Europa's Atmosphere with the Jovian Plasma: Multiprocessor Simulations

    Science.gov (United States)

    Dols, V. J.; Delamere, P. A.; Bagenal, F.; Cassidy, T. A.; Crary, F. J.

    2014-12-01

    We model the interaction of Europa's tenuous atmosphere with the plasma of Jupiter's torus with an improved version of our hybrid plasma code. In a hybrid plasma code, the ions are treated as kinetic Macro-particles moving under the Lorentz force and the electrons as a fluid leading to a generalized formulation of Ohm's law. In this version, the spatial simulation domain is decomposed in 2 directions and is non-uniform in the plasma convection direction. The code is run on a multi-processor supercomputer that offers 16416 cores and 2GB Ram per core. This new version allows us to tap into the large memory of the supercomputer and simulate the full interaction volume (Reuropa=1561km) with a high spatial resolution (50km). Compared to Io, Europa's atmosphere is about 100 times more tenuous, the ambient magnetic field is weaker and the density of incident plasma is lower. Consequently, the electrodynamic interaction is also weaker and substantial fluxes of thermal torus ions might reach and sputter the icy surface. Molecular O2 is the dominant atmospheric product of this surface sputtering. Observations of oxygen UV emissions (specifically the ratio of OI 1356A / 1304A emissions) are roughly consistent with an atmosphere that is composed predominantely of O2 with a small amount of atomic O. Galileo observations along flybys close to Europa have revealed the existence of induced currents in a conducting ocean under the icy crust. They also showed that, from flyby to flyby, the plasma interaction is very variable. Asymmetries of the plasma density and temperature in the wake of Europa were also observed and still elude a clear explanation. Galileo mag data also detected ion cyclotron waves, which is an indication of heavy ion pickup close to the moon. We prescribe an O2 atmosphere with a vertical density column consistent with UV observations and model the plasma properties along several Galileo flybys of the moon. We compare our results with the magnetometer

  10. Meeting the memory challenges of brain-scale network simulation

    Directory of Open Access Journals (Sweden)

    Susanne eKunkel

    2012-01-01

    Full Text Available The development of high-performance simulation software is crucial for studying the brain connectome. Using connectome data to generate neurocomputational models requires software capable of coping with models on a variety of scales: from the microscale, investigating plasticity and dynamics of circuits in local networks, to the macroscale, investigating the interactions between distinct brain regions. Prior to any serious dynamical investigation, the first task of network simulations is to check the consistency of data integrated in the connectome and constrain ranges for yet unknown parameters. Thanks to distributed computing techniques, it is possible today to routinely simulate local cortical networks of around 10^5 neurons with up to 10^9 synapses on clusters and multi-processor shared-memory machines. However, brain-scale networks are one or two orders of magnitude larger than such local networks, in terms of numbers of neurons and synapses as well as in terms of computational load. Such networks have been studied in individual studies, but the underlying simulation technologies have neither been described in sufficient detail to be reproducible nor made publicly available. Here, we discover that as the network model sizes approach the regime of meso- and macroscale simulations, memory consumption on individual compute nodes becomes a critical bottleneck. This is especially relevant on modern supercomputers such as the Bluegene/P architecture where the available working memory per CPU core is rather limited. We develop a simple linear model to analyze the memory consumption of the constituent components of a neuronal simulator as a function of network size and the number of cores used. This approach has multiple benefits. The model enables identification of key contributing components to memory saturation and prediction of the effects of potential improvements to code before any implementation takes place.

  11. Efficient implementation of multidimensional fast fourier transform on a distributed-memory parallel multi-node computer

    Science.gov (United States)

    Bhanot, Gyan V [Princeton, NJ; Chen, Dong [Croton-On-Hudson, NY; Gara, Alan G [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Heidelberger, Philip [Cortlandt Manor, NY; Steinmacher-Burow, Burkhard D [Mount Kisco, NY; Vranas, Pavlos M [Bedford Hills, NY

    2012-01-10

    The present in invention is directed to a method, system and program storage device for efficiently implementing a multidimensional Fast Fourier Transform (FFT) of a multidimensional array comprising a plurality of elements initially distributed in a multi-node computer system comprising a plurality of nodes in communication over a network, comprising: distributing the plurality of elements of the array in a first dimension across the plurality of nodes of the computer system over the network to facilitate a first one-dimensional FFT; performing the first one-dimensional FFT on the elements of the array distributed at each node in the first dimension; re-distributing the one-dimensional FFT-transformed elements at each node in a second dimension via "all-to-all" distribution in random order across other nodes of the computer system over the network; and performing a second one-dimensional FFT on elements of the array re-distributed at each node in the second dimension, wherein the random order facilitates efficient utilization of the network thereby efficiently implementing the multidimensional FFT. The "all-to-all" re-distribution of array elements is further efficiently implemented in applications other than the multidimensional FFT on the distributed-memory parallel supercomputer.

  12. Efficient implementation of a multidimensional fast fourier transform on a distributed-memory parallel multi-node computer

    Science.gov (United States)

    Bhanot, Gyan V [Princeton, NJ; Chen, Dong [Croton-On-Hudson, NY; Gara, Alan G [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Heidelberger, Philip [Cortlandt Manor, NY; Steinmacher-Burow, Burkhard D [Mount Kisco, NY; Vranas, Pavlos M [Bedford Hills, NY

    2008-01-01

    The present in invention is directed to a method, system and program storage device for efficiently implementing a multidimensional Fast Fourier Transform (FFT) of a multidimensional array comprising a plurality of elements initially distributed in a multi-node computer system comprising a plurality of nodes in communication over a network, comprising: distributing the plurality of elements of the array in a first dimension across the plurality of nodes of the computer system over the network to facilitate a first one-dimensional FFT; performing the first one-dimensional FFT on the elements of the array distributed at each node in the first dimension; re-distributing the one-dimensional FFT-transformed elements at each node in a second dimension via "all-to-all" distribution in random order across other nodes of the computer system over the network; and performing a second one-dimensional FFT on elements of the array re-distributed at each node in the second dimension, wherein the random order facilitates efficient utilization of the network thereby efficiently implementing the multidimensional FFT. The "all-to-all" re-distribution of array elements is further efficiently implemented in applications other than the multidimensional FFT on the distributed-memory parallel supercomputer.

  13. Apparatus for multiprocessor-based control of a multiagent robot

    Science.gov (United States)

    Peters, II, Richard Alan (Inventor)

    2009-01-01

    An architecture for robot intelligence enables a robot to learn new behaviors and create new behavior sequences autonomously and interact with a dynamically changing environment. Sensory information is mapped onto a Sensory Ego-Sphere (SES) that rapidly identifies important changes in the environment and functions much like short term memory. Behaviors are stored in a DBAM that creates an active map from the robot's current state to a goal state and functions much like long term memory. A dream state converts recent activities stored in the SES and creates or modifies behaviors in the DBAM.

  14. A Transparent Runtime Data Distribution Engine for OpenMP

    Directory of Open Access Journals (Sweden)

    Dimitrios S. Nikolopoulos

    2000-01-01

    Full Text Available This paper makes two important contributions. First, the paper investigates the performance implications of data placement in OpenMP programs running on modern NUMA multiprocessors. Data locality and minimization of the rate of remote memory accesses are critical for sustaining high performance on these systems. We show that due to the low remote-to-local memory access latency ratio of contemporary NUMA architectures, reasonably balanced page placement schemes, such as round-robin or random distribution, incur modest performance losses. Second, the paper presents a transparent, user-level page migration engine with an ability to gain back any performance loss that stems from suboptimal placement of pages in iterative OpenMP programs. The main body of the paper describes how our OpenMP runtime environment uses page migration for implementing implicit data distribution and redistribution schemes without programmer intervention. Our experimental results verify the effectiveness of the proposed framework and provide a proof of concept that it is not necessary to introduce data distribution directives in OpenMP and warrant the simplicity or the portability of the programming model.

  15. A trade-off between local and distributed information processing associated with remote episodic versus semantic memory.

    Science.gov (United States)

    Heisz, Jennifer J; Vakorin, Vasily; Ross, Bernhard; Levine, Brian; McIntosh, Anthony R

    2014-01-01

    Episodic memory and semantic memory produce very different subjective experiences yet rely on overlapping networks of brain regions for processing. Traditional approaches for characterizing functional brain networks emphasize static states of function and thus are blind to the dynamic information processing within and across brain regions. This study used information theoretic measures of entropy to quantify changes in the complexity of the brain's response as measured by magnetoencephalography while participants listened to audio recordings describing past personal episodic and general semantic events. Personal episodic recordings evoked richer subjective mnemonic experiences and more complex brain responses than general semantic recordings. Critically, we observed a trade-off between the relative contribution of local versus distributed entropy, such that personal episodic recordings produced relatively more local entropy whereas general semantic recordings produced relatively more distributed entropy. Changes in the relative contributions of local and distributed entropy to the total complexity of the system provides a potential mechanism that allows the same network of brain regions to represent cognitive information as either specific episodes or more general semantic knowledge.

  16. Distribution of Peripheral Memory T Follicular Helper Cells in Patients with Schistosomiasis Japonica.

    Directory of Open Access Journals (Sweden)

    Xiaojun Chen

    Full Text Available Schistosomiasis is a helminthic disease that affects more than 200 million people. An effective vaccine would be a major step towards eliminating the disease. Studies suggest that T follicular helper (Tfh cells provide help to B cells to generate the long-term humoral immunity, which would be a crucial component of successful vaccines. Thus, understanding the biological characteristics of Tfh cells in patients with schistosomiasis, which has never been explored, is essential for vaccine design.In this study, we investigated the biological characteristics of peripheral memory Tfh cells in schistosomiasis patients by flow cytometry. Our data showed that the frequencies of total and activated peripheral memory Tfh cells in patients were significantly increased during Schistosoma japonicum infection. Moreover, Tfh2 cells, which were reported to be a specific subpopulation to facilitate the generation of protective antibodies, were increased more greatly than other subpopulations of total peripheral memory Tfh cells in patients with schistosomiasis japonica. More importantly, our result showed significant correlations of the percentage of Tfh2 cells with both the frequency of plasma cells and the level of IgG antibody. In addition, our results showed that the percentage of T follicular regulatory (Tfr cells was also increased in patients with schistosomiasis.Our report is the first characterization of peripheral memory Tfh cells in schistosomasis patients, which not only provides potential targets to improve immune response to vaccination, but also is important for the development of vaccination strategies to control schistosomiasis.

  17. A parallelization study of the general purpose Monte Carlo code MCNP4 on a distributed memory highly parallel computer

    International Nuclear Information System (INIS)

    Yamazaki, Takao; Fujisaki, Masahide; Okuda, Motoi; Takano, Makoto; Masukawa, Fumihiro; Naito, Yoshitaka

    1993-01-01

    The general purpose Monte Carlo code MCNP4 has been implemented on the Fujitsu AP1000 distributed memory highly parallel computer. Parallelization techniques developed and studied are reported. A shielding analysis function of the MCNP4 code is parallelized in this study. A technique to map a history to each processor dynamically and to map control process to a certain processor was applied. The efficiency of parallelized code is up to 80% for a typical practical problem with 512 processors. These results demonstrate the advantages of a highly parallel computer to the conventional computers in the field of shielding analysis by Monte Carlo method. (orig.)

  18. Some algorithms for the solution of the symmetric eigenvalue problem on a multiprocessor electronic computer

    International Nuclear Information System (INIS)

    Molchanov, I.N.; Khimich, A.N.

    1984-01-01

    This article shows how a reflection method can be used to find the eigenvalues of a matrix by transforming the matrix to tridiagonal form. The method of conjugate gradients is used to find the smallest eigenvalue and the corresponding eigenvector of symmetric positive-definite band matrices. Topics considered include the computational scheme of the reflection method, the organization of parallel calculations by the reflection method, the computational scheme of the conjugate gradient method, the organization of parallel calculations by the conjugate gradient method, and the effectiveness of parallel algorithms. It is concluded that it is possible to increase the overall effectiveness of the multiprocessor electronic computers by either letting the newly available processors of a new problem operate in the multiprocessor mode, or by improving the coefficient of uniform partition of the original information

  19. Energy-Aware Real-Time Task Scheduling for Heterogeneous Multiprocessors with Particle Swarm Optimization Algorithm

    Directory of Open Access Journals (Sweden)

    Weizhe Zhang

    2014-01-01

    Full Text Available Energy consumption in computer systems has become a more and more important issue. High energy consumption has already damaged the environment to some extent, especially in heterogeneous multiprocessors. In this paper, we first formulate and describe the energy-aware real-time task scheduling problem in heterogeneous multiprocessors. Then we propose a particle swarm optimization (PSO based algorithm, which can successfully reduce the energy cost and the time for searching feasible solutions. Experimental results show that the PSO-based energy-aware metaheuristic uses 40%–50% less energy than the GA-based and SFLA-based algorithms and spends 10% less time than the SFLA-based algorithm in finding the solutions. Besides, it can also find 19% more feasible solutions than the SFLA-based algorithm.

  20. Safe and Efficient Support for Embeded Multi-Processors in ADA

    Science.gov (United States)

    Ruiz, Jose F.

    2010-08-01

    New software demands increasing processing power, and multi-processor platforms are spreading as the answer to achieve the required performance. Embedded real-time systems are also subject to this trend, but in the case of real-time mission-critical systems, the properties of reliability, predictability and analyzability are also paramount. The Ada 2005 language defined a subset of its tasking model, the Ravenscar profile, that provides the basis for the implementation of deterministic and time analyzable applications on top of a streamlined run-time system. This Ravenscar tasking profile, originally designed for single processors, has proven remarkably useful for modelling verifiable real-time single-processor systems. This paper proposes a simple extension to the Ravenscar profile to support multi-processor systems using a fully partitioned approach. The implementation of this scheme is simple, and it can be used to develop applications amenable to schedulability analysis.

  1. Efficient calculation of open quantum system dynamics and time-resolved spectroscopy with distributed memory HEOM (DM-HEOM).

    Science.gov (United States)

    Kramer, Tobias; Noack, Matthias; Reinefeld, Alexander; Rodríguez, Mirta; Zelinskyy, Yaroslav

    2018-06-11

    Time- and frequency-resolved optical signals provide insights into the properties of light-harvesting molecular complexes, including excitation energies, dipole strengths and orientations, as well as in the exciton energy flow through the complex. The hierarchical equations of motion (HEOM) provide a unifying theory, which allows one to study the combined effects of system-environment dissipation and non-Markovian memory without making restrictive assumptions about weak or strong couplings or separability of vibrational and electronic degrees of freedom. With increasing system size the exact solution of the open quantum system dynamics requires memory and compute resources beyond a single compute node. To overcome this barrier, we developed a scalable variant of HEOM. Our distributed memory HEOM, DM-HEOM, is a universal tool for open quantum system dynamics. It is used to accurately compute all experimentally accessible time- and frequency-resolved processes in light-harvesting molecular complexes with arbitrary system-environment couplings for a wide range of temperatures and complex sizes. © 2018 Wiley Periodicals, Inc. © 2018 Wiley Periodicals, Inc.

  2. PCI bus content-addressable-memory (CAM) implementation on FPGA for pattern recognition/image retrieval in a distributed environment

    Science.gov (United States)

    Megherbi, Dalila B.; Yan, Yin; Tanmay, Parikh; Khoury, Jed; Woods, C. L.

    2004-11-01

    Recently surveillance and Automatic Target Recognition (ATR) applications are increasing as the cost of computing power needed to process the massive amount of information continues to fall. This computing power has been made possible partly by the latest advances in FPGAs and SOPCs. In particular, to design and implement state-of-the-Art electro-optical imaging systems to provide advanced surveillance capabilities, there is a need to integrate several technologies (e.g. telescope, precise optics, cameras, image/compute vision algorithms, which can be geographically distributed or sharing distributed resources) into a programmable system and DSP systems. Additionally, pattern recognition techniques and fast information retrieval, are often important components of intelligent systems. The aim of this work is using embedded FPGA as a fast, configurable and synthesizable search engine in fast image pattern recognition/retrieval in a distributed hardware/software co-design environment. In particular, we propose and show a low cost Content Addressable Memory (CAM)-based distributed embedded FPGA hardware architecture solution with real time recognition capabilities and computing for pattern look-up, pattern recognition, and image retrieval. We show how the distributed CAM-based architecture offers a performance advantage of an order-of-magnitude over RAM-based architecture (Random Access Memory) search for implementing high speed pattern recognition for image retrieval. The methods of designing, implementing, and analyzing the proposed CAM based embedded architecture are described here. Other SOPC solutions/design issues are covered. Finally, experimental results, hardware verification, and performance evaluations using both the Xilinx Virtex-II and the Altera Apex20k are provided to show the potential and power of the proposed method for low cost reconfigurable fast image pattern recognition/retrieval at the hardware/software co-design level.

  3. Resource Allocation Model for Modelling Abstract RTOS on Multiprocessor System-on-Chip

    DEFF Research Database (Denmark)

    Virk, Kashif Munir; Madsen, Jan

    2003-01-01

    Resource Allocation is an important problem in RTOS's, and has been an active area of research. Numerous approaches have been developed and many different techniques have been combined for a wide range of applications. In this paper, we address the problem of resource allocation in the context...... of modelling an abstract RTOS on multiprocessor SoC platforms. We discuss the implementation details of a simplified basic priority inheritance protocol for our abstract system model in SystemC....

  4. Method for wiring allocation and switch configuration in a multiprocessor environment

    Science.gov (United States)

    Aridor, Yariv [Zichron Ya'akov, IL; Domany, Tamar [Kiryat Tivon, IL; Frachtenberg, Eitan [Jerusalem, IL; Gal, Yoav [Haifa, IL; Shmueli, Edi [Haifa, IL; Stockmeyer, legal representative, Robert E.; Stockmeyer, Larry Joseph [San Jose, CA

    2008-07-15

    A method for wiring allocation and switch configuration in a multiprocessor computer, the method including employing depth-first tree traversal to determine a plurality of paths among a plurality of processing elements allocated to a job along a plurality of switches and wires in a plurality of D-lines, and selecting one of the paths in accordance with at least one selection criterion.

  5. Operating system for a real-time multiprocessor propulsion system simulator

    Science.gov (United States)

    Cole, G. L.

    1984-01-01

    The success of the Real Time Multiprocessor Operating System (RTMPOS) in the development and evaluation of experimental hardware and software systems for real time interactive simulation of air breathing propulsion systems was evaluated. The Real Time Multiprocessor Operating System (RTMPOS) provides the user with a versatile, interactive means for loading, running, debugging and obtaining results from a multiprocessor based simulator. A front end processor (FEP) serves as the simulator controller and interface between the user and the simulator. These functions are facilitated by the RTMPOS which resides on the FEP. The RTMPOS acts in conjunction with the FEP's manufacturer supplied disk operating system that provides typical utilities like an assembler, linkage editor, text editor, file handling services, etc. Once a simulation is formulated, the RTMPOS provides for engineering level, run time operations such as loading, modifying and specifying computation flow of programs, simulator mode control, data handling and run time monitoring. Run time monitoring is a powerful feature of RTMPOS that allows the user to record all actions taken during a simulation session and to receive advisories from the simulator via the FEP. The RTMPOS is programmed mainly in PASCAL along with some assembly language routines. The RTMPOS software is easily modified to be applicable to hardware from different manufacturers.

  6. Paying attention to working memory: Similarities in the spatial distribution of attention in mental and physical space.

    Science.gov (United States)

    Sahan, Muhammet Ikbal; Verguts, Tom; Boehler, Carsten Nicolas; Pourtois, Gilles; Fias, Wim

    2016-08-01

    Selective attention is not limited to information that is physically present in the external world, but can also operate on mental representations in the internal world. However, it is not known whether the mechanisms of attentional selection operate in similar fashions in physical and mental space. We studied the spatial distributions of attention for items in physical and mental space by comparing how successfully distractors were rejected at varying distances from the attended location. The results indicated very similar distribution characteristics of spatial attention in physical and mental space. Specifically, we found that performance monotonically improved with increasing distractor distance relative to the attended location, suggesting that distractor confusability is particularly pronounced for nearby distractors, relative to distractors farther away. The present findings suggest that mental representations preserve their spatial configuration in working memory, and that similar mechanistic principles underlie selective attention in physical and in mental space.

  7. Memory controllers for high-performance and real-time MPSoCs : requirements, architectures, and future trends

    NARCIS (Netherlands)

    Akesson, K.B.; Huang, Po-Chun; Clermidy, F.; Dutoit, D.; Goossens, K.G.W.; Chang, Yuan-Hao; Kuo, Tei-Wei; Vivet, P.; Wingard, D.

    2011-01-01

    Designing memory controllers for complex real-time and high-performance multi-processor systems-on-chip is challenging, since sufficient capacity and (real-time) performance must be provided in a reliable manner at low cost and with low power consumption. This special session contains four

  8. Memory Hierarchy Design for Next Generation Scalable Many-core Platforms

    OpenAIRE

    Azarkhish, Erfan

    2016-01-01

    Performance and energy consumption in modern computing platforms is largely dominated by the memory hierarchy. The increasing computational power in the multiprocessors and accelerators, and the emergence of the data-intensive workloads (e.g. large-scale graph traversal and scientific algorithms) requiring fast transfer of large volumes of data, are two main trends which intensify this problem by putting even higher pressure on the memory hierarchy. This increasing gap between computation spe...

  9. Individual Differences in Components of Reaction Time Distributions and Their Relations to Working Memory and Intelligence

    Science.gov (United States)

    Schmiedek, Florian; Oberauer, Klaus; Wilhelm, Oliver; Suss, Heinz-Martin; Wittmann, Werner W.

    2007-01-01

    The authors bring together approaches from cognitive and individual differences psychology to model characteristics of reaction time distributions beyond measures of central tendency. Ex-Gaussian distributions and a diffusion model approach are used to describe individuals' reaction time data. The authors identified common latent factors for each…

  10. Conditional load and store in a shared memory

    Science.gov (United States)

    Blumrich, Matthias A; Ohmacht, Martin

    2015-02-03

    A method, system and computer program product for implementing load-reserve and store-conditional instructions in a multi-processor computing system. The computing system includes a multitude of processor units and a shared memory cache, and each of the processor units has access to the memory cache. In one embodiment, the method comprises providing the memory cache with a series of reservation registers, and storing in these registers addresses reserved in the memory cache for the processor units as a result of issuing load-reserve requests. In this embodiment, when one of the processor units makes a request to store data in the memory cache using a store-conditional request, the reservation registers are checked to determine if an address in the memory cache is reserved for that processor unit. If an address in the memory cache is reserved for that processor, the data are stored at this address.

  11. Distributed patterns of occipito-parietal functional connectivity predict the precision of visual working memory.

    Science.gov (United States)

    Galeano Weber, Elena M; Hahn, Tim; Hilger, Kirsten; Fiebach, Christian J

    2017-02-01

    Limitations in visual working memory (WM) quality (i.e., WM precision) may depend on perceptual and attentional limitations during stimulus encoding, thereby affecting WM capacity. WM encoding relies on the interaction between sensory processing systems and fronto-parietal 'control' regions, and differences in the quality of this interaction are a plausible source of individual differences in WM capacity. Accordingly, we hypothesized that the coupling between perceptual and attentional systems affects the quality of WM encoding. We combined fMRI connectivity analysis with behavioral modeling by fitting a variable precision and fixed capacity model to the performance data obtained while participants performed a visual delayed continuous response WM task. We quantified functional connectivity during WM encoding between occipital and parietal brain regions activated during both perception and WM encoding, as determined using a conjunction of two independent experiments. The multivariate pattern of voxel-wise inter-areal functional connectivity significantly predicted WM performance, most specifically the mean of WM precision but not the individual number of items that could be stored in memory. In particular, higher occipito-parietal connectivity was associated with higher behavioral mean precision. These results are consistent with a network perspective of WM capacity, suggesting that the efficiency of information flow between perceptual and attentional neural systems is a critical determinant of limitations in WM quality. Copyright © 2016 Elsevier Inc. All rights reserved.

  12. Interference control by best-effort process duty-cycling in chip multi-processor systems for real-time medical image processing

    NARCIS (Netherlands)

    Westmijze, M.; Bekooij, Marco Jan Gerrit; Smit, Gerardus Johannes Maria

    2013-01-01

    Systems with chip multi-processors are currently used for several applications that have real-time requirements. In chip multi-processor architectures, many hardware resources such as parts of the cache hierarchy are shared between cores and by using such resources, applications can significantly

  13. Distributed patterns of activity in sensory cortex reflect the precision of multiple items maintained in visual short-term memory.

    Science.gov (United States)

    Emrich, Stephen M; Riggall, Adam C; Larocque, Joshua J; Postle, Bradley R

    2013-04-10

    Traditionally, load sensitivity of sustained, elevated activity has been taken as an index of storage for a limited number of items in visual short-term memory (VSTM). Recently, studies have demonstrated that the contents of a single item held in VSTM can be decoded from early visual cortex, despite the fact that these areas do not exhibit elevated, sustained activity. It is unknown, however, whether the patterns of neural activity decoded from sensory cortex change as a function of load, as one would expect from a region storing multiple representations. Here, we use multivoxel pattern analysis to examine the neural representations of VSTM in humans across multiple memory loads. In an important extension of previous findings, our results demonstrate that the contents of VSTM can be decoded from areas that exhibit a transient response to visual stimuli, but not from regions that exhibit elevated, sustained load-sensitive delay-period activity. Moreover, the neural information present in these transiently activated areas decreases significantly with increasing load, indicating load sensitivity of the patterns of activity that support VSTM maintenance. Importantly, the decrease in classification performance as a function of load is correlated with within-subject changes in mnemonic resolution. These findings indicate that distributed patterns of neural activity in putatively sensory visual cortex support the representation and precision of information in VSTM.

  14. Investigating Solution Convergence in a Global Ocean Model Using a 2048-Processor Cluster of Distributed Shared Memory Machines

    Directory of Open Access Journals (Sweden)

    Chris Hill

    2007-01-01

    Full Text Available Up to 1920 processors of a cluster of distributed shared memory machines at the NASA Ames Research Center are being used to simulate ocean circulation globally at horizontal resolutions of 1/4, 1/8, and 1/16-degree with the Massachusetts Institute of Technology General Circulation Model, a finite volume code that can scale to large numbers of processors. The study aims to understand physical processes responsible for skill improvements as resolution is increased and to gain insight into what resolution is sufficient for particular purposes. This paper focuses on the computational aspects of reaching the technical objective of efficiently performing these global eddy-resolving ocean simulations. At 1/16-degree resolution the model grid contains 1.2 billion cells. At this resolution it is possible to simulate approximately one month of ocean dynamics in about 17 hours of wallclock time with a model timestep of two minutes on a cluster of four 512-way NUMA Altix systems. The Altix systems' large main memory and I/O subsystems allow computation and disk storage of rich sets of diagnostics during each integration, supporting the scientific objective to develop a better understanding of global ocean circulation model solution convergence as model resolution is increased.

  15. A QDWH-Based SVD Software Framework on Distributed-Memory Manycore Systems

    KAUST Repository

    Sukkari, Dalal; Ltaief, Hatem; Esposito, Aniello; Keyes, David E.

    2017-01-01

    , the inherent high level of concurrency associated with Level 3 BLAS compute-bound kernels ultimately compensates for the arithmetic complexity overhead. Using the ScaLAPACK two-dimensional block cyclic data distribution with a rectangular processor topology

  16. Study Trapped Charge Distribution in P-Channel Silicon-Oxide-Nitride-Oxide-Silicon Memory Device Using Dynamic Programming Scheme

    Science.gov (United States)

    Li, Fu-Hai; Chiu, Yung-Yueh; Lee, Yen-Hui; Chang, Ru-Wei; Yang, Bo-Jun; Sun, Wein-Town; Lee, Eric; Kuo, Chao-Wei; Shirota, Riichiro

    2013-04-01

    In this study, we precisely investigate the charge distribution in SiN layer by dynamic programming of channel hot hole induced hot electron injection (CHHIHE) in p-channel silicon-oxide-nitride-oxide-silicon (SONOS) memory device. In the dynamic programming scheme, gate voltage is increased as a staircase with fixed step amplitude, which can prohibits the injection of holes in SiN layer. Three-dimensional device simulation is calibrated and is compared with the measured programming characteristics. It is found, for the first time, that the hot electron injection point quickly traverses from drain to source side synchronizing to the expansion of charged area in SiN layer. As a result, the injected charges quickly spread over on the almost whole channel area uniformly during a short programming period, which will afford large tolerance against lateral trapped charge diffusion by baking.

  17. Parallelization of MCNP 4, a Monte Carlo neutron and photon transport code system, in highly parallel distributed memory type computer

    International Nuclear Information System (INIS)

    Masukawa, Fumihiro; Takano, Makoto; Naito, Yoshitaka; Yamazaki, Takao; Fujisaki, Masahide; Suzuki, Koichiro; Okuda, Motoi.

    1993-11-01

    In order to improve the accuracy and calculating speed of shielding analyses, MCNP 4, a Monte Carlo neutron and photon transport code system, has been parallelized and measured of its efficiency in the highly parallel distributed memory type computer, AP1000. The code has been analyzed statically and dynamically, then the suitable algorithm for parallelization has been determined for the shielding analysis functions of MCNP 4. This includes a strategy where a new history is assigned to the idling processor element dynamically during the execution. Furthermore, to avoid the congestion of communicative processing, the batch concept, processing multi-histories by a unit, has been introduced. By analyzing a sample cask problem with 2,000,000 histories by the AP1000 with 512 processor elements, the 82 % of parallelization efficiency is achieved, and the calculational speed has been estimated to be around 50 times as fast as that of FACOM M-780. (author)

  18. Successful declarative memory formation is associated with ongoing activity during encoding in a distributed neocortical network related to working memory: a magnetoencephalography study.

    NARCIS (Netherlands)

    Takashima, A.; Jensen, O.; Oostenveld, R.; Maris, E.G.G.; Coevering, M. van de; Fernandez, G.S.E.

    2006-01-01

    The aim of the present study was to investigate the spatio-temporal characteristics of the neural correlates of declarative memory formation as assessed by the subsequent memory effect, i.e. the difference in encoding activity between subsequently remembered and subsequently forgotten items.

  19. Successful declarative memory formation is associated with ongoing activity during encoding in a distributed neocortical network related to working memory: A magnetoencephalography study

    NARCIS (Netherlands)

    Takashima, A.; Jensen, O.; Oostenveld, R.; Maris, E.G.G.; Coevering, M. van de; Fernandez, G.S.E.

    2006-01-01

    The aim of the present study was to investigate the spatio-temporal characteristics of the neural correlates of declarative memory formation as assessed by the subsequent memory effect, i.e. the difference in encoding activity between subsequently remembered and subsequently forgotten items.

  20. A Parallel Distributed-Memory Particle Method Enables Acquisition-Rate Segmentation of Large Fluorescence Microscopy Images.

    Directory of Open Access Journals (Sweden)

    Yaser Afshar

    Full Text Available Modern fluorescence microscopy modalities, such as light-sheet microscopy, are capable of acquiring large three-dimensional images at high data rate. This creates a bottleneck in computational processing and analysis of the acquired images, as the rate of acquisition outpaces the speed of processing. Moreover, images can be so large that they do not fit the main memory of a single computer. We address both issues by developing a distributed parallel algorithm for segmentation of large fluorescence microscopy images. The method is based on the versatile Discrete Region Competition algorithm, which has previously proven useful in microscopy image segmentation. The present distributed implementation decomposes the input image into smaller sub-images that are distributed across multiple computers. Using network communication, the computers orchestrate the collectively solving of the global segmentation problem. This not only enables segmentation of large images (we test images of up to 10(10 pixels, but also accelerates segmentation to match the time scale of image acquisition. Such acquisition-rate image segmentation is a prerequisite for the smart microscopes of the future and enables online data compression and interactive experiments.

  1. A Parallel Distributed-Memory Particle Method Enables Acquisition-Rate Segmentation of Large Fluorescence Microscopy Images.

    Science.gov (United States)

    Afshar, Yaser; Sbalzarini, Ivo F

    2016-01-01

    Modern fluorescence microscopy modalities, such as light-sheet microscopy, are capable of acquiring large three-dimensional images at high data rate. This creates a bottleneck in computational processing and analysis of the acquired images, as the rate of acquisition outpaces the speed of processing. Moreover, images can be so large that they do not fit the main memory of a single computer. We address both issues by developing a distributed parallel algorithm for segmentation of large fluorescence microscopy images. The method is based on the versatile Discrete Region Competition algorithm, which has previously proven useful in microscopy image segmentation. The present distributed implementation decomposes the input image into smaller sub-images that are distributed across multiple computers. Using network communication, the computers orchestrate the collectively solving of the global segmentation problem. This not only enables segmentation of large images (we test images of up to 10(10) pixels), but also accelerates segmentation to match the time scale of image acquisition. Such acquisition-rate image segmentation is a prerequisite for the smart microscopes of the future and enables online data compression and interactive experiments.

  2. Theta-alpha EEG phase distributions in the frontal area for dissociation of visual and auditory working memory.

    Science.gov (United States)

    Akiyama, Masakazu; Tero, Atsushi; Kawasaki, Masahiro; Nishiura, Yasumasa; Yamaguchi, Yoko

    2017-03-07

    Working memory (WM) is known to be associated with synchronization of the theta and alpha bands observed in electroencephalograms (EEGs). Although frontal-posterior global theta synchronization appears in modality-specific WM, local theta synchronization in frontal regions has been found in modality-independent WM. How frontal theta oscillations separately synchronize with task-relevant sensory brain areas remains an open question. Here, we focused on theta-alpha phase relationships in frontal areas using EEG, and then verified their functional roles with mathematical models. EEG data showed that the relationship between theta (6 Hz) and alpha (12 Hz) phases in the frontal areas was about 1:2 during both auditory and visual WM, and that the phase distributions between auditory and visual WM were different. Next, we used the differences in phase distributions to construct FitzHugh-Nagumo type mathematical models. The results replicated the modality-specific branching by orthogonally of the trigonometric functions for theta and alpha oscillations. Furthermore, mathematical and experimental results were consistent with regards to the phase relationships and amplitudes observed in frontal and sensory areas. These results indicate the important role that different phase distributions of theta and alpha oscillations have in modality-specific dissociation in the brain.

  3. A Parallel Distributed-Memory Particle Method Enables Acquisition-Rate Segmentation of Large Fluorescence Microscopy Images

    Science.gov (United States)

    Afshar, Yaser; Sbalzarini, Ivo F.

    2016-01-01

    Modern fluorescence microscopy modalities, such as light-sheet microscopy, are capable of acquiring large three-dimensional images at high data rate. This creates a bottleneck in computational processing and analysis of the acquired images, as the rate of acquisition outpaces the speed of processing. Moreover, images can be so large that they do not fit the main memory of a single computer. We address both issues by developing a distributed parallel algorithm for segmentation of large fluorescence microscopy images. The method is based on the versatile Discrete Region Competition algorithm, which has previously proven useful in microscopy image segmentation. The present distributed implementation decomposes the input image into smaller sub-images that are distributed across multiple computers. Using network communication, the computers orchestrate the collectively solving of the global segmentation problem. This not only enables segmentation of large images (we test images of up to 1010 pixels), but also accelerates segmentation to match the time scale of image acquisition. Such acquisition-rate image segmentation is a prerequisite for the smart microscopes of the future and enables online data compression and interactive experiments. PMID:27046144

  4. Shared Memory Parallelization of an Implicit ADI-type CFD Code

    Science.gov (United States)

    Hauser, Th.; Huang, P. G.

    1999-01-01

    A parallelization study designed for ADI-type algorithms is presented using the OpenMP specification for shared-memory multiprocessor programming. Details of optimizations specifically addressed to cache-based computer architectures are described and performance measurements for the single and multiprocessor implementation are summarized. The paper demonstrates that optimization of memory access on a cache-based computer architecture controls the performance of the computational algorithm. A hybrid MPI/OpenMP approach is proposed for clusters of shared memory machines to further enhance the parallel performance. The method is applied to develop a new LES/DNS code, named LESTool. A preliminary DNS calculation of a fully developed channel flow at a Reynolds number of 180, Re(sub tau) = 180, has shown good agreement with existing data.

  5. Kemari: A Portable High Performance Fortran System for Distributed Memory Parallel Processors

    Directory of Open Access Journals (Sweden)

    T. Kamachi

    1997-01-01

    Full Text Available We have developed a compilation system which extends High Performance Fortran (HPF in various aspects. We support the parallelization of well-structured problems with loop distribution and alignment directives similar to HPF's data distribution directives. Such directives give both additional control to the user and simplify the compilation process. For the support of unstructured problems, we provide directives for dynamic data distribution through user-defined mappings. The compiler also allows integration of message-passing interface (MPI primitives. The system is part of a complete programming environment which also comprises a parallel debugger and a performance monitor and analyzer. After an overview of the compiler, we describe the language extensions and related compilation mechanisms in detail. Performance measurements demonstrate the compiler's applicability to a variety of application classes.

  6. Quality-Driven Model-Based Design of MultiProcessor Embedded Systems for Highlydemanding Applications

    DEFF Research Database (Denmark)

    Jozwiak, Lech; Madsen, Jan

    2013-01-01

    The recent spectacular progress in modern nano-dimension semiconductor technology enabled implementation of a complete complex multi-processor system on a single chip (MPSoC), global networking and mobile wire-less communication, and facilitated a fast progress in these areas. New important...... accessible or distant) objects, installations, machines or devices, or even implanted in human or animal body can serve as examples. However, many of the modern embedded application impose very stringent functional and parametric demands. Moreover, the spectacular advances in microelectronics introduced...

  7. ARTiS, an Asymmetric Real-Time Scheduler for Linux on Multi-Processor Architectures

    OpenAIRE

    Piel , Éric; Marquet , Philippe; Soula , Julien; Osuna , Christophe; Dekeyser , Jean-Luc

    2005-01-01

    The ARTiS system is a real-time extension of the GNU/Linux scheduler dedicated to SMP (Symmetric Multi-Processors) systems. It allows to mix High Performance Computing and real-time. ARTiS exploits the SMP architecture to guarantee the preemption of a processor when the system has to schedule a real-time task. The implementation is available as a modification of the Linux kernel, especially focusing (but not restricted to) IA-64 architecture. The basic idea of ARTiS is to assign a selected se...

  8. A high speed multi-tasking, multi-processor telemetry system

    Energy Technology Data Exchange (ETDEWEB)

    Wu, Kung Chris [Univ. of Texas, El Paso, TX (United States)

    1996-12-31

    This paper describes a small size, light weight, multitasking, multiprocessor telemetry system capable of collecting 32 channels of differential signals at a sampling rate of 6.25 kHz per channel. The system is designed to collect data from remote wind turbine research sites and transfer the data via wireless communication. A description of operational theory, hardware components, and itemized cost is provided. Synchronization with other data acquisition systems and test data on data transmission rates is also given. 11 refs., 7 figs., 4 tabs.

  9. A parallel implementation of 3-d CT image reconstruction on a hypercube multiprocessor

    International Nuclear Information System (INIS)

    Chen, C.M.; Lee, S.Y.; Cho, Z.H.

    1990-01-01

    In this paper, the authors describe how image reconstruction in computerized tomography (CT) can be parallelized on a message-passing multiprocessor. In particular, the results obtained from parallel implementation of 3-D CT image reconstruction for parallel beam geometries on the Intel hypercube, iPSC/2, are presented. A two stage pipelining approach is employed for filtering (convolution) and backprojection. The conventional sequential convolution algorithm is modified such that the symmetry of the filter kernel is fully utilized for parallelization. In the backprojection stage, the 3-D incremental algorithm, the authors' recently developed backprojection scheme which is shown to be faster than conventional algorithm, is parallelized

  10. Energy-efficient fault tolerance in multiprocessor real-time systems

    Science.gov (United States)

    Guo, Yifeng

    The recent progress in the multiprocessor/multicore systems has important implications for real-time system design and operation. From vehicle navigation to space applications as well as industrial control systems, the trend is to deploy multiple processors in real-time systems: systems with 4 -- 8 processors are common, and it is expected that many-core systems with dozens of processing cores will be available in near future. For such systems, in addition to general temporal requirement common for all real-time systems, two additional operational objectives are seen as critical: energy efficiency and fault tolerance. An intriguing dimension of the problem is that energy efficiency and fault tolerance are typically conflicting objectives, due to the fact that tolerating faults (e.g., permanent/transient) often requires extra resources with high energy consumption potential. In this dissertation, various techniques for energy-efficient fault tolerance in multiprocessor real-time systems have been investigated. First, the Reliability-Aware Power Management (RAPM) framework, which can preserve the system reliability with respect to transient faults when Dynamic Voltage Scaling (DVS) is applied for energy savings, is extended to support parallel real-time applications with precedence constraints. Next, the traditional Standby-Sparing (SS) technique for dual processor systems, which takes both transient and permanent faults into consideration while saving energy, is generalized to support multiprocessor systems with arbitrary number of identical processors. Observing the inefficient usage of slack time in the SS technique, a Preference-Oriented Scheduling Framework is designed to address the problem where tasks are given preferences for being executed as soon as possible (ASAP) or as late as possible (ALAP). A preference-oriented earliest deadline (POED) scheduler is proposed and its application in multiprocessor systems for energy-efficient fault tolerance is

  11. Approach to Accelerating Dissolved Vector Buffer Generation in Distributed In-Memory Cluster Architecture

    Directory of Open Access Journals (Sweden)

    Jinxin Shen

    2018-01-01

    Full Text Available The buffer generation algorithm is a fundamental function in GIS, identifying areas of a given distance surrounding geographic features. Past research largely focused on buffer generation algorithms generated in a stand-alone environment. Moreover, dissolved buffer generation is data- and computing-intensive. In this scenario, the improvement in the stand-alone environment is limited when considering large-scale mass vector data. Nevertheless, recent parallel dissolved vector buffer algorithms suffer from scalability problems, leaving room for further optimization. At present, the prevailing in-memory cluster-computing framework—Spark—provides promising efficiency for computing-intensive analysis; however, it has seldom been researched for buffer analysis. On this basis, we propose a cluster-computing-oriented parallel dissolved vector buffer generating algorithm, called the HPBM, that contains a Hilbert-space-filling-curve-based data partition method, a data skew and cross-boundary objects processing strategy, and a depth-given tree-like merging method. Experiments are conducted in both stand-alone and cluster environments using real-world vector data that include points and roads. Compared with some existing parallel buffer algorithms, as well as various popular GIS software, the HPBM achieves a performance gain of more than 50%.

  12. [Changes in cortical power distribution produced by memory consolidation as a function of a typewriting skill].

    Science.gov (United States)

    Cunha, Marlo; Bastos, Victor Hugo; Veiga, Heloisa; Cagy, Maurício; McDowell, Kaleb; Furtado, Vernon; Piedade, Roberto; Ribeiro, Pedro

    2004-09-01

    The present study aimed to investigate alterations in EEG patterns in normal, right-handed individuals, during the process of learning a specific motor skill (typewriting). Recent studies have shown that the cerebral cortex is susceptible to several changes during a learning process and that alterations in the brain's electrical patterns take place as a result of the acquisition of a motor skill and memory consolidation. In this context, subjects' brain electrical activity was analyzed before and after the motor task. EEG data were collected by a Braintech 3000 and analyzed by Neurometrics. For the statistical analysis, the behavioral variables "time" and "number of errors" were assessed by a one-way ANOVA. For the neurophysiological variable "Absolute Power", a paired t-Test was performed for each pair of electrodes CZ-C3/CZ-C4, in the theta and alpha frequency bands. The main results demonstrated a change in performance, through both behavioral variables ("time" and "number of errors"). At the same time, no changes were observed for the neurophysiological variable ("Absolute Power") in the theta band. On the other hand, a significant increase was observed in the alpha band in central areas (CZ-C3/CZ-C4). These results suggest an adaptation of the sensory-motor cortex, as a consequence of the typewriting training.

  13. Stochastic fluctuations and distributed control of gene expression impact cellular memory.

    Directory of Open Access Journals (Sweden)

    Guillaume Corre

    Full Text Available Despite the stochastic noise that characterizes all cellular processes the cells are able to maintain and transmit to their daughter cells the stable level of gene expression. In order to better understand this phenomenon, we investigated the temporal dynamics of gene expression variation using a double reporter gene model. We compared cell clones with transgenes coding for highly stable mRNA and fluorescent proteins with clones expressing destabilized mRNA-s and proteins. Both types of clones displayed strong heterogeneity of reporter gene expression levels. However, cells expressing stable gene products produced daughter cells with similar level of reporter proteins, while in cell clones with short mRNA and protein half-lives the epigenetic memory of the gene expression level was completely suppressed. Computer simulations also confirmed the role of mRNA and protein stability in the conservation of constant gene expression levels over several cell generations. These data indicate that the conservation of a stable phenotype in a cellular lineage may largely depend on the slow turnover of mRNA-s and proteins.

  14. Communication strategies for angular domain decomposition of transport calculations on message passing multiprocessors

    International Nuclear Information System (INIS)

    Azmy, Y.Y.

    1997-01-01

    The effect of three communication schemes for solving Arbitrarily High Order Transport (AHOT) methods of the Nodal type on parallel performance is examined via direct measurements and performance models. The target architecture in this study is Oak Ridge National Laboratory's 128 node Paragon XP/S 5 computer and the parallelization is based on the Parallel Virtual Machine (PVM) library. However, the conclusions reached can be easily generalized to a large class of message passing platforms and communication software. The three schemes considered here are: (1) PVM's global operations (broadcast and reduce) which utilizes the Paragon's native corresponding operations based on a spanning tree routing; (2) the Bucket algorithm wherein the angular domain decomposition of the mesh sweep is complemented with a spatial domain decomposition of the accumulation process of the scalar flux from the angular flux and the convergence test; (3) a distributed memory version of the Bucket algorithm that pushes the spatial domain decomposition one step farther by actually distributing the fixed source and flux iterates over the memories of the participating processes. Their conclusion is that the Bucket algorithm is the most efficient of the three if all participating processes have sufficient memories to hold the entire problem arrays. Otherwise, the third scheme becomes necessary at an additional cost to speedup and parallel efficiency that is quantifiable via the parallel performance model

  15. Three-dimensional magnetic field computation on a distributed memory parallel processor

    International Nuclear Information System (INIS)

    Barion, M.L.

    1990-01-01

    The analysis of three-dimensional magnetic fields by finite element methods frequently proves too onerous a task for the computing resource on which it is attempted. When non-linear and transient effects are included, it may become impossible to calculate the field distribution to sufficient resolution. One approach to this problem is to exploit the natural parallelism in the finite element method via parallel processing. This paper reports on an implementation of a finite element code for non-linear three-dimensional low-frequency magnetic field calculation on Intel's iPSC/2

  16. I/O-Optimal Distribution Sweeping on Private-Cache Chip Multiprocessors

    DEFF Research Database (Denmark)

    Ajwani, Deepak; Sitchinava, Nodar; Zeh, Norbert

    2011-01-01

    /PB) for a number of problems on axis aligned objects; P denotes the number of cores/processors, B denotes the number of elements that fit in a cache line, N and K denote the sizes of the input and output, respectively, and sortp(N) denotes the I/O complexity of sorting N items using P processors in the PEM model...... framework was introduced recently, and a number of algorithms for problems on axis-aligned objects were obtained using this framework. The obtained algorithms were efficient but not optimal. In this paper, we improve the framework to obtain algorithms with the optimal I/O complexity of O(sortp(N) + K...

  17. Automatic code generation for distributed robotic systems

    International Nuclear Information System (INIS)

    Jones, J.P.

    1993-01-01

    Hetero Helix is a software environment which supports relatively large robotic system development projects. The environment supports a heterogeneous set of message-passing LAN-connected common-bus multiprocessors, but the programming model seen by software developers is a simple shared memory. The conceptual simplicity of shared memory makes it an extremely attractive programming model, especially in large projects where coordinating a large number of people can itself become a significant source of complexity. We present results from three system development efforts conducted at Oak Ridge National Laboratory over the past several years. Each of these efforts used automatic software generation to create 10 to 20 percent of the system

  18. Quality-driven model-based design of multi-processor accelerators : an application to LDPC decoders

    NARCIS (Netherlands)

    Jan, Y.

    2012-01-01

    The recent spectacular progress in nano-electronic technology has enabled the implementation of very complex multi-processor systems on single chips (MPSoCs). However in parallel, new highly demanding complex embedded applications are emerging, in fields like communication and networking,

  19. Lower bounds for the head-body-tail problem on parallel machines: a computational study for the multiprocessor flow shop

    NARCIS (Netherlands)

    A. Vandevelde; J.A. Hoogeveen; C.A.J. Hurkens (Cor); J.K. Lenstra (Jan Karel)

    2005-01-01

    htmlabstractThe multiprocessor flow-shop is the generalization of the flow-shop in which each machine is replaced by a set of identical machines. As finding a minimum-length schedule is NP-hard, we set out to find good lower and upper bounds. The lower bounds are based on relaxation of the

  20. Lower bounds for the head-body-tail problem on parallel machines : a computational study of the multiprocessor flow shop

    NARCIS (Netherlands)

    Vandevelde, A.; Hoogeveen, J.A.; Hurkens, C.A.J.; Lenstra, J.K.

    2005-01-01

    The multiprocessor flow-shop is the generalization of the flow-shop in which each machine is replaced by a set of identical machines. As finding a minimum-length schedule is NP-hard, we set out to find good lower and upper bounds. The lower bounds are based on relaxation of the capacities of all

  1. Centralized multiprocessor control system for the frascati storage rings DAΦNE

    International Nuclear Information System (INIS)

    Di Pirro, G.; Milardi, C.; Serio, M.

    1992-01-01

    We describe the status of the DANTE (DAΦne New Tools Environment) control system for the new DAΦNE Φ-factory under construction at the Frascati National Laboratories. The system is based on a centralized communication architecture for simplicity and reliability. A central processor unit coordinates all communications between the consoles and the lower level distributed processing power, and continuously updates a central memory that contains the whole machine status. We have developed a system of VME Fiber Optic interfaces allowing very fast point to point communication between distant processors. Macintosh II personal computers are used as consoles. The lower levels are all built using the VME standard. (author)

  2. What happens when we compare the lifespan distributions of life script events and autobiographical memories of life story events? A cross-cultural study

    DEFF Research Database (Denmark)

    Zaragoza Scherman, Alejandra; Salgado, Sinué; Shao, Zhifang

    Cultural Life Script Theory (Berntsen and Rubin, 2004), provides a cultural explanation of the reminiscence bump: adults older than 40 years remember a significantly greater amount of life events happening between 15 - 30 years of age (Rubin, Rahal, & Poon, 1998), compared to other lifetime periods....... Most of these memories are rated as emotionally positive (Rubin & Berntsen, 2003). The cultural life script represents culturally shared expectations about the order and timing of life events in an typical, idealised life course. By comparing the lifespan distribution of the life scripts events...... and memories of life story events, we can determine the degree to which the cultural life script serves as a recall template for autobiographical memories, especially of positive life events from adolescence and early adulthood, also known as the reminiscence bump period....

  3. Modeling Mental Speed: Decomposing Response Time Distributions in Elementary Cognitive Tasks and Correlations with Working Memory Capacity and Fluid Intelligence

    Directory of Open Access Journals (Sweden)

    Florian Schmitz

    2016-10-01

    Full Text Available Previous research has shown an inverse relation between response times in elementary cognitive tasks and intelligence, but findings are inconsistent as to which is the most informative score. We conducted a study (N = 200 using a battery of elementary cognitive tasks, working memory capacity (WMC paradigms, and a test of fluid intelligence (gf. Frequently used candidate scores and model parameters derived from the response time (RT distribution were tested. Results confirmed a clear correlation of mean RT with WMC and to a lesser degree with gf. Highly comparable correlations were obtained for alternative location measures with or without extreme value treatment. Moderate correlations were found as well for scores of RT variability, but they were not as strong as for mean RT. Additionally, there was a trend towards higher correlations for slow RT bands, as compared to faster RT bands. Clearer evidence was obtained in an ex-Gaussian decomposition of the response times: the exponential component was selectively related to WMC and gf in easy tasks, while mean response time was additionally predictive in the most complex tasks. The diffusion model parsimoniously accounted for these effects in terms of individual differences in drift rate. Finally, correlations of model parameters as trait-like dispositions were investigated across different tasks, by correlating parameters of the diffusion and the ex-Gaussian model with conventional RT and accuracy scores.

  4. Some methods of encoding simple visual images for use with a sparse distributed memory, with applications to character recognition

    Science.gov (United States)

    Jaeckel, Louis A.

    1989-01-01

    To study the problems of encoding visual images for use with a Sparse Distributed Memory (SDM), I consider a specific class of images- those that consist of several pieces, each of which is a line segment or an arc of a circle. This class includes line drawings of characters such as letters of the alphabet. I give a method of representing a segment of an arc by five numbers in a continuous way; that is, similar arcs have similar representations. I also give methods for encoding these numbers as bit strings in an approximately continuous way. The set of possible segments and arcs may be viewed as a five-dimensional manifold M, whose structure is like a Mobious strip. An image, considered to be an unordered set of segments and arcs, is therefore represented by a set of points in M - one for each piece. I then discuss the problem of constructing a preprocessor to find the segments and arcs in these images, although a preprocessor has not been developed. I also describe a possible extension of the representation.

  5. The performance of disk arrays in shared-memory database machines

    Science.gov (United States)

    Katz, Randy H.; Hong, Wei

    1993-01-01

    In this paper, we examine how disk arrays and shared memory multiprocessors lead to an effective method for constructing database machines for general-purpose complex query processing. We show that disk arrays can lead to cost-effective storage systems if they are configured from suitably small formfactor disk drives. We introduce the storage system metric data temperature as a way to evaluate how well a disk configuration can sustain its workload, and we show that disk arrays can sustain the same data temperature as a more expensive mirrored-disk configuration. We use the metric to evaluate the performance of disk arrays in XPRS, an operational shared-memory multiprocessor database system being developed at the University of California, Berkeley.

  6. Distributed, Embedded and Real-time Java Systems

    CERN Document Server

    Wellings, Andy

    2012-01-01

    Research on real-time Java technology has been prolific over the past decade, leading to a large number of corresponding hardware and software solutions, and frameworks for distributed and embedded real-time Java systems.  This book is aimed primarily at researchers in real-time embedded systems, particularly those who wish to understand the current state of the art in using Java in this domain.  Much of the work in real-time distributed, embedded and real-time Java has focused on the Real-time Specification for Java (RTSJ) as the underlying base technology, and consequently many of the Chapters in this book address issues with, or solve problems using, this framework. Describes innovative techniques in: scheduling, memory management, quality of service and communication systems supporting real-time Java applications; Includes coverage of multiprocessor embedded systems and parallel programming; Discusses state-of-the-art resource management for embedded systems, including Java’s real-time garbage collect...

  7. The reminiscence bump without memories: The distribution of imagined word-cued and important autobiographical memories in a hypothetical 70-year-old

    DEFF Research Database (Denmark)

    Koppel, Jonathan; Berntsen, Dorthe

    2016-01-01

    The reminiscence bump is the disproportionate number of autobiographical memories dating from adolescence and early adulthood. It has often been ascribed to a consolidation of the mature self in the period covered by the bump. Here we stripped away factors relating to the characteristics of autob...

  8. Neural markers of negative symptom outcomes in distributed working memory brain activity of antipsychotic-naive schizophrenia patients

    DEFF Research Database (Denmark)

    Nejad, Ayna B.; Madsen, Kristoffer H.; Ebdrup, Bjørn H.

    2013-01-01

    Since working memory deficits in schizophrenia have been linked to negative symptoms, we tested whether features of the one could predict the treatment outcome in the other. Specifically, we hypothesized that working memory-related functional connectivity at pre-treatment can predict improvement...

  9. Distributed Optimization System

    Science.gov (United States)

    Hurtado, John E.; Dohrmann, Clark R.; Robinett, III, Rush D.

    2004-11-30

    A search system and method for controlling multiple agents to optimize an objective using distributed sensing and cooperative control. The search agent can be one or more physical agents, such as a robot, and can be software agents for searching cyberspace. The objective can be: chemical sources, temperature sources, radiation sources, light sources, evaders, trespassers, explosive sources, time dependent sources, time independent sources, function surfaces, maximization points, minimization points, and optimal control of a system such as a communication system, an economy, a crane, and a multi-processor computer.

  10. Multi-processor system for real-time deconvolution and flow estimation in medical ultrasound

    DEFF Research Database (Denmark)

    Jensen, Jesper Lomborg; Jensen, Jørgen Arendt; Stetson, Paul F.

    1996-01-01

    of the algorithms. Many of the algorithms can only be properly evaluated in a clinical setting with real-time processing, which generally cannot be done with conventional equipment. This paper therefore presents a multi-processor system capable of performing 1.2 billion floating point operations per second on RF...... filter is used with a second time-reversed recursive estimation step. Here it is necessary to perform about 70 arithmetic operations per RF sample or about 1 billion operations per second for real-time deconvolution. Furthermore, these have to be floating point operations due to the adaptive nature...... interfaced to our previously-developed real-time sampling system that can acquire RF data at a rate of 20 MHz and simultaneously transmit the data at 20 MHz to the processing system via several parallel channels. These two systems can, thus, perform real-time processing of ultrasound data. The advantage...

  11. 2-D fluid dynamics models for laser driven fusion on IBM 3090 vector multiprocessors

    International Nuclear Information System (INIS)

    Atzeni, S.

    1988-01-01

    Fluid-dynamics codes for laser fusion are complex research codes, consisting of many distinct modules and embodying a variety of numerical methods. They are therefore good candidates for testing general purpose advanced computer architectures and the related software. In this paper, after a brief outline of the basic concepts of laser fusion, the implementation of the 2-D laser fusion fluid code DUED on the IBM 3090 VF vector multiprocessors is discussed. Emphasis is put on parallelization, performed by means of IBM Parallel FORTRAN (PF). It is shown how different modules have been optimized by using different features of PF: i) modules based on depth-2 nested loops exploit automatic parallelization; ii) laser light ray tracing is partitioned by scheduling parallel ICCG algorithm (executed in parallel by appropiately synchronized parallel subroutines). Performance results are given for separate modules of the code, as well as for typical complete runs

  12. Commodity multi-processor systems in the ATLAS level-2 trigger

    International Nuclear Information System (INIS)

    Abolins, M.; Blair, R.; Bock, R.; Bogaerts, A.; Dawson, J.; Ermoline, Y.; Hauser, R.; Kugel, A.; Lay, R.; Muller, M.; Noffz, K.-H.; Pope, B.; Schlereth, J.; Werner, P.

    2000-01-01

    Low cost SMP (Symmetric Multi-Processor) systems provide substantial CPU and I/O capacity. These features together with the ease of system integration make them an attractive and cost effective solution for a number of real-time applications in event selection. In ATLAS the authors consider them as intelligent input buffers (active ROB complex), as event flow supervisors or as powerful processing nodes. Measurements of the performance of one off-the-shelf commercial 4-processor PC with two PCI buses, equipped with commercial FPGA based data source cards (microEnable) and running commercial software are presented and mapped on such applications together with a long-term program of work. The SMP systems may be considered as an important building block in future data acquisition systems

  13. Commodity multi-processor systems in the ATLAS level-2 trigger

    CERN Document Server

    Abolins, M; Bock, R; Bogaerts, J A C; Dawson, J; Ermoline, Y; Hauser, R; Kugel, A; Lay, R; Müller, M; Noffz, K H; Pope, B; Schlereth, J L; Werner, P

    2000-01-01

    Low cost SMP (symmetric multi-processor) systems provide substantial CPU and I/O capacity. These features together with the ease of system integration make them an attractive and cost effective solution for a number of real-time applications in event selection. In ATLAS we consider them as intelligent input buffers (an "active" ROB complex), as event flow supervisors or as powerful processing nodes. Measurements of the performance of one off-the-shelf commercial 4- processor PC with two PCI buses, equipped with commercial FPGA based data source cards (microEnable) and running commercial software are presented and mapped on such applications together with a long-term programme of work. The SMP systems may be considered as an important building block in future data acquisition systems. (9 refs).

  14. Parallel definition of tear film maps on distributed-memory clusters for the support of dry eye diagnosis.

    Science.gov (United States)

    González-Domínguez, Jorge; Remeseiro, Beatriz; Martín, María J

    2017-02-01

    The analysis of the interference patterns on the tear film lipid layer is a useful clinical test to diagnose dry eye syndrome. This task can be automated with a high degree of accuracy by means of the use of tear film maps. However, the time required by the existing applications to generate them prevents a wider acceptance of this method by medical experts. Multithreading has been previously successfully employed by the authors to accelerate the tear film map definition on multicore single-node machines. In this work, we propose a hybrid message-passing and multithreading parallel approach that further accelerates the generation of tear film maps by exploiting the computational capabilities of distributed-memory systems such as multicore clusters and supercomputers. The algorithm for drawing tear film maps is parallelized using Message Passing Interface (MPI) for inter-node communications and the multithreading support available in the C++11 standard for intra-node parallelization. The original algorithm is modified to reduce the communications and increase the scalability. The hybrid method has been tested on 32 nodes of an Intel cluster (with two 12-core Haswell 2680v3 processors per node) using 50 representative images. Results show that maximum runtime is reduced from almost two minutes using the previous only-multithreaded approach to less than ten seconds using the hybrid method. The hybrid MPI/multithreaded implementation can be used by medical experts to obtain tear film maps in only a few seconds, which will significantly accelerate and facilitate the diagnosis of the dry eye syndrome. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  15. Directions for memory hierarchies and their components: research and development

    International Nuclear Information System (INIS)

    Smith, A.J.

    1978-10-01

    The memory hierarchy is usually the largest identifiable part of a computer system and making effective use of it is critical to the operation and use of the system. The levels of such a memory hierarchy are considered and the state of the art and likely directions for both research and development are described. Algorithmic and logical features of the hierarchy not directly associated with specific components are also discussed. Among the problems believed to be the most significant are the following: (a) evaluate the effectiveness of gap filler technology as a level of storage between main memory and disk, and if it proves to be effective, determine how/where it should be used, (b) develop algorithms for the use of mass storage in a large computer system, and (c) determine how cache memories should be implemented in very large, fast multiprocessor systems

  16. Use of a genetic algorithm to solve two-fluid flow problems on an NCUBE multiprocessor computer

    International Nuclear Information System (INIS)

    Pryor, R.J.; Cline, D.D.

    1992-01-01

    A method of solving the two-phase fluid flow equations using a genetic algorithm on a NCUBE multiprocessor computer is presented. The topics discussed are the two-phase flow equations, the genetic representation of the unknowns, the fitness function, the genetic operators, and the implementation of the algorithm on the NCUBE computer. The efficiency of the implementation is investigated using a pipe blowdown problem. Effects of varying the genetic parameters and the number of processors are presented

  17. Use of a genetic agorithm to solve two-fluid flow problems on an NCUBE multiprocessor computer

    International Nuclear Information System (INIS)

    Pryor, R.J.; Cline, D.D.

    1993-01-01

    A method of solving the two-phases fluid flow equations using a genetic algorithm on a NCUBE multiprocessor computer is presented. The topics discussed are the two-phase flow equations, the genetic representation of the unkowns, the fitness function, the genetic operators, and the implementation of the algorithm on the NCUBE computer. The efficiency of the implementation is investigated using a pipe blowdown problem. Effects of varying the genetic parameters and the number of processors are presented. (orig.)

  18. Splenectomy alters distribution and turnover but not numbers or protective capacity of de novo generated memory CD8 T cells.

    Directory of Open Access Journals (Sweden)

    Marie eKim

    2014-11-01

    Full Text Available The spleen is a highly compartmentalized lymphoid organ that allows for efficient antigen presentation and activation of immune responses. Additionally, the spleen itself functions to remove senescent red blood cells, filter bacteria, and sequester platelets. Splenectomy, commonly performed after blunt force trauma or splenomegaly, has been shown to increase risk of certain bacterial and parasitic infections years after removal of the spleen. Although previous studies report defects in memory B cells and IgM titers in splenectomized patients, the effect of splenectomy on CD8 T cell responses and memory CD8 T cell function remains ill defined. Using TCR-transgenic P14 cells, we demonstrate that homeostatic proliferation and representation of pathogen-specific memory CD8 T cells in the blood are enhanced in splenectomized compared to sham surgery mice. Surprisingly, despite the enhanced turnover, splenectomized mice displayed no changes in total memory CD8 T cell numbers nor impaired protection against lethal dose challenge with Listeria monocytogenes. Thus, our data suggest that memory CD8 T cell maintenance and function remain intact in the absence of the spleen.

  19. 3-dimensional magnetotelluric inversion including topography using deformed hexahedral edge finite elements and direct solvers parallelized on symmetric multiprocessor computers - Part II: direct data-space inverse solution

    Science.gov (United States)

    Kordy, M.; Wannamaker, P.; Maris, V.; Cherkaev, E.; Hill, G.

    2016-01-01

    Following the creation described in Part I of a deformable edge finite-element simulator for 3-D magnetotelluric (MT) responses using direct solvers, in Part II we develop an algorithm named HexMT for 3-D regularized inversion of MT data including topography. Direct solvers parallelized on large-RAM, symmetric multiprocessor (SMP) workstations are used also for the Gauss-Newton model update. By exploiting the data-space approach, the computational cost of the model update becomes much less in both time and computer memory than the cost of the forward simulation. In order to regularize using the second norm of the gradient, we factor the matrix related to the regularization term and apply its inverse to the Jacobian, which is done using the MKL PARDISO library. For dense matrix multiplication and factorization related to the model update, we use the PLASMA library which shows very good scalability across processor cores. A synthetic test inversion using a simple hill model shows that including topography can be important; in this case depression of the electric field by the hill can cause false conductors at depth or mask the presence of resistive structure. With a simple model of two buried bricks, a uniform spatial weighting for the norm of model smoothing recovered more accurate locations for the tomographic images compared to weightings which were a function of parameter Jacobians. We implement joint inversion for static distortion matrices tested using the Dublin secret model 2, for which we are able to reduce nRMS to ˜1.1 while avoiding oscillatory convergence. Finally we test the code on field data by inverting full impedance and tipper MT responses collected around Mount St Helens in the Cascade volcanic chain. Among several prominent structures, the north-south trending, eruption-controlling shear zone is clearly imaged in the inversion.

  20. A distributed real-time operating system

    International Nuclear Information System (INIS)

    Tuynman, F.; Hertzberger, L.O.

    1984-07-01

    A distributed real-time operating system, Fados, has been developed for an embedded multi-processor system. The operating system is based on a host target approach and provides for communication between arbitrary processes on host and target machine. The facilities offered are, apart from process communication, access to the file system on the host by programs on the target machine and monitoring and debugging of programs on the target machine from the host. The process communication has been designed in such a way that the possibilities are the same as those offered by the Ada programming language. The operating system is implemented on a MC 68000 based multiprocessor system in combination with a Unix host. (orig.)

  1. A model for removing the increased recall of recent events from the temporal distribution of autobiographical memory

    NARCIS (Netherlands)

    Janssen, S.M.J.; Gralak, A.; Murre, J.M.J.

    2011-01-01

    The reminiscence bump is the tendency to recall relatively many personal events from the period in which the individual was between 10 and 30 years old. This effect has only been found in autobiographical memory studies that used participants who were older than 40 years of age. The increased recall

  2. Compilation time analysis to minimize run-time overhead in preemptive scheduling on multiprocessors

    Science.gov (United States)

    Wauters, Piet; Lauwereins, Rudy; Peperstraete, J.

    1994-10-01

    This paper describes a scheduling method for hard real-time Digital Signal Processing (DSP) applications, implemented on a multi-processor. Due to the very high operating frequencies of DSP applications (typically hundreds of kHz) runtime overhead should be kept as small as possible. Because static scheduling introduces very little run-time overhead it is used as much as possible. Dynamic pre-emption of tasks is allowed if and only if it leads to better performance in spite of the extra run-time overhead. We essentially combine static scheduling with dynamic pre-emption using static priorities. Since we are dealing with hard real-time applications we must be able to guarantee at compile-time that all timing requirements will be satisfied at run-time. We will show that our method performs at least as good as any static scheduling method. It also reduces the total amount of dynamic pre-emptions compared with run time methods like deadline monotonic scheduling.

  3. An efficient communication scheme for solving Sn equations on message-passing multiprocessors

    International Nuclear Information System (INIS)

    Azmy, Y.Y.

    1993-01-01

    Early models of Intel's hypercube multiprocessors, e.g., the iPSC/1 and iPSC/2, were characterized by the high latency of message passing. This relatively weak dependence of the communication penalty on the size of messages, in contrast to its strong dependence on the number of messages, justified using the Fan-in Fan-out algorithm (which implements a minimum spanning tree path) to perform global operations, such as global sums, etc. Recent models of message-passing computers, such as the iPSC/860 and the Paragon, have been found to possess much smaller latency, thus forcing a reexamination of the issue of performance optimization with respect to communication schemes. Essentially, the Fan-in Fan-out scheme minimizes the number of nonsimultaneous messages sent but not the volume of data traffic across the network. Furthermore, if a global operation is performed in conjunction with the message passing, a large fraction of the attached nodes remains idle as the number of utilized processors is halved in each step of the process. On the other hand, the Recursive Halving scheme offers the smallest communication cost for global operations but has some drawbacks

  4. Scheduling for energy and reliability management on multiprocessor real-time systems

    Science.gov (United States)

    Qi, Xuan

    Scheduling algorithms for multiprocessor real-time systems have been studied for years with many well-recognized algorithms proposed. However, it is still an evolving research area and many problems remain open due to their intrinsic complexities. With the emergence of multicore processors, it is necessary to re-investigate the scheduling problems and design/develop efficient algorithms for better system utilization, low scheduling overhead, high energy efficiency, and better system reliability. Focusing cluster schedulings with optimal global schedulers, we study the utilization bound and scheduling overhead for a class of cluster-optimal schedulers. Then, taking energy/power consumption into consideration, we developed energy-efficient scheduling algorithms for real-time systems, especially for the proliferating embedded systems with limited energy budget. As the commonly deployed energy-saving technique (e.g. dynamic voltage frequency scaling (DVFS)) will significantly affect system reliability, we study schedulers that have intelligent mechanisms to recuperate system reliability to satisfy the quality assurance requirements. Extensive simulation is conducted to evaluate the performance of the proposed algorithms on reduction of scheduling overhead, energy saving, and reliability improvement. The simulation results show that the proposed reliability-aware power management schemes could preserve the system reliability while still achieving substantial energy saving.

  5. E-Token Energy-Aware Proportionate Sharing Scheduling Algorithm for Multiprocessor Systems

    Directory of Open Access Journals (Sweden)

    Pasupuleti Ramesh

    2017-01-01

    Full Text Available WSN plays vital role from small range healthcare surveillance systems to largescale environmental monitoring. Its design for energy constrained applications is a challenging issue. Sensors in WSNs are projected to run separately for longer periods. It is of excessive cost to substitute exhausted batteries which is not even possible in antagonistic situations. Multiprocessors are used in WSNs for high performance scientific computing, where each processor is assigned the same or different workload. When the computational demands of the system increase then the energy efficient approaches play an important role to increase system lifetime. Energy efficiency is commonly carried out by using proportionate fair scheduler. This introduces abnormal overloading effect. In order to overcome the existing problems E-token Energy-Aware Proportionate Sharing (EEAPS scheduling is proposed here. The power consumption for each thread/task is calculated and the tasks are allotted to the multiple processors through the auctioning mechanism. The algorithm is simulated by using the real-time simulator (RTSIM and the results are tested.

  6. A Taxonomy of Reconfigurable Single-/Multiprocessor Systems-on-Chip

    Directory of Open Access Journals (Sweden)

    Diana Göhringer

    2009-01-01

    Full Text Available Runtime adaptivity of hardware in processor architectures is a novel trend, which is under investigation in a variety of research labs all over the world. The runtime exchange of modules, implemented on a reconfigurable hardware, affects the instruction flow (e.g., in reconfigurable instruction set processors or the data flow, which has a strong impact on the performance of an application. Furthermore, the choice of a certain processor architecture related to the class of target applications is a crucial point in application development. A simple example is the domain of high-performance computing applications found in meteorology or high-energy physics, where vector processors are the optimal choice. A classification scheme for computer systems was provided in 1966 by Flynn where single/multiple data and instruction streams were combined to four types of architectures. This classification is now used as a foundation for an extended classification scheme including runtime adaptivity as further degree of freedom for processor architecture design. The developed scheme is validated by a multiprocessor system implemented on reconfigurable hardware as well as by a classification of existing static and reconfigurable processor systems.

  7. Hybrid shared/distributed parallelism for 3D characteristics transport solvers

    International Nuclear Information System (INIS)

    Dahmani, M.; Roy, R.

    2005-01-01

    In this paper, we will present a new hybrid parallel model for solving large-scale 3-dimensional neutron transport problems used in nuclear reactor simulations. Large heterogeneous reactor problems, like the ones that occurs when simulating Candu cores, have remained computationally intensive and impractical for routine applications on single-node or even vector computers. Based on the characteristics method, this new model is designed to solve the transport equation after distributing the calculation load on a network of shared memory multi-processors. The tracks are either generated on the fly at each characteristics sweep or stored in sequential files. The load balancing is taken into account by estimating the calculation load of tracks and by distributing batches of uniform load on each node of the network. Moreover, the communication overhead can be predicted after benchmarking the latency and bandwidth using appropriate network test suite. These models are useful for predicting the performance of the parallel applications and to analyze the scalability of the parallel systems. (authors)

  8. Simulating Pre-Asymptotic, Non-Fickian Transport Although Doing Simple Random Walks - Supported By Empirical Pore-Scale Velocity Distributions and Memory Effects

    Science.gov (United States)

    Most, S.; Jia, N.; Bijeljic, B.; Nowak, W.

    2016-12-01

    Pre-asymptotic characteristics are almost ubiquitous when analyzing solute transport processes in porous media. These pre-asymptotic aspects are caused by spatial coherence in the velocity field and by its heterogeneity. For the Lagrangian perspective of particle displacements, the causes of pre-asymptotic, non-Fickian transport are skewed velocity distribution, statistical dependencies between subsequent increments of particle positions (memory) and dependence between the x, y and z-components of particle increments. Valid simulation frameworks should account for these factors. We propose a particle tracking random walk (PTRW) simulation technique that can use empirical pore-space velocity distributions as input, enforces memory between subsequent random walk steps, and considers cross dependence. Thus, it is able to simulate pre-asymptotic non-Fickian transport phenomena. Our PTRW framework contains an advection/dispersion term plus a diffusion term. The advection/dispersion term produces time-series of particle increments from the velocity CDFs. These time series are equipped with memory by enforcing that the CDF values of subsequent velocities change only slightly. The latter is achieved through a random walk on the axis of CDF values between 0 and 1. The virtual diffusion coefficient for that random walk is our only fitting parameter. Cross-dependence can be enforced by constraining the random walk to certain combinations of CDF values between the three velocity components in x, y and z. We will show that this modelling framework is capable of simulating non-Fickian transport by comparison with a pore-scale transport simulation and we analyze the approach to asymptotic behavior.

  9. Distributed optimization system and method

    Science.gov (United States)

    Hurtado, John E.; Dohrmann, Clark R.; Robinett, III, Rush D.

    2003-06-10

    A search system and method for controlling multiple agents to optimize an objective using distributed sensing and cooperative control. The search agent can be one or more physical agents, such as a robot, and can be software agents for searching cyberspace. The objective can be: chemical sources, temperature sources, radiation sources, light sources, evaders, trespassers, explosive sources, time dependent sources, time independent sources, function surfaces, maximization points, minimization points, and optimal control of a system such as a communication system, an economy, a crane, and a multi-processor computer.

  10. Parallel algorithms for quantum chemistry. I. Integral transformations on a hypercube multiprocessor

    International Nuclear Information System (INIS)

    Whiteside, R.A.; Binkley, J.S.; Colvin, M.E.; Schaefer, H.F. III

    1987-01-01

    For many years it has been recognized that fundamental physical constraints such as the speed of light will limit the ultimate speed of single processor computers to less than about three billion floating point operations per second (3 GFLOPS). This limitation is becoming increasingly restrictive as commercially available machines are now within an order of magnitude of this asymptotic limit. A natural way to avoid this limit is to harness together many processors to work on a single computational problem. In principle, these parallel processing computers have speeds limited only by the number of processors one chooses to acquire. The usefulness of potentially unlimited processing speed to a computationally intensive field such as quantum chemistry is obvious. If these methods are to be applied to significantly larger chemical systems, parallel schemes will have to be employed. For this reason we have developed distributed-memory algorithms for a number of standard quantum chemical methods. We are currently implementing these on a 32 processor Intel hypercube. In this paper we present our algorithm and benchmark results for one of the bottleneck steps in quantum chemical calculations: the four index integral transformation

  11. IEEE P1596, a scalable coherent interface for GigaByte/sec multiprocessor applications

    International Nuclear Information System (INIS)

    Gustavson, D.B.

    1988-11-01

    IEEE P1596, the Scalable Coherent Interface (formerly known as SuperBus) is based on experience gained during the development of Fastbus (IEEE 960), Futurebus (IEEE 896.1) and other modern 32-bit buses. SCI goals include a minimum bandwidth of 1 GByte/sec per processor; efficient support of a coherent distributed-cache image of shared memory; and support for segmentation, bus repeaters and general switched interconnections like Banyan, Omega, or full crossbar networks. To achieve these ambitious goals, SCI must sacrifice the immediate handshake characteristic of the present generation of buses in favor of a packet-like split-cycle protocol. Wire-ORs, broadcasts, and even ordinary passive bus structures are to be avoided. However, a lower performance (1 GByte/sec per backplane instead of per processor) implementation using a register insertion ring architecture on a passive ''backplane'' appears to be possible using the same interface as for the more costly switch networks. This paper presents a summary of current directions, and reports the status of the work in progress

  12. Distribution

    Science.gov (United States)

    John R. Jones

    1985-01-01

    Quaking aspen is the most widely distributed native North American tree species (Little 1971, Sargent 1890). It grows in a great diversity of regions, environments, and communities (Harshberger 1911). Only one deciduous tree species in the world, the closely related Eurasian aspen (Populus tremula), has a wider range (Weigle and Frothingham 1911)....

  13. Design concepts for a virtualizable embedded MPSoC architecture enabling virtualization in embedded multi-processor systems

    CERN Document Server

    Biedermann, Alexander

    2014-01-01

    Alexander Biedermann presents a generic hardware-based virtualization approach, which may transform an array of any off-the-shelf embedded processors into a multi-processor system with high execution dynamism. Based on this approach, he highlights concepts for the design of energy aware systems, self-healing systems as well as parallelized systems. For the latter, the novel so-called Agile Processing scheme is introduced by the author, which enables a seamless transition between sequential and parallel execution schemes. The design of such virtualizable systems is further aided by introduction

  14. Design of Networks-on-Chip for Real-Time Multi-Processor Systems-on-Chip

    DEFF Research Database (Denmark)

    Sparsø, Jens

    2012-01-01

    This paper addresses the design of networks-on-chips for use in multi-processor systems-on-chips - the hardware platforms used in embedded systems. These platforms typically have to guarantee real-time properties, and as the network is a shared resource, it has to provide service guarantees...... (bandwidth and/or latency) to different communication flows. The paper reviews some past work in this field and the lessons learned, and the paper discusses ongoing research conducted as part of the project "Time-predictable Multi-Core Architecture for Embedded Systems" (T-CREST), supported by the European...

  15. Computational design of RNA parts, devices, and transcripts with kinetic folding algorithms implemented on multiprocessor clusters.

    Science.gov (United States)

    Thimmaiah, Tim; Voje, William E; Carothers, James M

    2015-01-01

    With progress toward inexpensive, large-scale DNA assembly, the demand for simulation tools that allow the rapid construction of synthetic biological devices with predictable behaviors continues to increase. By combining engineered transcript components, such as ribosome binding sites, transcriptional terminators, ligand-binding aptamers, catalytic ribozymes, and aptamer-controlled ribozymes (aptazymes), gene expression in bacteria can be fine-tuned, with many corollaries and applications in yeast and mammalian cells. The successful design of genetic constructs that implement these kinds of RNA-based control mechanisms requires modeling and analyzing kinetically determined co-transcriptional folding pathways. Transcript design methods using stochastic kinetic folding simulations to search spacer sequence libraries for motifs enabling the assembly of RNA component parts into static ribozyme- and dynamic aptazyme-regulated expression devices with quantitatively predictable functions (rREDs and aREDs, respectively) have been described (Carothers et al., Science 334:1716-1719, 2011). Here, we provide a detailed practical procedure for computational transcript design by illustrating a high throughput, multiprocessor approach for evaluating spacer sequences and generating functional rREDs. This chapter is written as a tutorial, complete with pseudo-code and step-by-step instructions for setting up a computational cluster with an Amazon, Inc. web server and performing the large numbers of kinefold-based stochastic kinetic co-transcriptional folding simulations needed to design functional rREDs and aREDs. The method described here should be broadly applicable for designing and analyzing a variety of synthetic RNA parts, devices and transcripts.

  16. In memory of Alois Apfelbeck: An Interconnection between Cayley-Eisenstein-Pólya and Landau Probability Distributions

    Directory of Open Access Journals (Sweden)

    Vladimír Vojta

    2013-01-01

    Full Text Available The interconnection between the Cayley-Eisenstein-Pólya distribution and the Landau distribution is studied, and possibly new transform pairs for the Laplace and Mellin transform and integral expressions for the Lambert W function have been found.

  17. Processor tradeoffs in distributed real-time systems

    Science.gov (United States)

    Krishna, C. M.; Shin, Kang G.; Bhandari, Inderpal S.

    1987-01-01

    The problem of the optimization of the design of real-time distributed systems is examined with reference to a class of computer architectures similar to the continuously reconfigurable multiprocessor flight control system structure, CM2FCS. Particular attention is given to the impact of processor replacement and the burn-in time on the probability of dynamic failure and mean cost. The solution is obtained numerically and interpreted in the context of real-time applications.

  18. Process Management and Exception Handling in Multiprocessor Operating Systems Using Object-Oriented Design Techniques. Revised Sep. 1988

    Science.gov (United States)

    Russo, Vincent; Johnston, Gary; Campbell, Roy

    1988-01-01

    The programming of the interrupt handling mechanisms, process switching primitives, scheduling mechanism, and synchronization primitives of an operating system for a multiprocessor require both efficient code in order to support the needs of high- performance or real-time applications and careful organization to facilitate maintenance. Although many advantages have been claimed for object-oriented class hierarchical languages and their corresponding design methodologies, the application of these techniques to the design of the primitives within an operating system has not been widely demonstrated. To investigate the role of class hierarchical design in systems programming, the authors have constructed the Choices multiprocessor operating system architecture the C++ programming language. During the implementation, it was found that many operating system design concerns can be represented advantageously using a class hierarchical approach, including: the separation of mechanism and policy; the organization of an operating system into layers, each of which represents an abstract machine; and the notions of process and exception management. In this paper, we discuss an implementation of the low-level primitives of this system and outline the strategy by which we developed our solution.

  19. Distributed plant simulator with two-phase flow analysis code using drift-flux non-equilibrium model for pressurized water reactors

    International Nuclear Information System (INIS)

    Yamamoto, Takaya; Kitamura, Masashi; Ohi, Tadashi; Akagi, Katsumi

    1999-01-01

    As advanced monitoring and controlling systems, such as the advanced main control console and the operator support system have been developed, real-time simulators' simulation accuracy must be improved and simulation limits must be extended. Therefore the authors have developed a distributed simulation system to achieve high processing performance using low cost hardware. Moreover, the authors have developed a thermal-hydraulic computer code, using drift-flux non-equilibrium model, which can realize a high precision two-phase flow analysis, which is considered to have the same prediction capability as two-fluid models, while achieving high speed and stability for real-time simulators. The distributed plant simulator for PWR plants was realized as a result. The distributed simulator consists of multi-processors connected to each other by an optical fiber network. Controlling software for synchronized scheduling and memory transfer was also developed. The simulation results of the four loop PWR simulator are compared with experimental data and real plant data; the agreement is satisfactory for a plant simulator. The simulation speed is also satisfactory being twice as fast as real-time. (author)

  20. The Effect of SiC Polytypes on the Heat Distribution Efficiency of a Phase Change Memory.

    Science.gov (United States)

    Aziz, M. S.; Mohammed, Z.; Alip, R. I.

    2018-03-01

    The amorphous to crystalline transition of germanium-antimony-tellurium (GST) using three types of silicon carbide’s structure as a heating element was investigated. Simulation was done using COMSOL Multiphysic 5.0 software with separate heater structure. Silicon carbide (SiC) has three types of structure; 3C-SiC, 4H-SiC and 6H-SiC. These structures have a different thermal conductivity. The temperature of GST and phase transition of GST can be obtained from the simulation. The temperature of GST when using 3C-SiC, 4H-SiC and 6H-SiC are 467K, 466K and 460K, respectively. The phase transition of GST from amorphous to crystalline state for three type of SiC’s structure can be determined in this simulation. Based on the result, the thermal conductivity of SiC can affecting the temperature of GST and changed of phase change memory (PCM).

  1. Memory architecture

    NARCIS (Netherlands)

    2012-01-01

    A memory architecture is presented. The memory architecture comprises a first memory and a second memory. The first memory has at least a bank with a first width addressable by a single address. The second memory has a plurality of banks of a second width, said banks being addressable by components

  2. A pipelined architecture for real time correction of non-uniformity in infrared focal plane arrays imaging system using multiprocessors

    Science.gov (United States)

    Zou, Liang; Fu, Zhuang; Zhao, YanZheng; Yang, JunYan

    2010-07-01

    This paper proposes a kind of pipelined electric circuit architecture implemented in FPGA, a very large scale integrated circuit (VLSI), which efficiently deals with the real time non-uniformity correction (NUC) algorithm for infrared focal plane arrays (IRFPA). Dual Nios II soft-core processors and a DSP with a 64+ core together constitute this image system. Each processor undertakes own systematic task, coordinating its work with each other's. The system on programmable chip (SOPC) in FPGA works steadily under the global clock frequency of 96Mhz. Adequate time allowance makes FPGA perform NUC image pre-processing algorithm with ease, which has offered favorable guarantee for the work of post image processing in DSP. And at the meantime, this paper presents a hardware (HW) and software (SW) co-design in FPGA. Thus, this systematic architecture yields an image processing system with multiprocessor, and a smart solution to the satisfaction with the performance of the system.

  3. A class Hierarchical, object-oriented approach to virtual memory management

    Science.gov (United States)

    Russo, Vincent F.; Campbell, Roy H.; Johnston, Gary M.

    1989-01-01

    The Choices family of operating systems exploits class hierarchies and object-oriented programming to facilitate the construction of customized operating systems for shared memory and networked multiprocessors. The software is being used in the Tapestry laboratory to study the performance of algorithms, mechanisms, and policies for parallel systems. Described here are the architectural design and class hierarchy of the Choices virtual memory management system. The software and hardware mechanisms and policies of a virtual memory system implement a memory hierarchy that exploits the trade-off between response times and storage capacities. In Choices, the notion of a memory hierarchy is captured by abstract classes. Concrete subclasses of those abstractions implement a virtual address space, segmentation, paging, physical memory management, secondary storage, and remote (that is, networked) storage. Captured in the notion of a memory hierarchy are classes that represent memory objects. These classes provide a storage mechanism that contains encapsulated data and have methods to read or write the memory object. Each of these classes provides specializations to represent the memory hierarchy.

  4. Multi-processor system for real-time flow estimation in medical ultrasound imaging

    DEFF Research Database (Denmark)

    Stetson, Paul F.; Jensen, Jesper Lomborg; Antonius, Peter

    1997-01-01

    the processed data. The generous bandwidth of the links makes it easy to balance the computational load among the processors.In order to manage the shared system memory and to make use of the parallel processing capabilities of the system, a real-time multitasking kernel has been developed. The kernel uses...

  5. A heterogeneous multiprocessor architecture for low-power audio signal processing applications

    DEFF Research Database (Denmark)

    Paker, Ozgun; Sparsø, Jens; Haandbæk, Niels

    2001-01-01

    . The processors are tailored for different classes of filtering algorithms (FIR, IIR, N-LMS etc.), and in a typical system the communication among processors occurs at the sampling rate only. The processors are parameterized in word-size, memory-size, etc. and can be instantiated according to the needs...... of the application at hand using a normal synthesis based ASIC design flow. To give an impression of the size of a processor we mention that one of the FIR processors in a prototype design has 16 instructions, a 32 word×16 bit program memory, a 64 word×16 bit data memory and a 25 word×16 bit coefficient memory....... Early results obtained from the design of a prototype chip containing filter processors for a hearing aid application, indicate a power consumption that is an order of magnitude better than current state of the art low-power audio DSPs implemented using full-custom techniques. This is due to: (1...

  6. The parallel processing of EGS4 code on distributed memory scalar parallel computer:Intel Paragon XP/S15-256

    Energy Technology Data Exchange (ETDEWEB)

    Takemiya, Hiroshi; Ohta, Hirofumi; Honma, Ichirou

    1996-03-01

    The parallelization of Electro-Magnetic Cascade Monte Carlo Simulation Code, EGS4 on distributed memory scalar parallel computer: Intel Paragon XP/S15-256 is described. EGS4 has the feature that calculation time for one incident particle is quite different from each other because of the dynamic generation of secondary particles and different behavior of each particle. Granularity for parallel processing, parallel programming model and the algorithm of parallel random number generation are discussed and two kinds of method, each of which allocates particles dynamically or statically, are used for the purpose of realizing high speed parallel processing of this code. Among four problems chosen for performance evaluation, the speedup factors for three problems have been attained to nearly 100 times with 128 processor. It has been found that when both the calculation time for each incident particles and its dispersion are large, it is preferable to use dynamic particle allocation method which can average the load for each processor. And it has also been found that when they are small, it is preferable to use static particle allocation method which reduces the communication overhead. Moreover, it is pointed out that to get the result accurately, it is necessary to use double precision variables in EGS4 code. Finally, the workflow of program parallelization is analyzed and tools for program parallelization through the experience of the EGS4 parallelization are discussed. (author).

  7. Distance measurements across randomly distributed nitroxide probes from the temperature dependence of the electron spin phase memory time at 240 GHz

    Science.gov (United States)

    Edwards, Devin T.; Takahashi, Susumu; Sherwin, Mark S.; Han, Songi

    2012-10-01

    At 8.5 T, the polarization of an ensemble of electron spins is essentially 100% at 2 K, and decreases to 30% at 20 K. The strong temperature dependence of the electron spin polarization between 2 and 20 K leads to the phenomenon of spin bath quenching: temporal fluctuations of the dipolar magnetic fields associated with the energy-conserving spin "flip-flop" process are quenched as the temperature of the spin bath is lowered to the point of nearly complete spin polarization. This work uses pulsed electron paramagnetic resonance (EPR) at 240 GHz to investigate the effects of spin bath quenching on the phase memory times (TM) of randomly-distributed ensembles of nitroxide molecules below 20 K at 8.5 T. For a given electron spin concentration, a characteristic, dipolar flip-flop rate (W) is extracted by fitting the temperature dependence of TM to a simple model of decoherence driven by the spin flip-flop process. In frozen solutions of 4-Amino-TEMPO, a stable nitroxide radical in a deuterated water-glass, a calibration is used to quantify average spin-spin distances as large as r¯=6.6 nm from the dipolar flip-flop rate. For longer distances, nuclear spin fluctuations, which are not frozen out, begin to dominate over the electron spin flip-flop processes, placing an effective ceiling on this method for nitroxide molecules. For a bulk solution with a three-dimensional distribution of nitroxide molecules at concentration n, we find W∝n∝1/r, which is consistent with magnetic dipolar spin interactions. Alternatively, we observe W∝n for nitroxides tethered to a quasi two-dimensional surface of large (Ø ˜ 200 nm), unilamellar, lipid vesicles, demonstrating that the quantification of spin bath quenching can also be used to discern the geometry of molecular assembly or organization.

  8. MEMORY MODULATION

    Science.gov (United States)

    Roozendaal, Benno; McGaugh, James L.

    2011-01-01

    Our memories are not all created equally strong: Some experiences are well remembered while others are remembered poorly, if at all. Research on memory modulation investigates the neurobiological processes and systems that contribute to such differences in the strength of our memories. Extensive evidence from both animal and human research indicates that emotionally significant experiences activate hormonal and brain systems that regulate the consolidation of newly acquired memories. These effects are integrated through noradrenergic activation of the basolateral amygdala which regulates memory consolidation via interactions with many other brain regions involved in consolidating memories of recent experiences. Modulatory systems not only influence neurobiological processes underlying the consolidation of new information, but also affect other mnemonic processes, including memory extinction, memory recall and working memory. In contrast to their enhancing effects on consolidation, adrenal stress hormones impair memory retrieval and working memory. Such effects, as with memory consolidation, require noradrenergic activation of the basolateral amygdala and interactions with other brain regions. PMID:22122145

  9. Memory Matters

    Science.gov (United States)

    ... Staying Safe Videos for Educators Search English Español Memory Matters KidsHealth / For Kids / Memory Matters What's in ... of your complex and multitalented brain. What Is Memory? When an event happens, when you learn something, ...

  10. Design and simulation of parallel and distributed architectures for images processing

    International Nuclear Information System (INIS)

    Pirson, Alain

    1990-01-01

    The exploitation of visual information requires special computers. The diversity of operations and the Computing power involved bring about structures founded on the concepts of concurrency and distributed processing. This work identifies a vision computer with an association of dedicated intelligent entities, exchanging messages according to the model of parallelism introduced by the language Occam. It puts forward an architecture of the 'enriched processor network' type. It consists of a classical multiprocessor structure where each node is provided with specific devices. These devices perform processing tasks as well as inter-nodes dialogues. Such an architecture benefits from the homogeneity of multiprocessor networks and the power of dedicated resources. Its implementation corresponds to that of a distributed structure, tasks being allocated to each Computing element. This approach culminates in an original architecture called ATILA. This modular structure is based on a transputer network supplied with vision dedicated co-processors and powerful communication devices. (author) [fr

  11. Emotional organization of autobiographical memory.

    Science.gov (United States)

    Schulkind, Matthew D; Woldorf, Gillian M

    2005-09-01

    The emotional organization of autobiographical memory was examined by determining whether emotional cues would influence autobiographical retrieval in younger and older adults. Unfamiliar musical cues that represented orthogonal combinations of positive and negative valence and high and low arousal were used. Whereas cue valence influenced the valence of the retrieved memories, cue arousal did not affect arousal ratings. However, high-arousal cues were associated with reduced response latencies. A significant bias to report positive memories was observed, especially for the older adults, but neither the distribution of memories across the life span nor response latencies varied across memories differing in valence or arousal. These data indicate that emotional information can serve as effective cues for autobiographical memories and that autobiographical memories are organized in terms of emotional valence but not emotional arousal. Thus, current theories of autobiographical memory must be expanded to include emotional valence as a primary dimension of organization.

  12. Development of a parallel DBMS on the basis of PostgreSQL

    OpenAIRE

    Pan, C.

    2011-01-01

    The paper describes the architecture and the design of PargreSQL parallel database management system (DBMS) for distributed memory multiprocessors. PargreSQL is based upon PostgreSQL open-source DBMS and exploits partitioned parallelism.

  13. A Performance Evaluation of the Hemingway DSM System on a Network of SMPs

    National Research Council Canada - National Science Library

    Aggarwal, Anshu; Grumwald, Dirk

    1997-01-01

    .... In this paper we investigate the performance of a software distributed shared memory system, Hemingway, which is built out of such multiprocessor workstations, utilizing off-the-shelf communication networks...

  14. Learning and memory.

    Science.gov (United States)

    Brem, Anna-Katharine; Ran, Kathy; Pascual-Leone, Alvaro

    2013-01-01

    Learning and memory functions are crucial in the interaction of an individual with the environment and involve the interplay of large, distributed brain networks. Recent advances in technologies to explore neurobiological correlates of neuropsychological paradigms have increased our knowledge about human learning and memory. In this chapter we first review and define memory and learning processes from a neuropsychological perspective. Then we provide some illustrations of how noninvasive brain stimulation can play a major role in the investigation of memory functions, as it can be used to identify cause-effect relationships and chronometric properties of neural processes underlying cognitive steps. In clinical medicine, transcranial magnetic stimulation may be used as a diagnostic tool to understand memory and learning deficits in various patient populations. Furthermore, noninvasive brain stimulation is also being applied to enhance cognitive functions, offering exciting translational therapeutic opportunities in neurology and psychiatry. © 2013 Elsevier B.V. All rights reserved.

  15. High-bandwidth memory interface

    CERN Document Server

    Kim, Chulwoo; Song, Junyoung

    2014-01-01

    This book provides an overview of recent advances in memory interface design at both the architecture and circuit levels. Coverage includes signal integrity and testing, TSV interface, high-speed serial interface including equalization, ODT, pre-emphasis, wide I/O interface including crosstalk, skew cancellation, and clock generation and distribution. Trends for further bandwidth enhancement are also covered.   • Enables readers with minimal background in memory design to understand the basics of high-bandwidth memory interface design; • Presents state-of-the-art techniques for memory interface design; • Covers memory interface design at both the circuit level and system architecture level.

  16. Behavior characterization of the shared last-level cache in a chip multiprocessor

    OpenAIRE

    Benedicte Illescas, Pedro

    2014-01-01

    [CATALÀ] Aquest projecte consisteix a analitzar diferents aspectes de la jerarquia de memòria i entendre la seva influència al rendiment del sistema. Els aspectes que s'analitzaran són els algorismes de reemplaçament, els esquemes de mapeig de memòria i les polítiques de pàgina de memòria. [ANGLÈS] This project consists in analyzing different aspects of the memory hierarchy and understanding its influence in the overall system performance. The aspects that will be analyzed are cache replac...

  17. An innovative approach to achieve re-centering and ductility of cement mortar beams through randomly distributed pseudo-elastic shape memory alloy fibers

    Science.gov (United States)

    Shajil, N.; Srinivasan, S. M.; Santhanam, M.

    2012-04-01

    Fibers can play a major role in post cracking behavior of concrete members, because of their ability to bridge cracks and distribute the stress across the crack. Addition of steel fibers in mortar and concrete can improve toughness of the structural member and impart significant energy dissipation through slow pull out. However, steel fibers undergo plastic deformation at low strain levels, and cannot regain their shape upon unloading. This is a major disadvantage in strong cyclic loading conditions, such as those caused by earthquakes, where self-centering ability of the fibers is a desired characteristic in addition to ductility of the reinforced cement concrete. Fibers made from an alternative material such as shape memory alloy (SMA) could offer a scope for re-centering, thus improving performance especially after a severe loading has occurred. In this study, the load-deformation characteristics of SMA fiber reinforced cement mortar beams under cyclic loading conditions were investigated to assess the re-centering performance. This study involved experiments on prismatic members, and related analysis for the assessment and prediction of re-centering. The performances of NiTi fiber reinforced mortars are compared with mortars with same volume fraction of steel fibers. Since re-entrant corners and beam columns joints are prone to failure during a strong ground motion, a study was conducted to determine the behavior of these reinforced with NiTi fiber. Comparison is made with the results of steel fiber reinforced cases. NiTi fibers showed significantly improved re-centering and energy dissipation characteristics compared to the steel fibers.

  18. Ring-array processor distribution topology for optical interconnects

    Science.gov (United States)

    Li, Yao; Ha, Berlin; Wang, Ting; Wang, Sunyu; Katz, A.; Lu, X. J.; Kanterakis, E.

    1992-01-01

    The existing linear and rectangular processor distribution topologies for optical interconnects, although promising in many respects, cannot solve problems such as clock skews, the lack of supporting elements for efficient optical implementation, etc. The use of a ring-array processor distribution topology, however, can overcome these problems. Here, a study of the ring-array topology is conducted with an aim of implementing various fast clock rate, high-performance, compact optical networks for digital electronic multiprocessor computers. Practical design issues are addressed. Some proof-of-principle experimental results are included.

  19. The memory of volatility

    Directory of Open Access Journals (Sweden)

    Kai R. Wenger

    2018-03-01

    Full Text Available The focus of the volatility literature on forecasting and the predominance of theconceptually simpler HAR model over long memory stochastic volatility models has led to the factthat the actual degree of memory estimates has rarely been considered. Estimates in the literaturerange roughly between 0.4 and 0.6 - that is from the higher stationary to the lower non-stationaryregion. This difference, however, has important practical implications - such as the existence or nonexistenceof the fourth moment of the return distribution. Inference on the memory order is complicatedby the presence of measurement error in realized volatility and the potential of spurious long memory.In this paper we provide a comprehensive analysis of the memory in variances of international stockindices and exchange rates. On the one hand, we find that the variance of exchange rates is subject tospurious long memory and the true memory parameter is in the higher stationary range. Stock indexvariances, on the other hand, are free of low frequency contaminations and the memory is in the lowernon-stationary range. These results are obtained using state of the art local Whittle methods that allowconsistent estimation in presence of perturbations or low frequency contaminations.

  20. Formation of the distributed NiSiGe nanocrystals nonvolatile memory formed by rapidly annealing in N2 and O2 ambient

    International Nuclear Information System (INIS)

    Hu, Chih-Wei; Chang, Ting-Chang; Tu, Chun-Hao; Chiang, Cheng-Neng; Lin, Chao-Cheng; Chen, Min-Chen; Chang, Chun-Yen; Sze, Simon M.; Tseng, Tseung-Yuen

    2010-01-01

    In this work, electrical characteristics of the Ge-incorporated Nickel silicide (NiSiGe) nanocrystals memory device formed by the rapidly thermal annealing in N 2 and O 2 ambient have been studied. The trapping layer was deposited by co-sputtering the NiSi 2 and Ge, simultaneously. Transmission electron microscope results indicate that the NiSiGe nanocrystals were formed obviously in both the samples. The memory devices show obvious charge-storage ability under capacitance-voltage measurement. However, it is found that the NiSiGe nanocrystals device formed by annealing in N 2 ambient has smaller memory window and better retention characteristics than in O 2 ambient. Then, related material analyses were used to confirm that the oxidized Ge elements affect the charge-storage sites and the electrical performance of the NCs memory.

  1. Distribution and levels of [125I]IGF-I, [125I]IGF-II and [125I]insulin receptor binding sites in the hippocampus of aged memory-unimpaired and -impaired rats

    International Nuclear Information System (INIS)

    Quirion, R.; Rowe, W.; Kar, S.; Dore, S.

    1997-01-01

    The insulin-like growth factors (IGF-I and IGF-II) and insulin are localized within distinct brain regions and their respective functions are mediated by specific membrane receptors. High densities of binding sites for these growth factors are discretely and differentially distributed throughout the brain, with prominent levels localized to the hippocampal formation. IGFs and insulin, in addition to their growth promoting actions, are considered to play important roles in the development and maintenance of normal cell functions throughout life. We compared the anatomical distribution and levels of IGF and insulin receptors in young (five month) and aged (25 month) memory-impaired and memory-unimpaired male Long-Evans rats as determined in the Morris water maze task in order to determine if alterations in IGF and insulin activity may be related to the emergence of cognitive deficits in the aged memory-impaired rat. In the hippocampus, [ 125 I]IGF-I receptors are concentrated primarily in the dentate gyrus (DG) and the CA3 sub-field while high amounts of [ 125 I]IGF-II binding sites are localized to the pyramidal cell layer, and the granular cell layer of the DG. [ 125 I]insulin binding sites are mostly found in the molecular layer of the DG and the CA1 sub-field. No significant differences were found in [ 125 I]IGF-I, [ 125 I]IGF-II or [ 125 I]insulin binding levels in any regions or laminae of the hippocampus of young vs aged rats, and deficits in cognitive performance did not relate to altered levels of these receptors in aged memory-impaired vs aged memory-unimpaired rats. Other regions, including various cortical areas, were also examined and failed to reveal any significant differences between the three groups studied.It thus appears that IGF-I, IGF-II and insulin receptor sites are not markedly altered during the normal ageing process in the Long-Evans rat, in spite of significant learning deficits in a sub-group (memory-impaired) of aged animals. Hence

  2. A parallel row-based algorithm with error control for standard-cell replacement on a hypercube multiprocessor

    Science.gov (United States)

    Sargent, Jeff Scott

    1988-01-01

    A new row-based parallel algorithm for standard-cell placement targeted for execution on a hypercube multiprocessor is presented. Key features of this implementation include a dynamic simulated-annealing schedule, row-partitioning of the VLSI chip image, and two novel new approaches to controlling error in parallel cell-placement algorithms; Heuristic Cell-Coloring and Adaptive (Parallel Move) Sequence Control. Heuristic Cell-Coloring identifies sets of noninteracting cells that can be moved repeatedly, and in parallel, with no buildup of error in the placement cost. Adaptive Sequence Control allows multiple parallel cell moves to take place between global cell-position updates. This feedback mechanism is based on an error bound derived analytically from the traditional annealing move-acceptance profile. Placement results are presented for real industry circuits and the performance is summarized of an implementation on the Intel iPSC/2 Hypercube. The runtime of this algorithm is 5 to 16 times faster than a previous program developed for the Hypercube, while producing equivalent quality placement. An integrated place and route program for the Intel iPSC/2 Hypercube is currently being developed.

  3. ePRO-MP: A Tool for Profiling and Optimizing Energy and Performance of Mobile Multiprocessor Applications

    Directory of Open Access Journals (Sweden)

    Wonil Choi

    2009-01-01

    Full Text Available For mobile multiprocessor applications, achieving high performance with low energy consumption is a challenging task. In order to help programmers to meet these design requirements, system development tools play an important role. In this paper, we describe one such development tool, ePRO-MP, which profiles and optimizes both performance and energy consumption of multi-threaded applications running on top of Linux for ARM11 MPCore-based embedded systems. One of the key features of ePRO-MP is that it can accurately estimate the energy consumption of multi-threaded applications without requiring a power measurement equipment, using a regression-based energy model. We also describe another key benefit of ePRO-MP, an automatic optimization function, using two example problems. Using the automatic optimization function, ePRO-MP can achieve high performance and low power consumption without programmer intervention. Our experimental results show that ePRO-MP can improve the performance and energy consumption by 6.1% and 4.1%, respectively, over a baseline version for the co-running applications optimization example. For the producer-consumer application optimization example, ePRO-MP improves the performance and energy consumption by 60.5% and 43.3%, respectively over a baseline version.

  4. HEP - A semaphore-synchronized multiprocessor with central control. [Heterogeneous Element Processor

    Science.gov (United States)

    Gilliland, M. C.; Smith, B. J.; Calvert, W.

    1976-01-01

    The paper describes the design concept of the Heterogeneous Element Processor (HEP), a system tailored to the special needs of scientific simulation. In order to achieve high-speed computation required by simulation, HEP features a hierarchy of processes executing in parallel on a number of processors, with synchronization being largely accomplished by hardware. A full-empty-reserve scheme of synchronization is realized by zero-one-valued hardware semaphores. A typical system has, besides the control computer and the scheduler, an algebraic module, a memory module, a first-in first-out (FIFO) module, an integrator module, and an I/O module. The architecture of the scheduler and the algebraic module is examined in detail.

  5. One-way shared memory

    DEFF Research Database (Denmark)

    Schoeberl, Martin

    2018-01-01

    Standard multicore processors use the shared main memory via the on-chip caches for communication between cores. However, this form of communication has two limitations: (1) it is hardly time-predictable and therefore not a good solution for real-time systems and (2) this single shared memory...... is a bottleneck in the system. This paper presents a communication architecture for time-predictable multicore systems where core-local memories are distributed on the chip. A network-on-chip constantly copies data from a sender core-local memory to a receiver core-local memory. As this copying is performed...... in one direction we call this architecture a one-way shared memory. With the use of time-division multiplexing for the memory accesses and the network-on-chip routers we achieve a time-predictable solution where the communication latency and bandwidth can be bounded. An example architecture for a 3...

  6. Global aspects of radiation memory

    International Nuclear Information System (INIS)

    Winicour, J

    2014-01-01

    Gravitational radiation has a memory effect represented by a net change in the relative positions of test particles. Both the linear and nonlinear sources proposed for this radiation memory are of the ‘electric’ type, or E mode, as characterized by the even parity of the polarization pattern. Although ‘magnetic’ type, or B mode, radiation memory is mathematically possible, no physically realistic source has been identified. There is an electromagnetic counterpart to radiation memory in which the velocity of charged test particles obtain a net ‘kick’. Again, the physically realistic sources of electromagnetic radiation memory that have been identified are of the electric type. In this paper, a global null cone description of the electromagnetic field is applied to establish the non-existence of B-mode radiation memory and the non-existence of E-mode radiation memory due to a bound charge distribution. (paper)

  7. Cognitive memory.

    Science.gov (United States)

    Widrow, Bernard; Aragon, Juan Carlos

    2013-05-01

    Regarding the workings of the human mind, memory and pattern recognition seem to be intertwined. You generally do not have one without the other. Taking inspiration from life experience, a new form of computer memory has been devised. Certain conjectures about human memory are keys to the central idea. The design of a practical and useful "cognitive" memory system is contemplated, a memory system that may also serve as a model for many aspects of human memory. The new memory does not function like a computer memory where specific data is stored in specific numbered registers and retrieval is done by reading the contents of the specified memory register, or done by matching key words as with a document search. Incoming sensory data would be stored at the next available empty memory location, and indeed could be stored redundantly at several empty locations. The stored sensory data would neither have key words nor would it be located in known or specified memory locations. Sensory inputs concerning a single object or subject are stored together as patterns in a single "file folder" or "memory folder". When the contents of the folder are retrieved, sights, sounds, tactile feel, smell, etc., are obtained all at the same time. Retrieval would be initiated by a query or a prompt signal from a current set of sensory inputs or patterns. A search through the memory would be made to locate stored data that correlates with or relates to the prompt input. The search would be done by a retrieval system whose first stage makes use of autoassociative artificial neural networks and whose second stage relies on exhaustive search. Applications of cognitive memory systems have been made to visual aircraft identification, aircraft navigation, and human facial recognition. Concerning human memory, reasons are given why it is unlikely that long-term memory is stored in the synapses of the brain's neural networks. Reasons are given suggesting that long-term memory is stored in DNA or RNA

  8. Asynchronous and corrected-asynchronous numerical solutions of parabolic PDES on MIMD multiprocessors

    Science.gov (United States)

    Amitai, Dganit; Averbuch, Amir; Itzikowitz, Samuel; Turkel, Eli

    1991-01-01

    A major problem in achieving significant speed-up on parallel machines is the overhead involved with synchronizing the concurrent process. Removing the synchronization constraint has the potential of speeding up the computation. The authors present asynchronous (AS) and corrected-asynchronous (CA) finite difference schemes for the multi-dimensional heat equation. Although the discussion concentrates on the Euler scheme for the solution of the heat equation, it has the potential for being extended to other schemes and other parabolic partial differential equations (PDEs). These schemes are analyzed and implemented on the shared memory multi-user Sequent Balance machine. Numerical results for one and two dimensional problems are presented. It is shown experimentally that the synchronization penalty can be about 50 percent of run time: in most cases, the asynchronous scheme runs twice as fast as the parallel synchronous scheme. In general, the efficiency of the parallel schemes increases with processor load, with the time level, and with the problem dimension. The efficiency of the AS may reach 90 percent and over, but it provides accurate results only for steady-state values. The CA, on the other hand, is less efficient, but provides more accurate results for intermediate (non steady-state) values.

  9. A Context-Dependent Role for IL-21 in Modulating the Differentiation, Distribution, and Abundance of Effector and Memory CD8 T Cell Subsets.

    Science.gov (United States)

    Tian, Yuan; Cox, Maureen A; Kahan, Shannon M; Ingram, Jennifer T; Bakshi, Rakesh K; Zajac, Allan J

    2016-03-01

    The activation of naive CD8 T cells typically results in the formation of effector cells (TE) as well as phenotypically distinct memory cells that are retained over time. Memory CD8 T cells can be further subdivided into central memory, effector memory (TEM), and tissue-resident memory (TRM) subsets, which cooperate to confer immunological protection. Using mixed bone marrow chimeras and adoptive transfer studies in which CD8 T cells either do or do not express IL-21R, we discovered that under homeostatic or lymphopenic conditions IL-21 acts directly on CD8 T cells to favor the accumulation of TE/TEM populations. The inability to perceive IL-21 signals under competitive conditions also resulted in lower levels of TRM phenotype cells and reduced expression of granzyme B in the small intestine. IL-21 differentially promoted the expression of the chemokine receptor CX3CR1 and the integrin α4β7 on CD8 T cells primed in vitro and on circulating CD8 T cells in the mixed bone marrow chimeras. The requirement for IL-21 to establish CD8 TE/TEM and TRM subsets was overcome by acute lymphocytic choriomeningitis virus infection; nevertheless, memory virus-specific CD8 T cells remained dependent on IL-21 for optimal accumulation in lymphopenic environments. Overall, this study reveals a context-dependent role for IL-21 in sustaining effector phenotype CD8 T cells and influencing their migratory properties, accumulation, and functions. Copyright © 2016 by The American Association of Immunologists, Inc.

  10. Memory Modulation

    NARCIS (Netherlands)

    Roozendaal, Benno; McGaugh, James L.

    2011-01-01

    Our memories are not all created equally strong: Some experiences are well remembered while others are remembered poorly, if at all. Research on memory modulation investigates the neurobiological processes and systems that contribute to such differences in the strength of our memories. Extensive

  11. Smoothing type buffer memory device

    International Nuclear Information System (INIS)

    Podorozhnyj, D.M.; Yashin, I.V.

    1990-01-01

    The layout of the micropower 4-bit smoothing type buffer memory device allowing one to record without counting the sequence of input randomly distributed pulses in multi-channel devices with serial poll, is given. The power spent by a memory cell for one binary digit recording is not greater than 0.15 mW, the device dead time is 10 mus

  12. What happens when we compare the lifespan distributions of life script events and autobiographical memories of life story events? A cross-cultural study

    DEFF Research Database (Denmark)

    Zaragoza Scherman, Alejandra; Salgado, Sinué; Shao, Zhifang

    Cultural Life Script Theory (Berntsen and Rubin, 2004), provides a cultural explanation of the reminiscence bump: adults older than 40 years remember a significantly greater amount of life events happening between 15 - 30 years of age (Rubin, Rahal, & Poon, 1998), compared to other lifetime periods...... and memories of life story events, we can determine the degree to which the cultural life script serves as a recall template for autobiographical memories, especially of positive life events from adolescence and early adulthood, also known as the reminiscence bump period....

  13. Memory Dysfunction

    Science.gov (United States)

    Matthews, Brandy R.

    2015-01-01

    Purpose of Review: This article highlights the dissociable human memory systems of episodic, semantic, and procedural memory in the context of neurologic illnesses known to adversely affect specific neuroanatomic structures relevant to each memory system. Recent Findings: Advances in functional neuroimaging and refinement of neuropsychological and bedside assessment tools continue to support a model of multiple memory systems that are distinct yet complementary and to support the potential for one system to be engaged as a compensatory strategy when a counterpart system fails. Summary: Episodic memory, the ability to recall personal episodes, is the subtype of memory most often perceived as dysfunctional by patients and informants. Medial temporal lobe structures, especially the hippocampal formation and associated cortical and subcortical structures, are most often associated with episodic memory loss. Episodic memory dysfunction may present acutely, as in concussion; transiently, as in transient global amnesia (TGA); subacutely, as in thiamine deficiency; or chronically, as in Alzheimer disease. Semantic memory refers to acquired knowledge about the world. Anterior and inferior temporal lobe structures are most often associated with semantic memory loss. The semantic variant of primary progressive aphasia (svPPA) is the paradigmatic disorder resulting in predominant semantic memory dysfunction. Working memory, associated with frontal lobe function, is the active maintenance of information in the mind that can be potentially manipulated to complete goal-directed tasks. Procedural memory, the ability to learn skills that become automatic, involves the basal ganglia, cerebellum, and supplementary motor cortex. Parkinson disease and related disorders result in procedural memory deficits. Most memory concerns warrant bedside cognitive or neuropsychological evaluation and neuroimaging to assess for specific neuropathologies and guide treatment. PMID:26039844

  14. Optimizing survivability of multi-state systems with multi-level protection by multi-processor genetic algorithm

    International Nuclear Information System (INIS)

    Levitin, Gregory; Dai Yuanshun; Xie Min; Leng Poh, Kim

    2003-01-01

    In this paper we consider vulnerable systems which can have different states corresponding to different combinations of available elements composing the system. Each state can be characterized by a performance rate, which is the quantitative measure of a system's ability to perform its task. Both the impact of external factors (stress) and internal causes (failures) affect system survivability, which is determined as probability of meeting a given demand. In order to increase the survivability of the system, a multi-level protection is applied to its subsystems. This means that a subsystem and its inner level of protection are in their turn protected by the protection of an outer level. This double-protected subsystem has its outer protection and so forth. In such systems, the protected subsystems can be destroyed only if all of the levels of their protection are destroyed. Each level of protection can be destroyed only if all of the outer levels of protection are destroyed. We formulate the problem of finding the structure of series-parallel multi-state system (including choice of system elements, choice of structure of multi-level protection and choice of protection methods) in order to achieve a desired level of system survivability by the minimal cost. An algorithm based on the universal generating function method is used for determination of the system survivability. A multi-processor version of genetic algorithm is used as optimization tool in order to solve the structure optimization problem. An application example is presented to illustrate the procedure presented in this paper

  15. Changing concepts of working memory

    Science.gov (United States)

    Ma, Wei Ji; Husain, Masud; Bays, Paul M

    2014-01-01

    Working memory is widely considered to be limited in capacity, holding a fixed, small number of items, such as Miller's ‘magical number’ seven or Cowan's four. It has recently been proposed that working memory might better be conceptualized as a limited resource that is distributed flexibly among all items to be maintained in memory. According to this view, the quality rather than the quantity of working memory representations determines performance. Here we consider behavioral and emerging neural evidence for this proposal. PMID:24569831

  16. Declarative memory.

    Science.gov (United States)

    Riedel, Wim J; Blokland, Arjan

    2015-01-01

    Declarative Memory consists of memory for events (episodic memory) and facts (semantic memory). Methods to test declarative memory are key in investigating effects of potential cognition-enhancing substances--medicinal drugs or nutrients. A number of cognitive performance tests assessing declarative episodic memory tapping verbal learning, logical memory, pattern recognition memory, and paired associates learning are described. These tests have been used as outcome variables in 34 studies in humans that have been described in the literature in the past 10 years. Also, the use of episodic tests in animal research is discussed also in relation to the drug effects in these tasks. The results show that nutritional supplementation of polyunsaturated fatty acids has been investigated most abundantly and, in a number of cases, but not all, show indications of positive effects on declarative memory, more so in elderly than in young subjects. Studies investigating effects of registered anti-Alzheimer drugs, cholinesterase inhibitors in mild cognitive impairment, show positive and negative effects on declarative memory. Studies mainly carried out in healthy volunteers investigating the effects of acute dopamine stimulation indicate enhanced memory consolidation as manifested specifically by better delayed recall, especially at time points long after learning and more so when drug is administered after learning and if word lists are longer. The animal studies reveal a different picture with respect to the effects of different drugs on memory performance. This suggests that at least for episodic memory tasks, the translational value is rather poor. For the human studies, detailed parameters of the compositions of word lists for declarative memory tests are discussed and it is concluded that tailored adaptations of tests to fit the hypothesis under study, rather than "off-the-shelf" use of existing tests, are recommended.

  17. Parallel diffusion calculation for the PHAETON on-line multiprocessor computer

    International Nuclear Information System (INIS)

    Collart, J.M.; Fedon-Magnaud, C.; Lautard, J.J.

    1987-04-01

    The aim of the PHAETON project is the design of an on-line computer in order to increase the immediate knowledge of the main operating and safety parameters in power plants. A significant stage is the computation of the three dimensional flux distribution. For cost and safety reason a computer based on a parallel microprocessor architecture has been studied. This paper presents a first approach to parallelized three dimensional diffusion calculation. A computing software has been written and built in a four processors demonstrator. We present the realization in progress, concerning the final equipment. 8 refs

  18. Quantum memory Quantum memory

    Science.gov (United States)

    Le Gouët, Jean-Louis; Moiseev, Sergey

    2012-06-01

    Interaction of quantum radiation with multi-particle ensembles has sparked off intense research efforts during the past decade. Emblematic of this field is the quantum memory scheme, where a quantum state of light is mapped onto an ensemble of atoms and then recovered in its original shape. While opening new access to the basics of light-atom interaction, quantum memory also appears as a key element for information processing applications, such as linear optics quantum computation and long-distance quantum communication via quantum repeaters. Not surprisingly, it is far from trivial to practically recover a stored quantum state of light and, although impressive progress has already been accomplished, researchers are still struggling to reach this ambitious objective. This special issue provides an account of the state-of-the-art in a fast-moving research area that makes physicists, engineers and chemists work together at the forefront of their discipline, involving quantum fields and atoms in different media, magnetic resonance techniques and material science. Various strategies have been considered to store and retrieve quantum light. The explored designs belong to three main—while still overlapping—classes. In architectures derived from photon echo, information is mapped over the spectral components of inhomogeneously broadened absorption bands, such as those encountered in rare earth ion doped crystals and atomic gases in external gradient magnetic field. Protocols based on electromagnetic induced transparency also rely on resonant excitation and are ideally suited to the homogeneous absorption lines offered by laser cooled atomic clouds or ion Coulomb crystals. Finally off-resonance approaches are illustrated by Faraday and Raman processes. Coupling with an optical cavity may enhance the storage process, even for negligibly small atom number. Multiple scattering is also proposed as a way to enlarge the quantum interaction distance of light with matter. The

  19. A universal multiprocessor system for the fast acquisition and processing of positron camera data

    International Nuclear Information System (INIS)

    Deluigi, B.

    1982-01-01

    In this study the main components of a suitable detection system were worked out, and their properties were examined. For the measurement of the three-dimensional distribution of radiopharmaka marked by positron emitters in animal-experimental studies first a positron camera was constructed. For the detection of the annihilation quanta serve two opposite lying position-sensitive gamma detectors which are derived in coincidence. Two commercial camera heads working according to the Anger principle were reconstructed for these purposes and switched together by a special interface to the positron camera. By this arrangement a spatial resolution of 0.8 cm FWHM for a line source in the symmetry plane and a coincidence resolution time 2T of 16ns FW0.1M was reached. For the three-dimensional image reconstruction with the data of a positron camera a maximum-likelihood procedure was developed and tested by a Monte Carlo procedure. In view of this application an at most flexible multi-microprocessor system was developed. A high computing capacity is reached owing to the fact that several partial problems are distributed to different processors and are processed parallely. The architecture was so scheduled that the system possesses a high error tolerance and that the computing capacity can be extended without a principal limit. (orig./HSI) [de

  20. Development of Ada language control software for the NASA power management and distribution test bed

    Science.gov (United States)

    Wright, Ted; Mackin, Michael; Gantose, Dave

    1989-01-01

    The Ada language software developed to control the NASA Lewis Research Center's Power Management and Distribution testbed is described. The testbed is a reduced-scale prototype of the electric power system to be used on space station Freedom. It is designed to develop and test hardware and software for a 20-kHz power distribution system. The distributed, multiprocessor, testbed control system has an easy-to-use operator interface with an understandable English-text format. A simple interface for algorithm writers that uses the same commands as the operator interface is provided, encouraging interactive exploration of the system.

  1. Memory design

    DEFF Research Database (Denmark)

    Tanderup, Sisse

    by cultural forms, often specifically by the concept of memory in philosophy, sociology and psychology, while Danish design traditionally has been focusing on form and function with frequent references to the forms of nature. Alessi's motivation for investigating the concept of memory is that it adds......Mind and Matter - Nordik 2009 Conference for Art Historians Design Matters Contributed Memory design BACKGROUND My research concerns the use of memory categories in the designs by the companies Alessi and Georg Jensen. When Alessi's designers create their products, they are usually inspired...... a cultural dimension to the design objects, enabling the objects to make an identity-forming impact. Whether or not the concept of memory plays a significant role in Danish design has not yet been elucidated fully. TERMINOLOGY The concept of "memory design" refers to the idea that design carries...

  2. Disputed Memory

    DEFF Research Database (Denmark)

    , individual and political discourse and electronic social media. Analyzing memory disputes in various local, national and transnational contexts, the chapters demonstrate the political power and social impact of painful and disputed memories. The book brings new insights into current memory disputes...... in Central, Eastern and Southeastern Europe. It contributes to the understanding of processes of memory transmission and negotiation across borders and cultures in Europe, emphasizing the interconnectedness of memory with emotions, mediation and politics....... century in the region. Written by an international group of scholars from a diversity of disciplines, the chapters approach memory disputes in methodologically innovative ways, studying representations and negotiations of disputed pasts in different media, including monuments, museum exhibitions...

  3. Main Memory

    OpenAIRE

    Boncz, Peter; Liu, Lei; Özsu, M.

    2008-01-01

    htmlabstractPrimary storage, presently known as main memory, is the largest memory directly accessible to the CPU in the prevalent Von Neumann model and stores both data and instructions (program code). The CPU continuously reads instructions stored there and executes them. It is also called Random Access Memory (RAM), to indicate that load/store instructions can access data at any location at the same cost, is usually implemented using DRAM chips, which are connected to the CPU and other per...

  4. Efficiency Analysis of the Parallel Implementation of the SIMPLE Algorithm on Multiprocessor Computers

    Science.gov (United States)

    Lashkin, S. V.; Kozelkov, A. S.; Yalozo, A. V.; Gerasimov, V. Yu.; Zelensky, D. K.

    2017-12-01

    This paper describes the details of the parallel implementation of the SIMPLE algorithm for numerical solution of the Navier-Stokes system of equations on arbitrary unstructured grids. The iteration schemes for the serial and parallel versions of the SIMPLE algorithm are implemented. In the description of the parallel implementation, special attention is paid to computational data exchange among processors under the condition of the grid model decomposition using fictitious cells. We discuss the specific features for the storage of distributed matrices and implementation of vector-matrix operations in parallel mode. It is shown that the proposed way of matrix storage reduces the number of interprocessor exchanges. A series of numerical experiments illustrates the effect of the multigrid SLAE solver tuning on the general efficiency of the algorithm; the tuning involves the types of the cycles used (V, W, and F), the number of iterations of a smoothing operator, and the number of cells for coarsening. Two ways (direct and indirect) of efficiency evaluation for parallelization of the numerical algorithm are demonstrated. The paper presents the results of solving some internal and external flow problems with the evaluation of parallelization efficiency by two algorithms. It is shown that the proposed parallel implementation enables efficient computations for the problems on a thousand processors. Based on the results obtained, some general recommendations are made for the optimal tuning of the multigrid solver, as well as for selecting the optimal number of cells per processor.

  5. A Self Consistent Multiprocessor Space Charge Algorithm that is Almost Embarrassingly Parallel

    International Nuclear Information System (INIS)

    Nissen, Edward; Erdelyi, B.; Manikonda, S.L.

    2012-01-01

    We present a space charge code that is self consistent, massively parallelizeable, and requires very little communication between computer nodes; making the calculation almost embarrassingly parallel. This method is implemented in the code COSY Infinity where the differential algebras used in this code are important to the algorithm's proper functioning. The method works by calculating the self consistent space charge distribution using the statistical moments of the test particles, and converting them into polynomial series coefficients. These coefficients are combined with differential algebraic integrals to form the potential, and electric fields. The result is a map which contains the effects of space charge. This method allows for massive parallelization since its statistics based solver doesn't require any binning of particles, and only requires a vector containing the partial sums of the statistical moments for the different nodes to be passed. All other calculations are done independently. The resulting maps can be used to analyze the system using normal form analysis, as well as advance particles in numbers and at speeds that were previously impossible.

  6. Collaging Memories

    Science.gov (United States)

    Wallach, Michele

    2011-01-01

    Even middle school students can have memories of their childhoods, of an earlier time. The art of Romare Bearden and the writings of Paul Auster can be used to introduce ideas about time and memory to students and inspire works of their own. Bearden is an exceptional role model for young artists, not only because of his astounding art, but also…

  7. Memory Magic.

    Science.gov (United States)

    Hartman, Thomas G.; Nowak, Norman

    This paper outlines several "tricks" that aid students in improving their memories. The distinctions between operational and figural thought processes are noted. Operational memory is described as something that allows adults to make generalizations about numbers and the rules by which they may be combined, thus leading to easier memorization.…

  8. Memory loss

    Science.gov (United States)

    ... barbiturates or ( hypnotics ) ECT (electroconvulsive therapy) (most often short-term memory loss) Epilepsy that is not well controlled Illness that ... appointment. Medical history questions may include: Type of memory loss, such as short-term or long-term Time pattern, such as how ...

  9. Episodic Memories

    Science.gov (United States)

    Conway, Martin A.

    2009-01-01

    An account of episodic memories is developed that focuses on the types of knowledge they represent, their properties, and the functions they might serve. It is proposed that episodic memories consist of "episodic elements," summary records of experience often in the form of visual images, associated to a "conceptual frame" that provides a…

  10. Flavor Memory

    NARCIS (Netherlands)

    Mojet, Jos; Köster, Ep

    2016-01-01

    Odor, taste, texture, temperature, and pain all contribute to the perception and memory of food flavor. Flavor memory is also strongly linked to the situational aspects of previous encounters with the flavor, but does not depend on the precise recollection of its sensory features as in vision and

  11. Main Memory

    NARCIS (Netherlands)

    P.A. Boncz (Peter); L. Liu (Lei); M. Tamer Özsu

    2008-01-01

    htmlabstractPrimary storage, presently known as main memory, is the largest memory directly accessible to the CPU in the prevalent Von Neumann model and stores both data and instructions (program code). The CPU continuously reads instructions stored there and executes them. It is also called Random

  12. Accessing memory

    Science.gov (United States)

    Yoon, Doe Hyun; Muralimanohar, Naveen; Chang, Jichuan; Ranganthan, Parthasarathy

    2017-09-26

    A disclosed example method involves performing simultaneous data accesses on at least first and second independently selectable logical sub-ranks to access first data via a wide internal data bus in a memory device. The memory device includes a translation buffer chip, memory chips in independently selectable logical sub-ranks, a narrow external data bus to connect the translation buffer chip to a memory controller, and the wide internal data bus between the translation buffer chip and the memory chips. A data access is performed on only the first independently selectable logical sub-rank to access second data via the wide internal data bus. The example method also involves locating a first portion of the first data, a second portion of the first data, and the second data on the narrow external data bus during separate data transfers.

  13. Digital Extension of Music Memory Music as a Collective Cultural Memory

    Directory of Open Access Journals (Sweden)

    Dimitrije Buzarovski

    2014-11-01

    Full Text Available Artistic works represent a very important part of collective cultural memory. Every artistic work, by definition, can confirm its existence only through the presence in collective cultural memory. The migration from author’s individual memory to common collective cultural memory forms the cultural heritage. This equally applies to tangible and intangible cultural artifacts. Being part of collective cultural memory, music reflects the spatial (geographic and temporal (historic dimensions of this memory. Until the appearance of written signs (scores music was preserved only through collective cultural memory. Scores have facilitated further distribution of music artifacts. The appearance of different means for audio, and later audio/video recordings have greatly improved the distribution of music. The transition from analog to digital recording and carriers has been a revolutionary step which substantially extended the chances for the survival of music artifacts in collective memory.

  14. Memory Reconsolidation.

    Science.gov (United States)

    Haubrich, Josue; Nader, Karim

    2018-01-01

    Scientific advances in the last decades uncovered that memory is not a stable, fixed entity. Apparently stable memories may become transiently labile and susceptible to modifications when retrieved due to the process of reconsolidation. Here, we review the initial evidence and the logic on which reconsolidation theory is based, the wide range of conditions in which it has been reported and recent findings further revealing the fascinating nature of this process. Special focus is given to conceptual issues of when and why reconsolidation happen and its possible outcomes. Last, we discuss the potential clinical implications of memory modifications by reconsolidation.

  15. Olfactory Memory

    Science.gov (United States)

    Eichenbaum, Howard; Robitsek, R. Jonathan

    2009-01-01

    Odor-recognition memory in rodents may provide a valuable model of cognitive aging. In a recent study we used signal detection analyses to distinguish odor recognition based on recollection versus that based on familiarity. Aged rats were selectively impaired in recollection, with relative sparing of familiarity, and the deficits in recollection were correlated with spatial memory impairments. These results complement electro-physiological findings indicating age-associated deficits in the ability of hippocampal neurons to differentiate contextual information, and this information-processing impairment may underlie the common age-associated decline in olfactory and spatial memory. PMID:19686208

  16. Distributed computing feasibility in a non-dedicated homogeneous distributed system

    Science.gov (United States)

    Leutenegger, Scott T.; Sun, Xian-He

    1993-01-01

    The low cost and availability of clusters of workstations have lead researchers to re-explore distributed computing using independent workstations. This approach may provide better cost/performance than tightly coupled multiprocessors. In practice, this approach often utilizes wasted cycles to run parallel jobs. The feasibility of such a non-dedicated parallel processing environment assuming workstation processes have preemptive priority over parallel tasks is addressed. An analytical model is developed to predict parallel job response times. Our model provides insight into how significantly workstation owner interference degrades parallel program performance. A new term task ratio, which relates the parallel task demand to the mean service demand of nonparallel workstation processes, is introduced. It was proposed that task ratio is a useful metric for determining how large the demand of a parallel applications must be in order to make efficient use of a non-dedicated distributed system.

  17. Thermodynamic Model of Spatial Memory

    Science.gov (United States)

    Kaufman, Miron; Allen, P.

    1998-03-01

    We develop and test a thermodynamic model of spatial memory. Our model is an application of statistical thermodynamics to cognitive science. It is related to applications of the statistical mechanics framework in parallel distributed processes research. Our macroscopic model allows us to evaluate an entropy associated with spatial memory tasks. We find that older adults exhibit higher levels of entropy than younger adults. Thurstone's Law of Categorical Judgment, according to which the discriminal processes along the psychological continuum produced by presentations of a single stimulus are normally distributed, is explained by using a Hooke spring model of spatial memory. We have also analyzed a nonlinear modification of the ideal spring model of spatial memory. This work is supported by NIH/NIA grant AG09282-06.

  18. Improved polycrystalline Ni{sub 54}Mn{sub 16}Fe{sub 9}Ga{sub 21} high-temperature shape memory alloy by γ phase distributing along grain boundaries

    Energy Technology Data Exchange (ETDEWEB)

    Yang, Shuiyuan; Zhang, Fan; Zhang, Kaixin; Huang, Yangyang; Wang, Cuiping; Liu, Xingjun [Xiamen Univ. (China). Fujian Key Laboratory of Materials Genome

    2016-09-15

    In this study, the shape recovery and mechanical properties of Ni{sub 54}Mn{sub 16}Fe{sub 9}Ga{sub 21} high-temperature shape memory alloy are improved simultaneously. This results from the low, about 4.4%, volume fraction of γ phase being almost completely distributed along grain boundaries. The recovery strain gradually increases with the increase in residual strain with a shape recovery rate of above 68%, up to a maximum value of 5.3%. The compressive fracture strain of Ni{sub 54}Mn{sub 16}Fe{sub 9}Ga{sub 21} alloy is about 35%. The results further reveal that when applying a high compression deformation two types of cracks form and propagate either within martensite grains (type I) or along the boundaries between martensite phase and γ phase (type II) in the present two-phase alloy.

  19. Multiferroic Memories

    Directory of Open Access Journals (Sweden)

    Amritendu Roy

    2012-01-01

    Full Text Available Multiferroism implies simultaneous presence of more than one ferroic characteristics such as coexistence of ferroelectric and magnetic ordering. This phenomenon has led to the development of various kinds of materials and conceptions of many novel applications such as development of a memory device utilizing the multifunctionality of the multiferroic materials leading to a multistate memory device with electrical writing and nondestructive magnetic reading operations. Though, interdependence of electrical- and magnetic-order parameters makes it difficult to accomplish the above and thus rendering the device to only two switchable states, recent research has shown that such problems can be circumvented by novel device designs such as formation of tunnel junction or by use of exchange bias. In this paper, we review the operational aspects of multiferroic memories as well as the materials used for these applications along with the designs that hold promise for the future memory devices.

  20. Color Memory

    OpenAIRE

    Pate, Monica; Raclariu, Ana-Maria; Strominger, Andrew

    2017-01-01

    A transient color flux across null infinity in classical Yang-Mills theory is considered. It is shown that a pair of test `quarks' initially in a color singlet generically acquire net color as a result of the flux. A nonlinear formula is derived for the relative color rotation of the quarks. For weak color flux the formula linearizes to the Fourier transform of the soft gluon theorem. This color memory effect is the Yang-Mills analog of the gravitational memory effect.

  1. Three Types of Memory in Emergency Medical Services Communication

    Science.gov (United States)

    Angeli, Elizabeth L.

    2015-01-01

    This article examines memory and distributed cognition involved in the writing practices of emergency medical services (EMS) professionals. Results from a 16-month study indicate that EMS professionals rely on distributed cognition and three kinds of memory: individual, collaborative, and professional. Distributed cognition and the three types of…

  2. The reminiscence bump in autobiographical memory and for public events

    DEFF Research Database (Denmark)

    Koppel, Jonathan; Berntsen, Dorthe

    2016-01-01

    of public events. We did so between-subjects, through two cueing methods administered within-subjects, the cue word method and the important memories method. For word-cued memories, we found a similar bump from ages 5 to 19 for both types of memories. However, the bump was more pronounced...... for autobiographical memories. For most important memories, we found a bump from ages 20 to 29 in autobiographical memory, but little discernible age pattern for public events. Rather, specific public events (e.g., the Fall of the Berlin Wall) dominated recall, producing a chronological distribution characterised......The reminiscence bump has been found for both autobiographical memories and memories of public events. However, there have been few comparisons of the bump across each type of event. In the current study, therefore, we compared the bump for autobiographical memories versus the bump for memories...

  3. Event boundaries and memory improvement.

    Science.gov (United States)

    Pettijohn, Kyle A; Thompson, Alexis N; Tamplin, Andrea K; Krawietz, Sabine A; Radvansky, Gabriel A

    2016-03-01

    The structure of events can influence later memory for information that is embedded in them, with evidence indicating that event boundaries can both impair and enhance memory. The current study explored whether the presence of event boundaries during encoding can structure information to improve memory. In Experiment 1, memory for a list of words was tested in which event structure was manipulated by having participants walk through a doorway, or not, halfway through the word list. In Experiment 2, memory for lists of words was tested in which event structure was manipulated using computer windows. Finally, in Experiments 3 and 4, event structure was manipulated by having event shifts described in narrative texts. The consistent finding across all of these methods and materials was that memory was better when the information was distributed across two events rather than combined into a single event. Moreover, Experiment 4 demonstrated that increasing the number of event boundaries from one to two increased the memory benefit. These results are interpreted in the context of the Event Horizon Model of event cognition. Copyright © 2015 Elsevier B.V. All rights reserved.

  4. Self-powered information measuring wireless networks using the distribution of tasks within multicore processors

    Science.gov (United States)

    Zhuravska, Iryna M.; Koretska, Oleksandra O.; Musiyenko, Maksym P.; Surtel, Wojciech; Assembay, Azat; Kovalev, Vladimir; Tleshova, Akmaral

    2017-08-01

    The article contains basic approaches to develop the self-powered information measuring wireless networks (SPIM-WN) using the distribution of tasks within multicore processors critical applying based on the interaction of movable components - as in the direction of data transmission as wireless transfer of energy coming from polymetric sensors. Base mathematic model of scheduling tasks within multiprocessor systems was modernized to schedule and allocate tasks between cores of one-crystal computer (SoC) to increase energy efficiency SPIM-WN objects.

  5. Portable software for distributed readout controllers and event builders in FASTBUS and VME

    International Nuclear Information System (INIS)

    Pordes, R.; Berg, D.; Berman, E.; Bernett, M.; Brown, D.; Constanta-Fanourakis, P.; Dorries, T.; Haire, M.; Joshi, U.; Kaczar, K.; Mackinnon, B.; Moore, C.; Nicinski, T.; Oleynik, G.; Petravick, D.; Sergey, G.; Slimmer, D.; Streets, J.; Votava, M.; White, V.

    1989-12-01

    We report on software developed as part of the PAN-DA system to support the functions of front end readout controllers and event builders in multiprocessor, multilevel, distributed data acquisition systems. For the next generation data acquisition system we have undertaken to design and implement software tools that are easily transportable to new modules. The first implementation of this software is for Motorola 68K series processor boards in FASTBUS and VME and will be used in the Fermilab accelerator run at the beginning of 1990. We use a Real Time Kernel Operating System. The software provides general connectivity tools for control, diagnosis and monitoring. 17 refs., 7 figs

  6. Application of the coupled code Athlet-Quabox/Cubbox for the extreme scenarios of the OECD/NRC BWR turbine trip benchmark and its performance on multi-processor computers

    International Nuclear Information System (INIS)

    Langenbuch, S.; Schmidt, K.D.; Velkov, K.

    2003-01-01

    The OECD/NRC BWR Turbine Trip (TT) Benchmark is investigated to perform code-to-code comparison of coupled codes including a comparison to measured data which are available from turbine trip experiments at Peach Bottom 2. This Benchmark problem for a BWR over-pressure transient represents a challenging application of coupled codes which integrate 3-dimensional neutron kinetics into thermal-hydraulic system codes for best-estimate simulation of plant transients. This transient represents a typical application of coupled codes which are usually performed on powerful workstations using a single CPU. Nowadays, the availability of multi-CPUs is much easier. Indeed, powerful workstations already provide 4 to 8 CPU, computer centers give access to multi-processor systems with numbers of CPUs in the order of 16 up to several 100. Therefore, the performance of the coupled code Athlet-Quabox/Cubbox on multi-processor systems is studied. Different cases of application lead to changing requirements of the code efficiency, because the amount of computer time spent in different parts of the code is varying. This paper presents main results of the coupled code Athlet-Quabox/Cubbox for the extreme scenarios of the BWR TT Benchmark together with evaluations of the code performance on multi-processor computers. (authors)

  7. Holographic memories

    DEFF Research Database (Denmark)

    Ramanujam, P.S.; Berg, R.H.; Hvilsted, Søren

    1999-01-01

    A Two-dimensional holographic memory for archival storage is described. Assuming a coherent transfer function, an A4 page can be stored at high resolution in an area of 1 mm(2). Recently developed side-chain liquid crystalline azobenzene polyesters are found to be suitable media for holographic...

  8. Sharing Memories

    DEFF Research Database (Denmark)

    Rodil, Kasper; Nielsen, Emil Byskov; Nielsen, Jonathan Bernstorff

    2018-01-01

    in which it was to be contextualized and through a close partnership between aphasics and their caretakers. The underlying design methodology for the MemoryBook is Participatory Design manifested through the collaboration and creations by two aphasic residents and one member of the support staff. The idea...

  9. Memory consolidation

    NARCIS (Netherlands)

    Takashima, A.; Bakker, I.; Schmid, H.-J.

    2016-01-01

    In order to make use of novel experiences and knowledge to guide our future behavior, we must keep large amounts of information accessible for retrieval. The memory system that stores this information needs to be flexible in order to rapidly incorporate incoming information, but also requires that

  10. Skilled Memory.

    Science.gov (United States)

    1980-11-06

    Woodworth, R. S. Experimental Psychology. New York: Henry Holt and Co., 1938. Yates, F. A. The art of memory. London: Rutledge and Kegan Paul, 1966. 50...Group 1 Psychologist (TAEG) ON! Branch Office Dept. of the Navy 1030 East Green Street Orlando, FL 32813 Pasadena, CA 91101 1 Dr. Richard Sorensen I

  11. Milestoning with transition memory

    Science.gov (United States)

    Hawk, Alexander T.; Makarov, Dmitrii E.

    2011-12-01

    Milestoning is a method used to calculate the kinetics and thermodynamics of molecular processes occurring on time scales that are not accessible to brute force molecular dynamics (MD). In milestoning, the conformation space of the system is sectioned by hypersurfaces (milestones), an ensemble of trajectories is initialized on each milestone, and MD simulations are performed to calculate transitions between milestones. The transition probabilities and transition time distributions are then used to model the dynamics of the system with a Markov renewal process, wherein a long trajectory of the system is approximated as a succession of independent transitions between milestones. This approximation is justified if the transition probabilities and transition times are statistically independent. In practice, this amounts to a requirement that milestones are spaced such that trajectories lose position and velocity memory between subsequent transitions. Unfortunately, limiting the number of milestones limits both the resolution at which a system's properties can be analyzed, and the computational speedup achieved by the method. We propose a generalized milestoning procedure, milestoning with transition memory (MTM), which accounts for memory of previous transitions made by the system. When a reaction coordinate is used to define the milestones, the MTM procedure can be carried out at no significant additional expense as compared to conventional milestoning. To test MTM, we have applied its version that allows for the memory of the previous step to the toy model of a polymer chain undergoing Langevin dynamics in solution. We have computed the mean first passage time for the chain to attain a cyclic conformation and found that the number of milestones that can be used, without incurring significant errors in the first passage time is at least 8 times that permitted by conventional milestoning. We further demonstrate that, unlike conventional milestoning, MTM permits

  12. The mysteries of remote memory

    Science.gov (United States)

    2018-01-01

    Long-lasting memories form the basis of our identity as individuals and lie central in shaping future behaviours that guide survival. Surprisingly, however, our current knowledge of how such memories are stored in the brain and retrieved, as well as the dynamics of the circuits involved, remains scarce despite seminal technical and experimental breakthroughs in recent years. Traditionally, it has been proposed that, over time, information initially learnt in the hippocampus is stored in distributed cortical networks. This process—the standard theory of memory consolidation—would stabilize the newly encoded information into a lasting memory, become independent of the hippocampus, and remain essentially unmodifiable throughout the lifetime of the individual. In recent years, several pieces of evidence have started to challenge this view and indicate that long-lasting memories might already ab ovo be encoded, and subsequently stored in distributed cortical networks, akin to the multiple trace theory of memory consolidation. In this review, we summarize these recent findings and attempt to identify the biologically plausible mechanisms based on which a contextual memory becomes remote by integrating different levels of analysis: from neural circuits to cell ensembles across synaptic remodelling and epigenetic modifications. From these studies, remote memory formation and maintenance appear to occur through a multi-trace, dynamic and integrative cellular process ranging from the synapse to the nucleus, and represent an exciting field of research primed to change quickly as new experimental evidence emerges. This article is part of a discussion meeting issue ‘Of mice and mental health: facilitating dialogue between basic and clinical neuroscientists’. PMID:29352028

  13. Episodic memory in aspects of large-scale brain networks

    Science.gov (United States)

    Jeong, Woorim; Chung, Chun Kee; Kim, June Sic

    2015-01-01

    Understanding human episodic memory in aspects of large-scale brain networks has become one of the central themes in neuroscience over the last decade. Traditionally, episodic memory was regarded as mostly relying on medial temporal lobe (MTL) structures. However, recent studies have suggested involvement of more widely distributed cortical network and the importance of its interactive roles in the memory process. Both direct and indirect neuro-modulations of the memory network have been tried in experimental treatments of memory disorders. In this review, we focus on the functional organization of the MTL and other neocortical areas in episodic memory. Task-related neuroimaging studies together with lesion studies suggested that specific sub-regions of the MTL are responsible for specific components of memory. However, recent studies have emphasized that connectivity within MTL structures and even their network dynamics with other cortical areas are essential in the memory process. Resting-state functional network studies also have revealed that memory function is subserved by not only the MTL system but also a distributed network, particularly the default-mode network (DMN). Furthermore, researchers have begun to investigate memory networks throughout the entire brain not restricted to the specific resting-state network (RSN). Altered patterns of functional connectivity (FC) among distributed brain regions were observed in patients with memory impairments. Recently, studies have shown that brain stimulation may impact memory through modulating functional networks, carrying future implications of a novel interventional therapy for memory impairment. PMID:26321939

  14. Episodic memory in aspects of large-scale brain networks

    Directory of Open Access Journals (Sweden)

    Woorim eJeong

    2015-08-01

    Full Text Available Understanding human episodic memory in aspects of large-scale brain networks has become one of the central themes in neuroscience over the last decade. Traditionally, episodic memory was regarded as mostly relying on medial temporal lobe (MTL structures. However, recent studies have suggested involvement of more widely distributed cortical network and the importance of its interactive roles in the memory process. Both direct and indirect neuro-modulations of the memory network have been tried in experimental treatments of memory disorders. In this review, we focus on the functional organization of the MTL and other neocortical areas in episodic memory. Task-related neuroimaging studies together with lesion studies suggested that specific sub-regions of the MTL are responsible for specific components of memory. However, recent studies have emphasized that connectivity within MTL structures and even their network dynamics with other cortical areas are essential in the memory process. Resting-state functional network studies also have revealed that memory function is subserved by not only the MTL system but also a distributed network, particularly the default-mode network. Furthermore, researchers have begun to investigate memory networks throughout the entire brain not restricted to the specific resting-state network. Altered patterns of functional connectivity among distributed brain regions were observed in patients with memory impairments. Recently, studies have shown that brain stimulation may impact memory through modulating functional networks, carrying future implications of a novel interventional therapy for memory impairment.

  15. Ferrite materials for memory applications

    CERN Document Server

    Saravanan, R

    2017-01-01

    The book discusses the synthesis and characterization of various ferrite materials used for memory applications. The distinct feature of the book is the construction of charge density of ferrites by deploying the maximum entropy method (MEM). This charge density gives the distribution of charges in the ferrite unit cell, which is analyzed for charge related properties.

  16. Concrete Memories

    DEFF Research Database (Denmark)

    Wiegand, Frauke Katharina

    2015-01-01

    This article traces the presence of Atlantikwall bunkers in amateur holiday snapshots and discusses the ambiguous role of the bunker site in visual cultural memory. Departing from my family’s private photo collection from twenty years of vacationing at the Danish West coast, the different mundane...... and poetic appropriations and inscriptions of the bunker site are depicted. Ranging between overlooked side presences and an overwhelming visibility, the concrete remains of fascist war architecture are involved in and motivate different sensuous experiences and mnemonic appropriations. The article meets...... the bunkers’ changing visuality and the cultural topography they both actively transform and are being transformed by through juxtaposing different acts and objects of memory over time and in different visual articulations....

  17. Treadwell Memorial

    OpenAIRE

    Downey, Frances K

    2015-01-01

    This is a memorial to gold mining in Southeast Alaska. The structure takes visitors from the Treadwell trail onto the edge of a popular local beach, reclaiming a forgotten place that was once the largest gold mine in the world. A tangible tribute to this obscure period of history, this building kindles a connection between artifacts and the community. It is a liminal space, connecting ocean and mountain, past and present, civilization and wilderness. An investigation of the Treadwell Gold...

  18. Neuroanatomic organization of sound memory in humans.

    Science.gov (United States)

    Kraut, Michael A; Pitcock, Jeffery A; Calhoun, Vince; Li, Juan; Freeman, Thomas; Hart, John

    2006-11-01

    The neural interface between sensory perception and memory is a central issue in neuroscience, particularly initial memory organization following perceptual analyses. We used functional magnetic resonance imaging to identify anatomic regions extracting initial auditory semantic memory information related to environmental sounds. Two distinct anatomic foci were detected in the right superior temporal gyrus when subjects identified sounds representing either animals or threatening items. Threatening animal stimuli elicited signal changes in both foci, suggesting a distributed neural representation. Our results demonstrate both category- and feature-specific responses to nonverbal sounds in early stages of extracting semantic memory information from these sounds. This organization allows for these category-feature detection nodes to extract early, semantic memory information for efficient processing of transient sound stimuli. Neural regions selective for threatening sounds are similar to those of nonhuman primates, demonstrating semantic memory organization for basic biological/survival primitives are present across species.

  19. Testing for structural change in the presence of long memory

    OpenAIRE

    Krämer, Walter; Sibbertsen, Philipp

    2000-01-01

    We derive the limiting null distributions of the standard and OLS-based CUSUM-tests for structural change of the coefficients of a linear regression model in the context of long memory disturbances. We show that both tests behave fundamentally different in a long memory environment, as compared to short memory, and that long memory is easily mistaken for structural change when standard critical values are employed.

  20. Enhancement of Immune Memory Responses to Respiratory Infection

    Science.gov (United States)

    2017-08-01

    Unlimited Distribution 13. SUPPLEMENTARY NOTES 14. ABSTRACT Maintenance of long - term immunological memory against pathogens is crucial for the rapid...highly expressed in memory B cells in mice, and Atg7 is required for maintenance of long - term memory B cells needed to protect against influenza...AWARD NUMBER: W81XWH-16-1-0361 TITLE: Enhancement of Immune Memory Responses to Respiratory Infection PRINCIPAL INVESTIGATORs: Dr Farrah

  1. Optimal data replication: A new approach to optimizing parallel EM algorithms on a mesh-connected multiprocessor for 3D PET image reconstruction

    International Nuclear Information System (INIS)

    Chen, C.M.; Lee, S.Y.

    1995-01-01

    The EM algorithm promises an estimated image with the maximal likelihood for 3D PET image reconstruction. However, due to its long computation time, the EM algorithm has not been widely used in practice. While several parallel implementations of the EM algorithm have been developed to make the EM algorithm feasible, they do not guarantee an optimal parallelization efficiency. In this paper, the authors propose a new parallel EM algorithm which maximizes the performance by optimizing data replication on a mesh-connected message-passing multiprocessor. To optimize data replication, the authors have formally derived the optimal allocation of shared data, group sizes, integration and broadcasting of replicated data as well as the scheduling of shared data accesses. The proposed parallel EM algorithm has been implemented on an iPSC/860 with 16 PEs. The experimental and theoretical results, which are consistent with each other, have shown that the proposed parallel EM algorithm could improve performance substantially over those using unoptimized data replication

  2. Odor Preference Learning and Memory Modify GluA1 Phosphorylation and GluA1 Distribution in the Neonate Rat Olfactory Bulb: Testing the AMPA Receptor Hypothesis in an Appetitive Learning Model

    Science.gov (United States)

    Cui, Wen; Darby-King, Andrea; Grimes, Matthew T.; Howland, John G.; Wang, Yu Tian; McLean, John H.; Harley, Carolyn W.

    2011-01-01

    An increase in synaptic AMPA receptors is hypothesized to mediate learning and memory. AMPA receptor increases have been reported in aversive learning models, although it is not clear if they are seen with memory maintenance. Here we examine AMPA receptor changes in a cAMP/PKA/CREB-dependent appetitive learning model: odor preference learning in…

  3. Transactional Memory

    CERN Document Server

    Harris, Tim; Rajwar, Ravi

    2010-01-01

    The advent of multicore processors has renewed interest in the idea of incorporating transactions into the programming model used to write parallel programs.This approach, known as transactional memory, offers an alternative, and hopefully better, way to coordinate concurrent threads. The ACI(atomicity, consistency, isolation) properties of transactions provide a foundation to ensure that concurrent reads and writes of shared data do not produce inconsistent or incorrect results. At a higher level, a computation wrapped in a transaction executes atomically - either it completes successfullyand

  4. Modeling Confidence and Response Time in Recognition Memory

    Science.gov (United States)

    Ratcliff, Roger; Starns, Jeffrey J.

    2009-01-01

    A new model for confidence judgments in recognition memory is presented. In the model, the match between a single test item and memory produces a distribution of evidence, with better matches corresponding to distributions with higher means. On this match dimension, confidence criteria are placed, and the areas between the criteria under the…

  5. Intentionally fabricated autobiographical memories

    OpenAIRE

    Justice, LV; Morrison, CM; Conway, MA

    2017-01-01

    Participants generated both autobiographical memories (AMs) that they believed to be true and intentionally fabricated autobiographical memories (IFAMs). Memories were constructed while a concurrent memory load (random 8-digit sequence) was held in mind or while there was no concurrent load. Amount and accuracy of recall of the concurrent memory load was reliably poorer following generation of IFAMs than following generation of AMs. There was no reliable effect of load on memory generation ti...

  6. STRUKTUR DAN PROSES MEMORI

    Directory of Open Access Journals (Sweden)

    Magda Bhinnety

    2015-09-01

    Full Text Available This paper describes structures and processes of human memory system according to the modal model. Sensory memory is described as the first system to store information from outside world. Short‐term memory, or now called working memory, represents a system characterized by limited ability in storing as well as retrieving information. Long‐term memory on the hand stores information larger in amount and longer than short‐term memory

  7. STRUKTUR DAN PROSES MEMORI

    OpenAIRE

    Bhinnety, Magda

    2015-01-01

    This paper describes structures and processes of human memory system according to the modal model. Sensory memory is described as the first system to store information from outside world. Short‐term memory, or now called working memory, represents a system characterized by limited ability in storing as well as retrieving information. Long‐term memory on the hand stores information larger in amount and longer than short‐term memory

  8. Electroconvulsive therapy and memory.

    Science.gov (United States)

    Harper, R G; Wiens, A N

    1975-10-01

    Recent research on the effects of electroconvulsive therapy (ECT) on memory is critically reviewed. Despite some inconsistent findings, unilateral nondominant ECT appears to affect verbal memory less than bilateral ECT. Adequate research on multiple monitored ECT is lacking. With few exceptions, the research methodologies for assessing memory have been inadequate. Many studies have confounded learning with retention, and only very recently has long term memory been adequately studied. Standardized assessment procedures for short term and long term memory are needed, in addition to more sophisticated assessment of memory processes, the duration of memory loss, and qualitative aspects of memories.

  9. Distributed 3-D iterative reconstruction for quantitative SPECT

    International Nuclear Information System (INIS)

    Ju, Z.W.; Frey, E.C.; Tsui, B.M.W.

    1995-01-01

    The authors describe a distributed three dimensional (3-D) iterative reconstruction library for quantitative single-photon emission computed tomography (SPECT). This library includes 3-D projector-backprojector pairs (PBPs) and distributed 3-D iterative reconstruction algorithms. The 3-D PBPs accurately and efficiently model various combinations of the image degrading factors including attenuation, detector response and scatter response. These PBPs were validated by comparing projection data computed using the projectors with that from direct Monte Carlo (MC) simulations. The distributed 3-D iterative algorithms spread the projection-backprojection operations for all the projection angles over a heterogeneous network of single or multi-processor computers to reduce the reconstruction time. Based on a master/slave paradigm, these distributed algorithms provide dynamic load balancing and fault tolerance. The distributed algorithms were verified by comparing images reconstructed using both the distributed and non-distributed algorithms. Computation times for distributed 3-D reconstructions running on up to 4 identical processors were reduced by a factor approximately 80--90% times the number of the processors participating, compared to those for non-distributed 3-D reconstructions running on a single processor. When combined with faster affordable computers, this library provides an efficient means for implementing accurate reconstruction and compensation methods to improve quality and quantitative accuracy in SPECT images

  10. Distributed Processor/Memory Architectures Design Program

    Science.gov (United States)

    1975-02-01

    233 2. ’omu ci; G d P’M Po . . . . ...l. . . .2.3.6..... 3. % atar ) GilaD1rP’M t . 234 4. (Otem (me1uo m...its assigned ID, short descriptor in English , size, production rate, producer, and all consumers. In addition, a communication link matrix describing

  11. Distributed Memory Programming on Many-Cores

    DEFF Research Database (Denmark)

    Berthold, Jost; Dieterle, Mischa; Lobachev, Oleg

    2009-01-01

    Eden is a parallel extension of the lazy functional language Haskell providing dynamic process creation and automatic data exchange. As a Haskell extension, Eden takes a high-level approach to parallel programming and thereby simplifies parallel program development. The current implementation is ...

  12. Detailed sensory memory, sloppy working memory

    NARCIS (Netherlands)

    Sligte, I.G.; Vandenbroucke, A.R.E.; Scholte, H.S.; Lamme, V.A.F.

    2010-01-01

    Visual short-term memory (VSTM) enables us to actively maintain information in mind for a brief period of time after stimulus disappearance. According to recent studies, VSTM consists of three stages - iconic memory, fragile VSTM, and visual working memory - with increasingly stricter capacity

  13. Episodic memory, semantic memory, and amnesia.

    Science.gov (United States)

    Squire, L R; Zola, S M

    1998-01-01

    Episodic memory and semantic memory are two types of declarative memory. There have been two principal views about how this distinction might be reflected in the organization of memory functions in the brain. One view, that episodic memory and semantic memory are both dependent on the integrity of medial temporal lobe and midline diencephalic structures, predicts that amnesic patients with medial temporal lobe/diencephalic damage should be proportionately impaired in both episodic and semantic memory. An alternative view is that the capacity for semantic memory is spared, or partially spared, in amnesia relative to episodic memory ability. This article reviews two kinds of relevant data: 1) case studies where amnesia has occurred early in childhood, before much of an individual's semantic knowledge has been acquired, and 2) experimental studies with amnesic patients of fact and event learning, remembering and knowing, and remote memory. The data provide no compelling support for the view that episodic and semantic memory are affected differently in medial temporal lobe/diencephalic amnesia. However, episodic and semantic memory may be dissociable in those amnesic patients who additionally have severe frontal lobe damage.

  14. Smell your way back to childhood: autobiographical odor memory.

    Science.gov (United States)

    Willander, Johan; Larsson, Maria

    2006-04-01

    This study addressed age distributions and experiential qualities of autobiographical memories evoked by different sensory cues. Ninety-three older adults were presented with one of three cue types (word, picture, or odor) and were asked to relate any autobiographical event for the given cue. The main aims were to explore whether (1) the age distribution of olfactory-evoked memories differs from memories cued by words and pictures and (2) the experiential qualities of the evoked memories vary over the different cues. The results showed that autobiographical memories triggered by olfactory information were older than memories associated with verbal and visual information. Specifically, most odor-cued memories were located to the first decade of life (memories associated with verbal and visual cues peaked in early adulthood (11-20 years). Also, odor-evoked memories were associated with stronger feelings of being brought back in time and had been thought of less often than memories evoked by verbal and visual information. This pattern of findings suggests that odor-evoked memories may be different from other memory experiences.

  15. Life story chapters, specific memories and the reminiscence bump

    DEFF Research Database (Denmark)

    Thomsen, Dorthe Kirkegaard; Pillemer, David B.; Ivcevic, Zorana

    2011-01-01

    Theories of autobiographical memory posit that extended time periods (here termed chapters) and memories are organised hierarchically. If chapters organise memories and guide their recall, then chapters and memories should show similar temporal distributions over the life course. Previous research...... are over-represented at the beginning of chapters. Potential connections between chapters and the cultural life script are also examined. Adult participants first divided their life story into chapters and identified their most positive and most negative chapter. They then recalled a specific memory from...... demonstrates that positive but not negative memories show a reminiscence bump and that memories cluster at the beginning of extended time periods. The current study tested the hypotheses that (1) ages marking the beginning of positive but not negative chapters produce a bump, and that (2) specific memories...

  16. Solution of equations of electric power networks in multiprocessor computers; Solucao de equacoes de redes de energia eletrica em computadores multiprocessadores

    Energy Technology Data Exchange (ETDEWEB)

    Feltrin, Antonio Padilha

    1991-05-01

    This thesis describes a methodology for decomposing the repeat solution process the equation Ax = b independent tasks to be done in parallel, based on the matrix inverse factors (W-matrix) with partitions. The partitioning scheme proposed in this thesis consists of breaking up the W-matrix according the depths of the factorization path tree. In this scheme, all the information needed to generate the partitions can be obtained straightforward from the network factorization path tree. The partitioning algorithm is simple and ease to implement. The elements of W, matrices, except the last partition, can be obtained directly from L-matrix elements, not requiring extra work. The proposed scheme guarantees that additional fills be only created in the last partition. The forward and backward solutions are performed by rows, and the strategy proposed is to schedule on beach processor the operations corresponding to a row of each partition. It should be kept in mind that the multiprocessor environments are equipped with powerful unit processor and then it seems a sound strategy to perform the multi-add elementary operations inside the hardware in order to exploit its computing efficiency. This strategy seeks to match the parallel algorithm to the parallel architecture. The precedent relations - that give rise to delays - are replaced by multi-add operations performed inside the processor mode without external communication. In the partition, the forward, diagonal and backward solutions may be gathered, and so all the operations can be expressed as the product of matrix W{sub lp} by the update components of vector b. The performance results show that the potential speedup of the solution time is essentially bounded by the floating point operation capability of each processor, denoting that the methodology is a suitable way to exploit the growing power of the computing technology. 46 figs, 40 refs, 49 tabs.

  17. Semantic graphs and associative memories

    Science.gov (United States)

    Pomi, Andrés; Mizraji, Eduardo

    2004-12-01

    Graphs have been increasingly utilized in the characterization of complex networks from diverse origins, including different kinds of semantic networks. Human memories are associative and are known to support complex semantic nets; these nets are represented by graphs. However, it is not known how the brain can sustain these semantic graphs. The vision of cognitive brain activities, shown by modern functional imaging techniques, assigns renewed value to classical distributed associative memory models. Here we show that these neural network models, also known as correlation matrix memories, naturally support a graph representation of the stored semantic structure. We demonstrate that the adjacency matrix of this graph of associations is just the memory coded with the standard basis of the concept vector space, and that the spectrum of the graph is a code invariant of the memory. As long as the assumptions of the model remain valid this result provides a practical method to predict and modify the evolution of the cognitive dynamics. Also, it could provide us with a way to comprehend how individual brains that map the external reality, almost surely with different particular vector representations, are nevertheless able to communicate and share a common knowledge of the world. We finish presenting adaptive association graphs, an extension of the model that makes use of the tensor product, which provides a solution to the known problem of branching in semantic nets.

  18. Optical memory

    Science.gov (United States)

    Mao, Samuel S; Zhang, Yanfeng

    2013-07-02

    Optical memory comprising: a semiconductor wire, a first electrode, a second electrode, a light source, a means for producing a first voltage at the first electrode, a means for producing a second voltage at the second electrode, and a means for determining the presence of an electrical voltage across the first electrode and the second electrode exceeding a predefined voltage. The first voltage, preferably less than 0 volts, different from said second voltage. The semiconductor wire is optically transparent and has a bandgap less than the energy produced by the light source. The light source is optically connected to the semiconductor wire. The first electrode and the second electrode are electrically insulated from each other and said semiconductor wire.

  19. Long-term pitch memory for music recordings is related to auditory working memory precision.

    Science.gov (United States)

    Van Hedger, Stephen C; Heald, Shannon Lm; Nusbaum, Howard C

    2018-04-01

    Most individuals have reliable long-term memories for the pitch of familiar music recordings. This pitch memory (1) appears to be normally distributed in the population, (2) does not depend on explicit musical training and (3) only seems to be weakly related to differences in listening frequency estimates. The present experiment was designed to assess whether individual differences in auditory working memory could explain variance in long-term pitch memory for music recordings. In Experiment 1, participants first completed a musical note adjustment task that has been previously used to assess working memory of musical pitch. Afterward, participants were asked to judge the pitch of well-known music recordings, which either had or had not been shifted in pitch. We found that performance on the pitch working memory task was significantly related to performance in the pitch memory task using well-known recordings, even when controlling for overall musical experience and familiarity with each recording. In Experiment 2, we replicated these findings in a separate group of participants while additionally controlling for fluid intelligence and non-pitch-based components of auditory working memory. In Experiment 3, we demonstrated that participants could not accurately judge the pitch of unfamiliar recordings, suggesting that our method of pitch shifting did not result in unwanted acoustic cues that could have aided participants in Experiments 1 and 2. These results, taken together, suggest that the ability to maintain pitch information in working memory might lead to more accurate long-term pitch memory.

  20. Memory operation mechanism of fullerene-containing polymer memory

    Energy Technology Data Exchange (ETDEWEB)

    Nakajima, Anri, E-mail: anakajima@hiroshima-u.ac.jp; Fujii, Daiki [Research Institute for Nanodevice and Bio Systems, Hiroshima University, 1-4-2 Kagamiyama, Higashihiroshima, Hiroshima 739-8527 (Japan)

    2015-03-09

    The memory operation mechanism in fullerene-containing nanocomposite gate insulators was investigated while varying the kind of fullerene in a polymer gate insulator. It was cleared what kind of traps and which positions in the nanocomposite the injected electrons or holes are stored in. The reason for the difference in the easiness of programming was clarified taking the role of the charging energy of an injected electron into account. The dependence of the carrier dynamics on the kind of fullerene molecule was investigated. A nonuniform distribution of injected carriers occurred after application of a large magnitude programming voltage due to the width distribution of the polystyrene barrier between adjacent fullerene molecules. Through the investigations, we demonstrated a nanocomposite gate with fullerene molecules having excellent retention characteristics and a programming capability. This will lead to the realization of practical organic memories with fullerene-containing polymer nanocomposites.

  1. Memory, microprocessor, and ASIC

    CERN Document Server

    Chen, Wai-Kai

    2003-01-01

    System Timing. ROM/PROM/EPROM. SRAM. Embedded Memory. Flash Memories. Dynamic Random Access Memory. Low-Power Memory Circuits. Timing and Signal Integrity Analysis. Microprocessor Design Verification. Microprocessor Layout Method. Architecture. ASIC Design. Logic Synthesis for Field Programmable Gate Array (EPGA) Technology. Testability Concepts and DFT. ATPG and BIST. CAD Tools for BIST/DFT and Delay Faults.

  2. Infant Visual Recognition Memory

    Science.gov (United States)

    Rose, Susan A.; Feldman, Judith F.; Jankowski, Jeffery J.

    2004-01-01

    Visual recognition memory is a robust form of memory that is evident from early infancy, shows pronounced developmental change, and is influenced by many of the same factors that affect adult memory; it is surprisingly resistant to decay and interference. Infant visual recognition memory shows (a) modest reliability, (b) good discriminant…

  3. The Generalized Quantum Episodic Memory Model.

    Science.gov (United States)

    Trueblood, Jennifer S; Hemmer, Pernille

    2017-11-01

    Recent evidence suggests that experienced events are often mapped to too many episodic states, including those that are logically or experimentally incompatible with one another. For example, episodic over-distribution patterns show that the probability of accepting an item under different mutually exclusive conditions violates the disjunction rule. A related example, called subadditivity, occurs when the probability of accepting an item under mutually exclusive and exhaustive instruction conditions sums to a number >1. Both the over-distribution effect and subadditivity have been widely observed in item and source-memory paradigms. These phenomena are difficult to explain using standard memory frameworks, such as signal-detection theory. A dual-trace model called the over-distribution (OD) model (Brainerd & Reyna, 2008) can explain the episodic over-distribution effect, but not subadditivity. Our goal is to develop a model that can explain both effects. In this paper, we propose the Generalized Quantum Episodic Memory (GQEM) model, which extends the Quantum Episodic Memory (QEM) model developed by Brainerd, Wang, and Reyna (2013). We test GQEM by comparing it to the OD model using data from a novel item-memory experiment and a previously published source-memory experiment (Kellen, Singmann, & Klauer, 2014) examining the over-distribution effect. Using the best-fit parameters from the over-distribution experiments, we conclude by showing that the GQEM model can also account for subadditivity. Overall these results add to a growing body of evidence suggesting that quantum probability theory is a valuable tool in modeling recognition memory. Copyright © 2016 Cognitive Science Society, Inc.

  4. Nanoscale memory devices

    International Nuclear Information System (INIS)

    Chung, Andy; Deen, Jamal; Lee, Jeong-Soo; Meyyappan, M

    2010-01-01

    This article reviews the current status and future prospects for the use of nanomaterials and devices in memory technology. First, the status and continuing scaling trends of the flash memory are discussed. Then, a detailed discussion on technologies trying to replace flash in the near-term is provided. This includes phase change random access memory, Fe random access memory and magnetic random access memory. The long-term nanotechnology prospects for memory devices include carbon-nanotube-based memory, molecular electronics and memristors based on resistive materials such as TiO 2 . (topical review)

  5. Visual memory needs categories

    OpenAIRE

    Olsson, Henrik; Poom, Leo

    2005-01-01

    Capacity limitations in the way humans store and process information in working memory have been extensively studied, and several memory systems have been distinguished. In line with previous capacity estimates for verbal memory and memory for spatial information, recent studies suggest that it is possible to retain up to four objects in visual working memory. The objects used have typically been categorically different colors and shapes. Because knowledge about categories is stored in long-t...

  6. Non-volatile memories

    CERN Document Server

    Lacaze, Pierre-Camille

    2014-01-01

    Written for scientists, researchers, and engineers, Non-volatile Memories describes the recent research and implementations in relation to the design of a new generation of non-volatile electronic memories. The objective is to replace existing memories (DRAM, SRAM, EEPROM, Flash, etc.) with a universal memory model likely to reach better performances than the current types of memory: extremely high commutation speeds, high implantation densities and retention time of information of about ten years.

  7. Memory: sins and virtues

    OpenAIRE

    Schacter, Daniel L.

    2013-01-01

    Memory plays an important role in everyday life but does not provide an exact and unchanging record of experience: research has documented that memory is a constructive process that is subject to a variety of errors and distortions. Yet these memory “sins” also reflect the operation of adaptive aspects of memory. Memory can thus be characterized as an adaptive constructive process, which plays a functional role in cognition but produces distortions, errors, or illusions as a consequence of d...

  8. Memory for Light as a Quantum Process

    International Nuclear Information System (INIS)

    Lobino, M.; Kupchak, C.; Lvovsky, A. I.; Figueroa, E.

    2009-01-01

    We report complete characterization of an optical memory based on electromagnetically induced transparency. We recover the superoperator associated with the memory, under two different working conditions, by means of a quantum process tomography technique that involves storage of coherent states and their characterization upon retrieval. In this way, we can predict the quantum state retrieved from the memory for any input, for example, the squeezed vacuum or the Fock state. We employ the acquired superoperator to verify the nonclassicality benchmark for the storage of a Gaussian distributed set of coherent states.

  9. Implementation and Performance of Munin

    OpenAIRE

    Bennett, J.K.; Carter, J.B.; Zwaenepoel, W

    1991-01-01

    Munin is a distributed shared memory (DSM) system that allows shared memory paral­lel programs to be executed efficiently on distributed memory multiprocessors. Munin is unique among existing DSM systems in its use of multiple consistency protocols and in its use of release consistency. In Munin, shared program variables are annotated with their expected access pattern, and these annotations are then used by the runtime system to choose a consistency protocol best suited to that access patt...

  10. Memory Transformation Enhances Reinforcement Learning in Dynamic Environments.

    Science.gov (United States)

    Santoro, Adam; Frankland, Paul W; Richards, Blake A

    2016-11-30

    Over the course of systems consolidation, there is a switch from a reliance on detailed episodic memories to generalized schematic memories. This switch is sometimes referred to as "memory transformation." Here we demonstrate a previously unappreciated benefit of memory transformation, namely, its ability to enhance reinforcement learning in a dynamic environment. We developed a neural network that is trained to find rewards in a foraging task where reward locations are continuously changing. The network can use memories for specific locations (episodic memories) and statistical patterns of locations (schematic memories) to guide its search. We find that switching from an episodic to a schematic strategy over time leads to enhanced performance due to the tendency for the reward location to be highly correlated with itself in the short-term, but regress to a stable distribution in the long-term. We also show that the statistics of the environment determine the optimal utilization of both types of memory. Our work recasts the theoretical question of why memory transformation occurs, shifting the focus from the avoidance of memory interference toward the enhancement of reinforcement learning across multiple timescales. As time passes, memories transform from a highly detailed state to a more gist-like state, in a process called "memory transformation." Theories of memory transformation speak to its advantages in terms of reducing memory interference, increasing memory robustness, and building models of the environment. However, the role of memory transformation from the perspective of an agent that continuously acts and receives reward in its environment is not well explored. In this work, we demonstrate a view of memory transformation that defines it as a way of optimizing behavior across multiple timescales. Copyright © 2016 the authors 0270-6474/16/3612228-15$15.00/0.

  11. Examining procedural working memory processing in obsessive-compulsive disorder.

    Science.gov (United States)

    Shahar, Nitzan; Teodorescu, Andrei R; Anholt, Gideon E; Karmon-Presser, Anat; Meiran, Nachshon

    2017-07-01

    Previous research has suggested that a deficit in working memory might underlie the difficulty of obsessive-compulsive disorder (OCD) patients to control their thoughts and actions. However, a recent meta-analyses found only small effect sizes for working memory deficits in OCD. Recently, a distinction has been made between declarative and procedural working memory. Working memory in OCD was tested mostly using declarative measurements. However, OCD symptoms typically concerns actions, making procedural working-memory more relevant. Here, we tested the operation of procedural working memory in OCD. Participants with OCD and healthy controls performed a battery of choice reaction tasks under high and low procedural working memory demands. Reaction-times (RT) were estimated using ex-Gaussian distribution fitting, revealing no group differences in the size of the RT distribution tail (i.e., τ parameter), known to be sensitive to procedural working memory manipulations. Group differences, unrelated to working memory manipulations, were found in the leading-edge of the RT distribution and analyzed using a two-stage evidence accumulation model. Modeling results suggested that perceptual difficulties might underlie the current group differences. In conclusion, our results suggest that procedural working-memory processing is most likely intact in OCD, and raise a novel, yet untested assumption regarding perceptual deficits in OCD. Copyright © 2017 Elsevier Ireland Ltd. All rights reserved.

  12. Salam Memorial

    CERN Document Server

    Rubbia, Carlo

    1997-01-01

    by T.W.B. KIBBLE / Blackett Laboratory, Imperial College, London. Recollections of Abdus Salam at Imperial College I shall give a personal account of Professor Salam's life and work from the perspective of a colleague at Imperial College, concentrating particularly but not exclusively on the period leading up to the discovery of the electro-weak theory. If necessary I could perhaps give more detail, but only once I have given more thought to what ground I shall cover. by Sheldon Lee GLASHOW / Harvard University, Cambridge, MA, USA. Memories of Abdus Salam. My interactions with Abdus Salam, weak as they have been, extended over five decades. I regret that we never once collaborated in print or by correspondence. I visited Abdus only twice in London and twice again in Trieste, and met him at the occasional conference or summer school. Our face-to-face encounters could be counted on one's fingers and toes, but we became the best of friends. Others will discuss Abdus as an inspiring teacher, as a great scientist,...

  13. Distributed Sensor Networks

    Science.gov (United States)

    1979-09-30

    University, Pittsburgh, Pennsylvania (1976). 14. R. L. Kirby, "ULISP for PDP-11s with Memory Management ," Report MCS-76-23763, University of Maryland...teletVpe or 9 raphIc S output. The recor iuL, po , uitist il so mon itot its owvn ( Onmand queue and a( knowlede commands Sent to It hN the UsCtr interfa I...kernel. By a net- work kernel we mean a multicomputer distributed operating system kernel that includes proces- sor schedulers, "core" memory managers , and

  14. Organizational memory: from expectations memory to procedural memory

    NARCIS (Netherlands)

    Ebbers, J.J.; Wijnberg, N.M.

    2009-01-01

    Organizational memory is not just the stock of knowledge about how to do things, but also of expectations of organizational members vis-à-vis each other and the organization as a whole. The central argument of this paper is that this second type of organizational memory -organizational expectations

  15. Visual Memories Bypass Normalization.

    Science.gov (United States)

    Bloem, Ilona M; Watanabe, Yurika L; Kibbe, Melissa M; Ling, Sam

    2018-05-01

    How distinct are visual memory representations from visual perception? Although evidence suggests that briefly remembered stimuli are represented within early visual cortices, the degree to which these memory traces resemble true visual representations remains something of a mystery. Here, we tested whether both visual memory and perception succumb to a seemingly ubiquitous neural computation: normalization. Observers were asked to remember the contrast of visual stimuli, which were pitted against each other to promote normalization either in perception or in visual memory. Our results revealed robust normalization between visual representations in perception, yet no signature of normalization occurring between working memory stores-neither between representations in memory nor between memory representations and visual inputs. These results provide unique insight into the nature of visual memory representations, illustrating that visual memory representations follow a different set of computational rules, bypassing normalization, a canonical visual computation.

  16. Stochastic memory: getting memory out of noise

    Science.gov (United States)

    Stotland, Alexander; di Ventra, Massimiliano

    2011-03-01

    Memory circuit elements, namely memristors, memcapacitors and meminductors, can store information without the need of a power source. These systems are generally defined in terms of deterministic equations of motion for the state variables that are responsible for memory. However, in real systems noise sources can never be eliminated completely. One would then expect noise to be detrimental for memory. Here, we show that under specific conditions on the noise intensity memory can actually be enhanced. We illustrate this phenomenon using a physical model of a memristor in which the addition of white noise into the state variable equation improves the memory and helps the operation of the system. We discuss under which conditions this effect can be realized experimentally, discuss its implications on existing memory systems discussed in the literature, and also analyze the effects of colored noise. Work supported in part by NSF.

  17. Detailed Sensory Memory, Sloppy Working Memory

    OpenAIRE

    Sligte, Ilja G.; Vandenbroucke, Annelinde R. E.; Scholte, H. Steven; Lamme, Victor A. F.

    2010-01-01

    Visual short-term memory (VSTM) enables us to actively maintain information in mind for a brief period of time after stimulus disappearance. According to recent studies, VSTM consists of three stages - iconic memory, fragile VSTM, and visual working memory - with increasingly stricter capacity limits and progressively longer lifetimes. Still, the resolution (or amount of visual detail) of each VSTM stage has remained unexplored and we test this in the present study. We presented people with a...

  18. Hippocampal functional connectivity and episodic memory in early childhood.

    Science.gov (United States)

    Riggins, Tracy; Geng, Fengji; Blankenship, Sarah L; Redcay, Elizabeth

    2016-06-01

    Episodic memory relies on a distributed network of brain regions, with the hippocampus playing a critical and irreplaceable role. Few studies have examined how changes in this network contribute to episodic memory development early in life. The present addressed this gap by examining relations between hippocampal functional connectivity and episodic memory in 4- and 6-year-old children (n=40). Results revealed similar hippocampal functional connectivity between age groups, which included lateral temporal regions, precuneus, and multiple parietal and prefrontal regions, and functional specialization along the longitudinal axis. Despite these similarities, developmental differences were also observed. Specifically, 3 (of 4) regions within the hippocampal memory network were positively associated with episodic memory in 6-year-old children, but negatively associated with episodic memory in 4-year-old children. In contrast, all 3 regions outside the hippocampal memory network were negatively associated with episodic memory in older children, but positively associated with episodic memory in younger children. These interactions are interpreted within an interactive specialization framework and suggest the hippocampus becomes functionally integrated with cortical regions that are part of the hippocampal memory network in adults and functionally segregated from regions unrelated to memory in adults, both of which are associated with age-related improvements in episodic memory ability. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.

  19. A dual-trace model for visual sensory memory.

    Science.gov (United States)

    Cappiello, Marcus; Zhang, Weiwei

    2016-11-01

    Visual sensory memory refers to a transient memory lingering briefly after the stimulus offset. Although previous literature suggests that visual sensory memory is supported by a fine-grained trace for continuous representation and a coarse-grained trace of categorical information, simultaneous separation and assessment of these traces can be difficult without a quantitative model. The present study used a continuous estimation procedure to test a novel mathematical model of the dual-trace hypothesis of visual sensory memory according to which visual sensory memory could be modeled as a mixture of 2 von Mises (2VM) distributions differing in standard deviation. When visual sensory memory and working memory (WM) for colors were distinguished using different experimental manipulations in the first 3 experiments, the 2VM model outperformed Zhang and Luck (2008) standard mixture model (SM) representing a mixture of a single memory trace and random guesses, even though SM outperformed 2VM for WM. Experiment 4 generalized 2VM's advantages of fitting visual sensory memory data over SM from color to orientation. Furthermore, a single trace model and 4 other alternative models were ruled out, suggesting the necessity and sufficiency of dual traces for visual sensory memory. Together these results support the dual-trace model of visual sensory memory and provide a preliminary inquiry into the nature of information loss from visual sensory memory to WM. (PsycINFO Database Record (c) 2016 APA, all rights reserved).

  20. [Neuroscience and collective memory: memory schemas linking brain, societies and cultures].

    Science.gov (United States)

    Legrand, Nicolas; Gagnepain, Pierre; Peschanski, Denis; Eustache, Francis

    2015-01-01

    During the last two decades, the effect of intersubjective relationships on cognition has been an emerging topic in cognitive neurosciences leading through a so-called "social turn" to the formation of new domains integrating society and cultures to this research area. Such inquiry has been recently extended to collective memory studies. Collective memory refers to shared representations that are constitutive of the identity of a group and distributed among all its members connected by a common history. After briefly describing those evolutions in the study of human brain and behaviors, we review recent researches that have brought together cognitive psychology, neuroscience and social sciences into collective memory studies. Using the reemerging concept of memory schema, we propose a theoretical framework allowing to account for collective memories formation with a specific focus on the encoding process of historical events. We suggest that (1) if the concept of schema has been mainly used to describe rather passive framework of knowledge, such structure may also be implied in more active fashions in the understanding of significant collective events. And, (2) if some schema researches have restricted themselves to the individual level of inquiry, we describe a strong coherence between memory and cultural frameworks. Integrating the neural basis and properties of memory schema to collective memory studies may pave the way toward a better understanding of the reciprocal interaction between individual memories and cultural resources such as media or education. © Société de Biologie, 2016.

  1. Detailed sensory memory, sloppy working memory.

    Science.gov (United States)

    Sligte, Ilja G; Vandenbroucke, Annelinde R E; Scholte, H Steven; Lamme, Victor A F

    2010-01-01

    Visual short-term memory (VSTM) enables us to actively maintain information in mind for a brief period of time after stimulus disappearance. According to recent studies, VSTM consists of three stages - iconic memory, fragile VSTM, and visual working memory - with increasingly stricter capacity limits and progressively longer lifetimes. Still, the resolution (or amount of visual detail) of each VSTM stage has remained unexplored and we test this in the present study. We presented people with a change detection task that measures the capacity of all three forms of VSTM, and we added an identification display after each change trial that required people to identify the "pre-change" object. Accurate change detection plus pre-change identification requires subjects to have a high-resolution representation of the "pre-change" object, whereas change detection or identification only can be based on the hunch that something has changed, without exactly knowing what was presented before. We observed that people maintained 6.1 objects in iconic memory, 4.6 objects in fragile VSTM, and 2.1 objects in visual working memory. Moreover, when people detected the change, they could also identify the pre-change object on 88% of the iconic memory trials, on 71% of the fragile VSTM trials and merely on 53% of the visual working memory trials. This suggests that people maintain many high-resolution representations in iconic memory and fragile VSTM, but only one high-resolution object representation in visual working memory.

  2. Prospective memory, working memory, retrospective memory and self-rated memory performance in persons with intellectual disability

    OpenAIRE

    Levén, Anna; Lyxell, Björn; Andersson, Jan; Danielsson, Henrik; Rönnberg, Jerker

    2008-01-01

    The purpose of the present study was to examine the relationship between prospective memory, working memory, retrospective memory and self-rated memory capacity in adults with and without intellectual disability. Prospective memory was investigated by means of a picture-based task. Working memory was measured as performance on span tasks. Retrospective memory was scored as recall of subject performed tasks. Self-ratings of memory performance were based on the prospective and retrospective mem...

  3. Main Memory DBMS

    NARCIS (Netherlands)

    P.A. Boncz (Peter); L. Liu (Lei); M. Tamer Özsu

    2008-01-01

    htmlabstractA main memory database system is a DBMS that primarily relies on main memory for computer data storage. In contrast, normal database management systems employ hard disk based persisntent storage.

  4. Coping with Memory Loss

    Science.gov (United States)

    ... Consumers Home For Consumers Consumer Updates Coping With Memory Loss Share Tweet Linkedin Pin it More sharing ... be evaluated by a health professional. What Causes Memory Loss? Anything that affects cognition—the process of ...

  5. Memory and Aging

    Science.gov (United States)

    Memory and Aging Losing keys, misplacing a wallet, or forgetting someone’s name are common experiences. But for people nearing or over age 65, such memory lapses can be frightening. They wonder if they ...

  6. Tracing Cultural Memory

    DEFF Research Database (Denmark)

    Wiegand, Frauke Katharina

    by their encounters – to address a question that thirty years of ground - breaking research into memory has not yet sufficiently answered: What can we learn about the dynamics of cultural memory by examining mundane accounts of touristic encounters with sites of memory? From Blaavand Beach in Western Denmark......We encounter, relate to and make use of our past and that of others in multifarious and increasingly mobile ways. Tourism is one of the main paths for encountering sites of memory. This thesis examines tourists’ creative appropriations of sites of memory – the objects and future memories inspired...... of memory. They highlight the role of mundane uses of the past and indicate the need for cross - disciplinary research on the visual and on memory...

  7. Odor preference learning and memory modify GluA1 phosphorylation and GluA1 distribution in the neonate rat olfactory bulb: testing the AMPA receptor hypothesis in an appetitive learning model.

    Science.gov (United States)

    Cui, Wen; Darby-King, Andrea; Grimes, Matthew T; Howland, John G; Wang, Yu Tian; McLean, John H; Harley, Carolyn W

    2011-01-01

    An increase in synaptic AMPA receptors is hypothesized to mediate learning and memory. AMPA receptor increases have been reported in aversive learning models, although it is not clear if they are seen with memory maintenance. Here we examine AMPA receptor changes in a cAMP/PKA/CREB-dependent appetitive learning model: odor preference learning in the neonate rat. Rat pups were given a single pairing of peppermint and 2 mg/kg isoproterenol, which produces a 24-h, but not a 48-h, peppermint preference in the 7-d-old rat pup. GluA1 PKA-dependent phosphorylation peaked 10 min after the 10-min training trial and returned to baseline within 90 min. At 24 h, GluA1 subunits did not change overall but were significantly increased in synaptoneurosomes, consistent with increased membrane insertion. Immunohistochemistry revealed a significant increase in GluA1 subunits in olfactory bulb glomeruli, the targets of olfactory nerve axons. Glomerular increases were seen at 3 and 24 h after odor exposure in trained pups, but not in control pups. GluA1 increases were not seen as early as 10 min after training and were no longer observed 48 h after training when odor preference is no longer expressed behaviorally. Thus, the pattern of increased GluA1 membrane expression closely follows the memory timeline. Further, blocking GluA1 insertion using an interference peptide derived from the carboxyl tail of the GluA1 subunit inhibited 24 h odor preference memory providing causative support for our hypothesis. PKA-mediated GluA1 phosphorylation and later GluA1 insertion could, conjointly, provide increased AMPA function to support both short-term and long-term appetitive memory.

  8. System and method for programmable bank selection for banked memory subsystems

    Energy Technology Data Exchange (ETDEWEB)

    Blumrich, Matthias A. (Ridgefield, CT); Chen, Dong (Croton on Hudson, NY); Gara, Alan G. (Mount Kisco, NY); Giampapa, Mark E. (Irvington, NY); Hoenicke, Dirk (Seebruck-Seeon, DE); Ohmacht, Martin (Yorktown Heights, NY); Salapura, Valentina (Chappaqua, NY); Sugavanam, Krishnan (Mahopac, NY)

    2010-09-07

    A programmable memory system and method for enabling one or more processor devices access to shared memory in a computing environment, the shared memory including one or more memory storage structures having addressable locations for storing data. The system comprises: one or more first logic devices associated with a respective one or more processor devices, each first logic device for receiving physical memory address signals and programmable for generating a respective memory storage structure select signal upon receipt of pre-determined address bit values at selected physical memory address bit locations; and, a second logic device responsive to each of the respective select signal for generating an address signal used for selecting a memory storage structure for processor access. The system thus enables each processor device of a computing environment memory storage access distributed across the one or more memory storage structures.

  9. Contrasting single and multi-component working-memory systems in dual tasking.

    Science.gov (United States)

    Nijboer, Menno; Borst, Jelmer; van Rijn, Hedderik; Taatgen, Niels

    2016-05-01

    Working memory can be a major source of interference in dual tasking. However, there is no consensus on whether this interference is the result of a single working memory bottleneck, or of interactions between different working memory components that together form a complete working-memory system. We report a behavioral and an fMRI dataset in which working memory requirements are manipulated during multitasking. We show that a computational cognitive model that assumes a distributed version of working memory accounts for both behavioral and neuroimaging data better than a model that takes a more centralized approach. The model's working memory consists of an attentional focus, declarative memory, and a subvocalized rehearsal mechanism. Thus, the data and model favor an account where working memory interference in dual tasking is the result of interactions between different resources that together form a working-memory system. Copyright © 2016 Elsevier Inc. All rights reserved.

  10. Emotional Memory Persists Longer than Event Memory

    Science.gov (United States)

    Kuriyama, Kenichi; Soshi, Takahiro; Fujii, Takeshi; Kim, Yoshiharu

    2010-01-01

    The interaction between amygdala-driven and hippocampus-driven activities is expected to explain why emotion enhances episodic memory recognition. However, overwhelming behavioral evidence regarding the emotion-induced enhancement of immediate and delayed episodic memory recognition has not been obtained in humans. We found that the recognition…

  11. Music, memory and emotion

    Science.gov (United States)

    Jäncke, Lutz

    2008-01-01

    Because emotions enhance memory processes and music evokes strong emotions, music could be involved in forming memories, either about pieces of music or about episodes and information associated with particular music. A recent study in BMC Neuroscience has given new insights into the role of emotion in musical memory. PMID:18710596

  12. Attending to auditory memory.

    Science.gov (United States)

    Zimmermann, Jacqueline F; Moscovitch, Morris; Alain, Claude

    2016-06-01

    Attention to memory describes the process of attending to memory traces when the object is no longer present. It has been studied primarily for representations of visual stimuli with only few studies examining attention to sound object representations in short-term memory. Here, we review the interplay of attention and auditory memory with an emphasis on 1) attending to auditory memory in the absence of related external stimuli (i.e., reflective attention) and 2) effects of existing memory on guiding attention. Attention to auditory memory is discussed in the context of change deafness, and we argue that failures to detect changes in our auditory environments are most likely the result of a faulty comparison system of incoming and stored information. Also, objects are the primary building blocks of auditory attention, but attention can also be directed to individual features (e.g., pitch). We review short-term and long-term memory guided modulation of attention based on characteristic features, location, and/or semantic properties of auditory objects, and propose that auditory attention to memory pathways emerge after sensory memory. A neural model for auditory attention to memory is developed, which comprises two separate pathways in the parietal cortex, one involved in attention to higher-order features and the other involved in attention to sensory information. This article is part of a Special Issue entitled SI: Auditory working memory. Copyright © 2015 Elsevier B.V. All rights reserved.

  13. Saving Malta's music memory

    OpenAIRE

    Sant, Toni

    2013-01-01

    Maltese music is being lost. Along with it Malta loses its culture, way of life, and memories. Dr Toni Sant is trying to change this trend through the Malta Music Memory Project (M3P) http://www.um.edu.mt/think/saving-maltas-music-memory-2/

  14. Associative Memory Acceptors.

    Science.gov (United States)

    Card, Roger

    The properties of an associative memory are examined in this paper from the viewpoint of automata theory. A device called an associative memory acceptor is studied under real-time operation. The family "L" of languages accepted by real-time associative memory acceptors is shown to properly contain the family of languages accepted by one-tape,…

  15. Generation and Context Memory

    Science.gov (United States)

    Mulligan, Neil W.; Lozito, Jeffrey P.; Rosner, Zachary A.

    2006-01-01

    Generation enhances memory for occurrence but may not enhance other aspects of memory. The present study further delineates the negative generation effect in context memory reported in N. W. Mulligan (2004). First, the negative generation effect occurred for perceptual attributes of the target item (its color and font) but not for extratarget…

  16. Music, memory and emotion.

    Science.gov (United States)

    Jäncke, Lutz

    2008-08-08

    Because emotions enhance memory processes and music evokes strong emotions, music could be involved in forming memories, either about pieces of music or about episodes and information associated with particular music. A recent study in BMC Neuroscience has given new insights into the role of emotion in musical memory.

  17. Design issues for block-oriented reflective memory system

    Energy Technology Data Exchange (ETDEWEB)

    Jovanovic, M; Tomasevic, M; Milutinovic, V

    1996-12-31

    The block-oriented reflective memory (BORM) system represents a modular bus-based system architecture that belongs to the class of distributed shared memory systems. The results of the evaluation study of the BORM implementation strategies and design decisions in regard to the different values of input parameters are presented. 5 refs.

  18. Short-Term Memory in Orthogonal Neural Networks

    Science.gov (United States)

    White, Olivia L.; Lee, Daniel D.; Sompolinsky, Haim

    2004-04-01

    We study the ability of linear recurrent networks obeying discrete time dynamics to store long temporal sequences that are retrievable from the instantaneous state of the network. We calculate this temporal memory capacity for both distributed shift register and random orthogonal connectivity matrices. We show that the memory capacity of these networks scales with system size.

  19. Short-term memory in orthogonal neural networks

    International Nuclear Information System (INIS)

    White, Olivia L.; Lee, Daniel D.; Sompolinsky, Haim

    2004-01-01

    We study the ability of linear recurrent networks obeying discrete time dynamics to store long temporal sequences that are retrievable from the instantaneous state of the network. We calculate this temporal memory capacity for both distributed shift register and random orthogonal connectivity matrices. We show that the memory capacity of these networks scales with system size

  20. Neural systems for tactual memories.

    Science.gov (United States)

    Bonda, E; Petrides, M; Evans, A

    1996-04-01

    1. The aim of this study was to investigate the neural systems involved in the memory processing of experiences through touch. 2. Regional cerebral blood flow was measured with positron emission tomography by means of the water bolus H2(15)O methodology in human subjects as they performed tasks involving different levels of tactual memory. In one of the experimental tasks, the subjects had to palpate nonsense shapes to match each one to a previously learned set, thus requiring constant reference to long-term memory. The other experimental task involved judgements of the recent recurrence of shapes during the scanning period. A set of three control tasks was used to control for the type of exploratory movements and sensory processing inherent in the two experimental tasks. 3. Comparisons of the distribution of activity between the experimental and the control tasks were carried out by means of the subtraction method. In relation to the control conditions, the two experimental tasks requiring memory resulted in significant changes within the posteroventral insula and the central opercular region. In addition, the task requiring recall from long-term memory yielded changes in the perirhinal cortex. 4. The above findings demonstrated that a ventrally directed parietoinsular pathway, leading to the posteroventral insula and the perirhinal cortex, constitutes a system by which long-lasting representations of tactual experiences are formed. It is proposed that the posteroventral insula is involved in tactual feature analysis, by analogy with the similar role of the inferotemporal cortex in vision, whereas the perirhinal cortex is further involved in the integration of these features into long-lasting representations of somatosensory experiences.