WorldWideScience

Sample records for nonshared memory architectures

  1. Memory architecture

    NARCIS (Netherlands)

    2012-01-01

    A memory architecture is presented. The memory architecture comprises a first memory and a second memory. The first memory has at least a bank with a first width addressable by a single address. The second memory has a plurality of banks of a second width, said banks being addressable by components

  2. Programming parallel architectures - The BLAZE family of languages

    Science.gov (United States)

    Mehrotra, Piyush

    1989-01-01

    This paper gives an overview of the various approaches to programming multiprocessor architectures that are currently being explored. It is argued that two of these approaches, interactive programming environments and functional parallel languages, are particularly attractive, since they remove much of the burden of exploiting parallel architectures from the user. This paper also describes recent work in the design of parallel languages. Research on languages for both shared and nonshared memory multiprocessors is described.

  3. Programming parallel architectures: The BLAZE family of languages

    Science.gov (United States)

    Mehrotra, Piyush

    1988-01-01

    Programming multiprocessor architectures is a critical research issue. An overview is given of the various approaches to programming these architectures that are currently being explored. It is argued that two of these approaches, interactive programming environments and functional parallel languages, are particularly attractive since they remove much of the burden of exploiting parallel architectures from the user. Also described is recent work by the author in the design of parallel languages. Research on languages for both shared and nonshared memory multiprocessors is described, as well as the relations of this work to other current language research projects.

  4. Architecture and memory

    Directory of Open Access Journals (Sweden)

    Eneida de Almeida

    2015-12-01

    Full Text Available This paper investigates the links between architecture design and restoration, considering the blurry frontier that distinguishes this actions. The study holds in two contemporary architects performance: Lina Bo Bardi (1914-1992 and Aldo Rossi (1931-1997. The analyses of the concrete production, presented here by a work of each architecture – Sesc Pompeia and the Teatro Del Mondo – is based on the ability of reflection on the role of the memory in architecture: not only the memory in the buildings and urban fabrics materiality, but also the memory as an active instrument inside the mental processes adopted by the projects authors. Resorting to architects writings as well as authors who analyses this interventions, they seek to reconstitute the design development path, recognizing the strategy that reinterprets past experiences in order to overcome the traditional contraposition between “old” and “new”, tutorship and innovation.

  5. Emerging memory technologies design, architecture, and applications

    CERN Document Server

    2014-01-01

    This book explores the design implications of emerging, non-volatile memory (NVM) technologies on future computer memory hierarchy architecture designs. Since NVM technologies combine the speed of SRAM, the density of DRAM, and the non-volatility of Flash memory, they are very attractive as the basis for future universal memories. This book provides a holistic perspective on the topic, covering modeling, design, architecture and applications. The practical information included in this book will enable designers to exploit emerging memory technologies to improve significantly the performance/power/reliability of future, mainstream integrated circuits. • Provides a comprehensive reference on designing modern circuits with emerging, non-volatile memory technologies, such as MRAM and PCRAM; • Explores new design opportunities offered by emerging memory technologies, from a holistic perspective; • Describes topics in technology, modeling, architecture and applications; • Enables circuit designers to ex...

  6. Overlapping genetic and child-specific nonshared environmental influences on listening comprehension, reading motivation, and reading comprehension.

    Science.gov (United States)

    Schenker, Victoria J; Petrill, Stephen A

    2015-01-01

    This study investigated the genetic and environmental influences on observed associations between listening comprehension, reading motivation, and reading comprehension. Univariate and multivariate quantitative genetic models were conducted in a sample of 284 pairs of twins at a mean age of 9.81 years. Genetic and nonshared environmental factors accounted for statistically significant variance in listening and reading comprehension, and nonshared environmental factors accounted for variance in reading motivation. Furthermore, listening comprehension demonstrated unique genetic and nonshared environmental influences but also had overlapping genetic influences with reading comprehension. Reading motivation and reading comprehension each had unique and overlapping nonshared environmental contributions. Therefore, listening comprehension appears to be related to reading primarily due to genetic factors whereas motivation appears to affect reading via child-specific, nonshared environmental effects. Copyright © 2015 Elsevier Inc. All rights reserved.

  7. Overlapping Genetic and Child-Specific Nonshared Environmental Influences on Listening Comprehension, Reading Motivation, and Reading Comprehension

    Science.gov (United States)

    Schenker, Victoria J.; Petrill, Stephen A.

    2015-01-01

    This study investigated the genetic and environmental influences on observed associations between listening comprehension, reading motivation, and reading comprehension. Univariate and multivariate quantitative genetic models were conducted in a sample of 284 pairs of twins at a mean age of 9.81 years. Genetic and nonshared environmental factors accounted for statistically significant variance in listening and reading comprehension, and nonshared environmental factors accounted for variance in reading motivation. Furthermore, listening comprehension demonstrated unique genetic and nonshared environmental influences but also had overlapping genetic influences with reading comprehension. Reading motivation and reading comprehension each had unique and overlapping nonshared environmental contributions. Therefore, listening comprehension appears to be related to reading primarily due to genetic factors whereas motivation appears to affect reading via child-specific, nonshared environmental effects. PMID:26321677

  8. Fast, Accurate Memory Architecture Simulation Technique Using Memory Access Characteristics

    OpenAIRE

    小野, 貴継; 井上, 弘士; 村上, 和彰

    2007-01-01

    This paper proposes a fast and accurate memory architecture simulation technique. To design memory architecture, the first steps commonly involve using trace-driven simulation. However, expanding the design space makes the evaluation time increase. A fast simulation is achieved by a trace size reduction, but it reduces the simulation accuracy. Our approach can reduce the simulation time while maintaining the accuracy of the simulation results. In order to evaluate validity of proposed techniq...

  9. Virtual Prototyping and Performance Analysis of Two Memory Architectures

    Directory of Open Access Journals (Sweden)

    Huda S. Muhammad

    2009-01-01

    Full Text Available The gap between CPU and memory speed has always been a critical concern that motivated researchers to study and analyze the performance of memory hierarchical architectures. In the early stages of the design cycle, performance evaluation methodologies can be used to leverage exploration at the architectural level and assist in making early design tradeoffs. In this paper, we use simulation platforms developed using the VisualSim tool to compare the performance of two memory architectures, namely, the Direct Connect architecture of the Opteron, and the Shared Bus of the Xeon multicore processors. Key variations exist between the two memory architectures and both design approaches provide rich platforms that call for the early use of virtual system prototyping and simulation techniques to assess performance at an early stage in the design cycle.

  10. Enhanced memory architecture for massively parallel vision chip

    Science.gov (United States)

    Chen, Zhe; Yang, Jie; Liu, Liyuan; Wu, Nanjian

    2015-04-01

    Local memory architecture plays an important role in high performance massively parallel vision chip. In this paper, we propose an enhanced memory architecture with compact circuit area designed in a full-custom flow. The memory consists of separate master-stage static latches and shared slave-stage dynamic latches. We use split transmission transistors on the input data path to enhance tolerance for charge sharing and to achieve random read/write capabilities. The memory is designed in a 0.18 μm CMOS process. The area overhead of the memory achieves 16.6 μm2/bit. Simulation results show that the maximum operating frequency reaches 410 MHz and the corresponding peak dynamic power consumption for a 64-bit memory unit is 190 μW under 1.8 V supply voltage.

  11. Communication and Memory Architecture Design of Application-Specific High-End Multiprocessors

    Directory of Open Access Journals (Sweden)

    Yahya Jan

    2012-01-01

    Full Text Available This paper is devoted to the design of communication and memory architectures of massively parallel hardware multiprocessors necessary for the implementation of highly demanding applications. We demonstrated that for the massively parallel hardware multiprocessors the traditionally used flat communication architectures and multi-port memories do not scale well, and the memory and communication network influence on both the throughput and circuit area dominates the processors influence. To resolve the problems and ensure scalability, we proposed to design highly optimized application-specific hierarchical and/or partitioned communication and memory architectures through exploring and exploiting the regularity and hierarchy of the actual data flows of a given application. Furthermore, we proposed some data distribution and related data mapping schemes in the shared (global partitioned memories with the aim to eliminate the memory access conflicts, as well as, to ensure that our communication design strategies will be applicable. We incorporated these architecture synthesis strategies into our quality-driven model-based multi-processor design method and related automated architecture exploration framework. Using this framework, we performed a large series of experiments that demonstrate many various important features of the synthesized memory and communication architectures. They also demonstrate that our method and related framework are able to efficiently synthesize well scalable memory and communication architectures even for the high-end multiprocessors. The gains as high as 12-times in performance and 25-times in area can be obtained when using the hierarchical communication networks instead of the flat networks. However, for the high parallelism levels only the partitioned approach ensures the scalability in performance.

  12. FPGA Based Intelligent Co-operative Processor in Memory Architecture

    DEFF Research Database (Denmark)

    Ahmed, Zaki; Sotudeh, Reza; Hussain, Dil Muhammad Akbar

    2011-01-01

    benefits of PIM, a concept of Co-operative Intelligent Memory (CIM) was developed by the intelligent system group of University of Hertfordshire, based on the previously developed Co-operative Pseudo Intelligent Memory (CPIM). This paper provides an overview on previous works (CPIM, CIM) and realization......In a continuing effort to improve computer system performance, Processor-In-Memory (PIM) architecture has emerged as an alternative solution. PIM architecture incorporates computational units and control logic directly on the memory to provide immediate access to the data. To exploit the potential...

  13. Temporality and Memory in Architecture: Hagia Sophia

    Directory of Open Access Journals (Sweden)

    Yüksel Burçin Nur

    2017-12-01

    Full Text Available Istanbul, having hosted many civilizations and cultures, has a long and important past. Due to its geopolitical locations, the city has been the capital of two civilizations—Ottoman and Byzantine Empires—which left its traces in the history of the world. Architectural and symbolic monuments built by these civilizations made an impression in all communities making the city a center of attraction. After each and every damage caused by wars, civil strifes, and natural disasters, maximum effort has been made to restore these symbolic buildings. Attitude of a society to a piece of art or an architectural construction defined as historical artifact is shown in interventions, architectural supplementations and restorations to buildings to keep them alive. As a result of this attitude, it is accepted that buildings are perceived as a place of memory and symbolized with the city. The most important symbolic monument of the city, Ayasofya (Hagia Sophia, was found as the Church of the Byzantine Emperor in the year 360, then converted into the Mosque of the Ottoman Sultan, and now serves as one of the best-known museums of Turkey. With architectural additions requested by Byzantine emperors and Ottoman sultans, restorations and other functional changes; Hagia Sophia had become a monument witnessing its own changes as well as its surroundings while collecting memories. Accordingly, Hagia Sophia can be described as an immortal building.  Immortality is out of time notion, however it is a reflection of time effects as well. Immortality is about resisting to time. A construction from the past which appreciates as time passes will also exist in the future preserving its value. The building has been strengthened with the memory phenomenon formed during construction, incidents that the building witnessed in its location, restorations, architectural supplementations and the perception of the world heritage. The main purpose of this presentation is to show how

  14. Mapping robust parallel multigrid algorithms to scalable memory architectures

    Science.gov (United States)

    Overman, Andrea; Vanrosendale, John

    1993-01-01

    The convergence rate of standard multigrid algorithms degenerates on problems with stretched grids or anisotropic operators. The usual cure for this is the use of line or plane relaxation. However, multigrid algorithms based on line and plane relaxation have limited and awkward parallelism and are quite difficult to map effectively to highly parallel architectures. Newer multigrid algorithms that overcome anisotropy through the use of multiple coarse grids rather than relaxation are better suited to massively parallel architectures because they require only simple point-relaxation smoothers. In this paper, we look at the parallel implementation of a V-cycle multiple semicoarsened grid (MSG) algorithm on distributed-memory architectures such as the Intel iPSC/860 and Paragon computers. The MSG algorithms provide two levels of parallelism: parallelism within the relaxation or interpolation on each grid and across the grids on each multigrid level. Both levels of parallelism must be exploited to map these algorithms effectively to parallel architectures. This paper describes a mapping of an MSG algorithm to distributed-memory architectures that demonstrates how both levels of parallelism can be exploited. The result is a robust and effective multigrid algorithm for distributed-memory machines.

  15. Deaf Children Building Narrative Texts. Effect of Adult-Shared vs. Non-Shared Perception of a Picture Story

    Directory of Open Access Journals (Sweden)

    Tarwacka-Odolczyk Agata

    2014-08-01

    Full Text Available This paper discusses the communicative competence of deaf children. It illustrates the process in which such children build narrative texts in interaction with a deaf teacher, and presents the diversity of this process due to the shared vs. non-shared perception of a picture - the source of the topic. Detailed analyses focus on the formal and semantic aspect of the stories, including the length of the text in sign language, the content selected, information categories, and types of answers to the teacher’s questions. This text is our contribution in memory of Professor Grace Wales Shugar, whose idea of dual agentivity of child-adult interaction inspired the research presented here.

  16. Switch/router architectures shared-bus and shared-memory based systems

    CERN Document Server

    Aweya, James

    2018-01-01

    A practicing engineer's inclusive review of communication systems based on shared-bus and shared-memory switch/router architectures. This book delves into the inner workings of router and switch design in a comprehensive manner that is accessible to a broad audience. It begins by describing the role of switch/routers in a network, then moves on to the functional composition of a switch/router. A comparison of centralized versus distributed design of the architecture is also presented. The author discusses use of bus versus shared-memory for communication within a design, and also covers Quality of Service (QoS) mechanisms and configuration tools. Written in a simple style and language to allow readers to easily understand and appreciate the material presented, Switch/Router Architectures: Shared-Bus and Shared-Memory Based Systems discusses the design of multilayer switches—starting with the basic concepts and on to the basic architectures. It describes the evolution of multilayer switch designs and highli...

  17. A Memory-based Robot Architecture based on Contextual Information

    OpenAIRE

    Pratama, Ferdian; Mastrogiovanni, Fulvio; Chong, Nak Young

    2014-01-01

    In this paper, we present a preliminary conceptual design for a robot long-term memory architecture based on the notion of context. Contextual information is used to organize the data flow between Working Memory (including Perceptual Memory) and Long-Term Memory components. We discuss the major influence of the notion of context within Episodic Memory on Semantic and Procedural Memory, respectively. We address how the occurrence of specific object-related events in time impacts on the semanti...

  18. Resistive content addressable memory based in-memory computation architecture

    KAUST Repository

    Salama, Khaled N.; Zidan, Mohammed A.; Kurdahi, Fadi; Eltawil, Ahmed M.

    2016-01-01

    Various examples are provided examples related to resistive content addressable memory (RCAM) based in-memory computation architectures. In one example, a system includes a content addressable memory (CAM) including an array of cells having a memristor based crossbar and an interconnection switch matrix having a gateless memristor array, which is coupled to an output of the CAM. In another example, a method, includes comparing activated bit values stored a key register with corresponding bit values in a row of a CAM, setting a tag bit value to indicate that the activated bit values match the corresponding bit values, and writing masked key bit values to corresponding bit locations in the row of the CAM based on the tag bit value.

  19. Resistive content addressable memory based in-memory computation architecture

    KAUST Repository

    Salama, Khaled N.

    2016-12-08

    Various examples are provided examples related to resistive content addressable memory (RCAM) based in-memory computation architectures. In one example, a system includes a content addressable memory (CAM) including an array of cells having a memristor based crossbar and an interconnection switch matrix having a gateless memristor array, which is coupled to an output of the CAM. In another example, a method, includes comparing activated bit values stored a key register with corresponding bit values in a row of a CAM, setting a tag bit value to indicate that the activated bit values match the corresponding bit values, and writing masked key bit values to corresponding bit locations in the row of the CAM based on the tag bit value.

  20. A memory-array architecture for computer vision

    Energy Technology Data Exchange (ETDEWEB)

    Balsara, P.T.

    1989-01-01

    With the fast advances in the area of computer vision and robotics there is a growing need for machines that can understand images at a very high speed. A conventional von Neumann computer is not suited for this purpose because it takes a tremendous amount of time to solve most typical image processing problems. Exploiting the inherent parallelism present in various vision tasks can significantly reduce the processing time. Fortunately, parallelism is increasingly affordable as hardware gets cheaper. Thus it is now imperative to study computer vision in a parallel processing framework. The author should first design a computational structure which is well suited for a wide range of vision tasks and then develop parallel algorithms which can run efficiently on this structure. Recent advances in VLSI technology have led to several proposals for parallel architectures for computer vision. In this thesis he demonstrates that a memory array architecture with efficient local and global communication capabilities can be used for high speed execution of a wide range of computer vision tasks. This architecture, called the Access Constrained Memory Array Architecture (ACMAA), is efficient for VLSI implementation because of its modular structure, simple interconnect and limited global control. Several parallel vision algorithms have been designed for this architecture. The choice of vision problems demonstrates the versatility of ACMAA for a wide range of vision tasks. These algorithms were simulated on a high level ACMAA simulator running on the Intel iPSC/2 hypercube, a parallel architecture. The results of this simulation are compared with those of sequential algorithms running on a single hypercube node. Details of the ACMAA processor architecture are also presented.

  1. Image processing methods and architectures in diagnostic pathology.

    Directory of Open Access Journals (Sweden)

    Oscar DĂŠniz

    2010-05-01

    Full Text Available Grid technology has enabled the clustering and the efficient and secure access to and interaction among a wide variety of geographically distributed resources such as: supercomputers, storage systems, data sources, instruments and special devices and services. Their main applications include large-scale computational and data intensive problems in science and engineering. General grid structures and methodologies for both software and hardware in image analysis for virtual tissue-based diagnosis has been considered in this paper. This methods are focus on the user level middleware. The article describes the distributed programming system developed by the authors for virtual slide analysis in diagnostic pathology. The system supports different image analysis operations commonly done in anatomical pathology and it takes into account secured aspects and specialized infrastructures with high level services designed to meet application requirements. Grids are likely to have a deep impact on health related applications, and therefore they seem to be suitable for tissue-based diagnosis too. The implemented system is a joint application that mixes both Web and Grid Service Architecture around a distributed architecture for image processing. It has shown to be a successful solution to analyze a big and heterogeneous group of histological images under architecture of massively parallel processors using message passing and non-shared memory.

  2. Cognitive Architecture for Direction of Attention Founded on Subliminal Memory Searches, Pseudorandom and Nonstop

    OpenAIRE

    Burger, J. R.

    2008-01-01

    By way of explaining how a brain works logically, human associative memory is modeled with logical and memory neurons, corresponding to standard digital circuits. The resulting cognitive architecture incorporates basic psychological elements such as short term and long term memory. Novel to the architecture are memory searches using cues chosen pseudorandomly from short term memory. Recalls alternated with sensory images, many tens per second, are analyzed subliminally as an ongoing process, ...

  3. Novel memory architecture for video signal processor

    Science.gov (United States)

    Hung, Jen-Sheng; Lin, Chia-Hsing; Jen, Chein-Wei

    1993-11-01

    An on-chip memory architecture for video signal processor (VSP) is proposed. This memory structure is a two-level design for the different data locality in video applications. The upper level--Memory A provides enough storage capacity to reduce the impact on the limitation of chip I/O bandwidth, and the lower level--Memory B provides enough data parallelism and flexibility to meet the requirements of multiple reconfigurable pipeline function units in a single VSP chip. The needed memory size is decided by the memory usage analysis for video algorithms and the number of function units. Both levels of memory adopted a dual-port memory scheme to sustain the simultaneous read and write operations. Especially, Memory B uses multiple one-read-one-write memory banks to emulate the real multiport memory. Therefore, one can change the configuration of Memory B to several sets of memories with variable read/write ports by adjusting the bus switches. Then the numbers of read ports and write ports in proposed memory can meet requirement of data flow patterns in different video coding algorithms. We have finished the design of a prototype memory design using 1.2- micrometers SPDM SRAM technology and will fabricated it through TSMC, in Taiwan.

  4. Architectural design and simulation of a virtual memory

    Science.gov (United States)

    Kwok, G.; Chu, Y.

    1971-01-01

    Virtual memory is an imaginary main memory with a very large capacity which the programmer has at his disposal. It greatly contributes to the solution of the dynamic storage allocation problem. The architectural design of a virtual memory is presented which implements by hardware the idea of queuing and scheduling the page requests to a paging drum in such a way that the access of the paging drum is increased many times. With the design, an increase of up to 16 times in page transfer rate is achievable when the virtual memory is heavily loaded. This in turn makes feasible a great increase in the system throughput.

  5. A processing architecture for associative short-term memory in electronic noses

    Science.gov (United States)

    Pioggia, G.; Ferro, M.; Di Francesco, F.; DeRossi, D.

    2006-11-01

    Electronic nose (e-nose) architectures usually consist of several modules that process various tasks such as control, data acquisition, data filtering, feature selection and pattern analysis. Heterogeneous techniques derived from chemometrics, neural networks, and fuzzy rules used to implement such tasks may lead to issues concerning module interconnection and cooperation. Moreover, a new learning phase is mandatory once new measurements have been added to the dataset, thus causing changes in the previously derived model. Consequently, if a loss in the previous learning occurs (catastrophic interference), real-time applications of e-noses are limited. To overcome these problems this paper presents an architecture for dynamic and efficient management of multi-transducer data processing techniques and for saving an associative short-term memory of the previously learned model. The architecture implements an artificial model of a hippocampus-based working memory, enabling the system to be ready for real-time applications. Starting from the base models available in the architecture core, dedicated models for neurons, maps and connections were tailored to an artificial olfactory system devoted to analysing olive oil. In order to verify the ability of the processing architecture in associative and short-term memory, a paired-associate learning test was applied. The avoidance of catastrophic interference was observed.

  6. A Layered Active Memory Architecture for Cognitive Vision Systems

    OpenAIRE

    Kolonias, Ilias; Christmas, William; Kittler, Josef

    2007-01-01

    Recognising actions and objects from video material has attracted growing research attention and given rise to important applications. However, injecting cognitive capabilities into computer vision systems requires an architecture more elaborate than the traditional signal processing paradigm for information processing. Inspired by biological cognitive systems, we present a memory architecture enabling cognitive processes (such as selecting the processes required for scene understanding, laye...

  7. Linguistic representations and memory architectures: The devil is in the details.

    Science.gov (United States)

    Chacón, Dustin Alfonso; Momma, Shota; Phillips, Colin

    2016-01-01

    Attempts to explain linguistic phenomena as consequences of memory constraints require detailed specification of linguistic representations and memory architectures alike. We discuss examples of supposed locality biases in language comprehension and production, and their link to memory constraints. Findings do not generally favor Christiansen & Chater's (C&C's) approach. We discuss connections to debates that stretch back to the nineteenth century.

  8. A Compute Capable SSD Architecture for Next-Generation Non-volatile Memories

    Energy Technology Data Exchange (ETDEWEB)

    De, Arup [Univ. of California, San Diego, CA (United States)

    2014-01-01

    Existing storage technologies (e.g., disks and ash) are failing to cope with the processor and main memory speed and are limiting the overall perfor- mance of many large scale I/O or data-intensive applications. Emerging fast byte-addressable non-volatile memory (NVM) technologies, such as phase-change memory (PCM), spin-transfer torque memory (STTM) and memristor are very promising and are approaching DRAM-like performance with lower power con- sumption and higher density as process technology scales. These new memories are narrowing down the performance gap between the storage and the main mem- ory and are putting forward challenging problems on existing SSD architecture, I/O interface (e.g, SATA, PCIe) and software. This dissertation addresses those challenges and presents a novel SSD architecture called XSSD. XSSD o oads com- putation in storage to exploit fast NVMs and reduce the redundant data tra c across the I/O bus. XSSD o ers a exible RPC-based programming framework that developers can use for application development on SSD without dealing with the complication of the underlying architecture and communication management. We have built a prototype of XSSD on the BEE3 FPGA prototyping system. We implement various data-intensive applications and achieve speedup and energy ef- ciency of 1.5-8.9 and 1.7-10.27 respectively. This dissertation also compares XSSD with previous work on intelligent storage and intelligent memory. The existing ecosystem and these new enabling technologies make this system more viable than earlier ones.

  9. A Core Knowledge Architecture of Visual Working Memory

    Science.gov (United States)

    Wood, Justin N.

    2011-01-01

    Visual working memory (VWM) is widely thought to contain specialized buffers for retaining spatial and object information: a "spatial-object architecture." However, studies of adults, infants, and nonhuman animals show that visual cognition builds on core knowledge systems that retain more specialized representations: (1) spatiotemporal…

  10. KISS-Tree: Smart Latch-Free In-Memory Indexing on Modern Architectures

    OpenAIRE

    Kissinger, Thomas; Schlegel, Benjamin; Habich, Dirk; Lehner, Wolfgang

    2012-01-01

    Growing main memory capacities and an increasing number of hardware threads in modern server systems led to fundamental changes in database architectures. Most importantly, query processing is nowadays performed on data that is often completely stored in main memory. Despite of a high main memory scan performance, index structures are still important components, but they have to be designed from scratch to cope with the specific characteristics of main memory and to exploit the high degree of...

  11. A learnable parallel processing architecture towards unity of memory and computing.

    Science.gov (United States)

    Li, H; Gao, B; Chen, Z; Zhao, Y; Huang, P; Ye, H; Liu, L; Liu, X; Kang, J

    2015-08-14

    Developing energy-efficient parallel information processing systems beyond von Neumann architecture is a long-standing goal of modern information technologies. The widely used von Neumann computer architecture separates memory and computing units, which leads to energy-hungry data movement when computers work. In order to meet the need of efficient information processing for the data-driven applications such as big data and Internet of Things, an energy-efficient processing architecture beyond von Neumann is critical for the information society. Here we show a non-von Neumann architecture built of resistive switching (RS) devices named "iMemComp", where memory and logic are unified with single-type devices. Leveraging nonvolatile nature and structural parallelism of crossbar RS arrays, we have equipped "iMemComp" with capabilities of computing in parallel and learning user-defined logic functions for large-scale information processing tasks. Such architecture eliminates the energy-hungry data movement in von Neumann computers. Compared with contemporary silicon technology, adder circuits based on "iMemComp" can improve the speed by 76.8% and the power dissipation by 60.3%, together with a 700 times aggressive reduction in the circuit area.

  12. A learnable parallel processing architecture towards unity of memory and computing

    Science.gov (United States)

    Li, H.; Gao, B.; Chen, Z.; Zhao, Y.; Huang, P.; Ye, H.; Liu, L.; Liu, X.; Kang, J.

    2015-08-01

    Developing energy-efficient parallel information processing systems beyond von Neumann architecture is a long-standing goal of modern information technologies. The widely used von Neumann computer architecture separates memory and computing units, which leads to energy-hungry data movement when computers work. In order to meet the need of efficient information processing for the data-driven applications such as big data and Internet of Things, an energy-efficient processing architecture beyond von Neumann is critical for the information society. Here we show a non-von Neumann architecture built of resistive switching (RS) devices named “iMemComp”, where memory and logic are unified with single-type devices. Leveraging nonvolatile nature and structural parallelism of crossbar RS arrays, we have equipped “iMemComp” with capabilities of computing in parallel and learning user-defined logic functions for large-scale information processing tasks. Such architecture eliminates the energy-hungry data movement in von Neumann computers. Compared with contemporary silicon technology, adder circuits based on “iMemComp” can improve the speed by 76.8% and the power dissipation by 60.3%, together with a 700 times aggressive reduction in the circuit area.

  13. Architectures for a quantum random access memory

    Science.gov (United States)

    Giovannetti, Vittorio; Lloyd, Seth; Maccone, Lorenzo

    2008-11-01

    A random access memory, or RAM, is a device that, when interrogated, returns the content of a memory location in a memory array. A quantum RAM, or qRAM, allows one to access superpositions of memory sites, which may contain either quantum or classical information. RAMs and qRAMs with n -bit addresses can access 2n memory sites. Any design for a RAM or qRAM then requires O(2n) two-bit logic gates. At first sight this requirement might seem to make large scale quantum versions of such devices impractical, due to the difficulty of constructing and operating coherent devices with large numbers of quantum logic gates. Here we analyze two different RAM architectures (the conventional fanout and the “bucket brigade”) and propose some proof-of-principle implementations, which show that, in principle, only O(n) two-qubit physical interactions need take place during each qRAM call. That is, although a qRAM needs O(2n) quantum logic gates, only O(n) need to be activated during a memory call. The resulting decrease in resources could give rise to the construction of large qRAMs that could operate without the need for extensive quantum error correction.

  14. Concurrent Operations of O2-Tree on Shared Memory Multicore Architectures

    OpenAIRE

    Daniel Ohene-Kwofie; E. J. Otoo1, Gideon Nimako

    2014-01-01

    Modern computer architectures provide high performance computing capability by having multiple CPU cores. Such systems are also typically associated with very large main-memory capacities, thereby allowing them to be used for fast processing of in-memory database applications. However, most of the concurrency control mechanism associated with the index structures of these memory resident databases do not scale well, under high transaction rates. This paper presents the O2-Tree, a fast main me...

  15. Migration of vectorized iterative solvers to distributed memory architectures

    Energy Technology Data Exchange (ETDEWEB)

    Pommerell, C. [AT& T Bell Labs., Murray Hill, NJ (United States); Ruehl, R. [CSCS-ETH, Manno (Switzerland)

    1994-12-31

    Both necessity and opportunity motivate the use of high-performance computers for iterative linear solvers. Necessity results from the size of the problems being solved-smaller problems are often better handled by direct methods. Opportunity arises from the formulation of the iterative methods in terms of simple linear algebra operations, even if this {open_quote}natural{close_quotes} parallelism is not easy to exploit in irregularly structured sparse matrices and with good preconditioners. As a result, high-performance implementations of iterative solvers have attracted a lot of interest in recent years. Most efforts are geared to vectorize or parallelize the dominating operation-structured or unstructured sparse matrix-vector multiplication, or to increase locality and parallelism by reformulating the algorithm-reducing global synchronization in inner products or local data exchange in preconditioners. Target architectures for iterative solvers currently include mostly vector supercomputers and architectures with one or few optimized (e.g., super-scalar and/or super-pipelined RISC) processors and hierarchical memory systems. More recently, parallel computers with physically distributed memory and a better price/performance ratio have been offered by vendors as a very interesting alternative to vector supercomputers. However, programming comfort on such distributed memory parallel processors (DMPPs) still lags behind. Here the authors are concerned with iterative solvers and their changing computing environment. In particular, they are considering migration from traditional vector supercomputers to DMPPs. Application requirements force one to use flexible and portable libraries. They want to extend the portability of iterative solvers rather than reimplementing everything for each new machine, or even for each new architecture.

  16. Optical RAM-enabled cache memory and optical routing for chip multiprocessors: technologies and architectures

    Science.gov (United States)

    Pleros, Nikos; Maniotis, Pavlos; Alexoudi, Theonitsa; Fitsios, Dimitris; Vagionas, Christos; Papaioannou, Sotiris; Vyrsokinos, K.; Kanellos, George T.

    2014-03-01

    The processor-memory performance gap, commonly referred to as "Memory Wall" problem, owes to the speed mismatch between processor and electronic RAM clock frequencies, forcing current Chip Multiprocessor (CMP) configurations to consume more than 50% of the chip real-estate for caching purposes. In this article, we present our recent work spanning from Si-based integrated optical RAM cell architectures up to complete optical cache memory architectures for Chip Multiprocessor configurations. Moreover, we discuss on e/o router subsystems with up to Tb/s routing capacity for cache interconnection purposes within CMP configurations, currently pursued within the FP7 PhoxTrot project.

  17. Memory controllers for mixed-time-criticality systems architectures, methodologies and trade-offs

    CERN Document Server

    Goossens, Sven; Akesson, Benny; Goossens, Kees

    2016-01-01

    This book discusses the design and performance analysis of SDRAM controllers that cater to both real-time and best-effort applications, i.e. mixed-time-criticality memory controllers. The authors describe the state of the art, and then focus on an architecture template for reconfigurable memory controllers that addresses effectively the quickly evolving set of SDRAM standards, in terms of worst-case timing and power analysis, as well as implementation. A prototype implementation of the controller in SystemC and synthesizable VHDL for an FPGA development board are used as a proof of concept of the architecture template.

  18. Efficient Numeric and Geometric Computations using Heterogeneous Shared Memory Architectures

    Science.gov (United States)

    2017-10-04

    to the memory architectures of CPUs and GPUs to obtain good performance and result in good memory performance using cache management. These methods ...Accomplishments: The PI and students has developed new methods for path and ray tracing and their Report Date: 14-Oct-2017 INVESTIGATOR(S): Phone...The efficiency of our method makes it a good candidate for forming hybrid schemes with wave-based models. One possibility is to couple the ray curve

  19. Architecture and the Social Frameworks of Memory: A Postscript to Maurice Halbwachs’ “Collective Memory”

    Directory of Open Access Journals (Sweden)

    Can Bilsel

    2017-06-01

    Full Text Available This paper offers a commentary on Maurice Halwachs’ writings on “collective memory” in the years between 1925-1945. Architectural and urban spaces figure prominently in work of the French sociologist since he maintains that memories survive in the longue durée only to the extent they are indexed into architectural places, and mapped into an urban and historical topography. This comes with a caveat: in his pioneering study of “collective memory,” La topographie légendaire des Évangiles en Terre Sainte: etude de mémoire collective, Halbwachs highlights the discrepancy between the archaeological record preserved in material culture—for example ancient ruins and monuments—and the living memory of a religious community. Likewise, in his study of working classes, Halbwachs’ neologism, “collective memory” is defined as a deliberately unstable, and socially constructed category.  The provisional and fluid definition that Halbwachs assigned to “collective memory” offers an insight into our present predicament. In the last decades, the ability of architecture, urban design, and architectural conservation in framing and preserving a stable and unified cultural heritage has been profoundly challenged. This paper makes the case for moving away from merely technical inquiries that understand architecture and places as “sites of memory” to a new direction that builds upon Halbwachs’ social frameworks of memory. It is thanks to Halbwach’s pioneering, if incomplete, work on “collective memory” that we may understand how the emerging and open-ended social formations transform architecture and urban spaces.

  20. Applications of Case Based Organizational Memory Supported by the PAbMM Architecture

    Directory of Open Access Journals (Sweden)

    Martín

    2017-04-01

    Full Text Available In the aim to manage and retrieve the organizational knowledge, in the last years numerous proposals of models and tools for knowledge management and knowledge representation have arisen. However, most of them store knowledge in a non-structured or semi-structured way, hindering the semantic and automatic processing of this knowledge. In this paper we present a more detailed case-based organizational memory ontology, which aims at contributing to the design of an organizational memory based on cases, so that it can be used to learn, reasoning, solve problems, and as support to better decision making as well. The objective of this Organizational Memory is to serve as base for the organizational knowledge exchange in a processing architecture specialized in the measurement and evaluation. In this way, our processing architecture is based on the C-INCAMI framework (Context-Information Need, Concept model, Attribute, Metric and Indicator for defining the measurement projects. Additionally, the proposal architecture uses a big data repository to make available the data for consumption and to manage the Organizational Memory, which allows a feedback mechanism in relation with online processing. In order to illustrate its utility, two practical cases are explained: A pasture predictor system, using the data of the weather radar (WR of the Experimental Agricultural Station (EAS INTA Anguil (La Pampa State, Argentina and an outpatient monitoring scenario. Future trends and concluding remarks are extended.

  1. A Vertical Organic Transistor Architecture for Fast Nonvolatile Memory.

    Science.gov (United States)

    She, Xiao-Jian; Gustafsson, David; Sirringhaus, Henning

    2017-02-01

    A new device architecture for fast organic transistor memory is developed, based on a vertical organic transistor configuration incorporating high-performance ambipolar conjugated polymers and unipolar small molecules as the transport layers, to achieve reliable and fast programming and erasing of the threshold voltage shift in less than 200 ns. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  2. Architecture and performance of radiation-hardened 64-bit SOS/MNOS memory

    International Nuclear Information System (INIS)

    Kliment, D.C.; Ronen, R.S.; Nielsen, R.L.; Seymour, R.N.; Splinter, M.R.

    1976-01-01

    This paper discusses the circuit architecture and performance of a nonvolatile 64-bit MNOS memory fabricated on silicon on sapphire (SOS). The circuit is a test vehicle designed to demonstrate the feasibility of a high-performance, high-density, radiation-hardened MNOS/SOS memory. The array is organized as 16 words by 4 bits and is fully decoded. It utilizes a two-(MNOS) transistor-per-bit cell and differential sensing scheme and is realized in PMOS static resistor load logic. The circuit was fabricated and tested as both a fast write random access memory (RAM) and an electrically alterable read only memory (EAROM) to demonstrate design and process flexibility. Discrete device parameters such as retention, circuit electrical characteristics, and tolerance to total dose and transient radiation are presented

  3. Memory intensive functional architecture for distributed computer control systems

    International Nuclear Information System (INIS)

    Dimmler, D.G.

    1983-10-01

    A memory-intensive functional architectue for distributed data-acquisition, monitoring, and control systems with large numbers of nodes has been conceptually developed and applied in several large-scale and some smaller systems. This discussion concentrates on: (1) the basic architecture; (2) recent expansions of the architecture which now become feasible in view of the rapidly developing component technologies in microprocessors and functional large-scale integration circuits; and (3) implementation of some key hardware and software structures and one system implementation which is a system for performing control and data acquisition of a neutron spectrometer at the Brookhaven High Flux Beam Reactor. The spectrometer is equipped with a large-area position-sensitive neutron detector

  4. Real-time stereo matching architecture based on 2D MRF model: a memory-efficient systolic array

    Directory of Open Access Journals (Sweden)

    Park Sungchan

    2011-01-01

    Full Text Available Abstract There is a growing need in computer vision applications for stereopsis, requiring not only accurate distance but also fast and compact physical implementation. Global energy minimization techniques provide remarkably precise results. But they suffer from huge computational complexity. One of the main challenges is to parallelize the iterative computation, solving the memory access problem between the big external memory and the massive processors. Remarkable memory saving can be obtained with our memory reduction scheme, and our new architecture is a systolic array. If we expand it into N's multiple chips in a cascaded manner, we can cope with various ranges of image resolutions. We have realized it using the FPGA technology. Our architecture records 19 times smaller memory than the global minimization technique, which is a principal step toward real-time chip implementation of the various iterative image processing algorithms with tiny and distributed memory resources like optical flow, image restoration, etc.

  5. A non-destructive crossbar architecture of multi-level memory-based resistor

    Science.gov (United States)

    Sahebkarkhorasani, Seyedmorteza

    Nowadays, researchers are trying to shrink the memory cell in order to increase the capacity of the memory system and reduce the hardware costs. In recent years, there has been a revolution in electronics by using fundamentals of physics to build a new memory for computer application in order to increase the capacity and decrease the power consumption. Increasing the capacity of the memory causes a growth in the chip area. From 1971 to 2012 semiconductor manufacturing process improved from 6mum to 22 mum. In May 2008, S.Williams stated that "it is time to stop shrinking". In his paper, he declared that the process of shrinking memory element has recently become very slow and it is time to use another alternative in order to create memory elements [9]. In this project, we present a new design of a memory array using the new element named Memristor [3]. Memristor is a two-terminal passive electrical element that relates the charge and magnetic flux to each other. The device remained unknown since 1971 when it was discovered by Chua and introduced as the fourth fundamental passive element like capacitor, inductor and resistor [3]. Memristor has a dynamic resistance and it can retain its previous value even after disconnecting the power supply. Due to this interesting behavior of the Memristor, it can be a good replacement for all of the Non-Volatile Memories (NVMs) in the near future. Combination of this newly introduced element with the nanowire crossbar architecture would be a great structure which is called Crossbar Memristor. Some frameworks have recently been introduced in literature that utilized Memristor crossbar array, but there are many challenges to implement the Memristor crossbar array due to fabrication and device limitations. In this work, we proposed a simple design of Memristor crossbar array architecture which uses input feedback in order to preserve its data after each read operation.

  6. Flexible and twistable non-volatile memory cell array with all-organic one diode-one resistor architecture.

    Science.gov (United States)

    Ji, Yongsung; Zeigler, David F; Lee, Dong Su; Choi, Hyejung; Jen, Alex K-Y; Ko, Heung Cho; Kim, Tae-Wook

    2013-01-01

    Flexible organic memory devices are one of the integral components for future flexible organic electronics. However, high-density all-organic memory cell arrays on malleable substrates without cross-talk have not been demonstrated because of difficulties in their fabrication and relatively poor performances to date. Here we demonstrate the first flexible all-organic 64-bit memory cell array possessing one diode-one resistor architectures. Our all-organic one diode-one resistor cell exhibits excellent rewritable switching characteristics, even during and after harsh physical stresses. The write-read-erase-read output sequence of the cells perfectly correspond to the external pulse signal regardless of substrate deformation. The one diode-one resistor cell array is clearly addressed at the specified cells and encoded letters based on the standard ASCII character code. Our study on integrated organic memory cell arrays suggests that the all-organic one diode-one resistor cell architecture is suitable for high-density flexible organic memory applications in the future.

  7. Memory architecture for efficient utilization of SDRAM: a case study of the computation/memory access trade-off

    DEFF Research Database (Denmark)

    Gleerup, Thomas Møller; Holten-Lund, Hans Erik; Madsen, Jan

    2000-01-01

    . In software, forward differencing is usually better, but in this hardware implementation, the trade-off has made it possible to develop a very regular memory architecture with a buffering system, which can reach 95% bandwidth utilization using off-the-shelf SDRAM, This is achieved by changing the algorithm......This paper discusses the trade-off between calculations and memory accesses in a 3D graphics tile renderer for visualization of data from medical scanners. The performance requirement of this application is a frame rate of 25 frames per second when rendering 3D models with 2 million triangles, i...... to use a memory access strategy with write-only and read-only phases, and a buffering system, which uses round-robin bank write-access combined with burst read-access....

  8. Shared and non-shared antigens from three different extracts of the metacestode of Echinococcus granulosus

    Directory of Open Access Journals (Sweden)

    David Carmena

    2005-12-01

    Full Text Available Hydatid cyst fluid (HCF, somatic antigens (S-Ag and excretory-secretory products (ES-Ag of Echinococcus granulosus protoscoleces are used as the main antigenic sources for immunodiagnosis of human and dog echinococcosis. In order to determine their non-shared as well as their shared antigenic components, these extracts were studied by ELISA-inhibition and immunoblot-inhibition. Assays were carried out using homologous rabbit polyclonal antisera, human sera from individuals with surgically confirmed hydatidosis, and sera from dogs naturally infected with E. granulosus. High levels of cross-reactivity were observed for all antigenic extracts, but especially for ES-Ag and S-Ag. Canine antibodies evidenced lesser avidity for their specific antigens than antibodies from human origin. The major antigenic components shared by HCF, S-Ag, and ES-Ag have apparent molecular masses of 4-6, 20-24, 52, 80, and 100-104 kDa, including doublets of 41/45, 54/57, and 65/68 kDa. Non-shared polypeptides of each antigenic extract of E. granulosus were identified, having apparent masses of 108 and 78 kDa for HCF, of 124, 94, 83, and 75 kDa for S-Ag, and of 89, 66, 42, 39, 37, and 35 kDa for ES-Ag.

  9. Extending and implementing the Self-adaptive Virtual Processor for distributed memory architectures

    NARCIS (Netherlands)

    van Tol, M.W.; Koivisto, J.

    2011-01-01

    Many-core architectures of the future are likely to have distributed memory organizations and need fine grained concurrency management to be used effectively. The Self-adaptive Virtual Processor (SVP) is an abstract concurrent programming model which can provide this, but the model and its current

  10. Long-term knowledge acquisition using contextual information in a memory-inspired robot architecture

    Science.gov (United States)

    Pratama, Ferdian; Mastrogiovanni, Fulvio; Lee, Soon Geul; Chong, Nak Young

    2017-03-01

    In this paper, we present a novel cognitive framework allowing a robot to form memories of relevant traits of its perceptions and to recall them when necessary. The framework is based on two main principles: on the one hand, we propose an architecture inspired by current knowledge in human memory organisation; on the other hand, we integrate such an architecture with the notion of context, which is used to modulate the knowledge acquisition process when consolidating memories and forming new ones, as well as with the notion of familiarity, which is employed to retrieve proper memories given relevant cues. Although much research has been carried out, which exploits Machine Learning approaches to provide robots with internal models of their environment (including objects and occurring events therein), we argue that such approaches may not be the right direction to follow if a long-term, continuous knowledge acquisition is to be achieved. As a case study scenario, we focus on both robot-environment and human-robot interaction processes. In case of robot-environment interaction, a robot performs pick and place movements using the objects in the workspace, at the same time observing their displacement on a table in front of it, and progressively forms memories defined as relevant cues (e.g. colour, shape or relative position) in a context-aware fashion. As far as human-robot interaction is concerned, the robot can recall specific snapshots representing past events using both sensory information and contextual cues upon request by humans.

  11. Concurrent Operations of O2-Tree on Shared Memory Multicore Architectures

    Directory of Open Access Journals (Sweden)

    Daniel Ohene-Kwofie

    2014-05-01

    Full Text Available Modern computer architectures provide high performance computing capability by having multiple CPU cores. Such systems are also typically associated with very large main-memory capacities, thereby allowing them to be used for fast processing of in-memory database applications. However, most of the concurrency control mechanism associated with the index structures of these memory resident databases do not scale well, under high transaction rates. This paper presents the O2-Tree, a fast main memory resident index, which is also highly scalable and tolerant of high transaction rates in a concurrent environment using the relaxed balancing tree algorithm. The O2-Tree is a modified Red-Black tree in which the leaf nodes are formed into blocks that hold key-value pairs, while each internal node stores a single key that results from splitting leaf nodes. Multi-threaded concurrent manipulation of the O2-Tree outperforms popular NoSQL based key-value stores considered in this paper.

  12. Ultra-High Density Holographic Memory Module with Solid-State Architecture

    Science.gov (United States)

    Markov, Vladimir B.

    2000-01-01

    NASA's terrestrial. space, and deep-space missions require technology that allows storing. retrieving, and processing a large volume of information. Holographic memory offers high-density data storage with parallel access and high throughput. Several methods exist for data multiplexing based on the fundamental principles of volume hologram selectivity. We recently demonstrated that a spatial (amplitude-phase) encoding of the reference wave (SERW) looks promising as a way to increase the storage density. The SERW hologram offers a method other than traditional methods of selectivity, such as spatial de-correlation between recorded and reconstruction fields, In this report we present the experimental results of the SERW-hologram memory module with solid-state architecture, which is of particular interest for space operations.

  13. Parallel k-means++ for Multiple Shared-Memory Architectures

    Energy Technology Data Exchange (ETDEWEB)

    Mackey, Patrick S.; Lewis, Robert R.

    2016-09-22

    In recent years k-means++ has become a popular initialization technique for improved k-means clustering. To date, most of the work done to improve its performance has involved parallelizing algorithms that are only approximations of k-means++. In this paper we present a parallelization of the exact k-means++ algorithm, with a proof of its correctness. We develop implementations for three distinct shared-memory architectures: multicore CPU, high performance GPU, and the massively multithreaded Cray XMT platform. We demonstrate the scalability of the algorithm on each platform. In addition we present a visual approach for showing which platform performed k-means++ the fastest for varying data sizes.

  14. Compiling for Novel Scratch Pad Memory based Multicore Architectures for Extreme Scale Computing

    Energy Technology Data Exchange (ETDEWEB)

    Shrivastava, Aviral

    2016-02-05

    The objective of this proposal is to develop tools and techniques (in the compiler) to manage data of a task and communication among tasks on the scratch pad memory (SPM) of the core, so that any application (a set of tasks) can be executed efficiently on an SPM based manycore architecture.

  15. Cognitive Architectures for Multimedia Learning

    Science.gov (United States)

    Reed, Stephen K.

    2006-01-01

    This article provides a tutorial overview of cognitive architectures that can form a theoretical foundation for designing multimedia instruction. Cognitive architectures include a description of memory stores, memory codes, and cognitive operations. Architectures that are relevant to multimedia learning include Paivio's dual coding theory,…

  16. Strategies for memory-based decision making: Modeling behavioral and neural signatures within a cognitive architecture.

    Science.gov (United States)

    Fechner, Hanna B; Pachur, Thorsten; Schooler, Lael J; Mehlhorn, Katja; Battal, Ceren; Volz, Kirsten G; Borst, Jelmer P

    2016-12-01

    How do people use memories to make inferences about real-world objects? We tested three strategies based on predicted patterns of response times and blood-oxygen-level-dependent (BOLD) responses: one strategy that relies solely on recognition memory, a second that retrieves additional knowledge, and a third, lexicographic (i.e., sequential) strategy, that considers knowledge conditionally on the evidence obtained from recognition memory. We implemented the strategies as computational models within the Adaptive Control of Thought-Rational (ACT-R) cognitive architecture, which allowed us to derive behavioral and neural predictions that we then compared to the results of a functional magnetic resonance imaging (fMRI) study in which participants inferred which of two cities is larger. Overall, versions of the lexicographic strategy, according to which knowledge about many but not all alternatives is searched, provided the best account of the joint patterns of response times and BOLD responses. These results provide insights into the interplay between recognition and additional knowledge in memory, hinting at an adaptive use of these two sources of information in decision making. The results highlight the usefulness of implementing models of decision making within a cognitive architecture to derive predictions on the behavioral and neural level. Copyright © 2016 Elsevier B.V. All rights reserved.

  17. Multiprocessor architecture: Synthesis and evaluation

    Science.gov (United States)

    Standley, Hilda M.

    1990-01-01

    Multiprocessor computed architecture evaluation for structural computations is the focus of the research effort described. Results obtained are expected to lead to more efficient use of existing architectures and to suggest designs for new, application specific, architectures. The brief descriptions given outline a number of related efforts directed toward this purpose. The difficulty is analyzing an existing architecture or in designing a new computer architecture lies in the fact that the performance of a particular architecture, within the context of a given application, is determined by a number of factors. These include, but are not limited to, the efficiency of the computation algorithm, the programming language and support environment, the quality of the program written in the programming language, the multiplicity of the processing elements, the characteristics of the individual processing elements, the interconnection network connecting processors and non-local memories, and the shared memory organization covering the spectrum from no shared memory (all local memory) to one global access memory. These performance determiners may be loosely classified as being software or hardware related. This distinction is not clear or even appropriate in many cases. The effect of the choice of algorithm is ignored by assuming that the algorithm is specified as given. Effort directed toward the removal of the effect of the programming language and program resulted in the design of a high-level parallel programming language. Two characteristics of the fundamental structure of the architecture (memory organization and interconnection network) are examined.

  18. An energy efficient and high speed architecture for convolution computing based on binary resistive random access memory

    Science.gov (United States)

    Liu, Chen; Han, Runze; Zhou, Zheng; Huang, Peng; Liu, Lifeng; Liu, Xiaoyan; Kang, Jinfeng

    2018-04-01

    In this work we present a novel convolution computing architecture based on metal oxide resistive random access memory (RRAM) to process the image data stored in the RRAM arrays. The proposed image storage architecture shows performances of better speed-device consumption efficiency compared with the previous kernel storage architecture. Further we improve the architecture for a high accuracy and low power computing by utilizing the binary storage and the series resistor. For a 28 × 28 image and 10 kernels with a size of 3 × 3, compared with the previous kernel storage approach, the newly proposed architecture shows excellent performances including: 1) almost 100% accuracy within 20% LRS variation and 90% HRS variation; 2) more than 67 times speed boost; 3) 71.4% energy saving.

  19. Learning, memory, and the role of neural network architecture.

    Directory of Open Access Journals (Sweden)

    Ann M Hermundstad

    2011-06-01

    Full Text Available The performance of information processing systems, from artificial neural networks to natural neuronal ensembles, depends heavily on the underlying system architecture. In this study, we compare the performance of parallel and layered network architectures during sequential tasks that require both acquisition and retention of information, thereby identifying tradeoffs between learning and memory processes. During the task of supervised, sequential function approximation, networks produce and adapt representations of external information. Performance is evaluated by statistically analyzing the error in these representations while varying the initial network state, the structure of the external information, and the time given to learn the information. We link performance to complexity in network architecture by characterizing local error landscape curvature. We find that variations in error landscape structure give rise to tradeoffs in performance; these include the ability of the network to maximize accuracy versus minimize inaccuracy and produce specific versus generalizable representations of information. Parallel networks generate smooth error landscapes with deep, narrow minima, enabling them to find highly specific representations given sufficient time. While accurate, however, these representations are difficult to generalize. In contrast, layered networks generate rough error landscapes with a variety of local minima, allowing them to quickly find coarse representations. Although less accurate, these representations are easily adaptable. The presence of measurable performance tradeoffs in both layered and parallel networks has implications for understanding the behavior of a wide variety of natural and artificial learning systems.

  20. A Parallel Saturation Algorithm on Shared Memory Architectures

    Science.gov (United States)

    Ezekiel, Jonathan; Siminiceanu

    2007-01-01

    Symbolic state-space generators are notoriously hard to parallelize. However, the Saturation algorithm implemented in the SMART verification tool differs from other sequential symbolic state-space generators in that it exploits the locality of ring events in asynchronous system models. This paper explores whether event locality can be utilized to efficiently parallelize Saturation on shared-memory architectures. Conceptually, we propose to parallelize the ring of events within a decision diagram node, which is technically realized via a thread pool. We discuss the challenges involved in our parallel design and conduct experimental studies on its prototypical implementation. On a dual-processor dual core PC, our studies show speed-ups for several example models, e.g., of up to 50% for a Kanban model, when compared to running our algorithm only on a single core.

  1. Concept of rewritable organic ferroelectric random access memory in two lateral transistors-in-one cell architecture

    International Nuclear Information System (INIS)

    Kim, Min-Hoi; Lee, Gyu Jeong; Keum, Chang-Min; Lee, Sin-Doo

    2014-01-01

    We propose a concept of rewritable ferroelectric random access memory (RAM) with two lateral organic transistors-in-one cell architecture. Lateral integration of a paraelectric organic field-effect transistor (OFET), being a selection transistor, and a ferroelectric OFET as a memory transistor is realized using a paraelectric depolarizing layer (PDL) which is patterned on a ferroelectric insulator by transfer-printing. For the selection transistor, the key roles of the PDL are to reduce the dipolar strength and the surface roughness of the gate insulator, leading to the low memory on–off ratio and the high switching on–off current ratio. A new driving scheme preventing the crosstalk between adjacent memory cells is also demonstrated for the rewritable operation of the ferroelectric RAM. (paper)

  2. Neurocognitive architecture of working memory

    Science.gov (United States)

    Eriksson, Johan; Vogel, Edward K.; Lansner, Anders; Bergström, Fredrik; Nyberg, Lars

    2015-01-01

    The crucial role of working memory for temporary information processing and guidance of complex behavior has been recognized for many decades. There is emerging consensus that working memory maintenance results from the interactions among long-term memory representations and basic processes, including attention, that are instantiated as reentrant loops between frontal and posterior cortical areas, as well as subcortical structures. The nature of such interactions can account for capacity limitations, lifespan changes, and restricted transfer after working-memory training. Recent data and models indicate that working memory may also be based on synaptic plasticity, and that working memory can operate on non-consciously perceived information. PMID:26447571

  3. Working Memory and Parent-Rated Components of Attention in Middle Childhood: A Behavioral Genetic Study

    Science.gov (United States)

    Deater-Deckard, Kirby; Cutting, Laurie; Thompson, Lee A.; Petrill, Stephen A.

    2012-01-01

    The purpose of the current study was to investigate potential genetic and environmental correlations between working memory and three behavioral aspects of the attention network (i.e., executive, alerting, and orienting) using a twin design. Data were from 90 monozygotic (39% male) and 112 same-sex dizygotic (41% male) twins. Individual differences in working memory performance (digit span) and parent-rated measures of executive, alerting, and orienting attention included modest to moderate genetic variance, modest shared environmental variance, and modest to moderate nonshared environmental variance. As hypothesized, working memory performance was correlated with executive and alerting attention, but not orienting attention. The correlation between working memory, executive attention, and alerting attention was completely accounted for by overlapping genetic covariance, suggesting a common genetic mechanism or mechanisms underlying the links between working memory and certain parent-rated indicators of attentive behavior. PMID:21948215

  4. Factor structure of overall autobiographical memory usage: the directive, self and social functions revisited.

    Science.gov (United States)

    Rasmussen, Anne S; Habermas, Tilmann

    2011-08-01

    According to theory, autobiographical memory serves three broad functions of overall usage: directive, self, and social. However, there is evidence to suggest that the tripartite model may be better conceptualised in terms of a four-factor model with two social functions. In the present study we examined the two models in Danish and German samples, using the Thinking About Life Experiences Questionnaire (TALE; Bluck, Alea, Habermas, & Rubin, 2005), which measures the overall usage of the three functions generalised across concrete memories. Confirmatory factor analysis supported the four-factor model and rejected the theoretical three-factor model in both samples. The results are discussed in relation to cultural differences in overall autobiographical memory usage as well as sharing versus non-sharing aspects of social remembering.

  5. The parallel processing system for fast 3D-CT image reconstruction by circular shifting float memory architecture

    International Nuclear Information System (INIS)

    Wang Shi; Kang Kejun; Wang Jingjin

    1996-01-01

    Computerized Tomography (CT) is expected to become an inevitable diagnostic technique in the future. However, the long time required to reconstruct an image has been one of the major drawbacks associated with this technique. Parallel process is one of the best way to solve this problem. This paper gives the architecture, hardware and software design of PIRS-4 (4-processor Parallel Image Reconstruction System), which is a parallel processing system for fast 3D-CT image reconstruction by circular shifting float memory architecture. It includes the structure and components of the system, the design of crossbar switch and details of control model, the description of RPBP image reconstruction, the choice of OS (Operate System) and language, the principle of imitating EMS, direct memory R/W of float and programming in the protect model. Finally, the test results are given

  6. Exploring memory hierarchy design with emerging memory technologies

    CERN Document Server

    Sun, Guangyu

    2014-01-01

    This book equips readers with tools for computer architecture of high performance, low power, and high reliability memory hierarchy in computer systems based on emerging memory technologies, such as STTRAM, PCM, FBDRAM, etc.  The techniques described offer advantages of high density, near-zero static power, and immunity to soft errors, which have the potential of overcoming the “memory wall.”  The authors discuss memory design from various perspectives: emerging memory technologies are employed in the memory hierarchy with novel architecture modification;  hybrid memory structure is introduced to leverage advantages from multiple memory technologies; an analytical model named “Moguls” is introduced to explore quantitatively the optimization design of a memory hierarchy; finally, the vulnerability of the CMPs to radiation-based soft errors is improved by replacing different levels of on-chip memory with STT-RAMs.   ·         Provides a holistic study of using emerging memory technologies i...

  7. Insights into Working Memory from The Perspective of The EPIC Architecture for Modeling Skilled Perceptual-Motor and Cognitive Human Performance

    National Research Council Canada - National Science Library

    Kieras, David

    1998-01-01

    Computational modeling of human perceptual-motor and cognitive performance based on a comprehensive detailed information- processing architecture leads to new insights about the components of working memory...

  8. Chip architecture - A revolution brewing

    Science.gov (United States)

    Guterl, F.

    1983-07-01

    Techniques being explored by microchip designers and manufacturers to both speed up memory access and instruction execution while protecting memory are discussed. Attention is given to hardwiring control logic, pipelining for parallel processing, devising orthogonal instruction sets for interchangeable instruction fields, and the development of hardware for implementation of virtual memory and multiuser systems to provide memory management and protection. The inclusion of microcode in mainframes eliminated logic circuits that control timing and gating of the CPU. However, improvements in memory architecture have reduced access time to below that needed for instruction execution. Hardwiring the functions as a virtual memory enhances memory protection. Parallelism involves a redundant architecture, which allows identical operations to be performed simultaneously, and can be directed with microcode to avoid abortion of intermediate instructions once on set of instructions has been completed.

  9. Genetic and environmental influences on individual differences in emotion regulation and its relation to working memory in toddlerhood.

    Science.gov (United States)

    Wang, Manjie; Saudino, Kimberly J

    2013-12-01

    This is the first study to explore genetic and environmental contributions to individual differences in emotion regulation in toddlers, and the first to examine the genetic and environmental etiology underlying the association between emotion regulation and working memory. In a sample of 304 same-sex twin pairs (140 MZ, 164 DZ) at age 3, emotion regulation was assessed using the Behavior Rating Scale of the Bayley Scales of Infant Development (BRS; Bayley, 1993), and working memory was measured by the visually cued recall (VCR) task (Zelazo, Jacques, Burack, & Frye, 2002) and several memory tasks from the Mental Scale of the BSID. Based on model-fitting analyses, both emotion regulation and working memory were significantly influenced by genetic and nonshared environmental factors. Shared environmental effects were significant for working memory, but not for emotion regulation. Only genetic factors significantly contributed to the covariation between emotion regulation and working memory.

  10. Scalable Multi-core Architectures Design Methodologies and Tools

    CERN Document Server

    Jantsch, Axel

    2012-01-01

    As Moore’s law continues to unfold, two important trends have recently emerged. First, the growth of chip capacity is translated into a corresponding increase of number of cores. Second, the parallalization of the computation and 3D integration technologies lead to distributed memory architectures. This book provides a current snapshot of industrial and academic research, conducted as part of the European FP7 MOSART project, addressing urgent challenges in many-core architectures and application mapping.  It addresses the architectural design of many core chips, memory and data management, power management, design and programming methodologies. It also describes how new techniques have been applied in various industrial case studies. Describes trends towards distributed memory architectures and distributed power management; Integrates Network on Chip with distributed, shared memory architectures; Demonstrates novel design methodologies and frameworks for multi-core design space exploration; Shows how midll...

  11. Balance in machine architecture: Bandwidth on board and offboard, integer/control speed and flops versus memory

    International Nuclear Information System (INIS)

    Fischler, M.

    1992-04-01

    The issues to be addressed here are those of ''balance'' in machine architecture. By this, we mean how much emphasis must be placed on various aspects of the system to maximize its usefulness for physics. There are three components that contribute to the utility of a system: How the machine can be used, how big a problem can be attacked, and what the effective capabilities (power) of the hardware are like. The effective power issue is a matter of evaluating the impact of design decisions trading off architectural features such as memory bandwidth and interprocessor communication capabilities. What is studied is the effect these machine parameters have on how quickly the system can solve desired problems. There is a reasonable method for studying this: One selects a few representative algorithms and computes the impact of changing memory bandwidths, and so forth. The only room for controversy here is in the selection of representative problems. The issue of how big a problem can be attacked boils down to a balance of memory size versus power. Although this is a balance issue it is very different than the effective power situation, because no firm answer can be given at this time. The power to memory ratio is highly problem dependent, and optimizing it requires several pieces of physics input, including: how big a lattice is needed for interesting results; what sort of algorithms are best to use; and how many sweeps are needed to get valid results. We seem to be at the threshold of learning things about these issues, but for now, the memory size issue will necessarily be addressed in terms of best guesses, rules of thumb, and researchers' opinions

  12. One-way shared memory

    DEFF Research Database (Denmark)

    Schoeberl, Martin

    2018-01-01

    Standard multicore processors use the shared main memory via the on-chip caches for communication between cores. However, this form of communication has two limitations: (1) it is hardly time-predictable and therefore not a good solution for real-time systems and (2) this single shared memory...... is a bottleneck in the system. This paper presents a communication architecture for time-predictable multicore systems where core-local memories are distributed on the chip. A network-on-chip constantly copies data from a sender core-local memory to a receiver core-local memory. As this copying is performed...... in one direction we call this architecture a one-way shared memory. With the use of time-division multiplexing for the memory accesses and the network-on-chip routers we achieve a time-predictable solution where the communication latency and bandwidth can be bounded. An example architecture for a 3...

  13. Hardware system of parallel processing for fast CT image reconstruction based on circular shifting float memory architecture

    International Nuclear Information System (INIS)

    Wang Shi; Kang Kejun; Wang Jingjin

    1995-01-01

    Computerized Tomography (CT) is expected to become an inevitable diagnostic technique in the future. However, the long time required to reconstruct an image has been one of the major drawbacks associated with this technique. Parallel process is one of the best way to solve this problem. This paper gives the architecture and hardware design of PIRS-4 (4-processor Parallel Image Reconstruction System) which is a parallel processing system for fast 3D-CT image reconstruction by circular shifting float memory architecture. It includes structure and component of the system, the design of cross bar switch and details of control model. The test results are described

  14. High-bandwidth memory interface

    CERN Document Server

    Kim, Chulwoo; Song, Junyoung

    2014-01-01

    This book provides an overview of recent advances in memory interface design at both the architecture and circuit levels. Coverage includes signal integrity and testing, TSV interface, high-speed serial interface including equalization, ODT, pre-emphasis, wide I/O interface including crosstalk, skew cancellation, and clock generation and distribution. Trends for further bandwidth enhancement are also covered.   • Enables readers with minimal background in memory design to understand the basics of high-bandwidth memory interface design; • Presents state-of-the-art techniques for memory interface design; • Covers memory interface design at both the circuit level and system architecture level.

  15. A Scalable Multicore Architecture With Heterogeneous Memory Structures for Dynamic Neuromorphic Asynchronous Processors (DYNAPs).

    Science.gov (United States)

    Moradi, Saber; Qiao, Ning; Stefanini, Fabio; Indiveri, Giacomo

    2018-02-01

    Neuromorphic computing systems comprise networks of neurons that use asynchronous events for both computation and communication. This type of representation offers several advantages in terms of bandwidth and power consumption in neuromorphic electronic systems. However, managing the traffic of asynchronous events in large scale systems is a daunting task, both in terms of circuit complexity and memory requirements. Here, we present a novel routing methodology that employs both hierarchical and mesh routing strategies and combines heterogeneous memory structures for minimizing both memory requirements and latency, while maximizing programming flexibility to support a wide range of event-based neural network architectures, through parameter configuration. We validated the proposed scheme in a prototype multicore neuromorphic processor chip that employs hybrid analog/digital circuits for emulating synapse and neuron dynamics together with asynchronous digital circuits for managing the address-event traffic. We present a theoretical analysis of the proposed connectivity scheme, describe the methods and circuits used to implement such scheme, and characterize the prototype chip. Finally, we demonstrate the use of the neuromorphic processor with a convolutional neural network for the real-time classification of visual symbols being flashed to a dynamic vision sensor (DVS) at high speed.

  16. Determining the relationship between sleep architecture, seizure variables and memory in patients with focal epilepsy.

    Science.gov (United States)

    Miller, Laurie A; Ricci, Monica; van Schalkwijk, Frank J; Mohamed, Armin; van der Werf, Ysbrand D

    2016-06-01

    Sleep has been shown to be important to memory. Both sleep and memory have been found to be abnormal in patients with epilepsy. In this study, we explored the effects that nocturnal epileptiform discharges and the presence of a hippocampal lesion have on sleep patterns and memory. Twenty-five patients with focal epilepsy who underwent a 24-hr ambulatory EEG also completed the Everyday Memory Questionnaire (EMQ). The EEG record was scored for length of time spent in the various sleep stages, time spent awake after sleep onset, and rapid eye movement (REM) latency. Of these sleep variables, only REM latency differed when the epilepsy patients were divided on the bases of either presence/absence of nocturnal discharges or presence/absence of a hippocampal lesion. In both cases, presence of the abnormality was associated with longer latency. Furthermore, longer REM latency was found to be a better predictor of EMQ score than either number of discharges or presence of a hippocampal lesion. Longer REM latency was associated with a smaller percentage of time spent in slow-wave sleep in the early part of the night and may serve as a particularly sensitive marker to disturbances in sleep architecture. (PsycINFO Database Record (c) 2016 APA, all rights reserved).

  17. Architectures for a quantum random access memory

    OpenAIRE

    Giovannetti, Vittorio; Lloyd, Seth; Maccone, Lorenzo

    2008-01-01

    A random access memory, or RAM, is a device that, when interrogated, returns the content of a memory location in a memory array. A quantum RAM, or qRAM, allows one to access superpositions of memory sites, which may contain either quantum or classical information. RAMs and qRAMs with n-bit addresses can access 2^n memory sites. Any design for a RAM or qRAM then requires O(2^n) two-bit logic gates. At first sight this requirement might seem to make large scale quantum versions of such devices ...

  18. Die-stacking architecture

    CERN Document Server

    Xie, Yuan

    2015-01-01

    The emerging three-dimensional (3D) chip architectures, with their intrinsic capability of reducing the wire length, promise attractive solutions to reduce the delay of interconnects in future microprocessors. 3D memory stacking enables much higher memory bandwidth for future chip-multiprocessor design, mitigating the ""memory wall"" problem. In addition, heterogenous integration enabled by 3D technology can also result in innovative designs for future microprocessors. This book first provides a brief introduction to this emerging technology, and then presents a variety of approaches to design

  19. Neural Architecture for Feature Binding in Visual Working Memory.

    Science.gov (United States)

    Schneegans, Sebastian; Bays, Paul M

    2017-04-05

    Binding refers to the operation that groups different features together into objects. We propose a neural architecture for feature binding in visual working memory that employs populations of neurons with conjunction responses. We tested this model using cued recall tasks, in which subjects had to memorize object arrays composed of simple visual features (color, orientation, and location). After a brief delay, one feature of one item was given as a cue, and the observer had to report, on a continuous scale, one or two other features of the cued item. Binding failure in this task is associated with swap errors, in which observers report an item other than the one indicated by the cue. We observed that the probability of swapping two items strongly correlated with the items' similarity in the cue feature dimension, and found a strong correlation between swap errors occurring in spatial and nonspatial report. The neural model explains both swap errors and response variability as results of decoding noisy neural activity, and can account for the behavioral results in quantitative detail. We then used the model to compare alternative mechanisms for binding nonspatial features. We found the behavioral results fully consistent with a model in which nonspatial features are bound exclusively via their shared location, with no indication of direct binding between color and orientation. These results provide evidence for a special role of location in feature binding, and the model explains how this special role could be realized in the neural system. SIGNIFICANCE STATEMENT The problem of feature binding is of central importance in understanding the mechanisms of working memory. How do we remember not only that we saw a red and a round object, but that these features belong together to a single object rather than to different objects in our environment? Here we present evidence for a neural mechanism for feature binding in working memory, based on encoding of visual

  20. Memory, microprocessor, and ASIC

    CERN Document Server

    Chen, Wai-Kai

    2003-01-01

    System Timing. ROM/PROM/EPROM. SRAM. Embedded Memory. Flash Memories. Dynamic Random Access Memory. Low-Power Memory Circuits. Timing and Signal Integrity Analysis. Microprocessor Design Verification. Microprocessor Layout Method. Architecture. ASIC Design. Logic Synthesis for Field Programmable Gate Array (EPGA) Technology. Testability Concepts and DFT. ATPG and BIST. CAD Tools for BIST/DFT and Delay Faults.

  1. A Case Study on Neural Inspired Dynamic Memory Management Strategies for High Performance Computing.

    Energy Technology Data Exchange (ETDEWEB)

    Vineyard, Craig Michael [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Verzi, Stephen Joseph [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

    2017-09-01

    As high performance computing architectures pursue more computational power there is a need for increased memory capacity and bandwidth as well. A multi-level memory (MLM) architecture addresses this need by combining multiple memory types with different characteristics as varying levels of the same architecture. How to efficiently utilize this memory infrastructure is an unknown challenge, and in this research we sought to investigate whether neural inspired approaches can meaningfully help with memory management. In particular we explored neurogenesis inspired re- source allocation, and were able to show a neural inspired mixed controller policy can beneficially impact how MLM architectures utilize memory.

  2. Outline of a novel architecture for cortical computation.

    Science.gov (United States)

    Majumdar, Kaushik

    2008-03-01

    In this paper a novel architecture for cortical computation has been proposed. This architecture is composed of computing paths consisting of neurons and synapses. These paths have been decomposed into lateral, longitudinal and vertical components. Cortical computation has then been decomposed into lateral computation (LaC), longitudinal computation (LoC) and vertical computation (VeC). It has been shown that various loop structures in the cortical circuit play important roles in cortical computation as well as in memory storage and retrieval, keeping in conformity with the molecular basis of short and long term memory. A new learning scheme for the brain has also been proposed and how it is implemented within the proposed architecture has been explained. A few mathematical results about the architecture have been proposed, some of which are without proof.

  3. Customizable Memory Schemes for Data Parallel Architectures

    NARCIS (Netherlands)

    Gou, C.

    2011-01-01

    Memory system efficiency is crucial for any processor to achieve high performance, especially in the case of data parallel machines. Processing capabilities of parallel lanes will be wasted, when data requests are not accomplished in a sustainable and timely manner. Irregular vector memory accesses

  4. Sparse distributed memory overview

    Science.gov (United States)

    Raugh, Mike

    1990-01-01

    The Sparse Distributed Memory (SDM) project is investigating the theory and applications of massively parallel computing architecture, called sparse distributed memory, that will support the storage and retrieval of sensory and motor patterns characteristic of autonomous systems. The immediate objectives of the project are centered in studies of the memory itself and in the use of the memory to solve problems in speech, vision, and robotics. Investigation of methods for encoding sensory data is an important part of the research. Examples of NASA missions that may benefit from this work are Space Station, planetary rovers, and solar exploration. Sparse distributed memory offers promising technology for systems that must learn through experience and be capable of adapting to new circumstances, and for operating any large complex system requiring automatic monitoring and control. Sparse distributed memory is a massively parallel architecture motivated by efforts to understand how the human brain works. Sparse distributed memory is an associative memory, able to retrieve information from cues that only partially match patterns stored in the memory. It is able to store long temporal sequences derived from the behavior of a complex system, such as progressive records of the system's sensory data and correlated records of the system's motor controls.

  5. Applications for Packetized Memory Interfaces

    OpenAIRE

    Watson, Myles Glen

    2015-01-01

    The performance of the memory subsystem has a large impact on the performance of modern computer systems. Many important applications are memory bound and others are expected to become memory bound in the future. The importance of memory performance makes it imperative to understand and optimize the interactions between applications and the system architecture. Prototyping and exploring various configurations of memory systems can give important insights, but current memory interfaces are lim...

  6. Extending the Soar Cognitive Architecture

    National Research Council Canada - National Science Library

    Laird, John E

    2007-01-01

    .... Specifically looking at extensions related to memory and learning (episodic, semantic) and emotion. The direction changed when an opportunity became available to collaborate with other biologically-inspired cognitive architecture...

  7. PIMS: Memristor-Based Processing-in-Memory-and-Storage.

    Energy Technology Data Exchange (ETDEWEB)

    Cook, Jeanine

    2018-02-01

    Continued progress in computing has augmented the quest for higher performance with a new quest for higher energy efficiency. This has led to the re-emergence of Processing-In-Memory (PIM) ar- chitectures that offer higher density and performance with some boost in energy efficiency. Past PIM work either integrated a standard CPU with a conventional DRAM to improve the CPU- memory link, or used a bit-level processor with Single Instruction Multiple Data (SIMD) control, but neither matched the energy consumption of the memory to the computation. We originally proposed to develop a new architecture derived from PIM that more effectively addressed energy efficiency for high performance scientific, data analytics, and neuromorphic applications. We also originally planned to implement a von Neumann architecture with arithmetic/logic units (ALUs) that matched the power consumption of an advanced storage array to maximize energy efficiency. Implementing this architecture in storage was our original idea, since by augmenting storage (in- stead of memory), the system could address both in-memory computation and applications that accessed larger data sets directly from storage, hence Processing-in-Memory-and-Storage (PIMS). However, as our research matured, we discovered several things that changed our original direc- tion, the most important being that a PIM that implements a standard von Neumann-type archi- tecture results in significant energy efficiency improvement, but only about a O(10) performance improvement. In addition to this, the emergence of new memory technologies moved us to propos- ing a non-von Neumann architecture, called Superstrider, implemented not in storage, but in a new DRAM technology called High Bandwidth Memory (HBM). HBM is a stacked DRAM tech- nology that includes a logic layer where an architecture such as Superstrider could potentially be implemented.

  8. CMOL/CMOS hardware architectures and performance/price for Bayesian memory - The building block of intelligent systems

    Science.gov (United States)

    Zaveri, Mazad Shaheriar

    The semiconductor/computer industry has been following Moore's law for several decades and has reaped the benefits in speed and density of the resultant scaling. Transistor density has reached almost one billion per chip, and transistor delays are in picoseconds. However, scaling has slowed down, and the semiconductor industry is now facing several challenges. Hybrid CMOS/nano technologies, such as CMOL, are considered as an interim solution to some of the challenges. Another potential architectural solution includes specialized architectures for applications/models in the intelligent computing domain, one aspect of which includes abstract computational models inspired from the neuro/cognitive sciences. Consequently in this dissertation, we focus on the hardware implementations of Bayesian Memory (BM), which is a (Bayesian) Biologically Inspired Computational Model (BICM). This model is a simplified version of George and Hawkins' model of the visual cortex, which includes an inference framework based on Judea Pearl's belief propagation. We then present a "hardware design space exploration" methodology for implementing and analyzing the (digital and mixed-signal) hardware for the BM. This particular methodology involves: analyzing the computational/operational cost and the related micro-architecture, exploring candidate hardware components, proposing various custom hardware architectures using both traditional CMOS and hybrid nanotechnology - CMOL, and investigating the baseline performance/price of these architectures. The results suggest that CMOL is a promising candidate for implementing a BM. Such implementations can utilize the very high density storage/computation benefits of these new nano-scale technologies much more efficiently; for example, the throughput per 858 mm2 (TPM) obtained for CMOL based architectures is 32 to 40 times better than the TPM for a CMOS based multiprocessor/multi-FPGA system, and almost 2000 times better than the TPM for a PC

  9. Architectural Techniques to Enable Reliable and Scalable Memory Systems

    OpenAIRE

    Nair, Prashant J.

    2017-01-01

    High capacity and scalable memory systems play a vital role in enabling our desktops, smartphones, and pervasive technologies like Internet of Things (IoT). Unfortunately, memory systems are becoming increasingly prone to faults. This is because we rely on technology scaling to improve memory density, and at small feature sizes, memory cells tend to break easily. Today, memory reliability is seen as the key impediment towards using high-density devices, adopting new technologies, and even bui...

  10. Impact of Cognitive Architectures on Human-Computer Interaction

    Science.gov (United States)

    2014-09-01

    activation, reinforced learning, emotion, semantic memory , episodic memory , and visual imagery.12 In 2010 Rosenbloom created a variant of the Soar...being added to almost every new version. In 2004 Nuxoll and Laird added episodic memory to the Soar architecture.11 In 2008 Laird presented...York (NY): Psychology Press; 2014; p. 1–50. 11. Nuxoll A, Laird JE. A cognitive model of episodic memory integrated with a general cognitive

  11. Flexible NAND-Like Organic Ferroelectric Memory Array

    NARCIS (Netherlands)

    Kam, B.; Ke, T.H.; Chasin, A.; Tyagi, M.; Cristoferi, C.; Tempelaars, K.; Breemen, A.J.J.M. van; Myny, K.; Schols, S.; Genoe, J.; Gelinck, G.H.; Heremans, P.

    2014-01-01

    We present a memory array of organic ferroelectric field-effect transistors (OFeFETs) on flexible substrates. The OFeFETs are connected serially, similar to the NAND architecture of flash memory, which offers the highest memory density of transistor memories. We demonstrate a reliable addressing

  12. Phase change memory

    CERN Document Server

    Qureshi, Moinuddin K

    2011-01-01

    As conventional memory technologies such as DRAM and Flash run into scaling challenges, architects and system designers are forced to look at alternative technologies for building future computer systems. This synthesis lecture begins by listing the requirements for a next generation memory technology and briefly surveys the landscape of novel non-volatile memories. Among these, Phase Change Memory (PCM) is emerging as a leading contender, and the authors discuss the material, device, and circuit advances underlying this exciting technology. The lecture then describes architectural solutions t

  13. Fabry-Perot confocal resonator optical associative memory

    Science.gov (United States)

    Burns, Thomas J.; Rogers, Steven K.; Vogel, George A.

    1993-03-01

    A unique optical associative memory architecture is presented that combines the optical processing environment of a Fabry-Perot confocal resonator with the dynamic storage and recall properties of volume holograms. The confocal resonator reduces the size and complexity of previous associative memory architectures by folding a large number of discrete optical components into an integrated, compact optical processing environment. Experimental results demonstrate the system is capable of recalling a complete object from memory when presented with partial information about the object. A Fourier optics model of the system's operation shows it implements a spatially continuous version of a discrete, binary Hopfield neural network associative memory.

  14. Preventing Out-of-Sequence for Multicast Input-Queued Space-Memory-Memory Clos-Network

    DEFF Research Database (Denmark)

    Yu, Hao; Ruepp, Sarah Renée; Berger, Michael Stübert

    2011-01-01

    This paper proposes an out-of-sequence (OOS) preventative cell dispatching algorithm, the multicast flow-based round robin (MFRR), for multicast input-queued space-memory-memory (IQ-SMM) Clos-network architecture. Independently treating each incoming cell, such as the desynchronized static round...

  15. The neural architecture of music-evoked autobiographical memories.

    Science.gov (United States)

    Janata, Petr

    2009-11-01

    The medial prefrontal cortex (MPFC) is regarded as a region of the brain that supports self-referential processes, including the integration of sensory information with self-knowledge and the retrieval of autobiographical information. I used functional magnetic resonance imaging and a novel procedure for eliciting autobiographical memories with excerpts of popular music dating to one's extended childhood to test the hypothesis that music and autobiographical memories are integrated in the MPFC. Dorsal regions of the MPFC (Brodmann area 8/9) were shown to respond parametrically to the degree of autobiographical salience experienced over the course of individual 30 s excerpts. Moreover, the dorsal MPFC also responded on a second, faster timescale corresponding to the signature movements of the musical excerpts through tonal space. These results suggest that the dorsal MPFC associates music and memories when we experience emotionally salient episodic memories that are triggered by familiar songs from our personal past. MPFC acted in concert with lateral prefrontal and posterior cortices both in terms of tonality tracking and overall responsiveness to familiar and autobiographically salient songs. These findings extend the results of previous autobiographical memory research by demonstrating the spontaneous activation of an autobiographical memory network in a naturalistic task with low retrieval demands.

  16. The Neural Architecture of Music-Evoked Autobiographical Memories

    Science.gov (United States)

    2009-01-01

    The medial prefrontal cortex (MPFC) is regarded as a region of the brain that supports self-referential processes, including the integration of sensory information with self-knowledge and the retrieval of autobiographical information. I used functional magnetic resonance imaging and a novel procedure for eliciting autobiographical memories with excerpts of popular music dating to one's extended childhood to test the hypothesis that music and autobiographical memories are integrated in the MPFC. Dorsal regions of the MPFC (Brodmann area 8/9) were shown to respond parametrically to the degree of autobiographical salience experienced over the course of individual 30 s excerpts. Moreover, the dorsal MPFC also responded on a second, faster timescale corresponding to the signature movements of the musical excerpts through tonal space. These results suggest that the dorsal MPFC associates music and memories when we experience emotionally salient episodic memories that are triggered by familiar songs from our personal past. MPFC acted in concert with lateral prefrontal and posterior cortices both in terms of tonality tracking and overall responsiveness to familiar and autobiographically salient songs. These findings extend the results of previous autobiographical memory research by demonstrating the spontaneous activation of an autobiographical memory network in a naturalistic task with low retrieval demands. PMID:19240137

  17. Quantum random access memory

    OpenAIRE

    Giovannetti, Vittorio; Lloyd, Seth; Maccone, Lorenzo

    2007-01-01

    A random access memory (RAM) uses n bits to randomly address N=2^n distinct memory cells. A quantum random access memory (qRAM) uses n qubits to address any quantum superposition of N memory cells. We present an architecture that exponentially reduces the requirements for a memory call: O(log N) switches need be thrown instead of the N used in conventional (classical or quantum) RAM designs. This yields a more robust qRAM algorithm, as it in general requires entanglement among exponentially l...

  18. A Bandwidth-Optimized Multi-Core Architecture for Irregular Applications

    Energy Technology Data Exchange (ETDEWEB)

    Secchi, Simone; Tumeo, Antonino; Villa, Oreste

    2012-05-31

    This paper presents an architecture template for next-generation high performance computing systems specifically targeted to irregular applications. We start our work by considering that future generation interconnection and memory bandwidth full-system numbers are expected to grow by a factor of 10. In order to keep up with such a communication capacity, while still resorting to fine-grained multithreading as the main way to tolerate unpredictable memory access latencies of irregular applications, we show how overall performance scaling can benefit from the multi-core paradigm. At the same time, we also show how such an architecture template must be coupled with specific techniques in order to optimize bandwidth utilization and achieve the maximum scalability. We propose a technique based on memory references aggregation, together with the related hardware implementation, as one of such optimization techniques. We explore the proposed architecture template by focusing on the Cray XMT architecture and, using a dedicated simulation infrastructure, validate the performance of our template with two typical irregular applications. Our experimental results prove the benefits provided by both the multi-core approach and the bandwidth optimization reference aggregation technique.

  19. A stacked memory device on logic 3D technology for ultra-high-density data storage

    International Nuclear Information System (INIS)

    Kim, Jiyoung; Hong, Augustin J; Kim, Sung Min; Shin, Kyeong-Sik; Song, Emil B; Hwang, Yongha; Xiu, Faxian; Galatsis, Kosmas; Chui, Chi On; Candler, Rob N; Wang, Kang L; Choi, Siyoung; Moon, Joo-Tae

    2011-01-01

    We have demonstrated, for the first time, a novel three-dimensional (3D) memory chip architecture of stacked-memory-devices-on-logic (SMOL) achieving up to 95% of cell-area efficiency by directly building up memory devices on top of front-end CMOS devices. In order to realize the SMOL, a unique 3D Flash memory device and vertical integration structure have been successfully developed. The SMOL architecture has great potential to achieve tera-bit level memory density by stacking memory devices vertically and maximizing cell-area efficiency. Furthermore, various emerging devices could replace the 3D memory device to develop new 3D chip architectures.

  20. A stacked memory device on logic 3D technology for ultra-high-density data storage

    Energy Technology Data Exchange (ETDEWEB)

    Kim, Jiyoung; Hong, Augustin J; Kim, Sung Min; Shin, Kyeong-Sik; Song, Emil B; Hwang, Yongha; Xiu, Faxian; Galatsis, Kosmas; Chui, Chi On; Candler, Rob N; Wang, Kang L [Device Research Laboratory, Department of Electrical Engineering, University of California, Los Angeles, CA 90095 (United States); Choi, Siyoung; Moon, Joo-Tae, E-mail: hbt100@ee.ucla.edu [Advanced Technology Development Team and Process Development Team, Memory R and D Center, Samsung Electronics Co. Ltd (Korea, Republic of)

    2011-06-24

    We have demonstrated, for the first time, a novel three-dimensional (3D) memory chip architecture of stacked-memory-devices-on-logic (SMOL) achieving up to 95% of cell-area efficiency by directly building up memory devices on top of front-end CMOS devices. In order to realize the SMOL, a unique 3D Flash memory device and vertical integration structure have been successfully developed. The SMOL architecture has great potential to achieve tera-bit level memory density by stacking memory devices vertically and maximizing cell-area efficiency. Furthermore, various emerging devices could replace the 3D memory device to develop new 3D chip architectures.

  1. Key Technologies of Phone Storage Forensics Based on ARM Architecture

    Science.gov (United States)

    Zhang, Jianghan; Che, Shengbing

    2018-03-01

    Smart phones are mainly running Android, IOS and Windows Phone three mobile platform operating systems. The android smart phone has the best market shares and its processor chips are almost ARM software architecture. The chips memory address mapping mechanism of ARM software architecture is different with x86 software architecture. To forensics to android mart phone, we need to understand three key technologies: memory data acquisition, the conversion mechanism from virtual address to the physical address, and find the system’s key data. This article presents a viable solution which does not rely on the operating system API for a complete solution to these three issues.

  2. Embedded memory design for multi-core and systems on chip

    CERN Document Server

    Mohammad, Baker

    2014-01-01

    This book describes the various tradeoffs systems designers face when designing embedded memory.  Readers designing multi-core systems and systems on chip will benefit from the discussion of different topics from memory architecture, array organization, circuit design techniques and design for test.  The presentation enables a multi-disciplinary approach to chip design, which bridges the gap between the architecture level and circuit level, in order to address yield, reliability and power-related issues for embedded memory.  ·         Provides a comprehensive overview of embedded memory design and associated challenges and choices; ·         Explains tradeoffs and dependencies across different disciplines involved with multi-core and system on chip memory design; ·         Includes detailed discussion of memory hierarchy and its impact on energy and performance; ·         Uses real product examples to demonstrate embedded memory design flow from architecture, to circuit ...

  3. MC 68020 μp architecture

    International Nuclear Information System (INIS)

    Casals, O.; Dejuan, E.; Labarta, J.

    1988-01-01

    The MC68020 is a 32-bit microprocessor object code compatible with the earlier MC68000 and MC68010. In this paper we describe its architecture and two coprocessors: the MC68851 paged memory management unit and the MC68882 floating point coprocessor. Between its most important characteristics we can point up: addressing mode extensions for enhanced support of high level languages, an on-chip instruction cache and full support of virtual memory. (Author)

  4. Processor-in-memory-and-storage architecture

    Science.gov (United States)

    DeBenedictis, Erik

    2018-01-02

    A method and apparatus for performing reliable general-purpose computing. Each sub-core of a plurality of sub-cores of a processor core processes a same instruction at a same time. A code analyzer receives a plurality of residues that represents a code word corresponding to the same instruction and an indication of whether the code word is a memory address code or a data code from the plurality of sub-cores. The code analyzer determines whether the plurality of residues are consistent or inconsistent. The code analyzer and the plurality of sub-cores perform a set of operations based on whether the code word is a memory address code or a data code and a determination of whether the plurality of residues are consistent or inconsistent.

  5. Peer deviance, alcohol expectancies, and adolescent alcohol use: explaining shared and nonshared environmental effects using an adoptive sibling pair design.

    Science.gov (United States)

    Samek, Diana R; Keyes, Margaret A; Iacono, William G; McGue, Matt

    2013-07-01

    Previous research suggests adolescent alcohol use is largely influenced by environmental factors, yet little is known about the specific nature of this influence. We hypothesized that peer deviance and alcohol expectancies would be sources of environmental influence because both have been consistently and strongly correlated with adolescent alcohol use. The sample included 206 genetically related and 407 genetically unrelated sibling pairs assessed in mid-to-late adolescence. The heritability of adolescent alcohol use (e.g., frequency, quantity last 12 months) was minimal and not significantly different from zero. The associations among peer deviance, alcohol expectancies, and alcohol use were primarily due to shared environmental factors. Of special note, alcohol expectancies also significantly explained nonshared environmental influence on alcohol use. This study is one of few that have identified specific environmental variants of adolescent alcohol use while controlling for genetic influence.

  6. Roofline model toolkit: A practical tool for architectural and program analysis

    Energy Technology Data Exchange (ETDEWEB)

    Lo, Yu Jung [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Williams, Samuel [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Van Straalen, Brian [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Ligocki, Terry J. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Cordery, Matthew J. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Wright, Nicholas J. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Hall, Mary W. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Oliker, Leonid [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)

    2015-04-18

    We present preliminary results of the Roofline Toolkit for multicore, many core, and accelerated architectures. This paper focuses on the processor architecture characterization engine, a collection of portable instrumented micro benchmarks implemented with Message Passing Interface (MPI), and OpenMP used to express thread-level parallelism. These benchmarks are specialized to quantify the behavior of different architectural features. Compared to previous work on performance characterization, these microbenchmarks focus on capturing the performance of each level of the memory hierarchy, along with thread-level parallelism, instruction-level parallelism and explicit SIMD parallelism, measured in the context of the compilers and run-time environments. We also measure sustained PCIe throughput with four GPU memory managed mechanisms. By combining results from the architecture characterization with the Roofline model based solely on architectural specifications, this work offers insights for performance prediction of current and future architectures and their software systems. To that end, we instrument three applications and plot their resultant performance on the corresponding Roofline model when run on a Blue Gene/Q architecture.

  7. Architecture for robot intelligence

    Science.gov (United States)

    Peters, II, Richard Alan (Inventor)

    2004-01-01

    An architecture for robot intelligence enables a robot to learn new behaviors and create new behavior sequences autonomously and interact with a dynamically changing environment. Sensory information is mapped onto a Sensory Ego-Sphere (SES) that rapidly identifies important changes in the environment and functions much like short term memory. Behaviors are stored in a DBAM that creates an active map from the robot's current state to a goal state and functions much like long term memory. A dream state converts recent activities stored in the SES and creates or modifies behaviors in the DBAM.

  8. Emerging Non-volatile Memory Technologies Exploration Flow for Processor Architecture

    OpenAIRE

    senni , sophiane; Torres , Lionel; Sassatelli , Gilles; Gamatié , Abdoulaye; Mussard , Bruno

    2015-01-01

    International audience; Most die area of today's systems-on-chips is occupied by memories. Hence, a significant proportion of total power is spent on memory systems. Moreover, since processing elements have to be fed with instructions and data from memories, memory plays a key role for system's performance. As a result, memories are a critical part of future embedded systems. Continuing CMOS scaling leads to manufacturing constraints and power consumption issues for the current three main mem...

  9. Utilizing a multiprocessor architecture - The performance of MIDAS

    International Nuclear Information System (INIS)

    Maples, C.; Logan, D.; Meng, J.; Rathbun, W.; Weaver, D.

    1983-01-01

    The MIDAS architecture organizes multiple CPUs into clusters called distributed subsystems. Each subsystem consists of an array of processors controlled by a supervisory CPU. The multiprocessor array is composed of commercial CPUs (with floating point hardware) and specialized processing elements. Interprocessor communication within the array may occur either through switched memory modules or common shared memory. The architecture permits multiple processors to be focused on single problems. A distributed subsystem has been constructed and tested. It currently consists of a supervisor CPU; 16 blocks of independently switchable memory; 9 general purpose, VAX-class CPUs; and 2 specialized pipelined processors to handle I/O. Results on a variety of problems indicate that the subsystem performs 8 to 15 times faster than a standard computer with an identical CPU. The difference in performance represents the effect of differing CPU and I/O requirements

  10. Program Execution on Reconfigurable Multicore Architectures

    Directory of Open Access Journals (Sweden)

    Sanjiva Prasad

    2016-06-01

    Full Text Available Based on the two observations that diverse applications perform better on different multicore architectures, and that different phases of an application may have vastly different resource requirements, Pal et al. proposed a novel reconfigurable hardware approach for executing multithreaded programs. Instead of mapping a concurrent program to a fixed architecture, the architecture adaptively reconfigures itself to meet the application's concurrency and communication requirements, yielding significant improvements in performance. Based on our earlier abstract operational framework for multicore execution with hierarchical memory structures, we describe execution of multithreaded programs on reconfigurable architectures that support a variety of clustered configurations. Such reconfiguration may not preserve the semantics of programs due to the possible introduction of race conditions arising from concurrent accesses to shared memory by threads running on the different cores. We present an intuitive partial ordering notion on the cluster configurations, and show that the semantics of multithreaded programs is always preserved for reconfigurations "upward" in that ordering, whereas semantics preservation for arbitrary reconfigurations can be guaranteed for well-synchronised programs. We further show that a simple approximate notion of efficiency of execution on the different configurations can be obtained using the notion of amortised bisimulations, and extend it to dynamic reconfiguration.

  11. Low-Power Architectures for Large Radio Astronomy Correlators

    Science.gov (United States)

    D'Addario, Larry R.

    2011-01-01

    The architecture of a cross-correlator for a synthesis radio telescope with N greater than 1000 antennas is studied with the objective of minimizing power consumption. It is found that the optimum architecture minimizes memory operations, and this implies preference for a matrix structure over a pipeline structure and avoiding the use of memory banks as accumulation registers when sharing multiply-accumulators among baselines. A straw-man design for N = 2000 and bandwidth of 1 GHz, based on ASICs fabricated in a 90 nm CMOS process, is presented. The cross-correlator proper (excluding per-antenna processing) is estimated to consume less than 35 kW.

  12. Out-of-Sequence Preventative Cell Dispatching for Multicast Input-Queued Space-Memory-Memory Clos-Network

    DEFF Research Database (Denmark)

    Yu, Hao; Ruepp, Sarah Renée; Berger, Michael Stübert

    2011-01-01

    This paper proposes two out-of-sequence (OOS) preventative cell dispatching algorithms for the multicast input-queued space-memory-memory (IQ-SMM) Clos-network switch architecture, i.e. the multicast flow-based DSRR (MF-DSRR) and the multicast flow-based round-robin (MFRR). Treating each cell...

  13. Real-time FPGA architectures for computer vision

    Science.gov (United States)

    Arias-Estrada, Miguel; Torres-Huitzil, Cesar

    2000-03-01

    This paper presents an architecture for real-time generic convolution of a mask and an image. The architecture is intended for fast low level image processing. The FPGA-based architecture takes advantage of the availability of registers in FPGAs to implement an efficient and compact module to process the convolutions. The architecture is designed to minimize the number of accesses to the image memory and is based on parallel modules with internal pipeline operation in order to improve its performance. The architecture is prototyped in a FPGA, but it can be implemented on a dedicated VLSI to reach higher clock frequencies. Complexity issues, FPGA resources utilization, FPGA limitations, and real time performance are discussed. Some results are presented and discussed.

  14. MulticoreBSP for C : A high-performance library for shared-memory parallel programming

    NARCIS (Netherlands)

    Yzelman, A. N.; Bisseling, R. H.; Roose, D.; Meerbergen, K.

    2014-01-01

    The bulk synchronous parallel (BSP) model, as well as parallel programming interfaces based on BSP, classically target distributed-memory parallel architectures. In earlier work, Yzelman and Bisseling designed a MulticoreBSP for Java library specifically for shared-memory architectures. In the

  15. Non-volatile memory based on the ferroelectric photovoltaic effect

    Science.gov (United States)

    Guo, Rui; You, Lu; Zhou, Yang; Shiuh Lim, Zhi; Zou, Xi; Chen, Lang; Ramesh, R.; Wang, Junling

    2013-01-01

    The quest for a solid state universal memory with high-storage density, high read/write speed, random access and non-volatility has triggered intense research into new materials and novel device architectures. Though the non-volatile memory market is dominated by flash memory now, it has very low operation speed with ~10 μs programming and ~10 ms erasing time. Furthermore, it can only withstand ~105 rewriting cycles, which prevents it from becoming the universal memory. Here we demonstrate that the significant photovoltaic effect of a ferroelectric material, such as BiFeO3 with a band gap in the visible range, can be used to sense the polarization direction non-destructively in a ferroelectric memory. A prototype 16-cell memory based on the cross-bar architecture has been prepared and tested, demonstrating the feasibility of this technique. PMID:23756366

  16. Architecture for Multiple Interacting Robot Intelligences

    Science.gov (United States)

    Peters, Richard Alan, II (Inventor)

    2008-01-01

    An architecture for robot intelligence enables a robot to learn new behaviors and create new behavior sequences autonomously and interact with a dynamically changing environment. Sensory information is mapped onto a Sensory Ego-Sphere (SES) that rapidly identifies important changes in the environment and functions much like short term memory. Behaviors are stored in a database associative memory (DBAM) that creates an active map from the robot's current state to a goal state and functions much like long term memory. A dream state converts recent activities stored in the SES and creates or modifies behaviors in the DBAM.

  17. Skin-Inspired Haptic Memory Arrays with an Electrically Reconfigurable Architecture.

    Science.gov (United States)

    Zhu, Bowen; Wang, Hong; Liu, Yaqing; Qi, Dianpeng; Liu, Zhiyuan; Wang, Hua; Yu, Jiancan; Sherburne, Matthew; Wang, Zhaohui; Chen, Xiaodong

    2016-02-24

    Skin-inspired haptic-memory devices, which can retain pressure information after the removel of external pressure by virtue of the nonvolatile nature of the memory devices, are achieved. The rise of haptic-memory devices will allow for mimicry of human sensory memory, opening new avenues for the design of next-generation high-performance sensing devices and systems. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  18. Architectures of electro-optical packet switched networks

    DEFF Research Database (Denmark)

    Berger, Michael Stubert

    2004-01-01

    and examines possible architectures for future high capacity networks with high capacity nodes. It is assumed that optics will play a key role in this scenario, and in this respect, the European IST research project DAVID aimed at proposing viable architectures for optical packet switching, exploiting the best...... from optics and electronics. An overview of the DAVID network architecture is given, focusing on the MAN and WAN architecture as well as the MPLS based network hierarchy. A statistical model of the optical slot generation process is presented and utilised to evaluate delay vs. efficiency. Furthermore...... architecture for a buffered crossbar switch is presented. The architecture uses two levels of backpressure (flow control) with different constraints on round trip time. No additional scheduling complexity is introduced, and for the actual example shown, a reduction in memory of 75% was obtained at the cost...

  19. Reflective memory recorder upgrade: an opportunity to benchmark PowerPC and Intel architectures for real time

    Science.gov (United States)

    Abuter, Roberto; Tischer, Helmut; Frahm, Robert

    2014-07-01

    Several high frequency loops are required to run the VLTI (Very Large Telescope Interferometer) 2, e.g. for fringe tracking11, 5, angle tracking, vibration cancellation, data capture. All these loops rely on low latency real time computers based on the VME bus, Motorola PowerPC14 hardware architecture. In this context, one highly demanding application in terms of cycle time, latency and data transfer volume is the VLTI centralized recording facility, so called, RMN recorder1 (Reflective Memory Recorder). This application captures and transfers data flowing through the distributed memory of the system in real time. Some of the VLTI data producers are running with frequencies up to 8 KHz. With the evolution from first generation instruments like MIDI3, PRIMA5, and AMBER4 which use one or two baselines, to second generation instruments like MATISSE10 and GRAVITY9 which will use all six baselines simultaneously, the quantity of signals has increased by, at least, a factor of six. This has led to a significant overload of the RMN recorder1 which has reached the natural limits imposed by the underlying hardware. At the same time, new, more powerful computers, based on the Intel multicore families of CPUs and PCI buses have become available. With the purpose of improving the performance of the RMN recorder1 application and in order to make it capable of coping with the demands of the new generation instruments, a slightly modified implementation has been developed and integrated into an Intel based multicore computer15 running the VxWorks17 real time operating system. The core of the application is based on the standard VLT software framework for instruments13. The real time task reads from the reflective memory using the onboard DMA access12 and captured data is transferred to the outside world via a TCP socket on a dedicated Ethernet connection. The diversity of the software and hardware that are involved makes this application suitable as a benchmarking platform. A

  20. Efficient Machine Learning Approach for Optimizing Scientific Computing Applications on Emerging HPC Architectures

    Energy Technology Data Exchange (ETDEWEB)

    Arumugam, Kamesh [Old Dominion Univ., Norfolk, VA (United States)

    2017-05-01

    Efficient parallel implementations of scientific applications on multi-core CPUs with accelerators such as GPUs and Xeon Phis is challenging. This requires - exploiting the data parallel architecture of the accelerator along with the vector pipelines of modern x86 CPU architectures, load balancing, and efficient memory transfer between different devices. It is relatively easy to meet these requirements for highly structured scientific applications. In contrast, a number of scientific and engineering applications are unstructured. Getting performance on accelerators for these applications is extremely challenging because many of these applications employ irregular algorithms which exhibit data-dependent control-ow and irregular memory accesses. Furthermore, these applications are often iterative with dependency between steps, and thus making it hard to parallelize across steps. As a result, parallelism in these applications is often limited to a single step. Numerical simulation of charged particles beam dynamics is one such application where the distribution of work and memory access pattern at each time step is irregular. Applications with these properties tend to present significant branch and memory divergence, load imbalance between different processor cores, and poor compute and memory utilization. Prior research on parallelizing such irregular applications have been focused around optimizing the irregular, data-dependent memory accesses and control-ow during a single step of the application independent of the other steps, with the assumption that these patterns are completely unpredictable. We observed that the structure of computation leading to control-ow divergence and irregular memory accesses in one step is similar to that in the next step. It is possible to predict this structure in the current step by observing the computation structure of previous steps. In this dissertation, we present novel machine learning based optimization techniques to address

  1. Raexplore: Enabling Rapid, Automated Architecture Exploration for Full Applications

    Energy Technology Data Exchange (ETDEWEB)

    Zhang, Yao [Argonne National Lab. (ANL), Argonne, IL (United States); Balaprakash, Prasanna [Argonne National Lab. (ANL), Argonne, IL (United States); Meng, Jiayuan [Argonne National Lab. (ANL), Argonne, IL (United States); Morozov, Vitali [Argonne National Lab. (ANL), Argonne, IL (United States); Parker, Scott [Argonne National Lab. (ANL), Argonne, IL (United States); Kumaran, Kalyan [Argonne National Lab. (ANL), Argonne, IL (United States)

    2014-12-01

    We present Raexplore, a performance modeling framework for architecture exploration. Raexplore enables rapid, automated, and systematic search of architecture design space by combining hardware counter-based performance characterization and analytical performance modeling. We demonstrate Raexplore for two recent manycore processors IBM Blue- Gene/Q compute chip and Intel Xeon Phi, targeting a set of scientific applications. Our framework is able to capture complex interactions between architectural components including instruction pipeline, cache, and memory, and to achieve a 3–22% error for same-architecture and cross-architecture performance predictions. Furthermore, we apply our framework to assess the two processors, and discover and evaluate a list of architectural scaling options for future processor designs.

  2. The neural architecture of music-evoked autobiographical memories

    OpenAIRE

    Janata, P

    2009-01-01

    The medial prefrontal cortex (MPFC) is regarded as a region of the brain that supports self-referential processes, including the integration of sensory information with self-knowledge and the retrieval of autobiographical information. I used functional magnetic resonance imaging and a novel procedure for eliciting autobiographical memories with excerpts of popular music dating to one's extended childhood to test the hypothesis that music and autobiographical memories are integrated in the MPF...

  3. Array processor architecture

    Science.gov (United States)

    Barnes, George H. (Inventor); Lundstrom, Stephen F. (Inventor); Shafer, Philip E. (Inventor)

    1983-01-01

    A high speed parallel array data processing architecture fashioned under a computational envelope approach includes a data base memory for secondary storage of programs and data, and a plurality of memory modules interconnected to a plurality of processing modules by a connection network of the Omega gender. Programs and data are fed from the data base memory to the plurality of memory modules and from hence the programs are fed through the connection network to the array of processors (one copy of each program for each processor). Execution of the programs occur with the processors operating normally quite independently of each other in a multiprocessing fashion. For data dependent operations and other suitable operations, all processors are instructed to finish one given task or program branch before all are instructed to proceed in parallel processing fashion on the next instruction. Even when functioning in the parallel processing mode however, the processors are not locked-step but execute their own copy of the program individually unless or until another overall processor array synchronization instruction is issued.

  4. Time-Predictable Virtual Memory

    DEFF Research Database (Denmark)

    Puffitsch, Wolfgang; Schoeberl, Martin

    2016-01-01

    Virtual memory is an important feature of modern computer architectures. For hard real-time systems, memory protection is a particularly interesting feature of virtual memory. However, current memory management units are not designed for time-predictability and therefore cannot be used...... in such systems. This paper investigates the requirements on virtual memory from the perspective of hard real-time systems and presents the design of a time-predictable memory management unit. Our evaluation shows that the proposed design can be implemented efficiently. The design allows address translation...... and address range checking in constant time of two clock cycles on a cache miss. This constant time is in strong contrast to the possible cost of a miss in a translation look-aside buffer in traditional virtual memory organizations. Compared to a platform without a memory management unit, these two additional...

  5. Architecture, landscape architecture and interior- Hons B 2009

    CSIR Research Space (South Africa)

    Osman, A

    2010-03-01

    Full Text Available will be as follows: 1. History of Urban Form 2. Urban Renewal and Reactions 3. Urban Order, Security and Power 4. Colonial Impact on Urban From 5. Memory and Conservation 6. Considering the Public and Private Realm 7. Housing and Urban Form ? Type, Poetics 8....e. ?interior design? / ?inte- rior architecture?). Interior design is the reaction to ?found? space and follows three modes of produc- tion: installation, insertion and intervention. Archi- tectural theory pertinent to the discipline?s ontology...

  6. Designing Next Generation Massively Multithreaded Architectures for Irregular Applications

    Energy Technology Data Exchange (ETDEWEB)

    Tumeo, Antonino; Secchi, Simone; Villa, Oreste

    2012-08-31

    Irregular applications, such as data mining or graph-based computations, show unpredictable memory/network access patterns and control structures. Massively multi-threaded architectures with large node count, like the Cray XMT, have been shown to address their requirements better than commodity clusters. In this paper we present the approaches that we are currently pursuing to design future generations of these architectures. First, we introduce the Cray XMT and compare it to other multithreaded architectures. We then propose an evolution of the architecture, integrating multiple cores per node and next generation network interconnect. We advocate the use of hardware support for remote memory reference aggregation to optimize network utilization. For this evaluation we developed a highly parallel, custom simulation infrastructure for multi-threaded systems. Our simulator executes unmodified XMT binaries with very large datasets, capturing effects due to contention and hot-spotting, while predicting execution times with greater than 90% accuracy. We also discuss the FPGA prototyping approach that we are employing to study efficient support for irregular applications in next generation manycore processors.

  7. Exploring Hardware-Based Primitives to Enhance Parallel Security Monitoring in a Novel Computing Architecture

    National Research Council Canada - National Science Library

    Mott, Stephen

    2007-01-01

    .... In doing this, we propose a novel computing architecture, derived from a contemporary shared memory architecture, that facilitates efficient security-related monitoring in real-time, while keeping...

  8. Memory Efficient VLSI Implementation of Real-Time Motion Detection System Using FPGA Platform

    Directory of Open Access Journals (Sweden)

    Sanjay Singh

    2017-06-01

    Full Text Available Motion detection is the heart of a potentially complex automated video surveillance system, intended to be used as a standalone system. Therefore, in addition to being accurate and robust, a successful motion detection technique must also be economical in the use of computational resources on selected FPGA development platform. This is because many other complex algorithms of an automated video surveillance system also run on the same platform. Keeping this key requirement as main focus, a memory efficient VLSI architecture for real-time motion detection and its implementation on FPGA platform is presented in this paper. This is accomplished by proposing a new memory efficient motion detection scheme and designing its VLSI architecture. The complete real-time motion detection system using the proposed memory efficient architecture along with proper input/output interfaces is implemented on Xilinx ML510 (Virtex-5 FX130T FPGA development platform and is capable of operating at 154.55 MHz clock frequency. Memory requirement of the proposed architecture is reduced by 41% compared to the standard clustering based motion detection architecture. The new memory efficient system robustly and automatically detects motion in real-world scenarios (both for the static backgrounds and the pseudo-stationary backgrounds in real-time for standard PAL (720 × 576 size color video.

  9. An Innovative Radiation Hardened CAM Architecture

    CERN Document Server

    Shojaii, Seyed Ruhollah; The ATLAS collaboration

    2018-01-01

    This article describes an innovative Content Addressable Memory (CAM) cell with radiation hardened (RH) architecture. The RH-CAM is designed in a commercial 28 nm CMOS technology. The circuit has been simulated in worst-case conditions, and the effects due to single particles have been analyzed by injecting a current pulse into a circuit node. The proposed architecture is suitable for on-time pattern recognition tasks in harsh environments, such as front-end electronics in hadron colliders and in space applications.

  10. Using Runtime Systems Tools to Implement Efficient Preconditioners for Heterogeneous Architectures

    Directory of Open Access Journals (Sweden)

    Roussel Adrien

    2016-11-01

    Full Text Available Solving large sparse linear systems is a time-consuming step in basin modeling or reservoir simulation. The choice of a robust preconditioner strongly impact the performance of the overall simulation. Heterogeneous architectures based on General Purpose computing on Graphic Processing Units (GPGPU or many-core architectures introduce programming challenges which can be managed in a transparent way for developer with the use of runtime systems. Nevertheless, algorithms need to be well suited for these massively parallel architectures. In this paper, we present preconditioning techniques which enable to take advantage of emerging architectures. We also present our task-based implementations through the use of the HARTS (Heterogeneous Abstract RunTime System runtime system, which aims to manage the recent architectures. We focus on two preconditoners. The first is ILU(0 preconditioner implemented on distributing memory systems. The second one is a multi-level domain decomposition method implemented on a shared-memory system. Obtained results are then presented on corresponding architectures, which open the way to discuss on the scalability of such methods according to numerical performances while keeping in mind that the next step is to propose a massively parallel implementations of these techniques.

  11. Developing a Complete and Effective ACT-R Architecture

    National Research Council Canada - National Science Library

    Anderson, John R; Lebiere, Christian

    2008-01-01

    The Carnegie Mellon University team focused on extending their current cognitive architecture, ACT-R, to show how visual imagery, language, emotion and meta-cognition affect learning, memory, and reasoning...

  12. Impulse: Memory System Support for Scientific Applications

    Directory of Open Access Journals (Sweden)

    John B. Carter

    1999-01-01

    Full Text Available Impulse is a new memory system architecture that adds two important features to a traditional memory controller. First, Impulse supports application‐specific optimizations through configurable physical address remapping. By remapping physical addresses, applications control how their data is accessed and cached, improving their cache and bus utilization. Second, Impulse supports prefetching at the memory controller, which can hide much of the latency of DRAM accesses. Because it requires no modification to processor, cache, or bus designs, Impulse can be adopted in conventional systems. In this paper we describe the design of the Impulse architecture, and show how an Impulse memory system can improve the performance of memory‐bound scientific applications. For instance, Impulse decreases the running time of the NAS conjugate gradient benchmark by 67%. We expect that Impulse will also benefit regularly strided, memory‐bound applications of commercial importance, such as database and multimedia programs.

  13. Towards the Emergence of Procedural Memories from Lifelong Multi-Modal Streaming Memories for Cognitive Robots

    OpenAIRE

    Petit, M; Fischer, T; Demiris, Y

    2016-01-01

    Various research topics are emerging as the demand for intelligent lifelong interactions between robot and humans increases. Among them, we can find the examination of persistent storage, the continuous unsupervised annotation of memories and the usage of data at high-frequency over long periods of time. We recently proposed a lifelong autobiographical memory architecture tackling some of these challenges, allowing the iCub humanoid robot to 1) create new memories for both actions that are se...

  14. Database architecture optimized for the new bottleneck: Memory access

    NARCIS (Netherlands)

    P.A. Boncz (Peter); S. Manegold (Stefan); M.L. Kersten (Martin)

    1999-01-01

    textabstractIn the past decade, advances in speed of commodity CPUs have far out-paced advances in memory latency. Main-memory access is therefore increasingly a performance bottleneck for many computer applications, including database systems. In this article, we use a simple scan test to show the

  15. Optimizing Database Architecture for the New Bottleneck: Memory Access

    NARCIS (Netherlands)

    S. Manegold (Stefan); P.A. Boncz (Peter); M.L. Kersten (Martin)

    2000-01-01

    textabstractIn the past decade, advances in speed of commodity CPUs have far out-paced advances in memory latency. Main-memory access is therefore increasingly a performance bottleneck for many computer applications, including database systems. In this article, we use a simple scan test to show the

  16. Multicore technology architecture, reconfiguration, and modeling

    CERN Document Server

    Qadri, Muhammad Yasir

    2013-01-01

    The saturation of design complexity and clock frequencies for single-core processors has resulted in the emergence of multicore architectures as an alternative design paradigm. Nowadays, multicore/multithreaded computing systems are not only a de-facto standard for high-end applications, they are also gaining popularity in the field of embedded computing. The start of the multicore era has altered the concepts relating to almost all of the areas of computer architecture design, including core design, memory management, thread scheduling, application support, inter-processor communication, debu

  17. Fundamentals of computer architecture and design

    CERN Document Server

    Bindal, Ahmet

    2017-01-01

    This textbook provides semester-length coverage of computer architecture and design, providing a strong foundation for students to understand modern computer system architecture and to apply these insights and principles to future computer designs.  It is based on the author’s decades of industrial experience with computer architecture and design, as well as with teaching students focused on pursuing careers in computer engineering.  Unlike a number of existing textbooks for this course, this one focuses not only on CPU architecture, but also covers in great detail in system buses, peripherals and memories.This book teaches every element in a computing system in two steps.  First, it introduces the functionality of each topic (and subtopics) and then goes into “from-scratch design” of a particular digital block from its architectural specifications using timing diagrams.  The author describes how the data-path of a certain digital block is generated using timin g diagrams, a method which most textbo...

  18. Speculations on the representation of architecture in virtual reality

    DEFF Research Database (Denmark)

    Hermund, Anders; Klint, Lars; Bundgård, Ture Slot

    2017-01-01

    to the visual field of perception. However, this should not necessarily imply an acceptance of the dominance of vision over the other senses, and the much-criticized retinal architecture with its inherent loss of plasticity. Recent neurology studies indicate that 3D representation models in virtual reality...... are less demanding on the brain’s working memory than 3D models seen on flat two-dimensional screens. This paper suggests that virtual reality representational architectural models can, if used correctly, significantly improve the imaginative role of architectural representation....

  19. Homodyne detection of holographic memory systems

    Science.gov (United States)

    Urness, Adam C.; Wilson, William L.; Ayres, Mark R.

    2014-09-01

    We present a homodyne detection system implemented for a page-wise holographic memory architecture. Homodyne detection by holographic memory systems enables phase quadrature multiplexing (doubling address space), and lower exposure times (increasing read transfer rates). It also enables phase modulation, which improves signal-to-noise ratio (SNR) to further increase data capacity. We believe this is the first experimental demonstration of homodyne detection for a page-wise holographic memory system suitable for a commercial design.

  20. HTMT-class Latency Tolerant Parallel Architecture for Petaflops Scale Computation

    Science.gov (United States)

    Sterling, Thomas; Bergman, Larry

    2000-01-01

    Computational Aero Sciences and other numeric intensive computation disciplines demand computing throughputs substantially greater than the Teraflops scale systems only now becoming available. The related fields of fluids, structures, thermal, combustion, and dynamic controls are among the interdisciplinary areas that in combination with sufficient resolution and advanced adaptive techniques may force performance requirements towards Petaflops. This will be especially true for compute intensive models such as Navier-Stokes are or when such system models are only part of a larger design optimization computation involving many design points. Yet recent experience with conventional MPP configurations comprising commodity processing and memory components has shown that larger scale frequently results in higher programming difficulty and lower system efficiency. While important advances in system software and algorithms techniques have had some impact on efficiency and programmability for certain classes of problems, in general it is unlikely that software alone will resolve the challenges to higher scalability. As in the past, future generations of high-end computers may require a combination of hardware architecture and system software advances to enable efficient operation at a Petaflops level. The NASA led HTMT project has engaged the talents of a broad interdisciplinary team to develop a new strategy in high-end system architecture to deliver petaflops scale computing in the 2004/5 timeframe. The Hybrid-Technology, MultiThreaded parallel computer architecture incorporates several advanced technologies in combination with an innovative dynamic adaptive scheduling mechanism to provide unprecedented performance and efficiency within practical constraints of cost, complexity, and power consumption. The emerging superconductor Rapid Single Flux Quantum electronics can operate at 100 GHz (the record is 770 GHz) and one percent of the power required by convention

  1. PRISMA/DB: A Parallel Main-Memory Relational DBMS

    NARCIS (Netherlands)

    Apers, Peter M.G.; Flokstra, Jan; van den Berg, Carel A.; Grefen, P.W.P.J.; Wilschut, A.N.; Kersten, Martin L.; van den Berg, C.A.

    1992-01-01

    PRISMA/DB, a full-fledged parallel, main memory relational database management system (DBMS) is described. PRISMA/DB's high performance is obtained by the use of parallelism for query processing and main memory storage of the entire database. A flexible architecture for experimenting with

  2. Algorithms for computational fluid dynamics n parallel processors

    International Nuclear Information System (INIS)

    Van de Velde, E.F.

    1986-01-01

    A study of parallel algorithms for the numerical solution of partial differential equations arising in computational fluid dynamics is presented. The actual implementation on parallel processors of shared and nonshared memory design is discussed. The performance of these algorithms is analyzed in terms of machine efficiency, communication time, bottlenecks and software development costs. For elliptic equations, a parallel preconditioned conjugate gradient method is described, which has been used to solve pressure equations discretized with high order finite elements on irregular grids. A parallel full multigrid method and a parallel fast Poisson solver are also presented. Hyperbolic conservation laws were discretized with parallel versions of finite difference methods like the Lax-Wendroff scheme and with the Random Choice method. Techniques are developed for comparing the behavior of an algorithm on different architectures as a function of problem size and local computational effort. Effective use of these advanced architecture machines requires the use of machine dependent programming. It is shown that the portability problems can be minimized by introducing high level operations on vectors and matrices structured into program libraries

  3. A second look at the structure of human olfactory memory.

    Science.gov (United States)

    White, Theresa L

    2009-07-01

    How do we remember olfactory information? Is the architecture of human olfactory memory unique compared with that of memory for other types of stimuli? Ten years ago, a review article evaluated these questions, as well as the distinction between long- and short-term olfactory memory, with three lines of evidence: capacity differences, coding differences, and neuropsychological evidence, though serial position effects were also considered. From the data available at the time, the article preliminarily suggested that olfactory memory was a two-component system that was not qualitatively different from memory systems for other types of stimuli. The decade that has elapsed since then has ushered in considerable changes in theories of memory structure and provided huge advances in neuroscience capabilities. Not only have many studies exploring various aspects of olfactory memory been published, but a model of olfactory perception that includes an integral unitary memory system also has been presented. Consequently, the structure of olfactory memory is reevaluated in the light of further information currently available with the same theoretical lines of evidence previously considered. This evaluation finds that the preponderance of evidence suggests that, as in memory for other types of sensory stimuli, the short-term-long-term distinction remains a valuable dissociation for conceptualizing olfactory memory, though perhaps not as architecturally separate systems.

  4. Framewise phoneme classification with bidirectional LSTM and other neural network architectures.

    Science.gov (United States)

    Graves, Alex; Schmidhuber, Jürgen

    2005-01-01

    In this paper, we present bidirectional Long Short Term Memory (LSTM) networks, and a modified, full gradient version of the LSTM learning algorithm. We evaluate Bidirectional LSTM (BLSTM) and several other network architectures on the benchmark task of framewise phoneme classification, using the TIMIT database. Our main findings are that bidirectional networks outperform unidirectional ones, and Long Short Term Memory (LSTM) is much faster and also more accurate than both standard Recurrent Neural Nets (RNNs) and time-windowed Multilayer Perceptrons (MLPs). Our results support the view that contextual information is crucial to speech processing, and suggest that BLSTM is an effective architecture with which to exploit it.

  5. The Mind and Brain of Short-Term Memory

    OpenAIRE

    Jonides, John; Lewis, Richard L.; Nee, Derek Evan; Lustig, Cindy A.; Berman, Marc G.; Moore, Katherine Sledge

    2008-01-01

    The past 10 years have brought near-revolutionary changes in psychological theories about short-term memory, with similarly great advances in the neurosciences. Here, we critically examine the major psychological theories (the “mind”) of short-term memory and how they relate to evidence about underlying brain mechanisms. We focus on three features that must be addressed by any satisfactory theory of short-term memory. First, we examine the evidence for the architecture of short-term memory, w...

  6. The Role of Short-term Consolidation in Memory Persistence

    OpenAIRE

    Timothy J. Ricker

    2015-01-01

    Short-term memory, often described as working memory, is one of the most fundamental information processing systems of the human brain. Short-term memory function is necessary for language, spatial navigation, problem solving, and many other daily activities. Given its importance to cognitive function, understanding the architecture of short-term memory is of crucial importance to understanding human behavior. Recent work from several laboratories investigating the entry of information into s...

  7. The storage and recall of auditory memory.

    Science.gov (United States)

    Nebenzahl, I; Albeck, Y

    1990-01-01

    The architecture of the auditory memory is investigated. The auditory information is assumed to be represented by f-t patterns. With the help of a psycho-physical experiment it is demonstrated that the storage of these patterns is highly folded in the sense that a long signal is broken into many short stretches before being stored in the memory. Recognition takes place by correlating newly heard input in the short term memory to information previously stored in the long term memory. We show that this correlation is performed after the input is accumulated and held statically in the short term memory.

  8. The GOES-R Product Generation Architecture

    Science.gov (United States)

    Dittberner, G. J.; Kalluri, S.; Hansen, D.; Weiner, A.; Tarpley, A.; Marley, S.

    2011-12-01

    The GOES-R system will substantially improve users' ability to succeed in their work by providing data with significantly enhanced instruments, higher resolution, much shorter relook times, and an increased number and diversity of products. The Product Generation architecture is designed to provide the computer and memory resources necessary to achieve the necessary latency and availability for these products. Over time, new and updated algorithms are expected to be added and old ones removed as science advances and new products are developed. The GOES-R GS architecture is being planned to maintain functionality so that when such changes are implemented, operational product generation will continue without interruption. The primary parts of the PG infrastructure are the Service Based Architecture (SBA) and the Data Fabric (DF). SBA is the middleware that encapsulates and manages science algorithms that generate products. It is divided into three parts, the Executive, which manages and configures the algorithm as a service, the Dispatcher, which provides data to the algorithm, and the Strategy, which determines when the algorithm can execute with the available data. SBA is a distributed architecture, with services connected to each other over a compute grid and is highly scalable. This plug-and-play architecture allows algorithms to be added, removed, or updated without affecting any other services or software currently running and producing data. Algorithms require product data from other algorithms, so a scalable and reliable messaging is necessary. The SBA uses the DF to provide this data communication layer between algorithms. The DF provides an abstract interface over a distributed and persistent multi-layered storage system (e.g., memory based caching above disk-based storage) and an event management system that allows event-driven algorithm services to know when instrument data are available and where they reside. Together, the SBA and the DF provide a

  9. Memory Circuit Fault Simulator

    Science.gov (United States)

    Sheldon, Douglas J.; McClure, Tucker

    2013-01-01

    Spacecraft are known to experience significant memory part-related failures and problems, both pre- and postlaunch. These memory parts include both static and dynamic memories (SRAM and DRAM). These failures manifest themselves in a variety of ways, such as pattern-sensitive failures, timingsensitive failures, etc. Because of the mission critical nature memory devices play in spacecraft architecture and operation, understanding their failure modes is vital to successful mission operation. To support this need, a generic simulation tool that can model different data patterns in conjunction with variable write and read conditions was developed. This tool is a mathematical and graphical way to embed pattern, electrical, and physical information to perform what-if analysis as part of a root cause failure analysis effort.

  10. Carbon nanomaterials for non-volatile memories

    Science.gov (United States)

    Ahn, Ethan C.; Wong, H.-S. Philip; Pop, Eric

    2018-03-01

    Carbon can create various low-dimensional nanostructures with remarkable electronic, optical, mechanical and thermal properties. These features make carbon nanomaterials especially interesting for next-generation memory and storage devices, such as resistive random access memory, phase-change memory, spin-transfer-torque magnetic random access memory and ferroelectric random access memory. Non-volatile memories greatly benefit from the use of carbon nanomaterials in terms of bit density and energy efficiency. In this Review, we discuss sp2-hybridized carbon-based low-dimensional nanostructures, such as fullerene, carbon nanotubes and graphene, in the context of non-volatile memory devices and architectures. Applications of carbon nanomaterials as memory electrodes, interfacial engineering layers, resistive-switching media, and scalable, high-performance memory selectors are investigated. Finally, we compare the different memory technologies in terms of writing energy and time, and highlight major challenges in the manufacturing, integration and understanding of the physical mechanisms and material properties.

  11. An open architecture for medical image workstation

    Science.gov (United States)

    Liang, Liang; Hu, Zhiqiang; Wang, Xiangyun

    2005-04-01

    Dealing with the difficulties of integrating various medical image viewing and processing technologies with a variety of clinical and departmental information systems and, in the meantime, overcoming the performance constraints in transferring and processing large-scale and ever-increasing image data in healthcare enterprise, we design and implement a flexible, usable and high-performance architecture for medical image workstations. This architecture is not developed for radiology only, but for any workstations in any application environments that may need medical image retrieving, viewing, and post-processing. This architecture contains an infrastructure named Memory PACS and different kinds of image applications built on it. The Memory PACS is in charge of image data caching, pre-fetching and management. It provides image applications with a high speed image data access and a very reliable DICOM network I/O. In dealing with the image applications, we use dynamic component technology to separate the performance-constrained modules from the flexibility-constrained modules so that different image viewing or processing technologies can be developed and maintained independently. We also develop a weakly coupled collaboration service, through which these image applications can communicate with each other or with third party applications. We applied this architecture in developing our product line and it works well. In our clinical sites, this architecture is applied not only in Radiology Department, but also in Ultrasonic, Surgery, Clinics, and Consultation Center. Giving that each concerned department has its particular requirements and business routines along with the facts that they all have different image processing technologies and image display devices, our workstations are still able to maintain high performance and high usability.

  12. Performance evaluation of scientific programs on advanced architecture computers

    International Nuclear Information System (INIS)

    Walker, D.W.; Messina, P.; Baille, C.F.

    1988-01-01

    Recently a number of advanced architecture machines have become commercially available. These new machines promise better cost-performance then traditional computers, and some of them have the potential of competing with current supercomputers, such as the Cray X/MP, in terms of maximum performance. This paper describes an on-going project to evaluate a broad range of advanced architecture computers using a number of complete scientific application programs. The computers to be evaluated include distributed- memory machines such as the NCUBE, INTEL and Caltech/JPL hypercubes, and the MEIKO computing surface, shared-memory, bus architecture machines such as the Sequent Balance and the Alliant, very long instruction word machines such as the Multiflow Trace 7/200 computer, traditional supercomputers such as the Cray X.MP and Cray-2, and SIMD machines such as the Connection Machine. Currently 11 application codes from a number of scientific disciplines have been selected, although it is not intended to run all codes on all machines. Results are presented for two of the codes (QCD and missile tracking), and future work is proposed

  13. Reducing Competitive Cache Misses in Modern Processor Architectures

    OpenAIRE

    Prisagjanec, Milcho; Mitrevski, Pece

    2017-01-01

    The increasing number of threads inside the cores of a multicore processor, and competitive access to the shared cache memory, become the main reasons for an increased number of competitive cache misses and performance decline. Inevitably, the development of modern processor architectures leads to an increased number of cache misses. In this paper, we make an attempt to implement a technique for decreasing the number of competitive cache misses in the first level of cache memory. This tec...

  14. Exploring Hardware Support For Scaling Irregular Applications on Multi-node Multi-core Architectures

    Energy Technology Data Exchange (ETDEWEB)

    Secchi, Simone; Ceriani, Marco; Tumeo, Antonino; Villa, Oreste; Palermo, Gianluca; Raffo, Luigi

    2013-06-05

    With the recent emergence of large-scale knowledge dis- covery, data mining and social network analysis, irregular applications have gained renewed interest. Classic cache-based high-performance architectures do not provide optimal performances with such kind of workloads, mainly due to the very low spatial and temporal locality of the irregular control and memory access patterns. In this paper, we present a multi-node, multi-core, fine-grained multi-threaded shared-memory system architecture specifically designed for the execution of large-scale irregular applications, and built on top of three pillars, that we believe are fundamental to support these workloads. First, we offer transparent hardware support for Partitioned Global Address Space (PGAS) to provide a large globally-shared address space with no software library overhead. Second, we employ multi-threaded multi-core processing nodes to achieve the necessary latency tolerance required by accessing global memory, which potentially resides in a remote node. Finally, we devise hardware support for inter-thread synchronization on the whole global address space. We first model the performances by using an analytical model that takes into account the main architecture and application characteristics. We describe the hardware design of the proposed cus- tom architectural building blocks that provide support for the above- mentioned three pillars. Finally, we present a limited-scale evaluation of the system on a multi-board FPGA prototype with typical irregular kernels and benchmarks. The experimental evaluation demonstrates the architecture performance scalability for different configurations of the whole system.

  15. Principles of Transactional Memory The Theory

    CERN Document Server

    Guerraoui, Rachid

    2010-01-01

    Transactional memory (TM) is an appealing paradigm for concurrent programming on shared memory architectures. With a TM, threads of an application communicate, and synchronize their actions, via in-memory transactions. Each transaction can perform any number of operations on shared data, and then either commit or abort. When the transaction commits, the effects of all its operations become immediately visible to other transactions; when it aborts, however, those effects are entirely discarded. Transactions are atomic: programmers get the illusion that every transaction executes all its operati

  16. Memory Based Machine Intelligence Techniques in VLSI hardware

    OpenAIRE

    James, Alex Pappachen

    2012-01-01

    We briefly introduce the memory based approaches to emulate machine intelligence in VLSI hardware, describing the challenges and advantages. Implementation of artificial intelligence techniques in VLSI hardware is a practical and difficult problem. Deep architectures, hierarchical temporal memories and memory networks are some of the contemporary approaches in this area of research. The techniques attempt to emulate low level intelligence tasks and aim at providing scalable solutions to high ...

  17. Memristor-based nanoelectronic computing circuits and architectures

    CERN Document Server

    Vourkas, Ioannis

    2016-01-01

    This book considers the design and development of nanoelectronic computing circuits, systems and architectures focusing particularly on memristors, which represent one of today’s latest technology breakthroughs in nanoelectronics. The book studies, explores, and addresses the related challenges and proposes solutions for the smooth transition from conventional circuit technologies to emerging computing memristive nanotechnologies. Its content spans from fundamental device modeling to emerging storage system architectures and novel circuit design methodologies, targeting advanced non-conventional analog/digital massively parallel computational structures. Several new results on memristor modeling, memristive interconnections, logic circuit design, memory circuit architectures, computer arithmetic systems, simulation software tools, and applications of memristors in computing are presented. High-density memristive data storage combined with memristive circuit-design paradigms and computational tools applied t...

  18. Performances of multiprocessor multidisk architectures for continuous media storage

    Science.gov (United States)

    Gennart, Benoit A.; Messerli, Vincent; Hersch, Roger D.

    1996-03-01

    Multimedia interfaces increase the need for large image databases, capable of storing and reading streams of data with strict synchronicity and isochronicity requirements. In order to fulfill these requirements, we consider a parallel image server architecture which relies on arrays of intelligent disk nodes, each disk node being composed of one processor and one or more disks. This contribution analyzes through bottleneck performance evaluation and simulation the behavior of two multi-processor multi-disk architectures: a point-to-point architecture and a shared-bus architecture similar to current multiprocessor workstation architectures. We compare the two architectures on the basis of two multimedia algorithms: the compute-bound frame resizing by resampling and the data-bound disk-to-client stream transfer. The results suggest that the shared bus is a potential bottleneck despite its very high hardware throughput (400Mbytes/s) and that an architecture with addressable local memories located closely to their respective processors could partially remove this bottleneck. The point- to-point architecture is scalable and able to sustain high throughputs for simultaneous compute- bound and data-bound operations.

  19. A Fault Tolerant Integrated Circuit Memory

    OpenAIRE

    Barton, Anthony Francis

    1980-01-01

    Most commercially produced integrated circuits are incapable of tolerating manufacturing defects. The area and function of the circuits is thus limited by the probability of faults occurring within the circuit. This thesis examines techniques for using redundancy in memory circuits to provide fault tolerance and to increase storage capacity. A hierarchical memory architecture using multiple Hamming codes is introduced and analysed to determine its resistance to manufa...

  20. Design issues for block-oriented reflective memory system

    Energy Technology Data Exchange (ETDEWEB)

    Jovanovic, M; Tomasevic, M; Milutinovic, V

    1996-12-31

    The block-oriented reflective memory (BORM) system represents a modular bus-based system architecture that belongs to the class of distributed shared memory systems. The results of the evaluation study of the BORM implementation strategies and design decisions in regard to the different values of input parameters are presented. 5 refs.

  1. Urban Sustainability through Public Architecture

    Directory of Open Access Journals (Sweden)

    Soomi Kim

    2018-04-01

    Full Text Available As the sustainability of contemporary cities has gained emphasis, interest in architecture has increased, due to its social and public responsibility. Since sustainability is linked to public values, research on sustainable public spaces is an important way to secure sustainability in cities. Based on this, we analyzed the sustainability of European cities by examining the design methods of public architecture according to the region. The aim of the study is to derive architectural methodology corresponding to local characteristics, and to suggest issues to consider in public architecture design to promote urban sustainability based on this. First, regarding the environmental aspect, it can be observed that there is an effort to secure sustainability. Second, in terms of social sustainability, historical value remains as a trace of architectural place, so that it continues in people’s memory. In addition, public architecture provides public places where citizens can gather and enjoy programs, while the architectural methods showed differences influenced by cultural conditions. Third, in economic sustainability, it was shown that energy saving was achieved through cost reduction through recycling of materials, facilities, or environmental factors. In conclusion, the issues to be considered in public architectural design are the voiding of urban space through architectural devices in the construction method. In other words, the intention is to form “ground” that attempts to be part of the city, and thereby create better places. Since skin and material have a deep relationship with the environment, they should have the durability and an outer skin that are suitable for the regional environment. Finally, sustainability is to be utilized through the influx of programs that meet local and environmental characteristics. Design research into public architecture that is oriented towards urban sustainability will be a task to be carried out by the

  2. A study on low-power, nanosecond operation and multilevel bipolar resistance switching in Ti/ZrO2/Pt nonvolatile memory with 1T1R architecture

    International Nuclear Information System (INIS)

    Wu, Ming-Chi; Tseng, Tseung-Yuen; Jang, Wen-Yueh; Lin, Chen-Hsi

    2012-01-01

    Low-power, bipolar resistive switching (RS) characteristics in the Ti/ZrO 2 /Pt nonvolatile memory with one transistor and one resistor (1T1R) architecture were reported. Multilevel storage behavior was observed by modulating the amplitude of the MOSFET gate voltage, in which the transistor functions as a current limiter. Furthermore, multilevel storage was also executed by controlling the reset voltage, leading the resistive random access memory (RRAM) to the multiple metastable low resistance state (LRS). The experimental results on the measured electrical properties of the various sized devices confirm that the RS mechanism of the Ti/ZrO 2 /Pt structure obeys the conducting filaments model. In application, the devices exhibit high-speed switching performances (250 ns) with suitable high/low resistance state ratio (HRS/LRS > 10). The LRS of the devices with 10 year retention ability at 80 °C, based on the Arrhenius equation, is also demonstrated in the thermal accelerating test. Furthermore, the ramping gate voltage method with fixed drain voltage is used to switch the 1T1R memory cells for upgrading the memory performances. Our experimental results suggest that the ZrO 2 -based RRAM is a prospective alternative for nonvolatile multilevel memory device applications. (paper)

  3. Scalable shared-memory multiprocessing

    CERN Document Server

    Lenoski, Daniel E

    1995-01-01

    Dr. Lenoski and Dr. Weber have experience with leading-edge research and practical issues involved in implementing large-scale parallel systems. They were key contributors to the architecture and design of the DASH multiprocessor. Currently, they are involved with commercializing scalable shared-memory technology.

  4. Parallel implementation of DNA sequences matching algorithms using PWM on GPU architecture.

    Science.gov (United States)

    Sharma, Rahul; Gupta, Nitin; Narang, Vipin; Mittal, Ankush

    2011-01-01

    Positional Weight Matrices (PWMs) are widely used in representation and detection of Transcription Factor Of Binding Sites (TFBSs) on DNA. We implement online PWM search algorithm over parallel architecture. A large PWM data can be processed on Graphic Processing Unit (GPU) systems in parallel which can help in matching sequences at a faster rate. Our method employs extensive usage of highly multithreaded architecture and shared memory of multi-cored GPU. An efficient use of shared memory is required to optimise parallel reduction in CUDA. Our optimised method has a speedup of 230-280x over linear implementation on GPU named GeForce GTX 280.

  5. Ring interconnection for distributed memory automation and computing system

    Energy Technology Data Exchange (ETDEWEB)

    Vinogradov, V I [Inst. for Nuclear Research of the Russian Academy of Sciences, Moscow (Russian Federation)

    1996-12-31

    Problems of development of measurement, acquisition and central systems based on a distributed memory and a ring interface are discussed. It has been found that the RAM LINK-type protocol can be used for ringlet links in non-symmetrical distributed memory architecture multiprocessor system interaction. 5 refs.

  6. Kalman filter tracking on parallel architectures

    Science.gov (United States)

    Cerati, G.; Elmer, P.; Krutelyov, S.; Lantz, S.; Lefebvre, M.; McDermott, K.; Riley, D.; Tadel, M.; Wittich, P.; Wurthwein, F.; Yagil, A.

    2017-10-01

    We report on the progress of our studies towards a Kalman filter track reconstruction algorithm with optimal performance on manycore architectures. The combinatorial structure of these algorithms is not immediately compatible with an efficient SIMD (or SIMT) implementation; the challenge for us is to recast the existing software so it can readily generate hundreds of shared-memory threads that exploit the underlying instruction set of modern processors. We show how the data and associated tasks can be organized in a way that is conducive to both multithreading and vectorization. We demonstrate very good performance on Intel Xeon and Xeon Phi architectures, as well as promising first results on Nvidia GPUs.

  7. Investigation of fast initialization of spacecraft bubble memory systems

    Science.gov (United States)

    Looney, K. T.; Nichols, C. D.; Hayes, P. J.

    1984-01-01

    Bubble domain technology offers significant improvement in reliability and functionality for spacecraft onboard memory applications. In considering potential memory systems organizations, minimization of power in high capacity bubble memory systems necessitates the activation of only the desired portions of the memory. In power strobing arbitrary memory segments, a capability of fast turn on is required. Bubble device architectures, which provide redundant loop coding in the bubble devices, limit the initialization speed. Alternate initialization techniques are investigated to overcome this design limitation. An initialization technique using a small amount of external storage is demonstrated.

  8. Real-time field programmable gate array architecture for computer vision

    Science.gov (United States)

    Arias-Estrada, Miguel; Torres-Huitzil, Cesar

    2001-01-01

    This paper presents an architecture for real-time generic convolution of a mask and an image. The architecture is intended for fast low-level image processing. The field programmable gate array (FPGA)-based architecture takes advantage of the availability of registers in FPGAs to implement an efficient and compact module to process the convolutions. The architecture is designed to minimize the number of accesses to the image memory and it is based on parallel modules with internal pipeline operation in order to improve its performance. The architecture is prototyped in a FPGA, but it can be implemented on dedicated very- large-scale-integrated devices to reach higher clock frequencies. Complexity issues, FPGA resources utilization, FPGA limitations, and real-time performance are discussed. Some results are presented and discussed.

  9. Acute Kynurenine Challenge Disrupts Sleep-Wake Architecture and Impairs Contextual Memory in Adult Rats.

    Science.gov (United States)

    Pocivavsek, Ana; Baratta, Annalisa M; Mong, Jessica A; Viechweg, Shaun S

    2017-11-01

    Tryptophan metabolism via the kynurenine pathway may represent a key molecular link between sleep loss and cognitive dysfunction. Modest increases in the kynurenine pathway metabolite kynurenic acid (KYNA), which acts as an antagonist at N-methyl-d-aspartate and α7 nicotinic acetylcholine receptors in the brain, result in cognitive impairments. As glutamatergic and cholinergic neurotransmissions are critically involved in modulation of sleep, our current experiments tested the hypothesis that elevated KYNA adversely impacts sleep quality. Adult male Wistar rats were treated with vehicle (saline) and kynurenine (25, 50, 100, and 250 mg/kg), the direct bioprecursor of KYNA, intraperitoneally at zeitgeber time (ZT) 0 to rapidly increase brain KYNA. Levels of KYNA in the brainstem, cortex, and hippocampus were determined at ZT 0, ZT 2, and ZT 4, respectively. Analyses of vigilance state-related parameters categorized as wake, rapid eye movement (REM), and non-REM (NREM) as well as spectra power analysis during NREM and REM were assessed during the light phase. Separate animals were tested in the passive avoidance paradigm, testing contextual memory. When KYNA levels were elevated in the brain, total REM duration was reduced and total wake duration was increased. REM and wake architecture, assessed as number of vigilance state bouts and average duration of each bout, and theta power during REM were significantly impacted. Kynurenine challenge impaired performance in the hippocampal-dependent contextual memory task. Our results introduce kynurenine pathway metabolism and formation of KYNA as a novel molecular target contributing to sleep disruptions and cognitive impairments. © Sleep Research Society 2017. Published by Oxford University Press on behalf of the Sleep Research Society. All rights reserved. For permissions, please e-mail journals.permissions@oup.com.

  10. Word-serial Architectures for Filtering and Variable Rate Decimation

    Directory of Open Access Journals (Sweden)

    Eugene Grayver

    2002-01-01

    Full Text Available A new flexible architecture is proposed for word-serial filtering and variable rate decimation/interpolation. The architecture is targeted for low power applications requiring medium to low data rate and is ideally suited for implementation on either an ASIC or an FPGA. It combines the small size and low power of an ASIC with the programmability and flexibility of a DSP. An efficient memory addressing scheme eliminates the need for power hungry shift registers and allows full reconfiguration. The decimation ratio, filter length and filter coefficients can all be changed in real time. The architecture takes advantage of coefficient symmetries in linear phase filters and in polyphase components.

  11. Shared Memory Parallelization of an Implicit ADI-type CFD Code

    Science.gov (United States)

    Hauser, Th.; Huang, P. G.

    1999-01-01

    A parallelization study designed for ADI-type algorithms is presented using the OpenMP specification for shared-memory multiprocessor programming. Details of optimizations specifically addressed to cache-based computer architectures are described and performance measurements for the single and multiprocessor implementation are summarized. The paper demonstrates that optimization of memory access on a cache-based computer architecture controls the performance of the computational algorithm. A hybrid MPI/OpenMP approach is proposed for clusters of shared memory machines to further enhance the parallel performance. The method is applied to develop a new LES/DNS code, named LESTool. A preliminary DNS calculation of a fully developed channel flow at a Reynolds number of 180, Re(sub tau) = 180, has shown good agreement with existing data.

  12. Efficient Processing of a Rainfall Simulation Watershed on an FPGA-Based Architecture with Fast Access to Neighbourhood Pixels

    Directory of Open Access Journals (Sweden)

    Yeong LeeSeng

    2009-01-01

    Full Text Available This paper describes a hardware architecture to implement the watershed algorithm using rainfall simulation. The speed of the architecture is increased by utilizing a multiple memory bank approach to allow parallel access to the neighbourhood pixel values. In a single read cycle, the architecture is able to obtain all five values of the centre and four neighbours for a 4-connectivity watershed transform. The storage requirement of the multiple bank implementation is the same as a single bank implementation by using a graph-based memory bank addressing scheme. The proposed rainfall watershed architecture consists of two parts. The first part performs the arrowing operation and the second part assigns each pixel to its associated catchment basin. The paper describes the architecture datapath and control logic in detail and concludes with an implementation on a Xilinx Spartan-3 FPGA.

  13. A novel multiplexer-based structure for random access memory cell in quantum-dot cellular automata

    Science.gov (United States)

    Naji Asfestani, Mazaher; Rasouli Heikalabad, Saeed

    2017-09-01

    Quantum-dot cellular automata (QCA) is a new technology in scale of nano and perfect replacement for CMOS circuits in the future. Memory is one of the basic components in any digital system, so designing the random access memory (RAM) with high speed and optimal in QCA is important. In this paper, by employing the structure of multiplexer, a novel RAM cell architecture is proposed. The proposed architecture is implemented without the coplanar crossover approach. The proposed architecture is simulated using the QCADesigner version 2.0.3 and QCAPro. The simulation results demonstrate that the proposed QCA RAM architecture has the best performance in terms of delay, circuit complexity, area, cell count and energy consumption in comparison with other QCA RAM architectures.

  14. Dynamic computing random access memory

    International Nuclear Information System (INIS)

    Traversa, F L; Bonani, F; Pershin, Y V; Di Ventra, M

    2014-01-01

    The present von Neumann computing paradigm involves a significant amount of information transfer between a central processing unit and memory, with concomitant limitations in the actual execution speed. However, it has been recently argued that a different form of computation, dubbed memcomputing (Di Ventra and Pershin 2013 Nat. Phys. 9 200–2) and inspired by the operation of our brain, can resolve the intrinsic limitations of present day architectures by allowing for computing and storing of information on the same physical platform. Here we show a simple and practical realization of memcomputing that utilizes easy-to-build memcapacitive systems. We name this architecture dynamic computing random access memory (DCRAM). We show that DCRAM provides massively-parallel and polymorphic digital logic, namely it allows for different logic operations with the same architecture, by varying only the control signals. In addition, by taking into account realistic parameters, its energy expenditures can be as low as a few fJ per operation. DCRAM is fully compatible with CMOS technology, can be realized with current fabrication facilities, and therefore can really serve as an alternative to the present computing technology. (paper)

  15. Explaining the gap between theoretical peak performance and real performance for supercomputer architectures

    International Nuclear Information System (INIS)

    Schoenauer, W.; Haefner, H.

    1993-01-01

    The basic architectures of vector and parallel computers with their properties are presented. Then the memory size and the arithmetic operations in the context of memory bandwidth are discussed. For the exemplary discussion of a single operation micro-measurements of the vector triad for the IBM 3090 VF and the CRAY Y-MP/8 are presented. They reveal the details of the losses for a single operation. Then we analyze the global performance of a whole supercomputer by identifying reduction factors that bring down the theoretical peak performance to the poor real performance. The responsibilities of the manufacturer and of the user for these losses are dicussed. Then the price-performance ratio for different architectures in a snapshot of January 1991 is briefly mentioned. Finally some remarks to a user-friendly architecture for a supercomputer will be made. (orig.)

  16. Control system architecture: The standard and non-standard models

    International Nuclear Information System (INIS)

    Thuot, M.E.; Dalesio, L.R.

    1993-01-01

    Control system architecture development has followed the advances in computer technology through mainframes to minicomputers to micros and workstations. This technology advance and increasingly challenging accelerator data acquisition and automation requirements have driven control system architecture development. In summarizing the progress of control system architecture at the last International Conference on Accelerator and Large Experimental Physics Control Systems (ICALEPCS) B. Kuiper asserted that the system architecture issue was resolved and presented a ''standard model''. The ''standard model'' consists of a local area network (Ethernet or FDDI) providing communication between front end microcomputers, connected to the accelerator, and workstations, providing the operator interface and computational support. Although this model represents many present designs, there are exceptions including reflected memory and hierarchical architectures driven by requirements for widely dispersed, large channel count or tightly coupled systems. This paper describes the performance characteristics and features of the ''standard model'' to determine if the requirements of ''non-standard'' architectures can be met. Several possible extensions to the ''standard model'' are suggested including software as well as the hardware architectural feature

  17. Remote memory and cortical synaptic plasticity require neuronal CCCTC-binding factor (CTCF).

    Science.gov (United States)

    Kim, Somi; Yu, Nam-Kyung; Shim, Kyu-Won; Kim, Ji-Il; Kim, Hyopil; Han, Dae Hee; Choi, Ja Eun; Lee, Seung-Woo; Choi, Dong Il; Kim, Myung Won; Lee, Dong-Sung; Lee, Kyungmin; Galjart, Niels; Lee, Yong-Seok; Lee, Jae-Hyung; Kaang, Bong-Kiun

    2018-04-30

    The molecular mechanism of long-term memory has been extensively studied in the context of the hippocampus-dependent recent memory examined within several days. However, months-old remote memory maintained in the cortex for long-term has not been investigated much at the molecular level yet. Various epigenetic mechanisms are known to be important for long-term memory, but how the three-dimensional (3D) chromatin architecture and its regulator molecules contribute to neuronal plasticity and systems consolidation are still largely unknown. CCCTC-binding factor (CTCF) is an eleven-zinc finger protein well known for its role as a genome architecture molecule. Male conditional knockout (cKO) mice in which CTCF is lost in excitatory neurons during adulthood showed normal recent memory in the contextual fear conditioning and spatial water maze tasks. However, they showed remarkable impairments in remote memory in both tasks. Underlying the remote memory-specific phenotypes, we observed that female CTCF cKO mice exhibit disrupted cortical long-term potentiation (LTP), but not hippocampal LTP. Similarly, we observed that CTCF deletion in inhibitory neurons caused partial impairment of remote memory. Through RNA-sequencing, we observed that CTCF knockdown in cortical neuron culture caused altered expression of genes that are highly involved in cell adhesion, synaptic plasticity, and memory. These results suggest that remote memory storage in the cortex requires CTCF-mediated gene regulation in neurons while recent memory formation in the hippocampus does not. SIGNIFICANCE STATEMENT CTCF is a well-known 3D genome architectural protein that regulates gene expression. Here, we use two different CTCF conditional knockout mouse lines and reveal for the first time that CTCF is critically involved in the regulation of remote memory. We also show that CTCF is necessary for appropriate expression of genes, many of which we found to be involved in the learning and memory related

  18. Concrete Memories

    DEFF Research Database (Denmark)

    Wiegand, Frauke Katharina

    2015-01-01

    This article traces the presence of Atlantikwall bunkers in amateur holiday snapshots and discusses the ambiguous role of the bunker site in visual cultural memory. Departing from my family’s private photo collection from twenty years of vacationing at the Danish West coast, the different mundane...... and poetic appropriations and inscriptions of the bunker site are depicted. Ranging between overlooked side presences and an overwhelming visibility, the concrete remains of fascist war architecture are involved in and motivate different sensuous experiences and mnemonic appropriations. The article meets...... the bunkers’ changing visuality and the cultural topography they both actively transform and are being transformed by through juxtaposing different acts and objects of memory over time and in different visual articulations....

  19. Speculations on the representation of architecture in virtual reality

    DEFF Research Database (Denmark)

    Hermund, Anders; Klint, Lars; Bundgård, Ture Slot

    2017-01-01

    to the visual field of perception. However, this should not necessarily imply an acceptance of the dominance of vision over the other senses, and the much-criticized retinal architecture with its inherent loss of plasticity. Recent neurology studies indicate that 3D representation models in virtual reality......This paper discusses the present and future possibilities of representation models of architecture in new media such as virtual reality, seen in the broader context of tradition, perception, and neurology. Through comparative studies of real and virtual scenarios using eye tracking, the paper...... are less demanding on the brain’s working memory than 3D models seen on flat two-dimensional screens. This paper suggests that virtual reality representational architectural models can, if used correctly, significantly improve the imaginative role of architectural representation....

  20. The evolution of episodic memory

    Science.gov (United States)

    Allen, Timothy A.; Fortin, Norbert J.

    2013-01-01

    One prominent view holds that episodic memory emerged recently in humans and lacks a “(neo)Darwinian evolution” [Tulving E (2002) Annu Rev Psychol 53:1–25]. Here, we review evidence supporting the alternative perspective that episodic memory has a long evolutionary history. We show that fundamental features of episodic memory capacity are present in mammals and birds and that the major brain regions responsible for episodic memory in humans have anatomical and functional homologs in other species. We propose that episodic memory capacity depends on a fundamental neural circuit that is similar across mammalian and avian species, suggesting that protoepisodic memory systems exist across amniotes and, possibly, all vertebrates. The implication is that episodic memory in diverse species may primarily be due to a shared underlying neural ancestry, rather than the result of evolutionary convergence. We also discuss potential advantages that episodic memory may offer, as well as species-specific divergences that have developed on top of the fundamental episodic memory architecture. We conclude by identifying possible time points for the emergence of episodic memory in evolution, to help guide further research in this area. PMID:23754432

  1. Midcentury Modern High Schools: Rebooting the Architecture

    Science.gov (United States)

    Havens, Kevin

    2010-01-01

    A high school is more than a building; it's a repository of memories for many community members. High schools built at the turn of the century are not only cultural and civic landmarks, they are also often architectural treasures. When these facilities become outdated, a renovation that preserves the building's aesthetics and character is usually…

  2. Low Power LDPC Code Decoder Architecture Based on Intermediate Message Compression Technique

    Science.gov (United States)

    Shimizu, Kazunori; Togawa, Nozomu; Ikenaga, Takeshi; Goto, Satoshi

    Reducing the power dissipation for LDPC code decoder is a major challenging task to apply it to the practical digital communication systems. In this paper, we propose a low power LDPC code decoder architecture based on an intermediate message-compression technique which features as follows: (i) An intermediate message compression technique enables the decoder to reduce the required memory capacity and write power dissipation. (ii) A clock gated shift register based intermediate message memory architecture enables the decoder to decompress the compressed messages in a single clock cycle while reducing the read power dissipation. The combination of the above two techniques enables the decoder to reduce the power dissipation while keeping the decoding throughput. The simulation results show that the proposed architecture improves the power efficiency up to 52% and 18% compared to that of the decoder based on the overlapped schedule and the rapid convergence schedule without the proposed techniques respectively.

  3. Emerging technology and architecture for big-data analytics

    CERN Document Server

    Chang, Chip; Yu, Hao

    2017-01-01

    This book describes the current state of the art in big-data analytics, from a technology and hardware architecture perspective. The presentation is designed to be accessible to a broad audience, with general knowledge of hardware design and some interest in big-data analytics. Coverage includes emerging technology and devices for data-analytics, circuit design for data-analytics, and architecture and algorithms to support data-analytics. Readers will benefit from the realistic context used by the authors, which demonstrates what works, what doesn’t work, and what are the fundamental problems, solutions, upcoming challenges and opportunities. Provides a single-source reference to hardware architectures for big-data analytics; Covers various levels of big-data analytics hardware design abstraction and flow, from device, to circuits and systems; Demonstrates how non-volatile memory (NVM) based hardware platforms can be a viable solution to existing challenges in hardware architecture for big-data analytics.

  4. On the architecture for the X part of a very large FX correlator using two-accumulator CMACs

    Science.gov (United States)

    Lapshev, Stepan; Rezaul Hasan, S. M.

    2016-02-01

    This paper presents an improved input-buffer architecture for the X part of a very large FX correlator that optimizes memory use to both increase performance and reduce the overall power consumption. The architecture uses an array of two-accumulator CMACs that are reused for different pairs of correlated signals. Using two accumulators in every CMAC allows the processing array to alternately correlate two sets of signal pairs selected in such a way so that they share some or all of the processed data samples. This leads to increased processing bandwidth and a significant reduction of the memory read rate due to not having to update some or all of the processing buffers in every second processing cycle. The overall memory access rate is at most 75 % of that of the single-accumulator CMAC array. This architecture is intended for correlators of very large multi-element radio telescopes such as the Square Kilometre Array (SKA), and is suitable for an ASIC implementation.

  5. Evaluating Multicore Algorithms on the Unified Memory Model

    Directory of Open Access Journals (Sweden)

    John E. Savage

    2009-01-01

    Full Text Available One of the challenges to achieving good performance on multicore architectures is the effective utilization of the underlying memory hierarchy. While this is an issue for single-core architectures, it is a critical problem for multicore chips. In this paper, we formulate the unified multicore model (UMM to help understand the fundamental limits on cache performance on these architectures. The UMM seamlessly handles different types of multiple-core processors with varying degrees of cache sharing at different levels. We demonstrate that our model can be used to study a variety of multicore architectures on a variety of applications. In particular, we use it to analyze an option pricing problem using the trinomial model and develop an algorithm for it that has near-optimal memory traffic between cache levels. We have implemented the algorithm on a two Quad-Core Intel Xeon 5310 1.6 GHz processors (8 cores. It achieves a peak performance of 19.5 GFLOPs, which is 38% of the theoretical peak of the multicore system. We demonstrate that our algorithm outperforms compiler-optimized and auto-parallelized code by a factor of up to 7.5.

  6. Design and Analysis of Architectures for Structural Health Monitoring Systems

    Science.gov (United States)

    Mukkamala, Ravi; Sixto, S. L. (Technical Monitor)

    2002-01-01

    During the two-year project period, we have worked on several aspects of Health Usage and Monitoring Systems for structural health monitoring. In particular, we have made contributions in the following areas. 1. Reference HUMS architecture: We developed a high-level architecture for health monitoring and usage systems (HUMS). The proposed reference architecture is shown. It is compatible with the Generic Open Architecture (GOA) proposed as a standard for avionics systems. 2. HUMS kernel: One of the critical layers of HUMS reference architecture is the HUMS kernel. We developed a detailed design of a kernel to implement the high level architecture.3. Prototype implementation of HUMS kernel: We have implemented a preliminary version of the HUMS kernel on a Unix platform.We have implemented both a centralized system version and a distributed version. 4. SCRAMNet and HUMS: SCRAMNet (Shared Common Random Access Memory Network) is a system that is found to be suitable to implement HUMS. For this reason, we have conducted a simulation study to determine its stability in handling the input data rates in HUMS. 5. Architectural specification.

  7. Memory bottlenecks and memory contention in multi-core Monte Carlo transport codes

    International Nuclear Information System (INIS)

    Tramm, J.R.; Siegel, A.R.

    2013-01-01

    The simulation of whole nuclear cores through the use of Monte Carlo codes requires an impracticably long time-to-solution. We have extracted a kernel that executes only the most computationally expensive steps of the Monte Carlo particle transport algorithm - the calculation of macroscopic cross sections - in an effort to expose bottlenecks within multi-core, shared memory architectures. (authors)

  8. Control system architecture: The standard and non-standard models

    International Nuclear Information System (INIS)

    Thuot, M.E.; Dalesio, L.R.

    1993-01-01

    Control system architecture development has followed the advances in computer technology through mainframes to minicomputers to micros and workstations. This technology advance and increasingly challenging accelerator data acquisition and automation requirements have driven control system architecture development. In summarizing the progress of control system architecture at the last International Conference on Accelerator and Large Experimental Physics Control Systems (ICALEPCS) B. Kuiper asserted that the system architecture issue was resolved and presented a open-quotes standard modelclose quotes. The open-quotes standard modelclose quotes consists of a local area network (Ethernet or FDDI) providing communication between front end microcomputers, connected to the accelerator, and workstations, providing the operator interface and computational support. Although this model represents many present designs, there are exceptions including reflected memory and hierarchical architectures driven by requirements for widely dispersed, large channel count or tightly coupled systems. This paper describes the performance characteristics and features of the open-quotes standard modelclose quotes to determine if the requirements of open-quotes non-standardclose quotes architectures can be met. Several possible extensions to the open-quotes standard modelclose quotes are suggested including software as well as the hardware architectural features

  9. A NEW OS ARCHITECTURE FOR IOT

    Directory of Open Access Journals (Sweden)

    Jean Y. Astier

    2018-03-01

    Full Text Available Current computer operating systems architectures are not well suited for the coming world of connected objects, known as the Internet of Things (IoT for multiple reasons: poor communication performances in both point-to-point and broadcast cases, poor operational reliability and network security, excessive requirements both in terms of processor power and memory size leading to excessive electrical power consumption. We introduce a new computer operating system architecture well adapted to IoT, from the most modest to the most complex, and more generally able to significantly raise the input/output capacities of any communicating computer. This architecture rests on the principles of the Von Neumann hardware model, and is composed of two types of asymmetric distributed containers, which communicate by message passing. We describe the sub-systems of both of these types of containers, where each sub-system has its own scheduler, and a dedicated execution level.

  10. Evaluation of existing and proposed computer architectures for future ground-based systems

    Science.gov (United States)

    Schulbach, C.

    1985-01-01

    Parallel processing architectures and techniques used in current supercomputers are described and projections are made of future advances. Presently, the von Neumann sequential processing pattern has been accelerated by having separate I/O processors, interleaved memories, wide memories, independent functional units and pipelining. Recent supercomputers have featured single-input, multiple data stream architectures, which have different processors for performing various operations (vector or pipeline processors). Multiple input, multiple data stream machines have also been developed. Data flow techniques, wherein program instructions are activated only when data are available, are expected to play a large role in future supercomputers, along with increased parallel processor arrays. The enhanced operational speeds are essential for adequately treating data from future spacecraft remote sensing instruments such as the Thematic Mapper.

  11. Disruptive Logic Architectures and Technologies From Device to System Level

    CERN Document Server

    Gaillardon, Pierre-Emmanuel; Clermidy, Fabien

    2012-01-01

    This book discusses the opportunities offered by disruptive technologies to overcome the economical and physical limits currently faced by the electronics industry. It provides a new methodology for the fast evaluation of an emerging technology from an architectural perspective and discusses the implications from simple circuits to complex architectures. Several technologies are discussed, ranging from 3-D integration of devices (Phase Change Memories, Monolithic 3-D, Vertical NanoWires-based transistors) to dense 2-D arrangements (Double-Gate Carbon Nanotubes, Sublithographic Nanowires, Lithographic Crossbar arrangements). Novel architectural organizations, as well as the associated tools, are presented in order to explore this freshly opened design space. Describes a novel architectural organization for future reconfigurable systems; Includes a complete benchmarking toolflow for emerging technologies; Generalizes the description of reconfigurable circuits in terms of hierarchical levels; Assesses disruptive...

  12. Fork-join and data-driven execution models on multi-core architectures: Case study of the FMM

    KAUST Repository

    Amer, Abdelhalim

    2013-01-01

    Extracting maximum performance of multi-core architectures is a difficult task primarily due to bandwidth limitations of the memory subsystem and its complex hierarchy. In this work, we study the implications of fork-join and data-driven execution models on this type of architecture at the level of task parallelism. For this purpose, we use a highly optimized fork-join based implementation of the FMM and extend it to a data-driven implementation using a distributed task scheduling approach. This study exposes some limitations of the conventional fork-join implementation in terms of synchronization overheads. We find that these are not negligible and their elimination by the data-driven method, with a careful data locality strategy, was beneficial. Experimental evaluation of both methods on state-of-the-art multi-socket multi-core architectures showed up to 22% speed-ups of the data-driven approach compared to the original method. We demonstrate that a data-driven execution of FMM not only improves performance by avoiding global synchronization overheads but also reduces the memory-bandwidth pressure caused by memory-intensive computations. © 2013 Springer-Verlag.

  13. Techniques for Reducing Consistency-Related Communication in Distributed Shared Memory System

    OpenAIRE

    Zwaenepoel, W; Bennett, J.K.; Carter, J.B.

    1995-01-01

    Distributed shared memory 8DSM) is an abstraction of shared memory on a distributed memory machine. Hardware DSM systems support this abstraction at the architecture level; software DSM systems support the abstraction within the runtime system. One of the key problems in building an efficient software DSM system is to reduce the amount of communication needed to keep the distributed memories consistent. In this paper we present four techniques for doing so: 1) software release consistency; 2)...

  14. We, You, They? Spanish Traits in the Nationlist Memory of Mexican Architecture

    Directory of Open Access Journals (Sweden)

    Johanna Lozoya

    2010-01-01

    Full Text Available The new mestizo tradition in Mexican architectural historiography was invented after the 1930's by sublimating the racist character ever-present in the development of the philosophical, aesthetical, and scientific structures of modern Mexican architectural thought. In the face of the monopoly exerted by the State's cultural ideology on the imaginaries of late-nineteenth-century architectural historiography, still expressed in taxonomies  such as “indian expressionism” or “Creole expressionism”, Spanish traits become at once national and foreign, Mexican and anti- Mexican, traditional and opposed  to tradition modernity and antiquity, universality and locality. This work desconstructs turn-of-the-century arguments  by analyzing the continuity of Hispanic  traits in the post-revolutionary invention of an old mestizo tradition.

  15. Parallel discrete ordinates algorithms on distributed and common memory systems

    International Nuclear Information System (INIS)

    Wienke, B.R.; Hiromoto, R.E.; Brickner, R.G.

    1987-01-01

    The S/sub n/ algorithm employs iterative techniques in solving the linear Boltzmann equation. These methods, both ordered and chaotic, were compared on both the Denelcor HEP and the Intel hypercube. Strategies are linked to the organization and accessibility of memory (common memory versus distributed memory architectures), with common concern for acquisition of global information. Apart from this, the inherent parallelism of the algorithm maps directly onto the two architectures. Results comparing execution times, speedup, and efficiency are based on a representative 16-group (full upscatter and downscatter) sample problem. Calculations were performed on both the Los Alamos National Laboratory (LANL) Denelcor HEP and the LANL Intel hypercube. The Denelcor HEP is a 64-bit multi-instruction, multidate MIMD machine consisting of up to 16 process execution modules (PEMs), each capable of executing 64 processes concurrently. Each PEM can cooperate on a job, or run several unrelated jobs, and share a common global memory through a crossbar switch. The Intel hypercube, on the other hand, is a distributed memory system composed of 128 processing elements, each with its own local memory. Processing elements are connected in a nearest-neighbor hypercube configuration and sharing of data among processors requires execution of explicit message-passing constructs

  16. Implicit Unstructured Aerodynamics on Emerging Multi- and Many-Core HPC Architectures

    KAUST Repository

    Al Farhan, Mohammed A.

    2017-03-13

    Shared memory parallelization of PETSc-FUN3D, an unstructured tetrahedral mesh Euler code previously characterized for distributed memory Single Program, Multiple Data (SPMD) for thousands of nodes, is hybridized with shared memory Single Instruction, Multiple Data (SIMD) for hundreds of threads per node. We explore thread-level performance optimizations on state-of-the-art multi- and many-core Intel processors, including the second generation of Xeon Phi, Knights Landing (KNL). We study the performance on the KNL with different configurations of memory and cluster modes, with code optimizations to minimize indirect addressing and enhance the cache locality. The optimizations employed are expected to be of value other unstructured applications as many-core architecture evolves.

  17. An ACL2 Mechanization of an Axiomatic Framework for Weak Memory

    Directory of Open Access Journals (Sweden)

    Benjamin Selfridge

    2014-06-01

    Full Text Available Proving the correctness of programs written for multiple processors is a challenging problem, due in no small part to the weaker memory guarantees afforded by most modern architectures. In particular, the existence of store buffers means that the programmer can no longer assume that writes to different locations become visible to all processors in the same order. However, all practical architectures do provide a collection of weaker guarantees about memory consistency across processors, which enable the programmer to write provably correct programs in spite of a lack of full sequential consistency. In this work, we present a mechanization in the ACL2 theorem prover of an axiomatic weak memory model (introduced by Alglave et al.. In the process, we provide a new proof of an established theorem involving these axioms.

  18. Embedded System Synthesis under Memory Constraints

    DEFF Research Database (Denmark)

    Madsen, Jan; Bjørn-Jørgensen, Peter

    1999-01-01

    This paper presents a genetic algorithm to solve the system synthesis problem of mapping a time constrained single-rate system specification onto a given heterogeneous architecture which may contain irregular interconnection structures. The synthesis is performed under memory constraints, that is......, the algorithm takes into account the memory size of processors and the size of interface buffers of communication links, and in particular the complicated interplay of these. The presented algorithm is implemented as part of the LY-COS cosynthesis system....

  19. A semi-floating gate memory based on van der Waals heterostructures for quasi-non-volatile applications.

    Science.gov (United States)

    Liu, Chunsen; Yan, Xiao; Song, Xiongfei; Ding, Shijin; Zhang, David Wei; Zhou, Peng

    2018-04-09

    As conventional circuits based on field-effect transistors are approaching their physical limits due to quantum phenomena, semi-floating gate transistors have emerged as an alternative ultrafast and silicon-compatible technology. Here, we show a quasi-non-volatile memory featuring a semi-floating gate architecture with band-engineered van der Waals heterostructures. This two-dimensional semi-floating gate memory demonstrates 156 times longer refresh time with respect to that of dynamic random access memory and ultrahigh-speed writing operations on nanosecond timescales. The semi-floating gate architecture greatly enhances the writing operation performance and is approximately 10 6 times faster than other memories based on two-dimensional materials. The demonstrated characteristics suggest that the quasi-non-volatile memory has the potential to bridge the gap between volatile and non-volatile memory technologies and decrease the power consumption required for frequent refresh operations, enabling a high-speed and low-power random access memory.

  20. Save Now [Y/N]? Machine Memory at War in Iain Banks' "Look to Windward"

    Science.gov (United States)

    Blackmore, Tim

    2010-01-01

    Creating memory during and after wartime trauma is vexed by state attempts to control public and private discourse. Science fiction author Iain Banks' novel "Look to Windward" proposes different ways of preserving memory and culture, from posthuman memory devices, to artwork, to architecture, to personal, local ways of remembering.…

  1. Weak Memory Models with Matching Axiomatic and Operational Definitions

    OpenAIRE

    Zhang, Sizhuo; Vijayaraghavan, Muralidaran; Lustig, Dan; Arvind

    2017-01-01

    Memory consistency models are notorious for being difficult to define precisely, to reason about, and to verify. More than a decade of effort has gone into nailing down the definitions of the ARM and IBM Power memory models, and yet there still remain aspects of those models which (perhaps surprisingly) remain unresolved to this day. In response to these complexities, there has been somewhat of a recent trend in the (general-purpose) architecture community to limit new memory models to being ...

  2. Working memory training improves visual short-term memory capacity.

    Science.gov (United States)

    Schwarb, Hillary; Nail, Jayde; Schumacher, Eric H

    2016-01-01

    Since antiquity, philosophers, theologians, and scientists have been interested in human memory. However, researchers today are still working to understand the capabilities, boundaries, and architecture. While the storage capabilities of long-term memory are seemingly unlimited (Bahrick, J Exp Psychol 113:1-2, 1984), working memory, or the ability to maintain and manipulate information held in memory, seems to have stringent capacity limits (e.g., Cowan, Behav Brain Sci 24:87-185, 2001). Individual differences, however, do exist and these differences can often predict performance on a wide variety of tasks (cf. Engle What is working-memory capacity? 297-314, 2001). Recently, researchers have promoted the enticing possibility that simple behavioral training can expand the limits of working memory which indeed may also lead to improvements on other cognitive processes as well (cf. Morrison and Chein, Psychol Bull Rev 18:46-60 2011). However, initial investigations across a wide variety of cognitive functions have produced mixed results regarding the transferability of training-related improvements. Across two experiments, the present research focuses on the benefit of working memory training on visual short-term memory capacity-a cognitive process that has received little attention in the training literature. Data reveal training-related improvement of global measures of visual short-term memory as well as of measures of the independent sub-processes that contribute to capacity (Awh et al., Psychol Sci 18(7):622-628, 2007). These results suggest that the ability to inhibit irrelevant information within and between trials is enhanced via n-back training allowing for selective improvement on untrained tasks. Additionally, we highlight a potential limitation of the standard adaptive training procedure and propose a modified design to ensure variability in the training environment.

  3. A Survey of Phase Change Memory Systems

    Institute of Scientific and Technical Information of China (English)

    夏飞; 蒋德钧; 熊劲; 孙凝晖

    2015-01-01

    As the scaling of applications increases, the demand of main memory capacity increases in order to serve large working set. It is difficult for DRAM (dynamic random access memory) based memory system to satisfy the memory capacity requirement due to its limited scalability and high energy consumption. Compared to DRAM, PCM (phase change memory) has better scalability, lower energy leakage, and non-volatility. PCM memory systems have become a hot topic of academic and industrial research. However, PCM technology has the following three drawbacks: long write latency, limited write endurance, and high write energy, which raises challenges to its adoption in practice. This paper surveys architectural research work to optimize PCM memory systems. First, this paper introduces the background of PCM. Then, it surveys research efforts on PCM memory systems in performance optimization, lifetime improving, and energy saving in detail, respectively. This paper also compares and summarizes these techniques from multiple dimensions. Finally, it concludes these optimization techniques and discusses possible research directions of PCM memory systems in future.

  4. Bulk-memory processor for data acquisition

    International Nuclear Information System (INIS)

    Nelson, R.O.; McMillan, D.E.; Sunier, J.W.; Meier, M.; Poore, R.V.

    1981-01-01

    To meet the diverse needs and data rate requirements at the Van de Graaff and Weapons Neutron Research (WNR) facilities, a bulk memory system has been implemented which includes a fast and flexible processor. This bulk memory processor (BMP) utilizes bit slice and microcode techniques and features a 24 bit wide internal architecture allowing direct addressing of up to 16 megawords of memory and histogramming up to 16 million counts per channel without overflow. The BMP is interfaced to the MOSTEK MK 8000 bulk memory system and to the standard MODCOMP computer I/O bus. Coding for the BMP both at the microcode level and with macro instructions is supported. The generalized data acquisition system has been extended to support the BMP in a manner transparent to the user

  5. Summary Report for ASC L2 Milestone #4782: Assess Newly Emerging Programming and Memory Models for Advanced Architectures on Integrated Codes

    Energy Technology Data Exchange (ETDEWEB)

    Neely, J. R. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Hornung, R. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Black, A. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Robinson, P. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

    2014-09-29

    This document serves as a detailed companion to the powerpoint slides presented as part of the ASC L2 milestone review for Integrated Codes milestone #4782 titled “Assess Newly Emerging Programming and Memory Models for Advanced Architectures on Integrated Codes”, due on 9/30/2014, and presented for formal program review on 9/12/2014. The program review committee is represented by Mike Zika (A Program Project Lead for Kull), Brian Pudliner (B Program Project Lead for Ares), Scott Futral (DEG Group Lead in LC), and Mike Glass (Sierra Project Lead at Sandia). This document, along with the presentation materials, and a letter of completion signed by the review committee will act as proof of completion for this milestone.

  6. Assessing Programming Costs of Explicit Memory Localization on a Large Scale Shared Memory Multiprocessor

    Directory of Open Access Journals (Sweden)

    Silvio Picano

    1992-01-01

    Full Text Available We present detailed experimental work involving a commercially available large scale shared memory multiple instruction stream-multiple data stream (MIMD parallel computer having a software controlled cache coherence mechanism. To make effective use of such an architecture, the programmer is responsible for designing the program's structure to match the underlying multiprocessors capabilities. We describe the techniques used to exploit our multiprocessor (the BBN TC2000 on a network simulation program, showing the resulting performance gains and the associated programming costs. We show that an efficient implementation relies heavily on the user's ability to explicitly manage the memory system.

  7. Improving Software Performance in the Compute Unified Device Architecture

    Directory of Open Access Journals (Sweden)

    Alexandru PIRJAN

    2010-01-01

    Full Text Available This paper analyzes several aspects regarding the improvement of software performance for applications written in the Compute Unified Device Architecture CUDA. We address an issue of great importance when programming a CUDA application: the Graphics Processing Unit’s (GPU’s memory management through ranspose ernels. We also benchmark and evaluate the performance for progressively optimizing a transposing matrix application in CUDA. One particular interest was to research how well the optimization techniques, applied to software application written in CUDA, scale to the latest generation of general-purpose graphic processors units (GPGPU, like the Fermi architecture implemented in the GTX480 and the previous architecture implemented in GTX280. Lately, there has been a lot of interest in the literature for this type of optimization analysis, but none of the works so far (to our best knowledge tried to validate if the optimizations can apply to a GPU from the latest Fermi architecture and how well does the Fermi architecture scale to these software performance improving techniques.

  8. An Architecture for Emotional and Context-Aware Associative Learning for Robot Companions

    OpenAIRE

    Rizzi Raymundo, C.; Johnson, C. G.; Vargas, P. A.

    2015-01-01

    This work proposes a theoretical architectural model based on the brain's fear learning system with the purpose of generating artificial fear conditioning at both stimuli and context abstraction levels in robot companions. The proposed architecture is inspired by the different brain regions involved in fear learning, here divided into four modules that work in an integrated and parallel manner: the sensory system, the amygdala system, the hippocampal system and the working memory. Each of the...

  9. Processing-in-Memory Enabled Graphics Processors for 3D Rendering

    Energy Technology Data Exchange (ETDEWEB)

    Xie, Chenhao; Song, Shuaiwen; Wang, Jing; Zhang, Weigong; Fu, Xin

    2017-02-06

    The performance of 3D rendering of Graphics Processing Unit that convents 3D vector stream into 2D frame with 3D image effects significantly impact users’ gaming experience on modern computer systems. Due to the high texture throughput in 3D rendering, main memory bandwidth becomes a critical obstacle for improving the overall rendering performance. 3D stacked memory systems such as Hybrid Memory Cube (HMC) provide opportunities to significantly overcome the memory wall by directly connecting logic controllers to DRAM dies. Based on the observation that texel fetches significantly impact off-chip memory traffic, we propose two architectural designs to enable Processing-In-Memory based GPU for efficient 3D rendering.

  10. Persistent Memory in Single Node Delay-Coupled Reservoir Computing.

    Science.gov (United States)

    Kovac, André David; Koall, Maximilian; Pipa, Gordon; Toutounji, Hazem

    2016-01-01

    Delays are ubiquitous in biological systems, ranging from genetic regulatory networks and synaptic conductances, to predator/pray population interactions. The evidence is mounting, not only to the presence of delays as physical constraints in signal propagation speed, but also to their functional role in providing dynamical diversity to the systems that comprise them. The latter observation in biological systems inspired the recent development of a computational architecture that harnesses this dynamical diversity, by delay-coupling a single nonlinear element to itself. This architecture is a particular realization of Reservoir Computing, where stimuli are injected into the system in time rather than in space as is the case with classical recurrent neural network realizations. This architecture also exhibits an internal memory which fades in time, an important prerequisite to the functioning of any reservoir computing device. However, fading memory is also a limitation to any computation that requires persistent storage. In order to overcome this limitation, the current work introduces an extended version to the single node Delay-Coupled Reservoir, that is based on trained linear feedback. We show by numerical simulations that adding task-specific linear feedback to the single node Delay-Coupled Reservoir extends the class of solvable tasks to those that require nonfading memory. We demonstrate, through several case studies, the ability of the extended system to carry out complex nonlinear computations that depend on past information, whereas the computational power of the system with fading memory alone quickly deteriorates. Our findings provide the theoretical basis for future physical realizations of a biologically-inspired ultrafast computing device with extended functionality.

  11. Exploring Heterogeneous Multicore Architectures for Advanced Embedded Uncertainty Quantification.

    Energy Technology Data Exchange (ETDEWEB)

    Phipps, Eric T.; Edwards, Harold C.; Hu, Jonathan J.

    2014-09-01

    We explore rearrangements of classical uncertainty quantification methods with the aim of achieving higher aggregate performance for uncertainty quantification calculations on emerging multicore and manycore architectures. We show a rearrangement of the stochastic Galerkin method leads to improved performance and scalability on several computational architectures whereby un- certainty information is propagated at the lowest levels of the simulation code improving memory access patterns, exposing new dimensions of fine grained parallelism, and reducing communica- tion. We also develop a general framework for implementing such rearrangements for a diverse set of uncertainty quantification algorithms as well as computational simulation codes to which they are applied.

  12. A Scalable Unsegmented Multiport Memory for FPGA-Based Systems

    Directory of Open Access Journals (Sweden)

    Kevin R. Townsend

    2015-01-01

    Full Text Available On-chip multiport memory cores are crucial primitives for many modern high-performance reconfigurable architectures and multicore systems. Previous approaches for scaling memory cores come at the cost of operating frequency, communication overhead, and logic resources without increasing the storage capacity of the memory. In this paper, we present two approaches for designing multiport memory cores that are suitable for reconfigurable accelerators with substantial on-chip memory or complex communication. Our design approaches tackle these challenges by banking RAM blocks and utilizing interconnect networks which allows scaling without sacrificing logic resources. With banking, memory congestion is unavoidable and we evaluate our multiport memory cores under different memory access patterns to gain insights about different design trade-offs. We demonstrate our implementation with up to 256 memory ports using a Xilinx Virtex-7 FPGA. Our experimental results report high throughput memories with resource usage that scales with the number of ports.

  13. Hierarchical architecture of active knits

    International Nuclear Information System (INIS)

    Abel, Julianna; Luntz, Jonathan; Brei, Diann

    2013-01-01

    Nature eloquently utilizes hierarchical structures to form the world around us. Applying the hierarchical architecture paradigm to smart materials can provide a basis for a new genre of actuators which produce complex actuation motions. One promising example of cellular architecture—active knits—provides complex three-dimensional distributed actuation motions with expanded operational performance through a hierarchically organized structure. The hierarchical structure arranges a single fiber of active material, such as shape memory alloys (SMAs), into a cellular network of interlacing adjacent loops according to a knitting grid. This paper defines a four-level hierarchical classification of knit structures: the basic knit loop, knit patterns, grid patterns, and restructured grids. Each level of the hierarchy provides increased architectural complexity, resulting in expanded kinematic actuation motions of active knits. The range of kinematic actuation motions are displayed through experimental examples of different SMA active knits. The results from this paper illustrate and classify the ways in which each level of the hierarchical knit architecture leverages the performance of the base smart material to generate unique actuation motions, providing necessary insight to best exploit this new actuation paradigm. (paper)

  14. A One-Pass Real-Time Decoder Using Memory-Efficient State Network

    Science.gov (United States)

    Shao, Jian; Li, Ta; Zhang, Qingqing; Zhao, Qingwei; Yan, Yonghong

    This paper presents our developed decoder which adopts the idea of statically optimizing part of the knowledge sources while handling the others dynamically. The lexicon, phonetic contexts and acoustic model are statically integrated to form a memory-efficient state network, while the language model (LM) is dynamically incorporated on the fly by means of extended tokens. The novelties of our approach for constructing the state network are (1) introducing two layers of dummy nodes to cluster the cross-word (CW) context dependent fan-in and fan-out triphones, (2) introducing a so-called “WI layer” to store the word identities and putting the nodes of this layer in the non-shared mid-part of the network, (3) optimizing the network at state level by a sufficient forward and backward node-merge process. The state network is organized as a multi-layer structure for distinct token propagation at each layer. By exploiting the characteristics of the state network, several techniques including LM look-ahead, LM cache and beam pruning are specially designed for search efficiency. Especially in beam pruning, a layer-dependent pruning method is proposed to further reduce the search space. The layer-dependent pruning takes account of the neck-like characteristics of WI layer and the reduced variety of word endings, which enables tighter beam without introducing much search errors. In addition, other techniques including LM compression, lattice-based bookkeeping and lattice garbage collection are also employed to reduce the memory requirements. Experiments are carried out on a Mandarin spontaneous speech recognition task where the decoder involves a trigram LM and CW triphone models. A comparison with HDecode of HTK toolkits shows that, within 1% performance deviation, our decoder can run 5 times faster with half of the memory footprint.

  15. Scalable quantum computer architecture with coupled donor-quantum dot qubits

    Science.gov (United States)

    Schenkel, Thomas; Lo, Cheuk Chi; Weis, Christoph; Lyon, Stephen; Tyryshkin, Alexei; Bokor, Jeffrey

    2014-08-26

    A quantum bit computing architecture includes a plurality of single spin memory donor atoms embedded in a semiconductor layer, a plurality of quantum dots arranged with the semiconductor layer and aligned with the donor atoms, wherein a first voltage applied across at least one pair of the aligned quantum dot and donor atom controls a donor-quantum dot coupling. A method of performing quantum computing in a scalable architecture quantum computing apparatus includes arranging a pattern of single spin memory donor atoms in a semiconductor layer, forming a plurality of quantum dots arranged with the semiconductor layer and aligned with the donor atoms, applying a first voltage across at least one aligned pair of a quantum dot and donor atom to control a donor-quantum dot coupling, and applying a second voltage between one or more quantum dots to control a Heisenberg exchange J coupling between quantum dots and to cause transport of a single spin polarized electron between quantum dots.

  16. Dynamic Neural Fields as a Step Towards Cognitive Neuromorphic Architectures

    Directory of Open Access Journals (Sweden)

    Yulia eSandamirskaya

    2014-01-01

    Full Text Available Dynamic Field Theory (DFT is an established framework for modelling embodied cognition. In DFT, elementary cognitive functions such as memory formation, formation of grounded representations, attentional processes, decision making, adaptation, and learning emerge from neuronal dynamics. The basic computational element of this framework is a Dynamic Neural Field (DNF. Under constraints on the time-scale of the dynamics, the DNF is computationally equivalent to a soft winner-take-all (WTA network, which is considered one of the basic computational units in neuronal processing. Recently, it has been shown how a WTA network may be implemented in neuromorphic hardware, such as analogue Very Large Scale Integration (VLSI device. This paper leverages the relationship between DFT and soft WTA networks to systematically revise and integrate established DFT mechanisms that have previously been spread among different architectures. In addition, I also identify some novel computational and architectural mechanisms of DFT which may be implemented in neuromorphic VLSI devices using WTA networks as an intermediate computational layer. These specific mechanisms include the stabilization of working memory, the coupling of sensory systems to motor dynamics, intentionality, and autonomous learning. I further demonstrate how all these elements may be integrated into a unified architecture to generate behavior and autonomous learning.

  17. A Workload-Adaptive and Reconfigurable Bus Architecture for Multicore Processors

    Directory of Open Access Journals (Sweden)

    Shoaib Akram

    2010-01-01

    Full Text Available Interconnection networks for multicore processors are traditionally designed to serve a diversity of workloads. However, different workloads or even different execution phases of the same workload may benefit from different interconnect configurations. In this paper, we first motivate the need for workload-adaptive interconnection networks. Subsequently, we describe an interconnection network framework based on reconfigurable switches for use in medium-scale (up to 32 cores shared memory multicore processors. Our cost-effective reconfigurable interconnection network is implemented on a traditional shared bus interconnect with snoopy-based coherence, and it enables improved multicore performance. The proposed interconnect architecture distributes the cores of the processor into clusters with reconfigurable logic between clusters to support workload-adaptive policies for inter-cluster communication. Our interconnection scheme is complemented by interconnect-aware scheduling and additional interconnect optimizations which help boost the performance of multiprogramming and multithreaded workloads. We provide experimental results that show that the overall throughput of multiprogramming workloads (consisting of two and four programs can be improved by up to 60% with our configurable bus architecture. Similar gains can be achieved also for multithreaded applications as shown by further experiments. Finally, we present the performance sensitivity of the proposed interconnect architecture on shared memory bandwidth availability.

  18. Multiprocessor shared-memory information exchange

    International Nuclear Information System (INIS)

    Santoline, L.L.; Bowers, M.D.; Crew, A.W.; Roslund, C.J.; Ghrist, W.D. III

    1989-01-01

    In distributed microprocessor-based instrumentation and control systems, the inter-and intra-subsystem communication requirements ultimately form the basis for the overall system architecture. This paper describes a software protocol which addresses the intra-subsystem communications problem. Specifically the protocol allows for multiple processors to exchange information via a shared-memory interface. The authors primary goal is to provide a reliable means for information to be exchanged between central application processor boards (masters) and dedicated function processor boards (slaves) in a single computer chassis. The resultant Multiprocessor Shared-Memory Information Exchange (MSMIE) protocol, a standard master-slave shared-memory interface suitable for use in nuclear safety systems, is designed to pass unidirectional buffers of information between the processors while providing a minimum, deterministic cycle time for this data exchange

  19. Wind Power Forecasting Based on Echo State Networks and Long Short-Term Memory

    DEFF Research Database (Denmark)

    López, Erick; Allende, Héctor; Gil, Esteban

    2018-01-01

    involved. In particular, two types of RNN, Long Short-Term Memory (LSTM) and Echo State Network (ESN), have shown good results in time series forecasting. In this work, we present an LSTM+ESN architecture that combines the characteristics of both networks. An architecture similar to an ESN is proposed...

  20. Distributed-memory matrix computations

    DEFF Research Database (Denmark)

    Balle, Susanne Mølleskov

    1995-01-01

    The main goal of this project is to investigate, develop, and implement algorithms for numerical linear algebra on parallel computers in order to acquire expertise in methods for parallel computations. An important motivation for analyzaing and investigating the potential for parallelism in these......The main goal of this project is to investigate, develop, and implement algorithms for numerical linear algebra on parallel computers in order to acquire expertise in methods for parallel computations. An important motivation for analyzaing and investigating the potential for parallelism...... in these algorithms is that many scientific applications rely heavily on the performance of the involved dense linear algebra building blocks. Even though we consider the distributed-memory as well as the shared-memory programming paradigm, the major part of the thesis is dedicated to distributed-memory architectures....... We emphasize distributed-memory massively parallel computers - such as the Connection Machines model CM-200 and model CM-5/CM-5E - available to us at UNI-C and at Thinking Machines Corporation. The CM-200 was at the time this project started one of the few existing massively parallel computers...

  1. INVESTIGATION OF FLIP-FLOP PERFORMANCE ON DIFFERENT TYPE AND ARCHITECTURE IN SHIFT REGISTER WITH PARALLEL LOAD APPLICATIONS

    Directory of Open Access Journals (Sweden)

    Dwi Purnomo

    2015-08-01

    Full Text Available Register is one of the computer components that have a key role in computer organisation. Every computer contains millions of registers that are manifested by flip-flop. This research focuses on the investigation of flip-flop performance based on its type (D, T, S-R, and J-K and architecture (structural, behavioural, and hybrid. Each type of flip-flop on each architecture would be tested in different bit of shift register with parallel load applications. The experiment criteria that will be assessed are power consumption, resources required, memory required, latency, and efficiency. Based on the experiment, it could be shown that D flip-flop and hybrid architecture showed the best performance in required memory, latency, power consumption, and efficiency. In addition, the experiment results showed that the greater the register number, the less efficient the system would be.

  2. Parallel-Architecture Simulator Development Using Hardware Transactional Memory

    OpenAIRE

    Armejach Sanosa, Adrià

    2009-01-01

    To address the need for a simpler parallel programming model, Transactional Memory (TM) has been developed and promises good parallel performance with easy-to-write parallel code. Unlike lock-based approaches, with TM, programmers do not need to explicitly specify and manage the synchronization among threads. However, programmers simply mark code segments as transactions, and the TM system manages the concurrency control for them. TM can be implemented either in software (STM) or hardware (HT...

  3. Next generation spin torque memories

    CERN Document Server

    Kaushik, Brajesh Kumar; Kulkarni, Anant Aravind; Prajapati, Sanjay

    2017-01-01

    This book offers detailed insights into spin transfer torque (STT) based devices, circuits and memories. Starting with the basic concepts and device physics, it then addresses advanced STT applications and discusses the outlook for this cutting-edge technology. It also describes the architectures, performance parameters, fabrication, and the prospects of STT based devices. Further, moving from the device to the system perspective it presents a non-volatile computing architecture composed of STT based magneto-resistive and all-spin logic devices and demonstrates that efficient STT based magneto-resistive and all-spin logic devices can turn the dream of instant on/off non-volatile computing into reality.

  4. A high-throughput readout architecture based on PCI-Express Gen3 and DirectGMA technology

    International Nuclear Information System (INIS)

    Rota, L.; Vogelgesang, M.; Perez, L.E. Ardila; Caselle, M.; Chilingaryan, S.; Dritschler, T.; Zilio, N.; Kopmann, A.; Balzer, M.; Weber, M.

    2016-01-01

    Modern physics experiments produce multi-GB/s data rates. Fast data links and high performance computing stages are required for continuous data acquisition and processing. Because of their intrinsic parallelism and computational power, GPUs emerged as an ideal solution to process this data in high performance computing applications. In this paper we present a high-throughput platform based on direct FPGA-GPU communication. The architecture consists of a Direct Memory Access (DMA) engine compatible with the Xilinx PCI-Express core, a Linux driver for register access, and high- level software to manage direct memory transfers using AMD's DirectGMA technology. Measurements with a Gen3 x8 link show a throughput of 6.4 GB/s for transfers to GPU memory and 6.6 GB/s to system memory. We also assess the possibility of using the architecture in low latency systems: preliminary measurements show a round-trip latency as low as 1 μs for data transfers to system memory, while the additional latency introduced by OpenCL scheduling is the current limitation for GPU based systems. Our implementation is suitable for real-time DAQ system applications ranging from photon science and medical imaging to High Energy Physics (HEP) systems

  5. A compact PE memory for vision chips

    Science.gov (United States)

    Cong, Shi; Zhe, Chen; Jie, Yang; Nanjian, Wu; Zhihua, Wang

    2014-09-01

    This paper presents a novel compact memory in the processing element (PE) for single-instruction multiple-data (SIMD) vision chips. The PE memory is constructed with 8 × 8 register cells, where one latch in the slave stage is shared by eight latches in the master stage. The memory supports simultaneous read and write on the same address in one clock cycle. Its compact area of 14.33 μm2/bit promises a higher integration level of the processor. A prototype chip with a 64 × 64 PE array is fabricated in a UMC 0.18 μm CMOS technology. Five types of the PE memory cell structure are designed and compared. The testing results demonstrate that the proposed PE memory architecture well satisfies the requirement of the vision chip in high-speed real-time vision applications, such as 1000 fps edge extraction.

  6. A compact PE memory for vision chips

    International Nuclear Information System (INIS)

    Shi Cong; Chen Zhe; Yang Jie; Wu Nanjian; Wang Zhihua

    2014-01-01

    This paper presents a novel compact memory in the processing element (PE) for single-instruction multiple-data (SIMD) vision chips. The PE memory is constructed with 8 × 8 register cells, where one latch in the slave stage is shared by eight latches in the master stage. The memory supports simultaneous read and write on the same address in one clock cycle. Its compact area of 14.33 μm 2 /bit promises a higher integration level of the processor. A prototype chip with a 64 × 64 PE array is fabricated in a UMC 0.18 μm CMOS technology. Five types of the PE memory cell structure are designed and compared. The testing results demonstrate that the proposed PE memory architecture well satisfies the requirement of the vision chip in high-speed real-time vision applications, such as 1000 fps edge extraction. (semiconductor integrated circuits)

  7. Heterogeneous reconfigurable processors for real-time baseband processing from algorithm to architecture

    CERN Document Server

    Zhang, Chenxin; Öwall, Viktor

    2016-01-01

    This book focuses on domain-specific heterogeneous reconfigurable architectures, demonstrating for readers a computing platform which is flexible enough to support multiple standards, multiple modes, and multiple algorithms. The content is multi-disciplinary, covering areas of wireless communication, computing architecture, and circuit design. The platform described provides real-time processing capability with reasonable implementation cost, achieving balanced trade-offs among flexibility, performance, and hardware costs. The authors discuss efficient design methods for wireless communication processing platforms, from both an algorithm and architecture design perspective. Coverage also includes computing platforms for different wireless technologies and standards, including MIMO, OFDM, Massive MIMO, DVB, WLAN, LTE/LTE-A, and 5G. •Discusses reconfigurable architectures, including hardware building blocks such as processing elements, memory sub-systems, Network-on-Chip (NoC), and dynamic hardware reconfigur...

  8. Languages, compilers and run-time environments for distributed memory machines

    CERN Document Server

    Saltz, J

    1992-01-01

    Papers presented within this volume cover a wide range of topics related to programming distributed memory machines. Distributed memory architectures, although having the potential to supply the very high levels of performance required to support future computing needs, present awkward programming problems. The major issue is to design methods which enable compilers to generate efficient distributed memory programs from relatively machine independent program specifications. This book is the compilation of papers describing a wide range of research efforts aimed at easing the task of programmin

  9. Neuromorphic Computing – From Materials Research to Systems Architecture Roundtable

    Energy Technology Data Exchange (ETDEWEB)

    Schuller, Ivan K. [Univ. of California, San Diego, CA (United States); Stevens, Rick [Argonne National Lab. (ANL), Argonne, IL (United States); Univ. of Chicago, IL (United States); Pino, Robinson [Dept. of Energy (DOE) Office of Science, Washington, DC (United States); Pechan, Michael [Dept. of Energy (DOE) Office of Science, Washington, DC (United States)

    2015-10-29

    Computation in its many forms is the engine that fuels our modern civilization. Modern computation—based on the von Neumann architecture—has allowed, until now, the development of continuous improvements, as predicted by Moore’s law. However, computation using current architectures and materials will inevitably—within the next 10 years—reach a limit because of fundamental scientific reasons. DOE convened a roundtable of experts in neuromorphic computing systems, materials science, and computer science in Washington on October 29-30, 2015 to address the following basic questions: Can brain-like (“neuromorphic”) computing devices based on new material concepts and systems be developed to dramatically outperform conventional CMOS based technology? If so, what are the basic research challenges for materials sicence and computing? The overarching answer that emerged was: The development of novel functional materials and devices incorporated into unique architectures will allow a revolutionary technological leap toward the implementation of a fully “neuromorphic” computer. To address this challenge, the following issues were considered: The main differences between neuromorphic and conventional computing as related to: signaling models, timing/clock, non-volatile memory, architecture, fault tolerance, integrated memory and compute, noise tolerance, analog vs. digital, and in situ learning New neuromorphic architectures needed to: produce lower energy consumption, potential novel nanostructured materials, and enhanced computation Device and materials properties needed to implement functions such as: hysteresis, stability, and fault tolerance Comparisons of different implementations: spin torque, memristors, resistive switching, phase change, and optical schemes for enhanced breakthroughs in performance, cost, fault tolerance, and/or manufacturability.

  10. Building the Rainbow Nation. A critical analysis of the role of architecture in materializing a post-apartheid South African identity

    Directory of Open Access Journals (Sweden)

    Kim Raedt

    2012-02-01

    Full Text Available Soon after apartheid was abolished in 1994, the quest for a new, ‘authentic’ South African identity resulted in the emergence of the "Rainbow Nation" idea, picturing an equal, multicultural and reconciled society. As architecture is considered a crucial element in the promotion of this Rainbow identity, the country witnessed a remarkable "building boom" with its apogee roughly between 1998 and 2010. Huge investments have been made in state-driven projects which place the apartheid memory at the center of the architectural debate – mostly museums and memorials. However, the focus of this paper shall lie on another, less highlighted tendency in current architectural practice. This paper demonstrates that, through the construction of urban community services, South African architects attempt to materialize the Rainbow Nation in a way that might be closer to the everyday reality of society. Key words: architecture, post apartheid, Cape Town, South Africa, identity

  11. Persistent Memory in Single Node Delay-Coupled Reservoir Computing.

    Directory of Open Access Journals (Sweden)

    André David Kovac

    Full Text Available Delays are ubiquitous in biological systems, ranging from genetic regulatory networks and synaptic conductances, to predator/pray population interactions. The evidence is mounting, not only to the presence of delays as physical constraints in signal propagation speed, but also to their functional role in providing dynamical diversity to the systems that comprise them. The latter observation in biological systems inspired the recent development of a computational architecture that harnesses this dynamical diversity, by delay-coupling a single nonlinear element to itself. This architecture is a particular realization of Reservoir Computing, where stimuli are injected into the system in time rather than in space as is the case with classical recurrent neural network realizations. This architecture also exhibits an internal memory which fades in time, an important prerequisite to the functioning of any reservoir computing device. However, fading memory is also a limitation to any computation that requires persistent storage. In order to overcome this limitation, the current work introduces an extended version to the single node Delay-Coupled Reservoir, that is based on trained linear feedback. We show by numerical simulations that adding task-specific linear feedback to the single node Delay-Coupled Reservoir extends the class of solvable tasks to those that require nonfading memory. We demonstrate, through several case studies, the ability of the extended system to carry out complex nonlinear computations that depend on past information, whereas the computational power of the system with fading memory alone quickly deteriorates. Our findings provide the theoretical basis for future physical realizations of a biologically-inspired ultrafast computing device with extended functionality.

  12. Associative memory through rigid origami

    Science.gov (United States)

    Murugan, Arvind; Brenner, Michael

    2015-03-01

    Mechanisms such as Miura Ori have proven useful in diverse contexts since they have only one degree of freedom that is easily controlled. We combine the theory of rigid origami and associative memory in frustrated neural networks to create structures that can ``learn'' multiple generic folding mechanisms and yet can be robustly controlled. We show that such rigid origami structures can ``recall'' a specific learned mechanism when induced by a physical impulse that only need resemble the desired mechanism (i.e. robust recall through association). Such associative memory in matter, seen before in self-assembly, arises due to a balance between local promiscuity (i.e., many local degrees of freedom) and global frustration which minimizes interference between different learned behaviors. Origami with associative memory can lead to a new class of deployable structures and kinetic architectures with multiple context-dependent behaviors.

  13. Aspects of GPU perfomance in algorithms with random memory access

    Science.gov (United States)

    Kashkovsky, Alexander V.; Shershnev, Anton A.; Vashchenkov, Pavel V.

    2017-10-01

    The numerical code for solving the Boltzmann equation on the hybrid computational cluster using the Direct Simulation Monte Carlo (DSMC) method showed that on Tesla K40 accelerators computational performance drops dramatically with increase of percentage of occupied GPU memory. Testing revealed that memory access time increases tens of times after certain critical percentage of memory is occupied. Moreover, it seems to be the common problem of all NVidia's GPUs arising from its architecture. Few modifications of the numerical algorithm were suggested to overcome this problem. One of them, based on the splitting the memory into "virtual" blocks, resulted in 2.5 times speed up.

  14. Impact of memory bottleneck on the performance of graphics processing units

    Science.gov (United States)

    Son, Dong Oh; Choi, Hong Jun; Kim, Jong Myon; Kim, Cheol Hong

    2015-12-01

    Recent graphics processing units (GPUs) can process general-purpose applications as well as graphics applications with the help of various user-friendly application programming interfaces (APIs) supported by GPU vendors. Unfortunately, utilizing the hardware resource in the GPU efficiently is a challenging problem, since the GPU architecture is totally different to the traditional CPU architecture. To solve this problem, many studies have focused on the techniques for improving the system performance using GPUs. In this work, we analyze the GPU performance varying GPU parameters such as the number of cores and clock frequency. According to our simulations, the GPU performance can be improved by 125.8% and 16.2% on average as the number of cores and clock frequency increase, respectively. However, the performance is saturated when memory bottleneck problems incur due to huge data requests to the memory. The performance of GPUs can be improved as the memory bottleneck is reduced by changing GPU parameters dynamically.

  15. Sleep-dependent memory consolidation in healthy aging and mild cognitive impairment.

    Science.gov (United States)

    Pace-Schott, Edward F; Spencer, Rebecca M C

    2015-01-01

    Sleep quality and architecture as well as sleep's homeostatic and circadian controls change with healthy aging. Changes include reductions in slow-wave sleep's (SWS) percent and spectral power in the sleep electroencephalogram (EEG), number and amplitude of sleep spindles, rapid eye movement (REM) density and the amplitude of circadian rhythms, as well as a phase advance (moved earlier in time) of the brain's circadian clock. With mild cognitive impairment (MCI) there are further reductions of sleep quality, SWS, spindles, and percent REM, all of which further diminish, along with a profound disruption of circadian rhythmicity, with the conversion to Alzheimer's disease (AD). Sleep disorders may represent risk factors for dementias (e.g., REM Behavior Disorder presages Parkinson's disease) and sleep disorders are themselves extremely prevalent in neurodegenerative diseases. Working memory , formation of new episodic memories, and processing speed all decline with healthy aging whereas semantic, recognition, and emotional declarative memory are spared. In MCI, episodic and working memory further decline along with declines in semantic memory. In young adults, sleep-dependent memory consolidation (SDC) is widely observed for both declarative and procedural memory tasks. However, with healthy aging, although SDC for declarative memory is preserved, certain procedural tasks, such as motor-sequence learning, do not show SDC. In younger adults, fragmentation of sleep can reduce SDC, and a normative increase in sleep fragmentation may account for reduced SDC with healthy aging. Whereas sleep disorders such as insomnia, obstructive sleep apnea, and narcolepsy can impair SDC in the absence of neurodegenerative changes, the incidence of sleep disorders increases both with normal aging and, further, with neurodegenerative disease. Specific features of sleep architecture, such as sleep spindles and SWS are strongly linked to SDC. Diminution of these features with healthy aging

  16. Narratives in Mamluk architecture: Spatial and perceptual analyses of the madrassas and their mausoleums

    OpenAIRE

    Malhis, Shatha

    2017-01-01

    Mamluk sultans were known for their patronage of the arts and architecture. Their educational institutions were among the wide array of architectural projects that linked them as ruling elites to the religious scholars of their times. Their tombs were placed in a mausoleum attached to their educational–religious complexes to attest to their legacy. The evolution of their buildings such that both educational and memorial functions are integrated with the dense surroundings is scrutinized throu...

  17. Modeling aspects of human memory for scientific study.

    Energy Technology Data Exchange (ETDEWEB)

    Caudell, Thomas P. (University of New Mexico); Watson, Patrick (University of Illinois - Champaign-Urbana Beckman Institute); McDaniel, Mark A. (Washington University); Eichenbaum, Howard B. (Boston University); Cohen, Neal J. (University of Illinois - Champaign-Urbana Beckman Institute); Vineyard, Craig Michael; Taylor, Shawn Ellis; Bernard, Michael Lewis; Morrow, James Dan; Verzi, Stephen J.

    2009-10-01

    Working with leading experts in the field of cognitive neuroscience and computational intelligence, SNL has developed a computational architecture that represents neurocognitive mechanisms associated with how humans remember experiences in their past. The architecture represents how knowledge is organized and updated through information from individual experiences (episodes) via the cortical-hippocampal declarative memory system. We compared the simulated behavioral characteristics with those of humans measured under well established experimental standards, controlling for unmodeled aspects of human processing, such as perception. We used this knowledge to create robust simulations of & human memory behaviors that should help move the scientific community closer to understanding how humans remember information. These behaviors were experimentally validated against actual human subjects, which was published. An important outcome of the validation process will be the joining of specific experimental testing procedures from the field of neuroscience with computational representations from the field of cognitive modeling and simulation.

  18. Track recognition with an associative pattern memory

    International Nuclear Information System (INIS)

    Bok, H.W. den; Visschers, J.L.; Borgers, A.J.; Lourens, W.

    1991-01-01

    Using Programmable Gate Arrays (PGAs), a prototype for a fast Associative Pattern Memory module has been realized. The associative memory performs the recognition of tracks within the hadron detector data acquisition system at NIKHEF-K. The memory matches the detector state with a set of 24 predefined tracks to identify the particle tracks that occur during an event. This information enables the trigger hardware to classify and select or discriminate the event. Mounted on a standard size (6U) VME board, several PGAs together form an associative memory. The internal logic architecture of the Gate Array is used in such a way as to minimize signal propagation delay. The memory cells, containing a binary representation of the particle tracks, are dynamically loadable through a VME bus interface, providing a high level of flexibility. The hadron detector and its readout system are briefly described and our track representation method is presented. Results from measurements under experimental conditions are discussed. (orig.)

  19. T-CREST: Time-predictable multi-core architecture for embedded systems

    DEFF Research Database (Denmark)

    Schoeberl, Martin; Abbaspourseyedi, Sahar; Jordan, Alexander

    2015-01-01

    -core architectures that are optimized for the WCET instead of the average-case execution time. The resulting time-predictable resources (processors, interconnect, memory arbiter, and memory controller) and tools (compiler, WCET analysis) are designed to ease WCET analysis and to optimize WCET performance. Compared...... domain shows that the WCET can be reduced for computation-intensive tasks when distributing the tasks on several cores and using the network-on-chip for communication. With three cores the WCET is improved by a factor of 1.8 and with 15 cores by a factor of 5.7.The T-CREST project is the result...

  20. Working memory cells' behavior may be explained by cross-regional networks with synaptic facilitation.

    Directory of Open Access Journals (Sweden)

    Sergio Verduzco-Flores

    2009-08-01

    Full Text Available Neurons in the cortex exhibit a number of patterns that correlate with working memory. Specifically, averaged across trials of working memory tasks, neurons exhibit different firing rate patterns during the delay of those tasks. These patterns include: 1 persistent fixed-frequency elevated rates above baseline, 2 elevated rates that decay throughout the tasks memory period, 3 rates that accelerate throughout the delay, and 4 patterns of inhibited firing (below baseline analogous to each of the preceding excitatory patterns. Persistent elevated rate patterns are believed to be the neural correlate of working memory retention and preparation for execution of behavioral/motor responses as required in working memory tasks. Models have proposed that such activity corresponds to stable attractors in cortical neural networks with fixed synaptic weights. However, the variability in patterned behavior and the firing statistics of real neurons across the entire range of those behaviors across and within trials of working memory tasks are typical not reproduced. Here we examine the effect of dynamic synapses and network architectures with multiple cortical areas on the states and dynamics of working memory networks. The analysis indicates that the multiple pattern types exhibited by cells in working memory networks are inherent in networks with dynamic synapses, and that the variability and firing statistics in such networks with distributed architectures agree with that observed in the cortex.

  1. Two-dimensional systolic-array architecture for pixel-level vision tasks

    Science.gov (United States)

    Vijverberg, Julien A.; de With, Peter H. N.

    2010-05-01

    This paper presents ongoing work on the design of a two-dimensional (2D) systolic array for image processing. This component is designed to operate on a multi-processor system-on-chip. In contrast with other 2D systolic-array architectures and many other hardware accelerators, we investigate the applicability of executing multiple tasks in a time-interleaved fashion on the Systolic Array (SA). This leads to a lower external memory bandwidth and better load balancing of the tasks on the different processing tiles. To enable the interleaving of tasks, we add a shadow-state register for fast task switching. To reduce the number of accesses to the external memory, we propose to share the communication assist between consecutive tasks. A preliminary, non-functional version of the SA has been synthesized for an XV4S25 FPGA device and yields a maximum clock frequency of 150 MHz requiring 1,447 slices and 5 memory blocks. Mapping tasks from video content-analysis applications from literature on the SA yields reductions in the execution time of 1-2 orders of magnitude compared to the software implementation. We conclude that the choice for an SA architecture is useful, but a scaled version of the SA featuring less logic with fewer processing and pipeline stages yielding a lower clock frequency, would be sufficient for a video analysis system-on-chip.

  2. From green architecture to architectural green

    DEFF Research Database (Denmark)

    Earon, Ofri

    2011-01-01

    that describes the architectural exclusivity of this particular architecture genre. The adjective green expresses architectural qualities differentiating green architecture from none-green architecture. Currently, adding trees and vegetation to the building’s facade is the main architectural characteristics...... they have overshadowed the architectural potential of green architecture. The paper questions how a green space should perform, look like and function. Two examples are chosen to demonstrate thorough integrations between green and space. The examples are public buildings categorized as pavilions. One......The paper investigates the topic of green architecture from an architectural point of view and not an energy point of view. The purpose of the paper is to establish a debate about the architectural language and spatial characteristics of green architecture. In this light, green becomes an adjective...

  3. How aging affects sleep-dependent memory consolidation?

    Directory of Open Access Journals (Sweden)

    Caroline eHarand

    2012-02-01

    Full Text Available Sleep plays multiple functions among which energy conservation or recuperative processes. Besides, growing evidence indicate that sleep plays also a major role in memory consolidation, a process by which recently acquired and labile memory traces are progressively strengthened into more permanent and/or enhanced forms. Indeed, memories are not stored as they were initially encoded but rather undergo a gradual reorganization process, which is favoured by the neurochemical environment and the electrophysiological activity observed during sleep. Two putative, probably not exclusive, models (hippocampo-neocortical dialogue and synaptic homeostasis hypothesis have been proposed to explain the beneficial effect of sleep on memory processes. It is worth noting that all data gathered until now emerged from studies conducted in young subjects. The investigation of the relationships between sleep and memory in older adults has sparked off little interest until recently. Though, aging is characterized by memory impairment, changes in sleep architecture, as well as brain and neurochemical alterations. All these elements suggest that sleep-dependent memory consolidation may be impaired or occurs differently in older adults.Here, we give an overview of the mechanisms governing sleep-dependent memory consolidation, and the crucial points of this complex process that may dysfunction and result in impaired memory consolidation in aging.

  4. An investigation of the effects of interference speech on short-term memory for verbally presented prose

    Science.gov (United States)

    Lodico, Dana M.; Torres, Rendell R.; Shimizu, Yasushi; Hunter, Claudia

    2004-05-01

    This study investigates the effects of interference speech and the built acoustical environment on human performance, and the possibility of designing spaces to architecturally meet the acoustical goals of office and classroom environments. The effects of room size, geometry, and acoustical parameters on human performance are studied through human subject testing. Three experiments are used to investigate the effects of distracting background speech on short-term memory for verbally presented prose under constrained laboratory conditions. Short-term memory performance is rated within four different acoustical spaces and five background noise levels, as well as a quiet condition. The presentation will cover research methods, results, and possibilities for furthering this research. [Work supported by the Program in Architectural Acoustics, School of Architecture, Rensselaer Polytechnic Institute.

  5. Architecture on Architecture

    DEFF Research Database (Denmark)

    Olesen, Karen

    2016-01-01

    that is not scientific or academic but is more like a latent body of data that we find embedded in existing works of architecture. This information, it is argued, is not limited by the historical context of the work. It can be thought of as a virtual capacity – a reservoir of spatial configurations that can...... correlation between the study of existing architectures and the training of competences to design for present-day realities.......This paper will discuss the challenges faced by architectural education today. It takes as its starting point the double commitment of any school of architecture: on the one hand the task of preserving the particular knowledge that belongs to the discipline of architecture, and on the other hand...

  6. Architecture of 32 bit CISC (Complex Instruction Set Computer) microprocessors

    International Nuclear Information System (INIS)

    Jove, T.M.; Ayguade, E.; Valero, M.

    1988-01-01

    In this paper we describe the main topics about the architecture of the best known 32-bit CISC microprocessors; i80386, MC68000 family, NS32000 series and Z80000. We focus on the high level languages support, operating system design facilities, memory management, techniques to speed up the overall performance and program debugging facilities. (Author)

  7. Method and system for training dynamic nonlinear adaptive filters which have embedded memory

    Science.gov (United States)

    Rabinowitz, Matthew (Inventor)

    2002-01-01

    Described herein is a method and system for training nonlinear adaptive filters (or neural networks) which have embedded memory. Such memory can arise in a multi-layer finite impulse response (FIR) architecture, or an infinite impulse response (IIR) architecture. We focus on filter architectures with separate linear dynamic components and static nonlinear components. Such filters can be structured so as to restrict their degrees of computational freedom based on a priori knowledge about the dynamic operation to be emulated. The method is detailed for an FIR architecture which consists of linear FIR filters together with nonlinear generalized single layer subnets. For the IIR case, we extend the methodology to a general nonlinear architecture which uses feedback. For these dynamic architectures, we describe how one can apply optimization techniques which make updates closer to the Newton direction than those of a steepest descent method, such as backpropagation. We detail a novel adaptive modified Gauss-Newton optimization technique, which uses an adaptive learning rate to determine both the magnitude and direction of update steps. For a wide range of adaptive filtering applications, the new training algorithm converges faster and to a smaller value of cost than both steepest-descent methods such as backpropagation-through-time, and standard quasi-Newton methods. We apply the algorithm to modeling the inverse of a nonlinear dynamic tracking system 5, as well as a nonlinear amplifier 6.

  8. Optimizations of Unstructured Aerodynamics Computations for Many-core Architectures

    KAUST Repository

    Al Farhan, Mohammed Ahmed

    2018-04-13

    We investigate several state-of-the-practice shared-memory optimization techniques applied to key routines of an unstructured computational aerodynamics application with irregular memory accesses. We illustrate for the Intel KNL processor, as a representative of the processors in contemporary leading supercomputers, identifying and addressing performance challenges without compromising the floating point numerics of the original code. We employ low and high-level architecture-specific code optimizations involving thread and data-level parallelism. Our approach is based upon a multi-level hierarchical distribution of work and data across both the threads and the SIMD units within every hardware core. On a 64-core KNL chip, we achieve nearly 2.9x speedup of the dominant routines relative to the baseline. These exhibit almost linear strong scalability up to 64 threads, and thereafter some improvement with hyperthreading. At substantially fewer Watts, we achieve up to 1.7x speedup relative to the performance of 72 threads of a 36-core Haswell CPU and roughly equivalent performance to 112 threads of a 56-core Skylake scalable processor. These optimizations are expected to be of value for many other unstructured mesh PDE-based scientific applications as multi and many-core architecture evolves.

  9. A pipeline of associative memory boards for track finding

    CERN Document Server

    Annovi, A; Bardi, A; Carosi, R; Dell'Orso, Mauro; Giannetti, P; Iannaccone, G; Morsani, F; Pietri, M; Varotto, G

    2000-01-01

    We present a pipeline of associative memory boards for track finding, which satisfies the requirements of level two triggers of the next LHC experiments. With respect to previous realizations, the pipelined architecture warrants full scalability of the memory bank, increased bandwidth (by one order of magnitude), increased number of detector layers (by a factor 2). Each associative memory board consists of four smaller boards, each containing 32 programmable associative memory chips, implemented with low-cost commercial FPGA. FPGA programming has been optimized for maximum efficiency in terms of pattern density and PCB design has been optimized in terms of modularity and FPGA chip density. A complete AM board has been successfully tested at 40 MHz, and can contain 6.6x10//3 particle trajectories. 7 Refs.

  10. Content-addressable read/write memories for image analysis

    Science.gov (United States)

    Snyder, W. E.; Savage, C. D.

    1982-01-01

    The commonly encountered image analysis problems of region labeling and clustering are found to be cases of search-and-rename problem which can be solved in parallel by a system architecture that is inherently suitable for VLSI implementation. This architecture is a novel form of content-addressable memory (CAM) which provides parallel search and update functions, allowing speed reductions down to constant time per operation. It has been proposed in related investigations by Hall (1981) that, with VLSI, CAM-based structures with enhanced instruction sets for general purpose processing will be feasible.

  11. A class Hierarchical, object-oriented approach to virtual memory management

    Science.gov (United States)

    Russo, Vincent F.; Campbell, Roy H.; Johnston, Gary M.

    1989-01-01

    The Choices family of operating systems exploits class hierarchies and object-oriented programming to facilitate the construction of customized operating systems for shared memory and networked multiprocessors. The software is being used in the Tapestry laboratory to study the performance of algorithms, mechanisms, and policies for parallel systems. Described here are the architectural design and class hierarchy of the Choices virtual memory management system. The software and hardware mechanisms and policies of a virtual memory system implement a memory hierarchy that exploits the trade-off between response times and storage capacities. In Choices, the notion of a memory hierarchy is captured by abstract classes. Concrete subclasses of those abstractions implement a virtual address space, segmentation, paging, physical memory management, secondary storage, and remote (that is, networked) storage. Captured in the notion of a memory hierarchy are classes that represent memory objects. These classes provide a storage mechanism that contains encapsulated data and have methods to read or write the memory object. Each of these classes provides specializations to represent the memory hierarchy.

  12. Memory culture and the contemporary city : building sites

    DEFF Research Database (Denmark)

    Staiger, Uta; Steiner, Henriette; Webber, Andrew

    "These essays by leading figures from academia, architecture and the arts consider how cultures of memory are constructed for and in contemporary cities. They take Berlin as a key case of a historically burdened metropolis, but also extend to other global cities: Jerusalem, Buenos Aires, Cape Town...

  13. Fully Pipelined Parallel Architecture for Candidate Block and Pixel-Subsampling-Based Motion Estimation

    Directory of Open Access Journals (Sweden)

    Reeba Korah

    2008-01-01

    Full Text Available This paper presents a low power and high speed architecture for motion estimation with Candidate Block and Pixel Subsampling (CBPS Algorithm. Coarse-to-fine search approach is employed to find the motion vector so that the local minima problem is totally eliminated. Pixel subsampling is performed in the selected candidate blocks which significantly reduces computational cost with low quality degradation. The architecture developed is a fully pipelined parallel design with 9 processing elements. Two different methods are deployed to reduce the power consumption, parallel and pipelined implementation and parallel accessing to memory. For processing 30 CIF frames per second our architecture requires a clock frequency of 4.5 MHz.

  14. Collective Memory Transfers for Multi-Core Chips

    Energy Technology Data Exchange (ETDEWEB)

    Michelogiannakis, George; Williams, Alexander; Shalf, John

    2013-11-13

    Future performance improvements for microprocessors have shifted from clock frequency scaling towards increases in on-chip parallelism. Performance improvements for a wide variety of parallel applications require domain-decomposition of data arrays from a contiguous arrangement in memory to a tiled layout for on-chip L1 data caches and scratchpads. How- ever, DRAM performance suffers under the non-streaming access patterns generated by many independent cores. We propose collective memory scheduling (CMS) that actively takes control of collective memory transfers such that requests arrive in a sequential and predictable fashion to the memory controller. CMS uses the hierarchically tiled arrays formal- ism to compactly express collective operations, which greatly improves programmability over conventional prefetch or list- DMA approaches. CMS reduces application execution time by up to 32% and DRAM read power by 2.2×, compared to a baseline DMA architecture such as STI Cell.

  15. vDNN: Virtualized Deep Neural Networks for Scalable, Memory-Efficient Neural Network Design

    OpenAIRE

    Rhu, Minsoo; Gimelshein, Natalia; Clemons, Jason; Zulfiqar, Arslan; Keckler, Stephen W.

    2016-01-01

    The most widely used machine learning frameworks require users to carefully tune their memory usage so that the deep neural network (DNN) fits into the DRAM capacity of a GPU. This restriction hampers a researcher's flexibility to study different machine learning algorithms, forcing them to either use a less desirable network architecture or parallelize the processing across multiple GPUs. We propose a runtime memory manager that virtualizes the memory usage of DNNs such that both GPU and CPU...

  16. Systematic approach in optimizing numerical memory-bound kernels on GPU

    KAUST Repository

    Abdelfattah, Ahmad; Keyes, David E.; Ltaief, Hatem

    2013-01-01

    memory-bound DLA kernels on GPUs, by taking advantage of the underlying device's architecture (e.g., high throughput). This methodology proved to outperform existing state-of-the-art GPU implementations for the symmetric matrix-vector multiplication (SYMV

  17. Description and Simulation of a Fast Packet Switch Architecture for Communication Satellites

    Science.gov (United States)

    Quintana, Jorge A.; Lizanich, Paul J.

    1995-01-01

    The NASA Lewis Research Center has been developing the architecture for a multichannel communications signal processing satellite (MCSPS) as part of a flexible, low-cost meshed-VSAT (very small aperture terminal) network. The MCSPS architecture is based on a multifrequency, time-division-multiple-access (MF-TDMA) uplink and a time-division multiplex (TDM) downlink. There are eight uplink MF-TDMA beams, and eight downlink TDM beams, with eight downlink dwells per beam. The information-switching processor, which decodes, stores, and transmits each packet of user data to the appropriate downlink dwell onboard the satellite, has been fully described by using VHSIC (Very High Speed Integrated-Circuit) Hardware Description Language (VHDL). This VHDL code, which was developed in-house to simulate the information switching processor, showed that the architecture is both feasible and viable. This paper describes a shared-memory-per-beam architecture, its VHDL implementation, and the simulation efforts.

  18. Qualitative similarities in the visual short-term memory of pigeons and people.

    Science.gov (United States)

    Gibson, Brett; Wasserman, Edward; Luck, Steven J

    2011-10-01

    Visual short-term memory plays a key role in guiding behavior, and individual differences in visual short-term memory capacity are strongly predictive of higher cognitive abilities. To provide a broader evolutionary context for understanding this memory system, we directly compared the behavior of pigeons and humans on a change detection task. Although pigeons had a lower storage capacity and a higher lapse rate than humans, both species stored multiple items in short-term memory and conformed to the same basic performance model. Thus, despite their very different evolutionary histories and neural architectures, pigeons and humans have functionally similar visual short-term memory systems, suggesting that the functional properties of visual short-term memory are subject to similar selective pressures across these distant species.

  19. BLACKCOMB2: Hardware-software co-design for non-volatile memory in exascale systems

    Energy Technology Data Exchange (ETDEWEB)

    Mudge, Trevor [Univ. of Michigan, Ann Arbor, MI (United States)

    2017-12-15

    This work was part of a larger project, Blackcomb2, centered at Oak Ridge National Labs (Jeff Vetter PI) to investigate the opportunities for replacing or supplementing DRAM main memory with nonvolatile memory (NVmemory) in Exascale memory systems. The goal was to reduce the energy consumed by in future supercomputer memory systems and to improve their resiliency. Building on the accomplishments of the original Blackcomb Project, funded in 2010, the goal for Blackcomb2 was to identify, evaluate, and optimize the most promising emerging memory technologies, architecture hardware and software technologies, which are essential to provide the necessary memory capacity, performance, resilience, and energy efficiency in Exascale systems. Capacity and energy are the key drivers.

  20. A scalable single-chip multi-processor architecture with on-chip RTOS kernel

    NARCIS (Netherlands)

    Theelen, B.D.; Verschueren, A.C.; Reyes Suarez, V.V.; Stevens, M.P.J.; Nunez, A.

    2003-01-01

    Now that system-on-chip technology is emerging, single-chip multi-processors are becoming feasible. A key problem of designing such systems is the complexity of their on-chip interconnects and memory architecture. It is furthermore unclear at what level software should be integrated. An example of a

  1. The Architecture, Dynamics, and Development of Mental Processing: Greek, Chinese, or Universal?

    Science.gov (United States)

    Demetriou, A.; Kui, Z.X.; Spanoudis, G.; Christou, C.; Kyriakides, L.; Platsidou, M.

    2005-01-01

    This study compared Greeks with Chinese, from 8 to 14 years of age, on measures of processing efficiency, working memory, and reasoning. All processes were addressed through three domains of relations: verbal/propositional, quantitative, and visuo/spatial. Structural equations modelling and rating scale analysis showed that the architecture and…

  2. Construction and Application of an AMR Algorithm for Distributed Memory Computers

    OpenAIRE

    Deiterding, Ralf

    2003-01-01

    While the parallelization of blockstructured adaptive mesh refinement techniques is relatively straight-forward on shared memory architectures, appropriate distribution strategies for the emerging generation of distributed memory machines are a topic of on-going research. In this paper, a locality-preserving domain decomposition is proposed that partitions the entire AMR hierarchy from the base level on. It is shown that the approach reduces the communication costs and simplifies the im...

  3. Adaptive Digital Predistortion Schemes to Linearize RF Power Amplifiers with Memory Effects

    Institute of Scientific and Technical Information of China (English)

    ZHANG Peng; WU Si-liang; ZHANG Qin

    2008-01-01

    To compensate for nonlinear distortion introduced by RF power amplifiers (PAs) with memory effects, two correlated models, namely an extended memory polynomial (EMP) model and a memory lookup table (LUT) model, are proposed for predistorter design. Two adaptive digital predistortion (ADPD) schemes with indirect learning architecture are presented. One adopts the EMP model and the recursive least square (RLS) algorithm, and the other utilizes the memory LUT model and the least mean square (LMS) algorithm. Simulation results demonstrate that the EMP-based ADPD yields the best linearization performance in terms of suppressing spectral regrowth. It is also shown that the ADPD based on memory LUT makes optimum tradeoff between performance and computational complexity.

  4. SuperNeurons: Dynamic GPU Memory Management for Training Deep Neural Networks

    OpenAIRE

    Wang, Linnan; Ye, Jinmian; Zhao, Yiyang; Wu, Wei; Li, Ang; Song, Shuaiwen Leon; Xu, Zenglin; Kraska, Tim

    2018-01-01

    Going deeper and wider in neural architectures improves the accuracy, while the limited GPU DRAM places an undesired restriction on the network design domain. Deep Learning (DL) practitioners either need change to less desired network architectures, or nontrivially dissect a network across multiGPUs. These distract DL practitioners from concentrating on their original machine learning tasks. We present SuperNeurons: a dynamic GPU memory scheduling runtime to enable the network training far be...

  5. Design Example of Useful Memory Latency for Developing a Hazard Preventive Pipeline High-Performance Embedded-Microprocessor

    Directory of Open Access Journals (Sweden)

    Ching-Hwa Cheng

    2013-01-01

    Full Text Available The existence of structural, control, and data hazards presents a major challenge in designing an advanced pipeline/superscalar microprocessor. An efficient memory hierarchy cache-RAM-Disk design greatly enhances the microprocessor's performance. However, there are complex relationships among the memory hierarchy and the functional units in the microprocessor. Most past architectural design simulations focus on the instruction hazard detection/prevention scheme from the viewpoint of function units. This paper emphasizes that additional inboard memory can be well utilized to handle the hazardous conditions. When the instruction meets hazardous issues, the memory latency can be utilized to prevent performance degradation due to the hazard prevention mechanism. By using the proposed technique, a better architectural design can be rapidly validated by an FPGA at the start of the design stage. In this paper, the simulation results prove that our proposed methodology has a better performance and less power consumption compared to the conventional hazard prevention technique.

  6. The working memory networks of the human brain.

    Science.gov (United States)

    Linden, David E J

    2007-06-01

    Working memory and short-term memory are closely related in their cognitive architecture, capacity limitations, and functional neuroanatomy, which only partly overlap with those of long-term memory. The author reviews the functional neuroimaging literature on the commonalities and differences between working memory and short-term memory and the interplay of areas with modality-specific and supramodal representations in the brain networks supporting these fundamental cognitive processes. Sensory stores in the visual, auditory, and somatosensory cortex play a role in short-term memory, but supramodal parietal and frontal areas are often recruited as well. Classical working memory operations such as manipulation, protection against interference, or updating almost certainly require at least some degree of prefrontal support, but many pure maintenance tasks involve these areas as well. Although it seems that activity shifts from more posterior regions during encoding to more anterior regions during delay, some studies reported sustained delay activity in sensory areas as well. This spatiotemporal complexity of the short-term memory/working memory networks is mirrored in the activation patterns that may explain capacity constraints, which, although most prominent in the parietal cortex, seem to be pervasive across sensory and premotor areas. Finally, the author highlights open questions for cognitive neuroscience research of working memory, such as that of the mechanisms for integrating different types of content (binding) or those providing the link to long-term memory.

  7. Building a columnar database on shared main memory-based storage

    OpenAIRE

    Tinnefeld, Christian

    2014-01-01

    In the field of disk-based parallel database management systems exists a great variety of solutions based on a shared-storage or a shared-nothing architecture. In contrast, main memory-based parallel database management systems are dominated solely by the shared-nothing approach as it preserves the in-memory performance advantage by processing data locally on each server. We argue that this unilateral development is going to cease due to the combination of the following three trends: a) Nowad...

  8. Sparse distributed memory

    Science.gov (United States)

    Denning, Peter J.

    1989-01-01

    Sparse distributed memory was proposed be Pentti Kanerva as a realizable architecture that could store large patterns and retrieve them based on partial matches with patterns representing current sensory inputs. This memory exhibits behaviors, both in theory and in experiment, that resemble those previously unapproached by machines - e.g., rapid recognition of faces or odors, discovery of new connections between seemingly unrelated ideas, continuation of a sequence of events when given a cue from the middle, knowing that one doesn't know, or getting stuck with an answer on the tip of one's tongue. These behaviors are now within reach of machines that can be incorporated into the computing systems of robots capable of seeing, talking, and manipulating. Kanerva's theory is a break with the Western rationalistic tradition, allowing a new interpretation of learning and cognition that respects biology and the mysteries of individual human beings.

  9. Blocking of irrelevant memories by posterior alpha activity boosts memory encoding.

    Science.gov (United States)

    Park, Hyojin; Lee, Dong Soo; Kang, Eunjoo; Kang, Hyejin; Hahm, Jarang; Kim, June Sic; Chung, Chun Kee; Jensen, Ole

    2014-08-01

    In our daily lives, we are confronted with a large amount of information. Because only a small fraction can be encoded in long-term memory, the brain must rely on powerful mechanisms to filter out irrelevant information. To understand the neuronal mechanisms underlying the gating of information into long-term memory, we employed a paradigm where the encoding was directed by a "Remember" or a "No-Remember" cue. We found that posterior alpha activity increased prior to the "No-Remember" stimuli, whereas it decreased prior to the "Remember" stimuli. The sources were localized in the parietal cortex included in the dorsal attention network. Subjects with a larger cue-modulation of the alpha activity had better memory for the to-be-remembered items. Interestingly, alpha activity reflecting successful inhibition following the "No-Remember" cue was observed in the frontal midline structures suggesting preparatory inhibition was mediated by anterior parts of the dorsal attention network. During the presentation of the memory items, there was more gamma activity for the "Remember" compared to the "No-Remember" items in the same regions. Importantly, the anticipatory alpha power during cue predicted the gamma power during item. Our findings suggest that top-down controlled alpha activity reflects attentional inhibition of sensory processing in the dorsal attention network, which then finally gates information to long-term memory. This gating is achieved by inhibiting the processing of visual information reflected by neuronal synchronization in the gamma band. In conclusion, the functional architecture revealed by region-specific changes in the alpha activity reflects attentional modulation which has consequences for long-term memory encoding. Copyright © 2014 Wiley Periodicals, Inc.

  10. Computer architecture evaluation for structural dynamics computations: Project summary

    Science.gov (United States)

    Standley, Hilda M.

    1989-01-01

    The intent of the proposed effort is the examination of the impact of the elements of parallel architectures on the performance realized in a parallel computation. To this end, three major projects are developed: a language for the expression of high level parallelism, a statistical technique for the synthesis of multicomputer interconnection networks based upon performance prediction, and a queueing model for the analysis of shared memory hierarchies.

  11. Coupling Computer Codes for The Analysis of Severe Accident Using A Pseudo Shared Memory Based on MPI

    International Nuclear Information System (INIS)

    Cho, Young Chul; Park, Chang-Hwan; Kim, Dong-Min

    2016-01-01

    As there are four codes in-vessel analysis code (CSPACE), ex-vessel analysis code (SACAP), corium behavior analysis code (COMPASS), and fission product behavior analysis code, for the analysis of severe accident, it is complex to implement the coupling of codes with the similar methodologies for RELAP and CONTEMPT or SPACE and CAP. Because of that, an efficient coupling so called Pseudo shared memory architecture was introduced. In this paper, coupling methodologies will be compared and the methodology used for the analysis of severe accident will be discussed in detail. The barrier between in-vessel and ex-vessel has been removed for the analysis of severe accidents with the implementation of coupling computer codes with pseudo shared memory architecture based on MPI. The remaining are proper choice and checking of variables and values for the selected severe accident scenarios, e.g., TMI accident. Even though it is possible to couple more than two computer codes with pseudo shared memory architecture, the methodology should be revised to couple parallel codes especially when they are programmed using MPI

  12. Coupling Computer Codes for The Analysis of Severe Accident Using A Pseudo Shared Memory Based on MPI

    Energy Technology Data Exchange (ETDEWEB)

    Cho, Young Chul; Park, Chang-Hwan; Kim, Dong-Min [FNC Technology Co., Yongin (Korea, Republic of)

    2016-10-15

    As there are four codes in-vessel analysis code (CSPACE), ex-vessel analysis code (SACAP), corium behavior analysis code (COMPASS), and fission product behavior analysis code, for the analysis of severe accident, it is complex to implement the coupling of codes with the similar methodologies for RELAP and CONTEMPT or SPACE and CAP. Because of that, an efficient coupling so called Pseudo shared memory architecture was introduced. In this paper, coupling methodologies will be compared and the methodology used for the analysis of severe accident will be discussed in detail. The barrier between in-vessel and ex-vessel has been removed for the analysis of severe accidents with the implementation of coupling computer codes with pseudo shared memory architecture based on MPI. The remaining are proper choice and checking of variables and values for the selected severe accident scenarios, e.g., TMI accident. Even though it is possible to couple more than two computer codes with pseudo shared memory architecture, the methodology should be revised to couple parallel codes especially when they are programmed using MPI.

  13. Architectural-level power estimation and experimentation

    Science.gov (United States)

    Ye, Wu

    With the emergence of a plethora of embedded and portable applications and ever increasing integration levels, power dissipation of integrated circuits has moved to the forefront as a design constraint. Recent years have also seen a significant trend towards designs starting at the architectural (or RT) level. Those demand accurate yet fast RT level power estimation methodologies and tools. This thesis addresses issues and experiments associate with architectural level power estimation. An execution driven, cycle-accurate RT level power simulator, SimplePower, was developed using transition-sensitive energy models. It is based on the architecture of a five-stage pipelined RISC datapath for both 0.35mum and 0.8mum technology and can execute the integer subset of the instruction set of SimpleScalar . SimplePower measures the energy consumed in the datapath, memory and on-chip buses. During the development of SimplePower , a partitioning power modeling technique was proposed to model the energy consumed in complex functional units. The accuracy of this technique was validated with HSPICE simulation results for a register file and a shifter. A novel, selectively gated pipeline register optimization technique was proposed to reduce the datapath energy consumption. It uses the decoded control signals to selectively gate the data fields of the pipeline registers. Simulation results show that this technique can reduce the datapath energy consumption by 18--36% for a set of benchmarks. A low-level back-end compiler optimization, register relabeling, was applied to reduce the on-chip instruction cache data bus switch activities. Its impact was evaluated by SimplePower. Results show that it can reduce the energy consumed in the instruction data buses by 3.55--16.90%. A quantitative evaluation was conducted for the impact of six state-of-art high-level compilation techniques on both datapath and memory energy consumption. The experimental results provide a valuable insight for

  14. On the impact of communication complexity in the design of parallel numerical algorithms

    Science.gov (United States)

    Gannon, D.; Vanrosendale, J.

    1984-01-01

    This paper describes two models of the cost of data movement in parallel numerical algorithms. One model is a generalization of an approach due to Hockney, and is suitable for shared memory multiprocessors where each processor has vector capabilities. The other model is applicable to highly parallel nonshared memory MIMD systems. In the second model, algorithm performance is characterized in terms of the communication network design. Techniques used in VLSI complexity theory are also brought in, and algorithm independent upper bounds on system performance are derived for several problems that are important to scientific computation.

  15. Continuous-variable quantum computing in optical time-frequency modes using quantum memories.

    Science.gov (United States)

    Humphreys, Peter C; Kolthammer, W Steven; Nunn, Joshua; Barbieri, Marco; Datta, Animesh; Walmsley, Ian A

    2014-09-26

    We develop a scheme for time-frequency encoded continuous-variable cluster-state quantum computing using quantum memories. In particular, we propose a method to produce, manipulate, and measure two-dimensional cluster states in a single spatial mode by exploiting the intrinsic time-frequency selectivity of Raman quantum memories. Time-frequency encoding enables the scheme to be extremely compact, requiring a number of memories that are a linear function of only the number of different frequencies in which the computational state is encoded, independent of its temporal duration. We therefore show that quantum memories can be a powerful component for scalable photonic quantum information processing architectures.

  16. Implementation of Parallel Dynamic Simulation on Shared-Memory vs. Distributed-Memory Environments

    Energy Technology Data Exchange (ETDEWEB)

    Jin, Shuangshuang; Chen, Yousu; Wu, Di; Diao, Ruisheng; Huang, Zhenyu

    2015-12-09

    Power system dynamic simulation computes the system response to a sequence of large disturbance, such as sudden changes in generation or load, or a network short circuit followed by protective branch switching operation. It consists of a large set of differential and algebraic equations, which is computational intensive and challenging to solve using single-processor based dynamic simulation solution. High-performance computing (HPC) based parallel computing is a very promising technology to speed up the computation and facilitate the simulation process. This paper presents two different parallel implementations of power grid dynamic simulation using Open Multi-processing (OpenMP) on shared-memory platform, and Message Passing Interface (MPI) on distributed-memory clusters, respectively. The difference of the parallel simulation algorithms and architectures of the two HPC technologies are illustrated, and their performances for running parallel dynamic simulation are compared and demonstrated.

  17. A performance evaluation of in-memory databases

    Directory of Open Access Journals (Sweden)

    Abdullah Talha Kabakus

    2017-10-01

    Full Text Available The popularity of NoSQL databases has increased due to the need of (1 processing vast amount of data faster than the relational database management systems by taking the advantage of highly scalable architecture, (2 flexible (schema-free data structure, and, (3 low latency and high performance. Despite that memory usage is not major criteria to evaluate performance of algorithms, since these databases serve the data from memory, their memory usages are also experimented alongside the time taken to complete each operation in the paper to reveal which one uses the memory most efficiently. Currently there exists over 225 NoSQL databases that provide different features and characteristics. So it is necessary to reveal which one provides better performance for different data operations. In this paper, we experiment the widely used in-memory databases to measure their performance in terms of (1 the time taken to complete operations, and (2 how efficiently they use memory during operations. As per the results reported in this paper, there is no database that provides the best performance for all data operations. It is also proved that even though a RDMS stores its data in memory, its overall performance is worse than NoSQL databases.

  18. Modular architectures for quantum networks

    Science.gov (United States)

    Pirker, A.; Wallnöfer, J.; Dür, W.

    2018-05-01

    We consider the problem of generating multipartite entangled states in a quantum network upon request. We follow a top-down approach, where the required entanglement is initially present in the network in form of network states shared between network devices, and then manipulated in such a way that the desired target state is generated. This minimizes generation times, and allows for network structures that are in principle independent of physical links. We present a modular and flexible architecture, where a multi-layer network consists of devices of varying complexity, including quantum network routers, switches and clients, that share certain resource states. We concentrate on the generation of graph states among clients, which are resources for numerous distributed quantum tasks. We assume minimal functionality for clients, i.e. they do not participate in the complex and distributed generation process of the target state. We present architectures based on shared multipartite entangled Greenberger–Horne–Zeilinger states of different size, and fully connected decorated graph states, respectively. We compare the features of these architectures to an approach that is based on bipartite entanglement, and identify advantages of the multipartite approach in terms of memory requirements and complexity of state manipulation. The architectures can handle parallel requests, and are designed in such a way that the network state can be dynamically extended if new clients or devices join the network. For generation or dynamical extension of the network states, we propose a quantum network configuration protocol, where entanglement purification is used to establish high fidelity states. The latter also allows one to show that the entanglement generated among clients is private, i.e. the network is secure.

  19. Building more powerful less expensive supercomputers using Processing-In-Memory (PIM) LDRD final report.

    Energy Technology Data Exchange (ETDEWEB)

    Murphy, Richard C.

    2009-09-01

    This report details the accomplishments of the 'Building More Powerful Less Expensive Supercomputers Using Processing-In-Memory (PIM)' LDRD ('PIM LDRD', number 105809) for FY07-FY09. Latency dominates all levels of supercomputer design. Within a node, increasing memory latency, relative to processor cycle time, limits CPU performance. Between nodes, the same increase in relative latency impacts scalability. Processing-In-Memory (PIM) is an architecture that directly addresses this problem using enhanced chip fabrication technology and machine organization. PIMs combine high-speed logic and dense, low-latency, high-bandwidth DRAM, and lightweight threads that tolerate latency by performing useful work during memory transactions. This work examines the potential of PIM-based architectures to support mission critical Sandia applications and an emerging class of more data intensive informatics applications. This work has resulted in a stronger architecture/implementation collaboration between 1400 and 1700. Additionally, key technology components have impacted vendor roadmaps, and we are in the process of pursuing these new collaborations. This work has the potential to impact future supercomputer design and construction, reducing power and increasing performance. This final report is organized as follow: this summary chapter discusses the impact of the project (Section 1), provides an enumeration of publications and other public discussion of the work (Section 1), and concludes with a discussion of future work and impact from the project (Section 1). The appendix contains reprints of the refereed publications resulting from this work.

  20. VOP memory management in MPEG-4

    Science.gov (United States)

    Vaithianathan, Karthikeyan; Panchanathan, Sethuraman

    2001-03-01

    MPEG-4 is a multimedia standard that requires Video Object Planes (VOPs). Generation of VOPs for any kind of video sequence is still a challenging problem that largely remains unsolved. Nevertheless, if this problem is treated by imposing certain constraints, solutions for specific application domains can be found. MPEG-4 applications in mobile devices is one such domain where the opposite goals namely low power and high throughput are required to be met. Efficient memory management plays a major role in reducing the power consumption. Specifically, efficient memory management for VOPs is difficult because the lifetimes of these objects vary and these life times may be overlapping. Varying life times of the objects requires dynamic memory management where memory fragmentation is a key problem that needs to be addressed. In general, memory management systems address this problem by following a combination of strategy, policy and mechanism. For MPEG4 based mobile devices that lack instruction processors, a hardware based memory management solution is necessary. In MPEG4 based mobile devices that have a RISC processor, using a Real time operating system (RTOS) for this memory management task is not expected to be efficient because the strategies and policies used by the ROTS is often tuned for handling memory segments of smaller sizes compared to object sizes. Hence, a memory management scheme specifically tuned for VOPs is important. In this paper, different strategies, policies and mechanisms for memory management are considered and an efficient combination is proposed for the case of VOP memory management along with a hardware architecture, which can handle the proposed combination.

  1. Spatial-sequential working memory in younger and older adults: age predicts backward recall performance within both age groups

    OpenAIRE

    Louise A. Brown

    2016-01-01

    Working memory is vulnerable to age-related decline, but there is debate regarding the age-sensitivity of different forms of spatial-sequential working memory task, depending on their passive or active nature. The functional architecture of spatial working memory was therefore explored in younger (18–40 years) and older (64–85 years) adults, using passive and active recall tasks. Spatial working memory was assessed using a modified version of the Spatial Span subtest of the Wechsler Memory Sc...

  2. Homogeneous and Heterogeneous MPSoC Architectures with Network-On-Chip Connectivity for Low-Power and Real-Time Multimedia Signal Processing

    Directory of Open Access Journals (Sweden)

    Sergio Saponara

    2012-01-01

    Full Text Available Two multiprocessor system-on-chip (MPSoC architectures are proposed and compared in the paper with reference to audio and video processing applications. One architecture exploits a homogeneous topology; it consists of 8 identical tiles, each made of a 32-bit RISC core enhanced by a 64-bit DSP coprocessor with local memory. The other MPSoC architecture exploits a heterogeneous-tile topology with on-chip distributed memory resources; the tiles act as application specific processors supporting a different class of algorithms. In both architectures, the multiple tiles are interconnected by a network-on-chip (NoC infrastructure, through network interfaces and routers, which allows parallel operations of the multiple tiles. The functional performances and the implementation complexity of the NoC-based MPSoC architectures are assessed by synthesis results in submicron CMOS technology. Among the large set of supported algorithms, two case studies are considered: the real-time implementation of an H.264/MPEG AVC video codec and of a low-distortion digital audio amplifier. The heterogeneous architecture ensures a higher power efficiency and a smaller area occupation and is more suited for low-power multimedia processing, such as in mobile devices. The homogeneous scheme allows for a higher flexibility and easier system scalability and is more suited for general-purpose DSP tasks in power-supplied devices.

  3. A parallel 3-D discrete wavelet transform architecture using pipelined lifting scheme approach for video coding

    Science.gov (United States)

    Hegde, Ganapathi; Vaya, Pukhraj

    2013-10-01

    This article presents a parallel architecture for 3-D discrete wavelet transform (3-DDWT). The proposed design is based on the 1-D pipelined lifting scheme. The architecture is fully scalable beyond the present coherent Daubechies filter bank (9, 7). This 3-DDWT architecture has advantages such as no group of pictures restriction and reduced memory referencing. It offers low power consumption, low latency and high throughput. The computing technique is based on the concept that lifting scheme minimises the storage requirement. The application specific integrated circuit implementation of the proposed architecture is done by synthesising it using 65 nm Taiwan Semiconductor Manufacturing Company standard cell library. It offers a speed of 486 MHz with a power consumption of 2.56 mW. This architecture is suitable for real-time video compression even with large frame dimensions.

  4. Flash memories economic principles of performance, cost and reliability optimization

    CERN Document Server

    Richter, Detlev

    2014-01-01

    The subject of this book is to introduce a model-based quantitative performance indicator methodology applicable for performance, cost and reliability optimization of non-volatile memories. The complex example of flash memories is used to introduce and apply the methodology. It has been developed by the author based on an industrial 2-bit to 4-bit per cell flash development project. For the first time, design and cost aspects of 3D integration of flash memory are treated in this book. Cell, array, performance and reliability effects of flash memories are introduced and analyzed. Key performance parameters are derived to handle the flash complexity. A performance and array memory model is developed and a set of performance indicators characterizing architecture, cost and durability is defined.   Flash memories are selected to apply the Performance Indicator Methodology to quantify design and technology innovation. A graphical representation based on trend lines is introduced to support a requirement based pr...

  5. Genetic and Environmental Architecture of Changes in Episodic Memory from Middle to Late Middle Age

    Science.gov (United States)

    Panizzon, Matthew S.; Neale, Michael C.; Docherty, Anna R.; Franz, Carol E.; Jacobson, Kristen C.; Toomey, Rosemary; Xian, Hong; Vasilopoulos, Terrie; Rana, Brinda K.; McKenzie, Ruth M.; Lyons, Michael J.; Kremen, William S.

    2015-01-01

    Episodic memory is a complex construct at both the phenotypic and genetic level. Ample evidence supports age-related cognitive stability and change being accounted for by general and domain-specific factors. We hypothesized that general and specific factors would underlie change even within this single cognitive domain. We examined six measures from three episodic memory tests in a narrow age cohort at middle and late middle age. The factor structure was invariant across occasions. At both timepoints two of three test-specific factors (story recall, design recall) had significant genetic influences independent of the general memory factor. Phenotypic stability was moderate to high, and primarily accounted for by genetic influences, except for one test-specific factor (list learning). Mean change over time was nonsignificant for one test-level factor; one declined; one improved. The results highlight the phenotypic and genetic complexity of memory and memory change, and shed light on an understudied period of life. PMID:25938244

  6. Visual Working Memory Is Independent of the Cortical Spacing Between Memoranda.

    Science.gov (United States)

    Harrison, William J; Bays, Paul M

    2018-03-21

    The sensory recruitment hypothesis states that visual short-term memory is maintained in the same visual cortical areas that initially encode a stimulus' features. Although it is well established that the distance between features in visual cortex determines their visibility, a limitation known as crowding, it is unknown whether short-term memory is similarly constrained by the cortical spacing of memory items. Here, we investigated whether the cortical spacing between sequentially presented memoranda affects the fidelity of memory in humans (of both sexes). In a first experiment, we varied cortical spacing by taking advantage of the log-scaling of visual cortex with eccentricity, presenting memoranda in peripheral vision sequentially along either the radial or tangential visual axis with respect to the fovea. In a second experiment, we presented memoranda sequentially either within or beyond the critical spacing of visual crowding, a distance within which visual features cannot be perceptually distinguished due to their nearby cortical representations. In both experiments and across multiple measures, we found strong evidence that the ability to maintain visual features in memory is unaffected by cortical spacing. These results indicate that the neural architecture underpinning working memory has properties inconsistent with the known behavior of sensory neurons in visual cortex. Instead, the dissociation between perceptual and memory representations supports a role of higher cortical areas such as posterior parietal or prefrontal regions or may involve an as yet unspecified mechanism in visual cortex in which stimulus features are bound to their temporal order. SIGNIFICANCE STATEMENT Although much is known about the resolution with which we can remember visual objects, the cortical representation of items held in short-term memory remains contentious. A popular hypothesis suggests that memory of visual features is maintained via the recruitment of the same neural

  7. A High Performance VLSI Computer Architecture For Computer Graphics

    Science.gov (United States)

    Chin, Chi-Yuan; Lin, Wen-Tai

    1988-10-01

    A VLSI computer architecture, consisting of multiple processors, is presented in this paper to satisfy the modern computer graphics demands, e.g. high resolution, realistic animation, real-time display etc.. All processors share a global memory which are partitioned into multiple banks. Through a crossbar network, data from one memory bank can be broadcasted to many processors. Processors are physically interconnected through a hyper-crossbar network (a crossbar-like network). By programming the network, the topology of communication links among processors can be reconfigurated to satisfy specific dataflows of different applications. Each processor consists of a controller, arithmetic operators, local memory, a local crossbar network, and I/O ports to communicate with other processors, memory banks, and a system controller. Operations in each processor are characterized into two modes, i.e. object domain and space domain, to fully utilize the data-independency characteristics of graphics processing. Special graphics features such as 3D-to-2D conversion, shadow generation, texturing, and reflection, can be easily handled. With the current high density interconnection (MI) technology, it is feasible to implement a 64-processor system to achieve 2.5 billion operations per second, a performance needed in most advanced graphics applications.

  8. Impact of multiplexed reading scheme on nanocrossbar memristor memory's scalability

    International Nuclear Information System (INIS)

    Zhu Xuan; Tang Yu-Hua; Wu Jun-Jie; Yi Xun; Wu Chun-Qing

    2014-01-01

    Nanocrossbar is a potential memory architecture to integrate memristor to achieve large scale and high density memory. However, based on the currently widely-adopted parallel reading scheme, scalability of the nanocrossbar memory is limited, since the overhead of the reading circuits is in proportion with the size of the nanocrossbar component. In this paper, a multiplexed reading scheme is adopted as the foundation of the discussion. Through HSPICE simulation, we reanalyze scalability of the nanocrossbar memristor memory by investigating the impact of various circuit parameters on the output voltage swing as the memory scales to larger size. We find that multiplexed reading maintains sufficient noise margin in large size nanocrossbar memristor memory. In order to improve the scalability of the memory, memristors with nonlinear I—V characteristics and high LRS (low resistive state) resistance should be adopted. (interdisciplinary physics and related areas of science and technology)

  9. Fork-join and data-driven execution models on multi-core architectures: Case study of the FMM

    KAUST Repository

    Amer, Abdelhalim; Maruyama, Naoya; Pericà s, Miquel; Taura, Kenjiro; Yokota, Rio; Matsuoka, Satoshi

    2013-01-01

    Extracting maximum performance of multi-core architectures is a difficult task primarily due to bandwidth limitations of the memory subsystem and its complex hierarchy. In this work, we study the implications of fork-join and data-driven execution

  10. Benchmarking high performance computing architectures with CMS’ skeleton framework

    Science.gov (United States)

    Sexton-Kennedy, E.; Gartung, P.; Jones, C. D.

    2017-10-01

    In 2012 CMS evaluated which underlying concurrency technology would be the best to use for its multi-threaded framework. The available technologies were evaluated on the high throughput computing systems dominating the resources in use at that time. A skeleton framework benchmarking suite that emulates the tasks performed within a CMSSW application was used to select Intel’s Thread Building Block library, based on the measured overheads in both memory and CPU on the different technologies benchmarked. In 2016 CMS will get access to high performance computing resources that use new many core architectures; machines such as Cori Phase 1&2, Theta, Mira. Because of this we have revived the 2012 benchmark to test it’s performance and conclusions on these new architectures. This talk will discuss the results of this exercise.

  11. Visual perception and memory systems: from cortex to medial temporal lobe.

    Science.gov (United States)

    Khan, Zafar U; Martín-Montañez, Elisa; Baxter, Mark G

    2011-05-01

    Visual perception and memory are the most important components of vision processing in the brain. It was thought that the perceptual aspect of a visual stimulus occurs in visual cortical areas and that this serves as the substrate for the formation of visual memory in a distinct part of the brain called the medial temporal lobe. However, current evidence indicates that there is no functional separation of areas. Entire visual cortical pathways and connecting medial temporal lobe are important for both perception and visual memory. Though some aspects of this view are debated, evidence from both sides will be explored here. In this review, we will discuss the anatomical and functional architecture of the entire system and the implications of these structures in visual perception and memory.

  12. Volterra series based predistortion for broadband RF power amplifiers with memory effects

    Institute of Scientific and Technical Information of China (English)

    Jin Zhe; Song Zhihuan; He Jiaming

    2008-01-01

    RF power amplifiers(PAs)are usually considered as memoryless devices in most existing predistortion techniques.However,in broadband communication systems,such as WCDMA,the PA memory effects are significant,and memoryless predistortion cannot linearize the PAs effectively.After analyzing the PA memory effects,a novel predistortion method based on the simplified Volterra series is proposed to linearize broadband RF PAs with memory effects.The indirect learning architecture is adopted to design the predistortion scheme and the recursive least squares algorithm with forgetting factor is applied to identify the parameters of the predistorter.Simulation results show that the proposed predistortion method can compensate the nonlinear distortion and memory effects of broadband RF PAs effectively.

  13. Lateralised sleep spindles relate to false memory generation.

    Science.gov (United States)

    Shaw, John J; Monaghan, Padraic

    2017-12-01

    Sleep is known to enhance false memories: After presenting participants with lists of semantically related words, sleeping before recalling these words results in a greater acceptance of unseen "lure" words related in theme to previously seen words. Furthermore, the right hemisphere (RH) seems to be more prone to false memories than the left hemisphere (LH). In the current study, we investigated the sleep architecture associated with these false memory and lateralisation effects in a nap study. Participants viewed lists of related words, then stayed awake or slept for approximately 90min, and were then tested for recognition of previously seen-old, unseen-new, or unseen-lure words presented either to the LH or RH. Sleep increased acceptance of unseen-lure words as previously seen compared to the wake group, particularly for RH presentations of word lists. RH lateralised stage 2 sleep spindle density relative to the LH correlated with this increase in false memories, suggesting that RH sleep spindles enhanced false memories in the RH. Copyright © 2017 Elsevier Ltd. All rights reserved.

  14. Hardware architecture design of image restoration based on time-frequency domain computation

    Science.gov (United States)

    Wen, Bo; Zhang, Jing; Jiao, Zipeng

    2013-10-01

    The image restoration algorithms based on time-frequency domain computation is high maturity and applied widely in engineering. To solve the high-speed implementation of these algorithms, the TFDC hardware architecture is proposed. Firstly, the main module is designed, by analyzing the common processing and numerical calculation. Then, to improve the commonality, the iteration control module is planed for iterative algorithms. In addition, to reduce the computational cost and memory requirements, the necessary optimizations are suggested for the time-consuming module, which include two-dimensional FFT/IFFT and the plural calculation. Eventually, the TFDC hardware architecture is adopted for hardware design of real-time image restoration system. The result proves that, the TFDC hardware architecture and its optimizations can be applied to image restoration algorithms based on TFDC, with good algorithm commonality, hardware realizability and high efficiency.

  15. Scalable quantum memory in the ultrastrong coupling regime.

    Science.gov (United States)

    Kyaw, T H; Felicetti, S; Romero, G; Solano, E; Kwek, L-C

    2015-03-02

    Circuit quantum electrodynamics, consisting of superconducting artificial atoms coupled to on-chip resonators, represents a prime candidate to implement the scalable quantum computing architecture because of the presence of good tunability and controllability. Furthermore, recent advances have pushed the technology towards the ultrastrong coupling regime of light-matter interaction, where the qubit-resonator coupling strength reaches a considerable fraction of the resonator frequency. Here, we propose a qubit-resonator system operating in that regime, as a quantum memory device and study the storage and retrieval of quantum information in and from the Z2 parity-protected quantum memory, within experimentally feasible schemes. We are also convinced that our proposal might pave a way to realize a scalable quantum random-access memory due to its fast storage and readout performances.

  16. Implementation of digital equality comparator circuit on memristive memory crossbar array using material implication logic

    Science.gov (United States)

    Haron, Adib; Mahdzair, Fazren; Luqman, Anas; Osman, Nazmie; Junid, Syed Abdul Mutalib Al

    2018-03-01

    One of the most significant constraints of Von Neumann architecture is the limited bandwidth between memory and processor. The cost to move data back and forth between memory and processor is considerably higher than the computation in the processor itself. This architecture significantly impacts the Big Data and data-intensive application such as DNA analysis comparison which spend most of the processing time to move data. Recently, the in-memory processing concept was proposed, which is based on the capability to perform the logic operation on the physical memory structure using a crossbar topology and non-volatile resistive-switching memristor technology. This paper proposes a scheme to map digital equality comparator circuit on memristive memory crossbar array. The 2-bit, 4-bit, 8-bit, 16-bit, 32-bit, and 64-bit of equality comparator circuit are mapped on memristive memory crossbar array by using material implication logic in a sequential and parallel method. The simulation results show that, for the 64-bit word size, the parallel mapping exhibits 2.8× better performance in total execution time than sequential mapping but has a trade-off in terms of energy consumption and area utilization. Meanwhile, the total crossbar area can be reduced by 1.2× for sequential mapping and 1.5× for parallel mapping both by using the overlapping technique.

  17. Data acquisition, storage and control architecture for the SuperNova Acceleration Probe

    International Nuclear Information System (INIS)

    Prosser, Alan; Fermilab; Cardoso, Guilherme; Chramowicz, John; Marriner, John; Rivera, Ryan; Turqueti, Marcos; Fermilab

    2007-01-01

    The SuperNova Acceleration Probe (SNAP) instrument is being designed to collect image and spectroscopic data for the study of dark energy in the universe. In this paper, we describe a distributed architecture for the data acquisition system which interfaces to visible light and infrared imaging detectors. The architecture includes the use of NAND flash memory for the storage of exposures in a file system. Also described is an FPGA-based lossless data compression algorithm with a configurable pre-scaler based on a novel square root data compression method to improve compression performance. The required interactions of the distributed elements with an instrument control unit will be described as well

  18. VHDL vs. Bluespec system verilog: a case study on a Java embedded architecture

    NARCIS (Netherlands)

    Gruian, Flavius; Westmijze, M.

    2008-01-01

    This paper compares two hardware design flows, based on the classic VHDL on one side and the relatively new Blue-spec System Verilog (BSV) on the other side. The comparison is based on a case study of a Java embedded architecture, comprising a Java native processor and a memory management unit. The

  19. A review of emerging non-volatile memory (NVM) technologies and applications

    Science.gov (United States)

    Chen, An

    2016-11-01

    This paper will review emerging non-volatile memory (NVM) technologies, with the focus on phase change memory (PCM), spin-transfer-torque random-access-memory (STTRAM), resistive random-access-memory (RRAM), and ferroelectric field-effect-transistor (FeFET) memory. These promising NVM devices are evaluated in terms of their advantages, challenges, and applications. Their performance is compared based on reported parameters of major industrial test chips. Memory selector devices and cell structures are discussed. Changing market trends toward low power (e.g., mobile, IoT) and data-centric applications create opportunities for emerging NVMs. High-performance and low-cost emerging NVMs may simplify memory hierarchy, introduce non-volatility in logic gates and circuits, reduce system power, and enable novel architectures. Storage-class memory (SCM) based on high-density NVMs could fill the performance and density gap between memory and storage. Some unique characteristics of emerging NVMs can be utilized for novel applications beyond the memory space, e.g., neuromorphic computing, hardware security, etc. In the beyond-CMOS era, emerging NVMs have the potential to fulfill more important functions and enable more efficient, intelligent, and secure computing systems.

  20. Architecture independent environment for developing engineering software on MIMD computers

    Science.gov (United States)

    Valimohamed, Karim A.; Lopez, L. A.

    1990-01-01

    Engineers are constantly faced with solving problems of increasing complexity and detail. Multiple Instruction stream Multiple Data stream (MIMD) computers have been developed to overcome the performance limitations of serial computers. The hardware architectures of MIMD computers vary considerably and are much more sophisticated than serial computers. Developing large scale software for a variety of MIMD computers is difficult and expensive. There is a need to provide tools that facilitate programming these machines. First, the issues that must be considered to develop those tools are examined. The two main areas of concern were architecture independence and data management. Architecture independent software facilitates software portability and improves the longevity and utility of the software product. It provides some form of insurance for the investment of time and effort that goes into developing the software. The management of data is a crucial aspect of solving large engineering problems. It must be considered in light of the new hardware organizations that are available. Second, the functional design and implementation of a software environment that facilitates developing architecture independent software for large engineering applications are described. The topics of discussion include: a description of the model that supports the development of architecture independent software; identifying and exploiting concurrency within the application program; data coherence; engineering data base and memory management.

  1. Multiscale Architectures and Parallel Algorithms for Video Object Tracking

    Science.gov (United States)

    2011-10-01

    larger number of cores using the IBM QS22 Blade for handling higher video processing workloads (but at higher cost per core), low power consumption and...Cell/B.E. Blade processors which have a lot more main memory but also higher power consumption . More detailed performance figures for HD and SD video...Parallelism in Algorithms and Architectures, pages 289–298, 2007. [3] S. Ali and M. Shah. COCOA - Tracking in aerial imagery. In Daniel J. Henry

  2. An efficient spectral crystal plasticity solver for GPU architectures

    Science.gov (United States)

    Malahe, Michael

    2018-03-01

    We present a spectral crystal plasticity (CP) solver for graphics processing unit (GPU) architectures that achieves a tenfold increase in efficiency over prior GPU solvers. The approach makes use of a database containing a spectral decomposition of CP simulations performed using a conventional iterative solver over a parameter space of crystal orientations and applied velocity gradients. The key improvements in efficiency come from reducing global memory transactions, exposing more instruction-level parallelism, reducing integer instructions and performing fast range reductions on trigonometric arguments. The scheme also makes more efficient use of memory than prior work, allowing for larger problems to be solved on a single GPU. We illustrate these improvements with a simulation of 390 million crystal grains on a consumer-grade GPU, which executes at a rate of 2.72 s per strain step.

  3. Contention Modeling for Multithreaded Distributed Shared Memory Machines: The Cray XMT

    Energy Technology Data Exchange (ETDEWEB)

    Secchi, Simone; Tumeo, Antonino; Villa, Oreste

    2011-07-27

    Distributed Shared Memory (DSM) machines are a wide class of multi-processor computing systems where a large virtually-shared address space is mapped on a network of physically distributed memories. High memory latency and network contention are two of the main factors that limit performance scaling of such architectures. Modern high-performance computing DSM systems have evolved toward exploitation of massive hardware multi-threading and fine-grained memory hashing to tolerate irregular latencies, avoid network hot-spots and enable high scaling. In order to model the performance of such large-scale machines, parallel simulation has been proved to be a promising approach to achieve good accuracy in reasonable times. One of the most critical factors in solving the simulation speed-accuracy trade-off is network modeling. The Cray XMT is a massively multi-threaded supercomputing architecture that belongs to the DSM class, since it implements a globally-shared address space abstraction on top of a physically distributed memory substrate. In this paper, we discuss the development of a contention-aware network model intended to be integrated in a full-system XMT simulator. We start by measuring the effects of network contention in a 128-processor XMT machine and then investigate the trade-off that exists between simulation accuracy and speed, by comparing three network models which operate at different levels of accuracy. The comparison and model validation is performed by executing a string-matching algorithm on the full-system simulator and on the XMT, using three datasets that generate noticeably different contention patterns.

  4. Associative Memory Computing Power and Its Simulation

    CERN Document Server

    Volpi, G; The ATLAS collaboration

    2014-01-01

    The associative memory (AM) system is a computing device made of hundreds of AM ASICs chips designed to perform “pattern matching” at very high speed. Since each AM chip stores a data base of 130000 pre-calculated patterns and large numbers of chips can be easily assembled together, it is possible to produce huge AM banks. Speed and size of the system are crucial for real-time High Energy Physics applications, such as the ATLAS Fast TracKer (FTK) Processor. Using 80 million channels of the ATLAS tracker, FTK finds tracks within 100 micro seconds. The simulation of such a parallelized system is an extremely complex task if executed in commercial computers based on normal CPUs. The algorithm performance is limited, due to the lack of parallelism, and in addition the memory requirement is very large. In fact the AM chip uses a content addressable memory (CAM) architecture. Any data inquiry is broadcast to all memory elements simultaneously, thus data retrieval time is independent of the database size. The gr...

  5. Associative Memory computing power and its simulation

    CERN Document Server

    Ancu, L S; The ATLAS collaboration; Britzger, D; Giannetti, P; Howarth, J W; Luongo, C; Pandini, C; Schmitt, S; Volpi, G

    2014-01-01

    The associative memory (AM) system is a computing device made of hundreds of AM ASICs chips designed to perform “pattern matching” at very high speed. Since each AM chip stores a data base of 130000 pre-calculated patterns and large numbers of chips can be easily assembled together, it is possible to produce huge AM banks. Speed and size of the system are crucial for real-time High Energy Physics applications, such as the ATLAS Fast TracKer (FTK) Processor. Using 80 million channels of the ATLAS tracker, FTK finds tracks within 100 micro seconds. The simulation of such a parallelized system is an extremely complex task if executed in commercial computers based on normal CPUs. The algorithm performance is limited, due to the lack of parallelism, and in addition the memory requirement is very large. In fact the AM chip uses a content addressable memory (CAM) architecture. Any data inquiry is broadcast to all memory elements simultaneously, thus data retrieval time is independent of the database size. The gr...

  6. Vertex trigger implementation using shared memory technology

    CERN Document Server

    Müller, H

    1998-01-01

    The implementation of a 1 st level vertex trigger for LHC-B is particularly difficult due to the high ( 1 MHz ) input data rate. With ca. 350 silicon hits per event, both the R strips and Phi strips of the detectors produce a total of ca 2 Gbyte/s zero-suppressed da ta.1 note succeeds to the ideas to use R-phi coordinates for fast integer linefinding in programmable hardware, as described in LHB note 97-006. For an implementation we propose a FPGA preprocessing stage operating at 1 MHz with the benefit to substantially reduce the amount of data to be transmitted to the CPUs and to liberate a large fraction of CPU time. Interconnected via 4 Gbit/s SCI technol-ogy 2 , a shared memory system can be built which allows to perform data driven eventbuilding with, or without preprocessing. A fully data driven architecture between source modules and destination memories provides a highly reliable memory-to-memory transfer mechanism of very low latency. The eventbuilding is performed via associating events at the sourc...

  7. 3D-LIN: A Configurable Low-Latency Interconnect for Multi-Core Clusters with 3D Stacked L1 Memory

    OpenAIRE

    Beanato, Giulia; Loi, Igor; De Micheli, Giovanni; Leblebici, Yusuf; Benini, Luca

    2012-01-01

    Shared L1 memories are of interest for tightly- coupled processor clusters in programmable accelerators as they provide a convenient shared memory abstraction while avoiding cache coherence overheads. The performance of a shared-L1 memory critically depends on the architecture of the low-latency interconnect between processors and memory banks, which needs to provide ultra-fast access to the largest possible L1 working set. The advent of 3D technology provides new opportunities to improve the...

  8. A high-throughput two channel discrete wavelet transform architecture for the JPEG2000 standard

    Science.gov (United States)

    Badakhshannoory, Hossein; Hashemi, Mahmoud R.; Aminlou, Alireza; Fatemi, Omid

    2005-07-01

    The Discrete Wavelet Transform (DWT) is increasingly recognized in image and video compression standards, as indicated by its use in JPEG2000. The lifting scheme algorithm is an alternative DWT implementation that has a lower computational complexity and reduced resource requirement. In the JPEG2000 standard two lifting scheme based filter banks are introduced: the 5/3 and 9/7. In this paper a high throughput, two channel DWT architecture for both of the JPEG2000 DWT filters is presented. The proposed pipelined architecture has two separate input channels that process the incoming samples simultaneously with minimum memory requirement for each channel. The architecture had been implemented in VHDL and synthesized on a Xilinx Virtex2 XCV1000. The proposed architecture applies DWT on a 2K by 1K image at 33 fps with a 75 MHZ clock frequency. This performance is achieved with 70% less resources than two independent single channel modules. The high throughput and reduced resource requirement has made this architecture the proper choice for real time applications such as Digital Cinema.

  9. 3D-SoftChip: A Novel Architecture for Next-Generation Adaptive Computing Systems

    Directory of Open Access Journals (Sweden)

    Lee Mike Myung-Ok

    2006-01-01

    Full Text Available This paper introduces a novel architecture for next-generation adaptive computing systems, which we term 3D-SoftChip. The 3D-SoftChip is a 3-dimensional (3D vertically integrated adaptive computing system combining state-of-the-art processing and 3D interconnection technology. It comprises the vertical integration of two chips (a configurable array processor and an intelligent configurable switch through an indium bump interconnection array (IBIA. The configurable array processor (CAP is an array of heterogeneous processing elements (PEs, while the intelligent configurable switch (ICS comprises a switch block, 32-bit dedicated RISC processor for control, on-chip program/data memory, data frame buffer, along with a direct memory access (DMA controller. This paper introduces the novel 3D-SoftChip architecture for real-time communication and multimedia signal processing as a next-generation computing system. The paper further describes the advanced HW/SW codesign and verification methodology, including high-level system modeling of the 3D-SoftChip using SystemC, being used to determine the optimum hardware specification in the early design stage.

  10. Setting up the Nelson Mandela centre of memory and commemoration

    African Journals Online (AJOL)

    This article aims to outline the process of the audits, highlight the issues involved in the development of the web archive, as well as provide details on the development of the database architecture. Keywords: centre of memory and commemoration, Constitution Hill Heritage Site, Nelson Mandela Foundation, South Africa

  11. Memory-based attention capture when multiple items are maintained in visual working memory.

    Science.gov (United States)

    Hollingworth, Andrew; Beck, Valerie M

    2016-07-01

    Efficient visual search requires that attention is guided strategically to relevant objects, and most theories of visual search implement this function by means of a target template maintained in visual working memory (VWM). However, there is currently debate over the architecture of VWM-based attentional guidance. We contrasted a single-item-template hypothesis with a multiple-item-template hypothesis, which differ in their claims about structural limits on the interaction between VWM representations and perceptual selection. Recent evidence from van Moorselaar, Theeuwes, and Olivers (2014) indicated that memory-based capture during search, an index of VWM guidance, is not observed when memory set size is increased beyond a single item, suggesting that multiple items in VWM do not guide attention. In the present study, we maximized the overlap between multiple colors held in VWM and the colors of distractors in a search array. Reliable capture was observed when 2 colors were held in VWM and both colors were present as distractors, using both the original van Moorselaar et al. singleton-shape search task and a search task that required focal attention to array elements (gap location in outline square stimuli). In the latter task, memory-based capture was consistent with the simultaneous guidance of attention by multiple VWM representations. (PsycINFO Database Record (c) 2016 APA, all rights reserved).

  12. Compiling for Application Specific Computational Acceleration in Reconfigurable Architectures Final Report CRADA No. TSB-2033-01

    Energy Technology Data Exchange (ETDEWEB)

    De Supinski, B. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Caliga, D. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

    2017-09-28

    The primary objective of this project was to develop memory optimization technology to efficiently deliver data to, and distribute data within, the SRC-6's Field Programmable Gate Array- ("FPGA") based Multi-Adaptive Processors (MAPs). The hardware/software approach was to explore efficient MAP configurations and generate the compiler technology to exploit those configurations. This memory accessing technology represents an important step towards making reconfigurable symmetric multi-processor (SMP) architectures that will be a costeffective solution for large-scale scientific computing.

  13. Short-term Memory of Deep RNN

    OpenAIRE

    Gallicchio, Claudio

    2018-01-01

    The extension of deep learning towards temporal data processing is gaining an increasing research interest. In this paper we investigate the properties of state dynamics developed in successive levels of deep recurrent neural networks (RNNs) in terms of short-term memory abilities. Our results reveal interesting insights that shed light on the nature of layering as a factor of RNN design. Noticeably, higher layers in a hierarchically organized RNN architecture results to be inherently biased ...

  14. Architectural slicing

    DEFF Research Database (Denmark)

    Christensen, Henrik Bærbak; Hansen, Klaus Marius

    2013-01-01

    Architectural prototyping is a widely used practice, con- cerned with taking architectural decisions through experiments with light- weight implementations. However, many architectural decisions are only taken when systems are already (partially) implemented. This is prob- lematic in the context...... of architectural prototyping since experiments with full systems are complex and expensive and thus architectural learn- ing is hindered. In this paper, we propose a novel technique for harvest- ing architectural prototypes from existing systems, \\architectural slic- ing", based on dynamic program slicing. Given...... a system and a slicing criterion, architectural slicing produces an architectural prototype that contain the elements in the architecture that are dependent on the ele- ments in the slicing criterion. Furthermore, we present an initial design and implementation of an architectural slicer for Java....

  15. Scaling Non-Regular Shared-Memory Codes by Reusing Custom Loop Schedules

    Directory of Open Access Journals (Sweden)

    Dimitrios S. Nikolopoulos

    2003-01-01

    Full Text Available In this paper we explore the idea of customizing and reusing loop schedules to improve the scalability of non-regular numerical codes in shared-memory architectures with non-uniform memory access latency. The main objective is to implicitly setup affinity links between threads and data, by devising loop schedules that achieve balanced work distribution within irregular data spaces and reusing them as much as possible along the execution of the program for better memory access locality. This transformation provides a great deal of flexibility in optimizing locality, without compromising the simplicity of the shared-memory programming paradigm. In particular, the programmer does not need to explicitly distribute data between processors. The paper presents practical examples from real applications and experiments showing the efficiency of the approach.

  16. Robust quantum network architectures and topologies for entanglement distribution

    Science.gov (United States)

    Das, Siddhartha; Khatri, Sumeet; Dowling, Jonathan P.

    2018-01-01

    Entanglement distribution is a prerequisite for several important quantum information processing and computing tasks, such as quantum teleportation, quantum key distribution, and distributed quantum computing. In this work, we focus on two-dimensional quantum networks based on optical quantum technologies using dual-rail photonic qubits for the building of a fail-safe quantum internet. We lay out a quantum network architecture for entanglement distribution between distant parties using a Bravais lattice topology, with the technological constraint that quantum repeaters equipped with quantum memories are not easily accessible. We provide a robust protocol for simultaneous entanglement distribution between two distant groups of parties on this network. We also discuss a memory-based quantum network architecture that can be implemented on networks with an arbitrary topology. We examine networks with bow-tie lattice and Archimedean lattice topologies and use percolation theory to quantify the robustness of the networks. In particular, we provide figures of merit on the loss parameter of the optical medium that depend only on the topology of the network and quantify the robustness of the network against intermittent photon loss and intermittent failure of nodes. These figures of merit can be used to compare the robustness of different network topologies in order to determine the best topology in a given real-world scenario, which is critical in the realization of the quantum internet.

  17. NVL-C: Static Analysis Techniques for Efficient, Correct Programming of Non-Volatile Main Memory Systems

    Energy Technology Data Exchange (ETDEWEB)

    Lee, Seyong [ORNL; Vetter, Jeffrey S [ORNL

    2016-01-01

    Computer architecture experts expect that non-volatile memory (NVM) hierarchies will play a more significant role in future systems including mobile, enterprise, and HPC architectures. With this expectation in mind, we present NVL-C: a novel programming system that facilitates the efficient and correct programming of NVM main memory systems. The NVL-C programming abstraction extends C with a small set of intuitive language features that target NVM main memory, and can be combined directly with traditional C memory model features for DRAM. We have designed these new features to enable compiler analyses and run-time checks that can improve performance and guard against a number of subtle programming errors, which, when left uncorrected, can corrupt NVM-stored data. Moreover, to enable recovery of data across application or system failures, these NVL-C features include a flexible directive for specifying NVM transactions. So that our implementation might be extended to other compiler front ends and languages, the majority of our compiler analyses are implemented in an extended version of LLVM's intermediate representation (LLVM IR). We evaluate NVL-C on a number of applications to show its flexibility, performance, and correctness.

  18. SUSTAINABLE ARCHITECTURE : WHAT ARCHITECTURE STUDENTS THINK

    OpenAIRE

    SATWIKO, PRASASTO

    2013-01-01

    Sustainable architecture has become a hot issue lately as the impacts of climate change become more intense. Architecture educations have responded by integrating knowledge of sustainable design in their curriculum. However, in the real life, new buildings keep coming with designs that completely ignore sustainable principles. This paper discusses the results of two national competitions on sustainable architecture targeted for architecture students (conducted in 2012 and 2013). The results a...

  19. Software Alchemy: Turning Complex Statistical Computations into Embarrassingly-Parallel Ones

    Directory of Open Access Journals (Sweden)

    Norman Matloff

    2016-07-01

    Full Text Available The growth in the use of computationally intensive statistical procedures, especially with big data, has necessitated the usage of parallel computation on diverse platforms such as multicore, GPUs, clusters and clouds. However, slowdown due to interprocess communication costs typically limits such methods to "embarrassingly parallel" (EP algorithms, especially on non-shared memory platforms. This paper develops a broadlyapplicable method for converting many non-EP algorithms into statistically equivalent EP ones. The method is shown to yield excellent levels of speedup for a variety of statistical computations. It also overcomes certain problems of memory limitations.

  20. Reprogrammable logic in memristive crossbar for in-memory computing

    Science.gov (United States)

    Cheng, Long; Zhang, Mei-Yun; Li, Yi; Zhou, Ya-Xiong; Wang, Zhuo-Rui; Hu, Si-Yu; Long, Shi-Bing; Liu, Ming; Miao, Xiang-Shui

    2017-12-01

    Memristive stateful logic has emerged as a promising next-generation in-memory computing paradigm to address escalating computing-performance pressures in traditional von Neumann architecture. Here, we present a nonvolatile reprogrammable logic method that can process data between different rows and columns in a memristive crossbar array based on material implication (IMP) logic. Arbitrary Boolean logic can be executed with a reprogrammable cell containing four memristors in a crossbar array. In the fabricated Ti/HfO2/W memristive array, some fundamental functions, such as universal NAND logic and data transfer, were experimentally implemented. Moreover, using eight memristors in a 2  ×  4 array, a one-bit full adder was theoretically designed and verified by simulation to exhibit the feasibility of our method to accomplish complex computing tasks. In addition, some critical logic-related performances were further discussed, such as the flexibility of data processing, cascading problem and bit error rate. Such a method could be a step forward in developing IMP-based memristive nonvolatile logic for large-scale in-memory computing architecture.

  1. Sleep-dependent memory consolidation in patients with sleep disorders.

    Science.gov (United States)

    Cipolli, Carlo; Mazzetti, Michela; Plazzi, Giuseppe

    2013-04-01

    Sleep can improve the off-line memory consolidation of new items of declarative and non-declarative information in healthy subjects, whereas acute sleep loss, as well as sleep restriction and fragmentation, impair consolidation. This suggests that, by modifying the amount and/or architecture of sleep, chronic sleep disorders may also lead to a lower gain in off-line consolidation, which in turn may be responsible for the varying levels of impaired performance at memory tasks usually observed in sleep-disordered patients. The experimental studies conducted to date have shown specific impairments of sleep-dependent consolidation overall for verbal and visual declarative information in patients with primary insomnia, for verbal declarative information in patients with obstructive sleep apnoeas, and for visual procedural skills in patients with narcolepsy-cataplexy. These findings corroborate the hypothesis that impaired consolidation is a consequence of the chronically altered organization of sleep. Moreover, they raise several novel questions as to: a) the reversibility of consolidation impairment in the case of effective treatment, b) the possible negative influence of altered prior sleep also on the encoding of new information, and c) the relationships between altered sleep and memory impairment in patients with other (medical, psychiatric or neurological) diseases associated with quantitative and/or qualitative changes of sleep architecture. Copyright © 2012 Elsevier Ltd. All rights reserved.

  2. Architecture

    OpenAIRE

    Clear, Nic

    2014-01-01

    When discussing science fiction’s relationship with architecture, the usual practice is to look at the architecture “in” science fiction—in particular, the architecture in SF films (see Kuhn 75-143) since the spaces of literary SF present obvious difficulties as they have to be imagined. In this essay, that relationship will be reversed: I will instead discuss science fiction “in” architecture, mapping out a number of architectural movements and projects that can be viewed explicitly as scien...

  3. A portable implementation of ARPACK for distributed memory parallel architectures

    Energy Technology Data Exchange (ETDEWEB)

    Maschhoff, K.J.; Sorensen, D.C.

    1996-12-31

    ARPACK is a package of Fortran 77 subroutines which implement the Implicitly Restarted Arnoldi Method used for solving large sparse eigenvalue problems. A parallel implementation of ARPACK is presented which is portable across a wide range of distributed memory platforms and requires minimal changes to the serial code. The communication layers used for message passing are the Basic Linear Algebra Communication Subprograms (BLACS) developed for the ScaLAPACK project and Message Passing Interface(MPI).

  4. Strategies for memory-based decision making : Modeling behavioral and neural signatures within a cognitive architecture

    NARCIS (Netherlands)

    Fechner, Hanna B; Pachur, Thorsten; Schooler, Lael J; Mehlhorn, Katja; Battal, Ceren; Volz, Kirsten G; Borst, Jelmer P.

    2016-01-01

    How do people use memories to make inferences about real-world objects? We tested three strategies based on predicted patterns of response times and blood-oxygen-level-dependent (BOLD) responses: one strategy that relies solely on recognition memory, a second that retrieves additional knowledge, and

  5. Transformation-based exploration of data parallel architecture for customizable hardware : a JPEG encoder case study

    NARCIS (Netherlands)

    Corvino, R.; Diken, E.; Gamatié, A.; Jozwiak, L.

    2012-01-01

    In this paper, we present a method for the design of MPSoCs for complex data-intensive applications. This method aims at a blend exploration of the communication, the memory system architecture and the computation resource parallelism. The proposed method is exemplified on a JPEG Encoder case study

  6. Shape-morphing composites with designed micro-architectures.

    Science.gov (United States)

    Rodriguez, Jennifer N; Zhu, Cheng; Duoss, Eric B; Wilson, Thomas S; Spadaccini, Christopher M; Lewicki, James P

    2016-06-15

    Shape memory polymers (SMPs) are attractive materials due to their unique mechanical properties, including high deformation capacity and shape recovery. SMPs are easier to process, lightweight, and inexpensive compared to their metallic counterparts, shape memory alloys. However, SMPs are limited to relatively small form factors due to their low recovery stresses. Lightweight, micro-architected composite SMPs may overcome these size limitations and offer the ability to combine functional properties (e.g., electrical conductivity) with shape memory behavior. Fabrication of 3D SMP thermoset structures via traditional manufacturing methods is challenging, especially for designs that are composed of multiple materials within porous microarchitectures designed for specific shape change strategies, e.g. sequential shape recovery. We report thermoset SMP composite inks containing some materials from renewable resources that can be 3D printed into complex, multi-material architectures that exhibit programmable shape changes with temperature and time. Through addition of fiber-based fillers, we demonstrate printing of electrically conductive SMPs where multiple shape states may induce functional changes in a device and that shape changes can be actuated via heating of printed composites. The ability of SMPs to recover their original shapes will be advantageous for a broad range of applications, including medical, aerospace, and robotic devices.

  7. The relationships between memory systems and sleep stages.

    Science.gov (United States)

    Rauchs, Géraldine; Desgranges, Béatrice; Foret, Jean; Eustache, Francis

    2005-06-01

    Sleep function remains elusive despite our rapidly increasing comprehension of the processes generating and maintaining the different sleep stages. Several lines of evidence support the hypothesis that sleep is involved in the off-line reprocessing of recently-acquired memories. In this review, we summarize the main results obtained in the field of sleep and memory consolidation in both animals and humans, and try to connect sleep stages with the different memory systems. To this end, we have collated data obtained using several methodological approaches, including electrophysiological recordings of neuronal ensembles, post-training modifications of sleep architecture, sleep deprivation and functional neuroimaging studies. Broadly speaking, all the various studies emphasize the fact that the four long-term memory systems (procedural memory, perceptual representation system, semantic and episodic memory, according to Tulving's SPI model; Tulving, 1995) benefit either from non-rapid eye movement (NREM) (not just SWS) or rapid eye movement (REM) sleep, or from both sleep stages. Tulving's classification of memory systems appears more pertinent than the declarative/non-declarative dichotomy when it comes to understanding the role of sleep in memory. Indeed, this model allows us to resolve several contradictions, notably the fact that episodic and semantic memory (the two memory systems encompassed in declarative memory) appear to rely on different sleep stages. Likewise, this model provides an explanation for why the acquisition of various types of skills (perceptual-motor, sensory-perceptual and cognitive skills) and priming effects, subserved by different brain structures but all designated by the generic term of implicit or non-declarative memory, may not benefit from the same sleep stages.

  8. Scaling to Nanotechnology Limits with the PIMS Computer Architecture and a new Scaling Rule

    Energy Technology Data Exchange (ETDEWEB)

    Debenedictis, Erik P. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

    2015-02-01

    We describe a new approach to computing that moves towards the limits of nanotechnology using a newly formulated sc aling rule. This is in contrast to the current computer industry scali ng away from von Neumann's original computer at the rate of Moore's Law. We extend Moore's Law to 3D, which l eads generally to architectures that integrate logic and memory. To keep pow er dissipation cons tant through a 2D surface of the 3D structure requires using adiabatic principles. We call our newly proposed architecture Processor In Memory and Storage (PIMS). We propose a new computational model that integrates processing and memory into "tiles" that comprise logic, memory/storage, and communications functions. Since the programming model will be relatively stable as a system scales, programs repr esented by tiles could be executed in a PIMS system built with today's technology or could become the "schematic diagram" for implementation in an ultimate 3D nanotechnology of the future. We build a systems software approach that offers advantages over and above the technological and arch itectural advantages. Firs t, the algorithms may be more efficient in the conventional sens e of having fewer steps. Second, the algorithms may run with higher power efficiency per operation by being a better match for the adiabatic scaling ru le. The performance analysis based on demonstrated ideas in physical science suggests 80,000 x improvement in cost per operation for the (arguably) gene ral purpose function of emulating neurons in Deep Learning.

  9. Software architecture for time-constrained machine vision applications

    Science.gov (United States)

    Usamentiaga, Rubén; Molleda, Julio; García, Daniel F.; Bulnes, Francisco G.

    2013-01-01

    Real-time image and video processing applications require skilled architects, and recent trends in the hardware platform make the design and implementation of these applications increasingly complex. Many frameworks and libraries have been proposed or commercialized to simplify the design and tuning of real-time image processing applications. However, they tend to lack flexibility, because they are normally oriented toward particular types of applications, or they impose specific data processing models such as the pipeline. Other issues include large memory footprints, difficulty for reuse, and inefficient execution on multicore processors. We present a novel software architecture for time-constrained machine vision applications that addresses these issues. The architecture is divided into three layers. The platform abstraction layer provides a high-level application programming interface for the rest of the architecture. The messaging layer provides a message-passing interface based on a dynamic publish/subscribe pattern. A topic-based filtering in which messages are published to topics is used to route the messages from the publishers to the subscribers interested in a particular type of message. The application layer provides a repository for reusable application modules designed for machine vision applications. These modules, which include acquisition, visualization, communication, user interface, and data processing, take advantage of the power of well-known libraries such as OpenCV, Intel IPP, or CUDA. Finally, the proposed architecture is applied to a real machine vision application: a jam detector for steel pickling lines.

  10. Modeling Architectural Patterns Using Architectural Primitives

    NARCIS (Netherlands)

    Zdun, Uwe; Avgeriou, Paris

    2005-01-01

    Architectural patterns are a key point in architectural documentation. Regrettably, there is poor support for modeling architectural patterns, because the pattern elements are not directly matched by elements in modeling languages, and, at the same time, patterns support an inherent variability that

  11. Ambos lados: deconstructing identity, performing memory

    OpenAIRE

    Marín Ezpeleta, Rakel; Daniel, Henry

    2015-01-01

    Project Barca’s central research question – how can embodied personal and collective memories be shaped into new architectures of identityand belonging in the form of innovative performance works that speak to wider sections of society? – has so far generated a number of mixed-media performance outcomes in its two research locations; Barcelona, Catalonia, and Vancouver, Canada. Some major works are the video essay Encounters 3; Barca-El otro lado – a performance work that utilizes dancers, ac...

  12. Adapting Memory Hierarchies for Emerging Datacenter Interconnects

    Institute of Scientific and Technical Information of China (English)

    江涛; 董建波; 侯锐; 柴琳; 张立新; 孙凝晖; 田斌

    2015-01-01

    Efficient resource utilization requires that emerging datacenter interconnects support both high performance communication and efficient remote resource sharing. These goals require that the network be more tightly coupled with the CPU chips. Designing a new interconnection technology thus requires considering not only the interconnection itself, but also the design of the processors that will rely on it. In this paper, we study memory hierarchy implications for the design of high-speed datacenter interconnects—particularly as they affect remote memory access—and we use PCIe as the vehicle for our investigations. To that end, we build three complementary platforms: a PCIe-interconnected prototype server with which we measure and analyze current bottlenecks; a software simulator that lets us model microarchitectural and cache hierarchy changes;and an FPGA prototype system with a streamlined switchless customized protocol Thunder with which we study hardware optimizations outside the processor. We highlight several architectural modifications to better support remote memory access and communication, and quantify their impact and limitations.

  13. High-Efficient Parallel CAVLC Encoders on Heterogeneous Multicore Architectures

    Directory of Open Access Journals (Sweden)

    H. Y. Su

    2012-04-01

    Full Text Available This article presents two high-efficient parallel realizations of the context-based adaptive variable length coding (CAVLC based on heterogeneous multicore processors. By optimizing the architecture of the CAVLC encoder, three kinds of dependences are eliminated or weaken, including the context-based data dependence, the memory accessing dependence and the control dependence. The CAVLC pipeline is divided into three stages: two scans, coding, and lag packing, and be implemented on two typical heterogeneous multicore architectures. One is a block-based SIMD parallel CAVLC encoder on multicore stream processor STORM. The other is a component-oriented SIMT parallel encoder on massively parallel architecture GPU. Both of them exploited rich data-level parallelism. Experiments results show that compared with the CPU version, more than 70 times of speedup can be obtained for STORM and over 50 times for GPU. The implementation of encoder on STORM can make a real-time processing for 1080p @30fps and GPU-based version can satisfy the requirements for 720p real-time encoding. The throughput of the presented CAVLC encoders is more than 10 times higher than that of published software encoders on DSP and multicore platforms.

  14. Collecting memories of the city through the conservation of heritage building

    Science.gov (United States)

    Nurliani Lukito, Yulia; Nurul Rizky, Amalia

    2018-03-01

    Heritage building has a role for the city and the society that is associated with emotional, cultural, and use values. Those values are parts of collective memory and create the identity of the city. Some heritage buildings are vulnerable to modernization and even when the government conserves those buildings, some important values of the buildings are lost. This paper discusses a colonial building in Jakarta that has been converted into different functions. As a case study is Cut Meutia Mosque in Menteng, designed by a Dutch architect PAJ Moojen during the Dutch late colonial era. The building was initiated in 1922 as N.V. Bouwploeg, an architectural firm that developed the nearby residential area of New Gondangdia. This area was developed according to modern Garden City principles and the Bouwploeg was known as the gate to Menteng area and the architecture of the building was considered very modern and unique at that time – illustrating the importance of the building for the city. After Indonesia’s independence, the government converted the building into different functions such as an office and a mosque. Although the function of the building has changed, the building is still related to triggering a collective memory of the new area that should not be ignored in the effort of conserving the building. Through historical and field research, the paper aims to discuss some changes and lost values of the building as the result of conserving the colonial heritage, especially about collective memory. Hopefully, learning from the conservation of building heritage and city collective memory may support the idea of livable memory of heritage building and even a

  15. Respecting Relations: Memory Access and Antecedent Retrieval in Incremental Sentence Processing

    Science.gov (United States)

    Kush, Dave W.

    2013-01-01

    This dissertation uses the processing of anaphoric relations to probe how linguistic information is encoded in and retrieved from memory during real-time sentence comprehension. More specifically, the dissertation attempts to resolve a tension between the demands of a linguistic processor implemented in a general-purpose cognitive architecture and…

  16. VIPRAM_L1CMS: a 2-Tier 3D Architecture for Pattern Recognition for Track Finding

    Energy Technology Data Exchange (ETDEWEB)

    Hoff, J. R. [Fermilab; Joshi, Joshi,S. [Northwestern U.; Liu, Liu, [Fermilab; Olsen, J. [Fermilab; Shenai, A. [Fermilab

    2017-06-15

    In HEP tracking trigger applications, flagging an individual detector hit is not important. Rather, the path of a charged particle through many detector layers is what must be found. Moreover, given the increased luminosity projected for future LHC experiments, this type of track finding will be required within the Level 1 Trigger system. This means that future LHC experiments require not just a chip capable of high-speed track finding but also one with a high-speed readout architecture. VIPRAM_L1CMS is 2-Tier Vertically Integrated chip designed to fulfill these requirements. It is a complete pipelined Pattern Recognition Associative Memory (PRAM) architecture including pattern recognition, result sparsification, and readout for Level 1 trigger applications in CMS with 15-bit wide detector addresses and eight detector layers included in the track finding. Pattern recognition is based on classic Content Addressable Memories with a Current Race Scheme to reduce timing complexity and a 4-bit Selective Precharge to minimize power consumption. VIPRAM_L1CMS uses a pipelined set of priority-encoded binary readout structures to sparsify and readout active road flags at frequencies of at least 100MHz. VIPRAM_L1CMS is designed to work directly with the Pulsar2b Architecture.

  17. On the Performance of Three In-Memory Data Systems for On Line Analytical Processing

    Directory of Open Access Journals (Sweden)

    Ionut HRUBARU

    2017-01-01

    Full Text Available In-memory database systems are among the most recent and most promising Big Data technologies, being developed and released either as brand new distributed systems or as extensions of old monolith (centralized database systems. As name suggests, in-memory systems cache all the data into special memory structures. Many are part of the NewSQL strand and target to bridge the gap between OLTP and OLAP into so-called Hybrid Transactional Analytical Systems (HTAP. This paper aims to test the performance of using such type of systems for TPCH analytical workloads. Performance is analyzed in terms of data loading, memory footprint and execution time of the TPCH query set for three in-memory data systems: Oracle, SQL Server and MemSQL. Tests are subsequently deployed on classical on-disk architectures and results compared to in-memory solutions. As in-memory is an enterprise edition feature, associated costs are also considered.

  18. Non-volatile memory devices with redox-active diruthenium molecular compound

    International Nuclear Information System (INIS)

    Pookpanratana, S; Zhu, H; Bittle, E G; Richter, C A; Li, Q; Hacker, C A; Natoli, S N; Ren, T

    2016-01-01

    Reduction-oxidation (redox) active molecules hold potential for memory devices due to their many unique properties. We report the use of a novel diruthenium-based redox molecule incorporated into a non-volatile Flash-based memory device architecture. The memory capacitor device structure consists of a Pd/Al 2 O 3 /molecule/SiO 2 /Si structure. The bulky ruthenium redox molecule is attached to the surface by using a ‘click’ reaction and the monolayer structure is characterized by x-ray photoelectron spectroscopy to verify the Ru attachment and molecular density. The ‘click’ reaction is particularly advantageous for memory applications because of (1) ease of chemical design and synthesis, and (2) provides an additional spatial barrier between the oxide/silicon to the diruthenium molecule. Ultraviolet photoelectron spectroscopy data identified the energy of the electronic levels of the surface before and after surface modification. The molecular memory devices display an unsaturated charge storage window attributed to the intrinsic properties of the redox-active molecule. Our findings demonstrate the strengths and challenges with integrating molecular layers within solid-state devices, which will influence the future design of molecular memory devices. (paper)

  19. Memory handling in the ATLAS submission system from job definition to sites limits

    Science.gov (United States)

    Forti, A. C.; Walker, R.; Maeno, T.; Love, P.; Rauschmayr, N.; Filipcic, A.; Di Girolamo, A.

    2017-10-01

    In the past few years the increased luminosity of the LHC, changes in the linux kernel and a move to a 64bit architecture have affected the ATLAS jobs memory usage and the ATLAS workload management system had to be adapted to be more flexible and pass memory parameters to the batch systems, which in the past wasn’t a necessity. This paper describes the steps required to add the capability to better handle memory requirements, included the review of how each component definition and parametrization of the memory is mapped to the other components, and what changes had to be applied to make the submission chain work. These changes go from the definition of tasks and the way tasks memory requirements are set using scout jobs, through the new memory tool developed to do that, to how these values are used by the submission component of the system and how the jobs are treated by the sites through the CEs, batch systems and ultimately the kernel.

  20. Non-volatile flash memory with discrete bionanodot floating gate assembled by protein template

    International Nuclear Information System (INIS)

    Miura, Atsushi; Yamashita, Ichiro; Uraoka, Yukiharu; Fuyuki, Takashi; Tsukamoto, Rikako; Yoshii, Shigeo

    2008-01-01

    We demonstrated non-volatile flash memory fabrication by utilizing uniformly sized cobalt oxide (Co 3 O 4 ) bionanodot (Co-BND) architecture assembled by a cage-shaped supramolecular protein template. A fabricated high-density Co-BND array was buried in a metal-oxide-semiconductor field-effect-transistor (MOSFET) structure to use as the charge storage node of a floating nanodot gate memory. We observed a clockwise hysteresis in the drain current-gate voltage characteristics of fabricated BND-embedded MOSFETs. Observed hysteresis obviously indicates a memory operation of Co-BND-embedded MOSFETs due to the charge confinement in the embedded BND and successful functioning of embedded BNDs as the charge storage nodes of the non-volatile flash memory. Fabricated Co-BND-embedded MOSFETs showed good memory properties such as wide memory windows, long charge retention and high tolerance to repeated write/erase operations. A new pathway for device fabrication by utilizing the versatile functionality of biomolecules is presented

  1. Memory handling in the ATLAS submission system from job definition to sites limits

    CERN Document Server

    AUTHOR|(INSPIRE)INSPIRE-00027700; The ATLAS collaboration; Walker, Rodney; Maeno, Tadashi; Love, Peter; Rauschmayr, Nathalie; Filipcic, Andrej; Di Girolamo, Alessandro

    2017-01-01

    In the past few years the increased luminosity of the LHC, changes in the linux kernel and a move to a 64bit architecture have affected the ATLAS jobs memory usage and the ATLAS workload management system had to be adapted to be more flexible and pass memory parameters to the batch systems, which in the past wasn’t a necessity. This paper describes the steps required to add the capability to better handle memory requirements, included the review of how each component definition and parametrization of the memory is mapped to the other components, and what changes had to be applied to make the submission chain work. These changes go from the definition of tasks and the way tasks memory requirements are set using scout jobs, through the new memory tool developed to do that, to how these values are used by the submission component of the system and how the jobs are treated by the sites through the CEs, batch systems and ultimately the kernel.

  2. An Incremental Time-delay Neural Network for Dynamical Recurrent Associative Memory

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    An incremental time-delay neural network based on synapse growth, which is suitable for dynamic control and learning of autonomous robots, is proposed to improve the learning and retrieving performance of dynamical recurrent associative memory architecture. The model allows steady and continuous establishment of associative memory for spatio-temporal regularities and time series in discrete sequence of inputs. The inserted hidden units can be taken as the long-term memories that expand the capacity of network and sometimes may fade away under certain condition. Preliminary experiment has shown that this incremental network may be a promising approach to endow autonomous robots with the ability of adapting to new data without destroying the learned patterns. The system also benefits from its potential chaos character for emergence.

  3. Earth Orbiting Support Systems for commercial low Earth orbit data relay: Assessing architectures through tradespace exploration

    Science.gov (United States)

    Palermo, Gianluca; Golkar, Alessandro; Gaudenzi, Paolo

    2015-06-01

    As small satellites and Sun Synchronous Earth Observation systems are assuming an increased role in nowadays space activities, including commercial investments, it is of interest to assess how infrastructures could be developed to support the development of such systems and other spacecraft that could benefit from having a data relay service in Low Earth Orbit (LEO), as opposed to traditional Geostationary relays. This paper presents a tradespace exploration study of the architecture of such LEO commercial satellite data relay systems, here defined as Earth Orbiting Support Systems (EOSS). The paper proposes a methodology to formulate architectural decisions for EOSS constellations, and enumerate the corresponding tradespace of feasible architectures. Evaluation metrics are proposed to measure benefits and costs of architectures; lastly, a multicriteria Pareto criterion is used to downselect optimal architectures for subsequent analysis. The methodology is applied to two case studies for a set of 30 and 100 customer-spacecraft respectively, representing potential markets for LEO services in Exploration, Earth Observation, Science, and CubeSats. Pareto analysis shows how increased performance of the constellation is always achieved by an increased node size, as measured by the gain of the communications antenna mounted on EOSS spacecraft. On the other hand, nonlinear trends in optimal orbital altitude, number of satellites per plane, and number of orbital planes, are found in both cases. An upward trend in individual node memory capacity is found, although never exceeding 256 Gbits of onboard memory for both cases that have been considered, assuming the availability of a polar ground station for EOSS data downlink. System architects can use the proposed methodology to identify optimal EOSS constellations for a given service pricing strategy and customer target, thus identifying alternatives for selection by decision makers.

  4. Une approche de coloriage d’arrêtes pour la conception d’architectures parallèles d’entrelaceurs matériels

    OpenAIRE

    Awais Hussein , Sani

    2012-01-01

    Nowadays, Turbo and LDPC codes are two families of codes that are extensively used in current communication standards due to their excellent error correction capabilities. However, hardware design of coders and decoders for high data rate applications is not a straightforward process. For high data rates, decoders are implemented on parallel architectures in which more than one processing elements decode the received data. To achieve high memory bandwidth, the main memory is divided into smal...

  5. Architectures of a fragmented memory: imprisonment and liberation in W. G. Sebald's Austerlitz

    Directory of Open Access Journals (Sweden)

    Camila Marchesan Cargnelutti

    2015-07-01

    Full Text Available Austerlitz (2001, written by the German author Sebald, presents a fragmented narrative with various levels of relations and symbolic plans outlined by the story of Jacques Austerlitz. This form of literary construction is in perfect harmony with the fragmentation of the past and the oblivion that shape Austerlitz. As the character’s investigations and self-discovery process advance, we find that he was one of the Jewish children brought to London by the Kindertransports on the eve of World War II. In this study, we investigate a kind of dividing line in Austerlitz’s story, establishing itself as an ‘in-between’ that evokes two considerably distinct moments of the narrative. These moments sometimes evoke imprisonment and relate to imprisoned memories, and sometimes evoke liberation and relate to freed memory. First, we track images and descriptions that refer to imprisonment when Austerlitz feels trapped, isolated, without past or memories. Subsequently, we map descriptions of this kind of liberation that begins when the character starts to redraw his past, in a process of self-discovery and reconstruction of his story and his identity. In this work, both Austerlitz and Sebald evoke the need to remember the traumatic past and witness it, despite all the pain and incomprehension while facing it.

  6. Architectural prototyping

    DEFF Research Database (Denmark)

    Bardram, Jakob Eyvind; Christensen, Henrik Bærbak; Hansen, Klaus Marius

    2004-01-01

    A major part of software architecture design is learning how specific architectural designs balance the concerns of stakeholders. We explore the notion of "architectural prototypes", correspondingly architectural prototyping, as a means of using executable prototypes to investigate stakeholders...

  7. An efficient interpolation filter VLSI architecture for HEVC standard

    Science.gov (United States)

    Zhou, Wei; Zhou, Xin; Lian, Xiaocong; Liu, Zhenyu; Liu, Xiaoxiang

    2015-12-01

    The next-generation video coding standard of High-Efficiency Video Coding (HEVC) is especially efficient for coding high-resolution video such as 8K-ultra-high-definition (UHD) video. Fractional motion estimation in HEVC presents a significant challenge in clock latency and area cost as it consumes more than 40 % of the total encoding time and thus results in high computational complexity. With aims at supporting 8K-UHD video applications, an efficient interpolation filter VLSI architecture for HEVC is proposed in this paper. Firstly, a new interpolation filter algorithm based on the 8-pixel interpolation unit is proposed in this paper. It can save 19.7 % processing time on average with acceptable coding quality degradation. Based on the proposed algorithm, an efficient interpolation filter VLSI architecture, composed of a reused data path of interpolation, an efficient memory organization, and a reconfigurable pipeline interpolation filter engine, is presented to reduce the implement hardware area and achieve high throughput. The final VLSI implementation only requires 37.2k gates in a standard 90-nm CMOS technology at an operating frequency of 240 MHz. The proposed architecture can be reused for either half-pixel interpolation or quarter-pixel interpolation, which can reduce the area cost for about 131,040 bits RAM. The processing latency of our proposed VLSI architecture can support the real-time processing of 4:2:0 format 7680 × 4320@78fps video sequences.

  8. Nonvolatile “AND,” “OR,” and “NOT” Boolean logic gates based on phase-change memory

    Energy Technology Data Exchange (ETDEWEB)

    Li, Y.; Zhong, Y. P.; Deng, Y. F.; Zhou, Y. X.; Xu, L.; Miao, X. S., E-mail: miaoxs@mail.hust.edu.cn [Wuhan National Laboratory for Optoelectronics (WNLO), Huazhong University of Science and Technology (HUST), Wuhan 430074 (China); School of Optical and Electronic Information, Huazhong University of Science and Technology, Wuhan 430074 (China)

    2013-12-21

    Electronic devices or circuits that can implement both logic and memory functions are regarded as the building blocks for future massive parallel computing beyond von Neumann architecture. Here we proposed phase-change memory (PCM)-based nonvolatile logic gates capable of AND, OR, and NOT Boolean logic operations verified in SPICE simulations and circuit experiments. The logic operations are parallel computing and results can be stored directly in the states of the logic gates, facilitating the combination of computing and memory in the same circuit. These results are encouraging for ultralow-power and high-speed nonvolatile logic circuit design based on novel memory devices.

  9. Nonvolatile “AND,” “OR,” and “NOT” Boolean logic gates based on phase-change memory

    International Nuclear Information System (INIS)

    Li, Y.; Zhong, Y. P.; Deng, Y. F.; Zhou, Y. X.; Xu, L.; Miao, X. S.

    2013-01-01

    Electronic devices or circuits that can implement both logic and memory functions are regarded as the building blocks for future massive parallel computing beyond von Neumann architecture. Here we proposed phase-change memory (PCM)-based nonvolatile logic gates capable of AND, OR, and NOT Boolean logic operations verified in SPICE simulations and circuit experiments. The logic operations are parallel computing and results can be stored directly in the states of the logic gates, facilitating the combination of computing and memory in the same circuit. These results are encouraging for ultralow-power and high-speed nonvolatile logic circuit design based on novel memory devices

  10. Architectural communication: Intra and extra activity of architecture

    Directory of Open Access Journals (Sweden)

    Stamatović-Vučković Slavica

    2013-01-01

    Full Text Available Apart from a brief overview of architectural communication viewed from the standpoint of theory of information and semiotics, this paper contains two forms of dualistically viewed architectural communication. The duality denotation/connotation (”primary” and ”secondary” architectural communication is one of semiotic postulates taken from Umberto Eco who viewed architectural communication as a semiotic phenomenon. In addition, architectural communication can be viewed as an intra and an extra activity of architecture where the overall activity of the edifice performed through its spatial manifestation may be understood as an act of communication. In that respect, the activity may be perceived as the ”behavior of architecture”, which corresponds to Lefebvre’s production of space.

  11. Can We Efficiently Check Concurrent Programs Under Relaxed Memory Models in Maude?

    DEFF Research Database (Denmark)

    Arrahman, Yehia Abd; Andric, Marina; Beggiato, Alessandro

    2014-01-01

    to the state space explosion. Several techniques have been proposed to mitigate those problems so to make verification under relaxed memory models feasible. We discuss how to adopt some of those techniques in a Maude-based approach to language prototyping, and suggest the use of other techniques that have been......Relaxed memory models offer suitable abstractions of the actual optimizations offered by multi-core architectures and by compilers of concurrent programming languages. Using such abstractions for verification purposes is challenging in part due to their inherent non-determinism which contributes...

  12. Quality Assurance in Architectural Education in Asia – On the Perspective of ‘Design’ Based Architectural Education and its Holistic Assessment

    Directory of Open Access Journals (Sweden)

    Lee Junsuk

    2018-01-01

    Full Text Available This paper starts by describing the importance of ‘Architecture’ to our society to our everyday life. It dominates our living environment, inevitably forms visual memories of life experience. Undoubtedly, a building is a ‘container’ to hold one’s everyday living, the grouping of architecture an urban environment is a ‘container.’ It is to hold one’s entire life. If then, there is paramount importance in realizing how we should properly educate our future ‘architects.’ First, it is important that the essential of architecture education must be based on proper ‘Design Studio’ format. Much of reasons are self described by following quotes such as ‘like art, architecture cannot be taught, but it can and should be cultivated,’ ‘Essentials of architecture can be learned through actual doing,’ ‘architecture education is most perfect liberal arts education format invented.’ Secondly, it is important to realize the vast meaning of architecture and architect to our society. Luckily, there is the Charter for UNESCO/UIA for Architectural Education which serves as not only a fundamental background to this discussion but also as ‘prior principle’ in articulating proper education for architects. Thirdly, considering the Charter as a principle, then it is worth discussing actual methodology as a working system in delivering ‘accredited/validated education.’ The KAAB’s Conditions is compared with the Charter. For an indepth look into the most important set of Conditions of the KAAB the Student Performance Criteria, the paper describes the origin, historic background, and evolvement of the ‘Student Performance Criteria,’ which is borne by the NAAB in 1980’s. As it weighs much importance of the whole accrediting process, it must be carefully written to reflect society’s needs of an architect. Also, it must not be written too specific nor should include quantitative measure, to avoid all school programs

  13. A Visual Approach to Investigating Shared and Global Memory Behavior of CUDA Kernels

    KAUST Repository

    Rosen, Paul

    2013-01-01

    We present an approach to investigate the memory behavior of a parallel kernel executing on thousands of threads simultaneously within the CUDA architecture. Our top-down approach allows for quickly identifying any significant differences between the execution of the many blocks and warps. As interesting warps are identified, we allow further investigation of memory behavior by visualizing the shared memory bank conflicts and global memory coalescence, first with an overview of a single warp with many operations and, subsequently, with a detailed view of a single warp and a single operation. We demonstrate the strength of our approach in the context of a parallel matrix transpose kernel and a parallel 1D Haar Wavelet transform kernel. © 2013 The Author(s) Computer Graphics Forum © 2013 The Eurographics Association and Blackwell Publishing Ltd.

  14. A Visual Approach to Investigating Shared and Global Memory Behavior of CUDA Kernels

    KAUST Repository

    Rosen, Paul

    2013-06-01

    We present an approach to investigate the memory behavior of a parallel kernel executing on thousands of threads simultaneously within the CUDA architecture. Our top-down approach allows for quickly identifying any significant differences between the execution of the many blocks and warps. As interesting warps are identified, we allow further investigation of memory behavior by visualizing the shared memory bank conflicts and global memory coalescence, first with an overview of a single warp with many operations and, subsequently, with a detailed view of a single warp and a single operation. We demonstrate the strength of our approach in the context of a parallel matrix transpose kernel and a parallel 1D Haar Wavelet transform kernel. © 2013 The Author(s) Computer Graphics Forum © 2013 The Eurographics Association and Blackwell Publishing Ltd.

  15. Sleep reduces false memory in healthy older adults.

    Science.gov (United States)

    Lo, June C; Sim, Sam K Y; Chee, Michael W L

    2014-04-01

    To investigate the effects of post-learning sleep and sleep architecture on false memory in healthy older adults. Balanced, crossover design. False memory was induced using the Deese-Roediger-McDermott (DRM) paradigm and assessed following nocturnal sleep and following a period of daytime wakefulness. Post-learning sleep structure was evaluated using polysomnography (PSG). Sleep research laboratory. Fourteen healthy older adults from the Singapore-Longitudinal Aging Brain Study (mean age ± standard deviation = 66.6 ± 4.1 y; 7 males). At encoding, participants studied lists of words that were semantically related to non-presented critical lures. At retrieval, they made "remember"/"know" and "new" judgments. Compared to wakefulness, post-learning sleep was associated with reduced "remember" responses, but not "know" responses to critical lures. In contrast, there were no significant differences in the veridical recognition of studied words, false recognition of unrelated distractors, discriminability, or response bias between the sleep and the wake conditions. More post-learning slow wave sleep was associated with greater reduction in false memory. In healthy older adults, sleep facilitates the reduction in false memory without affecting veridical memory. This benefit correlates with the amount of slow wave sleep in the post-learning sleep episode.

  16. Memory, reasoning, and categorization: parallels and common mechanisms.

    Science.gov (United States)

    Hayes, Brett K; Heit, Evan; Rotello, Caren M

    2014-01-01

    Traditionally, memory, reasoning, and categorization have been treated as separate components of human cognition. We challenge this distinction, arguing that there is broad scope for crossover between the methods and theories developed for each task. The links between memory and reasoning are illustrated in a review of two lines of research. The first takes theoretical ideas (two-process accounts) and methodological tools (signal detection analysis, receiver operating characteristic curves) from memory research and applies them to important issues in reasoning research: relations between induction and deduction, and the belief bias effect. The second line of research introduces a task in which subjects make either memory or reasoning judgments for the same set of stimuli. Other than broader generalization for reasoning than memory, the results were similar for the two tasks, across a variety of experimental stimuli and manipulations. It was possible to simultaneously explain performance on both tasks within a single cognitive architecture, based on exemplar-based comparisons of similarity. The final sections explore evidence for empirical and processing links between inductive reasoning and categorization and between categorization and recognition. An important implication is that progress in all three of these fields will be expedited by further investigation of the many commonalities between these tasks.

  17. Memory, reasoning and categorization: Parallels and common mechanisms

    Directory of Open Access Journals (Sweden)

    BRETT eHAYES

    2014-06-01

    Full Text Available Traditionally, memory, reasoning and categorization have been treated as separate components of human cognition. We challenge this distinction, arguing that there is broad scope for crossover between the methods and theories developed for each task. The links between memory and reasoning are illustrated in a review of two lines of research. The first takes theoretical ideas (two-process accounts and methodological tools (signal detection analysis, receiver operating characteristic curves from memory research and applies them to important issues in reasoning research: relations between induction and deduction, and the belief bias effect. The second line of research introduces a task in which subjects make either memory or reasoning judgments for the same set of stimuli. Other than broader generalization for reasoning than memory, the results were similar for the two tasks, across a variety of experimental stimuli and manipulations. It was possible to simultaneously explain performance on both tasks within a single cognitive architecture, based on exemplar-based comparisons of similarity. The final sections explore evidence for empirical and processing links between inductive reasoning and categorization and between categorization and recognition. An important implication is that progress in all three of these fields will be expedited by further investigation of the many commonalities between these tasks.

  18. Optical interconnection network for parallel access to multi-rank memory in future computing systems.

    Science.gov (United States)

    Wang, Kang; Gu, Huaxi; Yang, Yintang; Wang, Kun

    2015-08-10

    With the number of cores increasing, there is an emerging need for a high-bandwidth low-latency interconnection network, serving core-to-memory communication. In this paper, aiming at the goal of simultaneous access to multi-rank memory, we propose an optical interconnection network for core-to-memory communication. In the proposed network, the wavelength usage is delicately arranged so that cores can communicate with different ranks at the same time and broadcast for flow control can be achieved. A distributed memory controller architecture that works in a pipeline mode is also designed for efficient optical communication and transaction address processes. The scaling method and wavelength assignment for the proposed network are investigated. Compared with traditional electronic bus-based core-to-memory communication, the simulation results based on the PARSEC benchmark show that the bandwidth enhancement and latency reduction are apparent.

  19. SCI based data acquisition architectures

    International Nuclear Information System (INIS)

    Bogaerts, J.A.C.; Divia, R.; Renardy, J.F.

    1992-01-01

    This paper discusses the Scalable Coherent Interface (SCI), an IEEE proposed standard (P1596) for interconnecting multiprocessor systems. The standard defines point to point connections between nodes, which can be processors, memories or I/O devices. Networks containing a maximum of 64K nodes with a bandwidth of one Gbyte/s between nodes, may be constructed. SCI is an attractive candidate to serve as a backbone for high speed, large volume data acquisition systems such as required by future experiments at the proposed Large Hadron Collider (LHC) at CERN. Work has started to simulate SCI based architectures for data acquisition systems. The simulation program proved to be a useful tool to study SCI systems. First results are reported on a model of a large LHC experiment containing over 1000 nodes

  20. Auditory short-term memory in the primate auditory cortex.

    Science.gov (United States)

    Scott, Brian H; Mishkin, Mortimer

    2016-06-01

    Sounds are fleeting, and assembling the sequence of inputs at the ear into a coherent percept requires auditory memory across various time scales. Auditory short-term memory comprises at least two components: an active ׳working memory' bolstered by rehearsal, and a sensory trace that may be passively retained. Working memory relies on representations recalled from long-term memory, and their rehearsal may require phonological mechanisms unique to humans. The sensory component, passive short-term memory (pSTM), is tractable to study in nonhuman primates, whose brain architecture and behavioral repertoire are comparable to our own. This review discusses recent advances in the behavioral and neurophysiological study of auditory memory with a focus on single-unit recordings from macaque monkeys performing delayed-match-to-sample (DMS) tasks. Monkeys appear to employ pSTM to solve these tasks, as evidenced by the impact of interfering stimuli on memory performance. In several regards, pSTM in monkeys resembles pitch memory in humans, and may engage similar neural mechanisms. Neural correlates of DMS performance have been observed throughout the auditory and prefrontal cortex, defining a network of areas supporting auditory STM with parallels to that supporting visual STM. These correlates include persistent neural firing, or a suppression of firing, during the delay period of the memory task, as well as suppression or (less commonly) enhancement of sensory responses when a sound is repeated as a ׳match' stimulus. Auditory STM is supported by a distributed temporo-frontal network in which sensitivity to stimulus history is an intrinsic feature of auditory processing. This article is part of a Special Issue entitled SI: Auditory working memory. Published by Elsevier B.V.

  1. Architectural freedom and industrialized architecture

    DEFF Research Database (Denmark)

    Vestergaard, Inge

    2012-01-01

    to explain that architecture can be thought as a complex and diverse design through customization, telling exactly the revitalized storey about the change to a contemporary sustainable and better performing expression in direct relation to the given context. Through the last couple of years we have...... proportions, to organize the process on site choosing either one room wall components or several rooms wall components – either horizontally or vertically. Combined with the seamless joint the playing with these possibilities the new industrialized architecture can deliver variations in choice of solutions...... for retrofit design. If we add the question of the installations e.g. ventilation to this systematic thinking of building technique we get a diverse and functional architecture, thereby creating a new and clearer story telling about new and smart system based thinking behind architectural expression....

  2. Architectural freedom and industrialized architecture

    DEFF Research Database (Denmark)

    Vestergaard, Inge

    2012-01-01

    to explain that architecture can be thought as a complex and diverse design through customization, telling exactly the revitalized storey about the change to a contemporary sustainable and better performing expression in direct relation to the given context. Through the last couple of years we have...... expression in the specific housing area. It is the aim of this article to expand the different design strategies which architects can use – to give the individual project attitudes and designs with architectural quality. Through the customized component production it is possible to choose different...... for retrofit design. If we add the question of the installations e.g. ventilation to this systematic thinking of building technique we get a diverse and functional architecture, thereby creating a new and clearer story telling about new and smart system based thinking behind architectural expression....

  3. Architectural freedom and industrialised architecture

    DEFF Research Database (Denmark)

    Vestergaard, Inge

    2012-01-01

    Architectural freedom and industrialized architecture. Inge Vestergaard, Associate Professor, Cand. Arch. Aarhus School of Architecture, Denmark Noerreport 20, 8000 Aarhus C Telephone +45 89 36 0000 E-mai l inge.vestergaard@aarch.dk Based on the repetitive architecture from the "building boom" 1960...... customization, telling exactly the revitalized storey about the change to a contemporary sustainable and better performed expression in direct relation to the given context. Through the last couple of years we have in Denmark been focusing a more sustainable and low energy building technique, which also include...... to the building physic problems a new industrialized period has started based on light weight elements basically made of wooden structures, faced with different suitable materials meant for individual expression for the specific housing area. It is the purpose of this article to widen up the different design...

  4. Reduction of Used Memory Ensemble Kalman Filtering (RumEnKF): A data assimilation scheme for memory intensive, high performance computing

    Science.gov (United States)

    Hut, Rolf; Amisigo, Barnabas A.; Steele-Dunne, Susan; van de Giesen, Nick

    2015-12-01

    Reduction of Used Memory Ensemble Kalman Filtering (RumEnKF) is introduced as a variant on the Ensemble Kalman Filter (EnKF). RumEnKF differs from EnKF in that it does not store the entire ensemble, but rather only saves the first two moments of the ensemble distribution. In this way, the number of ensemble members that can be calculated is less dependent on available memory, and mainly on available computing power (CPU). RumEnKF is developed to make optimal use of current generation super computer architecture, where the number of available floating point operations (flops) increases more rapidly than the available memory and where inter-node communication can quickly become a bottleneck. RumEnKF reduces the used memory compared to the EnKF when the number of ensemble members is greater than half the number of state variables. In this paper, three simple models are used (auto-regressive, low dimensional Lorenz and high dimensional Lorenz) to show that RumEnKF performs similarly to the EnKF. Furthermore, it is also shown that increasing the ensemble size has a similar impact on the estimation error from the three algorithms.

  5. Ferroelectric tunneling element and memory applications which utilize the tunneling element

    Science.gov (United States)

    Kalinin, Sergei V [Knoxville, TN; Christen, Hans M [Knoxville, TN; Baddorf, Arthur P [Knoxville, TN; Meunier, Vincent [Knoxville, TN; Lee, Ho Nyung [Oak Ridge, TN

    2010-07-20

    A tunneling element includes a thin film layer of ferroelectric material and a pair of dissimilar electrically-conductive layers disposed on opposite sides of the ferroelectric layer. Because of the dissimilarity in composition or construction between the electrically-conductive layers, the electron transport behavior of the electrically-conductive layers is polarization dependent when the tunneling element is below the Curie temperature of the layer of ferroelectric material. The element can be used as a basis of compact 1R type non-volatile random access memory (RAM). The advantages include extremely simple architecture, ultimate scalability and fast access times generic for all ferroelectric memories.

  6. Data Movement Dominates: Advanced Memory Technology to Address the Real Exascale Power Problem

    Energy Technology Data Exchange (ETDEWEB)

    Bergman, Keren

    2014-08-28

    Energy is the fundamental barrier to Exascale supercomputing and is dominated by the cost of moving data from one point to another, not computation. Similarly, performance is dominated by data movement, not computation. The solution to this problem requires three critical technologies: 3D integration, optical chip-to-chip communication, and a new communication model. The central goal of the Sandia led "Data Movement Dominates" project aimed to develop memory systems and new architectures based on these technologies that have the potential to lower the cost of local memory accesses by orders of magnitude and provide substantially more bandwidth. Only through these transformational advances can future systems reach the goals of Exascale computing with a manageable power budgets. The Sandia led team included co-PIs from Columbia University, Lawrence Berkeley Lab, and the University of Maryland. The Columbia effort of Data Movement Dominates focused on developing a physically accurate simulation environment and experimental verification for optically-connected memory (OCM) systems that can enable continued performance scaling through high-bandwidth capacity, energy-efficient bit-rate transparency, and time-of-flight latency. With OCM, memory device parallelism and total capacity can scale to match future high-performance computing requirements without sacrificing data-movement efficiency. When we consider systems with integrated photonics, links to memory can be seamlessly integrated with the interconnection network-in a sense, memory becomes a primary aspect of the interconnection network. At the core of the Columbia effort, toward expanding our understanding of OCM enabled computing we have created an integrated modeling and simulation environment that uniquely integrates the physical behavior of the optical layer. The PhoenxSim suite of design and software tools developed under this effort has enabled the co-design of and performance evaluation photonics-enabled OCM

  7. A direct metal transfer method for cross-bar type polymer non-volatile memory applications

    International Nuclear Information System (INIS)

    Kim, Tae-Wook; Lee, Kyeongmi; Oh, Seung-Hwan; Wang, Gunuk; Kim, Dong-Yu; Jung, Gun-Young; Lee, Takhee

    2008-01-01

    Polymer non-volatile memory devices in 8 x 8 array cross-bar architecture were fabricated by a non-aqueous direct metal transfer (DMT) method using a two-step thermal treatment. Top electrodes with a linewidth of 2 μm were transferred onto the polymer layer by the DMT method. The switching behaviour of memory devices fabricated by the DMT method was very similar to that of devices fabricated by the conventional shadow mask method. The devices fabricated using the DMT method showed three orders of magnitude of on/off ratio with stable resistance switching, demonstrating that the DMT method can be a simple process to fabricate organic memory array devices

  8. The Phenomenon of Touch in Architectural Design and a Field Study on Haptic Mapping

    Directory of Open Access Journals (Sweden)

    Pınar ÖKTEM ERKARTAL

    2015-02-01

    Full Text Available Ocular-centrism is the utilitarian-aesthetic perspective which dominates the perception of spatial quality and architectural success in the West. In locating vision as the dominant discourse in architectural design, this perspective has been criticized for ignoring the physical and psychological relation created between subject and space during the spatial experience, sensual memory, movement and time. The phenomenon of touch, which may be defined as the interaction between architecture and subject dependent on physical and cognitive perception, offers another way of thinking and interpreting architecture, and constitutes an alternative starting point for design. The aim of this study was three-fold: to research and describe the phenomenon of touch in design concepts, to present the effects of hapticity in spatial experience on the user, and to present a visualization study for this phenomenon which is quite challenging to express. For the fieldwork, five buildings designed by Peter Zumthor were chosen. Zumthor stresses the importance of sensation, materiality and atmosphere in the architectural design process. Zumthor’s abstract design elements, their use in architectural space and the effect were determined using physical measurement. The findings were represented in “haptic mapping”. This visualization study consisted of a “haptic scatter chart”, “materiality- affect analysis” and “sensation analysis” and revealed that the phenomenon of touch and concepts identified it such as sensations, influence, materiality and mental associations are not abstract and inaccessible assumptions, but tools which can be included in the architectural design process.

  9. A highly efficient 3D level-set grain growth algorithm tailored for ccNUMA architecture

    Science.gov (United States)

    Mießen, C.; Velinov, N.; Gottstein, G.; Barrales-Mora, L. A.

    2017-12-01

    A highly efficient simulation model for 2D and 3D grain growth was developed based on the level-set method. The model introduces modern computational concepts to achieve excellent performance on parallel computer architectures. Strong scalability was measured on cache-coherent non-uniform memory access (ccNUMA) architectures. To achieve this, the proposed approach considers the application of local level-set functions at the grain level. Ideal and non-ideal grain growth was simulated in 3D with the objective to study the evolution of statistical representative volume elements in polycrystals. In addition, microstructure evolution in an anisotropic magnetic material affected by an external magnetic field was simulated.

  10. Enterprise architecture evaluation using architecture framework and UML stereotypes

    Directory of Open Access Journals (Sweden)

    Narges Shahi

    2014-08-01

    Full Text Available There is an increasing need for enterprise architecture in numerous organizations with complicated systems with various processes. Support for information technology, organizational units whose elements maintain complex relationships increases. Enterprise architecture is so effective that its non-use in organizations is regarded as their institutional inability in efficient information technology management. The enterprise architecture process generally consists of three phases including strategic programing of information technology, enterprise architecture programing and enterprise architecture implementation. Each phase must be implemented sequentially and one single flaw in each phase may result in a flaw in the whole architecture and, consequently, in extra costs and time. If a model is mapped for the issue and then it is evaluated before enterprise architecture implementation in the second phase, the possible flaws in implementation process are prevented. In this study, the processes of enterprise architecture are illustrated through UML diagrams, and the architecture is evaluated in programming phase through transforming the UML diagrams to Petri nets. The results indicate that the high costs of the implementation phase will be reduced.

  11. A neuromorphic circuit mimicking biological short-term memory.

    Science.gov (United States)

    Barzegarjalali, Saeid; Parker, Alice C

    2016-08-01

    Research shows that the way we remember things for a few seconds is a different mechanism from the way we remember things for a longer time. Short-term memory is based on persistently firing neurons, whereas storing information for a longer time is based on strengthening the synapses or even forming new neural connections. Information about location and appearance of an object is segregated and processed by separate neurons. Furthermore neurons can continue firing using different mechanisms. Here, we have designed a biomimetic neuromorphic circuit that mimics short-term memory by firing neurons, using biological mechanisms to remember location and shape of an object. Our neuromorphic circuit has a hybrid architecture. Neurons are designed with CMOS 45nm technology and synapses are designed with carbon nanotubes (CNT).

  12. Achieving High Performance With TCP Over 40 GbE on NUMA Architectures for CMS Data Acquisition

    Energy Technology Data Exchange (ETDEWEB)

    Bawej, Tomasz; et al.

    2014-01-01

    TCP and the socket abstraction have barely changed over the last two decades, but at the network layer there has been a giant leap from a few megabits to 100 gigabits in bandwidth. At the same time, CPU architectures have evolved into the multicore era and applications are expected to make full use of all available resources. Applications in the data acquisition domain based on the standard socket library running in a Non-Uniform Memory Access (NUMA) architecture are unable to reach full efficiency and scalability without the software being adequately aware about the IRQ (Interrupt Request), CPU and memory affinities. During the first long shutdown of LHC, the CMS DAQ system is going to be upgraded for operation from 2015 onwards and a new software component has been designed and developed in the CMS online framework for transferring data with sockets. This software attempts to wrap the low-level socket library to ease higher-level programming with an API based on an asynchronous event driven model similar to the DAT uDAPL API. It is an event-based application with NUMA optimizations, that allows for a high throughput of data across a large distributed system. This paper describes the architecture, the technologies involved and the performance measurements of the software in the context of the CMS distributed event building.

  13. Optical computing, optical memory, and SBIRs at Foster-Miller

    Science.gov (United States)

    Domash, Lawrence H.

    1994-03-01

    A desktop design and manufacturing system for binary diffractive elements, MacBEEP, was developed with the optical researcher in mind. Optical processing systems for specialized tasks such as cellular automation computation and fractal measurement were constructed. A new family of switchable holograms has enabled several applications for control of laser beams in optical memories. New spatial light modulators and optical logic elements have been demonstrated based on a more manufacturable semiconductor technology. Novel synthetic and polymeric nonlinear materials for optical storage are under development in an integrated memory architecture. SBIR programs enable creative contributions from smaller companies, both product oriented and technology oriented, and support advances that might not otherwise be developed.

  14. Software architecture 2

    CERN Document Server

    Oussalah, Mourad Chabanne

    2014-01-01

    Over the past 20 years, software architectures have significantly contributed to the development of complex and distributed systems. Nowadays, it is recognized that one of the critical problems in the design and development of any complex software system is its architecture, i.e. the organization of its architectural elements. Software Architecture presents the software architecture paradigms based on objects, components, services and models, as well as the various architectural techniques and methods, the analysis of architectural qualities, models of representation of architectural templa

  15. Lightweight enterprise architectures

    CERN Document Server

    Theuerkorn, Fenix

    2004-01-01

    STATE OF ARCHITECTUREArchitectural ChaosRelation of Technology and Architecture The Many Faces of Architecture The Scope of Enterprise Architecture The Need for Enterprise ArchitectureThe History of Architecture The Current Environment Standardization Barriers The Need for Lightweight Architecture in the EnterpriseThe Cost of TechnologyThe Benefits of Enterprise Architecture The Domains of Architecture The Gap between Business and ITWhere Does LEA Fit? LEA's FrameworkFrameworks, Methodologies, and Approaches The Framework of LEATypes of Methodologies Types of ApproachesActual System Environmen

  16. Software architecture 1

    CERN Document Server

    Oussalah , Mourad Chabane

    2014-01-01

    Over the past 20 years, software architectures have significantly contributed to the development of complex and distributed systems. Nowadays, it is recognized that one of the critical problems in the design and development of any complex software system is its architecture, i.e. the organization of its architectural elements. Software Architecture presents the software architecture paradigms based on objects, components, services and models, as well as the various architectural techniques and methods, the analysis of architectural qualities, models of representation of architectural template

  17. Sequence-to-Sequence Prediction of Vehicle Trajectory via LSTM Encoder-Decoder Architecture

    OpenAIRE

    Park, Seong Hyeon; Kim, ByeongDo; Kang, Chang Mook; Chung, Chung Choo; Choi, Jun Won

    2018-01-01

    In this paper, we propose a deep learning based vehicle trajectory prediction technique which can generate the future trajectory sequence of surrounding vehicles in real time. We employ the encoder-decoder architecture which analyzes the pattern underlying in the past trajectory using the long short-term memory (LSTM) based encoder and generates the future trajectory sequence using the LSTM based decoder. This structure produces the $K$ most likely trajectory candidates over occupancy grid ma...

  18. Indigenous architecture as a context-oriented architecture, a look at ...

    African Journals Online (AJOL)

    What has become problematic as the achievement of international style and globalization of architecture during the time has been the purely technological look at architecture, and the architecture without belonging to a place. In recent decades, the topic of sustainable architecture and reconsidering indigenous architecture ...

  19. Role of memory errors in quantum repeaters

    International Nuclear Information System (INIS)

    Hartmann, L.; Kraus, B.; Briegel, H.-J.; Duer, W.

    2007-01-01

    We investigate the influence of memory errors in the quantum repeater scheme for long-range quantum communication. We show that the communication distance is limited in standard operation mode due to memory errors resulting from unavoidable waiting times for classical signals. We show how to overcome these limitations by (i) improving local memory and (ii) introducing two operational modes of the quantum repeater. In both operational modes, the repeater is run blindly, i.e., without waiting for classical signals to arrive. In the first scheme, entanglement purification protocols based on one-way classical communication are used allowing to communicate over arbitrary distances. However, the error thresholds for noise in local control operations are very stringent. The second scheme makes use of entanglement purification protocols with two-way classical communication and inherits the favorable error thresholds of the repeater run in standard mode. One can increase the possible communication distance by an order of magnitude with reasonable overhead in physical resources. We outline the architecture of a quantum repeater that can possibly ensure intercontinental quantum communication

  20. Visual software system for memory interleaving simulation

    Directory of Open Access Journals (Sweden)

    Milenković Katarina

    2017-01-01

    Full Text Available This paper describes the visual software system for memory interleaving simulation (VSMIS, implemented for the purpose of the course Computer Architecture and Organization 1, at the School of Electrical Engineering, University of Belgrade. The simulator enables students to expand their knowledge through practical work in the laboratory, as well as through independent work at home. VSMIS gives users the possibility to initialize parts of the system and to control simulation steps. The user has the ability to monitor simulation through graphical representation. It is possible to navigate through the entire hierarchy of the system using simple navigation. During the simulation the user can observe and set the values of the memory location. At any time, the user can reset the simulation of the system and observe it for different memory states; in addition, it is possible to save the current state of the simulation and continue with the execution of the simulation later. [Project of the Serbian Ministry of Education, Science and Technological Development, Grant no. III44009

  1. Silicon photonic integrated circuits with electrically programmable non-volatile memory functions.

    Science.gov (United States)

    Song, J-F; Lim, A E-J; Luo, X-S; Fang, Q; Li, C; Jia, L X; Tu, X-G; Huang, Y; Zhou, H-F; Liow, T-Y; Lo, G-Q

    2016-09-19

    Conventional silicon photonic integrated circuits do not normally possess memory functions, which require on-chip power in order to maintain circuit states in tuned or field-configured switching routes. In this context, we present an electrically programmable add/drop microring resonator with a wavelength shift of 426 pm between the ON/OFF states. Electrical pulses are used to control the choice of the state. Our experimental results show a wavelength shift of 2.8 pm/ms and a light intensity variation of ~0.12 dB/ms for a fixed wavelength in the OFF state. Theoretically, our device can accommodate up to 65 states of multi-level memory functions. Such memory functions can be integrated into wavelength division mutiplexing (WDM) filters and applied to optical routers and computing architectures fulfilling large data downloading demands.

  2. A Survey of Soft-Error Mitigation Techniques for Non-Volatile Memories

    Directory of Open Access Journals (Sweden)

    Sparsh Mittal

    2017-02-01

    Full Text Available Non-volatile memories (NVMs offer superior density and energy characteristics compared to the conventional memories; however, NVMs suffer from severe reliability issues that can easily eclipse their energy efficiency advantages. In this paper, we survey architectural techniques for improving the soft-error reliability of NVMs, specifically PCM (phase change memory and STT-RAM (spin transfer torque RAM. We focus on soft-errors, such as resistance drift and write disturbance, in PCM and read disturbance and write failures in STT-RAM. By classifying the research works based on key parameters, we highlight their similarities and distinctions. We hope that this survey will underline the crucial importance of addressing NVM reliability for ensuring their system integration and will be useful for researchers, computer architects and processor designers.

  3. From Augustine of Hippo's Memory Systems to Our Modern Taxonomy in Cognitive Psychology and Neuroscience of Memory: A 16-Century Nap of Intuition before Light of Evidence.

    Science.gov (United States)

    Cassel, Jean-Christophe; Cassel, Daniel; Manning, Lilianne

    2013-03-01

    Over the last half century, neuropsychologists, cognitive psychologists and cognitive neuroscientists interested in human memory have accumulated evidence showing that there is not one general memory function but a variety of memory systems deserving distinct (but for an organism, complementary) functional entities. The first attempts to organize memory systems within a taxonomic construct are often traced back to the French philosopher Maine de Biran (1766-1824), who, in his book first published in 1803, distinguished mechanical memory, sensitive memory and representative memory, without, however, providing any experimental evidence in support of his view. It turns out, however, that what might be regarded as the first elaborated taxonomic proposal is 14 centuries older and is due to Augustine of Hippo (354-430), also named St Augustine, who, in Book 10 of his Confessions, by means of an introspective process that did not aim at organizing memory systems, nevertheless distinguished and commented on sensible memory, intellectual memory, memory of memories, memory of feelings and passion, and memory of forgetting. These memories were envisaged as different and complementary instances. In the current study, after a short biographical synopsis of St Augustine, we provide an outline of the philosopher's contribution, both in terms of questions and answers, and focus on how this contribution almost perfectly fits with several viewpoints of modern psychology and neuroscience of memory about human memory functions, including the notion that episodic autobiographical memory stores events of our personal history in their what, where and when dimensions, and from there enables our mental time travel. It is not at all meant that St Augustine's elaboration was the basis for the modern taxonomy, but just that the similarity is striking, and that the architecture of our current viewpoints about memory systems might have preexisted as an outstanding intuition in the philosopher

  4. Constructing memory through artistic practices

    International Nuclear Information System (INIS)

    Massart, Cecile

    2015-01-01

    Cecile Massart is a visual artist who lives and works in Brussels, Belgium. Her teaching career includes Academy of Ixelles, Ecole Superieure des Arts Plastiques et Visuels in Mons, and Ecole Nationale Superieure des Arts Visuels La Cambre in Brussels. Cecile Massart has presented her extensive artistic research at numerous international conferences. Her works are featured in private and public collections. Since 1994, Cecile Massart has been investigating international sites for radioactive waste storage, exploring how this 21. century archaeological stratum is being inscribed in the landscape. Researching radioactive waste sites around the world for over 20 years, her main focus has become their identification in the landscape. Her ideas are communicated through her visual research and writings that aim to raise the awareness of radioactive disposal sites and to study their life within their surroundings for future generations. Her drawings, films, books and exhibitions investigate a new kind of architecture of the sites that become research platforms. Her first graphic research, edited under the title Un site archive pour Alpha, Beta, Gamma, helps in revealing their true nature. Her photographs, silk-screen prints, installations and pictures testify to the need to preserve the memory and knowledge of such sites across generations ensuring the safety of the living world. With this objective in mind, to build a memory, she has developed an architectural vocabulary functioning as warning sculptures to identify the nuclear repositories in the landscape: markers or archi-sculptures. In the following sections, Cecile Massart describes her work in her own words. For more details on her work, see www.cecilemassart.com

  5. Architecture in the Islamic Civilization: Muslim Building or Islamic Architecture

    OpenAIRE

    Yassin, Ayat Ali; Utaberta, Dr. Nangkula

    2012-01-01

    The main problem of the theory in the arena of islamic architecture is affected by some of its Westernthoughts, and stereotyping the islamic architecture according to Western thoughts; this leads to the breakdownof the foundations in the islamic architecture. It is a myth that islamic architecture is subjected to theinfluence from foreign architectures. This paper will highlight the dialectical concept of islamic architecture ormuslim buildings and the areas of recognition in islamic architec...

  6. Biomorphic Multi-Agent Architecture for Persistent Computing

    Science.gov (United States)

    Lodding, Kenneth N.; Brewster, Paul

    2009-01-01

    A multi-agent software/hardware architecture, inspired by the multicellular nature of living organisms, has been proposed as the basis of design of a robust, reliable, persistent computing system. Just as a multicellular organism can adapt to changing environmental conditions and can survive despite the failure of individual cells, a multi-agent computing system, as envisioned, could adapt to changing hardware, software, and environmental conditions. In particular, the computing system could continue to function (perhaps at a reduced but still reasonable level of performance) if one or more component( s) of the system were to fail. One of the defining characteristics of a multicellular organism is unity of purpose. In biology, the purpose is survival of the organism. The purpose of the proposed multi-agent architecture is to provide a persistent computing environment in harsh conditions in which repair is difficult or impossible. A multi-agent, organism-like computing system would be a single entity built from agents or cells. Each agent or cell would be a discrete hardware processing unit that would include a data processor with local memory, an internal clock, and a suite of communication equipment capable of both local line-of-sight communications and global broadcast communications. Some cells, denoted specialist cells, could contain such additional hardware as sensors and emitters. Each cell would be independent in the sense that there would be no global clock, no global (shared) memory, no pre-assigned cell identifiers, no pre-defined network topology, and no centralized brain or control structure. Like each cell in a living organism, each agent or cell of the computing system would contain a full description of the system encoded as genes, but in this case, the genes would be components of a software genome.

  7. Architecture of security management unit for safe hosting of multiple agents

    Science.gov (United States)

    Gilmont, Tanguy; Legat, Jean-Didier; Quisquater, Jean-Jacques

    1999-04-01

    In such growing areas as remote applications in large public networks, electronic commerce, digital signature, intellectual property and copyright protection, and even operating system extensibility, the hardware security level offered by existing processors is insufficient. They lack protection mechanisms that prevent the user from tampering critical data owned by those applications. Some devices make exception, but have not enough processing power nor enough memory to stand up to such applications (e.g. smart cards). This paper proposes an architecture of secure processor, in which the classical memory management unit is extended into a new security management unit. It allows ciphered code execution and ciphered data processing. An internal permanent memory can store cipher keys and critical data for several client agents simultaneously. The ordinary supervisor privilege scheme is replaced by a privilege inheritance mechanism that is more suited to operating system extensibility. The result is a secure processor that has hardware support for extensible multitask operating systems, and can be used for both general applications and critical applications needing strong protection. The security management unit and the internal permanent memory can be added to an existing CPU core without loss of performance, and do not require it to be modified.

  8. Hardware architecture design of a fast global motion estimation method

    Science.gov (United States)

    Liang, Chaobing; Sang, Hongshi; Shen, Xubang

    2015-12-01

    VLSI implementation of gradient-based global motion estimation (GME) faces two main challenges: irregular data access and high off-chip memory bandwidth requirement. We previously proposed a fast GME method that reduces computational complexity by choosing certain number of small patches containing corners and using them in a gradient-based framework. A hardware architecture is designed to implement this method and further reduce off-chip memory bandwidth requirement. On-chip memories are used to store coordinates of the corners and template patches, while the Gaussian pyramids of both the template and reference frame are stored in off-chip SDRAMs. By performing geometric transform only on the coordinates of the center pixel of a 3-by-3 patch in the template image, a 5-by-5 area containing the warped 3-by-3 patch in the reference image is extracted from the SDRAMs by burst read. Patched-based and burst mode data access helps to keep the off-chip memory bandwidth requirement at the minimum. Although patch size varies at different pyramid level, all patches are processed in term of 3x3 patches, so the utilization of the patch-processing circuit reaches 100%. FPGA implementation results show that the design utilizes 24,080 bits on-chip memory and for a sequence with resolution of 352x288 and frequency of 60Hz, the off-chip bandwidth requirement is only 3.96Mbyte/s, compared with 243.84Mbyte/s of the original gradient-based GME method. This design can be used in applications like video codec, video stabilization, and super-resolution, where real-time GME is a necessity and minimum memory bandwidth requirement is appreciated.

  9. Block RAM-based architecture for real-time reconfiguration using Xilinx® FPGAs

    Directory of Open Access Journals (Sweden)

    Rikus le Roux

    2015-07-01

    Full Text Available Despite the advantages dynamic reconfiguration adds to a system, it only improves system performance if the execution time exceeds the configuration time. As a result, dynamic reconfiguration is only capable of improving the performance of quasi-static applications. In order to improve the performance of dynamic applications, researchers focus on improving the reconfiguration throughput. These approaches are mostly limited by the bus commonly used to connect the configuration controller to the memory, which contributes to the configuration time. A method proposed to ameliorate this overhead is an architecture utilizing localised block RAM (BRAM connected to the configuration controller to store the configuration bitstream. The aim of this paper is to illustrate the advantages of the proposed architecture, especially for reconfiguring real-time applications. This is done by validating the throughput of the architecture and comparing this to the maximum theoretical throughput of the internal configuration access port (ICAP. It was found that the proposed architecture is capable of reconfiguring an application within a time-frame suitable for real-time reconfiguration. The drawback of this method is that the BRAM is extremely limited and only a discrete set of configurations can be stored. This paper also proposes a method on how this can be mitigated without affecting the throughput.

  10. A flexible algorithm for calculating pair interactions on SIMD architectures

    Science.gov (United States)

    Páll, Szilárd; Hess, Berk

    2013-12-01

    Calculating interactions or correlations between pairs of particles is typically the most time-consuming task in particle simulation or correlation analysis. Straightforward implementations using a double loop over particle pairs have traditionally worked well, especially since compilers usually do a good job of unrolling the inner loop. In order to reach high performance on modern CPU and accelerator architectures, single-instruction multiple-data (SIMD) parallelization has become essential. Avoiding memory bottlenecks is also increasingly important and requires reducing the ratio of memory to arithmetic operations. Moreover, when pairs only interact within a certain cut-off distance, good SIMD utilization can only be achieved by reordering input and output data, which quickly becomes a limiting factor. Here we present an algorithm for SIMD parallelization based on grouping a fixed number of particles, e.g. 2, 4, or 8, into spatial clusters. Calculating all interactions between particles in a pair of such clusters improves data reuse compared to the traditional scheme and results in a more efficient SIMD parallelization. Adjusting the cluster size allows the algorithm to map to SIMD units of various widths. This flexibility not only enables fast and efficient implementation on current CPUs and accelerator architectures like GPUs or Intel MIC, but it also makes the algorithm future-proof. We present the algorithm with an application to molecular dynamics simulations, where we can also make use of the effective buffering the method introduces.

  11. Memory controllers for real-time embedded systems predictable and composable real-time systems

    CERN Document Server

    Akesson, Benny

    2012-01-01

      Verification of real-time requirements in systems-on-chip becomes more complex as more applications are integrated. Predictable and composable systems can manage the increasing complexity using formal verification and simulation.  This book explains the concepts of predictability and composability and shows how to apply them to the design and analysis of a memory controller, which is a key component in any real-time system. This book is generally intended for readers interested in Systems-on-Chips with real-time applications.   It is especially well-suited for readers looking to use SDRAM memories in systems with hard or firm real-time requirements. There is a strong focus on real-time concepts, such as predictability and composability, as well as a brief discussion about memory controller architectures for high-performance computing. Readers will learn step-by-step how to go from an unpredictable SDRAM memory, offering highly variable bandwidth and latency, to a predictable and composable shared memory...

  12. A portable approach for PIC on emerging architectures

    Science.gov (United States)

    Decyk, Viktor

    2016-03-01

    A portable approach for designing Particle-in-Cell (PIC) algorithms on emerging exascale computers, is based on the recognition that 3 distinct programming paradigms are needed. They are: low level vector (SIMD) processing, middle level shared memory parallel programing, and high level distributed memory programming. In addition, there is a memory hierarchy associated with each level. Such algorithms can be initially developed using vectorizing compilers, OpenMP, and MPI. This is the approach recommended by Intel for the Phi processor. These algorithms can then be translated and possibly specialized to other programming models and languages, as needed. For example, the vector processing and shared memory programming might be done with CUDA instead of vectorizing compilers and OpenMP, but generally the algorithm itself is not greatly changed. The UCLA PICKSC web site at http://www.idre.ucla.edu/ contains example open source skeleton codes (mini-apps) illustrating each of these three programming models, individually and in combination. Fortran2003 now supports abstract data types, and design patterns can be used to support a variety of implementations within the same code base. Fortran2003 also supports interoperability with C so that implementations in C languages are also easy to use. Finally, main codes can be translated into dynamic environments such as Python, while still taking advantage of high performing compiled languages. Parallel languages are still evolving with interesting developments in co-Array Fortran, UPC, and OpenACC, among others, and these can also be supported within the same software architecture. Work supported by NSF and DOE Grants.

  13. A Cross-Layer Framework for Designing and Optimizing Deeply-Scaled FinFET-Based Cache Memories

    Directory of Open Access Journals (Sweden)

    Alireza Shafaei

    2015-08-01

    Full Text Available This paper presents a cross-layer framework in order to design and optimize energy-efficient cache memories made of deeply-scaled FinFET devices. The proposed design framework spans device, circuit and architecture levels and considers both super- and near-threshold modes of operation. Initially, at the device-level, seven FinFET devices on a 7-nm process technology are designed in which only one geometry-related parameter (e.g., fin width, gate length, gate underlap is changed per device. Next, at the circuit-level, standard 6T and 8T SRAM cells made of these 7-nm FinFET devices are characterized and compared in terms of static noise margin, access latency, leakage power consumption, etc. Finally, cache memories with all different combinations of devices and SRAM cells are evaluated at the architecture-level using a modified version of the CACTI tool with FinFET support and other considerations for deeply-scaled technologies. Using this design framework, it is observed that L1 cache memory made of longer channel FinFET devices operating at the near-threshold regime achieves the minimum energy operation point.

  14. Implementation of collisions on GPU architecture in the Vorpal code

    Science.gov (United States)

    Leddy, Jarrod; Averkin, Sergey; Cowan, Ben; Sides, Scott; Werner, Greg; Cary, John

    2017-10-01

    The Vorpal code contains a variety of collision operators allowing for the simulation of plasmas containing multiple charge species interacting with neutrals, background gas, and EM fields. These existing algorithms have been improved and reimplemented to take advantage of the massive parallelization allowed by GPU architecture. The use of GPUs is most effective when algorithms are single-instruction multiple-data, so particle collisions are an ideal candidate for this parallelization technique due to their nature as a series of independent processes with the same underlying operation. This refactoring required data memory reorganization and careful consideration of device/host data allocation to minimize memory access and data communication per operation. Successful implementation has resulted in an order of magnitude increase in simulation speed for a test-case involving multiple binary collisions using the null collision method. Work supported by DARPA under contract W31P4Q-16-C-0009.

  15. From Augustine of Hippo’s Memory Systems to Our Modern Taxonomy in Cognitive Psychology and Neuroscience of Memory: A 16-Century Nap of Intuition before Light of Evidence

    Directory of Open Access Journals (Sweden)

    Jean-Christophe Cassel

    2012-12-01

    Full Text Available Over the last half century, neuropsychologists, cognitive psychologists and cognitive neuroscientists interested in human memory have accumulated evidence showing that there is not one general memory function but a variety of memory systems deserving distinct (but for an organism, complementary functional entities. The first attempts to organize memory systems within a taxonomic construct are often traced back to the French philosopher Maine de Biran (1766–1824, who, in his book first published in 1803, distinguished mechanical memory, sensitive memory and representative memory, without, however, providing any experimental evidence in support of his view. It turns out, however, that what might be regarded as the first elaborated taxonomic proposal is 14 centuries older and is due to Augustine of Hippo (354–430, also named St Augustine, who, in Book 10 of his Confessions, by means of an introspective process that did not aim at organizing memory systems, nevertheless distinguished and commented on sensible memory, intellectual memory, memory of memories, memory of feelings and passion, and memory of forgetting. These memories were envisaged as different and complementary instances. In the current study, after a short biographical synopsis of St Augustine, we provide an outline of the philosopher’s contribution, both in terms of questions and answers, and focus on how this contribution almost perfectly fits with several viewpoints of modern psychology and neuroscience of memory about human memory functions, including the notion that episodic autobiographical memory stores events of our personal history in their what, where and when dimensions, and from there enables our mental time travel. It is not at all meant that St Augustine’s elaboration was the basis for the modern taxonomy, but just that the similarity is striking, and that the architecture of our current viewpoints about memory systems might have preexisted as an outstanding

  16. Cross-cultural differences in processing of architectural ranking: evidence from an event-related potential study.

    Science.gov (United States)

    Mecklinger, Axel; Kriukova, Olga; Mühlmann, Heiner; Grunwald, Thomas

    2014-01-01

    Visual object identification is modulated by perceptual experience. In a cross-cultural ERP study we investigated whether cultural expertise determines how buildings that vary in their ranking between high and low according to the Western architectural decorum are perceived. Two groups of German and Chinese participants performed an object classification task in which high- and low-ranking Western buildings had to be discriminated from everyday life objects. ERP results indicate that an early stage of visual object identification (i.e., object model selection) is facilitated for high-ranking buildings for the German participants, only. At a later stage of object identification, in which object knowledge is complemented by information from semantic and episodic long-term memory, no ERP evidence for cultural differences was obtained. These results suggest that the identification of architectural ranking is modulated by culturally specific expertise with Western-style architecture already at an early processing stage.

  17. Software architecture analysis tool : software architecture metrics collection

    NARCIS (Netherlands)

    Muskens, J.; Chaudron, M.R.V.; Westgeest, R.

    2002-01-01

    The Software Engineering discipline lacks the ability to evaluate software architectures. Here we describe a tool for software architecture analysis that is based on metrics. Metrics can be used to detect possible problems and bottlenecks in software architectures. Even though metrics do not give a

  18. Unstructured Computational Aerodynamics on Many Integrated Core Architecture

    KAUST Repository

    Al Farhan, Mohammed A.

    2016-06-08

    Shared memory parallelization of the flux kernel of PETSc-FUN3D, an unstructured tetrahedral mesh Euler flow code previously studied for distributed memory and multi-core shared memory, is evaluated on up to 61 cores per node and up to 4 threads per core. We explore several thread-level optimizations to improve flux kernel performance on the state-of-the-art many integrated core (MIC) Intel processor Xeon Phi “Knights Corner,” with a focus on strong thread scaling. While the linear algebraic kernel is bottlenecked by memory bandwidth for even modest numbers of cores sharing a common memory, the flux kernel, which arises in the control volume discretization of the conservation law residuals and in the formation of the preconditioner for the Jacobian by finite-differencing the conservation law residuals, is compute-intensive and is known to exploit effectively contemporary multi-core hardware. We extend study of the performance of the flux kernel to the Xeon Phi in three thread affinity modes, namely scatter, compact, and balanced, in both offload and native mode, with and without various code optimizations to improve alignment and reduce cache coherency penalties. Relative to baseline “out-of-the-box” optimized compilation, code restructuring optimizations provide about 3.8x speedup using the offload mode and about 5x speedup using the native mode. Even with these gains for the flux kernel, with respect to execution time the MIC simply achieves par with optimized compilation on a contemporary multi-core Intel CPU, the 16-core Sandy Bridge E5 2670. Nevertheless, the optimizations employed to reduce the data motion and cache coherency protocol penalties of the MIC are expected to be of value for CFD and many other unstructured applications as many-core architecture evolves. We explore large-scale distributed-shared memory performance on the Cray XC40 supercomputer, to demonstrate that optimizations employed on Phi hybridize to this context, where each of

  19. Unstructured Computational Aerodynamics on Many Integrated Core Architecture

    KAUST Repository

    Al Farhan, Mohammed A.; Kaushik, Dinesh K.; Keyes, David E.

    2016-01-01

    Shared memory parallelization of the flux kernel of PETSc-FUN3D, an unstructured tetrahedral mesh Euler flow code previously studied for distributed memory and multi-core shared memory, is evaluated on up to 61 cores per node and up to 4 threads per core. We explore several thread-level optimizations to improve flux kernel performance on the state-of-the-art many integrated core (MIC) Intel processor Xeon Phi “Knights Corner,” with a focus on strong thread scaling. While the linear algebraic kernel is bottlenecked by memory bandwidth for even modest numbers of cores sharing a common memory, the flux kernel, which arises in the control volume discretization of the conservation law residuals and in the formation of the preconditioner for the Jacobian by finite-differencing the conservation law residuals, is compute-intensive and is known to exploit effectively contemporary multi-core hardware. We extend study of the performance of the flux kernel to the Xeon Phi in three thread affinity modes, namely scatter, compact, and balanced, in both offload and native mode, with and without various code optimizations to improve alignment and reduce cache coherency penalties. Relative to baseline “out-of-the-box” optimized compilation, code restructuring optimizations provide about 3.8x speedup using the offload mode and about 5x speedup using the native mode. Even with these gains for the flux kernel, with respect to execution time the MIC simply achieves par with optimized compilation on a contemporary multi-core Intel CPU, the 16-core Sandy Bridge E5 2670. Nevertheless, the optimizations employed to reduce the data motion and cache coherency protocol penalties of the MIC are expected to be of value for CFD and many other unstructured applications as many-core architecture evolves. We explore large-scale distributed-shared memory performance on the Cray XC40 supercomputer, to demonstrate that optimizations employed on Phi hybridize to this context, where each of

  20. Aristotle: A performance Impact Indicator for the OpenCL Kernels Using Local Memory

    Directory of Open Access Journals (Sweden)

    Jianbin Fang

    2014-01-01

    Full Text Available Due to the increasing complexity of multi/many-core architectures (with their mix of caches and scratch-pad memories and applications (with different memory access patterns, the performance of many workloads becomes increasingly variable. In this work, we address one of the main causes for this performance variability: the efficiency of the memory system. Specifically, based on an empirical evaluation driven by memory access patterns, we qualify and partially quantify the performance impact of using local memory in multi/many-core processors. To do so, we systematically describe memory access patterns (MAPs in an application-agnostic manner. Next, for each identified MAP, we use OpenCL (for portability reasons to generate two microbenchmarks: a “naive” version (without local memory and “an optimized” version (using local memory. We then evaluate both of them on typically used multi-core and many-core platforms, and we log their performance. What we eventually obtain is a local memory performance database, indexed by various MAPs and platforms. Further, we propose a set of composing rules for multiple MAPs. Thus, we can get an indicator of whether using local memory is beneficial in the presence of multiple memory access patterns. This indication can be used to either avoid the hassle of implementing optimizations with too little gain or, alternatively, give a rough prediction of the performance gain.

  1. Optimization of the Brillouin operator on the KNL architecture

    Science.gov (United States)

    Dürr, Stephan

    2018-03-01

    Experiences with optimizing the matrix-times-vector application of the Brillouin operator on the Intel KNL processor are reported. Without adjustments to the memory layout, performance figures of 360 Gflop/s in single and 270 Gflop/s in double precision are observed. This is with Nc = 3 colors, Nv = 12 right-hand-sides, Nthr = 256 threads, on lattices of size 323 × 64, using exclusively OMP pragmas. Interestingly, the same routine performs quite well on Intel Core i7 architectures, too. Some observations on the much harderWilson fermion matrix-times-vector optimization problem are added.

  2. Libeskind and the Holocaust Metanarrative; from Discourse to Architecture

    Directory of Open Access Journals (Sweden)

    Tsiftsi Xanthi

    2017-11-01

    Full Text Available The Holocaust today resides between memory and postmemory. Initially, children of survivors and their contemporaries inherited a mediated past and bore full responsibility for disseminating their ancestors’ experiences. However, with the prevalence of the Holocaust metanarrative and its absolutist historicism, it was realised that when memory needs to cross generational boundaries, it needs to cross medial as well. The discourse was not enough; there was a need for broadening the narrative beyond the verbal using a powerful medium with the capacity to affect cognition and provoke emotions. This would be architecture, a storyteller by nature. In the 2000s, there was a noticeable boom in innovative Holocaust museums and memorials. Deconstructivist designs and symbolic forms constituted a new language that would meet the demands of local narratives, influence public opinion, and contribute to social change. This paper examines the potential of this transmediation and addresses critical issues-the importance of the experience, the role of empathy and intersubjectivity, the association of emotions with personal and symbolic experiences-and ethical challenges of the transmedia “migration” of a story. To accomplish this, it draws upon Daniel Libeskind, a Polish-born architect who has narrated different aspects of the Holocaust experience through his works.

  3. Modality-specific Alpha Modulations Facilitate Long-term Memory Encoding in the Presence of Distracters

    NARCIS (Netherlands)

    Jiang, H.; Gerven, M.A.J. van; Jensen, O.

    2015-01-01

    It has been proposed that long-term memory encoding is not only dependent on engaging task-relevant regions but also on disengaging task-irrelevant regions. In particular, oscillatory alpha activity has been shown to be involved in shaping the functional architecture of the working brain because it

  4. Modality-specific Alpha Modulations Facilitate Long-term Memory Encoding in the Presence of Distracters

    NARCIS (Netherlands)

    Jiang, H.; Gerven, M.A.J. van; Jensen, O.

    2014-01-01

    It has been proposed that long-term memory encoding is not only dependent on engaging task-relevant regions but also on disengaging task-irrelevant regions. In particular, oscillatory alpha activity has been shown to be involved in shaping the functional architecture of the working brain because it

  5. A flexible analog memory address list manager for PHENIX

    International Nuclear Information System (INIS)

    Ericson, M.N.; Musrock, M.S.; Britton, C.L. Jr.; Walker, J.W.; Wintenberg, A.L.; Young, G.R.; Allen, M.D.

    1996-01-01

    A programmable analog memory address list manager has been developed for use with all analog memory-based detector subsystems of PHENIX. The unit provides simultaneous read/write control, cell write-over protection for both a Level-1 trigger decision delay and digitization latency, and re-ordering of AMU addresses following conversion, at a beam crossing rate of 105 ns. Addresses are handled such that up to 5 Level-1 (LVL-1) events can be maintained in the AMU without write-over. Data tagging is implemented for handling overlapping and shared beam-event data packets. Full usage in all PHENIX analog memory-based detector subsystems is accomplished by the use of detector-specific programmable parameters--the number of data samples per valid LVL-1 trigger and the sample spacing. Architectural candidates for the system are discussed with emphasis on implementation implications. Details of the design are presented including application specifics, timing information, and test results from a full implementation using field programmable gate arrays (FPGAs)

  6. Exploring Shared-Memory Optimizations for an Unstructured Mesh CFD Application on Modern Parallel Systems

    KAUST Repository

    Mudigere, Dheevatsa

    2015-05-01

    In this work, we revisit the 1999 Gordon Bell Prize winning PETSc-FUN3D aerodynamics code, extending it with highly-tuned shared-memory parallelization and detailed performance analysis on modern highly parallel architectures. An unstructured-grid implicit flow solver, which forms the backbone of computational aerodynamics, poses particular challenges due to its large irregular working sets, unstructured memory accesses, and variable/limited amount of parallelism. This code, based on a domain decomposition approach, exposes tradeoffs between the number of threads assigned to each MPI-rank sub domain, and the total number of domains. By applying several algorithm- and architecture-aware optimization techniques for unstructured grids, we show a 6.9X speed-up in performance on a single-node Intel® XeonTM1 E5 2690 v2 processor relative to the out-of-the-box compilation. Our scaling studies on TACC Stampede supercomputer show that our optimizations continue to provide performance benefits over baseline implementation as we scale up to 256 nodes.

  7. Open architecture design and approach for the Integrated Sensor Architecture (ISA)

    Science.gov (United States)

    Moulton, Christine L.; Krzywicki, Alan T.; Hepp, Jared J.; Harrell, John; Kogut, Michael

    2015-05-01

    Integrated Sensor Architecture (ISA) is designed in response to stovepiped integration approaches. The design, based on the principles of Service Oriented Architectures (SOA) and Open Architectures, addresses the problem of integration, and is not designed for specific sensors or systems. The use of SOA and Open Architecture approaches has led to a flexible, extensible architecture. Using these approaches, and supported with common data formats, open protocol specifications, and Department of Defense Architecture Framework (DoDAF) system architecture documents, an integration-focused architecture has been developed. ISA can help move the Department of Defense (DoD) from costly stovepipe solutions to a more cost-effective plug-and-play design to support interoperability.

  8. Parallel SN algorithms in shared- and distributed-memory environments

    International Nuclear Information System (INIS)

    Haghighat, Alireza; Hunter, Melissa A.; Mattis, Ronald E.

    1995-01-01

    Different 2-D spatial domain partitioning Sn transport theory algorithms have been developed on the basis of the Block-Jacobi iterative scheme. These algorithms have been incorporated into TWOTRAN-II, and tested on a shared-memory CRAY Y-MP C90 and a distributed-memory IBM SP1. For a series of fixed source r-z geometry homogeneous problems, parallel efficiencies in a range of 50-90% are achieved on the C90 with 6 processors, and lower values (20-60%) are obtained on the SP1. It is demonstrated that better performance is attainable if one addresses issues such as convergence rate, load-balancing, and granularity for both architectures, as well as message passing (network bandwidth and latency) for SP1. (author). 17 refs, 4 figs

  9. Analysis of Architecture Pattern Usage in Legacy System Architecture Documentation

    NARCIS (Netherlands)

    Harrison, Neil B.; Avgeriou, Paris

    2008-01-01

    Architecture patterns are an important tool in architectural design. However, while many architecture patterns have been identified, there is little in-depth understanding of their actual use in software architectures. For instance, there is no overview of how many patterns are used per system or

  10. Fault-tolerant NAND-flash memory module for next-generation scientific instruments

    Science.gov (United States)

    Lange, Tobias; Michel, Holger; Fiethe, Björn; Michalik, Harald; Walter, Dietmar

    2015-10-01

    Remote sensing instruments on today's space missions deliver a high amount of data which is typically evaluated on ground. Especially for deep space missions the telemetry downlink is very limited which creates the need for the scientific evaluation and thereby a reduction of data volume already on-board the spacecraft. A demanding example is the Polarimetric and Helioseismic Imager (PHI) instrument on Solar Orbiter. To enable on-board offline processing for data reduction, the instrument has to be equipped with a high capacity memory module. The module is based on non-volatile NAND-Flash technology, which requires more advanced operation than volatile DRAM. Unlike classical mass memories, the module is integrated into the instrument and allows readback of data for processing. The architecture and safe operation of such kind of memory module is described in the following paper.

  11. PGHPF – An Optimizing High Performance Fortran Compiler for Distributed Memory Machines

    Directory of Open Access Journals (Sweden)

    Zeki Bozkus

    1997-01-01

    Full Text Available High Performance Fortran (HPF is the first widely supported, efficient, and portable parallel programming language for shared and distributed memory systems. HPF is realized through a set of directive-based extensions to Fortran 90. It enables application developers and Fortran end-users to write compact, portable, and efficient software that will compile and execute on workstations, shared memory servers, clusters, traditional supercomputers, or massively parallel processors. This article describes a production-quality HPF compiler for a set of parallel machines. Compilation techniques such as data and computation distribution, communication generation, run-time support, and optimization issues are elaborated as the basis for an HPF compiler implementation on distributed memory machines. The performance of this compiler on benchmark programs demonstrates that high efficiency can be achieved executing HPF code on parallel architectures.

  12. On the Architectural Engineering Competences in Architectural Design

    DEFF Research Database (Denmark)

    Kirkegaard, Poul Henning

    2007-01-01

    In 1997 a new education in Architecture & Design at Department of Architecture and Design, Aalborg University was started with 50 students. During the recent years this number has increased to approximately 100 new students each year, i.e. approximately 500 students are following the 3 years...... bachelor (BSc) and the 2 years master (MSc) programme. The first 5 semesters are common for all students followed by 5 semesters with specialization into Architectural Design, Urban Design, Industrial Design or Digital Design. The present paper gives a short summary of the architectural engineering...

  13. Using EDUCache Simulator for the Computer Architecture and Organization Course

    Directory of Open Access Journals (Sweden)

    Sasko Ristov

    2013-07-01

    Full Text Available The computer architecture and organization course is essential in all computer science and engineering programs, and the most selected and liked elective course for related engineering disciplines. However, the attractiveness brings a new challenge, it requires a lot of effort by the instructor, to explain rather complicated concepts to beginners or to those who study related disciplines. The usage of visual simulators can improve both the teaching and learning processes. The overall goal is twofold: 1~to enable a visual environment to explain the basic concepts and 2~to increase the student's willingness and ability to learn the material.A lot of visual simulators have been used for the computer architecture and organization course. However, due to the lack of visual simulators for simulation of the cache memory concepts, we have developed a new visual simulator EDUCache simulator. In this paper we present that it can be effectively and efficiently used as a supporting tool in the learning process of modern multi-layer, multi-cache and multi-core multi-processors.EDUCache's features enable an environment for performance evaluation and engineering of software systems, i.e. the students will also understand the importance of computer architecture building parts and hopefully, will increase their curiosity for hardware courses in general.

  14. Quantum memory Quantum memory

    Science.gov (United States)

    Le Gouët, Jean-Louis; Moiseev, Sergey

    2012-06-01

    Interaction of quantum radiation with multi-particle ensembles has sparked off intense research efforts during the past decade. Emblematic of this field is the quantum memory scheme, where a quantum state of light is mapped onto an ensemble of atoms and then recovered in its original shape. While opening new access to the basics of light-atom interaction, quantum memory also appears as a key element for information processing applications, such as linear optics quantum computation and long-distance quantum communication via quantum repeaters. Not surprisingly, it is far from trivial to practically recover a stored quantum state of light and, although impressive progress has already been accomplished, researchers are still struggling to reach this ambitious objective. This special issue provides an account of the state-of-the-art in a fast-moving research area that makes physicists, engineers and chemists work together at the forefront of their discipline, involving quantum fields and atoms in different media, magnetic resonance techniques and material science. Various strategies have been considered to store and retrieve quantum light. The explored designs belong to three main—while still overlapping—classes. In architectures derived from photon echo, information is mapped over the spectral components of inhomogeneously broadened absorption bands, such as those encountered in rare earth ion doped crystals and atomic gases in external gradient magnetic field. Protocols based on electromagnetic induced transparency also rely on resonant excitation and are ideally suited to the homogeneous absorption lines offered by laser cooled atomic clouds or ion Coulomb crystals. Finally off-resonance approaches are illustrated by Faraday and Raman processes. Coupling with an optical cavity may enhance the storage process, even for negligibly small atom number. Multiple scattering is also proposed as a way to enlarge the quantum interaction distance of light with matter. The

  15. Architectural heritage in post-disaster society: a tool for resilience in Banda Aceh after the 2004 tsunami disaster

    Science.gov (United States)

    Dewi, Cut; Nopera Rauzi, Era

    2018-05-01

    This paper discusses the role of architectural heritage as a tool for resilience in a community after a surpassing disaster. It argues that architectural heritage is not merely a passive victim needing to be rescued; rather it is also an active agent in providing resilience for survivors. It is evidence in the ways it acts as a signifier of collective memories and place identities, and a place to seek refuge in emergency time and to decide central decision during the reconstruction process. This paper explores several theories related to architectural heritage in post-disaster context and juxtaposes them in a case study of Banda Aceh after the 2004 Tsunami Disaster. The paper is based on a six-month anthropological fieldwork in 2012 in Banda Aceh after the Tsunami Disaster. During the fieldwork, 166 respondents were interviewed to gain extensive insight into the ways architecture might play a role in post-disaster reconstruction.

  16. Solarbus Solar Array Innovative Light Weight Mechanical Architecture with Thin Lateral Panels Deployed with Shape Memory Alloy Regulator

    Science.gov (United States)

    D'Abrigeon, Laurent; Carpine, Anne; Laduree, Gregory

    2005-05-01

    The standard ALCATEL SOLAR ARRAY PLANAR CONCEPT on the TELECOM market today on flight is named SOLARBUS.This concept is:• 3 to 10 identical panels covered with Si Hi-η celltechnology.• A central mast constitute by 3 to 4 panels and 1yoke linked together by hinges and synchronizedby cables.• From 2 to 6 lateral panelsThis concept is able to fit with the customer requirements in order to have a competitive "global offer at system level" (mass to power ratio 48-50 W/Kg)But, for the near future, in line with the market trend, and based on the previous experience, an improvement of the SOLARBUS Solar Array concept in term of W/kg/€ is essential in order to maintain the competitiveness of the global ALCATEL offer at system level.In order to increase the W/Kg performance Alcatel has developed a new architecture named Lightweight Panel Structure (LPS). The objectives of this new structure are :• To decrease the kg/m2 ratio • To be compatible of all promising cells technology including Si Hi-n, GaAs, GaAs+ small reflectors. This new architecture is based on the fact that during the 3 major life phases of a Solar Array (Launch/Deployment/Deployed orbital life), the structural needs are more important for the central panels than for the lateral panels.So two different panels have been designed :• Central panels (named LPS1)• Lateral panels (named LPS2)The stowing configuration as been adapted : 2 thin lateral panels LPS2 between 2 structural central panels LPS1, and local bumpers to transfer the loads from LPS2 to LPS1.Also one of the more stringent loads applied to the panels are corresponding to deployment loads. In order to limit the mass of reinforcement of the panels, a deployment speed regulator shall be used. In the frame of the new generation of solar arrays, Alcatel has developed a new actuator based on shape memory alloy torsional rod. This light weight component is directly connected to heaters lines and is able to provide great actuation torque

  17. Connecting Architecture and Implementation

    Science.gov (United States)

    Buchgeher, Georg; Weinreich, Rainer

    Software architectures are still typically defined and described independently from implementation. To avoid architectural erosion and drift, architectural representation needs to be continuously updated and synchronized with system implementation. Existing approaches for architecture representation like informal architecture documentation, UML diagrams, and Architecture Description Languages (ADLs) provide only limited support for connecting architecture descriptions and implementations. Architecture management tools like Lattix, SonarJ, and Sotoarc and UML-tools tackle this problem by extracting architecture information directly from code. This approach works for low-level architectural abstractions like classes and interfaces in object-oriented systems but fails to support architectural abstractions not found in programming languages. In this paper we present an approach for linking and continuously synchronizing a formalized architecture representation to an implementation. The approach is a synthesis of functionality provided by code-centric architecture management and UML tools and higher-level architecture analysis approaches like ADLs.

  18. mAgic-FPU and MADE: A customizable VLIW core and the modular VLIW processor architecture description environment

    Science.gov (United States)

    Paolucci, Pier S.; Kajfasz, Philippe; Bonnot, Philippe; Candaele, Bernard; Maufroid, Daniel; Pastorelli, Elena; Ricciardi, Andrea; Fusella, Yves; Guarino, Eugenio

    2001-09-01

    mAgic-FPU is the architecture of a family of VLIW cores for configurable system level integration of floating and fixed point computing power. mAgic customization permits the designer to tune basic parameters, such as the computing power/memory access ratio of the core processor, the number of available arithmetic operation per cycle, the register file size and number of port, as well as of the number of arithmetic operators. The reconfiguration (e.g., of register file size and number of port, as well as of the number of arithmetic operators) is supported by the software environment MADE (Modular VLIW processor Architecture and Assembler Description Environment). MADE reads an architecture description file and produces a customized assembler-scheduler for the target VLIW architecture, configuring a general purpose VLIW optimizer-scheduler engine. The mAgic-FPU core architecture satisfies the requisite of portability among silicon foundries. The first members of the mAgic FPU core family architecture fit the requirements of 'Smart Antenna for Adaptive Beam-Forming processing' and 'Physical Sound Synthesis'. The first 1 GigaFlops mAgic core will run at 100 MHz within an area of 40 mm 2 in 0.25 μm ATMEL CMOS technology in first half 2002.

  19. An Artificial Flexible Visual Memory System Based on an UV-Motivated Memristor.

    Science.gov (United States)

    Chen, Shuai; Lou, Zheng; Chen, Di; Shen, Guozhen

    2018-02-01

    For the mimicry of human visual memory, a prominent challenge is how to detect and store the image information by electronic devices, which demands a multifunctional integration to sense light like eyes and to memorize image information like the brain by transforming optical signals to electrical signals that can be recognized by electronic devices. Although current image sensors can perceive simple images in real time, the image information fades away when the external image stimuli are removed. The deficiency between the state-of-the-art image sensors and visual memory system inspires the logical integration of image sensors and memory devices to realize the sensing and memory process toward light information for the bionic design of human visual memory. Hence, a facile architecture is designed to construct artificial flexible visual memory system by employing an UV-motivated memristor. The visual memory arrays can realize the detection and memory process of UV light distribution with a patterned image for a long-term retention and the stored image information can be reset by a negative voltage sweep and reprogrammed to the same or an other image distribution, which proves the effective reusability. These results provide new opportunities for the mimicry of human visual memory and enable the flexible visual memory device to be applied in future wearable electronics, electronic eyes, multifunctional robotics, and auxiliary equipment for visual handicapped. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  20. Memory-Relevant Mushroom Body Output Synapses Are Cholinergic.

    Science.gov (United States)

    Barnstedt, Oliver; Owald, David; Felsenberg, Johannes; Brain, Ruth; Moszynski, John-Paul; Talbot, Clifford B; Perrat, Paola N; Waddell, Scott

    2016-03-16

    Memories are stored in the fan-out fan-in neural architectures of the mammalian cerebellum and hippocampus and the insect mushroom bodies. However, whereas key plasticity occurs at glutamatergic synapses in mammals, the neurochemistry of the memory-storing mushroom body Kenyon cell output synapses is unknown. Here we demonstrate a role for acetylcholine (ACh) in Drosophila. Kenyon cells express the ACh-processing proteins ChAT and VAChT, and reducing their expression impairs learned olfactory-driven behavior. Local ACh application, or direct Kenyon cell activation, evokes activity in mushroom body output neurons (MBONs). MBON activation depends on VAChT expression in Kenyon cells and is blocked by ACh receptor antagonism. Furthermore, reducing nicotinic ACh receptor subunit expression in MBONs compromises odor-evoked activation and redirects odor-driven behavior. Lastly, peptidergic corelease enhances ACh-evoked responses in MBONs, suggesting an interaction between the fast- and slow-acting transmitters. Therefore, olfactory memories in Drosophila are likely stored as plasticity of cholinergic synapses. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.

  1. Centaure: an heterogeneous parallel architecture for computer vision

    International Nuclear Information System (INIS)

    Peythieux, Marc

    1997-01-01

    This dissertation deals with the architecture of parallel computers dedicated to computer vision. In the first chapter, the problem to be solved is presented, as well as the architecture of the Sympati and Symphonie computers, on which this work is based. The second chapter is about the state of the art of computers and integrated processors that can execute computer vision and image processing codes. The third chapter contains a description of the architecture of Centaure. It has an heterogeneous structure: it is composed of a multiprocessor system based on Analog Devices ADSP21060 Sharc digital signal processor, and of a set of Symphonie computers working in a multi-SIMD fashion. Centaure also has a modular structure. Its basic node is composed of one Symphonie computer, tightly coupled to a Sharc thanks to a dual ported memory. The nodes of Centaure are linked together by the Sharc communication links. The last chapter deals with a performance validation of Centaure. The execution times on Symphonie and on Centaure of a benchmark which is typical of industrial vision, are presented and compared. In the first place, these results show that the basic node of Centaure allows a faster execution than Symphonie, and that increasing the size of the tested computer leads to a better speed-up with Centaure than with Symphonie. In the second place, these results validate the choice of running the low level structure of Centaure in a multi- SIMD fashion. (author) [fr

  2. Comparative Evaluation and Case Studies of Shared-Memory and Data-Parallel Execution Patterns

    Directory of Open Access Journals (Sweden)

    Xiaodong Zhang

    1999-01-01

    Full Text Available Shared‐memory and data‐parallel programming models are two important paradigms for scientific applications. Both models provide high‐level program abstractions, and simple and uniform views of network structures. The common features of the two models significantly simplify program coding and debugging for scientific applications. However, the underlining execution and overhead patterns are significantly different between the two models due to their programming constraints, and due to different and complex structures of interconnection networks and systems which support the two models. We performed this experimental study to present implications and comparisons of execution patterns on two commercial architectures. We implemented a standard electromagnetic simulation program (EM and a linear system solver using the shared‐memory model on the KSR‐1 and the data‐parallel model on the CM‐5. Our objectives are to examine the execution pattern changes required for an implementation transformation between the two models; to study memory access patterns; to address scalability issues; and to investigate relative costs and advantages/disadvantages of using the two models for scientific computations. Our results indicate that the EM program tends to become computation‐intensive in the KSR‐1 shared‐memory system, and memory‐demanding in the CM‐5 data‐parallel system when the systems and the problems are scaled. The EM program, a highly data‐parallel program performed extremely well, and the linear system solver, a highly control‐structured program suffered significantly in the data‐parallel model on the CM‐5. Our study provides further evidence that matching execution patterns of algorithms to parallel architectures would achieve better performance.

  3. Computational performance of a smoothed particle hydrodynamics simulation for shared-memory parallel computing

    Science.gov (United States)

    Nishiura, Daisuke; Furuichi, Mikito; Sakaguchi, Hide

    2015-09-01

    The computational performance of a smoothed particle hydrodynamics (SPH) simulation is investigated for three types of current shared-memory parallel computer devices: many integrated core (MIC) processors, graphics processing units (GPUs), and multi-core CPUs. We are especially interested in efficient shared-memory allocation methods for each chipset, because the efficient data access patterns differ between compute unified device architecture (CUDA) programming for GPUs and OpenMP programming for MIC processors and multi-core CPUs. We first introduce several parallel implementation techniques for the SPH code, and then examine these on our target computer architectures to determine the most effective algorithms for each processor unit. In addition, we evaluate the effective computing performance and power efficiency of the SPH simulation on each architecture, as these are critical metrics for overall performance in a multi-device environment. In our benchmark test, the GPU is found to produce the best arithmetic performance as a standalone device unit, and gives the most efficient power consumption. The multi-core CPU obtains the most effective computing performance. The computational speed of the MIC processor on Xeon Phi approached that of two Xeon CPUs. This indicates that using MICs is an attractive choice for existing SPH codes on multi-core CPUs parallelized by OpenMP, as it gains computational acceleration without the need for significant changes to the source code.

  4. How organisation of architecture documentation affects architectural knowledge retrieval

    NARCIS (Netherlands)

    de Graaf, K.A.; Liang, P.; Tang, A.; Vliet, J.C.

    A common approach to software architecture documentation in industry projects is the use of file-based documents. This approach offers a single-dimensional arrangement of the architectural knowledge. Knowledge retrieval from file-based architecture documentation is efficient if the organisation of

  5. Architectures drawn / digital models: the Venices (impossible on line

    Directory of Open Access Journals (Sweden)

    Malvina Borgherini

    2011-12-01

    Full Text Available A contemporary city representation talks not only about architecture and landscape but also about the effects that political institutions, cultural traditions and economic enterprises have on the urban community. Methods and ways traditionally used to present the complexity and, at the same time, the personality of a town, or on a smaller scale of one of his monument, are not suitable with the contemporary reality. A very famous panel presented at Venice Biennale in 1976, Aldo Rossi’s «Città Analoga», within real and ideal architectures, ancient monuments and contemporary landscapes, individual and collective memories, human presences and empty spaces, can be taken as an example for the preparation of a new ‘story’ or a new ‘map’ for a city like Venice. The space of a digital model can become a place for discussion and analysis, a place where to see together historical records and projects never realized, where to put subjective and objective visions, which overlap daily and occasional tracks.

  6. A shared, flexible neural map architecture reflects capacity limits in both visual short-term memory and enumeration.

    Science.gov (United States)

    Knops, André; Piazza, Manuela; Sengupta, Rakesh; Eger, Evelyn; Melcher, David

    2014-07-23

    Human cognition is characterized by severe capacity limits: we can accurately track, enumerate, or hold in mind only a small number of items at a time. It remains debated whether capacity limitations across tasks are determined by a common system. Here we measure brain activation of adult subjects performing either a visual short-term memory (vSTM) task consisting of holding in mind precise information about the orientation and position of a variable number of items, or an enumeration task consisting of assessing the number of items in those sets. We show that task-specific capacity limits (three to four items in enumeration and two to three in vSTM) are neurally reflected in the activity of the posterior parietal cortex (PPC): an identical set of voxels in this region, commonly activated during the two tasks, changed its overall response profile reflecting task-specific capacity limitations. These results, replicated in a second experiment, were further supported by multivariate pattern analysis in which we could decode the number of items presented over a larger range during enumeration than during vSTM. Finally, we simulated our results with a computational model of PPC using a saliency map architecture in which the level of mutual inhibition between nodes gives rise to capacity limitations and reflects the task-dependent precision with which objects need to be encoded (high precision for vSTM, lower precision for enumeration). Together, our work supports the existence of a common, flexible system underlying capacity limits across tasks in PPC that may take the form of a saliency map. Copyright © 2014 the authors 0270-6474/14/349857-10$15.00/0.

  7. Bus Arbitration for FDUMA Shared Memory Architecture

    OpenAIRE

    森垣,利彦; 弘中,哲夫; 児島,彰; 藤野,清次

    1997-01-01

    近年, プロセッサとDRAMを1つのLSI上に混載することでメモリバンド幅を広げる研究が行われている. しかし, この方法ではベクトル処理的な用途以外では得られるメモリバンド幅を有効に活用できず, On Chip Multiprocessorなどの共有メモリとして利用しにくい. そこで我々はこの問題を解決するメモリ・アーキテクチャとして, FDUMAマルチポートメモリシステムを提案している. 本稿では, 現在開発中であるFDUMAメモリシステムの試作機で用いるバス・アービトレーションについて述べ, その後ソフトウェア・シミュレータによるFDUMAメモリシステムの特性評価を行う. / Many research are done on deriving high memory bandwidth by merging the DRAM and logic on one chip. This merged DRAM/logic chip is effective for vector-style processing. Although it is not suitable for ...

  8. Autotuning of Adaptive Mesh Refinement PDE Solvers on Shared Memory Architectures

    KAUST Repository

    Nogina, Svetlana

    2012-01-01

    Many multithreaded, grid-based, dynamically adaptive solvers for partial differential equations permanently have to traverse subgrids (patches) of different and changing sizes. The parallel efficiency of this traversal depends on the interplay of the patch size, the architecture used, the operations triggered throughout the traversal, and the grain size, i.e. the size of the subtasks the patch is broken into. We propose an oracle mechanism delivering grain sizes on-the-fly. It takes historical runtime measurements for different patch and grain sizes as well as the traverse\\'s operations into account, and it yields reasonable speedups. Neither magic configuration settings nor an expensive pre-tuning phase are necessary. It is an autotuning approach. © 2012 Springer-Verlag.

  9. DESTINY: A Comprehensive Tool with 3D and Multi-Level Cell Memory Modeling Capability

    Directory of Open Access Journals (Sweden)

    Sparsh Mittal

    2017-09-01

    Full Text Available To enable the design of large capacity memory structures, novel memory technologies such as non-volatile memory (NVM and novel fabrication approaches, e.g., 3D stacking and multi-level cell (MLC design have been explored. The existing modeling tools, however, cover only a few memory technologies, technology nodes and fabrication approaches. We present DESTINY, a tool for modeling 2D/3D memories designed using SRAM, resistive RAM (ReRAM, spin transfer torque RAM (STT-RAM, phase change RAM (PCM and embedded DRAM (eDRAM and 2D memories designed using spin orbit torque RAM (SOT-RAM, domain wall memory (DWM and Flash memory. In addition to single-level cell (SLC designs for all of these memories, DESTINY also supports modeling MLC designs for NVMs. We have extensively validated DESTINY against commercial and research prototypes of these memories. DESTINY is very useful for performing design-space exploration across several dimensions, such as optimizing for a target (e.g., latency, area or energy-delay product for a given memory technology, choosing the suitable memory technology or fabrication method (i.e., 2D v/s 3D for a given optimization target, etc. We believe that DESTINY will boost studies of next-generation memory architectures used in systems ranging from mobile devices to extreme-scale supercomputers. The latest source-code of DESTINY is available from the following git repository: https://bitbucket.org/sparshmittal/destinyv2.

  10. Capital Architecture: Situating symbolism parallel to architectural methods and technology

    Science.gov (United States)

    Daoud, Bassam

    Capital Architecture is a symbol of a nation's global presence and the cultural and social focal point of its inhabitants. Since the advent of High-Modernism in Western cities, and subsequently decolonised capitals, civic architecture no longer seems to be strictly grounded in the philosophy that national buildings shape the legacy of government and the way a nation is regarded through its built environment. Amidst an exceedingly globalized architectural practice and with the growing concern of key heritage foundations over the shortcomings of international modernism in representing its immediate socio-cultural context, the contextualization of public architecture within its sociological, cultural and economic framework in capital cities became the key denominator of this thesis. Civic architecture in capital cities is essential to confront the challenges of symbolizing a nation and demonstrating the legitimacy of the government'. In today's dominantly secular Western societies, governmental architecture, especially where the seat of political power lies, is the ultimate form of architectural expression in conveying a sense of identity and underlining a nation's status. Departing with these convictions, this thesis investigates the embodied symbolic power, the representative capacity, and the inherent permanence in contemporary architecture, and in its modes of production. Through a vast study on Modern architectural ideals and heritage -- in parallel to methodologies -- the thesis stimulates the future of large scale governmental building practices and aims to identify and index the key constituents that may respond to the lack representation in civic architecture in capital cities.

  11. Software architecture evolution

    DEFF Research Database (Denmark)

    Barais, Olivier; Le Meur, Anne-Francoise; Duchien, Laurence

    2008-01-01

    Software architectures must frequently evolve to cope with changing requirements, and this evolution often implies integrating new concerns. Unfortunately, when the new concerns are crosscutting, existing architecture description languages provide little or no support for this kind of evolution....... The software architect must modify multiple elements of the architecture manually, which risks introducing inconsistencies. This chapter provides an overview, comparison and detailed treatment of the various state-of-the-art approaches to describing and evolving software architectures. Furthermore, we discuss...... one particular framework named Tran SAT, which addresses the above problems of software architecture evolution. Tran SAT provides a new element in the software architecture descriptions language, called an architectural aspect, for describing new concerns and their integration into an existing...

  12. A generic library for large scale solution of PDEs on modern heterogeneous architectures

    DEFF Research Database (Denmark)

    Glimberg, Stefan Lemvig; Engsig-Karup, Allan Peter

    2012-01-01

    Adapting to new programming models for modern multi- and many-core architectures requires code-rewriting and changing algorithms and data structures, in order to achieve good efficiency and scalability. We present a generic library for solving large scale partial differential equations (PDEs......), capable of utilizing heterogeneous CPU/GPU environments. The library can be used for fast proto-typing of PDE solvers, based on finite difference approximations of spatial derivatives in one, two, or three dimensions. In order to efficiently solve large scale problems, we keep memory consumption...... and memory access low, using a low-storage implementation of flexible-order finite difference operators. We will illustrate the use of library components by assembling such matrix-free operators to be used with one of the supported iterative solvers, such as GMRES, CG, Multigrid or Defect Correction...

  13. Overcoming the drawback of lower sense margin in tunnel FET based dynamic memory along with enhanced charge retention and scalability

    Science.gov (United States)

    Navlakha, Nupur; Kranti, Abhinav

    2017-11-01

    The work reports on the use of a planar tri-gate tunnel field effect transistor (TFET) to operate as dynamic memory at 85 °C with an enhanced sense margin (SM). Two symmetric gates (G1) aligned to the source at a partial region of intrinsic film result into better electrostatic control that regulates the read mechanism based on band-to-band tunneling, while the other gate (G2), positioned adjacent to the first front gate is responsible for charge storage and sustenance. The proposed architecture results in an enhanced SM of ˜1.2 μA μm-1 along with a longer retention time (RT) of ˜1.8 s at 85 °C, for a total length of 600 nm. The double gate architecture towards the source increases the tunneling current and also reduces short channel effects, enhancing SM and scalability, thereby overcoming the critical bottleneck faced by TFET based dynamic memories. The work also discusses the impact of overlap/underlap and interface charges on the performance of TFET based dynamic memory. Insights into device operation demonstrate that the choice of appropriate architecture and biases not only limit the trade-off between SM and RT, but also result in improved scalability with drain voltage and total length being scaled down to 0.8 V and 115 nm, respectively.

  14. Architectural design decisions

    NARCIS (Netherlands)

    Jansen, Antonius Gradus Johannes

    2008-01-01

    A software architecture can be considered as the collection of key decisions concerning the design of the software of a system. Knowledge about this design, i.e. architectural knowledge, is key for understanding a software architecture and thus the software itself. Architectural knowledge is mostly

  15. A programmable associative memory for track finding

    International Nuclear Information System (INIS)

    Bardi, A.; Belforte, S.; Donati, S.; Galeotti, S.; Giannetti, P.; Morsani, F.; Passuello, D.; Spinella, F.; Cerri, A.; Punzi, G.; Ristori, L.; Dell'Orso, M.; Meschi, E.; Leger, A.; Speer, T.; Wu, X.

    1998-01-01

    We present a device, based on the concept of associative memory for pattern recognition, dedicated to on-line track finding in high-energy physics experiments. A large pattern bank, describing all possible tracks, can be organized into field programmable gate arrays where all patterns are compared in parallel to data coming from the detector during readout. Patterns, recognized among 2 66 possible combinations, are output in a few 30 MHz clock cycles. Programmability results in a flexible, simple architecture and it allows to keep up smoothly with technology improvements. (orig.)

  16. A flexible analog memory address list manager/controller for PHENIX

    International Nuclear Information System (INIS)

    Ericson, M.N.; Walker, J.W.; Britton, C.L.; Wintenberg, A.L.; Young, G.R.

    1995-01-01

    A programmable analog memory address list manager/controller has been developed for use with all analog memory-based detector subsystems of PHENIX. The unit provides simultaneous read/write control, cell write-over protection for both a Level-1 trigger decision delay and digitization latency, and re-ordering of AMU addresses following conversion, at a beam crossing rate of 112 ns. Addresses are handled such that up to 5 Level-1 events can be maintained in the AMU without write-over. Data tagging is implemented for handling overlapping and shared beam event data packets. Full usage in all PHENIX analog memory-based detector sub-systems is accomplished by the use of detector-specific programmable parameters -- the number of data samples per Level-1 trigger valid and the swnple spacing. Architectural candidates for the system are discussed with emphasis on implementation implications. Details of the design are presented including design simulations, timing information, and test results from a full implementation using programmable logic devices

  17. Evaluation of a Connectionless NoC for a Real-Time Distributed Shared Memory Many-Core System

    NARCIS (Netherlands)

    Rutgers, J.H.; Bekooij, Marco Jan Gerrit; Smit, Gerardus Johannes Maria

    2012-01-01

    Real-time embedded systems like smartphones tend to comprise an ever increasing number of processing cores. For scalability and the need for guaranteed performance, the use of a connection-oriented network-on-chip (NoC) is advocated. Furthermore, a distributed shared memory architecture is preferred

  18. Architecture Governance: The Importance of Architecture Governance for Achieving Operationally Responsive Ground Systems

    Science.gov (United States)

    Kolar, Mike; Estefan, Jeff; Giovannoni, Brian; Barkley, Erik

    2011-01-01

    Topics covered (1) Why Governance and Why Now? (2) Characteristics of Architecture Governance (3) Strategic Elements (3a) Architectural Principles (3b) Architecture Board (3c) Architecture Compliance (4) Architecture Governance Infusion Process. Governance is concerned with decision making (i.e., setting directions, establishing standards and principles, and prioritizing investments). Architecture governance is the practice and orientation by which enterprise architectures and other architectures are managed and controlled at an enterprise-wide level

  19. Minimalism in architecture: Architecture as a language of its identity

    Directory of Open Access Journals (Sweden)

    Vasilski Dragana

    2012-01-01

    Full Text Available Every architectural work is created on the principle that includes the meaning, and then this work is read like an artifact of the particular meaning. Resources by which the meaning is built primarily, susceptible to transformation, as well as routing of understanding (decoding messages carried by a work of architecture, are subject of semiotics and communication theories, which have played significant role for the architecture and the architect. Minimalism in architecture, as a paradigm of the XXI century architecture, means searching for essence located in the irreducible minimum. Inspired use of architectural units (archetypical elements, trough the fatasm of simplicity, assumes the primary responsibility for providing the object identity, because it participates in language formation and therefore in its reading. Volume is form by clean language that builds the expression of the fluid areas liberated of recharge needs. Reduced architectural language is appropriating to the age marked by electronic communications.

  20. A State-Based Modeling Approach for Efficient Performance Evaluation of Embedded System Architectures at Transaction Level

    Directory of Open Access Journals (Sweden)

    Anthony Barreteau

    2012-01-01

    Full Text Available Abstract models are necessary to assist system architects in the evaluation process of hardware/software architectures and to cope with the still increasing complexity of embedded systems. Efficient methods are required to create reliable models of system architectures and to allow early performance evaluation and fast exploration of the design space. In this paper, we present a specific transaction level modeling approach for performance evaluation of hardware/software architectures. This approach relies on a generic execution model that exhibits light modeling effort. Created models are used to evaluate by simulation expected processing and memory resources according to various architectures. The proposed execution model relies on a specific computation method defined to improve the simulation speed of transaction level models. The benefits of the proposed approach are highlighted through two case studies. The first case study is a didactic example illustrating the modeling approach. In this example, a simulation speed-up by a factor of 7,62 is achieved by using the proposed computation method. The second case study concerns the analysis of a communication receiver supporting part of the physical layer of the LTE protocol. In this case study, architecture exploration is led in order to improve the allocation of processing functions.

  1. Simulating Hydrologic Flow and Reactive Transport with PFLOTRAN and PETSc on Emerging Fine-Grained Parallel Computer Architectures

    Science.gov (United States)

    Mills, R. T.; Rupp, K.; Smith, B. F.; Brown, J.; Knepley, M.; Zhang, H.; Adams, M.; Hammond, G. E.

    2017-12-01

    As the high-performance computing community pushes towards the exascale horizon, power and heat considerations have driven the increasing importance and prevalence of fine-grained parallelism in new computer architectures. High-performance computing centers have become increasingly reliant on GPGPU accelerators and "manycore" processors such as the Intel Xeon Phi line, and 512-bit SIMD registers have even been introduced in the latest generation of Intel's mainstream Xeon server processors. The high degree of fine-grained parallelism and more complicated memory hierarchy considerations of such "manycore" processors present several challenges to existing scientific software. Here, we consider how the massively parallel, open-source hydrologic flow and reactive transport code PFLOTRAN - and the underlying Portable, Extensible Toolkit for Scientific Computation (PETSc) library on which it is built - can best take advantage of such architectures. We will discuss some key features of these novel architectures and our code optimizations and algorithmic developments targeted at them, and present experiences drawn from working with a wide range of PFLOTRAN benchmark problems on these architectures.

  2. Architectural Narratives

    DEFF Research Database (Denmark)

    Kiib, Hans

    2010-01-01

    a functional framework for these concepts, but tries increasingly to endow the main idea of the cultural project with a spatially aesthetic expression - a shift towards “experience architecture.” A great number of these projects typically recycle and reinterpret narratives related to historical buildings......In this essay, I focus on the combination of programs and the architecture of cultural projects that have emerged within the last few years. These projects are characterized as “hybrid cultural projects,” because they intend to combine experience with entertainment, play, and learning. This essay...... and architectural heritage; another group tries to embed new performative technologies in expressive architectural representation. Finally, this essay provides a theoretical framework for the analysis of the political rationales of these projects and for the architectural representation bridges the gap between...

  3. Three-Dimensional Cellular Structures Enhanced By Shape Memory Alloys

    Science.gov (United States)

    Nathal, Michael V.; Krause, David L.; Wilmoth, Nathan G.; Bednarcyk, Brett A.; Baker, Eric H.

    2014-01-01

    This research effort explored lightweight structural concepts married with advanced smart materials to achieve a wide variety of benefits in airframe and engine components. Lattice block structures were cast from an aerospace structural titanium alloy Ti-6Al-4V and a NiTi shape memory alloy (SMA), and preliminary properties have been measured. A finite element-based modeling approach that can rapidly and accurately capture the deformation response of lattice architectures was developed. The Ti-6-4 and SMA material behavior was calibrated via experimental tests of ligaments machined from the lattice. Benchmark testing of complete lattice structures verified the main aspects of the model as well as demonstrated the advantages of the lattice structure. Shape memory behavior of a sample machined from a lattice block was also demonstrated.

  4. Room-Temperature Single-photon level Memory for Polarization States

    Science.gov (United States)

    Kupchak, Connor; Mittiga, Thomas; Jordaan, Bertus; Namazi, Mehdi; Nölleke, Christian; Figueroa, Eden

    2015-01-01

    An optical quantum memory is a stationary device that is capable of storing and recreating photonic qubits with a higher fidelity than any classical device. Thus far, these two requirements have been fulfilled for polarization qubits in systems based on cold atoms and cryogenically cooled crystals. Here, we report a room-temperature memory capable of storing arbitrary polarization qubits with a signal-to-background ratio higher than 1 and an average fidelity surpassing the classical benchmark for weak laser pulses containing 1.6 photons on average, without taking into account non-unitary operation. Our results demonstrate that a common vapor cell can reach the low background noise levels necessary for polarization qubit storage using single-photon level light, and propels atomic-vapor systems towards a level of functionality akin to other quantum information processing architectures.

  5. Insights into operation of planar tri-gate tunnel field effect transistor for dynamic memory application

    Science.gov (United States)

    Navlakha, Nupur; Kranti, Abhinav

    2017-07-01

    Insights into device physics and operation through the control of energy barriers are presented for a planar tri-gate Tunnel Field Effect Transistor (TFET) based dynamic memory. The architecture consists of a double gate (G1) at the source side and a single gate (G2) at the drain end of the silicon film. Dual gates (G1) effectively enhance the tunneling based read mechanism through the enhanced coupling and improved electrostatic control over the channel. The single gate (G2) controls the holes in the potential barrier induced through the proper selection of bias and workfunction. The results indicate that the planar tri-gate achieves optimum performance evaluated in terms of two composite metrics (M1 and M2), namely, product of (i) Sense Margin (SM) and Retention Time (RT) i.e., M1 = SM × RT and (ii) Sense Margin and Current Ratio (CR) i.e., M2 = SM × CR. The regulation of barriers created by the gates (G1 and G2) through the optimal use of device parameters leads to better performance metrics, with significant improvement at scaled lengths as compared to other tunneling based dynamic memory architectures. The investigation shows that lengths of G1, G2 and lateral spacing can be scaled down to 25 nm, 50 nm, and 30 nm, respectively, while achieving reasonable values for (M1, M2). The work demonstrates a systematic approach to showcase the advancement in TFET based Dynamic Random Access Memory (DRAM) through the use of planar tri-gate topology at a lower bias value. The concept, design, and operation of planar tri-gate architecture provide valuable viewpoints for TFET based DRAM.

  6. Architecture & Environment

    Science.gov (United States)

    Erickson, Mary; Delahunt, Michael

    2010-01-01

    Most art teachers would agree that architecture is an important form of visual art, but they do not always include it in their curriculums. In this article, the authors share core ideas from "Architecture and Environment," a teaching resource that they developed out of a long-term interest in teaching architecture and their fascination with the…

  7. Exporting Humanist Architecture

    DEFF Research Database (Denmark)

    Nielsen, Tom

    2016-01-01

    The article is a chapter in the catalogue for the Danish exhibition at the 2016 Architecture Biennale in Venice. The catalogue is conceived at an independent book exploring the theme Art of Many - The Right to Space. The chapter is an essay in this anthology tracing and discussing the different...... values and ethical stands involved in the export of Danish Architecture. Abstract: Danish architecture has, in a sense, been driven by an unwritten contract between the architects and the democratic state and its institutions. This contract may be viewed as an ethos – an architectural tradition...... with inherent aesthetic and moral values. Today, however, Danish architecture is also an export commodity. That raises questions, which should be debated as openly as possible. What does it mean for architecture and architects to practice in cultures and under political systems that do not use architecture...

  8. A model for Intelligent Random Access Memory architecture (IRAM) cellular automata algorithms on the Associative String Processing machine (ASTRA)

    CERN Document Server

    Rohrbach, F; Vesztergombi, G

    1997-01-01

    In the near future, the computer performance will be completely determined by how long it takes to access memory. There are bottle-necks in memory latency and memory-to processor interface bandwidth. The IRAM initiative could be the answer by putting Processor-In-Memory (PIM). Starting from the massively parallel processing concept, one reached a similar conclusion. The MPPC (Massively Parallel Processing Collaboration) project and the 8K processor ASTRA machine (Associative String Test bench for Research \\& Applications) developed at CERN \\cite{kuala} can be regarded as a forerunner of the IRAM concept. The computing power of the ASTRA machine, regarded as an IRAM with 64 one-bit processors on a 64$\\times$64 bit-matrix memory chip machine, has been demonstrated by running statistical physics algorithms: one-dimensional stochastic cellular automata, as a simple model for dynamical phase transitions. As a relevant result for physics, the damage spreading of this model has been investigated.

  9. Nonvolatile Memory Materials for Neuromorphic Intelligent Machines.

    Science.gov (United States)

    Jeong, Doo Seok; Hwang, Cheol Seong

    2018-04-18

    Recent progress in deep learning extends the capability of artificial intelligence to various practical tasks, making the deep neural network (DNN) an extremely versatile hypothesis. While such DNN is virtually built on contemporary data centers of the von Neumann architecture, physical (in part) DNN of non-von Neumann architecture, also known as neuromorphic computing, can remarkably improve learning and inference efficiency. Particularly, resistance-based nonvolatile random access memory (NVRAM) highlights its handy and efficient application to the multiply-accumulate (MAC) operation in an analog manner. Here, an overview is given of the available types of resistance-based NVRAMs and their technological maturity from the material- and device-points of view. Examples within the strategy are subsequently addressed in comparison with their benchmarks (virtual DNN in deep learning). A spiking neural network (SNN) is another type of neural network that is more biologically plausible than the DNN. The successful incorporation of resistance-based NVRAM in SNN-based neuromorphic computing offers an efficient solution to the MAC operation and spike timing-based learning in nature. This strategy is exemplified from a material perspective. Intelligent machines are categorized according to their architecture and learning type. Also, the functionality and usefulness of NVRAM-based neuromorphic computing are addressed. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  10. Fragments of Architecture

    DEFF Research Database (Denmark)

    Bang, Jacob Sebastian

    2016-01-01

    Topic 3: “Case studies dealing with the artistic and architectural work of architects worldwide, and the ties between specific artistic and architectural projects, methodologies and products”......Topic 3: “Case studies dealing with the artistic and architectural work of architects worldwide, and the ties between specific artistic and architectural projects, methodologies and products”...

  11. Enterprise architecture management

    DEFF Research Database (Denmark)

    Rahimi, Fatemeh; Gøtze, John; Møller, Charles

    2017-01-01

    Despite the growing interest in enterprise architecture management, researchers and practitioners lack a shared understanding of its applications in organizations. Building on findings from a literature review and eight case studies, we develop a taxonomy that categorizes applications of enterprise...... architecture management based on three classes of enterprise architecture scope. Organizations may adopt enterprise architecture management to help form, plan, and implement IT strategies; help plan and implement business strategies; or to further complement the business strategy-formation process....... The findings challenge the traditional IT-centric view of enterprise architecture management application and suggest enterprise architecture management as an approach that could support the consistent design and evolution of an organization as a whole....

  12. Enterprise architecture management

    DEFF Research Database (Denmark)

    Rahimi, Fatemeh; Gøtze, John; Møller, Charles

    2017-01-01

    architecture management based on three classes of enterprise architecture scope. Organizations may adopt enterprise architecture management to help form, plan, and implement IT strategies; help plan and implement business strategies; or to further complement the business strategy-formation process......Despite the growing interest in enterprise architecture management, researchers and practitioners lack a shared understanding of its applications in organizations. Building on findings from a literature review and eight case studies, we develop a taxonomy that categorizes applications of enterprise....... The findings challenge the traditional IT-centric view of enterprise architecture management application and suggest enterprise architecture management as an approach that could support the consistent design and evolution of an organization as a whole....

  13. A high performance architecture for accelerator controls

    International Nuclear Information System (INIS)

    Allen, M.; Hunt, S.M; Lue, H.; Saltmarsh, C.G.; Parker, C.R.C.B.

    1991-01-01

    The demands placed on the Superconducting Super Collider (SSC) control system due to large distances, high bandwidth and fast response time required for operation will require a fresh approach to the data communications architecture of the accelerator. The prototype design effort aims at providing deterministic communication across the accelerator complex with a response time of < 100 ms and total bandwidth of 2 Gbits/sec. It will offer a consistent interface for a large number of equipment types, from vacuum pumps to beam position monitors, providing appropriate communications performance for each equipment type. It will consist of highly parallel links to all equipment: those with computing resources, non-intelligent direct control interfaces, and data concentrators. This system will give each piece of equipment a dedicated link of fixed bandwidth to the control system. Application programs will have access to all accelerator devices which will be memory mapped into a global virtual addressing scheme. Links to devices in the same geographical area will be multiplexed using commercial Time Division Multiplexing equipment. Low-level access will use reflective memory techniques, eliminating processing overhead and complexity of traditional data communication protocols. The use of commercial standards and equipment will enable a high performance system to be built at low cost

  14. A high performance architecture for accelerator controls

    International Nuclear Information System (INIS)

    Allen, M.; Hunt, S.M.; Lue, H.; Saltmarsh, C.G.; Parker, C.R.C.B.

    1991-03-01

    The demands placed on the Superconducting Super Collider (SSC) control system due to large distances, high bandwidth and fast response time required for operation will require a fresh approach to the data communications architecture of the accelerator. The prototype design effort aims at providing deterministic communication across the accelerator complex with a response time of <100 ms and total bandwidth of 2 Gbits/sec. It will offer a consistent interface for a large number of equipment types, from vacuum pumps to beam position monitors, providing appropriate communications performance for each equipment type. It will consist of highly parallel links to all equipments: those with computing resources, non-intelligent direct control interfaces, and data concentrators. This system will give each piece of equipment a dedicated link of fixed bandwidth to the control system. Application programs will have access to all accelerator devices which will be memory mapped into a global virtual addressing scheme. Links to devices in the same geographical area will be multiplexed using commercial Time Division Multiplexing equipment. Low-level access will use reflective memory techniques, eliminating processing overhead and complexity of traditional data communication protocols. The use of commercial standards and equipment will enable a high performance system to be built at low cost. 1 fig

  15. Modeling Architectural Patterns’ Behavior Using Architectural Primitives

    NARCIS (Netherlands)

    Waqas Kamal, Ahmad; Avgeriou, Paris

    2008-01-01

    Architectural patterns have an impact on both the structure and the behavior of a system at the architecture design level. However, it is challenging to model patterns’ behavior in a systematic way because modeling languages do not provide the appropriate abstractions and because each pattern

  16. Space and Architecture's Current Line of Research? A Lunar Architecture Workshop With An Architectural Agenda.

    Science.gov (United States)

    Solomon, D.; van Dijk, A.

    The "2002 ESA Lunar Architecture Workshop" (June 3-16) ESTEC, Noordwijk, NL and V2_Lab, Rotterdam, NL) is the first-of-its-kind workshop for exploring the design of extra-terrestrial (infra) structures for human exploration of the Moon and Earth-like planets introducing 'architecture's current line of research', and adopting an architec- tural criteria. The workshop intends to inspire, engage and challenge 30-40 European masters students from the fields of aerospace engineering, civil engineering, archi- tecture, and art to design, validate and build models of (infra) structures for Lunar exploration. The workshop also aims to open up new physical and conceptual terrain for an architectural agenda within the field of space exploration. A sound introduc- tion to the issues, conditions, resources, technologies, and architectural strategies will initiate the workshop participants into the context of lunar architecture scenarios. In my paper and presentation about the development of the ideology behind this work- shop, I will comment on the following questions: * Can the contemporary architectural agenda offer solutions that affect the scope of space exploration? It certainly has had an impression on urbanization and colonization of previously sparsely populated parts of Earth. * Does the current line of research in architecture offer any useful strategies for com- bining scientific interests, commercial opportunity, and public space? What can be learned from 'state of the art' architecture that blends commercial and public pro- grammes within one location? * Should commercial 'colonisation' projects in space be required to provide public space in a location where all humans present are likely to be there in a commercial context? Is the wave in Koolhaas' new Prada flagship store just a gesture to public space, or does this new concept in architecture and shopping evolve the public space? * What can we learn about designing (infra-) structures on the Moon or any other

  17. Genetic Complexity of Episodic Memory: A Twin Approach to Studies of Aging

    Science.gov (United States)

    Kremen, William S.; Spoon, Kelly M.; Jacobson, Kristen C.; Vasilopoulos, Terrie; McCaffery, Jeanne M.; Panizzon, Matthew S.; Franz, Carol E.; Vuoksimaa, Eero; Xian, Hong; Rana, Brinda K.; Toomey, Rosemary; McKenzie, Ruth; Lyons, Michael J.

    2016-01-01

    Episodic memory change is a central issue in cognitive aging, and understanding that process will require elucidation of its genetic underpinnings. A key limiting factor in genetically informed research on memory has been lack of attention to genetic and phenotypic complexity, as if “memory is memory” and all well-validated assessments are essentially equivalent. Here we applied multivariate twin models to data from late-middle-aged participants in the Vietnam Era Twin Study of Aging to examine the genetic architecture of 6 measures from 3 standard neuropsychological tests: the California Verbal Learning Test-2, and Wechsler Memory Scale-III Logical Memory (LM) and Visual Reproductions (VR). An advantage of the twin method is that it can estimate the extent to which latent genetic influences are shared or independent across different measures before knowing which specific genes are involved. The best-fitting model was a higher order common pathways model with a heritable higher order general episodic memory factor and three test-specific subfactors. More importantly, substantial genetic variance was accounted for by genetic influences that were specific to the latent LM and VR subfactors (28% and 30%, respectively) and independent of the general factor. Such unique genetic influences could partially account for replication failures. Moreover, if different genes influence different memory phenotypes, they could well have different age-related trajectories. This approach represents an important step toward providing critical information for all types of genetically informative studies of aging and memory. PMID:24956007

  18. Application of source biasing technique for energy efficient DECODER circuit design: memory array application

    Science.gov (United States)

    Gupta, Neha; Parihar, Priyanka; Neema, Vaibhav

    2018-04-01

    Researchers have proposed many circuit techniques to reduce leakage power dissipation in memory cells. If we want to reduce the overall power in the memory system, we have to work on the input circuitry of memory architecture i.e. row and column decoder. In this research work, low leakage power with a high speed row and column decoder for memory array application is designed and four new techniques are proposed. In this work, the comparison of cluster DECODER, body bias DECODER, source bias DECODER, and source coupling DECODER are designed and analyzed for memory array application. Simulation is performed for the comparative analysis of different DECODER design parameters at 180 nm GPDK technology file using the CADENCE tool. Simulation results show that the proposed source bias DECODER circuit technique decreases the leakage current by 99.92% and static energy by 99.92% at a supply voltage of 1.2 V. The proposed circuit also improves dynamic power dissipation by 5.69%, dynamic PDP/EDP 65.03% and delay 57.25% at 1.2 V supply voltage.

  19. Looking back, thinking ahead : A neuropsychological view on cognitive correlates of time, space and prospective memory

    NARCIS (Netherlands)

    Kant, N.

    2017-01-01

    The aim of this thesis was to better understand the cognitive architecture underlying complex functions relevant for daily life. A special focus was placed on prospective memory (PM). PM is defined as remembering to carry out intended actions at an appropriate moment in the future. This moment can

  20. Birth weight, working memory and epigenetic signatures in IGF2 and related genes: a MZ twin study.

    Directory of Open Access Journals (Sweden)

    Aldo Córdova-Palomera

    Full Text Available Neurodevelopmental disruptions caused by obstetric complications play a role in the etiology of several phenotypes associated with neuropsychiatric diseases and cognitive dysfunctions. Importantly, it has been noticed that epigenetic processes occurring early in life may mediate these associations. Here, DNA methylation signatures at IGF2 (insulin-like growth factor 2 and IGF2BP1-3 (IGF2-binding proteins 1-3 were examined in a sample consisting of 34 adult monozygotic (MZ twins informative for obstetric complications and cognitive performance. Multivariate linear regression analysis of twin data was implemented to test for associations between methylation levels and both birth weight (BW and adult working memory (WM performance. Familial and unique environmental factors underlying these potential relationships were evaluated. A link was detected between DNA methylation levels of two CpG sites in the IGF2BP1 gene and both BW and adult WM performance. The BW-IGF2BP1 methylation association seemed due to non-shared environmental factors influencing BW, whereas the WM-IGF2BP1 methylation relationship seemed mediated by both genes and environment. Our data is in agreement with previous evidence indicating that DNA methylation status may be related to prenatal stress and later neurocognitive phenotypes. While former reports independently detected associations between DNA methylation and either BW or WM, current results suggest that these relationships are not confounded by each other.

  1. Modality-specific alpha modulations facilitate long-term memory encoding in the presence of distracters.

    Science.gov (United States)

    Jiang, Haiteng; van Gerven, Marcel A J; Jensen, Ole

    2015-03-01

    It has been proposed that long-term memory encoding is not only dependent on engaging task-relevant regions but also on disengaging task-irrelevant regions. In particular, oscillatory alpha activity has been shown to be involved in shaping the functional architecture of the working brain because it reflects the functional disengagement of specific regions in attention and memory tasks. We here ask if such allocation of resources by alpha oscillations generalizes to long-term memory encoding in a cross-modal setting in which we acquired the ongoing brain activity using magnetoencephalography. Participants were asked to encode pictures while ignoring simultaneously presented words and vice versa. We quantified the brain activity during rehearsal reflecting subsequent memory in the different attention conditions. The key finding was that successful long-term memory encoding is reflected by alpha power decreases in the sensory region of the to-be-attended modality and increases in the sensory region of the to-be-ignored modality to suppress distraction during rehearsal period. Our results corroborate related findings from attention studies by demonstrating that alpha activity is also important for the allocation of resources during long-term memory encoding in the presence of distracters.

  2. Methodical Design of Software Architecture Using an Architecture Design Assistant (ArchE)

    Science.gov (United States)

    2005-04-01

    PA 15213-3890 Methodical Design of Software Architecture Using an Architecture Design Assistant (ArchE) Felix Bachmann and Mark Klein Software...DATES COVERED 00-00-2005 to 00-00-2005 4. TITLE AND SUBTITLE Methodical Design of Software Architecture Using an Architecture Design Assistant...important for architecture design – quality requirements and constraints are most important Here’s some evidence: If the only concern is

  3. Poly(3,4-ethylenedioxythiophene)-Poly(styrenesulfonate) Interlayer Insertion Enables Organic Quaternary Memory.

    Science.gov (United States)

    Cheng, Xue-Feng; Hou, Xiang; Qian, Wen-Hu; He, Jing-Hui; Xu, Qing-Feng; Li, Hua; Li, Na-Jun; Chen, Dong-Yun; Lu, Jian-Mei

    2017-08-23

    Herein, for the first time, quaternary resistive memory based on an organic molecule is achieved via surface engineering. A layer of poly(3,4-ethylenedioxythiophene)-poly(styrenesulfonate) (PEDOT-PSS) was inserted between the indium tin oxide (ITO) electrode and the organic layer (squaraine, SA-Bu) to form an ITO/PEDOT-PSS/SA-Bu/Al architecture. The modified resistive random-access memory (RRAM) devices achieve quaternary memory switching with the highest yield (∼41%) to date. Surface morphology, crystallinity, and mosaicity of the deposited organic grains are greatly improved after insertion of a PEDOT-PSS interlayer, which provides better contacts at the grain boundaries as well as the electrode/active layer interface. The PEDOT-PSS interlayer also reduces the hole injection barrier from the electrode to the active layer. Thus, the threshold voltage of each switching is greatly reduced, allowing for more quaternary switching in a certain voltage window. Our results provide a simple yet powerful strategy as an alternative to molecular design to achieve organic quaternary resistive memory.

  4. Olfactory memories are intensity specific in larval Drosophila.

    Science.gov (United States)

    Mishra, Dushyant; Chen, Yi-Chun; Yarali, Ayse; Oguz, Tuba; Gerber, Bertram

    2013-05-01

    Learning can rely on stimulus quality, stimulus intensity, or a combination of these. Regarding olfaction, the coding of odour quality is often proposed to be combinatorial along the olfactory pathway, and working hypotheses are available concerning short-term associative memory trace formation of odour quality. However, it is less clear how odour intensity is coded, and whether olfactory memory traces include information about the intensity of the learnt odour. Using odour-sugar associative conditioning in larval Drosophila, we first describe the dose-effect curves of learnability across odour intensities for four different odours (n-amyl acetate, 3-octanol, 1-octen-3-ol and benzaldehyde). We then chose odour intensities such that larvae were trained at an intermediate odour intensity, but were tested for retention with either that trained intermediate odour intensity, or with respectively higher or lower intensities. We observed a specificity of retention for the trained intensity for all four odours used. This adds to the appreciation of the richness in 'content' of olfactory short-term memory traces, even in a system as simple as larval Drosophila, and to define the demands on computational models of associative olfactory memory trace formation. We suggest two kinds of circuit architecture that have the potential to accommodate intensity learning, and discuss how they may be implemented in the insect brain.

  5. Resting state networks' corticotopy: the dual intertwined rings architecture.

    Directory of Open Access Journals (Sweden)

    Salma Mesmoudi

    Full Text Available How does the brain integrate multiple sources of information to support normal sensorimotor and cognitive functions? To investigate this question we present an overall brain architecture (called "the dual intertwined rings architecture" that relates the functional specialization of cortical networks to their spatial distribution over the cerebral cortex (or "corticotopy". Recent results suggest that the resting state networks (RSNs are organized into two large families: 1 a sensorimotor family that includes visual, somatic, and auditory areas and 2 a large association family that comprises parietal, temporal, and frontal regions and also includes the default mode network. We used two large databases of resting state fMRI data, from which we extracted 32 robust RSNs. We estimated: (1 the RSN functional roles by using a projection of the results on task based networks (TBNs as referenced in large databases of fMRI activation studies; and (2 relationship of the RSNs with the Brodmann Areas. In both classifications, the 32 RSNs are organized into a remarkable architecture of two intertwined rings per hemisphere and so four rings linked by homotopic connections. The first ring forms a continuous ensemble and includes visual, somatic, and auditory cortices, with interspersed bimodal cortices (auditory-visual, visual-somatic and auditory-somatic, abbreviated as VSA ring. The second ring integrates distant parietal, temporal and frontal regions (PTF ring through a network of association fiber tracts which closes the ring anatomically and ensures a functional continuity within the ring. The PTF ring relates association cortices specialized in attention, language and working memory, to the networks involved in motivation and biological regulation and rhythms. This "dual intertwined architecture" suggests a dual integrative process: the VSA ring performs fast real-time multimodal integration of sensorimotor information whereas the PTF ring performs multi

  6. Architectural freedom and industrialised architecture

    DEFF Research Database (Denmark)

    Vestergaard, Inge

    2012-01-01

    to the building physic problems a new industrialized period has started based on light weight elements basically made of wooden structures, faced with different suitable materials meant for individual expression for the specific housing area. It is the purpose of this article to widen up the different design...... to this systematic thinking of the building technique we get a diverse and functional architecture. Creating a new and clearer story telling about new and smart system based thinking behind the architectural expression....

  7. Preemptive Architecture: Explosive Art and Future Architectures in Cursed Urban Zones

    Directory of Open Access Journals (Sweden)

    Stahl Stenslie

    2017-04-01

    Full Text Available This article describes the art and architectural research project Preemptive Architecture that uses artistic strategies and approaches to create bomb-ready architectural structures that act as instruments for the undoing of violence in war. Increasing environmental usability through destruction represents an inverse strategy that reverses common thinking patterns about warfare, art and architecture. Building structures predestined for a construc­tive destruction becomes a creative act. One of the main motivations behind this paper is to challenge and expand the material thinking as well as the socio-political conditions related to artistic, architectural and design based practices.   Article received: December 12, 2016; Article accepted: January 10, 2017; Published online: April 20, 2017 Original scholarly paper How to cite this article: Stenslie, Stahl, and Magne Wiggen. "Preemptive Architecture: Explosive Art and Future Architectures in Cursed Urban Zones." AM Journal of Art and Media Studies 12 (2017: 29-39. doi: 10.25038/am.v0i12.165

  8. Mapping of H.264 decoding on a multiprocessor architecture

    Science.gov (United States)

    van der Tol, Erik B.; Jaspers, Egbert G.; Gelderblom, Rob H.

    2003-05-01

    Due to the increasing significance of development costs in the competitive domain of high-volume consumer electronics, generic solutions are required to enable reuse of the design effort and to increase the potential market volume. As a result from this, Systems-on-Chip (SoCs) contain a growing amount of fully programmable media processing devices as opposed to application-specific systems, which offered the most attractive solutions due to a high performance density. The following motivates this trend. First, SoCs are increasingly dominated by their communication infrastructure and embedded memory, thereby making the cost of the functional units less significant. Moreover, the continuously growing design costs require generic solutions that can be applied over a broad product range. Hence, powerful programmable SoCs are becoming increasingly attractive. However, to enable power-efficient designs, that are also scalable over the advancing VLSI technology, parallelism should be fully exploited. Both task-level and instruction-level parallelism can be provided by means of e.g. a VLIW multiprocessor architecture. To provide the above-mentioned scalability, we propose to partition the data over the processors, instead of traditional functional partitioning. An advantage of this approach is the inherent locality of data, which is extremely important for communication-efficient software implementations. Consequently, a software implementation is discussed, enabling e.g. SD resolution H.264 decoding with a two-processor architecture, whereas High-Definition (HD) decoding can be achieved with an eight-processor system, executing the same software. Experimental results show that the data communication considerably reduces up to 65% directly improving the overall performance. Apart from considerable improvement in memory bandwidth, this novel concept of partitioning offers a natural approach for optimally balancing the load of all processors, thereby further improving the

  9. 3D memory: etch is the new litho

    Science.gov (United States)

    Petti, Christopher

    2018-03-01

    This paper discusses the process challenges and limitations for 3D NAND processes, focusing on vertical 3D architectures. The effect of deep memory hole etches on die cost is calculated, with die cost showing a minimum at a given number of layers because of aspect-ratio dependent etch effects. Techniques to mitigate these etch effects are summarized, as are other etch issues, such as bowing and twisting. Metal replacement gate processes and their challenges are also described. Lastly, future directions of vertical 3D NAND technologies are explored.

  10. Enterprise architecture patterns practical solutions for recurring IT-architecture problems

    CERN Document Server

    Perroud, Thierry

    2013-01-01

    Every enterprise architect faces similar problems when designing and governing the enterprise architecture of a medium to large enterprise. Design patterns are a well-established concept in software engineering, used to define universally applicable solution schemes. By applying this approach to enterprise architectures, recurring problems in the design and implementation of enterprise architectures can be solved over all layers, from the business layer to the application and data layer down to the technology layer.Inversini and Perroud describe patterns at the level of enterprise architecture

  11. Interactive volume exploration of petascale microscopy data streams using a visualization-driven virtual memory approach

    KAUST Repository

    Hadwiger, Markus; Beyer, Johanna; Jeong, Wonki; Pfister, Hanspeter

    2012-01-01

    This paper presents the first volume visualization system that scales to petascale volumes imaged as a continuous stream of high-resolution electron microscopy images. Our architecture scales to dense, anisotropic petascale volumes because it: (1) decouples construction of the 3D multi-resolution representation required for visualization from data acquisition, and (2) decouples sample access time during ray-casting from the size of the multi-resolution hierarchy. Our system is designed around a scalable multi-resolution virtual memory architecture that handles missing data naturally, does not pre-compute any 3D multi-resolution representation such as an octree, and can accept a constant stream of 2D image tiles from the microscopes. A novelty of our system design is that it is visualization-driven: we restrict most computations to the visible volume data. Leveraging the virtual memory architecture, missing data are detected during volume ray-casting as cache misses, which are propagated backwards for on-demand out-of-core processing. 3D blocks of volume data are only constructed from 2D microscope image tiles when they have actually been accessed during ray-casting. We extensively evaluate our system design choices with respect to scalability and performance, compare to previous best-of-breed systems, and illustrate the effectiveness of our system for real microscopy data from neuroscience. © 1995-2012 IEEE.

  12. Interactive volume exploration of petascale microscopy data streams using a visualization-driven virtual memory approach

    KAUST Repository

    Hadwiger, Markus

    2012-12-01

    This paper presents the first volume visualization system that scales to petascale volumes imaged as a continuous stream of high-resolution electron microscopy images. Our architecture scales to dense, anisotropic petascale volumes because it: (1) decouples construction of the 3D multi-resolution representation required for visualization from data acquisition, and (2) decouples sample access time during ray-casting from the size of the multi-resolution hierarchy. Our system is designed around a scalable multi-resolution virtual memory architecture that handles missing data naturally, does not pre-compute any 3D multi-resolution representation such as an octree, and can accept a constant stream of 2D image tiles from the microscopes. A novelty of our system design is that it is visualization-driven: we restrict most computations to the visible volume data. Leveraging the virtual memory architecture, missing data are detected during volume ray-casting as cache misses, which are propagated backwards for on-demand out-of-core processing. 3D blocks of volume data are only constructed from 2D microscope image tiles when they have actually been accessed during ray-casting. We extensively evaluate our system design choices with respect to scalability and performance, compare to previous best-of-breed systems, and illustrate the effectiveness of our system for real microscopy data from neuroscience. © 1995-2012 IEEE.

  13. MUF architecture /art London

    DEFF Research Database (Denmark)

    Svenningsen Kajita, Heidi

    2009-01-01

    Om MUF architecture samt interview med Liza Fior og Katherine Clarke, partnere i muf architecture/art......Om MUF architecture samt interview med Liza Fior og Katherine Clarke, partnere i muf architecture/art...

  14. Towards Terabit Memories

    Science.gov (United States)

    Hoefflinger, Bernd

    Memories have been the major yardstick for the continuing validity of Moore's law. In single-transistor-per-Bit dynamic random-access memories (DRAM), the number of bits per chip pretty much gives us the number of transistors. For decades, DRAM's have offered the largest storage capacity per chip. However, DRAM does not scale any longer, both in density and voltage, severely limiting its power efficiency to 10 fJ/b. A differential DRAM would gain four-times in density and eight-times in energy. Static CMOS RAM (SRAM) with its six transistors/cell is gaining in reputation because it scales well in cell size and operating voltage so that its fundamental advantage of speed, non-destructive read-out and low-power standby could lead to just 2.5 electrons/bit in standby and to a dynamic power efficiency of 2aJ/b. With a projected 2020 density of 16 Gb/cm², the SRAM would be as dense as normal DRAM and vastly better in power efficiency, which would mean a major change in the architecture and market scenario for DRAM versus SRAM. Non-volatile Flash memory have seen two quantum jumps in density well beyond the roadmap: Multi-Bit storage per transistor and high-density TSV (through-silicon via) technology. The number of electrons required per Bit on the storage gate has been reduced since their first realization in 1996 by more than an order of magnitude to 400 electrons/Bit in 2010 for a complexity of 32Gbit per chip at the 32 nm node. Chip stacking of eight chips with TSV has produced a 32GByte solid-state drive (SSD). A stack of 32 chips with 2 b/cell at the 16 nm node will reach a density of 2.5 Terabit/cm². Non-volatile memory with a density of 10 × 10 nm²/Bit is the target for widespread development. Phase-change memory (PCM) and resistive memory (RRAM) lead in cell density, and they will reach 20 Gb/cm² in 2D and higher with 3D chip stacking. This is still almost an order-of-magnitude less than Flash. However, their read-out speed is ~10-times faster, with as yet

  15. A quantum annealing architecture with all-to-all connectivity from local interactions.

    Science.gov (United States)

    Lechner, Wolfgang; Hauke, Philipp; Zoller, Peter

    2015-10-01

    Quantum annealers are physical devices that aim at solving NP-complete optimization problems by exploiting quantum mechanics. The basic principle of quantum annealing is to encode the optimization problem in Ising interactions between quantum bits (qubits). A fundamental challenge in building a fully programmable quantum annealer is the competing requirements of full controllable all-to-all connectivity and the quasi-locality of the interactions between physical qubits. We present a scalable architecture with full connectivity, which can be implemented with local interactions only. The input of the optimization problem is encoded in local fields acting on an extended set of physical qubits. The output is-in the spirit of topological quantum memories-redundantly encoded in the physical qubits, resulting in an intrinsic fault tolerance. Our model can be understood as a lattice gauge theory, where long-range interactions are mediated by gauge constraints. The architecture can be realized on various platforms with local controllability, including superconducting qubits, NV-centers, quantum dots, and atomic systems.

  16. Low power test architecture for dynamic read destructive fault detection in SRAM

    Science.gov (United States)

    Takher, Vikram Singh; Choudhary, Rahul Raj

    2018-06-01

    Dynamic Read Destructive Fault (dRDF) is the outcome of resistive open defects in the core cells of static random-access memories (SRAMs). The sensitisation of dRDF involves either performing multiple read operations or creation of number of read equivalent stress (RES), on the core cell under test. Though the creation of RES is preferred over the performing multiple read operation on the core cell, cell dissipates more power during RES than during the read or write operation. This paper focuses on the reduction in power dissipation by optimisation of number of RESs, which are required to sensitise the dRDF during test mode of operation of SRAM. The novel pre-charge architecture has been proposed in order to reduce the power dissipation by limiting the number of RESs to an optimised number of two. The proposed low power architecture is simulated and analysed which shows reduction in power dissipation by reducing the number of RESs up to 18.18%.

  17. VERNACULAR ARCHITECTURE: AN INTRODUCTORY COURSE TO LEARN ARCHITECTURE IN INDIA

    Directory of Open Access Journals (Sweden)

    Miki Desai

    2010-07-01

    Full Text Available “The object in view of both my predecessors in office and by myself has been rather to bring out the reasoning powers of individual students, so that they may understand the inner meaning of the old forms and their original function and may develop and modernize and gradually produce an architecture, Indian in character, but at the same time as suited to present day India as the old styles were to their own times and environment.” Claude Batley-1940; Lang, Desai, Desai, 1997 (p.143. The article introduces teaching philosophy, content and method of Basic Design I and II for first year students of architecture at the Faculty of Architecture, Centre for Environmental Planning and Technology (CEPT University, Ahmedabad, India. It is framed within the Indian perspective of architectural education from the British colonial times. Commencing with important academic literature and biases of the initial colonial period, it quickly traces architectural education in CEPT, the sixteenth school of post-independent India, set up in 1962, discussing the foundation year teaching imparted. The school was Modernist and avant-garde. The author introduced these two courses against the back drop of the Universalist Modernist credo of architecture and education. In the courses, the primary philosophy behind learning design emerges from heuristic method. The aim of the first course is seen as infusing interest in visual world, development of manual skills and dexterity through the dictum of ‘Look-feel-reason out-evaluate’ and ‘observe-record-interpret-synthesize transform express’. Due to the lack of architectural orientation in Indian schooling; the second course assumes vernacular architecture as a reasonable tool for a novice to understand the triangular relationship of society, architecture and physical context and its impact on design. The students are analytically exposed to the regional variety of architectures logically stemming from the geo

  18. Architecture Descriptions. A Contribution to Modeling of Production System Architecture

    DEFF Research Database (Denmark)

    Jepsen, Allan Dam; Hvam, Lars

    a proper understanding of the architecture phenomenon and the ability to describe it in a manner that allow the architecture to be communicated to and handled by stakeholders throughout the company. Despite the existence of several design philosophies in production system design such as Lean, that focus...... a diverse set of stakeholder domains and tools in the production system life cycle. To support such activities, a contribution is made to the identification and referencing of production system elements within architecture descriptions as part of the reference architecture framework. The contribution...

  19. Virtual memory support for distributed computing environments using a shared data object model

    Science.gov (United States)

    Huang, F.; Bacon, J.; Mapp, G.

    1995-12-01

    Conventional storage management systems provide one interface for accessing memory segments and another for accessing secondary storage objects. This hinders application programming and affects overall system performance due to mandatory data copying and user/kernel boundary crossings, which in the microkernel case may involve context switches. Memory-mapping techniques may be used to provide programmers with a unified view of the storage system. This paper extends such techniques to support a shared data object model for distributed computing environments in which good support for coherence and synchronization is essential. The approach is based on a microkernel, typed memory objects, and integrated coherence control. A microkernel architecture is used to support multiple coherence protocols and the addition of new protocols. Memory objects are typed and applications can choose the most suitable protocols for different types of object to avoid protocol mismatch. Low-level coherence control is integrated with high-level concurrency control so that the number of messages required to maintain memory coherence is reduced and system-wide synchronization is realized without severely impacting the system performance. These features together contribute a novel approach to the support for flexible coherence under application control.

  20. Sculpture Versus Architecture?

    Directory of Open Access Journals (Sweden)

    Alexander Rappaport

    2007-07-01

    Full Text Available Many critics consider Richard Serra the leading sculptor of the 20th century. He is famous not only for inventing something new in sculpture (abstract sculpture compositions existed before him, having been opened by constructivist vanguard of the beginning of the 20th century. Material selections by Vladimir Tatlin and sculptures by Osip Tsadkin, as well as compositions by Henry Moor appeared before Serra. Serra is famous for transferring his works' accent from the works as they are, which could be installed in any place, to their environment. That is he saw in the sculpture a key to understanding the urban space. His crude metal sheets and profiles, rectangular and curvilinear, exceeding regular scale of sculpture, come closer to architecture. Richard Serra places them near architectural constructions as checkpoints of intermediate scale category of space located between so-called «street furniture» – lamp posts, stalls, fountains and benches – and buildings, especially huge modern ones.But the matter is not only in the scale. Serra's sculptures are not only abstract compositions that harmoniously add to the space with their spacious scale. They have some mystery, some implicit sense appearing before a pedestrian as an enigma. Their mystique opposes both street furniture and architecture. But first of all it opposes the historical sculpture with its enigma always overshadowed by historical or biographical topic. Krylov's sculpture in the Summer Garden or Minin and Pozharsky's monument on the Red Square do not strike us, because we know that those monuments are erected IN COMMEMORATION of prominent people, as fellow citizens' tribute to their great contribution to the national history. But the crude metal sheets welded at different angles – what are they for? Who needs them?As an art critic, Edward Goldman, said, fame came to Richard Serra in 1989, when the sculpture composition Tilted Arc erected eight years before it was demolished by

  1. The Relationship Between Digital Technology Experience, Daily Media Exposure and Working Memory Capacity

    Directory of Open Access Journals (Sweden)

    Muhterem DİNDAR

    2016-06-01

    Full Text Available Today’s youngsters interact with digital technologies to a great extent which leads scholars to question the influence of this exposure on human cognitive structure. Through resorting to digital nativity assumptions, it is presumed that cognitive architecture of the youth may change in accordance with digital technology use. In this regard, the current study investigated the relationship between digital technology experience, daily media exposure and working memory capacity of so-called digital native participants. A total of 572 undergraduate students responded to self-report measures, which addressed years of experience for 7 different digital devices and the daily time spent for 14 different digital activities. Participants’ working memory capacity was measured through the Computation Span and the Dot Matrix Test. While the former was used to measure the phonological loop capacity, the latter was used to address the visuo-spatial sketchpad capacity. Correlational analyses revealed that neither the phonological loop capacity nor the visuo-spatial sketchpad capacity was related to digital technology experience and daily media exposure. Thus, the transformative contribution of digital technology experience to human cognitive architecture could not be observed through the current measures

  2. Investigation of a Novel Common Subexpression Elimination Method for Low Power and Area Efficient DCT Architecture

    Directory of Open Access Journals (Sweden)

    M. F. Siddiqui

    2014-01-01

    Full Text Available A wide interest has been observed to find a low power and area efficient hardware design of discrete cosine transform (DCT algorithm. This research work proposed a novel Common Subexpression Elimination (CSE based pipelined architecture for DCT, aimed at reproducing the cost metrics of power and area while maintaining high speed and accuracy in DCT applications. The proposed design combines the techniques of Canonical Signed Digit (CSD representation and CSE to implement the multiplier-less method for fixed constant multiplication of DCT coefficients. Furthermore, symmetry in the DCT coefficient matrix is used with CSE to further decrease the number of arithmetic operations. This architecture needs a single-port memory to feed the inputs instead of multiport memory, which leads to reduction of the hardware cost and area. From the analysis of experimental results and performance comparisons, it is observed that the proposed scheme uses minimum logic utilizing mere 340 slices and 22 adders. Moreover, this design meets the real time constraints of different video/image coders and peak-signal-to-noise-ratio (PSNR requirements. Furthermore, the proposed technique has significant advantages over recent well-known methods along with accuracy in terms of power reduction, silicon area usage, and maximum operating frequency by 41%, 15%, and 15%, respectively.

  3. Investigation of a novel common subexpression elimination method for low power and area efficient DCT architecture.

    Science.gov (United States)

    Siddiqui, M F; Reza, A W; Kanesan, J; Ramiah, H

    2014-01-01

    A wide interest has been observed to find a low power and area efficient hardware design of discrete cosine transform (DCT) algorithm. This research work proposed a novel Common Subexpression Elimination (CSE) based pipelined architecture for DCT, aimed at reproducing the cost metrics of power and area while maintaining high speed and accuracy in DCT applications. The proposed design combines the techniques of Canonical Signed Digit (CSD) representation and CSE to implement the multiplier-less method for fixed constant multiplication of DCT coefficients. Furthermore, symmetry in the DCT coefficient matrix is used with CSE to further decrease the number of arithmetic operations. This architecture needs a single-port memory to feed the inputs instead of multiport memory, which leads to reduction of the hardware cost and area. From the analysis of experimental results and performance comparisons, it is observed that the proposed scheme uses minimum logic utilizing mere 340 slices and 22 adders. Moreover, this design meets the real time constraints of different video/image coders and peak-signal-to-noise-ratio (PSNR) requirements. Furthermore, the proposed technique has significant advantages over recent well-known methods along with accuracy in terms of power reduction, silicon area usage, and maximum operating frequency by 41%, 15%, and 15%, respectively.

  4. Architecture and Film

    OpenAIRE

    Mohammad Javaheri, Saharnaz

    2016-01-01

    Film does not exist without architecture. In every movie that has ever been made throughout history, the cinematic image of architecture is embedded within the picture. Throughout my studies and research, I began to see that there is no director who can consciously or unconsciously deny the use of architectural elements in his or her movies. Architecture offers a strong profile to distinguish characters and story. In the early days, films were shot in streets surrounde...

  5. Time, Language and Action - A Unified Long-Term Memory Model for Sensory-Motor Chains and Word Schemata

    OpenAIRE

    Chersi, Fabian; Ferro, Marcello; Pezzulo, Giovanni; Pirrelli, Vito

    2011-01-01

    Action and language are known to be organized as closely-related brain subsystems. An Italian CNR project implemented a computational neural model where the ability to form chains of goal-directed actions and chains of linguistic units relies on a unified memory architecture obeying the same organizing principles.

  6. Architectural geometry

    KAUST Repository

    Pottmann, Helmut

    2014-11-26

    Around 2005 it became apparent in the geometry processing community that freeform architecture contains many problems of a geometric nature to be solved, and many opportunities for optimization which however require geometric understanding. This area of research, which has been called architectural geometry, meanwhile contains a great wealth of individual contributions which are relevant in various fields. For mathematicians, the relation to discrete differential geometry is significant, in particular the integrable system viewpoint. Besides, new application contexts have become available for quite some old-established concepts. Regarding graphics and geometry processing, architectural geometry yields interesting new questions but also new objects, e.g. replacing meshes by other combinatorial arrangements. Numerical optimization plays a major role but in itself would be powerless without geometric understanding. Summing up, architectural geometry has become a rewarding field of study. We here survey the main directions which have been pursued, we show real projects where geometric considerations have played a role, and we outline open problems which we think are significant for the future development of both theory and practice of architectural geometry.

  7. Architectural geometry

    KAUST Repository

    Pottmann, Helmut; Eigensatz, Michael; Vaxman, Amir; Wallner, Johannes

    2014-01-01

    Around 2005 it became apparent in the geometry processing community that freeform architecture contains many problems of a geometric nature to be solved, and many opportunities for optimization which however require geometric understanding. This area of research, which has been called architectural geometry, meanwhile contains a great wealth of individual contributions which are relevant in various fields. For mathematicians, the relation to discrete differential geometry is significant, in particular the integrable system viewpoint. Besides, new application contexts have become available for quite some old-established concepts. Regarding graphics and geometry processing, architectural geometry yields interesting new questions but also new objects, e.g. replacing meshes by other combinatorial arrangements. Numerical optimization plays a major role but in itself would be powerless without geometric understanding. Summing up, architectural geometry has become a rewarding field of study. We here survey the main directions which have been pursued, we show real projects where geometric considerations have played a role, and we outline open problems which we think are significant for the future development of both theory and practice of architectural geometry.

  8. Solving the integration problem of one transistor one memristor architecture with a Bi-layer IGZO film through synchronous process

    Science.gov (United States)

    Chang, Che-Chia; Liu, Po-Tsun; Chien, Chen-Yu; Fan, Yang-Shun

    2018-04-01

    This study demonstrates the integration of a thin film transistor (TFT) and resistive random-access memory (RRAM) to form a one-transistor-one-resistor (1T1R) configuration. With the concept of the current conducting direction in RRAM and TFT, a triple-layer stack design of Pt/InGaZnO/Al2O3 is proposed for both the switching layer of RRAM and the channel layer of TFT. This proposal decreases the complexity of fabrication and the numbers of photomasks required. Also, the robust endurance and stable retention characteristics are exhibited by the 1T1R architecture for promising applications in memory-embedded flat panel displays.

  9. Elements of Architecture

    DEFF Research Database (Denmark)

    Elements of Architecture explores new ways of engaging architecture in archaeology. It conceives of architecture both as the physical evidence of past societies and as existing beyond the physical environment, considering how people in the past have not just dwelled in buildings but have existed...

  10. Digitally-Driven Architecture

    Directory of Open Access Journals (Sweden)

    Henriette Bier

    2014-07-01

    Full Text Available The shift from mechanical to digital forces architects to reposition themselves: Architects generate digital information, which can be used not only in designing and fabricating building components but also in embedding behaviours into buildings. This implies that, similar to the way that industrial design and fabrication with its concepts of standardisation and serial production influenced modernist architecture, digital design and fabrication influences contemporary architecture. While standardisation focused on processes of rationalisation of form, mass-customisation as a new paradigm that replaces mass-production, addresses non-standard, complex, and flexible designs. Furthermore, knowledge about the designed object can be encoded in digital data pertaining not just to the geometry of a design but also to its physical or other behaviours within an environment. Digitally-driven architecture implies, therefore, not only digitally-designed and fabricated architecture, it also implies architecture – built form – that can be controlled, actuated, and animated by digital means.In this context, this sixth Footprint issue examines the influence of digital means as pragmatic and conceptual instruments for actuating architecture. The focus is not so much on computer-based systems for the development of architectural designs, but on architecture incorporating digital control, sens­ing, actuating, or other mechanisms that enable buildings to inter­act with their users and surroundings in real time in the real world through physical or sensory change and variation.

  11. PICNIC Architecture.

    Science.gov (United States)

    Saranummi, Niilo

    2005-01-01

    The PICNIC architecture aims at supporting inter-enterprise integration and the facilitation of collaboration between healthcare organisations. The concept of a Regional Health Economy (RHE) is introduced to illustrate the varying nature of inter-enterprise collaboration between healthcare organisations collaborating in providing health services to citizens and patients in a regional setting. The PICNIC architecture comprises a number of PICNIC IT Services, the interfaces between them and presents a way to assemble these into a functioning Regional Health Care Network meeting the needs and concerns of its stakeholders. The PICNIC architecture is presented through a number of views relevant to different stakeholder groups. The stakeholders of the first view are national and regional health authorities and policy makers. The view describes how the architecture enables the implementation of national and regional health policies, strategies and organisational structures. The stakeholders of the second view, the service viewpoint, are the care providers, health professionals, patients and citizens. The view describes how the architecture supports and enables regional care delivery and process management including continuity of care (shared care) and citizen-centred health services. The stakeholders of the third view, the engineering view, are those that design, build and implement the RHCN. The view comprises four sub views: software engineering, IT services engineering, security and data. The proposed architecture is founded into the main stream of how distributed computing environments are evolving. The architecture is realised using the web services approach. A number of well established technology platforms and generic standards exist that can be used to implement the software components. The software components that are specified in PICNIC are implemented in Open Source.

  12. Memory and innovation in the spaces of higher education. The contribution of the architectural limit

    Directory of Open Access Journals (Sweden)

    Pablo Campos Calvo-Sotelo

    2016-03-01

    Full Text Available The current panorama of Higher Education recommends to carry out a review of the space/time places where teaching/learning processes are hosted. The spatial consequences derived from the innovation in teaching demand the incorporation of new academic places, alternative to the traditional typology of the classroom, in order to optimize the integral formation of the student —the ultimate mission of Universities—. The historical and obsolete architectural design of the classroom, as a rigid space/time container, must start a process of de-materialization, in such a way that it fosters more versatile learning methods, that can be activated in any time and place. To accomplish such a goal, more creative ambits must be generated, adapted to a modern understanding of the idea of learning, which must abandon its old-fashion passive and static format, in order to be transformed into a dynamic modality, closed to the student and committed to him. Innovation regarding the architectural configuration is directly connected tothe nature and transformation of the architectural limits which embrace and give shape to those places where formation occurs. The current demand of diversification and flexibility in learning areas must be satisfied by means of a correct articulation between the internal space of the classroom and its direct surrounding context, together with its social and cultural environment. Spatial and visual continuity generate new atmospheres that increase the quality of the teaching/learning processes. The new course of Higher Education needs a proactive review of the space/time dimensions of the traditional classroom, associated to the paradigm shift affecting modalities of teaching/learning, with the aim of generating new opportunities of innovation in Universities.

  13. Architectural Theatricality

    DEFF Research Database (Denmark)

    Tvedebrink, Tenna Doktor Olsen

    environments and a knowledge gap therefore exists in present hospital designs. Consequently, the purpose of this thesis has been to investigate if any research-based knowledge exist supporting the hypothesis that the interior architectural qualities of eating environments influence patient food intake, health...... and well-being, as well as outline a set of basic design principles ‘predicting’ the future interior architectural qualities of patient eating environments. Methodologically the thesis is based on an explorative study employing an abductive approach and hermeneutic-interpretative strategy utilizing tactics...... and food intake, as well as a series of references exist linking the interior architectural qualities of healthcare environments with the health and wellbeing of patients. On the basis of these findings, the thesis presents the concept of Architectural Theatricality as well as a set of design principles...

  14. Prospective memory, working memory, retrospective memory and self-rated memory performance in persons with intellectual disability

    OpenAIRE

    Levén, Anna; Lyxell, Björn; Andersson, Jan; Danielsson, Henrik; Rönnberg, Jerker

    2008-01-01

    The purpose of the present study was to examine the relationship between prospective memory, working memory, retrospective memory and self-rated memory capacity in adults with and without intellectual disability. Prospective memory was investigated by means of a picture-based task. Working memory was measured as performance on span tasks. Retrospective memory was scored as recall of subject performed tasks. Self-ratings of memory performance were based on the prospective and retrospective mem...

  15. Asymmetrical access to color and location in visual working memory.

    Science.gov (United States)

    Rajsic, Jason; Wilson, Daryl E

    2014-10-01

    Models of visual working memory (VWM) have benefitted greatly from the use of the delayed-matching paradigm. However, in this task, the ability to recall a probed feature is confounded with the ability to maintain the proper binding between the feature that is to be reported and the feature (typically location) that is used to cue a particular item for report. Given that location is typically used as a cue-feature, we used the delayed-estimation paradigm to compare memory for location to memory for color, rotating which feature was used as a cue and which was reported. Our results revealed several novel findings: 1) the likelihood of reporting a probed object's feature was superior when reporting location with a color cue than when reporting color with a location cue; 2) location report errors were composed entirely of swap errors, with little to no random location reports; and 3) both colour and location reports greatly benefitted from the presence of nonprobed items at test. This last finding suggests that it is uncertainty over the bindings between locations and colors at memory retrieval that drive swap errors, not at encoding. We interpret our findings as consistent with a representational architecture that nests remembered object features within remembered locations.

  16. Architecture of Institution & Home. Architecture as Cultural Medium

    NARCIS (Netherlands)

    Robinson, J.W.

    2004-01-01

    This dissertation addresses how architecture functions as a cultural medium. It does so by by investigating how the architecture of institution and home each construct and support different cultural practices. By studying the design of ordinary settings in terms of how qualitative differences in

  17. On Detailing in Contemporary Architecture

    DEFF Research Database (Denmark)

    Kristensen, Claus; Kirkegaard, Poul Henning

    2010-01-01

    Details in architecture have a significant influence on how architecture is experienced. One can touch the materials and analyse the detailing - thus details give valuable information about the architectural scheme as a whole. The absence of perceptual stimulation like details and materiality...... / tactility can blur the meaning of the architecture and turn it into an empty statement. The present paper will outline detailing in contemporary architecture and discuss the issue with respect to architectural quality. Architectural cases considered as sublime piece of architecture will be presented...

  18. FY 1996 Report on the industrial science and technology research and development project. R and D of brain type computer architecture; 1996 nendo nogata computer architecture no kenkyu kaihatsu seika hokokusho

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1997-03-01

    It is an object of this project to develop an information processing device based on a completely new architecture, in order to technologically realize human-oriented information processing mechanisms, e.g., memory, learning, association of ideas, perception, intuition and value judgement. Described herein are the FY 1996 results. For development of an LSI based on a neural network in the primary visual cortex, it is confirmed that the basic circuit structure comprising the position-signal generators, memories, signal selectors and adders is suitable for development of the LSI circuit for a neural network function (Hough transform). For development of realtime parallel distributed processor (RPDP), the basic specifications are established for, e.g., local memory capacity of RPDP, functions incorporated in RPDP and number of RPDPs incorporated in the RPDP chip, operating frequency and clock supply method, and estimated power consumption and package, in order to realize the RPDP chip. For development and advanced evaluation of large-scale neural network silicon chip, the chip developed by the advanced research project is incorporated with learning rules, cell models and failure-detection circuits, to design the evaluation substrate incorporated with the above chip. The evaluation methods and implementation procedures are drawn. (NEDO)

  19. Systemic Architecture

    DEFF Research Database (Denmark)

    Poletto, Marco; Pasquero, Claudia

    -up or tactical design, behavioural space and the boundary of the natural and the artificial realms within the city and architecture. A new kind of "real-time world-city" is illustrated in the form of an operational design manual for the assemblage of proto-architectures, the incubation of proto-gardens...... and the coding of proto-interfaces. These prototypes of machinic architecture materialize as synthetic hybrids embedded with biological life (proto-gardens), computational power, behavioural responsiveness (cyber-gardens), spatial articulation (coMachines and fibrous structures), remote sensing (FUNclouds...

  20. Architectural practice and theory: the case of Bruno Taut's house in Berlin-Dahlewitz

    Directory of Open Access Journals (Sweden)

    Paola Ardizzola

    2017-06-01

    Full Text Available In 1926 Bruno Taut built his own house in Berlin-Dahlewitz. The German architect had already declared his ideas of housing in the book Die neue Whonung (1924 exemplifying the new concept of modern living-style, according to Neues Bauen. In other theoretical writings he defines the Neues Bauen in relation with new needs, tendencies and aesthetics of architecture, referring to important issues as climate, topography and tradition. The book Ein Whonhaus (1927 stigmatizes the coeval construction process of his house: the thirteen chapters are a detailed analysis which give evidence to every technological and morphological choice. Taut focuses on the relationship between architecture and landscape, type of furniture, functional plan layout, use of glass; especially he enlightens the reader as to the use of colour as a construction material. The house has an unconventional shape, it is a quarter of a circle; in his writings the architect painstakingly explains the impressive plan. With the book Ein Whonhaus Taut delivers to memory his home design, transforming process and ideas related to the modern house. He breaks through conventions and changes the notions of what Modernism could produce. The paper highlights the theoretical production related to the architect’s own house as praxis for doing architecture, emphasizing Taut’s contribution to a dialectic mutual relationship between theoretical and architectural practice, in order to achieve a more conscious and effective design process.