WorldWideScience

Sample records for fault-tolerant distributed computing

  1. Cooperative Fault Tolerant Distributed Computing

    Energy Technology Data Exchange (ETDEWEB)

    Fagg, Graham E.

    2006-03-15

    HARNESS was proposed as a system that combined the best of emerging technologies found in current distributed computing research and commercial products into a very flexible, dynamically adaptable framework that could be used by applications to allow them to evolve and better handle their execution environment. The HARNESS system was designed using the considerable experience from previous projects such as PVM, MPI, IceT and Cumulvs. As such, the system was designed to avoid any of the common problems found with using these current systems, such as no single point of failure, ability to survive machine, node and software failures. Additional features included improved inter-component connectivity, with full support for dynamic down loading of addition components at run-time thus reducing the stress on application developers to build in all the libraries they need in advance.

  2. Fault tolerant computing systems

    International Nuclear Information System (INIS)

    Randell, B.

    1981-01-01

    Fault tolerance involves the provision of strategies for error detection damage assessment, fault treatment and error recovery. A survey is given of the different sorts of strategies used in highly reliable computing systems, together with an outline of recent research on the problems of providing fault tolerance in parallel and distributed computing systems. (orig.)

  3. Algorithm-dependent fault tolerance for distributed computing

    Energy Technology Data Exchange (ETDEWEB)

    P. D. Hough; M. e. Goldsby; E. J. Walsh

    2000-02-01

    Large-scale distributed systems assembled from commodity parts, like CPlant, have become common tools in the distributed computing world. Because of their size and diversity of parts, these systems are prone to failures. Applications that are being run on these systems have not been equipped to efficiently deal with failures, nor is there vendor support for fault tolerance. Thus, when a failure occurs, the application crashes. While most programmers make use of checkpoints to allow for restarting of their applications, this is cumbersome and incurs substantial overhead. In many cases, there are more efficient and more elegant ways in which to address failures. The goal of this project is to develop a software architecture for the detection of and recovery from faults in a cluster computing environment. The detection phase relies on the latest techniques developed in the fault tolerance community. Recovery is being addressed in an application-dependent manner, thus allowing the programmer to take advantage of algorithmic characteristics to reduce the overhead of fault tolerance. This architecture will allow large-scale applications to be more robust in high-performance computing environments that are comprised of clusters of commodity computers such as CPlant and SMP clusters.

  4. Fault Tolerant Computer Architecture

    CERN Document Server

    Sorin, Daniel

    2009-01-01

    For many years, most computer architects have pursued one primary goal: performance. Architects have translated the ever-increasing abundance of ever-faster transistors provided by Moore's law into remarkable increases in performance. Recently, however, the bounty provided by Moore's law has been accompanied by several challenges that have arisen as devices have become smaller, including a decrease in dependability due to physical faults. In this book, we focus on the dependability challenge and the fault tolerance solutions that architects are developing to overcome it. The two main purposes

  5. A study of standard building blocks for the design of fault-tolerant distributed computer systems

    Science.gov (United States)

    Rennels, D. A.; Avizienis, A.; Ercegovac, M.

    1978-01-01

    This paper presents the results of a study that has established a standard set of four semiconductor VLSI building-block circuits. These circuits can be assembled with off-the-shelf microprocessors and semiconductor memory modules into fault-tolerant distributed computer configurations. The resulting multi-computer architecture uses self-checking computer modules backed up by a limited number of spares. A redundant bus system is employed for communication between computer modules.

  6. Designing Fault Tolerance Strategy by Iterative Redundancy for Component-Based Distributed Computing Systems

    Directory of Open Access Journals (Sweden)

    Hui Wang

    2014-01-01

    Full Text Available Reliability is a critical issue for component-based distributed computing systems, some distributed software allows the existence of large numbers of potentially faulty components on an open network. Faults are inevitable in this large-scale, complex, distributed components setting, which may include a lot of untrustworthy parts. How to provide highly reliable component-based distributed systems is a challenging problem and a critical research. Generally, redundancy and replication are utilized to realize the goal of fault tolerance. In this paper, we propose a CFI (critical fault iterative redundancy technique, by which the efficiency can be guaranteed to make use of resources (e.g., computation and storage and to create fault-tolerance applications. When operating in an environment with unknown components’ reliability, CFI redundancy is more efficient and adaptive than other techniques (e.g., K-Modular Redundancy and N-Version Programming. In the CFI strategy of redundancy, the function invocation relationships and invocation frequencies are employed to rank the functions’ importance and identify the most vulnerable function implemented via functionally equivalent components. A tradeoff has to be made between efficiency and reliability. In this paper, a formal theoretical analysis and an experimental analysis are presented. Compared with the existing methods, the reliability of components-based distributed system can be greatly improved by tolerating a small part of significant components.

  7. Software fault tolerance in computer operating systems

    Science.gov (United States)

    Iyer, Ravishankar K.; Lee, Inhwan

    1994-01-01

    This chapter provides data and analysis of the dependability and fault tolerance for three operating systems: the Tandem/GUARDIAN fault-tolerant system, the VAX/VMS distributed system, and the IBM/MVS system. Based on measurements from these systems, basic software error characteristics are investigated. Fault tolerance in operating systems resulting from the use of process pairs and recovery routines is evaluated. Two levels of models are developed to analyze error and recovery processes inside an operating system and interactions among multiple instances of an operating system running in a distributed environment. The measurements show that the use of process pairs in Tandem systems, which was originally intended for tolerating hardware faults, allows the system to tolerate about 70% of defects in system software that result in processor failures. The loose coupling between processors which results in the backup execution (the processor state and the sequence of events occurring) being different from the original execution is a major reason for the measured software fault tolerance. The IBM/MVS system fault tolerance almost doubles when recovery routines are provided, in comparison to the case in which no recovery routines are available. However, even when recovery routines are provided, there is almost a 50% chance of system failure when critical system jobs are involved.

  8. Fault Injection and Monitoring Capability for a Fault-Tolerant Distributed Computation System

    Science.gov (United States)

    Torres-Pomales, Wilfredo; Yates, Amy M.; Malekpour, Mahyar R.

    2010-01-01

    The Configurable Fault-Injection and Monitoring System (CFIMS) is intended for the experimental characterization of effects caused by a variety of adverse conditions on a distributed computation system running flight control applications. A product of research collaboration between NASA Langley Research Center and Old Dominion University, the CFIMS is the main research tool for generating actual fault response data with which to develop and validate analytical performance models and design methodologies for the mitigation of fault effects in distributed flight control systems. Rather than a fixed design solution, the CFIMS is a flexible system that enables the systematic exploration of the problem space and can be adapted to meet the evolving needs of the research. The CFIMS has the capabilities of system-under-test (SUT) functional stimulus generation, fault injection and state monitoring, all of which are supported by a configuration capability for setting up the system as desired for a particular experiment. This report summarizes the work accomplished so far in the development of the CFIMS concept and documents the first design realization.

  9. Distributed consensus and fault tolerance - Lecture 2

    CERN Multimedia

    CERN. Geneva

    2017-01-01

    In a world where clusters with thousands of nodes are becoming commonplace, we are often faced with the task of having them coordinate and share state. As the number of machines goes up, so does the probability that something goes wrong: a node could temporarily lose connectivity, crash because of some race condition, or have its hard drive fail. What are the challenges when designing fault-tolerant distributed systems, where a cluster is able to survive the loss of individual nodes? In this lecture, we will discuss some basics on this topic (consistency models, CAP theorem, failure modes, byzantine faults), detail the raft consensus algorithm, and showcase an interesting example of a highly resilient distributed system, bitcoin.

  10. Distributed consensus and fault tolerance - Lecture 1

    CERN Multimedia

    CERN. Geneva

    2017-01-01

    In a world where clusters with thousands of nodes are becoming commonplace, we are often faced with the task of having them coordinate and share state. As the number of machines goes up, so does the probability that something goes wrong: a node could temporarily lose connectivity, crash because of some race condition, or have its hard drive fail. What are the challenges when designing fault-tolerant distributed systems, where a cluster is able to survive the loss of individual nodes? In this lecture, we will discuss some basics on this topic (consistency models, CAP theorem, failure modes, byzantine faults), detail the raft consensus algorithm, and showcase an interesting example of a highly resilient distributed system, bitcoin.

  11. Method and system for environmentally adaptive fault tolerant computing

    Science.gov (United States)

    Copenhaver, Jason L. (Inventor); Jeremy, Ramos (Inventor); Wolfe, Jeffrey M. (Inventor); Brenner, Dean (Inventor)

    2010-01-01

    A method and system for adapting fault tolerant computing. The method includes the steps of measuring an environmental condition representative of an environment. An on-board processing system's sensitivity to the measured environmental condition is measured. It is determined whether to reconfigure a fault tolerance of the on-board processing system based in part on the measured environmental condition. The fault tolerance of the on-board processing system may be reconfigured based in part on the measured environmental condition.

  12. Fault-tolerant power distribution system

    Science.gov (United States)

    Volp, Jeffrey A. (Inventor)

    1987-01-01

    A fault-tolerant power distribution system which includes a plurality of power sources and a plurality of nodes responsive thereto for supplying power to one or more loads associated with each node. Each node includes a plurality of switching circuits, each of which preferably uses a power field effect transistor which provides a diode operation when power is first applied to the nodes and which thereafter provides bi-directional current flow through the switching circuit in a manner such that a low voltage drop is produced in each direction. Each switching circuit includes circuitry for disabling the power field effect transistor when the current in the switching circuit exceeds a preselected value.

  13. Fault-tolerant search algorithms reliable computation with unreliable information

    CERN Document Server

    Cicalese, Ferdinando

    2013-01-01

    Why a book on fault-tolerant search algorithms? Searching is one of the fundamental problems in computer science. Time and again algorithmic and combinatorial issues originally studied in the context of search find application in the most diverse areas of computer science and discrete mathematics. On the other hand, fault-tolerance is a necessary ingredient of computing. Due to their inherent complexity, information systems are naturally prone to errors, which may appear at any level - as imprecisions in the data, bugs in the software, or transient or permanent hardware failures. This book pr

  14. Multiple Embedded Processors for Fault-Tolerant Computing

    Science.gov (United States)

    Bolotin, Gary; Watson, Robert; Katanyoutanant, Sunant; Burke, Gary; Wang, Mandy

    2005-01-01

    A fault-tolerant computer architecture has been conceived in an effort to reduce vulnerability to single-event upsets (spurious bit flips caused by impingement of energetic ionizing particles or photons). As in some prior fault-tolerant architectures, the redundancy needed for fault tolerance is obtained by use of multiple processors in one computer. Unlike prior architectures, the multiple processors are embedded in a single field-programmable gate array (FPGA). What makes this new approach practical is the recent commercial availability of FPGAs that are capable of having multiple embedded processors. A working prototype (see figure) consists of two embedded IBM PowerPC 405 processor cores and a comparator built on a Xilinx Virtex-II Pro FPGA. This relatively simple instantiation of the architecture implements an error-detection scheme. A planned future version, incorporating four processors and two comparators, would correct some errors in addition to detecting them.

  15. Performance-Oriented Fault Tolerance in Computing Systems

    NARCIS (Netherlands)

    Borodin, D.

    2010-01-01

    In this dissertation we address the overhead reduction of fault tolerance (FT) techniques. Due to technology trends such as decreasing feature sizes and lowering voltage levels, FT is becoming increasingly important in modern computing systems. FT techniques are based on some form of redundancy. It

  16. Abstractions for Fault-Tolerant Distributed System Verification

    Science.gov (United States)

    Pike, Lee S.; Maddalon, Jeffrey M.; Miner, Paul S.; Geser, Alfons

    2004-01-01

    Four kinds of abstraction for the design and analysis of fault tolerant distributed systems are discussed. These abstractions concern system messages, faults, fault masking voting, and communication. The abstractions are formalized in higher order logic, and are intended to facilitate specifying and verifying such systems in higher order theorem provers.

  17. Clouds: A support architecture for fault tolerant, distributed systems

    Science.gov (United States)

    Dasgupta, P.; Leblanco, R. J., Jr.

    1986-01-01

    Clouds is a distributed operating system providing support for fault tolerance, location independence, reconfiguration, and transactions. The implementation paradigm uses objects and nested actions as building blocks. Subsystems and applications that can be supported by Clouds to further enhance the performance and utility of the system are also discussed.

  18. Effective Fault-Tolerant Quantum Computation with Slow Measurements

    International Nuclear Information System (INIS)

    DiVincenzo, David P.; Aliferis, Panos

    2007-01-01

    How important is fast measurement for fault-tolerant quantum computation? Using a combination of existing and new ideas, we argue that measurement times as long as even 1000 gate times or more have a very minimal effect on the quantum accuracy threshold. This shows that slow measurement, which appears to be unavoidable in many implementations of quantum computing, poses no essential obstacle to scalability

  19. Fault-tolerant clock synchronization validation methodology. [in computer systems

    Science.gov (United States)

    Butler, Ricky W.; Palumbo, Daniel L.; Johnson, Sally C.

    1987-01-01

    A validation method for the synchronization subsystem of a fault-tolerant computer system is presented. The high reliability requirement of flight-crucial systems precludes the use of most traditional validation methods. The method presented utilizes formal design proof to uncover design and coding errors and experimentation to validate the assumptions of the design proof. The experimental method is described and illustrated by validating the clock synchronization system of the Software Implemented Fault Tolerance computer. The design proof of the algorithm includes a theorem that defines the maximum skew between any two nonfaulty clocks in the system in terms of specific system parameters. Most of these parameters are deterministic. One crucial parameter is the upper bound on the clock read error, which is stochastic. The probability that this upper bound is exceeded is calculated from data obtained by the measurement of system parameters. This probability is then included in a detailed reliability analysis of the system.

  20. Faster quantum chemistry simulation on fault-tolerant quantum computers

    International Nuclear Information System (INIS)

    Cody Jones, N; McMahon, Peter L; Yamamoto, Yoshihisa; Whitfield, James D; Yung, Man-Hong; Aspuru-Guzik, Alán; Van Meter, Rodney

    2012-01-01

    Quantum computers can in principle simulate quantum physics exponentially faster than their classical counterparts, but some technical hurdles remain. We propose methods which substantially improve the performance of a particular form of simulation, ab initio quantum chemistry, on fault-tolerant quantum computers; these methods generalize readily to other quantum simulation problems. Quantum teleportation plays a key role in these improvements and is used extensively as a computing resource. To improve execution time, we examine techniques for constructing arbitrary gates which perform substantially faster than circuits based on the conventional Solovay–Kitaev algorithm (Dawson and Nielsen 2006 Quantum Inform. Comput. 6 81). For a given approximation error ϵ, arbitrary single-qubit gates can be produced fault-tolerantly and using a restricted set of gates in time which is O(log ϵ) or O(log log ϵ); with sufficient parallel preparation of ancillas, constant average depth is possible using a method we call programmable ancilla rotations. Moreover, we construct and analyze efficient implementations of first- and second-quantized simulation algorithms using the fault-tolerant arbitrary gates and other techniques, such as implementing various subroutines in constant time. A specific example we analyze is the ground-state energy calculation for lithium hydride. (paper)

  1. Verifiable fault tolerance in measurement-based quantum computation

    Science.gov (United States)

    Fujii, Keisuke; Hayashi, Masahito

    2017-09-01

    Quantum systems, in general, cannot be simulated efficiently by a classical computer, and hence are useful for solving certain mathematical problems and simulating quantum many-body systems. This also implies, unfortunately, that verification of the output of the quantum systems is not so trivial, since predicting the output is exponentially hard. As another problem, the quantum system is very delicate for noise and thus needs an error correction. Here, we propose a framework for verification of the output of fault-tolerant quantum computation in a measurement-based model. In contrast to existing analyses on fault tolerance, we do not assume any noise model on the resource state, but an arbitrary resource state is tested by using only single-qubit measurements to verify whether or not the output of measurement-based quantum computation on it is correct. Verifiability is equipped by a constant time repetition of the original measurement-based quantum computation in appropriate measurement bases. Since full characterization of quantum noise is exponentially hard for large-scale quantum computing systems, our framework provides an efficient way to practically verify the experimental quantum error correction.

  2. Fault tolerance in computational grids: perspectives, challenges, and issues.

    Science.gov (United States)

    Haider, Sajjad; Nazir, Babar

    2016-01-01

    Computational grids are established with the intention of providing shared access to hardware and software based resources with special reference to increased computational capabilities. Fault tolerance is one of the most important issues faced by the computational grids. The main contribution of this survey is the creation of an extended classification of problems that incur in the computational grid environments. The proposed classification will help researchers, developers, and maintainers of grids to understand the types of issues to be anticipated. Moreover, different types of problems, such as omission, interaction, and timing related have been identified that need to be handled on various layers of the computational grid. In this survey, an analysis and examination is also performed pertaining to the fault tolerance and fault detection mechanisms. Our conclusion is that a dependable and reliable grid can only be established when more emphasis is on fault identification. Moreover, our survey reveals that adaptive and intelligent fault identification, and tolerance techniques can improve the dependability of grid working environments.

  3. Fault-tolerant clock synchronization in distributed systems

    Science.gov (United States)

    Ramanathan, Parameswaran; Shin, Kang G.; Butler, Ricky W.

    1990-01-01

    Existing fault-tolerant clock synchronization algorithms are compared and contrasted. These include the following: software synchronization algorithms, such as convergence-averaging, convergence-nonaveraging, and consistency algorithms, as well as probabilistic synchronization; hardware synchronization algorithms; and hybrid synchronization. The worst-case clock skews guaranteed by representative algorithms are compared, along with other important aspects such as time, message, and cost overhead imposed by the algorithms. More recent developments such as hardware-assisted software synchronization and algorithms for synchronizing large, partially connected distributed systems are especially emphasized.

  4. Fault-tolerant quantum computation with nondeterministic entangling gates

    Science.gov (United States)

    Auger, James M.; Anwar, Hussain; Gimeno-Segovia, Mercedes; Stace, Thomas M.; Browne, Dan E.

    2018-03-01

    Performing entangling gates between physical qubits is necessary for building a large-scale universal quantum computer, but in some physical implementations—for example, those that are based on linear optics or networks of ion traps—entangling gates can only be implemented probabilistically. In this work, we study the fault-tolerant performance of a topological cluster state scheme with local nondeterministic entanglement generation, where failed entangling gates (which correspond to bonds on the lattice representation of the cluster state) lead to a defective three-dimensional lattice with missing bonds. We present two approaches for dealing with missing bonds; the first is a nonadaptive scheme that requires no additional quantum processing, and the second is an adaptive scheme in which qubits can be measured in an alternative basis to effectively remove them from the lattice, hence eliminating their damaging effect and leading to better threshold performance. We find that a fault-tolerance threshold can still be observed with a bond-loss rate of 6.5% for the nonadaptive scheme, and a bond-loss rate as high as 14.5% for the adaptive scheme.

  5. Coordinated Fault Tolerance for High-Performance Computing

    Energy Technology Data Exchange (ETDEWEB)

    Dongarra, Jack; Bosilca, George; et al.

    2013-04-08

    Our work to meet our goal of end-to-end fault tolerance has focused on two areas: (1) improving fault tolerance in various software currently available and widely used throughout the HEC domain and (2) using fault information exchange and coordination to achieve holistic, systemwide fault tolerance and understanding how to design and implement interfaces for integrating fault tolerance features for multiple layers of the software stack—from the application, math libraries, and programming language runtime to other common system software such as jobs schedulers, resource managers, and monitoring tools.

  6. Experimental magic state distillation for fault-tolerant quantum computing.

    Science.gov (United States)

    Souza, Alexandre M; Zhang, Jingfu; Ryan, Colm A; Laflamme, Raymond

    2011-01-25

    Any physical quantum device for quantum information processing (QIP) is subject to errors in implementation. In order to be reliable and efficient, quantum computers will need error-correcting or error-avoiding methods. Fault-tolerance achieved through quantum error correction will be an integral part of quantum computers. Of the many methods that have been discovered to implement it, a highly successful approach has been to use transversal gates and specific initial states. A critical element for its implementation is the availability of high-fidelity initial states, such as |0〉 and the 'magic state'. Here, we report an experiment, performed in a nuclear magnetic resonance (NMR) quantum processor, showing sufficient quantum control to improve the fidelity of imperfect initial magic states by distilling five of them into one with higher fidelity.

  7. Roads towards fault-tolerant universal quantum computation

    Science.gov (United States)

    Campbell, Earl T.; Terhal, Barbara M.; Vuillot, Christophe

    2017-09-01

    A practical quantum computer must not merely store information, but also process it. To prevent errors introduced by noise from multiplying and spreading, a fault-tolerant computational architecture is required. Current experiments are taking the first steps toward noise-resilient logical qubits. But to convert these quantum devices from memories to processors, it is necessary to specify how a universal set of gates is performed on them. The leading proposals for doing so, such as magic-state distillation and colour-code techniques, have high resource demands. Alternative schemes, such as those that use high-dimensional quantum codes in a modular architecture, have potential benefits, but need to be explored further.

  8. Single event upset tests of a RISC-based fault-tolerant computer

    Energy Technology Data Exchange (ETDEWEB)

    Kimbrough, J.R.; Butner, D.N.; Colella, N.J.; Kaschmitter, J.L.; Shaeffer, D.L.; McKnett, C.L.; Coakley, P.G.; Casteneda, C.

    1996-03-23

    The project successfully demonstrated that dual lock-step comparison of commercial RISC processors is a viable fault-tolerant approach to handling SEU in space environment. The fault tolerant approach on orbit error rate was 38 times less than the single processor error rate. The random nature of the upsets and appearance in critical code section show it is essential to incorporate both hardware and software in the design and operation of fault-tolerant computers.

  9. Quorums Systems as a Method to Enhance Collaboration for Achieving Fault Tolerance in Distributed System

    Directory of Open Access Journals (Sweden)

    Ioan PETRI

    2009-01-01

    Full Text Available A system that implements the byzantine agreement algorithm is supposed to be very reliable and robust because of its fault tolerating feature. For very realistic environments, byzantine agreement protocols becomes inadequate, because they are based on the assumption that failures are controlled and they have unlimited severity. The byzantine agreement model works with a number of bounded failures that have to be tolerated. It is never concerned to identify these failures or to exclude them from the system. In this paper, we tackle quorum systems, which is a particular sort of distributed systems where some storage or computations are replicated on various machines in the idea that some of them work correctly to produce a reliable output at some given moment of time. Thus, by majority voting collaboration with quorums, one can achieve fault tolerance in distributed systems. Further, we argue that an algorithm to identify faulty-behaving machines is useful to identify purposeful malicious behaviors.

  10. Distributed Fault-Tolerant Control of Networked Uncertain Euler-Lagrange Systems Under Actuator Faults.

    Science.gov (United States)

    Chen, Gang; Song, Yongduan; Lewis, Frank L

    2016-05-03

    This paper investigates the distributed fault-tolerant control problem of networked Euler-Lagrange systems with actuator and communication link faults. An adaptive fault-tolerant cooperative control scheme is proposed to achieve the coordinated tracking control of networked uncertain Lagrange systems on a general directed communication topology, which contains a spanning tree with the root node being the active target system. The proposed algorithm is capable of compensating for the actuator bias fault, the partial loss of effectiveness actuation fault, the communication link fault, the model uncertainty, and the external disturbance simultaneously. The control scheme does not use any fault detection and isolation mechanism to detect, separate, and identify the actuator faults online, which largely reduces the online computation and expedites the responsiveness of the controller. To validate the effectiveness of the proposed method, a test-bed of multiple robot-arm cooperative control system is developed for real-time verification. Experiments on the networked robot-arms are conduced and the results confirm the benefits and the effectiveness of the proposed distributed fault-tolerant control algorithms.

  11. An Autonomous Distributed Fault-Tolerant Local Positioning System

    Science.gov (United States)

    Malekpour, Mahyar R.

    2017-01-01

    We describe a fault-tolerant, GPS-independent (Global Positioning System) distributed autonomous positioning system for static/mobile objects and present solutions for providing highly-accurate geo-location data for the static/mobile objects in dynamic environments. The reliability and accuracy of a positioning system fundamentally depends on two factors; its timeliness in broadcasting signals and the knowledge of its geometry, i.e., locations and distances of the beacons. Existing distributed positioning systems either synchronize to a common external source like GPS or establish their own time synchrony using a scheme similar to a master-slave by designating a particular beacon as the master and other beacons synchronize to it, resulting in a single point of failure. Another drawback of existing positioning systems is their lack of addressing various fault manifestations, in particular, communication link failures, which, as in wireless networks, are increasingly dominating the process failures and are typically transient and mobile, in the sense that they typically affect different messages to/from different processes over time.

  12. Fault-tolerant measurement-based quantum computing with continuous-variable cluster states.

    Science.gov (United States)

    Menicucci, Nicolas C

    2014-03-28

    A long-standing open question about Gaussian continuous-variable cluster states is whether they enable fault-tolerant measurement-based quantum computation. The answer is yes. Initial squeezing in the cluster above a threshold value of 20.5 dB ensures that errors from finite squeezing acting on encoded qubits are below the fault-tolerance threshold of known qubit-based error-correcting codes. By concatenating with one of these codes and using ancilla-based error correction, fault-tolerant measurement-based quantum computation of theoretically indefinite length is possible with finitely squeezed cluster states.

  13. Error Mitigation of Point-to-Point Communication for Fault-Tolerant Computing

    Science.gov (United States)

    Akamine, Robert L.; Hodson, Robert F.; LaMeres, Brock J.; Ray, Robert E.

    2011-01-01

    Fault tolerant systems require the ability to detect and recover from physical damage caused by the hardware s environment, faulty connectors, and system degradation over time. This ability applies to military, space, and industrial computing applications. The integrity of Point-to-Point (P2P) communication, between two microcontrollers for example, is an essential part of fault tolerant computing systems. In this paper, different methods of fault detection and recovery are presented and analyzed.

  14. Energy-Aware Synthesis of Fault-Tolerant Schedules for Real-Time Distributed Embedded Systems

    DEFF Research Database (Denmark)

    Poulsen, Kåre Harbo; Pop, Paul; Izosimov, Viacheslav

    2007-01-01

    This paper presents a design optimisation tool for distributed embedded real-time systems that 1) decides mapping, fault-tolerance policy and generates a fault-tolerant schedule, 2) is targeted for hard real-time, 3) has hard reliability goal, 4) generates static schedule for processes and messages......, 5) provides fault-tolerance for k transient/soft faults, 6) optimises for minimal energy consumption, while considering impact of lowering voltages on the probability of faults, 7) uses constraint logic programming (CLP) based implementation....

  15. Fault-tolerant quantum computation with asymmetric Bacon-Shor codes

    Science.gov (United States)

    Brooks, Peter; Preskill, John

    2013-03-01

    We develop a scheme for fault-tolerant quantum computation based on asymmetric Bacon-Shor codes, which works effectively against highly biased noise dominated by dephasing. We find the optimal Bacon-Shor block size as a function of the noise strength and the noise bias, and estimate the logical error rate and overhead cost achieved by this optimal code. Our fault-tolerant gadgets, based on gate teleportation, are well suited for hardware platforms with geometrically local gates in two dimensions.

  16. Fault-Tolerant Consensus of Multi-Agent System With Distributed Adaptive Protocol.

    Science.gov (United States)

    Chen, Shun; Ho, Daniel W C; Li, Lulu; Liu, Ming

    2015-10-01

    In this paper, fault-tolerant consensus in multi-agent system using distributed adaptive protocol is investigated. Firstly, distributed adaptive online updating strategies for some parameters are proposed based on local information of the network structure. Then, under the online updating parameters, a distributed adaptive protocol is developed to compensate the fault effects and the uncertainty effects in the leaderless multi-agent system. Based on the local state information of neighboring agents, a distributed updating protocol gain is developed which leads to a fully distributed continuous adaptive fault-tolerant consensus protocol design for the leaderless multi-agent system. Furthermore, a distributed fault-tolerant leader-follower consensus protocol for multi-agent system is constructed by the proposed adaptive method. Finally, a simulation example is given to illustrate the effectiveness of the theoretical analysis.

  17. Enhanced fault-tolerant quantum computing in d-level systems.

    Science.gov (United States)

    Campbell, Earl T

    2014-12-05

    Error-correcting codes protect quantum information and form the basis of fault-tolerant quantum computing. Leading proposals for fault-tolerant quantum computation require codes with an exceedingly rare property, a transversal non-Clifford gate. Codes with the desired property are presented for d-level qudit systems with prime d. The codes use n=d-1 qudits and can detect up to ∼d/3 errors. We quantify the performance of these codes for one approach to quantum computation known as magic-state distillation. Unlike prior work, we find performance is always enhanced by increasing d.

  18. Fault-tolerant linear optical quantum computing with small-amplitude coherent States.

    Science.gov (United States)

    Lund, A P; Ralph, T C; Haselgrove, H L

    2008-01-25

    Quantum computing using two coherent states as a qubit basis is a proposed alternative architecture with lower overheads but has been questioned as a practical way of performing quantum computing due to the fragility of diagonal states with large coherent amplitudes. We show that using error correction only small amplitudes (alpha>1.2) are required for fault-tolerant quantum computing. We study fault tolerance under the effects of small amplitudes and loss using a Monte Carlo simulation. The first encoding level resources are orders of magnitude lower than the best single photon scheme.

  19. Shadow Replication: An Energy-Aware, Fault-Tolerant Computational Model for Green Cloud Computing

    Directory of Open Access Journals (Sweden)

    Xiaolong Cui

    2014-08-01

    Full Text Available As the demand for cloud computing continues to increase, cloud service providers face the daunting challenge to meet the negotiated SLA agreement, in terms of reliability and timely performance, while achieving cost-effectiveness. This challenge is increasingly compounded by the increasing likelihood of failure in large-scale clouds and the rising impact of energy consumption and CO2 emission on the environment. This paper proposes Shadow Replication, a novel fault-tolerance model for cloud computing, which seamlessly addresses failure at scale, while minimizing energy consumption and reducing its impact on the environment. The basic tenet of the model is to associate a suite of shadow processes to execute concurrently with the main process, but initially at a much reduced execution speed, to overcome failures as they occur. Two computationally-feasible schemes are proposed to achieve Shadow Replication. A performance evaluation framework is developed to analyze these schemes and compare their performance to traditional replication-based fault tolerance methods, focusing on the inherent tradeoff between fault tolerance, the specified SLA and profit maximization. The results show that Shadow Replication leads to significant energy reduction, and is better suited for compute-intensive execution models, where up to 30% more profit increase can be achieved due to reduced energy consumption.

  20. Adaptive Fault Tolerance for Many-Core Based Space-Borne Computing

    Science.gov (United States)

    James, Mark; Springer, Paul; Zima, Hans

    2010-01-01

    This paper describes an approach to providing software fault tolerance for future deep-space robotic NASA missions, which will require a high degree of autonomy supported by an enhanced on-board computational capability. Such systems have become possible as a result of the emerging many-core technology, which is expected to offer 1024-core chips by 2015. We discuss the challenges and opportunities of this new technology, focusing on introspection-based adaptive fault tolerance that takes into account the specific requirements of applications, guided by a fault model. Introspection supports runtime monitoring of the program execution with the goal of identifying, locating, and analyzing errors. Fault tolerance assertions for the introspection system can be provided by the user, domain-specific knowledge, or via the results of static or dynamic program analysis. This work is part of an on-going project at the Jet Propulsion Laboratory in Pasadena, California.

  1. Implementing a strand of a scalable fault-tolerant quantum computing fabric.

    Science.gov (United States)

    Chow, Jerry M; Gambetta, Jay M; Magesan, Easwar; Abraham, David W; Cross, Andrew W; Johnson, B R; Masluk, Nicholas A; Ryan, Colm A; Smolin, John A; Srinivasan, Srikanth J; Steffen, M

    2014-06-24

    With favourable error thresholds and requiring only nearest-neighbour interactions on a lattice, the surface code is an error-correcting code that has garnered considerable attention. At the heart of this code is the ability to perform a low-weight parity measurement of local code qubits. Here we demonstrate high-fidelity parity detection of two code qubits via measurement of a third syndrome qubit. With high-fidelity gates, we generate entanglement distributed across three superconducting qubits in a lattice where each code qubit is coupled to two bus resonators. Via high-fidelity measurement of the syndrome qubit, we deterministically entangle the code qubits in either an even or odd parity Bell state, conditioned on the syndrome qubit state. Finally, to fully characterize this parity readout, we develop a measurement tomography protocol. The lattice presented naturally extends to larger networks of qubits, outlining a path towards fault-tolerant quantum computing.

  2. Step-by-step magic state encoding for efficient fault-tolerant quantum computation.

    Science.gov (United States)

    Goto, Hayato

    2014-12-16

    Quantum error correction allows one to make quantum computers fault-tolerant against unavoidable errors due to decoherence and imperfect physical gate operations. However, the fault-tolerant quantum computation requires impractically large computational resources for useful applications. This is a current major obstacle to the realization of a quantum computer. In particular, magic state distillation, which is a standard approach to universality, consumes the most resources in fault-tolerant quantum computation. For the resource problem, here we propose step-by-step magic state encoding for concatenated quantum codes, where magic states are encoded step by step from the physical level to the logical one. To manage errors during the encoding, we carefully use error detection. Since the sizes of intermediate codes are small, it is expected that the resource overheads will become lower than previous approaches based on the distillation at the logical level. Our simulation results suggest that the resource requirements for a logical magic state will become comparable to those for a single logical controlled-NOT gate. Thus, the present method opens a new possibility for efficient fault-tolerant quantum computation.

  3. Step-by-step magic state encoding for efficient fault-tolerant quantum computation

    Science.gov (United States)

    Goto, Hayato

    2014-01-01

    Quantum error correction allows one to make quantum computers fault-tolerant against unavoidable errors due to decoherence and imperfect physical gate operations. However, the fault-tolerant quantum computation requires impractically large computational resources for useful applications. This is a current major obstacle to the realization of a quantum computer. In particular, magic state distillation, which is a standard approach to universality, consumes the most resources in fault-tolerant quantum computation. For the resource problem, here we propose step-by-step magic state encoding for concatenated quantum codes, where magic states are encoded step by step from the physical level to the logical one. To manage errors during the encoding, we carefully use error detection. Since the sizes of intermediate codes are small, it is expected that the resource overheads will become lower than previous approaches based on the distillation at the logical level. Our simulation results suggest that the resource requirements for a logical magic state will become comparable to those for a single logical controlled-NOT gate. Thus, the present method opens a new possibility for efficient fault-tolerant quantum computation. PMID:25511387

  4. Fault-tolerant quantum computation for local non-Markovian noise

    International Nuclear Information System (INIS)

    Terhal, Barbara M.; Burkard, Guido

    2005-01-01

    We derive a threshold result for fault-tolerant quantum computation for local non-Markovian noise models. The role of error amplitude in our analysis is played by the product of the elementary gate time t 0 and the spectral width of the interaction Hamiltonian between system and bath. We discuss extensions of our model and the applicability of our analysis

  5. Fault Tolerant Distributed Portfolio Optimization in Smart Grids

    DEFF Research Database (Denmark)

    Juelsgaard, Morten; Wisniewski, Rafal; Bendtsen, Jan Dimon

    2014-01-01

    optimization scheme for power balancing, where communication is allowed only between units that are linked in the graph. We include consumers with controllable consumption as an active part of the portfolio. We show that a suboptimal, but arbitrarily good power balancing can be obtained in an uncoordinated......, distributed optimization framework, and argue that the scheme will work even if the computation time is limited. We further show that our approach can tolerate changes in the portfolio, in the sense that increasing or reducing the number of units in the portfolio requires only local updates. This ensures......This work considers a portfolio of units for electrical power production and the problem of utilizing it to maintain power balance in the electrical grid. We treat the portfolio as a graph in which the nodes are distributed generators and the links are communication paths. We present a distributed...

  6. Effect of correlated decay on fault-tolerant quantum computation

    Science.gov (United States)

    Lemberger, B.; Yavuz, D. D.

    2017-12-01

    We analyze noise in the circuit model of quantum computers when the qubits are coupled to a common bosonic bath and discuss the possible failure of scalability of quantum computation. Specifically, we investigate correlated (super-radiant) decay between the qubit energy levels from a two- or three-dimensional array of qubits without imposing any restrictions on the size of the sample. We first show that regardless of how the spacing between the qubits compares with the emission wavelength, correlated decay produces errors outside the applicability of the threshold theorem. This is because the sum of the norms of the two-body interaction Hamiltonians (which can be viewed as the upper bound on the single-qubit error) that decoheres each qubit scales with the total number of qubits and is unbounded. We then discuss two related results: (1) We show that the actual error (instead of the upper bound) on each qubit scales with the number of qubits. As a result, in the limit of large number of qubits in the computer, N →∞ , correlated decay causes each qubit in the computer to decohere in ever shorter time scales. (2) We find the complete eigenvalue spectrum of the exchange Hamiltonian that causes correlated decay in the same limit. We show that the spread of the eigenvalue distribution grows faster with N compared to the spectrum of the unperturbed system Hamiltonian. As a result, as N →∞ , quantum evolution becomes completely dominated by the noise due to correlated decay. These results argue that scalable quantum computing may not be possible in the circuit model in a two- or three- dimensional geometry when the qubits are coupled to a common bosonic bath.

  7. Quantum computation with topological codes from qubit to topological fault-tolerance

    CERN Document Server

    Fujii, Keisuke

    2015-01-01

    This book presents a self-consistent review of quantum computation with topological quantum codes. The book covers everything required to understand topological fault-tolerant quantum computation, ranging from the definition of the surface code to topological quantum error correction and topological fault-tolerant operations. The underlying basic concepts and powerful tools, such as universal quantum computation, quantum algorithms, stabilizer formalism, and measurement-based quantum computation, are also introduced in a self-consistent way. The interdisciplinary fields between quantum information and other fields of physics such as condensed matter physics and statistical physics are also explored in terms of the topological quantum codes. This book thus provides the first comprehensive description of the whole picture of topological quantum codes and quantum computation with them.

  8. Fault Tolerant and Optimal Control of Wind Turbines with Distributed High-Speed Generators

    Directory of Open Access Journals (Sweden)

    Urs Giger

    2017-01-01

    Full Text Available In this paper, the control scheme of a distributed high-speed generator system with a total amount of 12 generators and nominal generator speed of 7000 min − 1 is studied. Specifically, a fault tolerant control (FTC scheme is proposed to keep the turbine in operation in the presence of up to four simultaneous generator faults. The proposed controller structure consists of two layers: The upper layer is the baseline controller, which is separated into a partial load region with the generator torque as an actuating signal and the full-load operation region with the collective pitch angle as the other actuating signal. In addition, the lower layer is responsible for the fault diagnosis and FTC characteristics of the distributed generator drive train. The fault reconstruction and fault tolerant control strategy are tested in simulations with several actuator faults of different types.

  9. Hybrid magic state distillation for universal fault-tolerant quantum computation

    OpenAIRE

    Zheng, Wenqiang; Yu, Yafei; Pan, Jian; Zhang, Jingfu; Li, Jun; Li, Zhaokai; Suter, Dieter; Zhou, Xianyi; Peng, Xinhua; Du, Jiangfeng

    2014-01-01

    A set of stabilizer operations augmented by some special initial states known as 'magic states', gives the possibility of universal fault-tolerant quantum computation. However, magic state preparation inevitably involves nonideal operations that introduce noise. The most common method to eliminate the noise is magic state distillation (MSD) by stabilizer operations. Here we propose a hybrid MSD protocol by connecting a four-qubit H-type MSD with a five-qubit T-type MSD, in order to overcome s...

  10. Local rollback for fault-tolerance in parallel computing systems

    Science.gov (United States)

    Blumrich, Matthias A [Yorktown Heights, NY; Chen, Dong [Yorktown Heights, NY; Gara, Alan [Yorktown Heights, NY; Giampapa, Mark E [Yorktown Heights, NY; Heidelberger, Philip [Yorktown Heights, NY; Ohmacht, Martin [Yorktown Heights, NY; Steinmacher-Burow, Burkhard [Boeblingen, DE; Sugavanam, Krishnan [Yorktown Heights, NY

    2012-01-24

    A control logic device performs a local rollback in a parallel super computing system. The super computing system includes at least one cache memory device. The control logic device determines a local rollback interval. The control logic device runs at least one instruction in the local rollback interval. The control logic device evaluates whether an unrecoverable condition occurs while running the at least one instruction during the local rollback interval. The control logic device checks whether an error occurs during the local rollback. The control logic device restarts the local rollback interval if the error occurs and the unrecoverable condition does not occur during the local rollback interval.

  11. Fault tolerant embedded computers and power electronics for nuclear robotics

    Energy Technology Data Exchange (ETDEWEB)

    Giraud, A.; Robiolle, M.

    1995-12-31

    For requirements of nuclear industries, it is necessary to use embedded rad-tolerant electronics and high-level safety. In this paper, we first describe a computer architecture called MICADO designed for French nuclear industry. We then present outgoing projects on our industry. A special point is made on power electronics for remote-operated and legged robots. (authors). 7 refs., 2 figs.

  12. Fault tolerant embedded computers and power electronics for nuclear robotics

    International Nuclear Information System (INIS)

    Giraud, A.; Robiolle, M.

    1995-01-01

    For requirements of nuclear industries, it is necessary to use embedded rad-tolerant electronics and high-level safety. In this paper, we first describe a computer architecture called MICADO designed for French nuclear industry. We then present outgoing projects on our industry. A special point is made on power electronics for remote-operated and legged robots. (authors). 7 refs., 2 figs

  13. Plan for the Characterization of HIRF Effects on a Fault-Tolerant Computer Communication System

    Science.gov (United States)

    Torres-Pomales, Wilfredo; Malekpour, Mahyar R.; Miner, Paul S.; Koppen, Sandra V.

    2008-01-01

    This report presents the plan for the characterization of the effects of high intensity radiated fields on a prototype implementation of a fault-tolerant data communication system. Various configurations of the communication system will be tested. The prototype system is implemented using off-the-shelf devices. The system will be tested in a closed-loop configuration with extensive real-time monitoring. This test is intended to generate data suitable for the design of avionics health management systems, as well as redundancy management mechanisms and policies for robust distributed processing architectures.

  14. CEGB philosophy and experience with fault-tolerant micro-computer application for power plant controls

    International Nuclear Information System (INIS)

    Clinch, D.A.L.

    1986-01-01

    From the mid-1960s until the late 1970s, automatic modulating control of the main boiler plant on CEGB fossil-fired power stations was largely implemented with hard wired electronic equipment. Mid-way through this period, the CEGB formulated a set of design requirements for this type of equipment; these laid particular emphasis on the fault tolerance of a control system and specified the nature of the interfaces with a control desk and with plant regulators. However, the automatic control of an Advanced Gas Cooled Reactor (AGR) is based upon measured values which are derived by processing a large number of thermocouple signals. This is more readily implemented digitally than with hard-wired equipment. Essential to the operation of an AGR power station is a data processing (DP) computer for monitoring the plant; so the first group of AGR power stations, designed in the 1960s, employed their DP computers for modulating control. Since the late 1970s, automatic modulating control of major plants, for new power stations and for re-fits on established power stations, has been implemented with micro-computers. Wherever practicable, the policy formulated earlier for hard-wired equipment has been retained, particularly in respect of the interfaces. This policy forms the foundation of the fault tolerance of these micro-computer systems

  15. 2009 fault tolerance for extreme-scale computing workshop, Albuquerque, NM - March 19-20, 2009.

    Energy Technology Data Exchange (ETDEWEB)

    Katz, D. S.; Daly, J.; DeBardeleben, N.; Elnozahy, M.; Kramer, B.; Lathrop, S.; Nystrom, N.; Milfeld, K.; Sanielevici, S.; Scott, S.; Votta, L.; Louisiana State Univ.; Center for Exceptional Computing; LANL; IBM; Univ. of Illinois; Shodor Foundation; Pittsburgh Supercomputer Center; Texas Advanced Computing Center; ORNL; Sun Microsystems

    2009-02-01

    This is a report on the third in a series of petascale workshops co-sponsored by Blue Waters and TeraGrid to address challenges and opportunities for making effective use of emerging extreme-scale computing. This workshop was held to discuss fault tolerance on large systems for running large, possibly long-running applications. The main point of the workshop was to have systems people, middleware people (including fault-tolerance experts), and applications people talk about the issues and figure out what needs to be done, mostly at the middleware and application levels, to run such applications on the emerging petascale systems, without having faults cause large numbers of application failures. The workshop found that there is considerable interest in fault tolerance, resilience, and reliability of high-performance computing (HPC) systems in general, at all levels of HPC. The only way to recover from faults is through the use of some redundancy, either in space or in time. Redundancy in time, in the form of writing checkpoints to disk and restarting at the most recent checkpoint after a fault that cause an application to crash/halt, is the most common tool used in applications today, but there are questions about how long this can continue to be a good solution as systems and memories grow faster than I/O bandwidth to disk. There is interest in both modifications to this, such as checkpoints to memory, partial checkpoints, and message logging, and alternative ideas, such as in-memory recovery using residues. We believe that systematic exploration of these ideas holds the most promise for the scientific applications community. Fault tolerance has been an issue of discussion in the HPC community for at least the past 10 years; but much like other issues, the community has managed to put off addressing it during this period. There is a growing recognition that as systems continue to grow to petascale and beyond, the field is approaching the point where we don

  16. Fault-tolerant design

    CERN Document Server

    Dubrova, Elena

    2013-01-01

    This textbook serves as an introduction to fault-tolerance, intended for upper-division undergraduate students, graduate-level students and practicing engineers in need of an overview of the field.  Readers will develop skills in modeling and evaluating fault-tolerant architectures in terms of reliability, availability and safety.  They will gain a thorough understanding of fault tolerant computers, including both the theory of how to design and evaluate them and the practical knowledge of achieving fault-tolerance in electronic, communication and software systems.  Coverage includes fault-tolerance techniques through hardware, software, information and time redundancy.  The content is designed to be highly accessible, including numerous examples and exercises.  Solutions and powerpoint slides are available for instructors.   ·         Provides textbook coverage of the fundamental concepts of fault-tolerance; ·         Describes a variety of basic techniques for achieving fault-toleran...

  17. Coordinated Fault-Tolerance for High-Performance Computing Final Project Report

    Energy Technology Data Exchange (ETDEWEB)

    Panda, Dhabaleswar Kumar [The Ohio State University; Beckman, Pete

    2011-07-28

    existing publish-subscribe tools. We enhanced the intrinsic fault tolerance capabilities representative implementations of a variety of key HPC software subsystems and integrated them with the FTB. Targeting software subsystems included: MPI communication libraries, checkpoint/restart libraries, resource managers and job schedulers, and system monitoring tools. Leveraging the aforementioned infrastructure, as well as developing and utilizing additional tools, we have examined issues associated with expanded, end-to-end fault response from both system and application viewpoints. From the standpoint of system operations, we have investigated log and root cause analysis, anomaly detection and fault prediction, and generalized notification mechanisms. Our applications work has included libraries for fault-tolerance linear algebra, application frameworks for coupled multiphysics applications, and external frameworks to support the monitoring and response for general applications. Our final goal was to engage the high-end computing community to increase awareness of tools and issues around coordinated end-to-end fault management.

  18. Universal fault-tolerant adiabatic quantum computing with quantum dots or donors

    Science.gov (United States)

    Landahl, Andrew

    I will present a conceptual design for an adiabatic quantum computer that can achieve arbitrarily accurate universal fault-tolerant quantum computations with a constant energy gap and nearest-neighbor interactions. This machine can run any quantum algorithm known today or discovered in the future, in principle. The key theoretical idea is adiabatic deformation of degenerate ground spaces formed by topological quantum error-correcting codes. An open problem with the design is making the four-body interactions and measurements it uses more technologically accessible. I will present some partial solutions, including one in which interactions between quantum dots or donors in a two-dimensional array can emulate the desired interactions in second-order perturbation theory. I will conclude with some open problems, including the challenge of reformulating Kitaev's gadget perturbation theory technique so that it preserves fault tolerance. Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy's National Nuclear Security Administration under contract DE-AC04-94AL85000.

  19. The use of automatic programming techniques for fault tolerant computing systems

    Science.gov (United States)

    Wild, C.

    1985-01-01

    It is conjectured that the production of software for ultra-reliable computing systems such as required by Space Station, aircraft, nuclear power plants and the like will require a high degree of automation as well as fault tolerance. In this paper, the relationship between automatic programming techniques and fault tolerant computing systems is explored. Initial efforts in the automatic synthesis of code from assertions to be used for error detection as well as the automatic generation of assertions and test cases from abstract data type specifications is outlined. Speculation on the ability to generate truly diverse designs capable of recovery from errors by exploring alternate paths in the program synthesis tree is discussed. Some initial thoughts on the use of knowledge based systems for the global detection of abnormal behavior using expectations and the goal-directed reconfiguration of resources to meet critical mission objectives are given. One of the sources of information for these systems would be the knowledge captured during the automatic programming process.

  20. A direct approach to fault-tolerance in measurement-based quantum computation via teleportation

    International Nuclear Information System (INIS)

    Silva, Marcus; Danos, Vincent; Kashefi, Elham; Ollivier, Harold

    2007-01-01

    We discuss a simple variant of the one-way quantum computing model (Raussendorf R and Briegel H-J 2001 Phys. Rev. Lett. 86 5188), called the Pauli measurement model, where measurements are restricted to be along the eigenbases of the Pauli X and Y operators, while qubits can be initially prepared both in the vertical bar + π/4 > := 1/√2( vertical bar 0> + e i(π/4) vertical bar 1>) state and the usual vertical bar +> := 1/√2 ( vertical bar 0 > + vertical bar 1>) state. We prove the universality of this quantum computation model, and establish a standardization procedure which permits all entanglement and state preparation to be performed at the beginning of computation. This leads us to develop a direct approach to fault-tolerance by simple transformations of the entanglement graph and preparation operations, while error correction is performed naturally via syndrome-extracting teleportations

  1. High-Intensity Radiated Field Fault-Injection Experiment for a Fault-Tolerant Distributed Communication System

    Science.gov (United States)

    Yates, Amy M.; Torres-Pomales, Wilfredo; Malekpour, Mahyar R.; Gonzalez, Oscar R.; Gray, W. Steven

    2010-01-01

    Safety-critical distributed flight control systems require robustness in the presence of faults. In general, these systems consist of a number of input/output (I/O) and computation nodes interacting through a fault-tolerant data communication system. The communication system transfers sensor data and control commands and can handle most faults under typical operating conditions. However, the performance of the closed-loop system can be adversely affected as a result of operating in harsh environments. In particular, High-Intensity Radiated Field (HIRF) environments have the potential to cause random fault manifestations in individual avionic components and to generate simultaneous system-wide communication faults that overwhelm existing fault management mechanisms. This paper presents the design of an experiment conducted at the NASA Langley Research Center's HIRF Laboratory to statistically characterize the faults that a HIRF environment can trigger on a single node of a distributed flight control system.

  2. Fault diagnosis and fault-tolerant control strategies for non-linear systems analytical and soft computing approaches

    CERN Document Server

    Witczak, Marcin

    2014-01-01

      This book presents selected fault diagnosis and fault-tolerant control strategies for non-linear systems in a unified framework. In particular, starting from advanced state estimation strategies up to modern soft computing, the discrete-time description of the system is employed Part I of the book presents original research results regarding state estimation and neural networks for robust fault diagnosis. Part II is devoted to the presentation of integrated fault diagnosis and fault-tolerant systems. It starts with a general fault-tolerant control framework, which is then extended by introducing robustness with respect to various uncertainties. Finally, it is shown how to implement the proposed framework for fuzzy systems described by the well-known Takagi–Sugeno models. This research monograph is intended for researchers, engineers, and advanced postgraduate students in control and electrical engineering, computer science,as well as mechanical and chemical engineering.

  3. Fault-tolerant quantum computing in the Pauli or Clifford frame with slow error diagnostics

    Directory of Open Access Journals (Sweden)

    Christopher Chamberland

    2018-01-01

    Full Text Available We consider the problem of fault-tolerant quantum computation in the presence of slow error diagnostics, either caused by measurement latencies or slow decoding algorithms. Our scheme offers a few improvements over previously existing solutions, for instance it does not require active error correction and results in a reduced error-correction overhead when error diagnostics is much slower than the gate time. In addition, we adapt our protocol to cases where the underlying error correction strategy chooses the optimal correction amongst all Clifford gates instead of the usual Pauli gates. The resulting Clifford frame protocol is of independent interest as it can increase error thresholds and could find applications in other areas of quantum computation.

  4. A distributed fault tolerant architecture for nuclear reactor control and safety functions

    International Nuclear Information System (INIS)

    Hecht, M.; Agron, J.; Hochhauser, S.

    1989-01-01

    This paper reports on a fault tolerance architecture that provides tolerance to a broad scope of hardware, software, and communications faults which is being developed. This architecture relies on widely commercially available operating systems, local area networks, and software standards. Thus, development time is significantly shortened, and modularity allows for continuous and inexpensive system enhancement throughout the expected 20- year life. The fault containment and parallel processing capabilites of computers network are being exploited to provide a high performance, high availability network capable of tolerating a broad scope of hardware software, and operating system faults. The system can tolerate all but one known (and avoidable) single fault, two known and avoidable dual faults, and will detect all higher order fault sequences and provide diagnostics to allow for rapid manual recovery

  5. ECFS: A decentralized, distributed and fault-tolerant FUSE filesystem for the LHCb online farm

    Science.gov (United States)

    Rybczynski, Tomasz; Bonaccorsi, Enrico; Neufeld, Niko

    2014-06-01

    The LHCb experiment records millions of proton collisions every second, but only a fraction of them are useful for LHCb physics. In order to filter out the "bad events" a large farm of x86-servers (~2000 nodes) has been put in place. These servers boot from and run from NFS, however they use their local disk to temporarily store data, which cannot be processed in real-time ("data-deferring"). These events are subsequently processed, when there are no live-data coming in. The effective CPU power is thus greatly increased. This gain in CPU power depends critically on the availability of the local disks. For cost and power-reasons, mirroring (RAID-1) is not used, leading to a lot of operational headache with failing disks and disk-errors or server failures induced by faulty disks. To mitigate these problems and increase the reliability of the LHCb farm, while at same time keeping cost and power-consumption low, an extensive research and study of existing highly available and distributed file systems has been done. While many distributed file systems are providing reliability by "file replication", none of the evaluated ones supports erasure algorithms. A decentralised, distributed and fault-tolerant "write once read many" file system has been designed and implemented as a proof of concept providing fault tolerance without using expensive - in terms of disk space - file replication techniques and providing a unique namespace as a main goals. This paper describes the design and the implementation of the Erasure Codes File System (ECFS) and presents the specialised FUSE interface for Linux. Depending on the encoding algorithm ECFS will use a certain number of target directories as a backend to store the segments that compose the encoded data. When target directories are mounted via nfs/autofs - ECFS will act as a file-system over network/block-level raid over multiple servers.

  6. Application of a Resource Theory for Magic States to Fault-Tolerant Quantum Computing.

    Science.gov (United States)

    Howard, Mark; Campbell, Earl

    2017-03-03

    Motivated by their necessity for most fault-tolerant quantum computation schemes, we formulate a resource theory for magic states. First, we show that robustness of magic is a well-behaved magic monotone that operationally quantifies the classical simulation overhead for a Gottesman-Knill-type scheme using ancillary magic states. Our framework subsequently finds immediate application in the task of synthesizing non-Clifford gates using magic states. When magic states are interspersed with Clifford gates, Pauli measurements, and stabilizer ancillas-the most general synthesis scenario-then the class of synthesizable unitaries is hard to characterize. Our techniques can place nontrivial lower bounds on the number of magic states required for implementing a given target unitary. Guided by these results, we have found new and optimal examples of such synthesis.

  7. Low cost management of replicated data in fault-tolerant distributed systems

    Science.gov (United States)

    Joseph, Thomas A.; Birman, Kenneth P.

    1990-01-01

    Many distributed systems replicate data for fault tolerance or availability. In such systems, a logical update on a data item results in a physical update on a number of copies. The synchronization and communication required to keep the copies of replicated data consistent introduce a delay when operations are performed. A technique is described that relaxes the usual degree of synchronization, permitting replicated data items to be updated concurrently with other operations, while at the same time ensuring that correctness is not violated. The additional concurrency thus obtained results in better response time when performing operations on replicated data. How this technique performs in conjunction with a roll-back and a roll-forward failure recovery mechanism is also discussed.

  8. Fault-tolerant Semiquantum key Distribution Over a Collective-dephasing Noise Channel

    Science.gov (United States)

    Zhang, Ming-Hui; Li, Hui-Fang; Peng, Jin-Ye; Feng, Xiao-Yi

    2017-08-01

    Semiquantum key distribution (SQKD) allows two remote users, quantum Alice and classical Bob, to share a secret key via a quantum channel and an authenticated classical channel. In most of the existing SQKD protocols, SQKD is possible only under the assumption of ideal quantum channels. However, the noise in quantum channels is unavoidable. In this paper, we propose two fault-tolerant SQKD protocols, the randomization-based SQKD protocol and the measure-resend SQKD protocol, which are robust against the collective-dephasing noise. Logical qubits have been selected to build travelling blocks for constructing a decoherence-free subspace (DFS). Compared with the previous SQKD protocols, our protocols can provide higher communication fidelity. In addition, a security proof is given in the subsequent section.

  9. What does fault tolerant Deep Learning need from MPI?

    Energy Technology Data Exchange (ETDEWEB)

    Amatya, Vinay C.; Vishnu, Abhinav; Siegel, Charles M.; Daily, Jeffrey A.

    2017-09-25

    Deep Learning (DL) algorithms have become the {\\em de facto} Machine Learning (ML) algorithm for large scale data analysis. DL algorithms are computationally expensive -- even distributed DL implementations which use MPI require days of training (model learning) time on commonly studied datasets. Long running DL applications become susceptible to faults -- requiring development of a fault tolerant system infrastructure, in addition to fault tolerant DL algorithms. This raises an important question: {\\em What is needed from MPI for designing fault tolerant DL implementations?} In this paper, we address this problem for permanent faults. We motivate the need for a fault tolerant MPI specification by an in-depth consideration of recent innovations in DL algorithms and their properties, which drive the need for specific fault tolerance features. We present an in-depth discussion on the suitability of different parallelism types (model, data and hybrid); a need (or lack thereof) for check-pointing of any critical data structures; and most importantly, consideration for several fault tolerance proposals (user-level fault mitigation (ULFM), Reinit) in MPI and their applicability to fault tolerant DL implementations. We leverage a distributed memory implementation of Caffe, currently available under the Machine Learning Toolkit for Extreme Scale (MaTEx). We implement our approaches by extending MaTEx-Caffe for using ULFM-based implementation. Our evaluation using the ImageNet dataset and AlexNet neural network topology demonstrates the effectiveness of the proposed fault tolerant DL implementation using OpenMPI based ULFM.

  10. A Constraint Logic Programming Framework for the Synthesis of Fault-Tolerant Schedules for Distributed Embedded Systems

    DEFF Research Database (Denmark)

    Poulsen, Kåre Harbo; Pop, Paul; Izosimov, Viacheslav

    2007-01-01

    We present a constraint logic programming (CLP) approach for synthesis of fault-tolerant hard real-time applications on distributed heterogeneous architectures. We address time-triggered systems, where processes and messages are statically scheduled based on schedule tables. We use process re-exe...

  11. Computer aided reliability, availability, and safety modeling for fault-tolerant computer systems with commentary on the HARP program

    Science.gov (United States)

    Shooman, Martin L.

    1991-01-01

    Many of the most challenging reliability problems of our present decade involve complex distributed systems such as interconnected telephone switching computers, air traffic control centers, aircraft and space vehicles, and local area and wide area computer networks. In addition to the challenge of complexity, modern fault-tolerant computer systems require very high levels of reliability, e.g., avionic computers with MTTF goals of one billion hours. Most analysts find that it is too difficult to model such complex systems without computer aided design programs. In response to this need, NASA has developed a suite of computer aided reliability modeling programs beginning with CARE 3 and including a group of new programs such as: HARP, HARP-PC, Reliability Analysts Workbench (Combination of model solvers SURE, STEM, PAWS, and common front-end model ASSIST), and the Fault Tree Compiler. The HARP program is studied and how well the user can model systems using this program is investigated. One of the important objectives will be to study how user friendly this program is, e.g., how easy it is to model the system, provide the input information, and interpret the results. The experiences of the author and his graduate students who used HARP in two graduate courses are described. Some brief comparisons were made with the ARIES program which the students also used. Theoretical studies of the modeling techniques used in HARP are also included. Of course no answer can be any more accurate than the fidelity of the model, thus an Appendix is included which discusses modeling accuracy. A broad viewpoint is taken and all problems which occurred in the use of HARP are discussed. Such problems include: computer system problems, installation manual problems, user manual problems, program inconsistencies, program limitations, confusing notation, long run times, accuracy problems, etc.

  12. Toward a Fault Tolerant Architecture for Vital Medical-Based Wearable Computing.

    Science.gov (United States)

    Abdali-Mohammadi, Fardin; Bajalan, Vahid; Fathi, Abdolhossein

    2015-12-01

    Advancements in computers and electronic technologies have led to the emergence of a new generation of efficient small intelligent systems. The products of such technologies might include Smartphones and wearable devices, which have attracted the attention of medical applications. These products are used less in critical medical applications because of their resource constraint and failure sensitivity. This is due to the fact that without safety considerations, small-integrated hardware will endanger patients' lives. Therefore, proposing some principals is required to construct wearable systems in healthcare so that the existing concerns are dealt with. Accordingly, this paper proposes an architecture for constructing wearable systems in critical medical applications. The proposed architecture is a three-tier one, supporting data flow from body sensors to cloud. The tiers of this architecture include wearable computers, mobile computing, and mobile cloud computing. One of the features of this architecture is its high possible fault tolerance due to the nature of its components. Moreover, the required protocols are presented to coordinate the components of this architecture. Finally, the reliability of this architecture is assessed by simulating the architecture and its components, and other aspects of the proposed architecture are discussed.

  13. Fault tolerant software modules for SIFT

    Science.gov (United States)

    Hecht, M.; Hecht, H.

    1982-01-01

    The implementation of software fault tolerance is investigated for critical modules of the Software Implemented Fault Tolerance (SIFT) operating system to support the computational and reliability requirements of advanced fly by wire transport aircraft. Fault tolerant designs generated for the error reported and global executive are examined. A description of the alternate routines, implementation requirements, and software validation are included.

  14. Generalized state spaces and nonlocality in fault-tolerant quantum-computing schemes

    International Nuclear Information System (INIS)

    Ratanje, N.; Virmani, S.

    2011-01-01

    We develop connections between generalized notions of entanglement and quantum computational devices where the measurements available are restricted, either because they are noisy and/or because by design they are only along Pauli directions. By considering restricted measurements one can (by considering the dual positive operators) construct single-particle-state spaces that are different to the usual quantum-state space. This leads to a modified notion of entanglement that can be very different to the quantum version (for example, Bell states can become separable). We use this approach to develop alternative methods of classical simulation that have strong connections to the study of nonlocal correlations: we construct noisy quantum computers that admit operations outside the Clifford set and can generate some forms of multiparty quantum entanglement, but are otherwise classical in that they can be efficiently simulated classically and cannot generate nonlocal statistics. Although the approach provides new regimes of noisy quantum evolution that can be efficiently simulated classically, it does not appear to lead to significant reductions of existing upper bounds to fault tolerance thresholds for common noise models.

  15. Noise Threshold and Resource Cost of Fault-Tolerant Quantum Computing with Majorana Fermions in Hybrid Systems.

    Science.gov (United States)

    Li, Ying

    2016-09-16

    Fault-tolerant quantum computing in systems composed of both Majorana fermions and topologically unprotected quantum systems, e.g., superconducting circuits or quantum dots, is studied in this Letter. Errors caused by topologically unprotected quantum systems need to be corrected with error-correction schemes, for instance, the surface code. We find that the error-correction performance of such a hybrid topological quantum computer is not superior to a normal quantum computer unless the topological charge of Majorana fermions is insusceptible to noise. If errors changing the topological charge are rare, the fault-tolerance threshold is much higher than the threshold of a normal quantum computer and a surface-code logical qubit could be encoded in only tens of topological qubits instead of about 1,000 normal qubits.

  16. Adaptive and technology-independent architecture for fault-tolerant distributed AAL solutions.

    Science.gov (United States)

    Schmidt, Michael; Obermaisser, Roman

    2018-04-01

    Today's architectures for Ambient Assisted Living (AAL) must cope with a variety of challenges like flawless sensor integration and time synchronization (e.g. for sensor data fusion) while abstracting from the underlying technologies at the same time. Furthermore, an architecture for AAL must be capable to manage distributed application scenarios in order to support elderly people in all situations of their everyday life. This encompasses not just life at home but in particular the mobility of elderly people (e.g. when going for a walk or having sports) as well. Within this paper we will introduce a novel architecture for distributed AAL solutions whose design follows a modern Microservices approach by providing small core services instead of a monolithic application framework. The architecture comprises core services for sensor integration, and service discovery while supporting several communication models (periodic, sporadic, streaming). We extend the state-of-the-art by introducing a fault-tolerance model for our architecture on the basis of a fault-hypothesis describing the fault-containment regions (FCRs) with their respective failure modes and failure rates in order to support safety-critical AAL applications. Copyright © 2017 Elsevier Ltd. All rights reserved.

  17. Advanced cloud fault tolerance system

    Science.gov (United States)

    Sumangali, K.; Benny, Niketa

    2017-11-01

    Cloud computing has become a prevalent on-demand service on the internet to store, manage and process data. A pitfall that accompanies cloud computing is the failures that can be encountered in the cloud. To overcome these failures, we require a fault tolerance mechanism to abstract faults from users. We have proposed a fault tolerant architecture, which is a combination of proactive and reactive fault tolerance. This architecture essentially increases the reliability and the availability of the cloud. In the future, we would like to compare evaluations of our proposed architecture with existing architectures and further improve it.

  18. Design Optimization of Time- and Cost-Constrained Fault-Tolerant Distributed Embedded Systems

    DEFF Research Database (Denmark)

    Izosimov, Viacheslav; Pop, Paul; Eles, Petru

    2005-01-01

    In this paper we present an approach to the design optimization of fault-tolerant embedded systems for safety-critical applications. Processes are statically scheduled and communications are performed using the time-triggered protocol. We use process re-execution and replication for tolerating...

  19. Minimum Entropy Active Fault Tolerant Control of the Non-Gaussian Stochastic Distribution System Subjected to Mean Constraint

    Directory of Open Access Journals (Sweden)

    Haokun Jin

    2017-05-01

    Full Text Available Stochastic distribution control (SDC systems are a group of systems where the outputs considered is the measured probability density function (PDF of the system output whilst subjected to a normal crisp input. The purpose of the active fault tolerant control of such systems is to use the fault estimation information and other measured information to make the output PDF still track the given distribution when the objective PDF is known. However, if the target PDF is unavailable, the PDF tracking operation will be impossible. Minimum entropy control of the system output can be considered as an alternative strategy. The mean represents the center location of the stochastic variable, and it is reasonable that the minimum entropy fault tolerant controller can be designed subjected to mean constraint. In this paper, using the rational square-root B-spline model for the shape control of the system output probability density function (PDF, a nonlinear adaptive observer based fault diagnosis algorithm is proposed to diagnose the fault. Through the controller reconfiguration, the system entropy subjected to mean restriction can still be minimized when fault occurs. An illustrative example is utilized to demonstrate the use of the minimum entropy fault tolerant control algorithms.

  20. USAGE OF STANDARD PERSONAL COMPUTER PORTS FOR DESIGNING OF THE DOUBLE REDUNDANT FAULT-TOLERANT COMPUTER CONTROL SYSTEMS

    Directory of Open Access Journals (Sweden)

    Rafig SAMEDOV

    2005-01-01

    Full Text Available In this study, for designing of the fault-tolerant control systems by using standard personal computers, the ports have been investigated, different structure versions have been designed and the method for choosing of an optimal structure has been suggested. In this scope, first of all, the ÇİFTYAK system has been defined and its work principle has been determined. Then, data transmission ports of the standard personal computers have been classified and analyzed. After that, the structure versions have been designed and evaluated according to the used data transmission methods, the numbers of ports and the criterions of reliability, performance, truth, control and cost. Finally, the method for choosing of the most optimal structure version has been suggested.

  1. Optimal Configuration of Fault-Tolerance Parameters for Distributed Server Access

    DEFF Research Database (Denmark)

    Daidone, Alessandro; Renier, Thibault; Bondavalli, Andrea

    2013-01-01

    such as enhanced name servers. Such architectures provide an increased number of redundancy configuration choices. The influence of a (wide area) network connection can be quite significant and induce trade-offs between dependability and user-perceived performance. This paper develops a quantitative stochastic...... in replicated server architectures. In order to obtain insight into the system behaviour, a set of relevant environment parameters and controllable fault-tolerance parameters are chosen and the dependability/performance trade-off is evaluated....

  2. Fault Tolerant Feedback Control

    DEFF Research Database (Denmark)

    Stoustrup, Jakob; Niemann, H.

    2001-01-01

    An architecture for fault tolerant feedback controllers based on the Youla parameterization is suggested. It is shown that the Youla parameterization will give a residual vector directly in connection with the fault diagnosis part of the fault tolerant feedback controller. It turns out...... that there is a separation be-tween the feedback controller and the fault tolerant part. The closed loop feedback properties are handled by the nominal feedback controller and the fault tolerant part is handled by the design of the Youla parameter. The design of the fault tolerant part will not affect the design...

  3. Stacked codes: Universal fault-tolerant quantum computation in a two-dimensional layout

    Science.gov (United States)

    Jochym-O'Connor, Tomas; Bartlett, Stephen D.

    2016-02-01

    We introduce a class of three-dimensional color codes, which we call stacked codes, together with a fault-tolerant transformation that will map logical qubits encoded in two-dimensional (2D) color codes into stacked codes and back. The stacked code allows for the transversal implementation of a non-Clifford π /8 logical gate, which when combined with the logical Clifford gates that are transversal in the 2D color code give a gate set that is both fault-tolerant and universal without requiring nonstabilizer magic states. We then show that the layers forming the stacked code can be unfolded and arranged in a 2D layout. As only Clifford gates can be implemented transversally for 2D topological stabilizer codes, a nonlocal operation must be incorporated in order to allow for this transversal application of a non-Clifford gate. Our code achieves this operation through the transformation from a 2D color code to the unfolded stacked code induced by measuring only geometrically local stabilizers and gauge operators within the bulk of 2D color codes together with a nonlocal operator that has support on a one-dimensional boundary between such 2D codes. We believe that this proposed method to implement the nonlocal operation is a realistic one for 2D stabilizer layouts and would be beneficial in avoiding the large overheads caused by magic state distillation.

  4. Designing fault-tolerant real-time computer systems with diversified bus architecture for nuclear power plants

    International Nuclear Information System (INIS)

    Behera, Rajendra Prasad; Murali, N.; Satya Murty, S.A.V.

    2014-01-01

    Fault-tolerant real-time computer (FT-RTC) systems are widely used to perform safe operation of nuclear power plants (NPP) and safe shutdown in the event of any untoward situation. Design requirements for such systems need high reliability, availability, computational ability for measurement via sensors, control action via actuators, data communication and human interface via keyboard or display. All these attributes of FT-RTC systems are required to be implemented using best known methods such as redundant system design using diversified bus architecture to avoid common cause failure, fail-safe design to avoid unsafe failure and diagnostic features to validate system operation. In this context, the system designer must select efficient as well as highly reliable diversified bus architecture in order to realize fault-tolerant system design. This paper presents a comparative study between CompactPCI bus and Versa Module Eurocard (VME) bus architecture for designing FT-RTC systems with switch over logic system (SOLS) for NPP. (author)

  5. Spacecraft Attitude Fault-tolerant Control Based on Dynamic Control Distribution

    Directory of Open Access Journals (Sweden)

    Zhou Hong-Cheng

    2014-09-01

    Full Text Available For spacecraft attitude control system, we consider the aircraft's control surface deflection position saturation and rate constraints. Based on the dynamic control allocation method, we put forward redistribution method in the event of actuator stuck and damage failure. Firstly, because of the system modeling error caused by uncertainty and external disturbance, under actuator stuck and damage failure, we put forward the attitude control system mathematical model of angular rate control. We design actuator stuck fault diagnosis device and an adaptive sliding mode observer, respectively. The hidden failures and interference information feedback to the controller and the dynamic control allocation algorithm, in order to realize the fault tolerant control of actuator stuck and damage failure.

  6. Verification Methodology of Fault-tolerant, Fail-safe Computers Applied to MAGLEV Control Computer Systems

    Science.gov (United States)

    1993-05-01

    The Maglev control computer system should be designed to verifiably possess high reliability and safety as well as high availability to make Maglev a dependable and attractive transportation alternative to the public. A Maglev computer system has bee...

  7. Fault Tolerant Ethernet Based Network for Time Sensitive Applications in Electrical Power Distribution Systems

    Directory of Open Access Journals (Sweden)

    Leos Bohac

    2013-01-01

    Full Text Available The paper analyses and experimentally verifies deployment of Ethernet based network technology to enable fault tolerant and timely exchange of data among a number of high voltage protective relays that use proprietary serial communication line to exchange data in real time on a state of its high voltage circuitry facilitating a fast protection switching in case of critical failures. The digital serial signal is first fetched into PCM multiplexer where it is mapped to the corresponding E1 (2 Mbit/s time division multiplexed signal. Subsequently, the resulting E1 frames are then packetized and sent through Ethernet control LAN to the opposite PCM demultiplexer where the same but reverse processing is done finally sending a signal into the opposite protective relay. The challenge of this setup is to assure very timely delivery of the control information between protective relays even in the cases of potential failures of Ethernet network itself. The tolerance of Ethernet network to faults is assured using widespread per VLAN Rapid Spanning Tree Protocol potentially extended by 1+1 PCM protection as a valuable option.

  8. Adaptive Fault-Tolerant Synchronization Control of a Class of Complex Dynamical Networks With General Input Distribution Matrices and Actuator Faults.

    Science.gov (United States)

    Li, Xiao-Jian; Yang, Guang-Hong

    2017-03-01

    This paper is concerned with the problem of adaptive fault-tolerant synchronization control of a class of complex dynamical networks (CDNs) with actuator faults and unknown coupling weights. The considered input distribution matrix is assumed to be an arbitrary matrix, instead of a unit one. Within this framework, an adaptive fault-tolerant controller is designed to achieve synchronization for the CDN. Moreover, a convex combination technique and an important graph theory result are developed, such that the rigorous convergence analysis of synchronization errors can be conducted. In particular, it is shown that the proposed fault-tolerant synchronization control approach is valid for the CDN with both time-invariant and time-varying coupling weights. Finally, two simulation examples are provided to validate the effectiveness of the theoretical results.

  9. An efficient fault-tolerant out-patient order entry system based on special distributed client/server architecture.

    Science.gov (United States)

    Chuang, C T

    1998-01-01

    An automatic order entry system is very important for processing out-patient information. This system not only helps physicians to enter their orders directly, but can also reduce order communication error and thus improve medical quality. Therefore, many hospitals have high aspirations to generate and implement direct order entry systems, but they are also concerned about the setbacks of system failure. In this paper, we present an effective and efficient fault-tolerant order entry system based on special distribution client/server architecture that satisfies the requirements of out-patient order entry very well. From the experimental results carried out on a prototype, we found that this system can improve the system response time of order entry and can also generate an operational method having a user friendly interface. The physicians can enter their orders easily, accurately, directly, flexibly and at a faster rate by making choices from standardized and personalized menus in this system.

  10. Communication and Agreement Abstractions for Fault-Tolerant Asynchronous Distributed Systems

    CERN Document Server

    Raynal, Michel

    2010-01-01

    Understanding distributed computing is not an easy task. This is due to the many facets of uncertainty one has to cope with and master in order to produce correct distributed software. Considering the uncertainty created by asynchrony and process crash failures in the context of message-passing systems, the book focuses on the main abstractions that one has to understand and master in order to be able to produce software with guaranteed properties. These fundamental abstractions are communication abstractions that allow the processes to communicate consistently (namely the register abstraction

  11. Designing a Scalable Fault Tolerance Model for High Performance Computational Chemistry: A Case Study with Coupled Cluster Perturbative Triples.

    Science.gov (United States)

    van Dam, Hubertus J J; Vishnu, Abhinav; de Jong, Wibe A

    2011-01-11

    In the past couple of decades, the massive computational power provided by the most modern supercomputers has resulted in simulation of higher-order computational chemistry methods, previously considered intractable. As the system sizes continue to increase, the computational chemistry domain continues to escalate this trend using parallel computing with programming models such as Message Passing Interface (MPI) and Partitioned Global Address Space (PGAS) programming models such as Global Arrays. The ever increasing scale of these supercomputers comes at a cost of reduced Mean Time Between Failures (MTBF), currently on the order of days and projected to be on the order of hours for upcoming extreme scale systems. While traditional disk-based check pointing methods are ubiquitous for storing intermediate solutions, they suffer from high overhead of writing and recovering from checkpoints. In practice, checkpointing itself often brings the system down. Clearly, methods beyond checkpointing are imperative to handling the aggravating issue of reducing MTBF. In this paper, we address this challenge by designing and implementing an efficient fault tolerant version of the Coupled Cluster (CC) method with NWChem, using in-memory data redundancy. We present the challenges associated with our design, including an efficient data storage model, maintenance of at least one consistent data copy, and the recovery process. Our performance evaluation without faults shows that the current design exhibits a small overhead. In the presence of a simulated fault, the proposed design incurs negligible overhead in comparison to the state of the art implementation without faults.

  12. Superior model for fault tolerance computation in designing nano-sized circuit systems

    Energy Technology Data Exchange (ETDEWEB)

    Singh, N. S. S., E-mail: narinderjit@petronas.com.my; Muthuvalu, M. S., E-mail: msmuthuvalu@gmail.com [Fundamental and Applied Sciences Department, Universiti Teknologi PETRONAS, Bandar Seri Iskandar, Perak (Malaysia); Asirvadam, V. S., E-mail: vijanth-sagayan@petronas.com.my [Electrical and Electronics Engineering Department, Universiti Teknologi PETRONAS, Bandar Seri Iskandar, Perak (Malaysia)

    2014-10-24

    As CMOS technology scales nano-metrically, reliability turns out to be a decisive subject in the design methodology of nano-sized circuit systems. As a result, several computational approaches have been developed to compute and evaluate reliability of desired nano-electronic circuits. The process of computing reliability becomes very troublesome and time consuming as the computational complexity build ups with the desired circuit size. Therefore, being able to measure reliability instantly and superiorly is fast becoming necessary in designing modern logic integrated circuits. For this purpose, the paper firstly looks into the development of an automated reliability evaluation tool based on the generalization of Probabilistic Gate Model (PGM) and Boolean Difference-based Error Calculator (BDEC) models. The Matlab-based tool allows users to significantly speed-up the task of reliability analysis for very large number of nano-electronic circuits. Secondly, by using the developed automated tool, the paper explores into a comparative study involving reliability computation and evaluation by PGM and, BDEC models for different implementations of same functionality circuits. Based on the reliability analysis, BDEC gives exact and transparent reliability measures, but as the complexity of the same functionality circuits with respect to gate error increases, reliability measure by BDEC tends to be lower than the reliability measure by PGM. The lesser reliability measure by BDEC is well explained in this paper using distribution of different signal input patterns overtime for same functionality circuits. Simulation results conclude that the reliability measure by BDEC depends not only on faulty gates but it also depends on circuit topology, probability of input signals being one or zero and also probability of error on signal lines.

  13. Observer-based distributed adaptive fault-tolerant containment control of multi-agent systems with general linear dynamics.

    Science.gov (United States)

    Ye, Dan; Chen, Mengmeng; Li, Kui

    2017-11-01

    In this paper, we consider the distributed containment control problem of multi-agent systems with actuator bias faults based on observer method. The objective is to drive the followers into the convex hull spanned by the dynamic leaders, where the input is unknown but bounded. By constructing an observer to estimate the states and bias faults, an effective distributed adaptive fault-tolerant controller is developed. Different from the traditional method, an auxiliary controller gain is designed to deal with the unknown inputs and bias faults together. Moreover, the coupling gain can be adjusted online through the adaptive mechanism without using the global information. Furthermore, the proposed control protocol can guarantee that all the signals of the closed-loop systems are bounded and all the followers converge to the convex hull with bounded residual errors formed by the dynamic leaders. Finally, a decoupled linearized longitudinal motion model of the F-18 aircraft is used to demonstrate the effectiveness. Copyright © 2017 ISA. Published by Elsevier Ltd. All rights reserved.

  14. Fault Diagnosis and Fault Tolerant Control for Non-Gaussian Singular Time-Delayed Stochastic Distribution Systems with Disturbance Based on the Rational Square-Root Model

    Directory of Open Access Journals (Sweden)

    Yuancheng Sun

    2016-01-01

    Full Text Available For the non-Gaussian singular time-delayed stochastic distribution control (SDC system with unknown external disturbance where the output probability density function (PDF is approximated by the rational square-root B-spline basis function, a robust fault diagnosis and fault tolerant control algorithm is presented. A full-order observer is constructed to estimate the exogenous disturbance and an adaptive observer is used to estimate the fault size. A fault tolerant tracking controller is designed using the feedback of distribution tracking error, fault, and disturbance estimation to let the postfault output PDF still track desired distribution. Finally, a simulation example is included to illustrate the effectiveness of the proposed algorithms and encouraging results have been obtained.

  15. Strategies for Fault-Tolerant, Space-Based Computing: Lessons Learned from the ARGOS Testbed

    National Research Council Canada - National Science Library

    Levellette, M. N; Wood, K. S; Wood, D. L; Beall, J. H; Shirvani, P. P; Oh, N; McCluskey, E. J

    2002-01-01

    The Advanced Space Computing and Autonomy Testbed on the ARGOS Satellite provides the first direct, on orbit comparison of a modern radiation hardened 32 bit processor with a similar COTS processor...

  16. System-Level Development of Fault-Tolerant Distributed Aero-Engine Control Architecture, Phase I

    Data.gov (United States)

    National Aeronautics and Space Administration — NASA's vision for an "intelligent engine" will be realized with the development of a truly distributed control system and reliable smart transducer node components;...

  17. Fault Detection, Identification, Reconstruction, and Fault-Tolerant Estimation for Distributed Spacecraft, Phase II

    Data.gov (United States)

    National Aeronautics and Space Administration — Formation flying enables new capabilities in distributed sensing, surveillance in Earth orbit and for interferometer imaging in deep space as envisioned by the...

  18. Distributed sensor and actuator reconfiguration for fault-tolerant networked control systems

    NARCIS (Netherlands)

    Herdeiro Teixeira, A.M.; Araujo, Jose; Sandberg, Henrik; Johansson, Karl H.

    2017-01-01

    In this paper, we address the problem of distributed reconfiguration of networked control systems upon the removal of misbehaving sensors and actuators. In particular, we consider systems with redundant sensors and actuators cooperating to recover from faults. Reconfiguration is performed while

  19. Distributed fault-tolerant time-varying formation control for high-order linear multi-agent systems with actuator failures.

    Science.gov (United States)

    Hua, Yongzhao; Dong, Xiwang; Li, Qingdong; Ren, Zhang

    2017-11-01

    This paper investigates the fault-tolerant time-varying formation control problems for high-order linear multi-agent systems in the presence of actuator failures. Firstly, a fully distributed formation control protocol is presented to compensate for the influences of both bias fault and loss of effectiveness fault. Using the adaptive online updating strategies, no global knowledge about the communication topology is required and the bounds of actuator failures can be unknown. Then an algorithm is proposed to determine the control parameters of the fault-tolerant formation protocol, where the time-varying formation feasible conditions and an approach to expand the feasible formation set are given. Furthermore, the stability of the proposed algorithm is proven based on the Lyapunov-like theory. Finally, two simulation examples are given to demonstrate the effectiveness of the theoretical results. Copyright © 2017 ISA. Published by Elsevier Ltd. All rights reserved.

  20. Development and evaluation of a fault-tolerant multiprocessor (FTMP) computer. Volume 4: FTMP executive summary

    Science.gov (United States)

    Smith, T. B., III; Lala, J. H.

    1984-01-01

    The FTMP architecture is a high reliability computer concept modeled after a homogeneous multiprocessor architecture. Elements of the FTMP are operated in tight synchronism with one another and hardware fault-detection and fault-masking is provided which is transparent to the software. Operating system design and user software design is thus greatly simplified. Performance of the FTMP is also comparable to that of a simplex equivalent due to the efficiency of fault handling hardware. The FTMP project constructed an engineering module of the FTMP, programmed the machine and extensively tested the architecture through fault injection and other stress testing. This testing confirmed the soundness of the FTMP concepts.

  1. ECFS: A decentralized, distributed and fault-tolerant FUSE filesystem for the LHCb online farm

    CERN Document Server

    Rybczynski, Tomasz; Neufeld, Niko

    2014-01-01

    The LHCb experiment records millions of proton collisions every second, but only a fraction of them are useful for LHCb physics. In order to filter out the 'bad events' a large farm of x86-servers (~2000 nodes) has been put in place. These servers boot from and run from NFS, however they use their local disk to temporarily store data, which cannot be processed in real-time ('data-deferring'). These events are subsequently processed, when there are no live-data coming in. The effective CPU power is thus greatly increased. This gain in CPU power depends critically on the availability of the local disks. For cost and power-reasons, mirroring (RAID-1) is not used, leading to a lot of operational headache with failing disks and disk-errors or server failures induced by faulty disks. To mitigate these problems and increase the reliability of the LHCb farm, while at same time keeping cost and power-consumption low, an extensive research and study of existing highly available and distributed file systems has been don...

  2. A Fault-Tolerant Mobile Computing Model Based On Scalable Replica

    Directory of Open Access Journals (Sweden)

    Meenakshi Sati

    2014-06-01

    Full Text Available The most frequent challenge faced by mobile user is stay connected with online data, while disconnected or poorly connected store the replica of critical data. Nomadic users require replication to store copies of critical data on their mobile machines. Existing replication services do not provide all classes of mobile users with the capabilities they require, which include: the ability for direct synchronization between any two replicas, support for large numbers of replicas, and detailed control over what files reside on their local (mobile replica. Existing peer-to-peer solutions would enable direct communication, but suffers from dramatic scaling problems in the number of replicas, limiting the number of overall users and impacting performance. Roam is a replication system designed to satisfy the requirements of the mobile user. Roam is based on the Ward Model, replication architecture for mobile environments. Using the Ward Model and new distributed algorithms, Roam provides a scalable replication solution for the mobile user. We describe the motivation, design, and implementation of Roam and report its performance. Replication is extremely important in mobile environments because nomadic users require local copies of important data.

  3. Towards scalable Byzantine fault-tolerant replication

    Science.gov (United States)

    Zbierski, Maciej

    2017-08-01

    Byzantine fault-tolerant (BFT) replication is a powerful technique, enabling distributed systems to remain available and correct even in the presence of arbitrary faults. Unfortunately, existing BFT replication protocols are mostly load-unscalable, i.e. they fail to respond with adequate performance increase whenever new computational resources are introduced into the system. This article proposes a universal architecture facilitating the creation of load-scalable distributed services based on BFT replication. The suggested approach exploits parallel request processing to fully utilize the available resources, and uses a load balancer module to dynamically adapt to the properties of the observed client workload. The article additionally provides a discussion on selected deployment scenarios, and explains how the proposed architecture could be used to increase the dependability of contemporary large-scale distributed systems.

  4. Fault Tolerant Control Systems

    DEFF Research Database (Denmark)

    Bøgh, S. A.

    This thesis considered the development of fault tolerant control systems. The focus was on the category of automated processes that do not necessarily comprise a high number of identical sensors and actuators to maintain safe operation, but still have a potential for improving immunity to component...... failures. It is often feasible to increase availability for these control loops by designing the control system to perform on-line detection and reconfiguration in case of faults before the safety system makes a close-down of the process. A general development methodology is given in the thesis...... that carried the control system designer through the steps necessary to consider fault handling in an early design phase. It was shown how an existing control loop with interface to the plant wide control system could be extended with three additional modules to obtain fault tolerance: Fault detection...

  5. An efficient fault-tolerant order entry management information system based on special distributed client/server architecture.

    Science.gov (United States)

    Chuang, C T

    1998-11-01

    An automatic order entry system is very important for the processing of out-patient information, not only helping doctors to enter their orders directly but also reducing errors of communication. Many hospitals are anxious to set up a direct order entry system but are concerned about possible system failures. In this paper we report on an effective and efficient fault-tolerant order entry management system which satisfies the requirements for out-patient order entry. From the results of experiments on a prototype we found that the system was user friendly and reduced the time taken. Doctors are able to enter their orders more easily, accurately and quickly by selecting from the standardized and personalized menus to be found in the system.

  6. Fault-Tolerant NDE Data Reduction Framework, Phase I

    Data.gov (United States)

    National Aeronautics and Space Administration — A distributed fault tolerant nondestructive evaluation (NDE) data reduction framework is proposed in which large NDE datasets are mapped to thousands to millions of...

  7. Fault-Tolerant Precision Formation Guidance for Interferometry, Phase I

    Data.gov (United States)

    National Aeronautics and Space Administration — A methodology is to be developed that will allow the development and implementation of fault-tolerant control system for distributed collaborative spacecraft. The...

  8. Fault tolerance and reliability in integrated ship control

    DEFF Research Database (Denmark)

    Nielsen, Jens Frederik Dalsgaard; Izadi-Zamanabadi, Roozbeh; Schiøler, Henrik

    2002-01-01

    Various strategies for achieving fault tolerance in large scale control systems are discussed. The positive and negative impacts of distribution through network communication are presented. The ATOMOS framework for standardized reliable marine automation is presented along with the corresponding...

  9. Design and simulation of advanced fault tolerant flight control schemes

    Science.gov (United States)

    Gururajan, Srikanth

    This research effort describes the design and simulation of a distributed Neural Network (NN) based fault tolerant flight control scheme and the interface of the scheme within a simulation/visualization environment. The goal of the fault tolerant flight control scheme is to recover an aircraft from failures to its sensors or actuators. A commercially available simulation package, Aviator Visual Design Simulator (AVDS), was used for the purpose of simulation and visualization of the aircraft dynamics and the performance of the control schemes. For the purpose of the sensor failure detection, identification and accommodation (SFDIA) task, it is assumed that the pitch, roll and yaw rate gyros onboard are without physical redundancy. The task is accomplished through the use of a Main Neural Network (MNN) and a set of three De-Centralized Neural Networks (DNNs), providing analytical redundancy for the pitch, roll and yaw gyros. The purpose of the MNN is to detect a sensor failure while the purpose of the DNNs is to identify the failed sensor and then to provide failure accommodation. The actuator failure detection, identification and accommodation (AFDIA) scheme also features the MNN, for detection of actuator failures, along with three Neural Network Controllers (NNCs) for providing the compensating control surface deflections to neutralize the failure induced pitching, rolling and yawing moments. All NNs continue to train on-line, in addition to an offline trained baseline network structure, using the Extended Back-Propagation Algorithm (EBPA), with the flight data provided by the AVDS simulation package. The above mentioned adaptive flight control schemes have been traditionally implemented sequentially on a single computer. This research addresses the implementation of these fault tolerant flight control schemes on parallel and distributed computer architectures, using Berkeley Software Distribution (BSD) sockets and Message Passing Interface (MPI) for inter

  10. Fault-tolerant architectures for superconducting qubits

    Energy Technology Data Exchange (ETDEWEB)

    DiVincenzo, David P [IBM Research Division, Thomas J Watson Research Center, Yorktown Heights, NY 10598 (United States)], E-mail: divince@watson.ibm.com

    2009-12-15

    In this short review, I draw attention to new developments in the theory of fault tolerance in quantum computation that may give concrete direction to future work in the development of superconducting qubit systems. The basics of quantum error-correction codes, which I will briefly review, have not significantly changed since their introduction 15 years ago. But an interesting picture has emerged of an efficient use of these codes that may put fault-tolerant operation within reach. It is now understood that two-dimensional surface codes, close relatives of the original toric code of Kitaev, can be adapted as shown by Raussendorf and Harrington to effectively perform logical gate operations in a very simple planar architecture, with error thresholds for fault-tolerant operation simulated to be 0.75%. This architecture uses topological ideas in its functioning, but it is not 'topological quantum computation'-there are no non-abelian anyons in sight. I offer some speculations on the crucial pieces of superconducting hardware that could be demonstrated in the next couple of years that would be clear stepping stones towards this surface-code architecture.

  11. What is Fault Tolerant Control

    DEFF Research Database (Denmark)

    Blanke, Mogens; Frei, C. W.; Kraus, K.

    2000-01-01

    Faults in automated processes will often cause undesired reactions and shut-down of a controlled plant, and the consequences could be damage to the plant, to personnel or the environment. Fault-tolerant control is the synonym for a set of recent techniques that were developed to increase plant...... availability and reduce the risk of safety hazards. Its aim is to prevent that simple faults develop into serious failure. Fault-tolerant control merges several disciplines to achieve this goal, including on-line fault diagnosis, automatic condition assessment and calculation of remedial actions when a fault...

  12. Steps toward fault-tolerant quantum chemistry.

    Energy Technology Data Exchange (ETDEWEB)

    Taube, Andrew Garvin

    2010-05-01

    Developing quantum chemistry programs on the coming generation of exascale computers will be a difficult task. The programs will need to be fault-tolerant and minimize the use of global operations. This work explores the use a task-based model that uses a data-centric approach to allocate work to different processes as it applies to quantum chemistry. After introducing the key problems that appear when trying to parallelize a complicated quantum chemistry method such as coupled-cluster theory, we discuss the implications of that model as it pertains to the computational kernel of a coupled-cluster program - matrix multiplication. Also, we discuss the extensions that would required to build a full coupled-cluster program using the task-based model. Current programming models for high-performance computing are fault-intolerant and use global operations. Those properties are unsustainable as computers scale to millions of CPUs; instead one must recognize that these systems will be hierarchical in structure, prone to constant faults, and global operations will be infeasible. The FAST-OS HARE project is introducing a scale-free computing model to address these issues. This model is hierarchical and fault-tolerant by design, allows for the clean overlap of computation and communication, reducing the network load, does not require checkpointing, and avoids the complexity of many HPC runtimes. Development of an algorithm within this model requires a change in focus from imperative programming to a data-centric approach. Quantum chemistry (QC) algorithms, in particular electronic structure methods, are an ideal test bed for this computing model. These methods describe the distribution of electrons in a molecule, which determine the properties of the molecule. The computational cost of these methods is high, scaling quartically or higher in the size of the molecule, which is why QC applications are major users of HPC resources. The complexity of these algorithms means that

  13. Fault-Tolerant Heat Exchanger

    Science.gov (United States)

    Izenson, Michael G.; Crowley, Christopher J.

    2005-01-01

    A compact, lightweight heat exchanger has been designed to be fault-tolerant in the sense that a single-point leak would not cause mixing of heat-transfer fluids. This particular heat exchanger is intended to be part of the temperature-regulation system for habitable modules of the International Space Station and to function with water and ammonia as the heat-transfer fluids. The basic fault-tolerant design is adaptable to other heat-transfer fluids and heat exchangers for applications in which mixing of heat-transfer fluids would pose toxic, explosive, or other hazards: Examples could include fuel/air heat exchangers for thermal management on aircraft, process heat exchangers in the cryogenic industry, and heat exchangers used in chemical processing. The reason this heat exchanger can tolerate a single-point leak is that the heat-transfer fluids are everywhere separated by a vented volume and at least two seals. The combination of fault tolerance, compactness, and light weight is implemented in a unique heat-exchanger core configuration: Each fluid passage is entirely surrounded by a vented region bridged by solid structures through which heat is conducted between the fluids. Precise, proprietary fabrication techniques make it possible to manufacture the vented regions and heat-conducting structures with very small dimensions to obtain a very large coefficient of heat transfer between the two fluids. A large heat-transfer coefficient favors compact design by making it possible to use a relatively small core for a given heat-transfer rate. Calculations and experiments have shown that in most respects, the fault-tolerant heat exchanger can be expected to equal or exceed the performance of the non-fault-tolerant heat exchanger that it is intended to supplant (see table). The only significant disadvantages are a slight weight penalty and a small decrease in the mass-specific heat transfer.

  14. A logical structure based fault tolerant approach to handle leader election in mobile ad hoc networks

    Directory of Open Access Journals (Sweden)

    Bharti Sharma

    2017-07-01

    Full Text Available We propose a light weight layered architecture to support the computation of leader in mobile ad hoc networks. In distributed applications, the leader has to perform a number of synchronization activities among participating nodes and numerous applications; hence, it is a stressed node and consequently prone to failure. Thus, fast and fault tolerant leader election is a major concern and popular area of research in distributed computing networks, in general, and wireless ad hoc networks, in particular. In the present article, we have proposed a fault tolerant leader election approach. More importantly, the nodes elect the leader quickly on the basis of local information only. The illustration includes suitable examples. The correctness proof and performance evaluation has also been presented.

  15. Fault Tolerant External Memory Algorithms

    DEFF Research Database (Denmark)

    Jørgensen, Allan Grønlund; Brodal, Gerth Stølting; Mølhave, Thomas

    2009-01-01

    Algorithms dealing with massive data sets are usually designed for I/O-efficiency, often captured by the I/O model by Aggarwal and Vitter. Another aspect of dealing with massive data is how to deal with memory faults, e.g. captured by the adversary based faulty memory RAM by Finocchi and Italiano....... However, current fault tolerant algorithms do not scale beyond the internal memory. In this paper we investigate for the first time the connection between I/O-efficiency in the I/O model and fault tolerance in the faulty memory RAM, and we assume that both memory and disk are unreliable. We show a lower...... bound on the number of I/Os required for any deterministic dictionary that is resilient to memory faults. We design a static and a dynamic deterministic dictionary with optimal query performance as well as an optimal sorting algorithm and an optimal priority queue. Finally, we consider scenarios where...

  16. Fibre bundle framework for quantum fault tolerance

    Science.gov (United States)

    Zhang, Lucy Liuxuan; Gottesman, Daniel

    2014-03-01

    We introduce a differential geometric framework for describing families of quantum error-correcting codes and for understanding quantum fault tolerance. In particular, we use fibre bundles and a natural projectively flat connection thereon to study the transformation of codewords under unitary fault-tolerant evolutions. We'll explain how the fault-tolerant logical operations are given by the monodromy group for the bundles with projectively flat connection, which is always discrete. We will discuss the construction of the said bundles for two examples of fault-tolerant families of operations, the string operators in the toric code and the qudit transversal gates. This framework unifies topological fault tolerance and fault tolerance based on transversal gates, and is expected to apply for all unitary quantum fault-tolerant protocols.

  17. Scalability, performance, and fault tolerance of PACS architectures

    Science.gov (United States)

    Blume, Hartwig R.; Prior, Fred W.; di Pierro, Milan C.; Goble, John C.; Lodgberg, Jonas; Kenney, Robert S.; Goeringer, Fred

    1998-07-01

    Three data-base architectures may be distinguished among Picture Archiving and Communication Systems (PACSs): (1) Configurations with logically and physically centralized data- base and file server, (2) systems with physically distributed file servers and a logically centralized data-base, and (3) installations with logically and physically distributed data- bases and file servers. A brief overview of these architectures and their scaleability, performance, and fault- tolerance is given. A PACS for an existing large university hospital is designed for the first as well as the second architecture using given image production data and workflow. We evaluate the fault-tolerance of the two architectures. By modeling the work-flow and employing queuing theory, solutions with practically realizable data transfer requirements are found for both architectures. With today's performance and cost of computers, storage, and information management technologies, the second and third architectures are preferably implemented, depending on the size of the installation. The architectures offer almost unlimited scaleability, very high fault-tolerance, and optimized workflow. We describe a modern commercial PACS that adheres to the open-systems concept and consists of software application programs that run, independent of specific computer and network components, on off-the-shelf hardware and under standard multi-platform operating systems and utilize commercial data-base management systems and network managers. The system is based on the second architecture with multiple islands of functionality, each with servers and archive modules and a physically distributed data-base. Our PACS architecture supports browser technology: Workstations use the data-base to determine the location of needed information and then, through the image browser, mount the appropriate file server for access. The architecture supports a concept similar to domain name server (DNS) directory services on the

  18. Fault tolerant homopolar magnetic bearings with flux invariant control

    International Nuclear Information System (INIS)

    Na, Uhn Joo

    2006-01-01

    The theory for a novel fault-tolerant 4-active-pole homopolar magnetic bearing is developed. If any one coil of the four coils in the bearing actuator fail, the remaining three coil currents change via an optimal distribution matrix such that the same opposing pole, C-core type, control fluxes as those of the un-failed bearing are produced. The homopolar magnetic bearing thus provides unaltered magnetic forces without any loss of the bearing load capacity even if any one coil suddenly fails. Numerical examples are provided to illustrate the novel fault-tolerant, 4-active pole homopolar magnetic bearings

  19. Scalable error correction in distributed ion trap computers

    International Nuclear Information System (INIS)

    Oi, Daniel K. L.; Devitt, Simon J.; Hollenberg, Lloyd C. L.

    2006-01-01

    A major challenge for quantum computation in ion trap systems is scalable integration of error correction and fault tolerance. We analyze a distributed architecture with rapid high-fidelity local control within nodes and entangled links between nodes alleviating long-distance transport. We demonstrate fault-tolerant operator measurements which are used for error correction and nonlocal gates. This scheme is readily applied to linear ion traps which cannot be scaled up beyond a few ions per individual trap but which have access to a probabilistic entanglement mechanism. A proof-of-concept system is presented which is within the reach of current experiment

  20. Fault-tolerant Supervisory Control

    DEFF Research Database (Denmark)

    Izadi-Zamanabadi, Roozbeh

    , is extended to cope with the important reconfiguration problem. In order to enable a designer to acquire knowledge about reconfiguration possibilities, the structural analysis method is added as an extension to the existing methodology. This extension builds upon the earlier method where fault propagation...... the selection of remedial actions. Furthermore, it is shown how sensor information fusion is obtained by using the SA method. The construction of the supervisor's decision logic is essential for the active form of fault-tolerant control. In this regard, two approaches has been presented. The first aims...... at constructing the decision logic in form of a ``language''. This language is obtained as a direct result of the component based approach, presented in this thesis. This approach is based on the definition of a functional component, components placement in a control system hierarchy and the definition of system...

  1. Fault Tolerant Wind Farm Control

    DEFF Research Database (Denmark)

    Odgaard, Peter Fogh; Stoustrup, Jakob

    2013-01-01

    In the recent years the wind turbine industry has focused on optimizing the cost of energy. One of the important factors in this is to increase reliability of the wind turbines. Advanced fault detection, isolation and accommodation are important tools in this process. Clearly most faults are dealt...... with best at a wind turbine control level. However, some faults are better dealt with at the wind farm control level, if the wind turbine is located in a wind farm. In this paper a benchmark model for fault detection and isolation, and fault tolerant control of wind turbines implemented at the wind farm...... control level is presented. The benchmark model includes a small wind farm of nine wind turbines, based on simple models of the wind turbines as well as the wind and interactions between wind turbines in the wind farm. The model includes wind and power references scenarios as well as three relevant fault...

  2. Adiabatic Motion of Fault Tolerant Qubits

    Science.gov (United States)

    Drummond, David Edward

    This work proposes and analyzes the adiabatic motion of fault tolerant qubits in two systems as candidates for the building blocks of a quantum computer. The first proposal examines a pair of electron spins in double quantum dots, finding that the leading source of decoherence, hyperfine dephasing, can be suppressed by adiabatic rotation of the dots in real space. The additional spin-orbit effects introduced by this motion are analyzed, simulated, and found to result in an infidelity below the error-correction threshold. The second proposal examines topological qubits formed by Majorana zero modes theorized to exist at the ends of semiconductor nanowires coupled to conventional superconductors. A model is developed to design adiabatic movements of the Majorana bound states to produce entangled qubits. Analysis and simulations indicate that these adiabatic operations can also be used to demonstrate entanglement experimentally by testing Bell's theorem.

  3. Diagnosis and fault-tolerant control

    CERN Document Server

    Blanke, Mogens; Lunze, Jan; Staroswiecki, Marcel

    2016-01-01

    Fault-tolerant control aims at a gradual shutdown response in automated systems when faults occur. It satisfies the industrial demand for enhanced availability and safety, in contrast to traditional reactions to faults, which bring about sudden shutdowns and loss of availability. The book presents effective model-based analysis and design methods for fault diagnosis and fault-tolerant control. Architectural and structural models are used to analyse the propagation of the fault through the process, to test the fault detectability and to find the redundancies in the process that can be used to ensure fault tolerance. It also introduces design methods suitable for diagnostic systems and fault-tolerant controllers for continuous processes that are described by analytical models of discrete-event systems represented by automata. The book is suitable for engineering students, engineers in industry and researchers who wish to get an overview of the variety of approaches to process diagnosis and fault-tolerant contro...

  4. Synthesis of Fault-Tolerant Embedded Systems

    DEFF Research Database (Denmark)

    Eles, Petru; Izosimov, Viacheslav; Pop, Paul

    2008-01-01

    This work addresses the issue of design optimization for fault- tolerant hard real-time systems. In particular, our focus is on the handling of transient faults using both checkpointing with rollback recovery and active replication. Fault tolerant schedules are generated based on a conditional...... process graph representation. The formulated system synthesis approaches decide the assignment of fault-tolerance policies to processes, the optimal placement of checkpoints and the mapping of processes to processors, such that multiple transient faults are tolerated, transparency requirements...

  5. Fault tolerant control for switched linear systems

    CERN Document Server

    Du, Dongsheng; Shi, Peng

    2015-01-01

    This book presents up-to-date research and novel methodologies on fault diagnosis and fault tolerant control for switched linear systems. It provides a unified yet neat framework of filtering, fault detection, fault diagnosis and fault tolerant control of switched systems. It can therefore serve as a useful textbook for senior and/or graduate students who are interested in knowing the state-of-the-art of filtering, fault detection, fault diagnosis and fault tolerant control areas, as well as recent advances in switched linear systems.  

  6. Fault Tolerant Control: A Simultaneous Stabilization Result

    DEFF Research Database (Denmark)

    Stoustrup, Jakob; Blondel, V.D.

    2004-01-01

    This paper discusses the problem of designing fault tolerant compensators that stabilize a given system both in the nominal situation, as well as in the situation where one of the sensors or one of the actuators has failed. It is shown that such compensators always exist, provided that the system...... is detectable from each output and that it is stabilizable. The proof of this result is constructive, and a worked example shows how to design a fault tolerant compensator for a simple, yet challeging system. A family of second order systems is described that requires fault tolerant compensators of arbitrarily...

  7. Real-Time Fault Tolerant Networking Protocols

    National Research Council Canada - National Science Library

    Henzinger, Thomas A

    2004-01-01

    We made significant progress in the areas of video streaming, wireless protocols, mobile ad-hoc and sensor networks, peer-to-peer systems, fault tolerant algorithms, dependability and timing analysis...

  8. Reliable, fault tolerant control systems for nuclear generating stations

    International Nuclear Information System (INIS)

    McNeil, T.O.; Olmstead, R.A.; Schafer, S.

    1990-01-01

    Two operational features of CANDU Nuclear Power Stations provide for high plant availability. First, the plant re-fuels on-line, thereby eliminating the need for periodic and lengthy refuelling 'outages'. Second, the all plants are controlled by real-time computer systems. Later plants are also protected using real-time computer systems. In the past twenty years, the control systems now operating in 21 plants have achieved an availability of 99.8%, making significant contributions to high CANDU plant capacity factors. This paper describes some of the features that ensure the high degree of system fault tolerance and hence high plant availability. The emphasis will be placed on the fault tolerant features of the computer systems included in the latest reactor design - the CANDU 3 (450MWe). (author)

  9. Assessing Server Fault Tolerance and Disaster Recovery Implementation in Thin Client Architectures

    National Research Council Canada - National Science Library

    Slaydon, Samuel L

    2007-01-01

    This thesis will focus on assessing server fault tolerance and disaster recovery procedures for thin-clients being implemented in smart classrooms and computer laboratories aboard the Naval Postgraduate School campus...

  10. Energy-efficient fault-tolerant systems

    CERN Document Server

    Mathew, Jimson; Pradhan, Dhiraj K

    2013-01-01

    This book describes the state-of-the-art in energy efficient, fault-tolerant embedded systems. It covers the entire product lifecycle of electronic systems design, analysis and testing and includes discussion of both circuit and system-level approaches. Readers will be enabled to meet the conflicting design objectives of energy efficiency and fault-tolerance for reliability, given the up-to-date techniques presented.

  11. Incorporating Fault Tolerance Tactics in Software Architecture Patterns

    NARCIS (Netherlands)

    Harrison, Neil B.; Avgeriou, Paris

    2008-01-01

    One important way that an architecture impacts fault tolerance is by making it easy or hard to implement measures that improve fault tolerance. Many such measures are described as fault tolerance tactics. We studied how various fault tolerance tactics can be implemented in the best-known

  12. Investigation of the applicability of a functional programming model to fault-tolerant parallel processing for knowledge-based systems

    Science.gov (United States)

    Harper, Richard

    1989-01-01

    In a fault-tolerant parallel computer, a functional programming model can facilitate distributed checkpointing, error recovery, load balancing, and graceful degradation. Such a model has been implemented on the Draper Fault-Tolerant Parallel Processor (FTPP). When used in conjunction with the FTPP's fault detection and masking capabilities, this implementation results in a graceful degradation of system performance after faults. Three graceful degradation algorithms have been implemented and are presented. A user interface has been implemented which requires minimal cognitive overhead by the application programmer, masking such complexities as the system's redundancy, distributed nature, variable complement of processing resources, load balancing, fault occurrence and recovery. This user interface is described and its use demonstrated. The applicability of the functional programming style to the Activation Framework, a paradigm for intelligent systems, is then briefly described.

  13. Towards as assessment of fault-tolerant design principles for software

    Science.gov (United States)

    Eckhardt, Dave E., Jr.

    1985-01-01

    Several topics related to the assessment of fault-tolerant design principles for software are presented in outline form. A coincident errors model, discrete intensity distribution and the effects of coincident errors are discussed.

  14. Fault-tolerant interface between quantum memories and quantum processors.

    Science.gov (United States)

    Poulsen Nautrup, Hendrik; Friis, Nicolai; Briegel, Hans J

    2017-11-06

    Topological error correction codes are promising candidates to protect quantum computations from the deteriorating effects of noise. While some codes provide high noise thresholds suitable for robust quantum memories, others allow straightforward gate implementation needed for data processing. To exploit the particular advantages of different topological codes for fault-tolerant quantum computation, it is necessary to be able to switch between them. Here we propose a practical solution, subsystem lattice surgery, which requires only two-body nearest-neighbor interactions in a fixed layout in addition to the indispensable error correction. This method can be used for the fault-tolerant transfer of quantum information between arbitrary topological subsystem codes in two dimensions and beyond. In particular, it can be employed to create a simple interface, a quantum bus, between noise resilient surface code memories and flexible color code processors.

  15. Fault tolerant microcomputer based alarm annunciator for Dhruva reactor

    International Nuclear Information System (INIS)

    Chandra, A.K.

    1988-01-01

    The Dhruva alarm annunciator displays the status of 624 alarm points on an array of display windows using the standard ringback sequence. Recognizing the need for a very high availability, the system is implemented as a fault tolerant configuration. The annunciator is partitioned into three identical units; each unit is implemented using two microcomputers wired in a hot standby mode. In the event of one computer malfunctioning, the standby computer takes over control in a bouncefree transfer. The use of microprocessors has helped built-in flexibility in the system. The system also provides built-in capability to resolve the sequence of occurrence of events and conveys this information to another system for display on a CRT. This report describes the system features, fault tolerant organisation used and the hardware and software developed for the annunciation function. (author). 8 figs

  16. Quantum Control and Fault-tolerance

    Science.gov (United States)

    Paz Silva, Gerardo; Dominy, Jason; Lidar, Daniel

    2013-03-01

    Quantum control (QC) and the methods of fault-tolerant quantum computing (FTQC) are two of the cornerstones on which the hope for a quantum computer rests. However QC methods do not generally scale well with the size of the system, and it is not known how their performance is hindered when integration with FTQC methods, especially considering these demand a large system size overhead, is attempted under realistic noise models. Here we study this problem using dynamical decoupling in the bang-bang limit as a toy model, with a non-Markovian noise where interactions decay with distance, and show that there exists a regime of the norms of the relevant Hamiltonians, in which dynamical decoupling protected gates provide an advantage over the bare gate implementation. This is a first step towards showing that QC protocols designed for a small set of qubits can be extended to larger sets without a significant loss of performance, as long as the noise model behaves reasonably well.

  17. Dynamic surface fault tolerant control for underwater remotely operated vehicles.

    Science.gov (United States)

    Baldini, Alessandro; Ciabattoni, Lucio; Felicetti, Riccardo; Ferracuti, Francesco; Freddi, Alessandro; Monteriù, Andrea

    2018-03-01

    In this paper, we present a two stages actuator Fault Tolerant Control (FTC) strategy for the trajectory tracking of a Remotely Operated Vehicle (ROV). Dynamic Surface Control (DSC) is used to generate the moment and forces required by the vehicle to perform the desired motion. In the second stage of the control system, a fault tolerant thruster allocation policy is employed to distribute moment and forces among the thrusters. Exhaustive simulations have been carried out in order to compare the performance of the proposed solution with respect to different control techniques (i.e., PID, backstepping and sliding mode approaches). Saturations, actuator dynamics, sensor noises and time discretization are considered, in fault-free and faulty conditions. Furthermore, in order to provide a fair and exhaustive comparison of the control techniques, the same meta-heuristic approach, namely Artificial Bee Colony algorithm (ABC), has been employed to tune the controllers parameters. Copyright © 2018 ISA. Published by Elsevier Ltd. All rights reserved.

  18. Diagnosis and Fault-tolerant Control

    DEFF Research Database (Denmark)

    Blanke, Mogens; Kinnaert, Michel; Lunze, Jan

    The book presents effective model-based analysis and design methods for fault diagnosis and fault-tolerant control. Architectural and structural models are used to analyse the propagation of the fault through the process, to test the fault detectability and to find the redundancies in the process...... the applicability of the presented methods. The theoretical results are illustrated by two running examples which are used throughout the book. The book addresses engineering students, engineers in industry and researchers who wish to get a survey over the variety of approaches to process diagnosis and fault...... that can be used to ensure fault tolerance. Design methods for diagnostic systems and fault-tolerant controllers are presented for processes that are described by analytical models, by discrete-event models or that can be dealt with as quantised systems. Four case studies on pilot processes show...

  19. An architecture for fault tolerant controllers

    DEFF Research Database (Denmark)

    Niemann, Hans Henrik; Stoustrup, Jakob

    2005-01-01

    A general architecture for fault tolerant control is proposed. The architecture is based on the (primary) YJBK parameterization of all stabilizing compensators and uses the dual YJBK parameterization to quantify the performance of the fault tolerant system. The approach suggested can be applied...... for additive faults, parametric faults, and for system structural changes. The modeling for each of these fault classes is described. The method allows to design for passive as well as for active fault handling. Also, the related design method can be fitted either to guarantee stability or to achieve graceful...... degradation in the sense of guaranteed degraded performance. A number of fault diagnosis problems, fault tolerant control problems, and feedback control with fault rejection problems are formulated/considered, mainly from a fault modeling point of view. The method is illustrated on a servo example including...

  20. Rollback recovery with low overhead for fault tolerance in mobile ad hoc networks

    Directory of Open Access Journals (Sweden)

    Parmeet Kaur Jaggi

    2015-10-01

    Full Text Available Mobile ad hoc networks (MANETs have significantly enhanced the wireless networks by eliminating the need for any fixed infrastructure. Hence, these are increasingly being used for expanding the computing capacity of existing networks or for implementation of autonomous mobile computing Grids. However, the fragile nature of MANETs makes the constituent nodes susceptible to failures and the computing potential of these networks can be utilized only if they are fault tolerant. The technique of checkpointing based rollback recovery has been used effectively for fault tolerance in static and cellular mobile systems; yet, the implementation of existing protocols for MANETs is not straightforward. The paper presents a novel rollback recovery protocol for handling the failures of mobile nodes in a MANET using checkpointing and sender based message logging. The proposed protocol utilizes the routing protocol existing in the network for implementing a low overhead recovery mechanism. The presented recovery procedure at a node is completely domino-free and asynchronous. The protocol is resilient to the dynamic characteristics of the MANET; allowing a distributed application to be executed independently without access to any wired Grid or cellular network access points. We also present an algorithm to record a consistent global snapshot of the MANET.

  1. Fault-Tolerant in Embedded Systems (MPSoC: Performance Estimation and Dynamic Migration Task

    Directory of Open Access Journals (Sweden)

    Kamel Smiri

    2017-07-01

    Full Text Available Multiprocessor Systems-on-Chip (MPSoC allow the implementation of heterogeneous architectures with a high integration capacity. In recent years, computational requirements MPSoC are increasing exponentially. This complexity, coupled with constantly evolving specifications, has forced designers to consider intrinsically flexible implementations. Deploying applications typical of multimedia domains is difficult, not only due to the heterogeneous parallelism in the platforms, but also due to the performance constraints that typify these systems. An application can be modeled as a set of cooperative tasks. A task can be implemented in software or in hardware depending on its complexity and the involved cost. Our proposal is a fault tolerance approach which combines the results of a performance model and a technical’s fault tolerance. We interest of the dynamic migration task to resolve the Fault-Tolerant for Multiprocessors Embedded System. We exploited an example of multimedia application (MJPEG decoder to find optimal Fault tolerance systems. Our aim in this paper is to exploit the classic technique of fault tolerance. The solution chosen is the transformation of software processing into hardware processing. And also, exploitation of hybrid models (simulation/analytics. The goal is to have a Fault Tolerant Embedded System.

  2. Architecting Fault-Tolerant Software Systems

    NARCIS (Netherlands)

    Sözer, Hasan

    2009-01-01

    The increasing size and complexity of software systems makes it hard to prevent or remove all possible faults. Faults that remain in the system can eventually lead to a system failure. Fault tolerance techniques are introduced for enabling systems to recover and continue operation when they are

  3. Sputnik: ad hoc distributed computation.

    Science.gov (United States)

    Völkel, Gunnar; Lausser, Ludwig; Schmid, Florian; Kraus, Johann M; Kestler, Hans A

    2015-04-15

    In bioinformatic applications, computationally demanding algorithms are often parallelized to speed up computation. Nevertheless, setting up computational environments for distributed computation is often tedious. Aim of this project were the lightweight ad hoc set up and fault-tolerant computation requiring only a Java runtime, no administrator rights, while utilizing all CPU cores most effectively. The Sputnik framework provides ad hoc distributed computation on the Java Virtual Machine which uses all supplied CPU cores fully. It provides a graphical user interface for deployment setup and a web user interface displaying the current status of current computation jobs. Neither a permanent setup nor administrator privileges are required. We demonstrate the utility of our approach on feature selection of microarray data. The Sputnik framework is available on Github http://github.com/sysbio-bioinf/sputnik under the Eclipse Public License. hkestler@fli-leibniz.de or hans.kestler@uni-ulm.de Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  4. A methodology for testing fault-tolerant software

    Science.gov (United States)

    Andrews, D. M.; Mahmood, A.; Mccluskey, E. J.

    1985-01-01

    A methodology for testing fault tolerant software is presented. There are problems associated with testing fault tolerant software because many errors are masked or corrected by voters, limiter, or automatic channel synchronization. This methodology illustrates how the same strategies used for testing fault tolerant hardware can be applied to testing fault tolerant software. For example, one strategy used in testing fault tolerant hardware is to disable the redundancy during testing. A similar testing strategy is proposed for software, namely, to move the major emphasis on testing earlier in the development cycle (before the redundancy is in place) thus reducing the possibility that undetected errors will be masked when limiters and voters are added.

  5. Fault tolerant control of systems with saturations

    DEFF Research Database (Denmark)

    Niemann, Hans Henrik

    2013-01-01

    This paper presents framework for fault tolerant controllers (FTC) that includes input saturation. The controller architecture known from FTC is based on the Youla-Jabr-Bongiorno-Kucera (YJBK) parameterization is extended to handle input saturation. Applying this controller architecture in connec......This paper presents framework for fault tolerant controllers (FTC) that includes input saturation. The controller architecture known from FTC is based on the Youla-Jabr-Bongiorno-Kucera (YJBK) parameterization is extended to handle input saturation. Applying this controller architecture...... in connection with faulty systems including input saturation gives an additional YJBK transfer function related to the input saturation. In the fault free case, this additional YJBK transfer function can be applied directly for optimizing the feedback loop around the input saturation. In the faulty case......, the design problem is a mixed design problem involved both parametric faults and input saturation....

  6. Experimental Demonstration of Fault-Tolerant State Preparation with Superconducting Qubits

    Science.gov (United States)

    Takita, Maika; Cross, Andrew W.; Córcoles, A. D.; Chow, Jerry M.; Gambetta, Jay M.

    2017-11-01

    Robust quantum computation requires encoding delicate quantum information into degrees of freedom that are hard for the environment to change. Quantum encodings have been demonstrated in many physical systems by observing and correcting storage errors, but applications require not just storing information; we must accurately compute even with faulty operations. The theory of fault-tolerant quantum computing illuminates a way forward by providing a foundation and collection of techniques for limiting the spread of errors. Here we implement one of the smallest quantum codes in a five-qubit superconducting transmon device and demonstrate fault-tolerant state preparation. We characterize the resulting code words through quantum process tomography and study the free evolution of the logical observables. Our results are consistent with fault-tolerant state preparation in a protected qubit subspace.

  7. A Replication-Based Mechanism for Fault Tolerance in MapReduce Framework

    Directory of Open Access Journals (Sweden)

    Yang Liu

    2015-01-01

    Full Text Available MapReduce is a programming model and an associated implementation for processing and generating large data sets with a parallel, distributed algorithm on a cluster. In cloud environment, node and task failure are no longer accidental but a common feature of large-scale systems. Current rescheduling-based fault tolerance method in MapReduce framework failed to fully consider the location of distributed data and the computation and storage overhead of rescheduling failure tasks. Thus, a single node failure will increase the completion time dramatically. In this paper, a replication-based mechanism is proposed, which takes both task and node failure into consideration. Experimental results show that, compared with default mechanism in Hadoop, our mechanism can significantly improve the performance at failure time, with more than 30% decreasing in execution time.

  8. Novel Design for Quantum Dots Cellular Automata to Obtain Fault-Tolerant Majority Gate

    Directory of Open Access Journals (Sweden)

    Razieh Farazkish

    2012-01-01

    Full Text Available Quantum-dot Cellular Automata (QCA is one of the most attractive technologies for computing at nanoscale. The principle element in QCA is majority gate. In this paper, fault-tolerance properties of the majority gate is analyzed. This component is suitable for designing fault-tolerant QCA circuits. We analyze fault-tolerance properties of three-input majority gate in terms of misalignment, missing, and dislocation cells. In order to verify the functionality of the proposed component some physical proofs using kink energy (the difference in electrostatic energy between the two polarization states and computer simulations using QCA Designer tool are provided. Our results clearly demonstrate that the redundant version of the majority gate is more robust than the standard style for this gate.

  9. Development and Evaluation of Fault-Tolerant Flight Control Systems

    Science.gov (United States)

    Song, Yong D.; Gupta, Kajal (Technical Monitor)

    2004-01-01

    The research is concerned with developing a new approach to enhancing fault tolerance of flight control systems. The original motivation for fault-tolerant control comes from the need for safe operation of control elements (e.g. actuators) in the event of hardware failures in high reliability systems. One such example is modem space vehicle subjected to actuator/sensor impairments. A major task in flight control is to revise the control policy to balance impairment detectability and to achieve sufficient robustness. This involves careful selection of types and parameters of the controllers and the impairment detecting filters used. It also involves a decision, upon the identification of some failures, on whether and how a control reconfiguration should take place in order to maintain a certain system performance level. In this project new flight dynamic model under uncertain flight conditions is considered, in which the effects of both ramp and jump faults are reflected. Stabilization algorithms based on neural network and adaptive method are derived. The control algorithms are shown to be effective in dealing with uncertain dynamics due to external disturbances and unpredictable faults. The overall strategy is easy to set up and the computation involved is much less as compared with other strategies. Computer simulation software is developed. A serious of simulation studies have been conducted with varying flight conditions.

  10. Fault-tolerance in Two-dimensional Topological Systems

    Science.gov (United States)

    Anderson, Jonas T.

    This thesis is a collection of ideas with the general goal of building, at least in the abstract, a local fault-tolerant quantum computer. The connection between quantum information and topology has proven to be an active area of research in several fields. The introduction of the toric code by Alexei Kitaev demonstrated the usefulness of topology for quantum memory and quantum computation. Many quantum codes used for quantum memory are modeled by spin systems on a lattice, with operators that extract syndrome information placed on vertices or faces of the lattice. It is natural to wonder whether the useful codes in such systems can be classified. This thesis presents work that leverages ideas from topology and graph theory to explore the space of such codes. Homological stabilizer codes are introduced and it is shown that, under a set of reasonable assumptions, any qubit homological stabilizer code is equivalent to either a toric code or a color code. Additionally, the toric code and the color code correspond to distinct classes of graphs. Many systems have been proposed as candidate quantum computers. It is very desirable to design quantum computing architectures with two-dimensional layouts and low complexity in parity-checking circuitry. Kitaev's surface codes provided the first example of codes satisfying this property. They provided a new route to fault tolerance with more modest overheads and thresholds approaching 1%. The recently discovered color codes share many properties with the surface codes, such as the ability to perform syndrome extraction locally in two dimensions. Some families of color codes admit a transversal implementation of the entire Clifford group. This work investigates color codes on the 4.8.8 lattice known as triangular codes. I develop a fault-tolerant error-correction strategy for these codes in which repeated syndrome measurements on this lattice generate a three-dimensional space-time combinatorial structure. I then develop an

  11. Fault Tolerance Assistant (FTA): An Exception Handling Programming Model for MPI Applications

    Energy Technology Data Exchange (ETDEWEB)

    Fang, Aiman [Univ. of Chicago, IL (United States). Dept. of Computer Science; Laguna, Ignacio [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Sato, Kento [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Islam, Tanzima [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Mohror, Kathryn [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

    2016-05-23

    Future high-performance computing systems may face frequent failures with their rapid increase in scale and complexity. Resilience to faults has become a major challenge for large-scale applications running on supercomputers, which demands fault tolerance support for prevalent MPI applications. Among failure scenarios, process failures are one of the most severe issues as they usually lead to termination of applications. However, the widely used MPI implementations do not provide mechanisms for fault tolerance. We propose FTA-MPI (Fault Tolerance Assistant MPI), a programming model that provides support for failure detection, failure notification and recovery. Specifically, FTA-MPI exploits a try/catch model that enables failure localization and transparent recovery of process failures in MPI applications. We demonstrate FTA-MPI with synthetic applications and a molecular dynamics code CoMD, and show that FTA-MPI provides high programmability for users and enables convenient and flexible recovery of process failures.

  12. Interface For Fault-Tolerant Control System

    Science.gov (United States)

    Shaver, Charles; Williamson, Michael

    1989-01-01

    Interface unit and controller emulator developed for research on electronic helicopter-flight-control systems equipped with artificial intelligence. Interface unit interrupt-driven system designed to link microprocessor-based, quadruply-redundant, asynchronous, ultra-reliable, fault-tolerant control system (controller) with electronic servocontrol unit that controls set of hydraulic actuators. Receives digital feedforward messages from, and transmits digital feedback messages to, controller through differential signal lines or fiber-optic cables (thus far only differential signal lines have been used). Analog signals transmitted to and from servocontrol unit via coaxial cables.

  13. A Concept for fault tolerant controllers

    DEFF Research Database (Denmark)

    Niemann, Hans Henrik; Poulsen, Niels Kjølstad

    2009-01-01

    This paper describe a concept for fault tolerant controllers (FTC) based on the YJBK (after Youla, Jabr, Bongiorno and Kucera) parameterization. This controller architecture will allow to change the controller on-line in the case of faults in the system. In the described FTC concept, a safe mode...... controller is applied as the basic feedback controller. A controller for normal operation with high performance is obtained by including certain YJBK parameters (transfer functions) in the controller. This will allow a fast switch from normal operation to safe mode operation in case of critical faults...

  14. Fault tolerance of artificial neural networks with applications in critical systems

    Science.gov (United States)

    Protzel, Peter W.; Palumbo, Daniel L.; Arras, Michael K.

    1992-01-01

    This paper investigates the fault tolerance characteristics of time continuous recurrent artificial neural networks (ANN) that can be used to solve optimization problems. The principle of operations and performance of these networks are first illustrated by using well-known model problems like the traveling salesman problem and the assignment problem. The ANNs are then subjected to 13 simultaneous 'stuck at 1' or 'stuck at 0' faults for network sizes of up to 900 'neurons'. The effects of these faults is demonstrated and the cause for the observed fault tolerance is discussed. An application is presented in which a network performs a critical task for a real-time distributed processing system by generating new task allocations during the reconfiguration of the system. The performance degradation of the ANN under the presence of faults is investigated by large-scale simulations, and the potential benefits of delegating a critical task to a fault tolerant network are discussed.

  15. Engineering scalable fault-tolerant quantum computation

    Science.gov (United States)

    Kimchi-Schwartz, Mollie; Danna, Rosenberg; Kim, David; Yoder, Jonilyn; Kjaergaard, Morten; Das, Rabindra; Grover, Jeff; Gustavsson, Simon; Oliver, William

    Recent demonstrations of quantum protocols comprising on the order of 5-10 superconducting qubits are foundational to the future development of quantum information processors. A next critical step in the development of resilient quantum processors will be the integration of coherent quantum circuits with a hardware platform that is amenable to extending the system size to hundreds of qubits and beyond. In this talk, we will discuss progress toward integrating coherent superconducting qubits with signal routing via the third dimension. This research was funded in part by the Office of the Director of National Intelligence (ODNI), Intelligence Advanced Research Projects Activity (IARPA) and by the Assistant Secretary of Defense for Research & Engineering under Air Force Contract No. FA8721-05-C-0002. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of ODNI, IARPA, or the US Government.

  16. Cooperative Fault Tolerant Tracking Control for Multiagent Systems: An Intermediate Estimator-Based Approach.

    Science.gov (United States)

    Zhu, Jun-Wei; Yang, Guang-Hong; Zhang, Wen-An; Yu, Li

    2017-10-17

    This paper studies the observer based fault tolerant tracking control problem for linear multiagent systems with multiple faults and mismatched disturbances. A novel distributed intermediate estimator based fault tolerant tracking protocol is presented. The leader's input is nonzero and unavailable to the followers. By applying a projection technique, the mismatched disturbances are separated into matched and unmatched components. For each node, a tracking error system is established, for which an intermediate estimator driven by the relative output measurements is constructed to estimate the sensor faults and a combined signal of the leader's input, process faults, and matched disturbance component. Based on the estimation, a fault tolerant tracking protocol is designed to eliminate the effects of the combined signal. Besides, the effect of unmatched disturbance component can be attenuated by directly adjusting some specified parameters. Finally, a simulation example of aircraft demonstrates the effectiveness of the designed tracking protocol.This paper studies the observer based fault tolerant tracking control problem for linear multiagent systems with multiple faults and mismatched disturbances. A novel distributed intermediate estimator based fault tolerant tracking protocol is presented. The leader's input is nonzero and unavailable to the followers. By applying a projection technique, the mismatched disturbances are separated into matched and unmatched components. For each node, a tracking error system is established, for which an intermediate estimator driven by the relative output measurements is constructed to estimate the sensor faults and a combined signal of the leader's input, process faults, and matched disturbance component. Based on the estimation, a fault tolerant tracking protocol is designed to eliminate the effects of the combined signal. Besides, the effect of unmatched disturbance component can be attenuated by directly adjusting some

  17. Task Mapping and Bandwidth Reservation for Mixed Hard/Soft Fault-Tolerant Embedded Systems

    DEFF Research Database (Denmark)

    Saraswat, Prabhat Kumar; Pop, Paul; Madsen, Jan

    2010-01-01

    In this paper we are interested in mixed hard/soft real-time fault-tolerant applications mapped on distributed heterogeneous architectures. We use the Earliest Deadline First (EDF) scheduling for the hard real-time tasks and the Constant Bandwidth Server (CBS) for the soft tasks. The bandwidth re...

  18. Control switching in high performance and fault tolerant control

    DEFF Research Database (Denmark)

    Niemann, Hans Henrik; Poulsen, Niels Kjølstad

    2010-01-01

    The problem of reliability in high performance control and in fault tolerant control is considered in this paper. A feedback controller architecture for high performance and fault tolerance is considered. The architecture is based on the Youla-Jabr-Bongiorno-Kucera (YJBK) parameterization. By usi...

  19. Fault Tolerant Controllers for Sampled-data Systems

    DEFF Research Database (Denmark)

    Niemann, H.; Stoustrup, Jakob

    2004-01-01

    A general compensator architecture for fault tolerant control (FTC) for sampled-data systems is proposed. The architecture is based on the YJBK parameterization of all stabilizing controllers, and uses the dual YJBK parameterization to quantify the performance of the fault tolerant system. The FT...

  20. Fault tolerant controllers for sampled-data systems

    DEFF Research Database (Denmark)

    Niemann, Hans Henrik; Stoustrup, Jakob

    2004-01-01

    A general compensator architecture for fault tolerant control (FTC) for sampled-data systems is proposed. The architecture is based on the YJBK parameterization of all stabilizing controllers, and uses the dual YJBK parameterization to quantify the performance of the fault tolerant system. The FTC...

  1. Fault tolerant control for uncertain systems with parametric faults

    DEFF Research Database (Denmark)

    Niemann, Hans Henrik; Poulsen, Niels Kjølstad

    2006-01-01

    A fault tolerant control (FTC) architecture based on active fault diagnosis (AFD) and the YJBK (Youla, Jarb, Bongiorno and Kucera)parameterization is applied in this paper. Based on the FTC architecture, fault tolerant control of uncertain systems with slowly varying parametric faults...... is investigated. Conditions are given for closed-loop stability in case of false alarms or missing fault detection/isolation....

  2. MCNP load balancing and fault tolerance with PVM

    International Nuclear Information System (INIS)

    McKinney, G.W.

    1995-01-01

    Version 4A of the Monte Carlo neutron, photon, and electron transport code MCNP, developed by LANL (Los Alamos National Laboratory), supports distributed-memory multiprocessing through the software package PVM (Parallel Virtual Machine, version 3.1.4). Using PVM for interprocessor communication, MCNP can simultaneously execute a single problem on a cluster of UNIX-based workstations. This capability provided system efficiencies that exceeded 80% on dedicated workstation clusters, however, on heterogeneous or multiuser systems, the performance was limited by the slowest processor (i.e., equal work was assigned to each processor). The next public release of MCNP will provide multiprocessing enhancements that include load balancing and fault tolerance which are shown to dramatically increase multiuser system efficiency and reliability

  3. MCNP load balancing and fault tolerance with PVM

    Energy Technology Data Exchange (ETDEWEB)

    McKinney, G.W.

    1995-07-01

    Version 4A of the Monte Carlo neutron, photon, and electron transport code MCNP, developed by LANL (Los Alamos National Laboratory), supports distributed-memory multiprocessing through the software package PVM (Parallel Virtual Machine, version 3.1.4). Using PVM for interprocessor communication, MCNP can simultaneously execute a single problem on a cluster of UNIX-based workstations. This capability provided system efficiencies that exceeded 80% on dedicated workstation clusters, however, on heterogeneous or multiuser systems, the performance was limited by the slowest processor (i.e., equal work was assigned to each processor). The next public release of MCNP will provide multiprocessing enhancements that include load balancing and fault tolerance which are shown to dramatically increase multiuser system efficiency and reliability.

  4. The superior fault tolerance of artificial neural network training with a fault/noise injection-based genetic algorithm.

    Science.gov (United States)

    Su, Feng; Yuan, Peijiang; Wang, Yangzhen; Zhang, Chen

    2016-10-01

    Artificial neural networks (ANNs) are powerful computational tools that are designed to replicate the human brain and adopted to solve a variety of problems in many different fields. Fault tolerance (FT), an important property of ANNs, ensures their reliability when significant portions of a network are lost. In this paper, a fault/noise injection-based (FIB) genetic algorithm (GA) is proposed to construct fault-tolerant ANNs. The FT performance of an FIB-GA was compared with that of a common genetic algorithm, the back-propagation algorithm, and the modification of weights algorithm. The FIB-GA showed a slower fitting speed when solving the exclusive OR (XOR) problem and the overlapping classification problem, but it significantly reduced the errors in cases of single or multiple faults in ANN weights or nodes. Further analysis revealed that the fit weights showed no correlation with the fitting errors in the ANNs constructed with the FIB-GA, suggesting a relatively even distribution of the various fitting parameters. In contrast, the output weights in the training of ANNs implemented with the use the other three algorithms demonstrated a positive correlation with the errors. Our findings therefore indicate that a combination of the fault/noise injection-based method and a GA is capable of introducing FT to ANNs and imply that the distributed ANNs demonstrate superior FT performance.

  5. Fault tolerant operation of switched reluctance machine

    Science.gov (United States)

    Wang, Wei

    The energy crisis and environmental challenges have driven industry towards more energy efficient solutions. With nearly 60% of electricity consumed by various electric machines in industry sector, advancement in the efficiency of the electric drive system is of vital importance. Adjustable speed drive system (ASDS) provides excellent speed regulation and dynamic performance as well as dramatically improved system efficiency compared with conventional motors without electronics drives. Industry has witnessed tremendous grow in ASDS applications not only as a driving force but also as an electric auxiliary system for replacing bulky and low efficiency auxiliary hydraulic and mechanical systems. With the vast penetration of ASDS, its fault tolerant operation capability is more widely recognized as an important feature of drive performance especially for aerospace, automotive applications and other industrial drive applications demanding high reliability. The Switched Reluctance Machine (SRM), a low cost, highly reliable electric machine with fault tolerant operation capability, has drawn substantial attention in the past three decades. Nevertheless, SRM is not free of fault. Certain faults such as converter faults, sensor faults, winding shorts, eccentricity and position sensor faults are commonly shared among all ASDS. In this dissertation, a thorough understanding of various faults and their influence on transient and steady state performance of SRM is developed via simulation and experimental study, providing necessary knowledge for fault detection and post fault management. Lumped parameter models are established for fast real time simulation and drive control. Based on the behavior of the faults, a fault detection scheme is developed for the purpose of fast and reliable fault diagnosis. In order to improve the SRM power and torque capacity under faults, the maximum torque per ampere excitation are conceptualized and validated through theoretical analysis and

  6. Diagnosis and Fault-tolerant Control, 3rd Edition

    DEFF Research Database (Denmark)

    Blanke, Mogens; Kinnaert, Michel; Lunze, Jan

    The book presents effective model-based analysis and design methods for fault diagnosis and fault-tolerant control. Architectural and structural models are used to analyse the propagation of the fault through the process, to test the fault detectability and to find the redundancies in the process...... that can be used to ensure fault tolerance. It also introduces design methods suitable for diagnostic systems and fault-tolerant controllers for continuous processes that are described by analytical models of discrete-event systems represented by automata....

  7. A two-stage approach for managing actuators redundancy and its application to fault tolerant flight control

    Directory of Open Access Journals (Sweden)

    Zhong Lunlong

    2015-04-01

    Full Text Available In safety-critical systems such as transportation aircraft, redundancy of actuators is introduced to improve fault tolerance. How to make the best use of remaining actuators to allow the system to continue achieving a desired operation in the presence of some actuators failures is the main subject of this paper. Considering that many dynamical systems, including flight dynamics of a transportation aircraft, can be expressed as an input affine nonlinear system, a new state representation is adopted here where the output dynamics are related with virtual inputs associated with the intended operation. This representation, as well as the distribution matrix associated with the effectiveness of the remaining operational actuators, allows us to define different levels of fault tolerant governability with respect to actuators’ failures. Then, a two-stage control approach is developed, leading first to the inversion of the output dynamics to get nominal values for the virtual inputs and then to the solution of a linear quadratic (LQ problem to compute the solicitation of each operational actuator. The proposed approach is applied to the control of a transportation aircraft which performs a stabilized roll maneuver while a partial failure appears. Two fault scenarios are considered and the resulting performance of the proposed approach is displayed and discussed.

  8. Award ER25750: Coordinated Infrastructure for Fault Tolerance Systems Indiana University Final Report

    Energy Technology Data Exchange (ETDEWEB)

    Lumsdaine, Andrew

    2013-03-08

    The main purpose of the Coordinated Infrastructure for Fault Tolerance in Systems initiative has been to conduct research with a goal of providing end-to-end fault tolerance on a systemwide basis for applications and other system software. While fault tolerance has been an integral part of most high-performance computing (HPC) system software developed over the past decade, it has been treated mostly as a collection of isolated stovepipes. Visibility and response to faults has typically been limited to the particular hardware and software subsystems in which they are initially observed. Little fault information is shared across subsystems, allowing little flexibility or control on a system-wide basis, making it practically impossible to provide cohesive end-to-end fault tolerance in support of scientific applications. As an example, consider faults such as communication link failures that can be seen by a network library but are not directly visible to the job scheduler, or consider faults related to node failures that can be detected by system monitoring software but are not inherently visible to the resource manager. If information about such faults could be shared by the network libraries or monitoring software, then other system software, such as a resource manager or job scheduler, could ensure that failed nodes or failed network links were excluded from further job allocations and that further diagnosis could be performed. As a founding member and one of the lead developers of the Open MPI project, our efforts over the course of this project have been focused on making Open MPI more robust to failures by supporting various fault tolerance techniques, and using fault information exchange and coordination between MPI and the HPC system software stack from the application, numeric libraries, and programming language runtime to other common system components such as jobs schedulers, resource managers, and monitoring tools.

  9. Model-Based Fault Tolerant Control

    Science.gov (United States)

    Kumar, Aditya; Viassolo, Daniel

    2008-01-01

    The Model Based Fault Tolerant Control (MBFTC) task was conducted under the NASA Aviation Safety and Security Program. The goal of MBFTC is to develop and demonstrate real-time strategies to diagnose and accommodate anomalous aircraft engine events such as sensor faults, actuator faults, or turbine gas-path component damage that can lead to in-flight shutdowns, aborted take offs, asymmetric thrust/loss of thrust control, or engine surge/stall events. A suite of model-based fault detection algorithms were developed and evaluated. Based on the performance and maturity of the developed algorithms two approaches were selected for further analysis: (i) multiple-hypothesis testing, and (ii) neural networks; both used residuals from an Extended Kalman Filter to detect the occurrence of the selected faults. A simple fusion algorithm was implemented to combine the results from each algorithm to obtain an overall estimate of the identified fault type and magnitude. The identification of the fault type and magnitude enabled the use of an online fault accommodation strategy to correct for the adverse impact of these faults on engine operability thereby enabling continued engine operation in the presence of these faults. The performance of the fault detection and accommodation algorithm was extensively tested in a simulation environment.

  10. Fault tolerant control schemes using integral sliding modes

    CERN Document Server

    Hamayun, Mirza Tariq; Alwi, Halim

    2016-01-01

    The key attribute of a Fault Tolerant Control (FTC) system is its ability to maintain overall system stability and acceptable performance in the face of faults and failures within the feedback system. In this book Integral Sliding Mode (ISM) Control Allocation (CA) schemes for FTC are described, which have the potential to maintain close to nominal fault-free performance (for the entire system response), in the face of actuator faults and even complete failures of certain actuators. Broadly an ISM controller based around a model of the plant with the aim of creating a nonlinear fault tolerant feedback controller whose closed-loop performance is established during the design process. The second approach involves retro-fitting an ISM scheme to an existing feedback controller to introduce fault tolerance. This may be advantageous from an industrial perspective, because fault tolerance can be introduced without changing the existing control loops. A high fidelity benchmark model of a large transport aircraft is u...

  11. Fault-Tolerant Control For A Robotic Inspection System

    Science.gov (United States)

    Tso, Kam Sing

    1995-01-01

    Report describes first phase of continuing program of research on fault-tolerant control subsystem of telerobotic visual-inspection system. Goal of program to develop robotic system for remotely controlled visual inspection of structures in outer space.

  12. Modular, Fault-Tolerant Electronics Supporting Space Exploration, Phase II

    Data.gov (United States)

    National Aeronautics and Space Administration — Modern electronic systems tolerate only as many point failures as there are redundant system copies, using mere macro-scale redundancy. Fault Tolerant Electronics...

  13. Fault-tolerant error correction with the gauge color code

    Science.gov (United States)

    Brown, Benjamin J.; Nickerson, Naomi H.; Browne, Dan E.

    2016-01-01

    The constituent parts of a quantum computer are inherently vulnerable to errors. To this end, we have developed quantum error-correcting codes to protect quantum information from noise. However, discovering codes that are capable of a universal set of computational operations with the minimal cost in quantum resources remains an important and ongoing challenge. One proposal of significant recent interest is the gauge color code. Notably, this code may offer a reduced resource cost over other well-studied fault-tolerant architectures by using a new method, known as gauge fixing, for performing the non-Clifford operations that are essential for universal quantum computation. Here we examine the gauge color code when it is subject to noise. Specifically, we make use of single-shot error correction to develop a simple decoding algorithm for the gauge color code, and we numerically analyse its performance. Remarkably, we find threshold error rates comparable to those of other leading proposals. Our results thus provide the first steps of a comparative study between the gauge color code and other promising computational architectures. PMID:27470619

  14. Real-time fault diagnosis and fault-tolerant control

    OpenAIRE

    Gao, Zhiwei; Ding, Steven X.; Cecati, Carlo

    2015-01-01

    This "Special Section on Real-Time Fault Diagnosis and Fault-Tolerant Control" of the IEEE Transactions on Industrial Electronics is motivated to provide a forum for academic and industrial communities to report recent theoretic/application results in real-time monitoring, diagnosis, and fault-tolerant design, and exchange the ideas about the emerging research direction in this field. Twenty-three papers were eventually selected through a strict peer-reviewed procedure, which represent the mo...

  15. A Fault-tolerant Development Methodology for Industrial Control Systems

    DEFF Research Database (Denmark)

    Izadi-Zamanabadi, Roozbeh; Thybo, C.

    2004-01-01

    Developing advanced detection schemes is not the lone factor for obtaining a successful fault diagnosis performance. Acquiring significant achievements in applying Fault-tolerance in industrial development requires that fault diagnosis and recovery schemes are developed in a consistent and logica......Developing advanced detection schemes is not the lone factor for obtaining a successful fault diagnosis performance. Acquiring significant achievements in applying Fault-tolerance in industrial development requires that fault diagnosis and recovery schemes are developed in a consistent...

  16. Lightweight storage and overlay networks for fault tolerance.

    Energy Technology Data Exchange (ETDEWEB)

    Oldfield, Ron A.

    2010-01-01

    The next generation of capability-class, massively parallel processing (MPP) systems is expected to have hundreds of thousands to millions of processors, In such environments, it is critical to have fault-tolerance mechanisms, including checkpoint/restart, that scale with the size of applications and the percentage of the system on which the applications execute. For application-driven, periodic checkpoint operations, the state-of-the-art does not provide a scalable solution. For example, on today's massive-scale systems that execute applications which consume most of the memory of the employed compute nodes, checkpoint operations generate I/O that consumes nearly 80% of the total I/O usage. Motivated by this observation, this project aims to improve I/O performance for application-directed checkpoints through the use of lightweight storage architectures and overlay networks. Lightweight storage provide direct access to underlying storage devices. Overlay networks provide caching and processing capabilities in the compute-node fabric. The combination has potential to signifcantly reduce I/O overhead for large-scale applications. This report describes our combined efforts to model and understand overheads for application-directed checkpoints, as well as implementation and performance analysis of a checkpoint service that uses available compute nodes as a network cache for checkpoint operations.

  17. Service for fault tolerance in the Ad Hoc Networks based on Multi Agent Systems

    Directory of Open Access Journals (Sweden)

    Ghalem Belalem

    2011-02-01

    Full Text Available The Ad hoc networks are distributed networks, self-organized and does not require infrastructure. In such network, mobile infrastructures are subject of disconnections. This situation may concern a voluntary or involuntary disconnection of nodes caused by the high mobility in the Ad hoc network. In these problems we are trying through this work to contribute to solving these problems in order to ensure continuous service by proposing our service for faults tolerance based on Multi Agent Systems (MAS, which predict a problem and decision making in relation to critical nodes. Our work contributes to study the prediction of voluntary and involuntary disconnections in the Ad hoc network; therefore we propose our service for faults tolerance that allows for effective distribution of information in the Network by selecting some objects of the network to be duplicates of information.

  18. Resonant Tunneling Diodes-Based Cellular Nonlinear Networks with Fault Tolerance Analysis

    Directory of Open Access Journals (Sweden)

    Shukai Duan

    2013-01-01

    Full Text Available The resonant tunneling diodes (RTD have found numerous applications in high-speed digital and analog circuits owing to its folded-back negative differential resistance (NDR in current-voltage (I-V characteristics and nanometer size. On account of the replacement of the state resistor in standard cell by an RTD, an RTD-based cellular neural/nonlinear network (RTD-CNN can be obtained, in which the cell requires neither self-feedback nor a nonlinear output, thereby being more compact and versatile. This paper addresses the structure of RTD-CNN in detail and investigates its fault-tolerant properties in image processing taking horizontal line detection and edge extraction, for examples. A series of computer simulations demonstrates the promising fault-tolerant abilities of the RTD-CNN.

  19. An Efficient Grid Scheduling Algorithm with Fault Tolerance and User Satisfaction

    Directory of Open Access Journals (Sweden)

    P. Keerthika

    2013-01-01

    Full Text Available Problem Statement. The advances in human civilization lead to more complications in problem solving. Grid computing serves as an efficient technology in solving those complicated problems. In computational grids, the grid scheduler schedules the task and finds the appropriate resource for each task. The scheduler must consider several factors such as user demand, communication time, failure handling mechanisms, and reduced makespan. Most of the existing algorithms do not consider user satisfaction. Thus a scheduling algorithm that handles failure of resources and achieves user satisfaction gains more importance. Approach. A new bicriteria scheduling algorithm (BSA that considers user satisfaction along with fault tolerance has been introduced. The main contribution of this paper includes achieving user satisfaction along with fault tolerance and minimizing the makespan of jobs. Results. The performance of this proposed algorithm is evaluated using GridSim based on makespan and number of jobs completed successfully within user deadline. Conclusions/Recommendations. The proposed BSA algorithm achieves reduced makespan and better hit rate with higher user satisfaction and fault tolerance.

  20. Modeling and Design of Fault-Tolerant and Self-Adaptive Reconfigurable Networked Embedded Systems

    Directory of Open Access Journals (Sweden)

    Streichert Thilo

    2006-01-01

    Full Text Available Automotive, avionic, or body-area networks are systems that consist of several communicating control units specialized for certain purposes. Typically, different constraints regarding fault tolerance, availability and also flexibility are imposed on these systems. In this article, we will present a novel framework for increasing fault tolerance and flexibility by solving the problem of hardware/software codesign online. Based on field-programmable gate arrays (FPGAs in combination with CPUs, we allow migrating tasks implemented in hardware or software from one node to another. Moreover, if not enough hardware/software resources are available, the migration of functionality from hardware to software or vice versa is provided. Supporting such flexibility through services integrated in a distributed operating system for networked embedded systems is a substantial step towards self-adaptive systems. Beside the formal definition of methods and concepts, we describe in detail a first implementation of a reconfigurable networked embedded system running automotive applications.

  1. Fault-tolerance techniques for high-speed fiber-optic networks

    Science.gov (United States)

    Deruiter, John

    1991-01-01

    Four fiber optic network topologies (linear bus, ring, central star, and distributed star) are discussed relative to their application to high data throughput, fault tolerant networks. The topologies are also examined in terms of redundancy and the need to provide for single point, failure free (or better) system operation. Linear bus topology, although traditionally the method of choice for wire systems, presents implementation problems when larger fiber optic systems are considered. Ring topology works well for high speed systems when coupled with a token passing protocol, but it requires a significant increase in protocol complexity to manage system reconfiguration due to ring and node failures. Star topologies offer a natural fault tolerance, without added protocol complexity, while still providing high data throughput capability.

  2. Modeling and Design of Fault-Tolerant and Self-Adaptive Reconfigurable Networked Embedded Systems

    Directory of Open Access Journals (Sweden)

    Jürgen Teich

    2006-06-01

    Full Text Available Automotive, avionic, or body-area networks are systems that consist of several communicating control units specialized for certain purposes. Typically, different constraints regarding fault tolerance, availability and also flexibility are imposed on these systems. In this article, we will present a novel framework for increasing fault tolerance and flexibility by solving the problem of hardware/software codesign online. Based on field-programmable gate arrays (FPGAs in combination with CPUs, we allow migrating tasks implemented in hardware or software from one node to another. Moreover, if not enough hardware/software resources are available, the migration of functionality from hardware to software or vice versa is provided. Supporting such flexibility through services integrated in a distributed operating system for networked embedded systems is a substantial step towards self-adaptive systems. Beside the formal definition of methods and concepts, we describe in detail a first implementation of a reconfigurable networked embedded system running automotive applications.

  3. Fault-Tolerant Software-Defined Radio on Manycore

    Science.gov (United States)

    Ricketts, Scott

    2015-01-01

    Software-defined radio (SDR) platforms generally rely on field-programmable gate arrays (FPGAs) and digital signal processors (DSPs), but such architectures require significant software development. In addition, application demands for radiation mitigation and fault tolerance exacerbate programming challenges. MaXentric Technologies, LLC, has developed a manycore-based SDR technology that provides 100 times the throughput of conventional radiationhardened general purpose processors. Manycore systems (30-100 cores and beyond) have the potential to provide high processing performance at error rates that are equivalent to current space-deployed uniprocessor systems. MaXentric's innovation is a highly flexible radio, providing over-the-air reconfiguration; adaptability; and uninterrupted, real-time, multimode operation. The technology is also compliant with NASA's Space Telecommunications Radio System (STRS) architecture. In addition to its many uses within NASA communications, the SDR can also serve as a highly programmable research-stage prototyping device for new waveforms and other communications technologies. It can also support noncommunication codes on its multicore processor, collocated with the communications workload-reducing the size, weight, and power of the overall system by aggregating processing jobs to a single board computer.

  4. ALLIANCE: An architecture for fault tolerant multi-robot cooperation

    Energy Technology Data Exchange (ETDEWEB)

    Parker, L.E.

    1995-02-01

    ALLIANCE is a software architecture that facilitates the fault tolerant cooperative control of teams of heterogeneous mobile robots performing missions composed of loosely coupled, largely independent subtasks. ALLIANCE allows teams of robots, each of which possesses a variety of high-level functions that it can perform during a mission, to individually select appropriate actions throughout the mission based on the requirements of the mission, the activities of other robots, the current environmental conditions, and the robot`s own internal states. ALLIANCE is a fully distributed, behavior-based architecture that incorporates the use of mathematically modeled motivations (such as impatience and acquiescence) within each robot to achieve adaptive action selection. Since cooperative robotic teams usually work in dynamic and unpredictable environments, this software architecture allows the robot team members to respond robustly, reliably, flexibly, and coherently to unexpected environmental changes and modifications in the robot team that may occur due to mechanical failure, the learning of new skills, or the addition or removal of robots from the team by human intervention. The feasibility of this architecture is demonstrated in an implementation on a team of mobile robots performing a laboratory version of hazardous waste cleanup.

  5. ALLIANCE: An architecture for fault tolerant multi-robot cooperation

    International Nuclear Information System (INIS)

    Parker, L.E.

    1995-02-01

    ALLIANCE is a software architecture that facilitates the fault tolerant cooperative control of teams of heterogeneous mobile robots performing missions composed of loosely coupled, largely independent subtasks. ALLIANCE allows teams of robots, each of which possesses a variety of high-level functions that it can perform during a mission, to individually select appropriate actions throughout the mission based on the requirements of the mission, the activities of other robots, the current environmental conditions, and the robot's own internal states. ALLIANCE is a fully distributed, behavior-based architecture that incorporates the use of mathematically modeled motivations (such as impatience and acquiescence) within each robot to achieve adaptive action selection. Since cooperative robotic teams usually work in dynamic and unpredictable environments, this software architecture allows the robot team members to respond robustly, reliably, flexibly, and coherently to unexpected environmental changes and modifications in the robot team that may occur due to mechanical failure, the learning of new skills, or the addition or removal of robots from the team by human intervention. The feasibility of this architecture is demonstrated in an implementation on a team of mobile robots performing a laboratory version of hazardous waste cleanup

  6. Optimal structure of fault-tolerant software systems

    International Nuclear Information System (INIS)

    Levitin, Gregory

    2005-01-01

    This paper considers software systems consisting of fault-tolerant components. These components are built from functionally equivalent but independently developed versions characterized by different reliability and execution time. Because of hardware resource constraints, the number of versions that can run simultaneously is limited. The expected system execution time and its reliability (defined as probability of obtaining the correct output within a specified time) strictly depend on parameters of software versions and sequence of their execution. The system structure optimization problem is formulated in which one has to choose software versions for each component and find the sequence of their execution in order to achieve the greatest system reliability subject to cost constraints. The versions are to be chosen from a list of available products. Each version is characterized by its reliability, execution time and cost. The suggested optimization procedure is based on an algorithm for determining system execution time distribution that uses the moment generating function approach and on the genetic algorithm. Both N-version programming and the recovery block scheme are considered within a universal model. Illustrated example is presented

  7. Adaptive Fault-Tolerant Routing in 2D Mesh with Cracky Rectangular Model

    Directory of Open Access Journals (Sweden)

    Yi Yang

    2014-01-01

    Full Text Available This paper mainly focuses on routing in two-dimensional mesh networks. We propose a novel faulty block model, which is cracky rectangular block, for fault-tolerant adaptive routing. All the faulty nodes and faulty links are surrounded in this type of block, which is a convex structure, in order to avoid routing livelock. Additionally, the model constructs the interior spanning forest for each block in order to keep in touch with the nodes inside of each block. The procedure for block construction is dynamically and totally distributed. The construction algorithm is simple and ease of implementation. And this is a fully adaptive block which will dynamically adjust its scale in accordance with the situation of networks, either the fault emergence or the fault recovery, without shutdown of the system. Based on this model, we also develop a distributed fault-tolerant routing algorithm. Then we give the formal proof for this algorithm to guarantee that messages will always reach their destinations if and only if the destination nodes keep connecting with these mesh networks. So the new model and routing algorithm maximize the availability of the nodes in networks. This is a noticeable overall improvement of fault tolerability of the system.

  8. Peningkatan Kinerja Siakad Menggunakan Metode Load Balancing dan Fault Tolerance Di Jaringan Kampus Universitas Halu Oleo

    Directory of Open Access Journals (Sweden)

    Alimuddin Alimuddin

    2016-01-01

    Full Text Available The application of academic information system (siakad a web-based college is essential to improve the academic services. Siakad the application has many obstacles, especially in dealing with a high amount of access that caused the overload. Moreover in case of hardware or software failure caused siakad inaccessible. The solution of this problem is the use of many existing servers where the load is distributed in the respective server. Need a method of distributing the load evenly in the respective server load balancing is the method by round robin algorithm so high siakad scalability. As for dealing with the failure of a server need fault tolerance for the availability siakad be high. This research is to develop methods of load balancing and fault tolerance using software linux virtual server and some additional programs such as ipvsadm and heartbeat that has the ability to increase scalability and availability siakad. The results showed that with load balancing to minimize the response time to 5,7%, increase throughput by 37% or 1,6 times and maximize resource utilization or utilization of 1,6 times increased, and avoid overload. While high availability is obtained from the server's ability to perform failover or move another server in the event of failure. Thus implementing load balancing and fault tolerance can improve the service performance of siakad and avoid mistakes.

  9. Analysis of a hardware and software fault tolerant processor for critical applications

    Science.gov (United States)

    Dugan, Joanne B.

    1993-01-01

    Computer systems for critical applications must be designed to tolerate software faults as well as hardware faults. A unified approach to tolerating hardware and software faults is characterized by classifying faults in terms of duration (transient or permanent) rather than source (hardware or software). Errors arising from transient faults can be handled through masking or voting, but errors arising from permanent faults require system reconfiguration to bypass the failed component. Most errors which are caused by software faults can be considered transient, in that they are input-dependent. Software faults are triggered by a particular set of inputs. Quantitative dependability analysis of systems which exhibit a unified approach to fault tolerance can be performed by a hierarchical combination of fault tree and Markov models. A methodology for analyzing hardware and software fault tolerant systems is applied to the analysis of a hypothetical system, loosely based on the Fault Tolerant Parallel Processor. The models consider both transient and permanent faults, hardware and software faults, independent and related software faults, automatic recovery, and reconfiguration.

  10. Fault Tolerant Control Using Gaussian Processes and Model Predictive Control

    Directory of Open Access Journals (Sweden)

    Yang Xiaoke

    2015-03-01

    Full Text Available Essential ingredients for fault-tolerant control are the ability to represent system behaviour following the occurrence of a fault, and the ability to exploit this representation for deciding control actions. Gaussian processes seem to be very promising candidates for the first of these, and model predictive control has a proven capability for the second. We therefore propose to use the two together to obtain fault-tolerant control functionality. Our proposal is illustrated by several reasonably realistic examples drawn from flight control.

  11. Interactive animation of fault-tolerant parallel algorithms

    Energy Technology Data Exchange (ETDEWEB)

    Apgar, S.W.

    1992-02-01

    Animation of algorithms makes understanding them intuitively easier. This paper describes the software tool Raft (Robust Animator of Fault Tolerant Algorithms). The Raft system allows the user to animate a number of parallel algorithms which achieve fault tolerant execution. In particular, we use it to illustrate the key Write-All problem. It has an extensive user-interface which allows a choice of the number of processors, the number of elements in the Write-All array, and the adversary to control the processor failures. The novelty of the system is that the interface allows the user to create new on-line adversaries as the algorithm executes.

  12. Fault-tolerant design of picture archiving and communication systems

    International Nuclear Information System (INIS)

    Taira, R.K.; Chan, K.K.; Stewart, B.K.; Weinberg, W.S.

    1990-01-01

    Reliability is perhaps the most important attribute of a PACS. Any downtime of the system may seriously affect patient care. This paper describes fault-tolerant measures employed in the design of a hospital-wide PACS. Six fault-tolerant measures have been implemented: hardware redundance (networks and archives), data-base backups, monitoring routines for local host processes and network status; uninterruptible power supplied, structured software design techniques, and in-service training of all radiology technologists. A PACS consisting of 13 acquisition nodes, two optical archiving nodes, two data-base server nodes, and five workstation nodes has been developed

  13. Enhanced Maritime Safety through Diagnosis and Fault Tolerant Control

    DEFF Research Database (Denmark)

    Blanke, Mogens

    2001-01-01

    Faults in steering, navigation instruments or propulsion machinery are serious on a marine vessel since the consequence could be loss of maneuvering ability, and imply risk of damage to vessel personnel or environment. Early diagnosis and accomodation of faults could enhance safety. Fault......-tolerant control is a methodology to help prevent that faults develop into failure. The means include on-line fault diagnosis, automatic condition assessment and calculation of remedial action to avoid hazards. This paper gives an overview of methods to obtain fault-tolerance: fault diagnosis; analysis...

  14. Rule-based fault-tolerant flight control

    Science.gov (United States)

    Handelman, Dave

    1988-01-01

    Fault tolerance has always been a desirable characteristic of aircraft. The ability to withstand unexpected changes in aircraft configuration has a direct impact on the ability to complete a mission effectively and safely. The possible synergistic effects of combining techniques of modern control theory, statistical hypothesis testing, and artificial intelligence in the attempt to provide failure accommodation for aircraft are investigated. This effort has resulted in the definition of a theory for rule based control and a system for development of such a rule based controller. Although presented here in response to the goal of aircraft fault tolerance, the rule based control technique is applicable to a wide range of complex control problems.

  15. Design and Experimental Validation for Direct-Drive Fault-Tolerant Permanent-Magnet Vernier Machines

    Directory of Open Access Journals (Sweden)

    Guohai Liu

    2014-01-01

    Full Text Available A fault-tolerant permanent-magnet vernier (FT-PMV machine is designed for direct-drive applications, incorporating the merits of high torque density and high reliability. Based on the so-called magnetic gearing effect, PMV machines have the ability of high torque density by introducing the flux-modulation poles (FMPs. This paper investigates the fault-tolerant characteristic of PMV machines and provides a design method, which is able to not only meet the fault-tolerant requirements but also keep the ability of high torque density. The operation principle of the proposed machine has been analyzed. The design process and optimization are presented specifically, such as the combination of slots and poles, the winding distribution, and the dimensions of PMs and teeth. By using the time-stepping finite element method (TS-FEM, the machine performances are evaluated. Finally, the FT-PMV machine is manufactured, and the experimental results are presented to validate the theoretical analysis.

  16. ALLIANCE: An architecture for fault tolerant, cooperative control of heterogeneous mobile robots

    Energy Technology Data Exchange (ETDEWEB)

    Parker, L.E.

    1995-02-01

    This research addresses the problem of achieving fault tolerant cooperation within small- to medium-sized teams of heterogeneous mobile robots. The author describes a novel behavior-based, fully distributed architecture, called ALLIANCE, that utilizes adaptive action selection to achieve fault tolerant cooperative control in robot missions involving loosely coupled, largely independent tasks. The robots in this architecture possess a variety of high-level functions that they can perform during a mission, and must at all times select an appropriate action based on the requirements of the mission, the activities of other robots, the current environmental conditions, and their own internal states. Since such cooperative teams often work in dynamic and unpredictable environments, the software architecture allows the team members to respond robustly and reliably to unexpected environmental changes and modifications in the robot team that may occur due to mechanical failure, the learning of new skills, or the addition or removal of robots from the team by human intervention. After presenting ALLIANCE, the author describes in detail experimental results of an implementation of this architecture on a team of physical mobile robots performing a cooperative box pushing demonstration. These experiments illustrate the ability of ALLIANCE to achieve adaptive, fault-tolerant cooperative control amidst dynamic changes in the capabilities of the robot team.

  17. Implementing Fault-Tolerant Services in Goal-Oriented Multi-Agent Systems

    Directory of Open Access Journals (Sweden)

    BORA, S.

    2014-08-01

    Full Text Available In this paper, findings and analysis detail the implementation of fault tolerance services into a goal-oriented multi-agent systems development platform. Fault tolerance services are used to provide replication-based fault tolerance policies (i.e. static and adaptive to multi-agent systems. This approach provided flexibility and reusability to multi-agent systems because fault tolerance policies were implemented as reusable plan structures. Thus, whenever an agent was needed to be made fault-tolerant, plans for fault tolerance policies were simply activated by sending a request message.

  18. Fault tolerant UAV`s are coming; Fault tolerant mujinki jidai no torai

    Energy Technology Data Exchange (ETDEWEB)

    Sumita, J.

    1999-06-05

    This paper explains a concept of UAV (unmanned aviation vehicle). Previous UAV`s have achieved success because of their simple system and simple operation. However, for future UAV`s, higher reliability and safety than those of ordinary aircraft are strongly required with a rise in expectation for mission to be executed. In other words, future UAV`s should aim at a fault tolerant system featured by autonomous operation and less than 10{sup -9} fault/hour reliability. Recently ordinary aircraft also came to adopt auto-sequence control for flight control systems to achieve considerably high programmed automatic control from takeoff to landing. A UAV with an autonomous operation function possible to return to a base was also developed. A system reliability of a 10{sup -9} level against flight critical phenomena is required for ordinary commercial aircraft. It is supposed that a reliability equivalent to or more than the above reliability will be required for UAV`s as system design requirement in the near future. (NEDO)

  19. Design of fault tolerant control system for steam generator using

    Energy Technology Data Exchange (ETDEWEB)

    Kim, Myung Ki; Seo, Mi Ro [Korea Electric Power Research Institute, Taejon (Korea, Republic of)

    1998-12-31

    A controller and sensor fault tolerant system for a steam generator is designed with fuzzy logic. A structure of the proposed fault tolerant redundant system is composed of a supervisor and two fuzzy weighting modulators. A supervisor alternatively checks a controller and a sensor induced performances to identify which part, a controller or a sensor, is faulty. In order to analyze controller induced performance both an error and a change in error of the system output are chosen as fuzzy variables. The fuzzy logic for a sensor induced performance uses two variables : a deviation between two sensor outputs and its frequency. Fuzzy weighting modulator generates an output signal compensated for faulty input signal. Simulations show that the proposed fault tolerant control scheme for a steam generator regulates well water level by suppressing fault effect of either controllers or sensors. Therefore through duplicating sensors and controllers with the proposed fault tolerant scheme, both a reliability of a steam generator control and sensor system and that of a power plant increase even more. 2 refs., 9 figs., 1 tab. (Author)

  20. A Ship Propulsion System Model for Fault-tolerant Control

    DEFF Research Database (Denmark)

    Izadi-Zamanabadi, Roozbeh; Blanke, M.

    This report presents a propulsion system model for a low speed marine vehicle, which can be used as a test benchmark for Fault-Tolerant Control purposes. The benchmark serves the purpose of offering realistic and challenging problems relevant in both FDI and (autonomous) supervisory control area...

  1. A benchmark for fault tolerant flight control evaluation

    NARCIS (Netherlands)

    Smaili, H.; Breeman, J.; Lombaerts, T.; Stroosma, O.

    2013-01-01

    A large transport aircraft simulation benchmark (REconfigurable COntrol for Vehicle Emergency Return ? RECOVER) has been developed within the GARTEUR (Group for Aeronautical Research and Technology in Europe) Flight Mechanics Action Group 16 (FM-AG(16)) on Fault Tolerant Control (2004 2008) for the

  2. A Fault tolerant Control Supervisory System development Procedurefor Small Satellites

    DEFF Research Database (Denmark)

    Izadi-Zamanabadi, Roozbeh; Larsen, Jesper Abildgaard

    The paper presents a stepwise procedure to develop a fault tolerant control system for small satellites. The procedure is illustrated through implementation on the AAUSAT-II spacecraft. As it is shown the presented procedure requires expertise from several disciplines that are nevertheless...

  3. Modular Multilevel Converter Control Strategy with Fault Tolerance

    DEFF Research Database (Denmark)

    Teodorescu, Remus; Eni, Emanuel-Petre; Mathe, Laszlo

    2013-01-01

    The Modular Multilevel Converter (MMC) technology has recently emerged in VSC-HVDC applications where it demonstrated higher efficiency and fault tolerance compared to the classical 2-level topology. Due to the ability of MMC to connect to HV levels, MMC can be also used in transformerless STATCO...... communication infrastructure based on Industrial Ethernet....

  4. Fault-tolerant Sensor Fusion for Marine Navigation

    DEFF Research Database (Denmark)

    Blanke, Mogens

    2006-01-01

    Reliability of navigation data are critical for steering and manoeuvring control, and in particular so at high speed or in critical phases of a mission. Should faults occur, faulty instruments need be autonomously isolated and faulty information discarded. This paper designs a navigation solution...... events where the fault-tolerant sensor fusion provided uninterrupted navigation data despite temporal instrument defects...

  5. Electrical Steering of Vehicles - Fault-tolerant Analysis and Design

    DEFF Research Database (Denmark)

    Blanke, Mogens; Thomsen, Jesper Sandberg

    2006-01-01

    The topic of this paper is systems that need be designed such that no single fault can cause failure at the overall level. A methodology is presented for analysis and design of fault-tolerant architectures, where diagnosis and autonomous reconfiguration can replace high cost triple redundancy sol...

  6. Fault-tolerant Actuator System for Electrical Steering of Vehicles

    DEFF Research Database (Denmark)

    Sørensen, Jesper Sandberg; Blanke, Mogens

    2006-01-01

    Being critical to the safety of vehicles, the steering system is required to maintain the vehicles ability to steer until it is brought to halt, should a fault occur. With electrical steering becoming a cost-effective candidate for electrical powered vehicles, a fault-tolerant architecture...

  7. Fault tolerant control - a residual based set-up

    DEFF Research Database (Denmark)

    Niemann, Hans Henrik; Poulsen, Niels Kjølstad

    2009-01-01

    A new set-up for fault tolerant control (FTC) for stable systems is presented in this paper. The new set-up is based on a simple implementation of the Youla-Jabr-Bongiorno-Kucera (YJBK) parameterization. This implementation of the YJBK parameterization will allow a direct and simple reconfigurati...

  8. Modular Multilevel Converter Control Strategy with Fault Tolerance

    DEFF Research Database (Denmark)

    Teodorescu, Remus; Eni, Emanuel-Petre; Mathe, Laszlo

    2013-01-01

    The Modular Multilevel Converter (MMC) technology has recently emerged in VSC-HVDC applications where it demonstrated higher efficiency and fault tolerance compared to the classical 2-level topology. Due to the ability of MMC to connect to HV levels, MMC can be also used in transformerless STATCO...

  9. Fault-tolerant system for catastrophic faults in AMR sensors

    NARCIS (Netherlands)

    Zambrano Constantini, A.C.; Kerkhoff, Hans G.

    Anisotropic Magnetoresistance angle sensors are widely used in automotive applications considered to be safety-critical applications. Therefore dependability is an important requirement and fault-tolerant strategies must be used to guarantee the correct operation of the sensors even in case of

  10. Fault-Tolerant Process Control Methods and Applications

    CERN Document Server

    Mhaskar, Prashant; Christofides, Panagiotis D

    2013-01-01

    Fault-Tolerant Process Control focuses on the development of general, yet practical, methods for the design of advanced fault-tolerant control systems; these ensure an efficient fault detection and a timely response to enhance fault recovery, prevent faults from propagating or developing into total failures, and reduce the risk of safety hazards. To this end, methods are presented for the design of advanced fault-tolerant control systems for chemical processes which explicitly deal with actuator/controller failures and sensor faults and data losses. Specifically, the book puts forward: ·         a framework for  detection, isolation and diagnosis of actuator and sensor faults for nonlinear systems; ·         controller reconfiguration and safe-parking-based fault-handling methodologies; ·         integrated-data- and model-based fault-detection and isolation and fault-tolerant control methods; ·         methods for handling sensor faults and data losses; and ·      ...

  11. Fault Tolerant Control System Design Using Automated Methods from Risk Analysis

    DEFF Research Database (Denmark)

    Blanke, M.

    Fault tolerant controls have the ability to be resilient to simple faults in control loop components.......Fault tolerant controls have the ability to be resilient to simple faults in control loop components....

  12. Organization of the secure distributed computing based on multi-agent system

    Science.gov (United States)

    Khovanskov, Sergey; Rumyantsev, Konstantin; Khovanskova, Vera

    2018-04-01

    Nowadays developing methods for distributed computing is received much attention. One of the methods of distributed computing is using of multi-agent systems. The organization of distributed computing based on the conventional network computers can experience security threats performed by computational processes. Authors have developed the unified agent algorithm of control system of computing network nodes operation. Network PCs is used as computing nodes. The proposed multi-agent control system for the implementation of distributed computing allows in a short time to organize using of the processing power of computers any existing network to solve large-task by creating a distributed computing. Agents based on a computer network can: configure a distributed computing system; to distribute the computational load among computers operated agents; perform optimization distributed computing system according to the computing power of computers on the network. The number of computers connected to the network can be increased by connecting computers to the new computer system, which leads to an increase in overall processing power. Adding multi-agent system in the central agent increases the security of distributed computing. This organization of the distributed computing system reduces the problem solving time and increase fault tolerance (vitality) of computing processes in a changing computing environment (dynamic change of the number of computers on the network). Developed a multi-agent system detects cases of falsification of the results of a distributed system, which may lead to wrong decisions. In addition, the system checks and corrects wrong results.

  13. Fault Tolerant, Radiation hard DSP, Phase I

    Data.gov (United States)

    National Aeronautics and Space Administration — Commercial digital signal processors (DSP) are problematic for satellite computers due to damaging space radiation effects, particularly single event upsets (SEU)...

  14. Scheduling and Optimization of Fault-Tolerant Embedded Systems with Transparency/Performance Trade-Offs

    DEFF Research Database (Denmark)

    Izosimov, Viacheslav; Pop, Paul; Eles, Petru

    2012-01-01

    In this article, we propose a strategy for the synthesis of fault-tolerant schedules and for the mapping of fault-tolerant applications. Our techniques handle transparency/performance trade-offs and use the faultoccurrence information to reduce the overhead due to fault tolerance. Processes and m...

  15. Distributed Computing: An Overview

    OpenAIRE

    Md. Firoj Ali; Rafiqul Zaman Khan

    2015-01-01

    Decrease in hardware costs and advances in computer networking technologies have led to increased interest in the use of large-scale parallel and distributed computing systems. Distributed computing systems offer the potential for improved performance and resource sharing. In this paper we have made an overview on distributed computing. In this paper we studied the difference between parallel and distributed computing, terminologies used in distributed computing, task allocation in distribute...

  16. A universal, fault-tolerant, non-linear analytic network for modeling and fault detection

    Energy Technology Data Exchange (ETDEWEB)

    Mott, J.E. [Advanced Modeling Techniques Corp., Idaho Falls, ID (United States); King, R.W.; Monson, L.R.; Olson, D.L.; Staffon, J.D. [Argonne National Lab., Idaho Falls, ID (United States)

    1992-03-06

    The similarities and differences of a universal network to normal neural networks are outlined. The description and application of a universal network is discussed by showing how a simple linear system is modeled by normal techniques and by universal network techniques. A full implementation of the universal network as universal process modeling software on a dedicated computer system at EBR-II is described and example results are presented. It is concluded that the universal network provides different feature recognition capabilities than a neural network and that the universal network can provide extremely fast, accurate, and fault-tolerant estimation, validation, and replacement of signals in a real system.

  17. A universal, fault-tolerant, non-linear analytic network for modeling and fault detection

    International Nuclear Information System (INIS)

    Mott, J.E.; King, R.W.; Monson, L.R.; Olson, D.L.; Staffon, J.D.

    1992-01-01

    The similarities and differences of a universal network to normal neural networks are outlined. The description and application of a universal network is discussed by showing how a simple linear system is modeled by normal techniques and by universal network techniques. A full implementation of the universal network as universal process modeling software on a dedicated computer system at EBR-II is described and example results are presented. It is concluded that the universal network provides different feature recognition capabilities than a neural network and that the universal network can provide extremely fast, accurate, and fault-tolerant estimation, validation, and replacement of signals in a real system

  18. A fault-tolerant software strategy for digital systems

    Science.gov (United States)

    Hitt, E. F.; Webb, J. J.

    1984-01-01

    Techniques developed for producing fault-tolerant software are described. Tolerance is required because of the impossibility of defining fault-free software. Faults are caused by humans and can appear anywhere in the software life cycle. Tolerance is effected through error detection, damage assessment, recovery, and fault treatment, followed by return of the system to service. Multiversion software comprises two or more versions of the software yielding solutions which are examined by a decision algorithm. Errors can also be detected by extrapolation from previous results or by the acceptability of results. Violations of timing specifications can reveal errors, or the system can roll back to an error-free state when a defect is detected. The software, when used in flight control systems, must not impinge on time-critical responses. Efforts are still needed to reduce the costs of developing the fault-tolerant systems.

  19. Passive Fault tolerant Control of an Inverted Double Pendulum

    DEFF Research Database (Denmark)

    Niemann, H.; Stoustrup, Jakob

    2003-01-01

    A passive fault tolerant control scheme is suggested, in which a nominal controller is augmented with an additional block, which guarantees stability and performance after the occurrence of a fault. The method is based on the Youla parameterization, which requires the nominal controller to be imp...... to be implemented in the observer based form. The proposed method is applied to a double inverted pendulum system, for which an H controller has been designed and verified in a lap setup. In this case study, the fault is a degradation of the tacho loop.......A passive fault tolerant control scheme is suggested, in which a nominal controller is augmented with an additional block, which guarantees stability and performance after the occurrence of a fault. The method is based on the Youla parameterization, which requires the nominal controller...

  20. FAULT-TOLERANT DESIGN FOR ADVANCED DIVERSE PROTECTION SYSTEM

    Directory of Open Access Journals (Sweden)

    YANG GYUN OH

    2013-11-01

    Full Text Available For the improvement of APR1400 Diverse Protection System (DPS design, the Advanced DPS (ADPS has recently been developed to enhance the fault tolerance capability of the system. Major fault masking features of the ADPS compared with the APR1400 DPS are the changes to the channel configuration and reactor trip actuation equipment. To minimize the fault occurrences within the ADPS, and to mitigate the consequences of common-cause failures (CCF within the safety I&C systems, several fault avoidance design features have been applied in the ADPS. The fault avoidance design features include the changes to the system software classification, communication methods, equipment platform, MMI equipment, etc. In addition, the fault detection, location, containment, and recovery processes have been incorporated in the ADPS design. Therefore, it is expected that the ADPS can provide an enhanced fault tolerance capability against the possible faults within the system and its input/output equipment, and the CCF of safety systems.

  1. Fault-tolerant Control of a Cyber-physical System

    Science.gov (United States)

    Roxana, Rusu-Both; Eva-Henrietta, Dulf

    2017-10-01

    Cyber-physical systems represent a new emerging field in automatic control. The fault system is a key component, because modern, large scale processes must meet high standards of performance, reliability and safety. Fault propagation in large scale chemical processes can lead to loss of production, energy, raw materials and even environmental hazard. The present paper develops a multi-agent fault-tolerant control architecture using robust fractional order controllers for a (13C) cryogenic separation column cascade. The JADE (Java Agent DEvelopment Framework) platform was used to implement the multi-agent fault tolerant control system while the operational model of the process was implemented in Matlab/SIMULINK environment. MACSimJX (Multiagent Control Using Simulink with Jade Extension) toolbox was used to link the control system and the process model. In order to verify the performance and to prove the feasibility of the proposed control architecture several fault simulation scenarios were performed.

  2. Fault Tolerant Position-mooring Control for Offshore Vessels

    DEFF Research Database (Denmark)

    Blanke, Mogens; Nguyen, Trong Dong

    2018-01-01

    by a system to handle faults in mooring lines, sensors or thrusters. Simulations and model basin experiments are carried out to validate the concept for scenarios with single or multiple faults. The results demonstrate that enhanced availability and safety are obtainable with this design approach. While......Fault-tolerance is crucial to maintain safety in offshore operations. The objective of this paper is to show how systematic analysis and design of fault-tolerance is conducted for a complex automation system, exemplified by thruster assisted Position-mooring. Using redundancy as required....... Functional faults that are only detectable, are rendered isolable through an active isolation approach. Once functional faults are isolated, they are handled by fault accommodation techniques to meet overall control objectives specified by class requirements. The paper illustrates the generic methodology...

  3. Fault Tolerance in ZigBee Wireless Sensor Networks

    Science.gov (United States)

    Alena, Richard; Gilstrap, Ray; Baldwin, Jarren; Stone, Thom; Wilson, Pete

    2011-01-01

    Wireless sensor networks (WSN) based on the IEEE 802.15.4 Personal Area Network standard are finding increasing use in the home automation and emerging smart energy markets. The network and application layers, based on the ZigBee 2007 PRO Standard, provide a convenient framework for component-based software that supports customer solutions from multiple vendors. This technology is supported by System-on-a-Chip solutions, resulting in extremely small and low-power nodes. The Wireless Connections in Space Project addresses the aerospace flight domain for both flight-critical and non-critical avionics. WSNs provide the inherent fault tolerance required for aerospace applications utilizing such technology. The team from Ames Research Center has developed techniques for assessing the fault tolerance of ZigBee WSNs challenged by radio frequency (RF) interference or WSN node failure.

  4. Database mirroring in fault-tolerant continuous technological process control

    Directory of Open Access Journals (Sweden)

    R. Danel

    2015-10-01

    Full Text Available This paper describes the implementations of mirroring technology of the selected database systems – Microsoft SQL Server, MySQL and Caché. By simulating critical failures the systems behavior and their resilience against failure were tested. The aim was to determine whether the database mirroring is suitable to use in continuous metallurgical processes for ensuring the fault-tolerant solution at affordable cost. The present day database systems are characterized by high robustness and are resistant to sudden system failure. Database mirroring technologies are reliable and even low-budget projects can be provided with a decent fault-tolerant solution. The database system technologies available for low-budget projects are not suitable for use in real-time systems.

  5. Robust and Fault Tolerant Control of CD-players

    DEFF Research Database (Denmark)

    Vidal, Enrique Sanchez

    is to be found in the fault-diagnosis and fault-tolerant control fields. One of the main challenges in the positioning control of the focus point in CD-players is to handle two types of disturbances with conflicting requirements in an effective way. While a high bandwidth is desired to better suppress shocks......, a low bandwidth is preferred in the presence of surface defects. Traditionally, a simple defect detector is employed to deal with this trade-off. In this work, two fault diagnosis schemes are suggested which are able not only to detect but also to separate, to certain extent, the characteristics...... of the signals originated by the surface defects. Furthermore two fault-tolerant control schemes are proposed such that the mentioned trade-off is handled in a more efficient way....

  6. Fault Tolerant Position-mooring Control for Offshore Vessels

    DEFF Research Database (Denmark)

    Blanke, Mogens; Nguyen, Trong Dong

    2018-01-01

    Fault-tolerance is crucial to maintain safety in offshore operations. The objective of this paper is to show how systematic analysis and design of fault-tolerance is conducted for a complex automation system, exemplified by thruster assisted Position-mooring. Using redundancy as required....... Functional faults that are only detectable, are rendered isolable through an active isolation approach. Once functional faults are isolated, they are handled by fault accommodation techniques to meet overall control objectives specified by class requirements. The paper illustrates the generic methodology...... by a system to handle faults in mooring lines, sensors or thrusters. Simulations and model basin experiments are carried out to validate the concept for scenarios with single or multiple faults. The results demonstrate that enhanced availability and safety are obtainable with this design approach. While...

  7. Task Migration for Fault-Tolerance in Mixed-Criticality Embedded Systems

    DEFF Research Database (Denmark)

    Saraswat, Prabhat Kumar; Pop, Paul; Madsen, Jan

    2009-01-01

    In this paper we are interested in mixed-criticality embedded applications implemented on distributed architectures. Depending on their time-criticality, tasks can be hard or soft real-time and regarding safety-criticality, tasks can be fault-tolerant to transient faults, permanent faults, or have....... For tolerating permanent faults in processors, we use task migration, i.e., restarting the safety-critical tasks on other processors. We propose a Greedy-based on- line heuristic for the migration of safety-critical tasks, in response to permanent faults, and the adjustment of CBS parameters on the target...

  8. Energy-Efficient Fault-Tolerant Dynamic Event Region Detection in Wireless Sensor Networks

    DEFF Research Database (Denmark)

    Enemark, Hans-Jacob; Zhang, Yue; Dragoni, Nicola

    2015-01-01

    Fault-tolerant event detection is fundamental to wireless sensor network applications. Existing approaches usually adopt neighborhood collaboration for better detection accuracy, while need more energy consumption due to communication. Focusing on energy efficiency, this paper makes an improvement...... to a hybrid algorithm for dynamic event region detection, such as real-time tracking of chemical leakage regions. Considering the characteristics of the moving away dynamic events, we propose a return back condition for the hybrid algorithm from distributed neighborhood collaboration, in which a node makes...

  9. On the feasibility of a spaceborne fault-tolerant hypercube

    Science.gov (United States)

    Rennels, David A.; Mathur, Frank P.; Chau, Savio N.; Rohr, John A.

    1989-01-01

    The feasibility of implementing a fault-tolerant hypercube architecture for space applications is discussed. Node-level architectures and designs are considered and a first-order reliability model is presented. It is shown how error recovery can be implemented using program rollback or roll-forward techniques. Shared memory augmentations to the message-passing structure can be used to get around the inefficiencies of multicomputers to provide efficient use of hardware to achieve the needed reliabilities while maintaining performance.

  10. FAULT-TOLERANT DESIGN FOR ADVANCED DIVERSE PROTECTION SYSTEM

    OpenAIRE

    YANG GYUN OH; JIN KWON JEONG; CHANG JAE LEE; YOON HEE LEE; SEUNG MIN BAEK; SANG JEONG LEE

    2013-01-01

    For the improvement of APR1400 Diverse Protection System (DPS) design, the Advanced DPS (ADPS) has recently been developed to enhance the fault tolerance capability of the system. Major fault masking features of the ADPS compared with the APR1400 DPS are the changes to the channel configuration and reactor trip actuation equipment. To minimize the fault occurrences within the ADPS, and to mitigate the consequences of common-cause failures (CCF) within the safety I&C systems, several fault avo...

  11. Wind turbine fault detection and fault tolerant control

    DEFF Research Database (Denmark)

    Odgaard, Peter Fogh; Johnson, Kathryn

    2013-01-01

    In this updated edition of a previous wind turbine fault detection and fault tolerant control challenge, we present a more sophisticated wind turbine model and updated fault scenarios to enhance the realism of the challenge and therefore the value of the solutions. This paper describes...... the challenge model and the requirements for challenge participants. In addition, it motivates many of the faults by citing publications that give field data from wind turbine control tests....

  12. Fault Tolerance for Fight Through (FTFT)

    Science.gov (United States)

    2013-02-01

    Published by Springer, Delhi , India, May 2012, pp. 883-896. 27. “Inverting the Information Pyramid,” Federal Computer Week, Vol . 26, No.4, March...networks such that all networks can co-exist. A preliminary version of this work was presented in IEEE WCNC 2012 in Paris , France [35]. The complete...04g.html 4. David Leversage and Eric Byres, “Estimating a System’s Mean Time-to-Compromise,” Journal of Security and Privacy, Vol . 6, Issue1

  13. SABRE: a bio-inspired fault-tolerant electronic architecture

    International Nuclear Information System (INIS)

    Bremner, P; Samie, M; Dragffy, G; Pipe, A G; Liu, Y; Tempesti, G; Timmis, J; Tyrrell, A M

    2013-01-01

    As electronic devices become increasingly complex, ensuring their reliable, fault-free operation is becoming correspondingly more challenging. It can be observed that, in spite of their complexity, biological systems are highly reliable and fault tolerant. Hence, we are motivated to take inspiration for biological systems in the design of electronic ones. In SABRE (self-healing cellular architectures for biologically inspired highly reliable electronic systems), we have designed a bio-inspired fault-tolerant hierarchical architecture for this purpose. As in biology, the foundation for the whole system is cellular in nature, with each cell able to detect faults in its operation and trigger intra-cellular or extra-cellular repair as required. At the next level in the hierarchy, arrays of cells are configured and controlled as function units in a transport triggered architecture (TTA), which is able to perform partial-dynamic reconfiguration to rectify problems that cannot be solved at the cellular level. Each TTA is, in turn, part of a larger multi-processor system which employs coarser grain reconfiguration to tolerate faults that cause a processor to fail. In this paper, we describe the details of operation of each layer of the SABRE hierarchy, and how these layers interact to provide a high systemic level of fault tolerance. (paper)

  14. Fault tolerant aggregation for power system services

    DEFF Research Database (Denmark)

    Kosek, Anna Magdalena; Gehrke, Oliver; Kullmann, Daniel

    2013-01-01

    Exploiting the flexibility in distributed energy resources (DER) is seen as an important contribution to allow high penetrations of renewable generation in electrical power systems. However, the present control infrastructure in power systems is not well suited for the integration of a very large...... number of small units. A common approach is to aggregate a portfolio of such units together and expose them to the power system as a single large virtual unit. In order to realize the vision of a Smart Grid, concepts for flexible, resilient and reliable aggregation infrastructures are required...

  15. Advanced information processing system: The Army fault tolerant architecture conceptual study. Volume 2: Army fault tolerant architecture design and analysis

    Science.gov (United States)

    Harper, R. E.; Alger, L. S.; Babikyan, C. A.; Butler, B. P.; Friend, S. A.; Ganska, R. J.; Lala, J. H.; Masotto, T. K.; Meyer, A. J.; Morton, D. P.

    1992-01-01

    Described here is the Army Fault Tolerant Architecture (AFTA) hardware architecture and components and the operating system. The architectural and operational theory of the AFTA Fault Tolerant Data Bus is discussed. The test and maintenance strategy developed for use in fielded AFTA installations is presented. An approach to be used in reducing the probability of AFTA failure due to common mode faults is described. Analytical models for AFTA performance, reliability, availability, life cycle cost, weight, power, and volume are developed. An approach is presented for using VHSIC Hardware Description Language (VHDL) to describe and design AFTA's developmental hardware. A plan is described for verifying and validating key AFTA concepts during the Dem/Val phase. Analytical models and partial mission requirements are used to generate AFTA configurations for the TF/TA/NOE and Ground Vehicle missions.

  16. Study, design and realization of a fault-tolerant and predictable synchronous communication protocol on off-the-shelf components

    International Nuclear Information System (INIS)

    Chabrol, D.

    2006-06-01

    This PhD thesis contributes to the design and realization of safety-critical real-time systems on multiprocessor architectures with distributed memory. They are essential to compute systems that have to ensure complex and critical functions. This PhD thesis deals with communication media management. The communication management conditions strongly the capability of the system to fulfill the timeliness property and the dependability requirements. Our contribution includes: - The design of predictable and fault-tolerant synchronous communication protocol; - The study and the definition of the execution model to have a efficient and safe communications management; - The proposal of a method to generate automatically the communications scheduling. Our approach is based on a communication model that allows the analysis of the feasibility, before execution, of a distributed safe-critical real-time system with timeliness and safety requirements. This leads to the definition of an execution model based on a time-triggered and parallel communication management. A set of linear constraints system is generated automatically to compute the network scheduling and the network load with timeliness fulfillment. Then, the proposed communication interface is based on an advanced version of TDMA protocol which allows to use proprietary components (TTP, FlexRay) as well as standard components (Ethernet). The concepts presented in this thesis lead to the realisation and evaluation of a prototype within the framework of the OASIS project done at the CEA/List. (author)

  17. Techniques for modeling the reliability of fault-tolerant systems with the Markov state-space approach

    Science.gov (United States)

    Butler, Ricky W.; Johnson, Sally C.

    1995-01-01

    This paper presents a step-by-step tutorial of the methods and the tools that were used for the reliability analysis of fault-tolerant systems. The approach used in this paper is the Markov (or semi-Markov) state-space method. The paper is intended for design engineers with a basic understanding of computer architecture and fault tolerance, but little knowledge of reliability modeling. The representation of architectural features in mathematical models is emphasized. This paper does not present details of the mathematical solution of complex reliability models. Instead, it describes the use of several recently developed computer programs SURE, ASSIST, STEM, and PAWS that automate the generation and the solution of these models.

  18. Distributed computing system with dual independent communications paths between computers and employing split tokens

    Science.gov (United States)

    Rasmussen, Robert D. (Inventor); Manning, Robert M. (Inventor); Lewis, Blair F. (Inventor); Bolotin, Gary S. (Inventor); Ward, Richard S. (Inventor)

    1990-01-01

    This is a distributed computing system providing flexible fault tolerance; ease of software design and concurrency specification; and dynamic balance of the loads. The system comprises a plurality of computers each having a first input/output interface and a second input/output interface for interfacing to communications networks each second input/output interface including a bypass for bypassing the associated computer. A global communications network interconnects the first input/output interfaces for providing each computer the ability to broadcast messages simultaneously to the remainder of the computers. A meshwork communications network interconnects the second input/output interfaces providing each computer with the ability to establish a communications link with another of the computers bypassing the remainder of computers. Each computer is controlled by a resident copy of a common operating system. Communications between respective ones of computers is by means of split tokens each having a moving first portion which is sent from computer to computer and a resident second portion which is disposed in the memory of at least one of computer and wherein the location of the second portion is part of the first portion. The split tokens represent both functions to be executed by the computers and data to be employed in the execution of the functions. The first input/output interfaces each include logic for detecting a collision between messages and for terminating the broadcasting of a message whereby collisions between messages are detected and avoided.

  19. Fault-tolerant design of local controller for the poloidal field converter control system on ITER

    International Nuclear Information System (INIS)

    Shen, Jun; Fu, Peng; Gao, Ge; He, Shiying; Huang, Liansheng; Zhu, Lili; Chen, Xiaojiao

    2016-01-01

    Highlights: • The requirements on the Local Control Cubicles (LCC) for ITER Poloidal Field Converter are analyzed. • Decoupled service-based software architecture is proposed to make control loops on LCC running at varying cycle-time. • Fault detection and recovery methods for the LCC are developed to enhance the system. • The performance of the LCC with or without fault-tolerant feature is tested and compared. - Abstract: The control system for the Poloidal Field (PF) on ITER is a synchronously networked control system, which has several kinds of computational controllers. The Local Control Cubicles (LCC) play a critical role in the networked control system for they are the interface to all input and output signals. Thus, some additional work must be done to guarantee the LCCs proper operation under influence of faults. This paper mainly analyzes the system demands of the LCCs and faults which have been encountered recently. In order to handle these faults, decoupled service-based software architecture has been proposed. Based on this architecture, fault detection and system recovery methods, such as redundancy and rejuvenation, have been incorporated to achieve a fault-tolerant private network with the aid of QNX operating system. Unlike the conventional method, this method requires no additional hardware and can be achieved relatively easily. To demonstrate effectiveness the LCCs have been successfully tested during the recent PF Converter Unit performance tests for ITER.

  20. Evaluation of Simple Causal Message Logging for Large-Scale Fault Tolerant HPC Systems

    Energy Technology Data Exchange (ETDEWEB)

    Bronevetsky, G; Meneses, E; Kale, L V

    2011-02-25

    The era of petascale computing brought machines with hundreds of thousands of processors. The next generation of exascale supercomputers will make available clusters with millions of processors. In those machines, mean time between failures will range from a few minutes to few tens of minutes, making the crash of a processor the common case, instead of a rarity. Parallel applications running on those large machines will need to simultaneously survive crashes and maintain high productivity. To achieve that, fault tolerance techniques will have to go beyond checkpoint/restart, which requires all processors to roll back in case of a failure. Incorporating some form of message logging will provide a framework where only a subset of processors are rolled back after a crash. In this paper, we discuss why a simple causal message logging protocol seems a promising alternative to provide fault tolerance in large supercomputers. As opposed to pessimistic message logging, it has low latency overhead, especially in collective communication operations. Besides, it saves messages when more than one thread is running per processor. Finally, we demonstrate that a simple causal message logging protocol has a faster recovery and a low performance penalty when compared to checkpoint/restart. Running NAS Parallel Benchmarks (CG, MG and BT) on 1024 processors, simple causal message logging has a latency overhead below 5%.

  1. A Fault-tolerable Control Scheme for an Open-frame Underwater Vehicle

    Directory of Open Access Journals (Sweden)

    Huang Hai

    2014-05-01

    Full Text Available Open-frame is one of the major types of structures of Remote Operated Vehicles (ROV because it is easy to place sensors and operations equipment onboard. Firstly, this paper designed a petri-based recurrent neural network (PRFNN to improve the robustness with response to nonlinear characteristics and strong disturbance of an open-frame underwater vehicle. A threshold has been set in the third layer to reduce the amount of calculations and regulate the training process. The whole network convergence is guaranteed with the selection of learning rate parameters. Secondly, a fault tolerance control (FTC scheme is established with the optimal allocation of thrust. Infinity-norm optimization has been combined with 2-norm optimization to construct a bi-criteria primal-dual neural network FTC scheme. In the experiments and simulation, PRFNN outperformed fuzzy neural networks in motion control, while bi-criteria optimization outperformed 2-norm optimization in FTC, which demonstrates that the FTC controller can improve computational efficiency, reduce control errors, and implement fault tolerable thrust allocation.

  2. Design and Verification of Fault-Tolerant Components

    DEFF Research Database (Denmark)

    Zhang, Miaomiao; Liu, Zhiming; Ravn, Anders Peter

    2009-01-01

    We present a systematic approach to design and verification of fault-tolerant components with real-time properties as found in embedded systems. A state machine model of the correct component is augmented with internal transitions that represent hypothesized faults. Also, constraints...... on the occurrence or timing of faults are included in this model. This model of a faulty component is then extended with fault detection and recovery mechanisms, again in the form of state machines. Desired properties of the component are model checked for each of the successive models. The models can be made...

  3. A Benchmark Evaluation of Fault Tolerant Wind Turbine Control Concepts

    DEFF Research Database (Denmark)

    Odgaard, Peter Fogh; Stoustrup, Jakob

    2015-01-01

    As the world’s power supply to a larger and larger degree depends on wind turbines, it is consequently and increasingly important that these are as reliable and available as possible. Modern fault tolerant control (FTC) could play a substantial part in increasing reliability of modern wind turbin...... accommodation is handled in software sensor and actuator blocks. This means that the wind turbine controller can continue operation as in the fault free case. The other two evaluated solutions show some potential but probably need improvements before industrial applications....

  4. Concepts and Methods in Fault-tolerant Control

    DEFF Research Database (Denmark)

    Blanke, Mogens; Staroswiecly, M.; Wu, N.E.

    2001-01-01

    in an intelligent way. The aim is to prevent that simple faults develop into serious failure and hence increase plant availability and reduce the risk of safety hazards. Fault-tolerant control merges several disciplines into a common framework to achieve these goals. The desired features are obtained through on......-line fault diagnosis, automatic condition assessment and calculation of appropriate remedial actions to avoid certain consequences of a fault. The envelope of the possible remedial actions is very wide. Sometimes, simple could be achieved by replacing a measurement from a faulty sensor by an estimate. In yet...

  5. Data center networks topologies, architectures and fault-tolerance characteristics

    CERN Document Server

    Liu, Yang; Veeraraghavan, Malathi; Lin, Dong; Hamdi, Mounir

    2013-01-01

    This SpringerBrief presents a survey of data center network designs and topologies and compares several properties in order to highlight their advantages and disadvantages. The brief also explores several routing protocols designed for these topologies and compares the basic algorithms to establish connections, the techniques used to gain better performance, and the mechanisms for fault-tolerance. Readers will be equipped to understand how current research on data center networks enables the design of future architectures that can improve performance and dependability of data centers. This con

  6. Prognostics Enhancemend Fault-Tolerant Control with an Application to a Hovercraft, Phase II

    Data.gov (United States)

    National Aeronautics and Space Administration — Fault-Tolerant Control (FTC) is an emerging area of engineering and scientific research that integrates prognostics, health management concepts and intelligent...

  7. Implementations of a four-level mechanical architecture for fault-tolerant robots

    International Nuclear Information System (INIS)

    Hooper, Richard; Sreevijayan, Dev; Tesar, Delbert; Geisinger, Joseph; Kapoor, Chelan

    1996-01-01

    This paper describes a fault tolerant mechanical architecture with four levels devised and implemented in concert with NASA (Tesar, D. and Sreevijayan, D., Four-level fault tolerance in manipulator design for space operations. In First Int. Symp. Measurement and Control in Robotics (ISMCR '90), Houston, Texas, 20-22 June 1990.) Subsequent work has clarified and revised the architecture. The four levels proceed from fault tolerance at the actuator level, to fault tolerance via in-parallel chains, to fault tolerance using serial kinematic redundancy, and finally to the fault tolerance multiple arm systems provide. This is a subsumptive architecture because each successive layer can incorporate the fault tolerance provided by all layers beneath. For instance a serially-redundant robot can incorporate dual fault-tolerant actuators. Redundant systems provide the fault tolerance, but the guiding principle of this architecture is that functional redundancies actively increase the performance of the system. Redundancies do not simply remain dormant until needed. This paper includes specific examples of hardware and/or software implementation at all four levels

  8. Novel neural networks-based fault tolerant control scheme with fault alarm.

    Science.gov (United States)

    Shen, Qikun; Jiang, Bin; Shi, Peng; Lim, Cheng-Chew

    2014-11-01

    In this paper, the problem of adaptive active fault-tolerant control for a class of nonlinear systems with unknown actuator fault is investigated. The actuator fault is assumed to have no traditional affine appearance of the system state variables and control input. The useful property of the basis function of the radial basis function neural network (NN), which will be used in the design of the fault tolerant controller, is explored. Based on the analysis of the design of normal and passive fault tolerant controllers, by using the implicit function theorem, a novel NN-based active fault-tolerant control scheme with fault alarm is proposed. Comparing with results in the literature, the fault-tolerant control scheme can minimize the time delay between fault occurrence and accommodation that is called the time delay due to fault diagnosis, and reduce the adverse effect on system performance. In addition, the FTC scheme has the advantages of a passive fault-tolerant control scheme as well as the traditional active fault-tolerant control scheme's properties. Furthermore, the fault-tolerant control scheme requires no additional fault detection and isolation model which is necessary in the traditional active fault-tolerant control scheme. Finally, simulation results are presented to demonstrate the efficiency of the developed techniques.

  9. Fault-tolerant dead reckoning system for a modular vehicle

    Science.gov (United States)

    Hashimoto, Masafumi; Oba, Fuminori; Takahashi, Kazuhiko

    2005-12-01

    A fault-tolerant dead reckoning system is presented for a modular vehicle, which consists of one chassis unit and several wheel units. The units locally estimate the vehicle position based on their own internal sensors. The local estimates are exchanged among the units via an inter-communication system, and they are fused in a decentralized manner. The units can then determine the vehicle position accurately. The decentralized dead reckoning algorithm is formulated based on the information filter and the covariance Intersection method. For enhancing the reliability of the dead reckoning a multi-model based fault detection and diagnosis (FDD) of the internal sensors is incorporated into the dead reckoning system. The units diagnose their sensors with the FDD system, and they apply only the normal sensors for the vehicle localization. In this paper two fault modes (hard fault and noise fault modes) of the sensors are handled; on the hard fault the sensor output is stuck at a constant value. On the noise fault it is disturbed by a large noise. The FDD algorithm is based on the variable structure interacting multiple-model estimator. The fault-tolerant dead reckoning algorithm was implemented on our indoor test-vehicle, which consists of one chassis unit and four wheel units. Experimental results show that our dead reckoning provided better localization accuracy than the conventional one (i.e., the dead reckoning without sensor FDD system) did even though the sensors partially failed.

  10. Vertical Load Distribution for Cloud Computing via Multiple Implementation Options

    Science.gov (United States)

    Phan, Thomas; Li, Wen-Syan

    Cloud computing looks to deliver software as a provisioned service to end users, but the underlying infrastructure must be sufficiently scalable and robust. In our work, we focus on large-scale enterprise cloud systems and examine how enterprises may use a service-oriented architecture (SOA) to provide a streamlined interface to their business processes. To scale up the business processes, each SOA tier usually deploys multiple servers for load distribution and fault tolerance, a scenario which we term horizontal load distribution. One limitation of this approach is that load cannot be distributed further when all servers in the same tier are loaded. In complex multi-tiered SOA systems, a single business process may actually be implemented by multiple different computation pathways among the tiers, each with different components, in order to provide resilience and scalability. Such multiple implementation options gives opportunities for vertical load distribution across tiers. In this chapter, we look at a novel request routing framework for SOA-based enterprise computing with multiple implementation options that takes into account the options of both horizontal and vertical load distribution.

  11. HEP@Home - A distributed computing system based on BOINC

    CERN Document Server

    Amorim, A; Andrade, P; Amorim, Antonio; Villate, Jaime; Andrade, Pedro

    2005-01-01

    Project SETI@HOME has proven to be one of the biggest successes of distributed computing during the last years. With a quite simple approach SETI manages to process large volumes of data using a vast amount of distributed computer power. To extend the generic usage of this kind of distributed computing tools, BOINC is being developed. In this paper we propose HEP@HOME, a BOINC version tailored to the specific requirements of the High Energy Physics (HEP) community. The HEP@HOME will be able to process large amounts of data using virtually unlimited computing power, as BOINC does, and it should be able to work according to HEP specifications. In HEP the amounts of data to be analyzed or reconstructed are of central importance. Therefore, one of the design principles of this tool is to avoid data transfer. This will allow scientists to run their analysis applications and taking advantage of a large number of CPUs. This tool also satisfies other important requirements in HEP, namely, security, fault-tolerance an...

  12. An Evaluation of Fault Tolerant Wind Turbine Control Schemes applied to a Benchmark Model

    DEFF Research Database (Denmark)

    Odgaard, Peter Fogh; Stoustrup, Jakob

    2014-01-01

    an international competition on wind turbine fault tolerant control has been proposed. In this article the top three solutions from this wind fault tolerant control competition are introduced and evaluated. The evaluation presented in this paper shows that the winner of the competition performs very well...

  13. Evaluation of digital fault-tolerant architectures for nuclear power plant control systems

    International Nuclear Information System (INIS)

    Battle, R.E.

    1990-01-01

    Four fault tolerant architectures were evaluated for their potential reliability in service as control systems of nuclear power plants. The reliability analyses showed that human- and software-related common cause failures and single points of failure in the output modules are dominant contributors to system unreliability. The four architectures are triple-modular-redundant (TMR), both synchronous and asynchronous, and also dual synchronous and asynchronous. The evaluation includes a review of design features, an analysis of the importance of coverage, and reliability analyses of fault tolerant systems. An advantage of fault-tolerant controllers over those not fault tolerant, is that fault-tolerant controllers continue to function after the occurrence of most single hardware faults. However, most fault-tolerant controllers have single hardware components that will cause system failure, almost all controllers have single points of failure in software, and all are subject to common cause failures. Reliability analyses based on data from several industries that have fault-tolerant controllers were used to estimate the mean-time-between-failures of fault-tolerant controllers and to predict those failures modes that may be important in nuclear power plants. 7 refs., 4 tabs

  14. Logical Specification and Analysis of Fault Tolerant Systems through Partial Model Checking

    NARCIS (Netherlands)

    Gnesi, S.; Etalle, Sandro; Mukhopadhyay, S.; Lenzini, Gabriele; Lenzini, G.; Martinelli, F.; Roychoudhury, A.

    2003-01-01

    This paper presents a framework for a logical characterisation of fault tolerance and its formal analysis based on partial model checking techniques. The framework requires a fault tolerant system to be modelled using a formal calculus, here the CCS process algebra. To this aim we propose a uniform

  15. Online Reconfigurable Self-Timed Links for Fault Tolerant NoC

    Directory of Open Access Journals (Sweden)

    Teijo Lehtonen

    2007-01-01

    of the links. The fault tolerance properties are analyzed using a fault model containing temporary, intermittent, and permanent faults that occur both as bursts and as single faults. The results show a considerable enhancement in the fault tolerance at the cost of performance and area, and with only a slight increase in power consumption.

  16. Diagnosis and Fault-tolerant Control for Ship Station Keeping

    DEFF Research Database (Denmark)

    Blanke, Mogens

    2005-01-01

    This paper adresses the design process of diagnosis and fault-tolerant control when the a system should operate despite multiple failures in sensors or actuators. Graph-teory based analysis of systems structure is demonstrated to be a unique design methodology that can cope with the diagnosis...... design for systems of high complexity, and also analyse the cases of cascaded or multiple faults. The paper takes as example a ship with two CP propellers, rudders and a bow thruster as actuators, and instrumentation with a suite of global position sensors, inertial navigation units and conventional gyro...... units to provide ship motion information. A salient feature of the design mehod is the ability to analyse cases where faults have occurrred and easily determine where in the faulty system diagnosability and controlability are retained....

  17. Fault-tolerant and Diagnostic Methods for Navigation

    DEFF Research Database (Denmark)

    Blanke, Mogens

    2003-01-01

    Precise and reliable navigation is crucial, and for reasons of safety, essential navigation instruments are often duplicated. Hardware redundancy is mostly used to manually switch between instruments should faults occur. In contrast, diagnostic methods are available that can use analytic redundancy...... to diagnose faults and autonomously provide valid navigation data, disregarding any faulty sensor data and use sensor fusion to obtain a best estimate for users. This paper discusses how diagnostic and fault-tolerant methods are applicable in marine systems. An example chosen is sensor fusion for navigation....... Diagnosis design is based on parity relations and statistical hypothesis tests. Sensor fusion on healthy signals is made using a Kalman filter with inverse covariance updating to deal with asynchronous or missing data from instruments. The paper is presented at a tutorial level....

  18. Fault-Tolerant Onboard Monitoring and Decision Support Systems

    DEFF Research Database (Denmark)

    Lajic, Zoran

    The purpose of this research project is to improve current onboard decision support systems. Special focus is on the onboard prediction of the instantaneous sea state. In this project a new approach to increasing the overall reliability of a monitoring and decision support system has been...... advice regarding speed and course changes to decrease the wave-induced loads. The SeaSense system is based on the combined use of a mathematical model and measurements from a set of sensors. The overall dependability of a shipboard monitoring and decision support system such as the SeaSense system can...... of a fault. A supervisory function determines the severity of the fault once its origin has been isolated and its magnitude estimated. Fault-tolerant Sensor Fusion means that the monitoring and decision support system can accommodate faults so that the overall system continues to satisfy its goal...

  19. A Bypass-Ring Scheme for a Fault Tolerant Multicast

    Directory of Open Access Journals (Sweden)

    V. Dynda

    2003-01-01

    Full Text Available We present a fault tolerant scheme for recovery from single or multiple node failures in multi-directional multicast trees. The scheme is based on cyclic structures providing alternative paths to eliminate faulty nodes and reroute the traffic. Our scheme is independent of message source and direction in the tree, provides a basis for on-the-fly repair and can be used as a platform for various strategies for reconnecting tree partitions. It only requires an underlying infrastructure to provide a reliable routing service. Although it is described in the context of a message multicast, the scheme can be used universally in all systems using tree-based overlay networks for communication among components.

  20. Fault tolerant vector control of induction motor drive

    International Nuclear Information System (INIS)

    Odnokopylov, G; Bragin, A

    2014-01-01

    For electric composed of technical objects hazardous industries, such as nuclear, military, chemical, etc. an urgent task is to increase their resiliency and survivability. The construction principle of vector control system fault-tolerant asynchronous electric. Displaying recovery efficiency three-phase induction motor drive in emergency mode using two-phase vector control system. The process of formation of a simulation model of the asynchronous electric unbalance in emergency mode. When modeling used coordinate transformation, providing emergency operation electric unbalance work. The results of modeling transient phase loss motor stator. During a power failure phase induction motor cannot save circular rotating field in the air gap of the motor and ensure the restoration of its efficiency at rated torque and speed

  1. Fault tolerant strategies for automated operation of nuclear reactors

    International Nuclear Information System (INIS)

    Berkan, R.C.; Tsoukalas, L.

    1991-01-01

    This paper introduces an automatic control system incorporating a number of verification, validation, and command generation tasks with-in a fault-tolerant architecture. The integrated system utilizes recent methods of artificial intelligence such as neural networks and fuzzy logic control. Furthermore, advanced signal processing and nonlinear control methods are also included in the design. The primary goal is to create an on-line capability to validate signals, analyze plant performance, and verify the consistency of commands before control decisions are finalized. The application of this approach to the automated startup of the Experimental Breeder Reactor-II (EBR-II) is performed using a validated nonlinear model. The simulation results show that the advanced concepts have the potential to improve plant availability andsafety

  2. Fault-tolerant control of heavy-haul trains

    Science.gov (United States)

    Zhuan, Xiangtao; Xia, Xiaohua

    2010-06-01

    The fault-tolerant control (FTC) of heavy-haul trains is discussed on the basis of the speed regulation proposed in previous works. The fault modes of trains are assumed and the corresponding fault detection and isolation (FDI) are studied. The FDI of sensor faults is based on a geometric approach for residual generators. The FDI of a braking system is based on the observation of the steady-state speed. From the difference of the steady-state speeds between the fault system and the faultless system, one can get fault information. Simulation tests were conducted on the suitability of the FDIs and the redesigned speed regulators. It is shown that the proposed FTC does not explicitly worsen the performance of the speed regulator in the case of a faultless system, while it obviously improves the performance of the speed regulator in the case of a faulty system.

  3. A fault tolerant superheat control strategy for supermarket refrigeration systems

    DEFF Research Database (Denmark)

    Vinther, Kasper; Izadi-Zamanabadi, Roozbeh; Rasmussen, Henrik

    2013-01-01

    In this paper, a fault tolerant control (FTC) strategy is proposed for evaporator superheat control in supermarket refrigeration systems. Conventional control uses a pressure and temperature sensor for this purpose, however, the pressure sensor can fail to function. A contingency control strategy......, based on a maximum slope-seeking control method and only a single temperature sensor, is developed to drive the evaporator outlet temperature to a level that gives a suitable superheat of the refrigerant. The FTC strategy requires no a priori system knowledge or additional hardware and functions...... in a plug & play fashion. The strategy is outlined by means of procedural steps as well as a flow chart that also illustrates the process of automatic tuning of the maximum slope-seeking controller. Test results are furthermore presented for a display case in a full scale CO2 supermarket refrigeration...

  4. Fault tolerant control based on active fault diagnosis

    DEFF Research Database (Denmark)

    Niemann, Hans Henrik

    2005-01-01

    An active fault diagnosis (AFD) method will be considered in this paper in connection with a Fault Tolerant Control (FTC) architecture based on the YJBK parameterization of all stabilizing controllers. The architecture consists of a fault diagnosis (FD) part and a controller reconfiguration (CR......) part. The FTC architecture can be applied for additive faults, parametric faults, and for system structural changes. Only parametric faults will be considered in this paper. The main focus in this paper is on the use of the new approach of active fault diagnosis in connection with FTC. The active fault...... diagnosis approach is based on including an auxiliary input in the system. A fault signature matrix is introduced in connection with AFD, given as the transfer function from the auxiliary input to the residual output. This can be considered as a generalization of the passive fault diagnosis case, where...

  5. Real-time fault tolerant full adder design for critical applications

    Directory of Open Access Journals (Sweden)

    Pankaj Kumar

    2016-09-01

    Full Text Available In the complex computing system, processing units are dealing with devices of smaller size, which are sensitive to the transient faults. A transient fault occurs in a circuit caused by the electromagnetic noises, cosmic rays, crosstalk and power supply noise. It is very difficult to detect these faults during offline testing. Hence an area efficient fault tolerant full adder for testing and repairing of transient and permanent faults occurred in single and multi-net is proposed. Additionally, the proposed architecture can also detect and repair permanent faults. This design incurs much lower hardware overheads relative to the traditional hardware architecture. In addition to this, proposed design also provides higher error detection and correction efficiency when compared to the existing designs.

  6. Fault diagnosis and fault-tolerant control based on adaptive control approach

    CERN Document Server

    Shen, Qikun; Shi, Peng

    2017-01-01

    This book provides recent theoretical developments in and practical applications of fault diagnosis and fault tolerant control for complex dynamical systems, including uncertain systems, linear and nonlinear systems. Combining adaptive control technique with other control methodologies, it investigates the problems of fault diagnosis and fault tolerant control for uncertain dynamic systems with or without time delay. As such, the book provides readers a solid understanding of fault diagnosis and fault tolerant control based on adaptive control technology. Given its depth and breadth, it is well suited for undergraduate and graduate courses on linear system theory, nonlinear system theory, fault diagnosis and fault tolerant control techniques. Further, it can be used as a reference source for academic research on fault diagnosis and fault tolerant control, and for postgraduates in the field of control theory and engineering. .

  7. The Design of a Fault-Tolerant COTS-Based Bus Architecture for Space Applications

    Science.gov (United States)

    Chau, Savio N.; Alkalai, Leon; Tai, Ann T.

    2000-01-01

    The high-performance, scalability and miniaturization requirements together with the power, mass and cost constraints mandate the use of commercial-off-the-shelf (COTS) components and standards in the X2000 avionics system architecture for deep-space missions. In this paper, we report our experiences and findings on the design of an IEEE 1394 compliant fault-tolerant COTS-based bus architecture. While the COTS standard IEEE 1394 adequately supports power management, high performance and scalability, its topological criteria impose restrictions on fault tolerance realization. To circumvent the difficulties, we derive a "stack-tree" topology that not only complies with the IEEE 1394 standard but also facilitates fault tolerance realization in a spaceborne system with limited dedicated resource redundancies. Moreover, by exploiting pertinent standard features of the 1394 interface which are not purposely designed for fault tolerance, we devise a comprehensive set of fault detection mechanisms to support the fault-tolerant bus architecture.

  8. The Isis project: Fault-tolerance in large distributed systems

    Science.gov (United States)

    Birman, Kenneth P.; Marzullo, Keith

    1993-01-01

    This final status report covers activities of the Isis project during the first half of 1992. During the report period, the Isis effort has achieved a major milestone in its effort to redesign and reimplement the Isis system using Mach and Chorus as target operating system environments. In addition, we completed a number of publications that address issues raised in our prior work; some of these have recently appeared in print, while others are now being considered for publication in a variety of journals and conferences.

  9. Towards a Fault-Tolerant Architecture for Enterprise Application Integration Solutions

    Science.gov (United States)

    Frantz, Rafael Z.; Corchuelo, Rafael; Molina-Jimenez, Carlos

    Enterprise Application Integration (EAI) solutions rely on process support systems to implement exogenous message workflows whereby one can devise and deploy a process that helps keep a number of applications' data in synchrony or develop new functionality on top of them. EAI solutions are prone to failures due to the fact that they are highly distributed and combine stand-alone applications with specific-purpose integration processes. The literature provides two execution models for workflows, namely, synchronous and asynchronous. In this paper, we report on an architecture that addresses the problem of endowing the asynchronous model with fault-tolerance capabilities, which is a problem for which the literature does not provide a conclusion.

  10. A Fault Tolerant Self-Routing Computer Network Topology

    Science.gov (United States)

    1987-01-01

    Delay Characteristics’, IEEE Transactions on Communications, 153 pp. 1400-1416, December 1975. 29. R. M. Metcalfe and D.R. Boggs, " Ethernet ...loop, the local switch, the metro or interoffice facility, the tandem switch and the intercity facility. In addition to these, signaling is the "glue...Evolution 3. Metro /Interoffice Evolution 4. Tandem Switching Evolution 5. Intercity Facilities Evolution 6. Signaling Network Evolution 7. Interworklng of

  11. Intelligent distributed computing

    CERN Document Server

    Thampi, Sabu

    2015-01-01

    This book contains a selection of refereed and revised papers of the Intelligent Distributed Computing Track originally presented at the third International Symposium on Intelligent Informatics (ISI-2014), September 24-27, 2014, Delhi, India.  The papers selected for this Track cover several Distributed Computing and related topics including Peer-to-Peer Networks, Cloud Computing, Mobile Clouds, Wireless Sensor Networks, and their applications.

  12. A Novel Dual Separate Paths (DSP) Algorithm Providing Fault-Tolerant Communication for Wireless Sensor Networks.

    Science.gov (United States)

    Tien, Nguyen Xuan; Kim, Semog; Rhee, Jong Myung; Park, Sang Yoon

    2017-07-25

    Fault tolerance has long been a major concern for sensor communications in fault-tolerant cyber physical systems (CPSs). Network failure problems often occur in wireless sensor networks (WSNs) due to various factors such as the insufficient power of sensor nodes, the dislocation of sensor nodes, the unstable state of wireless links, and unpredictable environmental interference. Fault tolerance is thus one of the key requirements for data communications in WSN applications. This paper proposes a novel path redundancy-based algorithm, called dual separate paths (DSP), that provides fault-tolerant communication with the improvement of the network traffic performance for WSN applications, such as fault-tolerant CPSs. The proposed DSP algorithm establishes two separate paths between a source and a destination in a network based on the network topology information. These paths are node-disjoint paths and have optimal path distances. Unicast frames are delivered from the source to the destination in the network through the dual paths, providing fault-tolerant communication and reducing redundant unicast traffic for the network. The DSP algorithm can be applied to wired and wireless networks, such as WSNs, to provide seamless fault-tolerant communication for mission-critical and life-critical applications such as fault-tolerant CPSs. The analyzed and simulated results show that the DSP-based approach not only provides fault-tolerant communication, but also improves network traffic performance. For the case study in this paper, when the DSP algorithm was applied to high-availability seamless redundancy (HSR) networks, the proposed DSP-based approach reduced the network traffic by 80% to 88% compared with the standard HSR protocol, thus improving network traffic performance.

  13. Fault detection and fault-tolerant control for nonlinear systems

    CERN Document Server

    Li, Linlin

    2016-01-01

    Linlin Li addresses the analysis and design issues of observer-based FD and FTC for nonlinear systems. The author analyses the existence conditions for the nonlinear observer-based FD systems to gain a deeper insight into the construction of FD systems. Aided by the T-S fuzzy technique, she recommends different design schemes, among them the L_inf/L_2 type of FD systems. The derived FD and FTC approaches are verified by two benchmark processes. Contents Overview of FD and FTC Technology Configuration of Nonlinear Observer-Based FD Systems Design of L2 nonlinear Observer-Based FD Systems Design of Weighted Fuzzy Observer-Based FD Systems FTC Configurations for Nonlinear Systems< Application to Benchmark Processes Target Groups Researchers and students in the field of engineering with a focus on fault diagnosis and fault-tolerant control fields The Author Dr. Linlin Li completed her dissertation under the supervision of Prof. Steven X. Ding at the Faculty of Engineering, University of Duisburg-Essen, Germany...

  14. Fault-tolerant digital microfluidic biochips compilation and synthesis

    CERN Document Server

    Pop, Paul; Stuart, Elena; Madsen, Jan

    2016-01-01

    This book describes for researchers in the fields of compiler technology, design and test, and electronic design automation the new area of digital microfluidic biochips (DMBs), and thus offers a new application area for their methods.  The authors present a routing-based model of operation execution, along with several associated compilation approaches, which progressively relax the assumption that operations execute inside fixed rectangular modules.  Since operations can experience transient faults during the execution of a bioassay, the authors show how to use both offline (design time) and online (runtime) recovery strategies. The book also presents methods for the synthesis of fault-tolerant application-specific DMB architectures. ·         Presents the current models used for the research on compilation and synthesis techniques of DMBs in a tutorial fashion; ·         Includes a set of “benchmarks”, which are presented in great detail and includes the source code of most of the t...

  15. Fault Tolerant Control for Civil Structures Based on LMI Approach

    Directory of Open Access Journals (Sweden)

    Chunxu Qu

    2013-01-01

    Full Text Available The control system may lose the performance to suppress the structural vibration due to the faults in sensors or actuators. This paper designs the filter to perform the fault detection and isolation (FDI and then reforms the control strategy to achieve the fault tolerant control (FTC. The dynamic equation of the structure with active mass damper (AMD is first formulated. Then, an estimated system is built to transform the FDI filter design problem to the static gain optimization problem. The gain is designed to minimize the gap between the estimated system and the practical system, which can be calculated by linear matrix inequality (LMI approach. The FDI filter is finally used to isolate the sensor faults and reform the FTC strategy. The efficiency of FDI and FTC is validated by the numerical simulation of a three-story structure with AMD system with the consideration of sensor faults. The results show that the proposed FDI filter can detect the sensor faults and FTC controller can effectively tolerate the faults and suppress the structural vibration.

  16. ATLAS Distributed Computing Automation

    CERN Document Server

    Schovancova, J; The ATLAS collaboration; Borrego, C; Campana, S; Di Girolamo, A; Elmsheuser, J; Hejbal, J; Kouba, T; Legger, F; Magradze, E; Medrano Llamas, R; Negri, G; Rinaldi, L; Sciacca, G; Serfon, C; Van Der Ster, D C

    2012-01-01

    The ATLAS Experiment benefits from computing resources distributed worldwide at more than 100 WLCG sites. The ATLAS Grid sites provide over 100k CPU job slots, over 100 PB of storage space on disk or tape. Monitoring of status of such a complex infrastructure is essential. The ATLAS Grid infrastructure is monitored 24/7 by two teams of shifters distributed world-wide, by the ATLAS Distributed Computing experts, and by site administrators. In this paper we summarize automation efforts performed within the ATLAS Distributed Computing team in order to reduce manpower costs and improve the reliability of the system. Different aspects of the automation process are described: from the ATLAS Grid site topology provided by the ATLAS Grid Information System, via automatic site testing by the HammerCloud, to automatic exclusion from production or analysis activities.

  17. Fault Tolerant Hardware/Software Architecture for Flight Critical Function

    Science.gov (United States)

    1985-09-01

    after Augusta Ada Byron, Countess Lovelace , often recognized as the first programmer. The first Ada language reference manual was published in 1981...34A Survivable Distributed Computing System for Embedded Application Programs Written in Ada ," Ada Letters , Vol. 111, No. 3, November/December 1983...TOLERATED? " Ky T.Ander~on 4A DEPENABLE VONIC DATA TRANSMISSION) Sby D.R.PoweNf and J.C.Valadir 5 MULTI-COMPUTER kAULT TOLERANT SYSTEMS USING ADA by

  18. Analysis and optimization of fault-tolerant embedded systems with hardened processors

    DEFF Research Database (Denmark)

    Izosimov, Viacheslav; Polian, Ilia; Pop, Paul

    2009-01-01

    In this paper we propose an approach to the design optimization of fault-tolerant hard real-time embedded systems, which combines hardware and software fault tolerance techniques. We trade-off between selective hardening in hardware and process reexecution in software to provide the required levels...... of fault tolerance against transient faults with the lowest-possible system costs. We propose a system failure probability (SFP) analysis that connects the hardening level with the maximum number of reexecutions in software. We present design optimization heuristics, to select the fault......-tolerant architecture and decide process mapping such that the system cost is minimized, deadlines are satisfied, and the reliability requirements are fulfilled....

  19. Passive Fault Tolerant Control of Piecewise Affine Systems Based on H Infinity Synthesis

    DEFF Research Database (Denmark)

    Gholami, Mehdi; Cocquempot, vincent; Schiøler, Henrik

    2011-01-01

    In this paper we design a passive fault tolerant controller against actuator faults for discretetime piecewise affine (PWA) systems. By using dissipativity theory and H analysis, fault tolerant state feedback controller design is expressed as a set of Linear Matrix Inequalities (LMIs). In the cur......In this paper we design a passive fault tolerant controller against actuator faults for discretetime piecewise affine (PWA) systems. By using dissipativity theory and H analysis, fault tolerant state feedback controller design is expressed as a set of Linear Matrix Inequalities (LMIs......). In the current paper, the PWA system switches not only due to the state but also due to the control input. The method is applied on a large scale livestock ventilation model....

  20. Evaluation of digital fault-tolerant architectures for nuclear power plant control systems

    International Nuclear Information System (INIS)

    Battle, R.E.

    1990-01-01

    This paper reports on four fault-tolerant architectures that were evaluated for their potential reliability in service as control systems of nuclear power plants. The reliability analyses showed that human- and software-related common cause failures and single points of failure in the output modules are dominant contributors to system unreliability. The four architectures are triple-modular-redundant, both synchronous and asynchronous, and also dual synchronous and asynchronous. The evaluation includes a review of design features, an analysis of the importance of coverage, and reliability analyses of fault-tolerant systems. Reliability analyses based on data from several industries that have fault-tolerant controllers were used to estimate the mean-time-between-failures of fault-tolerant controllers and to predict those failure modes that may be important in nuclear power plants

  1. Fault-Tolerant Relative Navigation System (RNS) for Docking, Phase I

    Data.gov (United States)

    National Aeronautics and Space Administration — A method is propsed to develop a sensor fusion process for blending GPS/IMU/EO data for fault tolerant rendezvous and docking of spacecraft. The methodology takes...

  2. Design of a fault-tolerant decision-making system for biomedical applications.

    Science.gov (United States)

    Faust, Oliver; Acharya, U Rajendra; Sputh, Bernhard H C; Tamura, Toshiyo

    2013-01-01

    This paper describes the design of a fault-tolerant classification system for medical applications. The design process follows the systems engineering methodology: in the agreement phase, we make the case for fault tolerance in diagnosis systems for biomedical applications. The argument extends the idea that machine diagnosis systems mimic the functionality of human decision-making, but in many cases they do not achieve the fault tolerance of the human brain. After making the case for fault tolerance, both requirements and specification for the fault-tolerant system are introduced before the implementation is discussed. The system is tested with fault and use cases to build up trust in the implemented system. This structured approach aided in the realisation of the fault-tolerant classification system. During the specification phase, we produced a formal model that enabled us to discuss what fault tolerance, reliability and safety mean for this particular classification system. Furthermore, such a formal basis for discussion is extremely useful during the initial stages of the design, because it helps to avoid big mistakes caused by a lack of overview later on in the project. During the implementation, we practiced component reuse by incorporating a reliable classification block, which was developed during a previous project, into the current design. Using a well-structured approach and practicing component reuse we follow best practice for both research and industry projects, which enabled us to realise the fault-tolerant classification system on time and within budget. This system can serve in a wide range of future health care systems.

  3. An Analysis of Failure Handling in Chameleon, A Framework for Supporting Cost-Effective Fault Tolerant Services

    Science.gov (United States)

    Haakensen, Erik Edward

    1998-01-01

    The desire for low-cost reliable computing is increasing. Most current fault tolerant computing solutions are not very flexible, i.e., they cannot adapt to reliability requirements of newly emerging applications in business, commerce, and manufacturing. It is important that users have a flexible, reliable platform to support both critical and noncritical applications. Chameleon, under development at the Center for Reliable and High-Performance Computing at the University of Illinois, is a software framework. for supporting cost-effective adaptable networked fault tolerant service. This thesis details a simulation of fault injection, detection, and recovery in Chameleon. The simulation was written in C++ using the DEPEND simulation library. The results obtained from the simulation included the amount of overhead incurred by the fault detection and recovery mechanisms supported by Chameleon. In addition, information about fault scenarios from which Chameleon cannot recover was gained. The results of the simulation showed that both critical and noncritical applications can be executed in the Chameleon environment with a fairly small amount of overhead. No single point of failure from which Chameleon could not recover was found. Chameleon was also found to be capable of recovering from several multiple failure scenarios.

  4. Parameter Estimation Analysis for Hybrid Adaptive Fault Tolerant Control

    Science.gov (United States)

    Eshak, Peter B.

    Research efforts have increased in recent years toward the development of intelligent fault tolerant control laws, which are capable of helping the pilot to safely maintain aircraft control at post failure conditions. Researchers at West Virginia University (WVU) have been actively involved in the development of fault tolerant adaptive control laws in all three major categories: direct, indirect, and hybrid. The first implemented design to provide adaptation was a direct adaptive controller, which used artificial neural networks to generate augmentation commands in order to reduce the modeling error. Indirect adaptive laws were implemented in another controller, which utilized online PID to estimate and update the controller parameter. Finally, a new controller design was introduced, which integrated both direct and indirect control laws. This controller is known as hybrid adaptive controller. This last control design outperformed the two earlier designs in terms of less NNs effort and better tracking quality. The performance of online PID has an important role in the quality of the hybrid controller; therefore, the quality of the estimation will be of a great importance. Unfortunately, PID is not perfect and the online estimation process has some inherited issues; the online PID estimates are primarily affected by delays and biases. In order to ensure updating reliable estimates to the controller, the estimator consumes some time to converge. Moreover, the estimator will often converge to a biased value. This thesis conducts a sensitivity analysis for the estimation issues, delay and bias, and their effect on the tracking quality. In addition, the performance of the hybrid controller as compared to direct adaptive controller is explored. In order to serve this purpose, a simulation environment in MATLAB/SIMULINK has been created. The simulation environment is customized to provide the user with the flexibility to add different combinations of biases and delays to

  5. Coping with distributed computing

    International Nuclear Information System (INIS)

    Cormell, L.

    1992-09-01

    The rapid increase in the availability of high performance, cost-effective RISC/UNIX workstations has been both a blessing and a curse. The blessing of having extremely powerful computing engines available on the desk top is well-known to many users. The user has tremendous freedom, flexibility, and control of his environment. That freedom can, however, become the curse of distributed computing. The user must become a system manager to some extent, he must worry about backups, maintenance, upgrades, etc. Traditionally these activities have been the responsibility of a central computing group. The central computing group, however, may find that it can no longer provide all of the traditional services. With the plethora of workstations now found on so many desktops throughout the entire campus or lab, the central computing group may be swamped by support requests. This talk will address several of these computer support and management issues by providing some examples of the approaches taken at various HEP institutions. In addition, a brief review of commercial directions or products for distributed computing and management will be given

  6. DIRAC distributed computing services

    International Nuclear Information System (INIS)

    Tsaregorodtsev, A

    2014-01-01

    DIRAC Project provides a general-purpose framework for building distributed computing systems. It is used now in several HEP and astrophysics experiments as well as for user communities in other scientific domains. There is a large interest from smaller user communities to have a simple tool like DIRAC for accessing grid and other types of distributed computing resources. However, small experiments cannot afford to install and maintain dedicated services. Therefore, several grid infrastructure projects are providing DIRAC services for their respective user communities. These services are used for user tutorials as well as to help porting the applications to the grid for a practical day-to-day work. The services are giving access typically to several grid infrastructures as well as to standalone computing clusters accessible by the target user communities. In the paper we will present the experience of running DIRAC services provided by the France-Grilles NGI and other national grid infrastructure projects.

  7. Network-Physics(NP) Bec DIGITAL(#)-VULNERABILITY Versus Fault-Tolerant Analog

    Science.gov (United States)

    Alexander, G. K.; Hathaway, M.; Schmidt, H. E.; Siegel, E.

    2011-03-01

    Siegel[AMS Joint Mtg.(2002)-Abs.973-60-124] digits logarithmic-(Newcomb(1881)-Weyl(1914; 1916)-Benford(1938)-"NeWBe"/"OLDbe")-law algebraic-inversion to ONLY BEQS BEC:Quanta/Bosons= digits: Synthesis reveals EMP-like SEVERE VULNERABILITY of ONLY DIGITAL-networks(VS. FAULT-TOLERANT ANALOG INvulnerability) via Barabasi "Network-Physics" relative-``statics''(VS.dynamics-[Willinger-Alderson-Doyle(Not.AMS(5/09)]-]critique); (so called)"Quantum-computing is simple-arithmetic(sans division/ factorization); algorithmic-complexities: INtractibility/ UNdecidability/ INefficiency/NONcomputability / HARDNESS(so MIScalled) "noise"-induced-phase-transitions(NITS) ACCELERATION: Cook-Levin theorem Reducibility is Renormalization-(Semi)-Group fixed-points; number-Randomness DEFINITION via WHAT? Query(VS. Goldreich[Not.AMS(02)] How? mea culpa)can ONLY be MBCS "hot-plasma" versus digit-clumping NON-random BEC; Modular-arithmetic Congruences= Signal X Noise PRODUCTS = clock-model; NON-Shor[Physica A,341,586(04)] BEC logarithmic-law inversion factorization:Watkins number-thy. U stat.-phys.); P=/=NP TRIVIAL Proof: Euclid!!! [(So Miscalled) computational-complexity J-O obviation via geometry.

  8. Robust fault-tolerant control for a biped robot using a recurrent cerebellar model articulation controller.

    Science.gov (United States)

    Lin, Chih-Min; Chen, Chiu-Hsiung

    2007-02-01

    A design technique of a recurrent cerebellar model articulation controller (RCMAC)-based fault-tolerant control (FTC) system is investigated to rectify the nonlinear faults of a biped robot. The proposed RCMAC-based FTC (RCFTC) scheme contains two components: 1) an online fault estimation module based on an RCMAC is used to provide approximation information for any nonnominal behavior due to the system failure and modeling error of the biped robot; and 2) a controller module consisting of a computed torque controller and a robust FTC is utilized to achieve FTC. In the controller module, the computed torque controller reveals a basic stabilizing controller to stabilize the system, and the robust FTC is utilized to compensate for the effects of the system failure so as to achieve fault accommodation. The adaptive laws of the RCFTC system are rigorously established based on the Lyapunov function, so that the stability of the system can be guaranteed. Finally, two simulation cases of a biped robot are presented to illustrate the effectiveness of the proposed design method. Simulation results show that the RCFTC system can effectively recover the control performance for the system in the presence of the nonlinear faults and modeling uncertainties.

  9. A Framework-Based Approach for Fault-Tolerant Service Robots

    Directory of Open Access Journals (Sweden)

    Heejune Ahn

    2012-11-01

    Full Text Available Recently the component-based approach has become a major trend in intelligent service robot development due to its reusability and productivity. The framework in a component-based system should provide essential services for application components. However, to our knowledge the existing robot frameworks do not yet support fault tolerance service. Moreover, it is often believed that faults can be handled only at the application level. In this paper, by extending the robot framework with the fault tolerance function, we argue that the framework-based fault tolerance approach is feasible and even has many benefits, including that: 1 the system integrators can build fault tolerance applications from non-fault-aware components; 2 the constraints of the components and the operating environment can be considered at the time of integration, which – cannot be anticipated eaily at the time of component development; 3 consistency in system reliability can be obtained even in spite of diverse application component sources. In the proposed construction, we build XML rule files defining the rules for probing and determining the fault conditions of each component, contamination cases from a faulty component, and the possible recovery and safety methods. The rule files are established by a system integrator and the fault manager in the framework controls the fault tolerance process according to the rules. We demonstrate that the fault-tolerant framework can incorporate widely accepted fault tolerance techniques. The effectiveness and real-time performance of the framework-based approach and its techniques are examined by testing an autonomous mobile robot in typical fault scenarios.

  10. A fault-tolerant control architecture for unmanned aerial vehicles

    Science.gov (United States)

    Drozeski, Graham R.

    Research has presented several approaches to achieve varying degrees of fault-tolerance in unmanned aircraft. Approaches in reconfigurable flight control are generally divided into two categories: those which incorporate multiple non-adaptive controllers and switch between them based on the output of a fault detection and identification element, and those that employ a single adaptive controller capable of compensating for a variety of fault modes. Regardless of the approach for reconfigurable flight control, certain fault modes dictate system restructuring in order to prevent a catastrophic failure. System restructuring enables active control of actuation not employed by the nominal system to recover controllability of the aircraft. After system restructuring, continued operation requires the generation of flight paths that adhere to an altered flight envelope. The control architecture developed in this research employs a multi-tiered hierarchy to allow unmanned aircraft to generate and track safe flight paths despite the occurrence of potentially catastrophic faults. The hierarchical architecture increases the level of autonomy of the system by integrating five functionalities with the baseline system: fault detection and identification, active system restructuring, reconfigurable flight control; reconfigurable path planning, and mission adaptation. Fault detection and identification algorithms continually monitor aircraft performance and issue fault declarations. When the severity of a fault exceeds the capability of the baseline flight controller, active system restructuring expands the controllability of the aircraft using unconventional control strategies not exploited by the baseline controller. Each of the reconfigurable flight controllers and the baseline controller employ a proven adaptive neural network control strategy. A reconfigurable path planner employs an adaptive model of the vehicle to re-shape the desired flight path. Generation of the revised

  11. Fault-Tolerant and Elastic Streaming MapReduce with Decentralized Coordination

    Energy Technology Data Exchange (ETDEWEB)

    Kumbhare, Alok [Univ. of Southern California, Los Angeles, CA (United States); Frincu, Marc [Univ. of Southern California, Los Angeles, CA (United States); Simmhan, Yogesh [Indian Inst. of Technology (IIT), Bangalore (India); Prasanna, Viktor K. [Univ. of Southern California, Los Angeles, CA (United States)

    2015-06-29

    The MapReduce programming model, due to its simplicity and scalability, has become an essential tool for processing large data volumes in distributed environments. Recent Stream Processing Systems (SPS) extend this model to provide low-latency analysis of high-velocity continuous data streams. However, integrating MapReduce with streaming poses challenges: first, the runtime variations in data characteristics such as data-rates and key-distribution cause resource overload, that inturn leads to fluctuations in the Quality of the Service (QoS); and second, the stateful reducers, whose state depends on the complete tuple history, necessitates efficient fault-recovery mechanisms to maintain the desired QoS in the presence of resource failures. We propose an integrated streaming MapReduce architecture leveraging the concept of consistent hashing to support runtime elasticity along with locality-aware data and state replication to provide efficient load-balancing with low-overhead fault-tolerance and parallel fault-recovery from multiple simultaneous failures. Our evaluation on a private cloud shows up to 2:8 improvement in peak throughput compared to Apache Storm SPS, and a low recovery latency of 700 -1500 ms from multiple failures.

  12. Fault-diagnosis applications. Model-based condition monitoring. Acutators, drives, machinery, plants, sensors, and fault-tolerant systems

    Energy Technology Data Exchange (ETDEWEB)

    Isermann, Rolf [Technische Univ. Darmstadt (DE). Inst. fuer Automatisierungstechnik (IAT)

    2011-07-01

    Supervision, condition-monitoring, fault detection, fault diagnosis and fault management play an increasing role for technical processes and vehicles in order to improve reliability, availability, maintenance and lifetime. For safety-related processes fault-tolerant systems with redundancy are required in order to reach comprehensive system integrity. This book is a sequel of the book ''Fault-Diagnosis Systems'' published in 2006, where the basic methods were described. After a short introduction into fault-detection and fault-diagnosis methods the book shows how these methods can be applied for a selection of 20 real technical components and processes as examples, such as: Electrical drives (DC, AC) Electrical actuators Fluidic actuators (hydraulic, pneumatic) Centrifugal and reciprocating pumps Pipelines (leak detection) Industrial robots Machine tools (main and feed drive, drilling, milling, grinding) Heat exchangers Also realized fault-tolerant systems for electrical drives, actuators and sensors are presented. The book describes why and how the various signal-model-based and process-model-based methods were applied and which experimental results could be achieved. In several cases a combination of different methods was most successful. The book is dedicated to graduate students of electrical, mechanical, chemical engineering and computer science and for engineers. (orig.)

  13. FPGAs and parallel architectures for aerospace applications soft errors and fault-tolerant design

    CERN Document Server

    Rech, Paolo

    2016-01-01

    This book introduces the concepts of soft errors in FPGAs, as well as the motivation for using commercial, off-the-shelf (COTS) FPGAs in mission-critical and remote applications, such as aerospace.  The authors describe the effects of radiation in FPGAs, present a large set of soft-error mitigation techniques that can be applied in these circuits, as well as methods for qualifying these circuits under radiation.  Coverage includes radiation effects in FPGAs, fault-tolerant techniques for FPGAs, use of COTS FPGAs in aerospace applications, experimental data of FPGAs under radiation, FPGA embedded processors under radiation, and fault injection in FPGAs. Since dedicated parallel processing architectures such as GPUs have become more desirable in aerospace applications due to high computational power, GPU analysis under radiation is also discussed. ·         Discusses features and drawbacks of reconfigurability methods for FPGAs, focused on aerospace applications; ·         Explains how radia...

  14. Fault-tolerant and QoS based Network Layer for Security Management

    Directory of Open Access Journals (Sweden)

    Mohamed Naceur Abdelkrim

    2013-07-01

    Full Text Available Wireless sensor networks have profound effects on many application fields like security management which need an immediate, fast and energy efficient route. In this paper, we define a fault-tolerant and QoS based network layer for security management of chemical products warehouse which can be classified as real-time and mission critical application. This application generate routine data packets and alert packets caused by unusual events which need a high reliability, short end to end delay and low packet loss rate constraints. After each node compute his hop count and build his neighbors table in the initialization phase, packets can be routed to the sink. We use FELGossiping protocol for routine data packets and node-disjoint multipath routing protocol for alert packets. Furthermore, we utilize the information gathering phase of FELGossiping to update the neighbors table and detect the failed nodes, and we adapt the network topology changes by rerun the initialization phase when chemical units were added or removed from the warehouse. Analysis shows that the network layer is energy efficient and can meet the QoS constraints of unusual events packets.

  15. Integral Sliding Mode Fault-Tolerant Control for Uncertain Linear Systems Over Networks With Signals Quantization.

    Science.gov (United States)

    Hao, Li-Ying; Park, Ju H; Ye, Dan

    2017-09-01

    In this paper, a new robust fault-tolerant compensation control method for uncertain linear systems over networks is proposed, where only quantized signals are assumed to be available. This approach is based on the integral sliding mode (ISM) method where two kinds of integral sliding surfaces are constructed. One is the continuous-state-dependent surface with the aim of sliding mode stability analysis and the other is the quantization-state-dependent surface, which is used for ISM controller design. A scheme that combines the adaptive ISM controller and quantization parameter adjustment strategy is then proposed. Through utilizing H ∞ control analytical technique, once the system is in the sliding mode, the nature of performing disturbance attenuation and fault tolerance from the initial time can be found without requiring any fault information. Finally, the effectiveness of our proposed ISM control fault-tolerant schemes against quantization errors is demonstrated in the simulation.

  16. Clustering and fault tolerance for target tracking using wireless sensor networks

    International Nuclear Information System (INIS)

    Bhatti, S.; Khanzada, S.; Memon, S.

    2012-01-01

    Over the last few years, the deployment of WSNs (Wireless Sensor Networks) has been fostered in diverse applications. WSN has great potential for a variety of domains ranging from scientific experiments to commercial applications. Due to the deployment of WSNs in dynamic and unpredictable environments. They have potential to cope with variety of faults. This paper proposes an energy-aware fault-tolerant clustering protocol for target tracking applications termed as the FITf (Fault Tolerant Target Tracking) protocol The identification of RNs (Redundant Nodes) makes SN (Sensor Node) fault tolerance plausible and the clustering endorsed recovery of sensors supervised by a faulty CH (Cluster Head). The FfTT protocol intends two steps of reducing energy consumption: first, by identifying RNs in the network; secondly, by restricting the numbers of SNs sending data to the CH. Simulations validate the scalability and low power consumption of the FITf protocol in comparison with LEACH protocol. (author)

  17. Data-driven design of fault diagnosis and fault-tolerant control systems

    CERN Document Server

    Ding, Steven X

    2014-01-01

    Data-driven Design of Fault Diagnosis and Fault-tolerant Control Systems presents basic statistical process monitoring, fault diagnosis, and control methods, and introduces advanced data-driven schemes for the design of fault diagnosis and fault-tolerant control systems catering to the needs of dynamic industrial processes. With ever increasing demands for reliability, availability and safety in technical processes and assets, process monitoring and fault-tolerance have become important issues surrounding the design of automatic control systems. This text shows the reader how, thanks to the rapid development of information technology, key techniques of data-driven and statistical process monitoring and control can now become widely used in industrial practice to address these issues. To allow for self-contained study and facilitate implementation in real applications, important mathematical and control theoretical knowledge and tools are included in this book. Major schemes are presented in algorithm form and...

  18. Fault tolerant design of a servo manipulator system for hot cell operation

    International Nuclear Information System (INIS)

    Jin, Jae Hyun; Park, Byung Suk; Ahn, Sung Ho; Yoon, Ji Sup; Jung, Jae Hoo

    2003-01-01

    In this paper, fault tolerant mechanisms are presented for a servo manipulator system designed to operate in a hot cell. A hot cell is a sealed and shielded room to handle radioactive materials, and it is dangerous for people to work in the hot cell. So, remote operations are necessary to handle the radioactive materials in the hot cell. KAERI has developed a servo manipulator system to perform such remote operations. However, since electric components such as servo motors are weakened with radiation, fault tolerant mechanisms have to be considered. For fault tolerance of the servo manipulator system, hardware and software redundancy has been considered. In the case of hardware, radioactive resistant electric components such as cables and connectors have been adopted and motors driving a transport have been duplicated. In case of software, a reconfiguration algorithm accommodating one motor's failure has been developed. The algorithm uses redundant axes to recover the end effector's motion in spite of one motor's failure

  19. Industrial Cost-Benefit Assessment for Fault-tolerant Control Systems

    DEFF Research Database (Denmark)

    Thybo, C.; Blanke, M.

    1998-01-01

    Economic aspects are decisive for industrial acceptance of research concepts including the promising ideas in fault tolerant control. Fault tolerance is the ability of a system to detect, isolate and accommodate a fault, such that simple faults in a sub-system do not develop into failures...... at a system level. In a design phase for an industrial system, possibilities span from fail safe design where any single point failure is accommodated by hardware, over fault-tolerant design where selected faults are handled without extra hardware, to fault-ignorant design where no extra precaution is taken...... against failure. The paper describes the assessments needed to find the right path for new industrial designs. The economic decisions in the design phase are discussed: cost of different failures, profits associated with available benefits, investments needed for development and life-time support...

  20. Fast architecture-level synthesis of fault-tolerant flow-based microfluidic biochips

    DEFF Research Database (Denmark)

    Huang, Wei Lun; Gupta, Ankur; Roy, Sudip

    2017-01-01

    in turn results in wastage of expensive reagent fluids. In order to make the chip fault-tolerant, the state-of-the-art technique adopts simulated annealing (SA) based approach to synthesize a fault-tolerant architecture. However, the SA method is time consuming and non-deterministic with over......-simplified model that usually derive sub-optimal results. Thus, we propose a progressive optimization procedure for the synthesis of fault-tolerant flow-based microfluidic biochips. Simulation results demonstrate that proposed method is efficient compared to the state-of-the-art techniques and can provide...... effective solutions in 88% (on average) less CPU time compared to state-of-the-art technique over three benchmark bioprotocols....

  1. Artificial neural networks contribution to the operational security of embedded systems. Artificial neural networks contribution to fault tolerance of on-board functions in space environment

    International Nuclear Information System (INIS)

    Vintenat, Lionel

    1999-01-01

    A good quality often attributed to artificial neural networks is fault tolerance. In general presentation works, this property is almost always introduced as 'natural', i.e. being obtained without any specific precaution during learning. Besides, space environment is known to be aggressive towards on-board hardware, inducing various abnormal operations. Particularly, digital components suffer from upset phenomenon, i.e. misplaced switches of memory flip-flops. These two observations lead to the question: would neural chips constitute an interesting and robust solution to implement some board functions of spacecrafts? First, the various aspects of the problem are detailed: artificial neural networks and their fault tolerance, neural chips, space environment and resulting failures. Further to this presentation, a particular technique to carry out neural chips is selected because of its simplicity, and especially because it requires few memory flip-flops: random pulse streams. An original method for star recognition inside a field-of-view is then proposed for the board function 'attitude computation'. This method relies on a winner-takes-all competition network, and on a Kohonen self-organized map. An hardware implementation of those two neural models is then proposed using random pulse streams. Thanks to this realization, on one hand difficulties related to that particular implementation technique can be highlighted, and on the other hand a first evaluation of its practical fault tolerance can be carried out. (author) [fr

  2. A New Fault-tolerant Switched Reluctance Motor with reliable fault detection capability

    DEFF Research Database (Denmark)

    Lu, Kaiyuan

    2014-01-01

    For reliable fault detection, often, search coils are used in many fault-tolerant drives. The search coils occupy extra slot space. They are normally open-circuited and are not used for torque production. This degrades the motor performance, increases the cost and manufacture complexity. A new...... Fault-Tolerant Switched Reluctance (FTSR) motor is proposed in this paper. A unique feature of this special design is that it allows use of the unexcited phase coils as search coils for fault detection. Therefore this new motor has all the advantages of using search coils for reliable fault detection...

  3. Fault-tolerant Control of Inverter-fed Induction Motor Drives

    DEFF Research Database (Denmark)

    Thybo, C.

    The main purpose of this work was to investigate how fault-tolerant control (FTC) could be included in the control scheme of frequency converter fed induction motor applications. This was approached by identifying the potential failure modes for which fault tolerant control should be applied...... a current sensor fault, by switching to a closed loop scalar controller, was analysed. The main contributions of this work are · An investigation of the potential failure modes of inverter fed induction motor drives. · An extension of the FTC development cycle, to include economical cost-benefit analysis...

  4. The Aircraft Attitude Robust Inversion Fault-tolerant Control Based on Observer

    Directory of Open Access Journals (Sweden)

    Zhou Hong-Cheng

    2014-09-01

    Full Text Available For attitude control system, based on instruction filter back-stepping techniques, a robust fault-tolerant control method is proposed. Firstly, attitude control system mathematical model is given, on this basis, the attitude control system under the modeling errors caused by uncertainty, external disturbances and control surfaces faults are considered. The fault tolerant control design involves two main units, one is auxiliary system design, the other is controller design using the auxiliary system. Finally, the simulation results show that the proposed method can make the tracking performance for flight control system.

  5. Evaporator unit as a benchmark for plug and play and fault tolerant control

    DEFF Research Database (Denmark)

    Izadi-Zamanabadi, Roozbeh; Vinther, Kasper; Mojallali, Hamed

    2012-01-01

    This paper presents a challenging industrial benchmark for implementation of control strategies under realistic working conditions. The developed control strategies should perform in a plug & play manner, i.e. adapt to varying working conditions, optimize their performance, and provide fault...... tolerance. A fault tolerant strategy is needed to deal with a faulty sensor measurement of the evaporation pressure. The design and algorithmic challenges in the control of an evaporator include: unknown model parameters, large parameter variations, varying loads, and external discrete phenomena...... such as compressor switch on/o or abrupt change in compressor speed....

  6. Particle Filter Based Fault-tolerant ROV Navigation using Hydro-acoustic Position and Doppler Velocity Measurements

    DEFF Research Database (Denmark)

    Zhao, Bo; Blanke, Mogens; Skjetne, Roger

    2012-01-01

    This paper presents a fault tolerant navigation system for a remotely operated vehicle (ROV). The navigation system uses hydro-acoustic position reference (HPR) and Doppler velocity log (DVL) measurements to achieve an integrated navigation. The fault tolerant functionality is based on a modied...... the ROV kinematic states, even when sensor failures appear frequently....

  7. Fault-tolerance thresholds for the surface code with fabrication errors

    Science.gov (United States)

    Auger, James M.; Anwar, Hussain; Gimeno-Segovia, Mercedes; Stace, Thomas M.; Browne, Dan E.

    2017-10-01

    The construction of topological error correction codes requires the ability to fabricate a lattice of physical qubits embedded on a manifold with a nontrivial topology such that the quantum information is encoded in the global degrees of freedom (i.e., the topology) of the manifold. However, the manufacturing of large-scale topological devices will undoubtedly suffer from fabrication errors—permanent faulty components such as missing physical qubits or failed entangling gates—introducing permanent defects into the topology of the lattice and hence significantly reducing the distance of the code and the quality of the encoded logical qubits. In this work we investigate how fabrication errors affect the performance of topological codes, using the surface code as the test bed. A known approach to mitigate defective lattices involves the use of primitive swap gates in a long sequence of syndrome extraction circuits. Instead, we show that in the presence of fabrication errors the syndrome can be determined using the supercheck operator approach and the outcome of the defective gauge stabilizer generators without any additional computational overhead or use of swap gates. We report numerical fault-tolerance thresholds in the presence of both qubit fabrication and gate fabrication errors using a circuit-based noise model and the minimum-weight perfect-matching decoder. Our numerical analysis is most applicable to two-dimensional chip-based technologies, but the techniques presented here can be readily extended to other topological architectures. We find that in the presence of 8 % qubit fabrication errors, the surface code can still tolerate a computational error rate of up to 0.1 % .

  8. Robust Mpc for Actuator–Fault Tolerance Using Set–Based Passive Fault Detection and Active Fault Isolation

    Directory of Open Access Journals (Sweden)

    Xu Feng

    2017-03-01

    Full Text Available In this paper, a fault-tolerant control (FTC scheme is proposed for actuator faults, which is built upon tube-based model predictive control (MPC as well as set-based fault detection and isolation (FDI. In the class of MPC techniques, tubebased MPC can effectively deal with system constraints and uncertainties with relatively low computational complexity compared with other robust MPC techniques such as min-max MPC. Set-based FDI, generally considering the worst case of uncertainties, can robustly detect and isolate actuator faults. In the proposed FTC scheme, fault detection (FD is passive by using invariant sets, while fault isolation (FI is active by means of MPC and tubes. The active FI method proposed in this paper is implemented by making use of the constraint-handling ability of MPC to manipulate the bounds of inputs.

  9. Synthesis of Fault-Tolerant Embedded Systems with Checkpointing and Replication

    DEFF Research Database (Denmark)

    Izosimov, Viacheslav; Pop, Paul; Eles, Petru

    2006-01-01

    We present an approach to the synthesis of fault-tolerant hard real-time systems for safety-critical applications. We use checkpointing with rollback recovery and active replication for tolerating transient faults. Processes are statically scheduled and communications are performed using the time...

  10. Implementation of fault tolerant control for modular multilevel converter using EtherCAT communication

    DEFF Research Database (Denmark)

    Burlacu, Paul Dan; Mathe, Laszlo; Rejas, Marcos

    2015-01-01

    Modular Multilevel Converter (MMC) is very promising technology this days. It offers fault tolerant capabilities and ensures high efficiency with low output voltage harmonic content which results in need for smaller filter size. A disadvantage of the system is that the control becomes more...

  11. Active and passive fault-tolerant LPV control of wind Turbines

    DEFF Research Database (Denmark)

    Sloth, Christoffer; Esbensen, Thomas; Stoustrup, Jakob

    2010-01-01

    This paper addresses the design and comparison of active and passive fault-tolerant linear parameter-varying (LPV) controllers for wind turbines. The considered wind turbine plant model is characterized by parameter variations along the nominal operating trajectory and includes a model of an inci...

  12. Passive Fault-tolerant Control of Discrete-time Piecewise Affine Systems against Actuator Faults

    DEFF Research Database (Denmark)

    Tabatabaeipour, Seyed Mojtaba; Izadi-Zamanabadi, Roozbeh; Bak, Thomas

    2012-01-01

    In this paper, we propose a new method for passive fault-tolerant control of discrete time piecewise affine systems. Actuator faults are considered. A reliable piecewise linear quadratic regulator (LQR) state feedback is designed such that it can tolerate actuator faults. A sufficient condition...

  13. Towards fault-tolerant decision support systems for ship operator guidance

    DEFF Research Database (Denmark)

    Nielsen, Ulrik Dam; Lajic, Zoran; Jensen, Jørgen Juncher

    2012-01-01

    Fault detection and isolation are very important elements in the design of fault-tolerant decision support systems for ship operator guidance. This study outlines remedies that can be applied for fault diagnosis, when the ship responses are assumed to be linear in the wave excitation. A novel num...

  14. Sliding Mode Fault Tolerant Control with Adaptive Diagnosis for Aircraft Engines

    Science.gov (United States)

    Xiao, Lingfei; Du, Yanbin; Hu, Jixiang; Jiang, Bin

    2018-03-01

    In this paper, a novel sliding mode fault tolerant control method is presented for aircraft engine systems with uncertainties and disturbances on the basis of adaptive diagnostic observer. By taking both sensors faults and actuators faults into account, the general model of aircraft engine control systems which is subjected to uncertainties and disturbances, is considered. Then, the corresponding augmented dynamic model is established in order to facilitate the fault diagnosis and fault tolerant controller design. Next, a suitable detection observer is designed to detect the faults effectively. Through creating an adaptive diagnostic observer and based on sliding mode strategy, the sliding mode fault tolerant controller is constructed. Robust stabilization is discussed and the closed-loop system can be stabilized robustly. It is also proven that the adaptive diagnostic observer output errors and the estimations of faults converge to a set exponentially, and the converge rate greater than some value which can be adjusted by choosing designable parameters properly. The simulation on a twin-shaft aircraft engine verifies the applicability of the proposed fault tolerant control method.

  15. FAULT TOLERANCE FOR TWO WHEEL MOBILE ROBOT USING FSM (FINITE STATE MACHINE

    Directory of Open Access Journals (Sweden)

    Chan Shi Jing

    2017-02-01

    Full Text Available Fault Tolerance (FT enables system to continue operating despite in the event of failures. Therefore, FT serves as a backup component or procedure that can immediately play its role to minimize any service lost. FT exists in many forms, where it can either be in the software form or hardware form or both hardware and software form. Fault Tolerance is an umbrella term for fault detection, fault isolation, fault identification and fault solving. To better visualize the fault detection and isolation process, a two wheel robot is used in this study to represent the complex system. The aim of this research is to construct and design a Fault Tolerance algorithm considered to speed up the fault isolation procedure and it might identify multiple fault with the same static fault signature. The Finite State Machine (FSM model, a wide library of reusable model for the fault tolerant is used in this study to solve the fault in actuator or in the sensor by resetting and adjusting it to the correct position. Using the system sensors or actuators, the technique used is able to recognize the fault from its data. This FSM method is capable to avoid, replace, reset and recover any possible faults occurred in the system, offering an innovative solution to identify and solve a fault immediately.

  16. Fault Tolerance for Industrial Actuators in Absence of Accurate Models and Hardware Redundancy

    DEFF Research Database (Denmark)

    Papageorgiou, Dimitrios; Blanke, Mogens; Niemann, Hans Henrik

    2015-01-01

    This paper investigates Fault-Tolerant Control for closed-loop systems where only coarse models are available and there is lack of actuator and sensor redundancies. The problem is approached in the form of a typical servomotor in closed-loop. A linear model is extracted from input/output data to ...

  17. Fault tolerant control for unstable systems: A linear time varying approach

    DEFF Research Database (Denmark)

    Stoustrup, Jakob; Niemann, Hans Henrik

    2004-01-01

    In (passive) fault tolerant control design, the objective is to find a fixed compensator, which will maintain a suitable performance - or at least stability - in the event that a fault should occur. A major theoretical obstacle to obtain this objective, is that even if the system models correspon...

  18. Piloted Simulator Evaluation Results of New Fault-Tolerant Flight Control Algorithm

    NARCIS (Netherlands)

    Lombaerts, T.J.J.; Smaili, M.H.; Stroosma, O.; Chu, Q.P.; Mulder, J.A.; Joosten, D.A.

    2010-01-01

    A high fidelity aircraft simulation model, reconstructed using the Digital Flight Data Recorder (DFDR) of the 1992 Amsterdam Bijlmermeer aircraft accident (Flight 1862), has been used to evaluate a new Fault-Tolerant Flight Control Algorithm in an online piloted evaluation. This paper focuses on the

  19. Study of a Nine-Phase Fault Tolerant Permanent Magnet Starter-Alternator

    OpenAIRE

    RUBA Mircea; SURDU Felicia; SZABÓ Loránd

    2011-01-01

    The paper presents a study on a nine-phasepermanent magnet synchronous starter-alternator forautomotive applications, analyzing different convertertopologies, detailing the simulation programs anddiscussing the results in different operating conditions,from entire healthy machine to several faulted phases.The comparison between the two converter topologiescontrolling the multiphase machine highlights theincreased fault tolerance, hence the reliability of suchstarter-alternator structures. Nev...

  20. Fault-tolerant reference generation for model predictive control with active diagnosis of elevator jamming faults

    NARCIS (Netherlands)

    Ferranti, L.; Wan, Y.; Keviczky, T.

    2018-01-01

    This paper focuses on the longitudinal control of an Airbus passenger aircraft in the presence of elevator jamming faults. In particular, in this paper, we address permanent and temporary actuator jamming faults using a novel reconfigurable fault-tolerant predictive control design. Due to their

  1. Architecture Synthesis for Cost-Constrained Fault-Tolerant Flow-based Biochips

    DEFF Research Database (Denmark)

    Eskesen, Morten Chabert; Pop, Paul; Potluri, Seetal

    2016-01-01

    . This increase in fabrication complexity has led to an increase in defect rates during the manufacturing, thereby motivating the need to improve the yield, by designing these biochips such that they are fault tolerant. We propose an approach based on a Greedy Randomized Adaptive Search Procedure (GRASP...

  2. Design and Implementation of a Fault-Tolerant Magnetic Bearing System for MSCMG

    Directory of Open Access Journals (Sweden)

    Enqiong Tang

    2013-01-01

    Full Text Available The magnetically suspended control moment gyros (MSCMGs are complex system with multivariable, nonlinear, and strongly gyroscopic coupling. Therefore, its reliability is a key factor to determine whether it can be widely used in spacecraft. Fault-tolerant magnetic bearing systems have been proposed so that the system can operate normally in spite of some faults in the system. However, the conventional magnetic bearing and fault-tolerant control strategies are not suitable for the MSCMGs because of the moving-gimbal effects and requirement of the maximum load capacity after failure. A novel fault-tolerant magnetic bearing system which has low power loss and good robust performances to reject the moving-gimbal effects is presented in this paper. Moreover, its maximum load capacity is unchanged before and after failure. In addition, the compensation filters are designed to improve the bandwidth of the amplifiers so that the nutation stability of the high-speed rotor cannot be affected by the increasing of the coil currents. The experimental results show the effectiveness and superiority of the proposed fault-tolerant system.

  3. Reliability modeling of digital component in plant protection system with various fault-tolerant techniques

    International Nuclear Information System (INIS)

    Kim, Bo Gyung; Kang, Hyun Gook; Kim, Hee Eun; Lee, Seung Jun; Seong, Poong Hyun

    2013-01-01

    Highlights: • Integrated fault coverage is introduced for reflecting characteristics of fault-tolerant techniques in the reliability model of digital protection system in NPPs. • The integrated fault coverage considers the process of fault-tolerant techniques from detection to fail-safe generation process. • With integrated fault coverage, the unavailability of repairable component of DPS can be estimated. • The new developed reliability model can reveal the effects of fault-tolerant techniques explicitly for risk analysis. • The reliability model makes it possible to confirm changes of unavailability according to variation of diverse factors. - Abstract: With the improvement of digital technologies, digital protection system (DPS) has more multiple sophisticated fault-tolerant techniques (FTTs), in order to increase fault detection and to help the system safely perform the required functions in spite of the possible presence of faults. Fault detection coverage is vital factor of FTT in reliability. However, the fault detection coverage is insufficient to reflect the effects of various FTTs in reliability model. To reflect characteristics of FTTs in the reliability model, integrated fault coverage is introduced. The integrated fault coverage considers the process of FTT from detection to fail-safe generation process. A model has been developed to estimate the unavailability of repairable component of DPS using the integrated fault coverage. The new developed model can quantify unavailability according to a diversity of conditions. Sensitivity studies are performed to ascertain important variables which affect the integrated fault coverage and unavailability

  4. Fault Tolerant Software-Defined Radio on Manycore, Phase I

    Data.gov (United States)

    National Aeronautics and Space Administration — Mobile communications systems require programmable embedded platforms that can handle computationally demanding signal processing codes without the burden of high...

  5. Fault Tolerant Software-Defined Radio on Manycore Project

    Data.gov (United States)

    National Aeronautics and Space Administration — Mobile communications systems require programmable embedded platforms that can handle computationally demanding signal processing codes without the burden of high...

  6. Fault Tolerant Software-Defined Radio on Manycore, Phase II

    Data.gov (United States)

    National Aeronautics and Space Administration — Mobile communications systems require programmable embedded platforms that can handle computationally demanding signal processing codes without the burden of high...

  7. An Efficient Network Coding-Based Fault-Tolerant Mechanism in WBAN for Smart Healthcare Monitoring Systems

    Directory of Open Access Journals (Sweden)

    Yuhuai Peng

    2017-08-01

    Full Text Available As a key technology in smart healthcare monitoring systems, wireless body area networks (WBANs can pre-embed sensors and sinks on body surface or inside bodies for collecting different vital signs parameters, such as human Electrocardiograph (ECG, Electroencephalograph (EEG, Electromyogram (EMG, body temperature, blood pressure, blood sugar, blood oxygen, etc. Using real-time online healthcare, patients can be tracked and monitored in normal or emergency conditions at their homes, hospital rooms, and in Intensive Care Units (ICUs. In particular, the reliability and effectiveness of the packets transmission will be directly related to the timely rescue of critically ill patients with life-threatening injuries. However, traditional fault-tolerant schemes either have the deficiency of underutilised resources or react too slowly to failures. In future healthcare systems, the medical Internet of Things (IoT for real-time monitoring can integrate sensor networks, cloud computing, and big data techniques to address these problems. It can collect and send patient’s vital parameter signal and safety monitoring information to intelligent terminals and enhance transmission reliability and efficiency. Therefore, this paper presents a design in healthcare monitoring systems for a proactive reliable data transmission mechanism with resilience requirements in a many-to-one stream model. This Network Coding-based Fault-tolerant Mechanism (NCFM first proposes a greedy grouping algorithm to divide the topology into small logical units; it then constructs a spanning tree based on random linear network coding to generate linearly independent coding combinations. Numerical results indicate that this transmission scheme works better than traditional methods in reducing the probability of packet loss, the resource redundant rate, and average delay, and can increase the effective throughput rate.

  8. Open-Phase Fault Tolerance Techniques of Five-Phase Dual-Rotor Permanent Magnet Synchronous Motor

    Directory of Open Access Journals (Sweden)

    Jing Zhao

    2015-11-01

    Full Text Available Multi-phase motors are gaining more attention due to the advantages of good fault tolerance capability and high power density, etc. By applying dual-rotor technology to multi-phase machines, a five-phase dual-rotor permanent magnet synchronous motor (DRPMSM is researched in this paper to further promote their torque density and fault tolerance capability. It has two rotors and two sets of stator windings, and it can adopt a series drive mode or parallel drive mode. The fault-tolerance capability of the five-phase DRPMSM is researched. All open circuit fault types and corresponding fault tolerance techniques in different drive modes are analyzed. A fault-tolerance control strategy of injecting currents containing a certain third harmonic component is proposed for five-phase DRPMSM to ensure performance after faults in the motor or drive circuit. For adjacent double-phase faults in the motor, based on where the additional degrees of freedom are used, two different fault-tolerance current calculation schemes are adopted and the torque results are compared. Decoupling of the inner motor and outer motor is investigated under fault-tolerant conditions in parallel drive mode. The finite element analysis (FMA results and co-simulation results based on Simulink-Simplorer-Maxwell verify the effectiveness of the techniques.

  9. Reliability Evaluation Methodologies of Fault Tolerant Techniques of Digital I and C Systems in Nuclear Power Plants

    International Nuclear Information System (INIS)

    Kim, Bo Gyung; Kang, Hyun Gook; Seong, Poong Hyun; Lee, Seung Jun

    2011-01-01

    Since the reactor protection system was replaced from analog to digital, digital reactor protection system has 4 redundant channels and each channel has several modules. It is necessary for various fault tolerant techniques to improve availability and reliability due to using complex components in DPPS. To use the digital system, it is necessary to improve the reliability and availability of a system through fault-tolerant techniques. Several researches make an effort to effects of fault tolerant techniques. However, the effects of fault tolerant techniques have not been properly considered yet in most fault tree models. Various fault-tolerant techniques, which used in digital system in NPPs, should reflect in fault tree analysis for getting lower system unavailability and more reliable PSA. When fault-tolerant techniques are modeled in fault tree, categorizing the module to detect by each fault tolerant techniques, fault coverage, detection period and the fault recovery should be considered. Further work will concentrate on various aspects for fault tree modeling. We will find other important factors, and found a new theory to construct the fault tree model

  10. Reconfiguration Schemes for Fault-Tolerant Processor Arrays

    Science.gov (United States)

    1992-10-15

    Computacion UDEM󈨞, Universidad de Monterrey, Monterrey, Mexico, September 28, 1990. [2] "Systolic Array Design in the Linear Algebra Framework...faster than the subdi- implicating structures for diagnosuble systems," vision option. Thus, as a general heuristic, there is no IEEE Trans. Comput., v. C...Evaluator," ACM Trans. Prog. Lang. & Sys., v. 10. [31] H. 3. Siegel, J. B. Armstrong, and D. W. Watson, Apr. 1988, pp. 248-266. "Mapping computer- vision

  11. Advanced I&C for Fault-Tolerant Supervisory Control of Small Modular Reactors

    Energy Technology Data Exchange (ETDEWEB)

    Cole, Daniel G. [Univ. of Pittsburgh, PA (United States)

    2018-01-30

    In this research, we have developed a supervisory control approach to enable automated control of SMRs. By design the supervisory control system has an hierarchical, interconnected, adaptive control architecture. A considerable advantage to this architecture is that it allows subsystems to communicate at different/finer granularity, facilitates monitoring of process at the modular and plant levels, and enables supervisory control. We have investigated the deployment of automation, monitoring, and data collection technologies to enable operation of multiple SMRs. Each unit's controller collects and transfers information from local loops and optimize that unit’s parameters. Information is passed from the each SMR unit controller to the supervisory controller, which supervises the actions of SMR units and manage plant processes. The information processed at the supervisory level will provide operators the necessary information needed for reactor, unit, and plant operation. In conjunction with the supervisory effort, we have investigated techniques for fault-tolerant networks, over which information is transmitted between local loops and the supervisory controller to maintain a safe level of operational normalcy in the presence of anomalies. The fault-tolerance of the supervisory control architecture, the network that supports it, and the impact of fault-tolerance on multi-unit SMR plant control has been a second focus of this research. To this end, we have investigated the deployment of advanced automation, monitoring, and data collection and communications technologies to enable operation of multiple SMRs. We have created a fault-tolerant multi-unit SMR supervisory controller that collects and transfers information from local loops, supervise their actions, and adaptively optimize the controller parameters. The goal of this research has been to develop the methodologies and procedures for fault-tolerant supervisory control of small modular reactors. To achieve

  12. Non-determinism in Byzantine Fault-Tolerant Replication

    OpenAIRE

    Cachin, Christian; Vukolic, Marko; Schubert, Simon

    2016-01-01

    Service replication distributes an application over many processes for tolerating faults, attacks, and misbehavior among a subset of the processes. With the recent interest in blockchain technologies, distributed execution of one logical application has become a prominent topic. The established state-machine replication paradigm inherently requires the application to be deterministic. This paper distinguishes three models for dealing with non-determinism in replicated services, where some pro...

  13. Allocating application to group of consecutive processors in fault-tolerant deadlock-free routing path defined by routers obeying same rules for path selection

    Science.gov (United States)

    Leung, Vitus J [Albuquerque, NM; Phillips, Cynthia A [Albuquerque, NM; Bender, Michael A [East Northport, NY; Bunde, David P [Urbana, IL

    2009-07-21

    In a multiple processor computing apparatus, directional routing restrictions and a logical channel construct permit fault tolerant, deadlock-free routing. Processor allocation can be performed by creating a linear ordering of the processors based on routing rules used for routing communications between the processors. The linear ordering can assume a loop configuration, and bin-packing is applied to this loop configuration. The interconnection of the processors can be conceptualized as a generally rectangular 3-dimensional grid, and the MC allocation algorithm is applied with respect to the 3-dimensional grid.

  14. Testability and Fault Tolerance for Emerging Nanoelectronic Memories

    NARCIS (Netherlands)

    Haron, N.Z.B.

    2012-01-01

    Emerging nanoelectronic memories such as Resistive Random Access Memories (RRAMs) are possible candidates to replace the conventional memory technologies such as SRAMs, DRAMs and flash memories in future computer systems. Despite their advantages such as enormous storage capacity, low-power per unit

  15. Fault-Tolerant Topology Selection for TTEthernet Networks

    DEFF Research Database (Denmark)

    Gavrilut, Voica Maria; Tamas-Selicean, Domitian; Pop, Paul

    2015-01-01

    Many safety-critical real-time applications are implemented using distributed architectures, composed of heterogeneous processing elements (PEs) interconnected in a network. In this paper, we are interested in the TTEthernet protocol, which is a deterministic, synchronized and congestion-free net...

  16. Automating the fault tolerance process in Grid Environment

    OpenAIRE

    Maninder Singh; Inderpreet Chopra

    2010-01-01

    As Grid encourages the dynamic addition of resources that are not likely to be benefited from the manual management techniques as these are time-consuming, unsecure and more prone to errors. A new paradigm for self-management is pervading over the old manual system to begin the next generation of computing. In this paper we have discussed the different approaches for self-healing the current grid middleware use, and after analyzing these we have proposed the new approach, Selfhealing Manageme...

  17. OConGraX - Automatically Generating Data-Flow Test Cases for Fault-Tolerant Systems

    Science.gov (United States)

    Nunes, Paulo R. F.; Hanazumi, Simone; de Melo, Ana C. V.

    The more complex to develop and manage systems the more software design faults increase, making fault-tolerant systems highly required. To ensure their quality, the normal and exceptional behaviors must be tested and/or verified. Software testing is still a difficult and costly software development task and a reasonable amount of effort has been employed to develop techniques for testing programs’ normal behaviors. For the exceptional behavior, however, there is a lack of techniques and tools to effectively test it. To help in testing and analyzing fault-tolerant systems, we present in this paper a tool that provides an automatic generation of data-flow test cases for objects and exception-handling mechanisms of Java programs and data/control-flow graphs for program analysis.

  18. Active fault tolerant control of piecewise affine systems with reference tracking and input constraints

    DEFF Research Database (Denmark)

    Gholami, M.; Cocquempot, V.; Schiøler, H.

    2014-01-01

    performance of the faulty system are held. The design of the supervisory scheme is not considered here. The set of controllers is composed of a normal controller for the fault-free case, an active fault detection and isolation controller for isolation and identification of the faults, and a set of passive......An active fault tolerant control (AFTC) method is proposed for discrete-time piecewise affine (PWA) systems. Only actuator faults are considered. The AFTC framework contains a supervisory scheme, which selects a suitable controller in a set of controllers such that the stability and an acceptable...... fault tolerant controllers (PFTCs) modules designed to be robust against a set of actuator faults. In this research, the piecewise nonlinear model is approximated by a PWA system. The PFTCs are state feedback laws. Each one is robust against a fixed set of actuator faults and is able to track...

  19. Flow-Based Biochips: Fault-Tolerant Design and Error Recovery

    DEFF Research Database (Denmark)

    Pop, Paul

    2015-01-01

    VLSI). Biochips are currently being designed manually using tools such as AutoCAD. Physical defects can be introduced during the fabrication process, which reduces the yield, and may lead to the failure of the biochemical application. Failure is costly because of the need to redo lengthy experiments, using...... prevent the failure during the operation of the biochip, we advocate the use of fault-tolerant biochip design. The vision is to provide application fault-tolerance at run-time (online), detecting the faults as they appear, and reconfiguring the application. However, in this paper our assumption...... is that the faults are detected during testing, and that the operation of the biochip is reconfigured offline (at design time) to avoid the faults. We are interested to introduce redundancy such that the applications can still successfully run on a defective biochip. Redundancy is the addition of extra resources...

  20. H infinity Integrated Fault Estimation and Fault Tolerant Control of Discrete-time Piecewise Linear Systems

    DEFF Research Database (Denmark)

    Tabatabaeipour, Seyed Mojtaba; Bak, Thomas

    2012-01-01

    , the estimate of fault is used to compensate for the effect of the fault. Hence, using the estimate of fault, a fault tolerant controller using a piecewise linear static output feedback is designed such that it stabilizes the system and provides an upper bound on the H∞ performance of the faulty system......In this paper we consider the problem of fault estimation and accommodation for discrete time piecewise linear systems. A robust fault estimator is designed to estimate the fault such that the estimation error converges to zero and H∞ performance of the fault estimation is minimized. Then....... Sufficient conditions for the existence of robust fault estimator and fault tolerant controller are derived in terms of linear matrix inequalities. Upper bounds on the H∞ performance can be minimized by solving convex optimization problems with linear matrix inequality constraints. The efficiency...

  1. Passive fault tolerant control of a double inverted pendulum - a case study

    DEFF Research Database (Denmark)

    Niemann, Hans Henrik; Stoustrup, Jakob

    2005-01-01

    A passive fault tolerant control scheme is suggested, in which a nominal controller is augmented with an additional block, which guarantees stability and performance after the occurrence of a fault. The method is based on the YJBK parameterization, which requires the nominal controller to be impl...... to be implemented in observer based form. The proposed method is applied to a double inverted pendulum system, for which an H_inf controller has been designed and verified in a lab setup. In this case study, the fault is a degradation of the tacho loop.......A passive fault tolerant control scheme is suggested, in which a nominal controller is augmented with an additional block, which guarantees stability and performance after the occurrence of a fault. The method is based on the YJBK parameterization, which requires the nominal controller...

  2. Reliability of voting in fault-tolerant software systems for small output spaces

    Science.gov (United States)

    Mcallister, David F.; Sun, Chien-En; Vouk, Mladen A.

    1990-01-01

    Under a voting strategy in a fault-tolerant software system there is a difference between correctness and agreement. An independent N-version programming reliability model is proposed for treating small output spaces which distinguishes between correctness and agreement. System reliability is investigated using analytical relationships and simulation. A consensus majority voting stratey is proposed and its performance is analyzed and compared with other voting strategies. A consensus voting strategy automatically adapts the voting to diffeerent component reliability and output space cardinality characteristics. It is shown that absolute majority voting strategy provides a lower bound on the reliability provided by the consensus majority, and the 2-of-n voting strategy an upper bound. If r is the cardinality of output space it is proved that 1/r is a lower bound on the average reliability of fault-tolerant system components below which the system reliability begins to deteriorate as more versions are added.

  3. Systematic Fault Tolerant Control Based on Adaptive Thau Observer Estimation for Quadrotor Uavs

    Directory of Open Access Journals (Sweden)

    Cen Zhaohui

    2015-03-01

    Full Text Available A systematic fault tolerant control (FTC scheme based on fault estimation for a quadrotor actuator, which integrates normal control, active and passive FTC and fault parking is proposed in this paper. Firstly, an adaptive Thau observer (ATO is presented to estimate the quadrotor rotor fault magnitudes, and then faults with different magnitudes and time-varying natures are rated into corresponding fault severity levels based on the pre-defined fault-tolerant boundaries. Secondly, a systematic FTC strategy which can coordinate various FTC methods is designed to compensate for failures depending on the fault types and severity levels. Unlike former stand-alone passive FTC or active FTC, our proposed FTC scheme can compensate for faults in a way of condition-based maintenance (CBM, and especially consider the fatal failures that traditional FTC techniques cannot accommodate to avoid the crashing of UAVs. Finally, various simulations are carried out to show the performance and effectiveness of the proposed method.

  4. Active Disturbance Rejection Approach for Robust Fault-Tolerant Control via Observer Assisted Sliding Mode Control

    Directory of Open Access Journals (Sweden)

    John Cortés-Romero

    2013-01-01

    Full Text Available This work proposes an active disturbance rejection approach for the establishment of a sliding mode control strategy in fault-tolerant operations. The core of the proposed active disturbance rejection assistance is a Generalized Proportional Integral (GPI observer which is in charge of the active estimation of lumped nonlinear endogenous and exogenous disturbance inputs related to the creation of local sliding regimes with limited control authority. Possibilities are explored for the GPI observer assisted sliding mode control in fault-tolerant schemes. Convincing improvements are presented with respect to classical sliding mode control strategies. As a collateral advantage, the observer-based control architecture offers the possibility of chattering reduction given that a significant part of the control signal is of the continuous type. The case study considers a classical DC motor control affected by actuator faults, parametric failures, and perturbations. Experimental results and comparisons with other established sliding mode controller design methodologies, which validate the proposed approach, are provided.

  5. An evaluation method of fault-tolerance for digital plant protection system in nuclear power plants

    International Nuclear Information System (INIS)

    Lee, Jun Seok; Kim, Man Cheol; Seong, Poong Hyun; Kang, Hyun Gook; Jang, Seung Cheol

    2005-01-01

    In recent years, analog based nuclear power plant (NPP) safety related instrumentation and control (I and C) systems have been replaced to modern digital based I and C systems. NPP safety related I and C systems require very high design reliability compare to the conventional digital systems so that reliability assessment is very important. In the reliability assessment of the digital system, fault tolerance evaluation is one of the crucial factors. However, the evaluation is very difficult because the digital system in NPP is very complex. In this paper, the simulation based fault injection technique on simplified processor is used to evaluate the fault-tolerance of the digital plant protection system (DPPS) with high efficiency with low cost

  6. A Middleware Approach to Achieving Fault Tolerance of Kahn Process Networks on Networks on Chips

    Directory of Open Access Journals (Sweden)

    Onur Derin

    2011-01-01

    propose a task-aware middleware concept that allows adaptivity in KPN implemented over a Network on Chip (NoC. We also list our ideas on the development of a simulation platform as an initial step towards creating fault tolerance strategies for KPNs applications running on NoCs. In doing that, we extend our SACRE (Self-Adaptive Component Run Time Environment framework by integrating it with an open source NoC simulator, Noxim. We evaluate the overhead that the middleware brings to the the total execution time and to the total amount of data transferred in the NoC. With this work, we also provide a methodology that can help in identifying the requirements and implementing fault tolerance and adaptivity support on real platforms.

  7. Fault tolerant control of multivariable processes using auto-tuning PID controller.

    Science.gov (United States)

    Yu, Ding-Li; Chang, T K; Yu, Ding-Wen

    2005-02-01

    Fault tolerant control of dynamic processes is investigated in this paper using an auto-tuning PID controller. A fault tolerant control scheme is proposed composing an auto-tuning PID controller based on an adaptive neural network model. The model is trained online using the extended Kalman filter (EKF) algorithm to learn system post-fault dynamics. Based on this model, the PID controller adjusts its parameters to compensate the effects of the faults, so that the control performance is recovered from degradation. The auto-tuning algorithm for the PID controller is derived with the Lyapunov method and therefore, the model predicted tracking error is guaranteed to converge asymptotically. The method is applied to a simulated two-input two-output continuous stirred tank reactor (CSTR) with various faults, which demonstrate the applicability of the developed scheme to industrial processes.

  8. Guaranteed Cost Fault-Tolerant Control for Networked Control Systems with Sensor Faults

    Directory of Open Access Journals (Sweden)

    Qixin Zhu

    2015-01-01

    Full Text Available For the large scale and complicated structure of networked control systems, time-varying sensor faults could inevitably occur when the system works in a poor environment. Guaranteed cost fault-tolerant controller for the new networked control systems with time-varying sensor faults is designed in this paper. Based on time delay of the network transmission environment, the networked control systems with sensor faults are modeled as a discrete-time system with uncertain parameters. And the model of networked control systems is related to the boundary values of the sensor faults. Moreover, using Lyapunov stability theory and linear matrix inequalities (LMI approach, the guaranteed cost fault-tolerant controller is verified to render such networked control systems asymptotically stable. Finally, simulations are included to demonstrate the theoretical results.

  9. An Adaptive Fault-Tolerant Communication Scheme for Body Sensor Networks

    Directory of Open Access Journals (Sweden)

    Zichuan Xu

    2010-10-01

    Full Text Available A high degree of reliability for critical data transmission is required in body sensor networks (BSNs. However, BSNs are usually vulnerable to channel impairments due to body fading effect and RF interference, which may potentially cause data transmission to be unreliable. In this paper, an adaptive and flexible fault-tolerant communication scheme for BSNs, namely AFTCS, is proposed. AFTCS adopts a channel bandwidth reservation strategy to provide reliable data transmission when channel impairments occur. In order to fulfill the reliability requirements of critical sensors, fault-tolerant priority and queue are employed to adaptively adjust the channel bandwidth allocation. Simulation results show that AFTCS can alleviate the effect of channel impairments, while yielding lower packet loss rate and latency for critical sensors at runtime.

  10. Fault-Tolerant Approach for Modular Multilevel Converters under Submodule Faults

    DEFF Research Database (Denmark)

    Deng, Fujin; Tian, Yanjun; Zhu, Rongwu

    2016-01-01

    The modular multilevel converter (MMC) is attractive for medium- or high-power applications because of the advantages of its high modularity, availability, and high power quality. The fault-tolerant operation is one of the important issues for the MMC. This paper proposed a fault-tolerant approach...... for the MMC under submodule (SM) faults. The characteristic of the MMC with arms containing different number of healthy SMs under faults is analyzed. Based on the characteristic, the proposed approach can effectively keep the MMC operation as normal under SM faults. It can effectively improve the MMC...... performance under SM faults but without the knowledge of the number of the faulty SMs in the arm, without extra demand on communication systems, which potentially increases the reliability. The time-domain simulation studies with the PSCAD/EMTDC are conducted and a down-scale MMC prototype is also tested...

  11. Architecture Synthesis for Cost-Constrained Fault-Tolerant Flow-based Biochips

    DEFF Research Database (Denmark)

    Eskesen, Morten Chabert; Pop, Paul; Potluri, Seetal

    2016-01-01

    In this paper, we are interested in the synthesis of fault-tolerant architectures for flow-based microfluidic biochips, which use microvalves and channels to run biochemical applications. The growth rate of device integration in flow-based microfluidic biochips is scaling faster than Moore's law........ The proposed algorithm has been evaluated using several benchmarks and compared to the results of a Simulated Annealing metaheuristic....

  12. Fault-Tolerant Topology Selection for TTEthernet Networks

    DEFF Research Database (Denmark)

    Gavrilut, Voica Maria; Tamas-Selicean, Domitian; Pop, Paul

    2015-01-01

    Many safety-critical real-time applications are implemented using distributed architectures, composed of heterogeneous processing elements (PEs) interconnected in a network. In this paper, we are interested in the TTEthernet protocol, which is a deterministic, synchronized and congestion-free net....... We propose a Simulated Annealing meta-heuristic to solve this optimization problem. The proposed approach has been evaluated using a synthetic benchmark and a space case study, based on the Orion Crew Exploration Vehicle....

  13. Fault tolerance improvement for queuing systems under stress load

    International Nuclear Information System (INIS)

    Nikonov, Eh.G.; Florko, A.B.

    2009-01-01

    Various kinds of queuing information systems (exchange auctions systems, web servers, SCADA) are faced to unpredictable situations during operation, when information flow that requires being analyzed and processed rises extremely. Such stress load situations often require human (dispatcher's or administrator's) intervention that is the reason why the time of the first denial of service is extremely important. Common queuing systems architecture is described. Existing approaches to computing resource management are considered. A new late-first-denial-of-service resource management approach is proposed

  14. Distributed computing for global health

    CERN Multimedia

    CERN. Geneva; Schwede, Torsten; Moore, Celia; Smith, Thomas E; Williams, Brian; Grey, François

    2005-01-01

    Distributed computing harnesses the power of thousands of computers within organisations or over the Internet. In order to tackle global health problems, several groups of researchers have begun to use this approach to exceed by far the computing power of a single lab. This event illustrates how companies, research institutes and the general public are contributing their computing power to these efforts, and what impact this may have on a range of world health issues. Grids for neglected diseases Vincent Breton, CNRS/EGEE This talk introduces the topic of distributed computing, explaining the similarities and differences between Grid computing, volunteer computing and supercomputing, and outlines the potential of Grid computing for tackling neglected diseases where there is little economic incentive for private R&D efforts. Recent results on malaria drug design using the Grid infrastructure of the EU-funded EGEE project, which is coordinated by CERN and involves 70 partners in Europe, the US and Russi...

  15. Active Fault-Tolerant Control for Wind Turbine with Simultaneous Actuator and Sensor Faults

    Directory of Open Access Journals (Sweden)

    Lei Wang

    2017-01-01

    Full Text Available The purpose of this paper is to show a novel fault-tolerant tracking control (FTC strategy with robust fault estimation and compensating for simultaneous actuator sensor faults. Based on the framework of fault-tolerant control, developing an FTC design method for wind turbines is a challenge and, thus, they can tolerate simultaneous pitch actuator and pitch sensor faults having bounded first time derivatives. The paper’s key contribution is proposing a descriptor sliding mode method, in which for establishing a novel augmented descriptor system, with which we can estimate the state of system and reconstruct fault by designing descriptor sliding mode observer, the paper introduces an auxiliary descriptor state vector composed by a system state vector, actuator fault vector, and sensor fault vector. By the optimized method of LMI, the conditions for stability that estimated error dynamics are set up to promote the determination of the parameters designed. With this estimation, and designing a fault-tolerant controller, the system’s stability can be maintained. The effectiveness of the design strategy is verified by implementing the controller in the National Renewable Energy Laboratory’s 5-MW nonlinear, high-fidelity wind turbine model (FAST and simulating it in MATLAB/Simulink.

  16. Adaptive Fault-Tolerant Control of Uncertain Nonlinear Large-Scale Systems With Unknown Dead Zone.

    Science.gov (United States)

    Chen, Mou; Tao, Gang

    2016-08-01

    In this paper, an adaptive neural fault-tolerant control scheme is proposed and analyzed for a class of uncertain nonlinear large-scale systems with unknown dead zone and external disturbances. To tackle the unknown nonlinear interaction functions in the large-scale system, the radial basis function neural network (RBFNN) is employed to approximate them. To further handle the unknown approximation errors and the effects of the unknown dead zone and external disturbances, integrated as the compounded disturbances, the corresponding disturbance observers are developed for their estimations. Based on the outputs of the RBFNN and the disturbance observer, the adaptive neural fault-tolerant control scheme is designed for uncertain nonlinear large-scale systems by using a decentralized backstepping technique. The closed-loop stability of the adaptive control system is rigorously proved via Lyapunov analysis and the satisfactory tracking performance is achieved under the integrated effects of unknown dead zone, actuator fault, and unknown external disturbances. Simulation results of a mass-spring-damper system are given to illustrate the effectiveness of the proposed adaptive neural fault-tolerant control scheme for uncertain nonlinear large-scale systems.

  17. Fault-tolerant control for current sensors of doubly fed induction generators based on an improved fault detection method

    DEFF Research Database (Denmark)

    Li, Hui; Yang, Chao; Hu, Yaogang

    2014-01-01

    Fault-tolerant control of current sensors is studied in this paper to improve the reliability of a doubly fed induction generator (DFIG). A fault-tolerant control system of current sensors is presented for the DFIG, which consists of a new current observer and an improved current sensor fault...... detection algorithm. The current observer is constructed by using only voltage signals as inputs. The fault detection algorithm is based on the current observer, in which an adaptive threshold and different fault duration times are considered. The performance of the proposed observer, improved fault...... detection algorithm, and fault-tolerant control system are investigated by simulation. The results indicate that the outputs of the observer and the sensor are highly coherent. The fault detection algorithm can efficiently detect both soft and hard faults in current sensors, and the fault-tolerant control...

  18. Fault-tolerant Agreement in Synchronous Message-passing Systems

    CERN Document Server

    Raynal, Michel

    2010-01-01

    The present book focuses on the way to cope with the uncertainty created by process failures (crash, omission failures and Byzantine behavior) in synchronous message-passing systems (i.e., systems whose progress is governed by the passage of time). To that end, the book considers fundamental problems that distributed synchronous processes have to solve. These fundamental problems concern agreement among processes (if processes are unable to agree in one way or another in presence of failures, no non-trivial problem can be solved). They are consensus, interactive consistency, k-set agreement an

  19. Computing Battery Lifetime Distributions

    NARCIS (Netherlands)

    Cloth, L.; Haverkort, Boudewijn R.H.M.; Jongerden, M.R.

    The usage of mobile devices like cell phones, navigation systems, or laptop computers, is limited by the lifetime of the included batteries. This lifetime depends naturally on the rate at which energy is consumed, however, it also depends on the usage pattern of the battery. Continuous drawing of a

  20. Fault-Tolerant, Multiple-Zone Temperature Control

    Science.gov (United States)

    Granger, James; Franklin, Brian; Michalik, Martin; Yates, Phillip; Peterson, Erik; Borders, James

    2008-01-01

    A computer program has been written as an essential part of an electronic temperature control system for a spaceborne instrument that contains several zones. The system was developed because the temperature and the rate of change of temperature in each zone are required to be maintained to within limits that amount to degrees of precision thought to be unattainable by use of simple bimetallic thermostats. The software collects temperature readings from six platinum resistance thermometers, calculates temperature errors from the readings, and implements a proportional + integral + derivative (PID) control algorithm that adjusts heater power levels. The software accepts, via a serial port, commands to change its operational parameters. The software attempts to detect and mitigate a host of potential faults. It is robust to many kinds of faults in that it can maintain PID control in the presence of those faults.

  1. Fault tolerance in onboard processors - Protecting efficient FDM demultiplexers

    Science.gov (United States)

    Redinbo, Robert

    1992-01-01

    The application of convolutional codes to protect demultiplexer filter banks is demonstrated analytically for efficient implementations. An overview is given of the parameters for the efficient implementations of filter banks, and real convolutional codes are discussed in terms of DSP operations. Methods for composite filtering and parity generation are outlined, and attention is given to the protection of polyphase filter demultiplexing systems. Real convolutional codes can be applied to protect demultiplexer filter banks by employing two forms of low-rate parity calculation to each filter bank. The parity values are computed either by the output with an FIR parity filter or in parallel with the normal processing by a composite filter. Hardware similarities between the filter bank and the main demultiplexer bank permit efficient redeployment of the processing resources to the main processing function in any configuration.

  2. Fault-tolerant feature-based estimation of space debris rotational motion during active removal missions

    Science.gov (United States)

    Biondi, Gabriele; Mauro, Stefano; Pastorelli, Stefano; Sorli, Massimo

    2018-05-01

    One of the key functionalities required by an Active Debris Removal mission is the assessment of the target kinematics and inertial properties. Passive sensors, such as stereo cameras, are often included in the onboard instrumentation of a chaser spacecraft for capturing sequential photographs and for tracking features of the target surface. A plenty of methods, based on Kalman filtering, are available for the estimation of the target's state from feature positions; however, to guarantee the filter convergence, they typically require continuity of measurements and the capability of tracking a fixed set of pre-defined features of the object. These requirements clash with the actual tracking conditions: failures in feature detection often occur and the assumption of having some a-priori knowledge about the shape of the target could be restrictive in certain cases. The aim of the presented work is to propose a fault-tolerant alternative method for estimating the angular velocity and the relative magnitudes of the principal moments of inertia of the target. Raw data regarding the positions of the tracked features are processed to evaluate corrupted values of a 3-dimentional parameter which entirely describes the finite screw motion of the debris and which primarily is invariant on the particular set of considered features of the object. Missing values of the parameter are completely restored exploiting the typical periodicity of the rotational motion of an uncontrolled satellite: compressed sensing techniques, typically adopted for recovering images or for prognostic applications, are herein used in a completely original fashion for retrieving a kinematic signal that appears sparse in the frequency domain. Due to its invariance about the features, no assumptions are needed about the target's shape and continuity of the tracking. The obtained signal is useful for the indirect evaluation of an attitude signal that feeds an unscented Kalman filter for the estimation of

  3. LQCD workflow execution framework: Models, provenance and fault-tolerance

    International Nuclear Information System (INIS)

    Piccoli, Luciano; Simone, James N; Kowalkowlski, James B; Dubey, Abhishek

    2010-01-01

    Large computing clusters used for scientific processing suffer from systemic failures when operated over long continuous periods for executing workflows. Diagnosing job problems and faults leading to eventual failures in this complex environment is difficult, specifically when the success of an entire workflow might be affected by a single job failure. In this paper, we introduce a model-based, hierarchical, reliable execution framework that encompass workflow specification, data provenance, execution tracking and online monitoring of each workflow task, also referred to as participants. The sequence of participants is described in an abstract parameterized view, which is translated into a concrete data dependency based sequence of participants with defined arguments. As participants belonging to a workflow are mapped onto machines and executed, periodic and on-demand monitoring of vital health parameters on allocated nodes is enabled according to pre-specified rules. These rules specify conditions that must be true pre-execution, during execution and post-execution. Monitoring information for each participant is propagated upwards through the reflex and healing architecture, which consists of a hierarchical network of decentralized fault management entities, called reflex engines. They are instantiated as state machines or timed automatons that change state and initiate reflexive mitigation action(s) upon occurrence of certain faults. We describe how this cluster reliability framework is combined with the workflow execution framework using formal rules and actions specified within a structure of first order predicate logic that enables a dynamic management design that reduces manual administrative workload, and increases cluster-productivity.

  4. On TTEthernet for Integrated Fault-Tolerant Spacecraft Networks

    Science.gov (United States)

    Loveless, Andrew

    2015-01-01

    There has recently been a push for adopting integrated modular avionics (IMA) principles in designing spacecraft architectures. This consolidation of multiple vehicle functions to shared computing platforms can significantly reduce spacecraft cost, weight, and de- sign complexity. Ethernet technology is attractive for inclusion in more integrated avionic systems due to its high speed, flexibility, and the availability of inexpensive commercial off-the-shelf (COTS) components. Furthermore, Ethernet can be augmented with a variety of quality of service (QoS) enhancements that enable its use for transmitting critical data. TTEthernet introduces a decentralized clock synchronization paradigm enabling the use of time-triggered Ethernet messaging appropriate for hard real-time applications. TTEthernet can also provide two forms of event-driven communication, therefore accommodating the full spectrum of traffic criticality levels required in IMA architectures. This paper explores the application of TTEthernet technology to future IMA spacecraft architectures as part of the Avionics and Software (A&S) project chartered by NASA's Advanced Exploration Systems (AES) program.

  5. Stochastic Model Predictive Fault Tolerant Control Based on Conditional Value at Risk for Wind Energy Conversion System

    Directory of Open Access Journals (Sweden)

    Yun-Tao Shi

    2018-01-01

    Full Text Available Wind energy has been drawing considerable attention in recent years. However, due to the random nature of wind and high failure rate of wind energy conversion systems (WECSs, how to implement fault-tolerant WECS control is becoming a significant issue. This paper addresses the fault-tolerant control problem of a WECS with a probable actuator fault. A new stochastic model predictive control (SMPC fault-tolerant controller with the Conditional Value at Risk (CVaR objective function is proposed in this paper. First, the Markov jump linear model is used to describe the WECS dynamics, which are affected by many stochastic factors, like the wind. The Markov jump linear model can precisely model the random WECS properties. Second, the scenario-based SMPC is used as the controller to address the control problem of the WECS. With this controller, all the possible realizations of the disturbance in prediction horizon are enumerated by scenario trees so that an uncertain SMPC problem can be transformed into a deterministic model predictive control (MPC problem. Finally, the CVaR object function is adopted to improve the fault-tolerant control performance of the SMPC controller. CVaR can provide a balance between the performance and random failure risks of the system. The Min-Max performance index is introduced to compare the fault-tolerant control performance with the proposed controller. The comparison results show that the proposed method has better fault-tolerant control performance.

  6. Fault Tolerance and Scaling in e-Science Cloud Applications: Observations from the Continuing Development of MODISAzure

    Energy Technology Data Exchange (ETDEWEB)

    Li, Jie [Univ. of Virginia, Charlottesville, VA (United States). Dept. of Computer Science; Humphrey, Marty [Univ. of Virginia, Charlottesville, VA (United States). Dept. of Computer Science; Cheah, You-Wei [Indiana Univ., Bloomington, IN (United States); Ryu, Youngryel [Univ. of California, Berkeley, CA (United States). Dept. of Environmental Science, Policy, and Management; Agarwal, Deb [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Jackson, Keith [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); van Ingen, Catharine [Microsoft Research. San Francisco, CA (United States)

    2010-04-01

    It can be natural to believe that many of the traditional issues of scale have been eliminated or at least greatly reduced via cloud computing. That is, if one can create a seemingly wellfunctioning cloud application that operates correctly on small or moderate-sized problems, then the very nature of cloud programming abstractions means that the same application will run as well on potentially significantly larger problems. In this paper, we present our experiences taking MODISAzure, our satellite data processing system built on the Windows Azure cloud computing platform, from the proof-of-concept stage to a point of being able to run on significantly larger problem sizes (e.g., from national-scale data sizes to global-scale data sizes). To our knowledge, this is the longest-running eScience application on the nascent Windows Azure platform. We found that while many infrastructure-level issues were thankfully masked from us by the cloud infrastructure, it was valuable to design additional redundancy and fault-tolerance capabilities such as transparent idempotent task retry and logging to support debugging of user code encountering unanticipated data issues. Further, we found that using a commercial cloud means anticipating inconsistent performance and black-box behavior of virtualized compute instances, as well as leveraging changing platform capabilities over time. We believe that the experiences presented in this paper can help future eScience cloud application developers on Windows Azure and other commercial cloud providers.

  7. Distributed GPU Computing in GIScience

    Science.gov (United States)

    Jiang, Y.; Yang, C.; Huang, Q.; Li, J.; Sun, M.

    2013-12-01

    Geoscientists strived to discover potential principles and patterns hidden inside ever-growing Big Data for scientific discoveries. To better achieve this objective, more capable computing resources are required to process, analyze and visualize Big Data (Ferreira et al., 2003; Li et al., 2013). Current CPU-based computing techniques cannot promptly meet the computing challenges caused by increasing amount of datasets from different domains, such as social media, earth observation, environmental sensing (Li et al., 2013). Meanwhile CPU-based computing resources structured as cluster or supercomputer is costly. In the past several years with GPU-based technology matured in both the capability and performance, GPU-based computing has emerged as a new computing paradigm. Compare to traditional computing microprocessor, the modern GPU, as a compelling alternative microprocessor, has outstanding high parallel processing capability with cost-effectiveness and efficiency(Owens et al., 2008), although it is initially designed for graphical rendering in visualization pipe. This presentation reports a distributed GPU computing framework for integrating GPU-based computing within distributed environment. Within this framework, 1) for each single computer, computing resources of both GPU-based and CPU-based can be fully utilized to improve the performance of visualizing and processing Big Data; 2) within a network environment, a variety of computers can be used to build up a virtual super computer to support CPU-based and GPU-based computing in distributed computing environment; 3) GPUs, as a specific graphic targeted device, are used to greatly improve the rendering efficiency in distributed geo-visualization, especially for 3D/4D visualization. Key words: Geovisualization, GIScience, Spatiotemporal Studies Reference : 1. Ferreira de Oliveira, M. C., & Levkowitz, H. (2003). From visual data exploration to visual data mining: A survey. Visualization and Computer Graphics, IEEE

  8. Energy efficient distributed computing systems

    CERN Document Server

    Lee, Young-Choon

    2012-01-01

    The energy consumption issue in distributed computing systems raises various monetary, environmental and system performance concerns. Electricity consumption in the US doubled from 2000 to 2005.  From a financial and environmental standpoint, reducing the consumption of electricity is important, yet these reforms must not lead to performance degradation of the computing systems.  These contradicting constraints create a suite of complex problems that need to be resolved in order to lead to 'greener' distributed computing systems.  This book brings together a group of outsta

  9. From fault classification to fault tolerance for multi-agent systems

    CERN Document Server

    Potiron, Katia; Taillibert, Patrick

    2013-01-01

    Faults are a concern for Multi-Agent Systems (MAS) designers, especially if the MAS are built for industrial or military use because there must be some guarantee of dependability. Some fault classification exists for classical systems, and is used to define faults. When dependability is at stake, such fault classification may be used from the beginning of the system's conception to define fault classes and specify which types of faults are expected. Thus, one may want to use fault classification for MAS; however, From Fault Classification to Fault Tolerance for Multi-Agent Systems argues that

  10. Scheduling of Fault-Tolerant Embedded Systems with Soft and Hard Timing Constraints

    DEFF Research Database (Denmark)

    Izosimov, Viacheslav; Pop, Paul; Eles, Petru

    2008-01-01

    fails or completes, incurs an unacceptable overhead. Thus, we use a quasi-static scheduling strategy, where a set of schedules is synthesized off-line and, at run time, the scheduler will select the right schedule based on the occurrence of faults and the actual execution times of processes......In this paper we present an approach to the synthesis of fault-tolerant schedules for embedded applications with soft and hard real-time constraints. We are interested to guarantee the deadlines for the hard processes even in the case of faults, while maximizing the overall utility. We use time...

  11. Industrial Cost-Benefit Assessment for Fault-tolerant Control Systems

    DEFF Research Database (Denmark)

    Thybo, C.; Blanke, M.

    1998-01-01

    against failure. The paper describes the assessments needed to find the right path for new industrial designs. The economic decisions in the design phase are discussed: cost of different failures, profits associated with available benefits, investments needed for development and life-time support....... The objective of this paper is to help, in the early product development state, to find the economical most suitable scheme. A salient result is that with increased customer awareness of total cost of ownership, new products can benefit significantly from applying fault tolerant control principles....

  12. Safety Verification of a Fault Tolerant Reconfigurable Autonomous Goal-Based Robotic Control System

    Science.gov (United States)

    Braman, Julia M. B.; Murray, Richard M; Wagner, David A.

    2007-01-01

    Fault tolerance and safety verification of control systems are essential for the success of autonomous robotic systems. A control architecture called Mission Data System (MDS), developed at the Jet Propulsion Laboratory, takes a goal-based control approach. In this paper, a method for converting goal network control programs into linear hybrid systems is developed. The linear hybrid system can then be verified for safety in the presence of failures using existing symbolic model checkers. An example task is simulated in MDS and successfully verified using HyTech, a symbolic model checking software for linear hybrid systems.

  13. Fault tolerant, multiplexed control rod position detection and indication system for nuclear power plants

    International Nuclear Information System (INIS)

    Dufek, W.L.; Jelovich, J.J.; Neuner, J.A.

    1977-01-01

    The majority of Westinghouse nuclear plants placed in service thus far have incorporated a Rod Position Indication system based upon an analog design philosophy. This system, while meeting all functional and accuracy requirements, has proven somewhat cumbersome, particularly in the area of initial field calibration and maintenance. This paper describes a new Digital Rod Position Indication system (DRPI) developed for use with pressurized water reactors. The system is based upon a digital design philosophy and meets all previous design constraints and environmental requirements. Further, fault tolerance, improved accuracy, interference from adjacent rods and the elimination of adjustments and calibration has been provided

  14. Low-Cost Fault Tolerant Methodology for Real Time MPSoC Based Embedded System

    Directory of Open Access Journals (Sweden)

    Mohsin Amin

    2014-01-01

    Full Text Available We are proposing a design methodology for a fault tolerant homogeneous MPSoC having additional design objectives that include low hardware overhead and performance. We have implemented three different FT methodologies on MPSoCs and compared them against the defined constraints. The comparison of these FT methodologies is carried out by modelling their architectures in VHDL-RTL, on Spartan 3 FPGA. The results obtained through simulations helped us to identify the most relevant scheme in terms of the given design constraints.

  15. Lithium Ion Battery (LIB) Charger: Spacesuit Battery Charger Design with 2-Fault Tolerance to Catastrophic Hazards

    Science.gov (United States)

    Darcy, Eric; Davies, Frank

    2009-01-01

    Charger design that is 2-fault tolerant to catastrophic has been achieved for the Spacesuit Li-ion Battery with key features. Power supply control circuit and 2 microprocessors independently control against overcharge. 3 microprocessor control against undercharge (false positive: Go for EVA) conditions. 2 independent channels provide functional redundancy. Capable of charge balancing cell banks in series. Cell manufacturing and performance uniformity is excellent with both designs. Once a few outliers are removed, LV cells are slightly more uniform than MoliJ cells. If cell balance feature of charger is ever invoked, it will be an indication of a significant degradation issue, not a nominal condition.

  16. An experimental evaluation of the effectiveness of random testing of fault-tolerant software

    Science.gov (United States)

    Vouk, Mladen A.; Mcallister, David F.; Tai, K. C.

    1986-01-01

    Results of a fault-tolerant software (FTS) experiment are used to show deficiencies of the simple random testing approach. Testing was performed using randomly generated test cases supplemented with extremal and special value (ESV) cases. Error detection efficiency of the random testing approach, with emphasis on correlated errors, was compared to the error detecting capabilities of the ESV data and found deficient. The use of carefully designed test cases as a supplement to random testing, as well as the use of structure based testing are recommended.

  17. (m,n-Semirings and a Generalized Fault-Tolerance Algebra of Systems

    Directory of Open Access Journals (Sweden)

    Syed Eqbal Alam

    2013-01-01

    Full Text Available We propose a new class of mathematical structures called (m,n-semirings (which generalize the usual semirings and describe their basic properties. We define partial ordering and generalize the concepts of congruence, homomorphism, and so forth, for (m,n-semirings. Following earlier work by Rao (2008, we consider systems made up of several components whose failures may cause them to fail and represent the set of such systems algebraically as an (m,n-semiring. Based on the characteristics of these components, we present a formalism to compare the fault-tolerance behavior of two systems using our framework of a partially ordered (m,n-semiring.

  18. Full-Authority Fault-Tolerant Electronic Engine Control System for Variable Cycle Engines.

    Science.gov (United States)

    1982-04-01

    hydraulic actuation system controlled on corrected speed by the digital controller through an Electro- Hydraulic Servo Valve ( EHSV ) and a vane position...RD-A12i 746 FULL-AUTHORITY FRULT-TOLERANT ELECTRONIC ENGINE CONTROL 1/2 SYSTEM FOR YARIAB..(U) GENERAL MOTORS CORP INDIANAPOLIS IN DETROIT DIESEL...RESOLUTION TEST CHART liafIONAL DuRtAu or SToAMONS -6 -A AFWAL-TR-82-2037 Full-Authority Fault-Tolerant - Electronic Engine Control System Sfor Variable

  19. A reliable fuzzy fault-tolerant automatic controller for nuclear plant equipment

    International Nuclear Information System (INIS)

    Rodriguez, R.J.; Liang, E.; Husseiny, A.A.; Sabri, Z.A.

    1989-01-01

    A reliable fuzzy fault-tolerant automatic controller (REFFTAC) has been designed for control of nuclear power equipment. The controller comprises an adaptive digital controller for on-line design of control actions based on analysis of input and output signals and a fuzzy controller as a backup system using a set of control rules based on operation experience. The controller is applied to a counter-flow heat exchanger. The applicability to control of nuclear reactor pumps is examined. A cost-benefit analysis is also provided. (orig.) [de

  20. Distributed-memory matrix computations

    DEFF Research Database (Denmark)

    Balle, Susanne Mølleskov

    1995-01-01

    in these algorithms is that many scientific applications rely heavily on the performance of the involved dense linear algebra building blocks. Even though we consider the distributed-memory as well as the shared-memory programming paradigm, the major part of the thesis is dedicated to distributed-memory architectures....... We emphasize distributed-memory massively parallel computers - such as the Connection Machines model CM-200 and model CM-5/CM-5E - available to us at UNI-C and at Thinking Machines Corporation. The CM-200 was at the time this project started one of the few existing massively parallel computers...... algorithm is investigated. this algorithm is built on top of several scan-operations. What difficulties occur when implementing this algorithm to massively parallel computers?...

  1. Distributed computing for macromolecular crystallography.

    Science.gov (United States)

    Krissinel, Evgeny; Uski, Ville; Lebedev, Andrey; Winn, Martyn; Ballard, Charles

    2018-02-01

    Modern crystallographic computing is characterized by the growing role of automated structure-solution pipelines, which represent complex expert systems utilizing a number of program components, decision makers and databases. They also require considerable computational resources and regular database maintenance, which is increasingly more difficult to provide at the level of individual desktop-based CCP4 setups. On the other hand, there is a significant growth in data processed in the field, which brings up the issue of centralized facilities for keeping both the data collected and structure-solution projects. The paradigm of distributed computing and data management offers a convenient approach to tackling these problems, which has become more attractive in recent years owing to the popularity of mobile devices such as tablets and ultra-portable laptops. In this article, an overview is given of developments by CCP4 aimed at bringing distributed crystallographic computations to a wide crystallographic community.

  2. Distributed computing at the SSCL

    International Nuclear Information System (INIS)

    Cormell, L.; White, R.

    1993-05-01

    The rapid increase in the availability of high performance, cost- effective RISC/UNIX workstations has been both a blessing and a curse. The blessing of having extremely powerful computing engines available on the desk top is well-known to many users. The user has tremendous freedom, flexibility, and control of his environment. That freedom can, however, become the curse of distributed computing. The user must become a system manager to some extent, he must worry about backups, maintenance, upgrades, etc. Traditionally these activities have been the responsibility of a central computing group. The central computing group, however, may find that it can no linger provide all of the traditional services. With the plethora of workstations now found on so many desktops throughout the entire campus or lab, the central computing group may be swamped by support requests. This talk will address several of these computer support and management issues by discussing the approach taken at the Superconducting Super Collider Laboratory. In addition, a brief review of the future directions of commercial products for distributed computing and management will be given

  3. Kinematic analysis and fault-tolerant trajectory planning of space manipulator under a single joint failure.

    Science.gov (United States)

    Mu, Zonggao; Han, Liang; Xu, Wenfu; Li, Bing; Liang, Bin

    2016-01-01

    A space manipulator plays an important role in spacecraft capturing, repairing, maintenance, and so on. However, the harsh space environment will cause its joints fail to work. For a non-redundant manipulator, single joint locked failure will cause it to lose one degree of freedom (DOF), hence reducing its movement ability. In this paper, the key problems related to the fault-tolerant including kinematics, workspace, and trajectory planning of a non-redundant space manipulator under single joint failure are handled. First, the analytical inverse kinematics equations are derived for the 5-DOF manipulator formed by locking the failure joint of the original 6-DOF manipulator. Then, the reachable end-effector pose (position and orientation) is determined. Further, we define the missions can be completed by the 5-DOF manipulator. According to the constraints of the on-orbital mission, we determine the grasp envelope required for the end-effector. Combining the manipulability of the manipulator and the performance of its end-effector, a fault tolerance parameter is defined and a planning method is proposed to generate the reasonable trajectory, based on which the 5-DOF manipulator can complete the desired tasks. Finally, typical cases are simulated and the simulation results verify the proposed method.

  4. Massive Sensor Array Fault Tolerance: Tolerance Mechanism and Fault Injection for Validation

    Directory of Open Access Journals (Sweden)

    Dugan Um

    2010-01-01

    Full Text Available As today's machines become increasingly complex in order to handle intricate tasks, the number of sensors must increase for intelligent operations. Given the large number of sensors, detecting, isolating, and then tolerating faulty sensors is especially important. In this paper, we propose fault tolerance architecture suitable for a massive sensor array often found in highly advanced systems such as autonomous robots. One example is the sensitive skin, a type of massive sensor array. The objective of the sensitive skin is autonomous guidance of machines in unknown environments, requiring elongated operations in a remote site. The entirety of such a system needs to be able to work remotely without human attendance for an extended period of time. To that end, we propose a fault-tolerant architecture whereby component and analytical redundancies are integrated cohesively for effective failure tolerance of a massive array type sensor or sensor system. In addition, we discuss the evaluation results of the proposed tolerance scheme by means of fault injection and validation analysis as a measure of system reliability and performance.

  5. Fault-tolerant conversion between adjacent Reed–Muller quantum codes based on gauge fixing

    Science.gov (United States)

    Quan, Dong-Xiao; Zhu, Li-Li; Pei, Chang-Xing; Sanders, Barry C.

    2018-03-01

    We design forward and backward fault-tolerant conversion circuits, which convert between the Steane code and the 15-qubit Reed–Muller quantum code so as to provide a universal transversal gate set. In our method, only seven out of a total 14 code stabilizers need to be measured, and we further enhance the circuit by simplifying some stabilizers; thus, we need only to measure eight weight-4 stabilizers for one round of forward conversion and seven weight-4 stabilizers for one round of backward conversion. For conversion, we treat random single-qubit errors and their influence on syndromes of gauge operators, and our novel single-step process enables more efficient fault-tolerant conversion between these two codes. We make our method quite general by showing how to convert between any two adjacent Reed–Muller quantum codes \\overline{\\textsf{RM}}(1,m) and \\overline{\\textsf{RM}}≤ft(1,m+1\\right) , for which we need only measure stabilizers whose number scales linearly with m rather than exponentially with m obtained in previous work. We provide the explicit mathematical expression for the necessary stabilizers and the concomitant resources required.

  6. Modular Adder Designs Using Optimal Reversible and Fault Tolerant Gates in Field-Coupled QCA Nanocomputing

    Science.gov (United States)

    Bilal, Bisma; Ahmed, Suhaib; Kakkar, Vipan

    2018-02-01

    The challenges which the CMOS technology is facing toward the end of the technology roadmap calls for an investigation of various logical and technological solutions to CMOS at the nano scale. Two such paradigms which are considered in this paper are the reversible logic and the quantum-dot cellular automata (QCA) nanotechnology. Firstly, a new 3 × 3 reversible and universal gate, RG-QCA, is proposed and implemented in QCA technology using conventional 3-input majority voter based logic. Further the gate is optimized by using explicit interaction of cells and this optimized gate is then used to design an optimized modular full adder in QCA. Another configuration of RG-QCA gate, CRG-QCA, is then proposed which is a 4 × 4 gate and includes the fault tolerant characteristics and parity preserving nature. The proposed CRG-QCA gate is then tested to design a fault tolerant full adder circuit. Extensive comparisons of gate and adder circuits are drawn with the existing literature and it is envisaged that our proposed designs perform better and are cost efficient in QCA technology.

  7. Design of passive fault-tolerant controllers of a quadrotor based on sliding mode theory

    Directory of Open Access Journals (Sweden)

    Merheb Abdel-Razzak

    2015-09-01

    Full Text Available Abstract In this paper, sliding mode control is used to develop two passive fault tolerant controllers for an AscTec Pelican UAV quadrotor. In the first approach, a regular sliding mode controller (SMC augmented with an integrator uses the robustness property of variable structure control to tolerate partial actuator faults. The second approach is a cascaded sliding mode controller with an inner and outer SMC loops. In this configuration, faults are tolerated in the fast inner loop controlling the velocity system. Tuning the controllers to find the optimal values of the sliding mode controller gains is made using the ecological systems algorithm (ESA, a biologically inspired stochastic search algorithm based on the natural equilibrium of animal species. The controllers are tested using SIMULINK in the presence of two different types of actuator faults, partial loss of motor power affecting all the motors at once, and partial loss of motor speed. Results of the quadrotor following a continuous path demonstrated the effectiveness of the controllers, which are able to tolerate a significant number of actuator faults despite the lack of hardware redundancy in the quadrotor system. Tuning the controller using a faulty system improves further its ability to afford more severe faults. Simulation results show that passive schemes reserve their important role in fault tolerant control and are complementary to active techniques

  8. Optimal fault-tolerant control strategy of a solid oxide fuel cell system

    Science.gov (United States)

    Wu, Xiaojuan; Gao, Danhui

    2017-10-01

    For solid oxide fuel cell (SOFC) development, load tracking, heat management, air excess ratio constraint, high efficiency, low cost and fault diagnosis are six key issues. However, no literature studies the control techniques combining optimization and fault diagnosis for the SOFC system. An optimal fault-tolerant control strategy is presented in this paper, which involves four parts: a fault diagnosis module, a switching module, two backup optimizers and a controller loop. The fault diagnosis part is presented to identify the SOFC current fault type, and the switching module is used to select the appropriate backup optimizer based on the diagnosis result. NSGA-II and TOPSIS are employed to design the two backup optimizers under normal and air compressor fault states. PID algorithm is proposed to design the control loop, which includes a power tracking controller, an anode inlet temperature controller, a cathode inlet temperature controller and an air excess ratio controller. The simulation results show the proposed optimal fault-tolerant control method can track the power, temperature and air excess ratio at the desired values, simultaneously achieving the maximum efficiency and the minimum unit cost in the case of SOFC normal and even in the air compressor fault.

  9. Novel fault tolerant modular system architecture for I and C applications

    International Nuclear Information System (INIS)

    Kumar, Ankit; Venkatesan, A.; Madhusoodanan, K.

    2013-01-01

    Novel fault tolerant 3U modular system architecture has been developed for safety related and safety critical I and C systems of the reactor. Design innovatively utilizes simplest multi-drop serial bus called Inter-Integrated Circuits (I 2 C) Bus for system operation with simplicity, fault tolerance and online maintainability (hot swap). I 2 C bus failure modes analysis was done and system design was hardened for possible failure modes. System backplane uses only passive components, dual redundant I 2 C buses, data consistency checks and geographical addressing scheme to tackle bus lock ups/stuck buses and bit flips in data transactions. Dual CPU active/standby redundancy architecture with hot swap implements tolerance for CPU software stuck up conditions and hardware faults. System cards implement hot swap for online maintainability, power supply fault containment, communication buses fault containment and I/O channel to channel isolation and independency. Typical applications for pure hardwired (without real time software) Core Temperature Monitoring System for FBRs, as a Universal Signal Conditioning System for safety related I and C systems and as a complete control system for non nuclear safety systems have also been discussed. (author)

  10. Achieving privacy-preserving big data aggregation with fault tolerance in smart grid

    Directory of Open Access Journals (Sweden)

    Zhitao Guan

    2017-11-01

    Full Text Available In a smart grid, a huge amount of data is collected for various applications, such as load monitoring and demand response. These data are used for analyzing the power state and formulating the optimal dispatching strategy. However, these big energy data in terms of volume, velocity and variety raise concern over consumers’ privacy. For instance, in order to optimize energy utilization and support demand response, numerous smart meters are installed at a consumer's home to collect energy consumption data at a fine granularity, but these fine-grained data may contain information on the appliances and thus the consumer's behaviors at home. In this paper, we propose a privacy-preserving data aggregation scheme based on secret sharing with fault tolerance in a smart grid, which ensures that the control center obtains the integrated data without compromising privacy. Meanwhile, we also consider fault tolerance and resistance to differential attack during the data aggregation. Finally, we perform a security analysis and performance evaluation of our scheme in comparison with the other similar schemes. The analysis shows that our scheme can meet the security requirement, and it also shows better performance than other popular methods.

  11. Fault Tolerant Mechanism for Multimedia Flows in Wireless Ad Hoc Networks Based on Fast Switching Paths

    Directory of Open Access Journals (Sweden)

    Juan R. Diaz

    2014-01-01

    Full Text Available Multimedia traffic can be forwarded through a wireless ad hoc network using the available resources of the nodes. Several models and protocols have been designed in order to organize and arrange the nodes to improve transmissions along the network. We use a cluster-based framework, called MWAHCA architecture, which optimizes multimedia transmissions over a wireless ad hoc network. It was proposed by us in a previous research work. This architecture is focused on decreasing quality of service (QoS parameters like latency, jitter, and packet loss, but other network features were not developed, like load balance or fault tolerance. In this paper, we propose a new fault tolerance mechanism, using as a base the MWAHCA architecture, in order to recover any multimedia flow crossing the wireless ad hoc network when there is a node failure. The algorithm can run independently for each multimedia flow. The main objective is to keep the QoS parameters as low as possible. To achieve this goal, the convergence time must be controlled and reduced. This paper provides the designed protocol, the analytical model of the algorithm, and a software application developed to test its performance in a real laboratory.

  12. Redundant and fault-tolerant algorithms for real-time measurement and control systems for weapon equipment.

    Science.gov (United States)

    Li, Dan; Hu, Xiaoguang

    2017-03-01

    Because of the high availability requirements from weapon equipment, an in-depth study has been conducted on the real-time fault-tolerance of the widely applied Compact PCI (CPCI) bus measurement and control system. A redundancy design method that uses heartbeat detection to connect the primary and alternate devices has been developed. To address the low successful execution rate and relatively large waste of time slices in the primary version of the task software, an improved algorithm for real-time fault-tolerant scheduling is proposed based on the Basic Checking available time Elimination idle time (BCE) algorithm, applying a single-neuron self-adaptive proportion sum differential (PSD) controller. The experimental validation results indicate that this system has excellent redundancy and fault-tolerance, and the newly developed method can effectively improve the system availability. Copyright © 2017 ISA. Published by Elsevier Ltd. All rights reserved.

  13. Fault tolerant synchronization of chaotic systems based on T–S fuzzy model with fuzzy sampled-data controller

    International Nuclear Information System (INIS)

    Da-Zhong, Ma; Hua-Guang, Zhang; Zhan-Shan, Wang; Jian, Feng

    2010-01-01

    In this paper the fault tolerant synchronization of two chaotic systems based on fuzzy model and sample data is investigated. The problem of fault tolerant synchronization is formulated to study the global asymptotical stability of the error system with the fuzzy sampled-data controller which contains a state feedback controller and a fault compensator. The synchronization can be achieved no matter whether the fault occurs or not. To investigate the stability of the error system and facilitate the design of the fuzzy sampled-data controller, a Takagi–Sugeno (T–S) fuzzy model is employed to represent the chaotic system dynamics. To acquire good performance and produce a less conservative analysis result, a new parameter-dependent Lyapunov–Krasovksii functional and a relaxed stabilization technique are considered. The stability conditions based on linear matrix inequality are obtained to achieve the fault tolerant synchronization of the chaotic systems. Finally, a numerical simulation is shown to verify the results. (general)

  14. A Parameter Communication Optimization Strategy for Distributed Machine Learning in Sensors

    Directory of Open Access Journals (Sweden)

    Jilin Zhang

    2017-09-01

    Full Text Available In order to utilize the distributed characteristic of sensors, distributed machine learning has become the mainstream approach, but the different computing capability of sensors and network delays greatly influence the accuracy and the convergence rate of the machine learning model. Our paper describes a reasonable parameter communication optimization strategy to balance the training overhead and the communication overhead. We extend the fault tolerance of iterative-convergent machine learning algorithms and propose the Dynamic Finite Fault Tolerance (DFFT. Based on the DFFT, we implement a parameter communication optimization strategy for distributed machine learning, named Dynamic Synchronous Parallel Strategy (DSP, which uses the performance monitoring model to dynamically adjust the parameter synchronization strategy between worker nodes and the Parameter Server (PS. This strategy makes full use of the computing power of each sensor, ensures the accuracy of the machine learning model, and avoids the situation that the model training is disturbed by any tasks unrelated to the sensors.

  15. A Parameter Communication Optimization Strategy for Distributed Machine Learning in Sensors.

    Science.gov (United States)

    Zhang, Jilin; Tu, Hangdi; Ren, Yongjian; Wan, Jian; Zhou, Li; Li, Mingwei; Wang, Jue; Yu, Lifeng; Zhao, Chang; Zhang, Lei

    2017-09-21

    In order to utilize the distributed characteristic of sensors, distributed machine learning has become the mainstream approach, but the different computing capability of sensors and network delays greatly influence the accuracy and the convergence rate of the machine learning model. Our paper describes a reasonable parameter communication optimization strategy to balance the training overhead and the communication overhead. We extend the fault tolerance of iterative-convergent machine learning algorithms and propose the Dynamic Finite Fault Tolerance (DFFT). Based on the DFFT, we implement a parameter communication optimization strategy for distributed machine learning, named Dynamic Synchronous Parallel Strategy (DSP), which uses the performance monitoring model to dynamically adjust the parameter synchronization strategy between worker nodes and the Parameter Server (PS). This strategy makes full use of the computing power of each sensor, ensures the accuracy of the machine learning model, and avoids the situation that the model training is disturbed by any tasks unrelated to the sensors.

  16. A Parameter Communication Optimization Strategy for Distributed Machine Learning in Sensors

    Science.gov (United States)

    Zhang, Jilin; Tu, Hangdi; Ren, Yongjian; Wan, Jian; Zhou, Li; Li, Mingwei; Wang, Jue; Yu, Lifeng; Zhao, Chang; Zhang, Lei

    2017-01-01

    In order to utilize the distributed characteristic of sensors, distributed machine learning has become the mainstream approach, but the different computing capability of sensors and network delays greatly influence the accuracy and the convergence rate of the machine learning model. Our paper describes a reasonable parameter communication optimization strategy to balance the training overhead and the communication overhead. We extend the fault tolerance of iterative-convergent machine learning algorithms and propose the Dynamic Finite Fault Tolerance (DFFT). Based on the DFFT, we implement a parameter communication optimization strategy for distributed machine learning, named Dynamic Synchronous Parallel Strategy (DSP), which uses the performance monitoring model to dynamically adjust the parameter synchronization strategy between worker nodes and the Parameter Server (PS). This strategy makes full use of the computing power of each sensor, ensures the accuracy of the machine learning model, and avoids the situation that the model training is disturbed by any tasks unrelated to the sensors. PMID:28934163

  17. Different-Level Simultaneous Minimization Scheme for Fault Tolerance of Redundant Manipulator Aided with Discrete-Time Recurrent Neural Network.

    Science.gov (United States)

    Jin, Long; Liao, Bolin; Liu, Mei; Xiao, Lin; Guo, Dongsheng; Yan, Xiaogang

    2017-01-01

    By incorporating the physical constraints in joint space, a different-level simultaneous minimization scheme, which takes both the robot kinematics and robot dynamics into account, is presented and investigated for fault-tolerant motion planning of redundant manipulator in this paper. The scheme is reformulated as a quadratic program (QP) with equality and bound constraints, which is then solved by a discrete-time recurrent neural network. Simulative verifications based on a six-link planar redundant robot manipulator substantiate the efficacy and accuracy of the presented acceleration fault-tolerant scheme, the resultant QP and the corresponding discrete-time recurrent neural network.

  18. Hydronic distribution system computer model

    Energy Technology Data Exchange (ETDEWEB)

    Andrews, J.W.; Strasser, J.J.

    1994-10-01

    A computer model of a hot-water boiler and its associated hydronic thermal distribution loop has been developed at Brookhaven National Laboratory (BNL). It is intended to be incorporated as a submodel in a comprehensive model of residential-scale thermal distribution systems developed at Lawrence Berkeley. This will give the combined model the capability of modeling forced-air and hydronic distribution systems in the same house using the same supporting software. This report describes the development of the BNL hydronics model, initial results and internal consistency checks, and its intended relationship to the LBL model. A method of interacting with the LBL model that does not require physical integration of the two codes is described. This will provide capability now, with reduced up-front cost, as long as the number of runs required is not large.

  19. A Fault-Tolerant Parallel Structure of Single-Phase Full-Bridge Rectifiers for a Wound-Field Doubly Salient Generator

    DEFF Research Database (Denmark)

    Chen, Zhihui; Chen, Ran; Chen, Zhe

    2013-01-01

    The fault-tolerance design is widely adopted for high-reliability applications. In this paper, a parallel structure of single-phase full-bridge rectifiers (FBRs) (PS-SPFBR) is proposed for a wound-field doubly salient generator. The analysis shows the potential fault-tolerance capability of the PS...

  20. On providing the fault-tolerant operation of information systems based on open content management systems

    Science.gov (United States)

    Kratov, Sergey

    2018-01-01

    Modern information systems designed to service a wide range of users, regardless of their subject area, are increasingly based on Web technologies and are available to users via Internet. The article discusses the issues of providing the fault-tolerant operation of such information systems, based on free and open source content management systems. The toolkit available to administrators of similar systems is shown; the scenarios for using these tools are described. Options for organizing backups and restoring the operability of systems after failures are suggested. Application of the proposed methods and approaches allows providing continuous monitoring of the state of systems, timely response to the emergence of possible problems and their prompt solution.

  1. Fault-Tolerant Region-Based Control of an Underwater Vehicle with Kinematically Redundant Thrusters

    Directory of Open Access Journals (Sweden)

    Zool H. Ismail

    2014-01-01

    Full Text Available This paper presents a new control approach for an underwater vehicle with a kinematically redundant thruster system. This control scheme is derived based on a fault-tolerant decomposition for thruster force allocation and a region control scheme for the tracking objective. Given a redundant thruster system, that is, six or more pairs of thrusters are used, the proposed redundancy resolution and region control scheme determine the number of thruster faults, as well as providing the reference thruster forces in order to keep the underwater vehicle within the desired region. The stability of the presented control law is proven in the sense of a Lyapunov function. Numerical simulations are performed with an omnidirectional underwater vehicle and the results of the proposed scheme illustrate the effectiveness in terms of optimizing the thruster forces.

  2. Decentralized Fault-Tolerant Control of Inland Navigation Networks: a Challenge

    Science.gov (United States)

    Segovia, P.; Rajaoarisoa, L.; Nejjari, F.; Blesa, J.; Puig, V.; Duviella, E.

    2017-01-01

    Inland waterways are large-scale networks used principally for navigation. Even if the transport planning is an important issue, the water resource management is a crucial point. Indeed, navigation is not possible when there is too little or too much water inside the waterways. Hence, the water resource management of waterways has to be particularly efficient in a context of climate change and increase of water demand. This management has to be done by considering different time and space scales and still requires the development of new methodologies and tools in the topics of the Control and Informatics communities. This work addresses the problem of waterways management in terms of modeling, control, diagnosis and fault-tolerant control by focusing in the inland waterways of the north of France. A review of proposed tools and the ongoing research topics are provided in this paper.

  3. Fault diagnosis and fault-tolerant control and guidance for aerospace vehicles from theory to application

    CERN Document Server

    Zolghadri, Ali; Cieslak, Jerome; Efimov, Denis; Goupil, Philippe

    2014-01-01

    Fault Diagnosis and Fault-Tolerant Control and Guidance for Aerospace demonstrates the attractive potential of recent developments in control for resolving such issues as improved flight performance, self-protection and extended life of structures. Importantly, the text deals with a number of practically significant considerations: tuning, complexity of design, real-time capability, evaluation of worst-case performance, robustness in harsh environments, and extensibility when development or adaptation is required. Coverage of such issues helps to draw the advanced concepts arising from academic research back towards the technological concerns of industry. Initial coverage of basic definitions and ideas and a literature review gives way to a treatment of important electrical flight control system failures: the oscillatory failure case, runaway, and jamming. Advanced fault detection and diagnosis for linear and nonlinear systems are described. Lastly recovery strategies appropriate to remaining acuator/sensor/c...

  4. A hybrid robust fault tolerant control based on adaptive joint unscented Kalman filter.

    Science.gov (United States)

    Shabbouei Hagh, Yashar; Mohammadi Asl, Reza; Cocquempot, Vincent

    2017-01-01

    In this paper, a new hybrid robust fault tolerant control scheme is proposed. A robust H ∞ control law is used in non-faulty situation, while a Non-Singular Terminal Sliding Mode (NTSM) controller is activated as soon as an actuator fault is detected. Since a linear robust controller is designed, the system is first linearized through the feedback linearization method. To switch from one controller to the other, a fuzzy based switching system is used. An Adaptive Joint Unscented Kalman Filter (AJUKF) is used for fault detection and diagnosis. The proposed method is based on the simultaneous estimation of the system states and parameters. In order to show the efficiency of the proposed scheme, a simulated 3-DOF robotic manipulator is used. Copyright © 2016 ISA. Published by Elsevier Ltd. All rights reserved.

  5. Application of Joint Parameter Identification and State Estimation to a Fault-Tolerant Robot System

    DEFF Research Database (Denmark)

    Sun, Zhen; Yang, Zhenyu

    2011-01-01

    , it would further simplify the reconfigurable design task and possibly speed up the system recovery, if the system state information under the new operating circumstance can be available along with faulty parameter information. The joint parameter identification and state estimation using the combined......The joint parameter identification and state estimation technique is applied to develop a fault-tolerant space robot system. The potential faults in the considered system are abrupt parametric faults, which indicate that some system parameters will immediately deviate from their nominal values...... if a fault happens. The concerned system parameters consist of deterministic parts as well as those describing the stochastic features in the system. Due to the purpose for design of reconfigurable control, these deviated system parameters need to be identified as precisely and quickly as possible. Meanwhile...

  6. A data-driven fault-tolerant control design of linear multivariable systems with performance optimization.

    Science.gov (United States)

    Li, Zhe; Yang, Guang-Hong

    2017-09-01

    In this paper, an integrated data-driven fault-tolerant control (FTC) design scheme is proposed under the configuration of the Youla parameterization for multiple-input multiple-output (MIMO) systems. With unknown system model parameters, the canonical form identification technique is first applied to design the residual observer in fault-free case. In faulty case, with online tuning of the Youla parameters based on the system data via the gradient-based algorithm, the fault influence is attenuated with system performance optimization. In addition, to improve the robustness of the residual generator to a class of system deviations, a novel adaptive scheme is proposed for the residual generator to prevent its over-activation. Simulation results of a two-tank flow system demonstrate the optimized performance and effect of the proposed FTC scheme. Copyright © 2017 ISA. Published by Elsevier Ltd. All rights reserved.

  7. Adaptive Fault-Tolerant Tracking Control of Nonaffine Nonlinear Systems with Actuator Failure

    Directory of Open Access Journals (Sweden)

    Hongcheng Zhou

    2014-01-01

    Full Text Available This paper proposes an adaptive fault-tolerant control scheme for nonaffine nonlinear systems. A model approximation method which is a solution that bridges the gap between affine and nonaffine control systems is developed firstly. A joint estimation approach is based on unscented Kalman filter, in which both failure parameters and states are simultaneously estimated by means of the argument state vector composed of the unknown faults and states. Then, stability analysis is given for the closed-loop system. Finally, the proposed approach is verified using a three-degree-of-freedom simulation of a typical fighter aircraft and the significantly improved system response demonstrates the practical potential of the theoretic results obtained.

  8. Position, Attitude, and Fault-Tolerant Control of Tilting-Rotor Quadcopter

    Science.gov (United States)

    Kumar, Rumit

    The aim of this thesis is to present algorithms for autonomous control of tilt-rotor quadcopter UAV. In particular, this research work describes position, attitude and fault tolerant control in tilt-rotor quadcopter. Quadcopters are one of the most popular and reliable unmanned aerial systems because of the design simplicity, hovering capabilities and minimal operational cost. Numerous applications for quadcopters have been explored all over the world but very little work has been done to explore design enhancements and address the fault-tolerant capabilities of the quadcopters. The tilting rotor quadcopter is a structural advancement of traditional quadcopter and it provides additional actuated controls as the propeller motors are actuated for tilt which can be utilized to improve efficiency of the aerial vehicle during flight. The tilting rotor quadcopter design is accomplished by using an additional servo motor for each rotor that enables the rotor to tilt about the axis of the quadcopter arm. Tilting rotor quadcopter is a more agile version of conventional quadcopter and it is a fully actuated system. The tilt-rotor quadcopter is capable of following complex trajectories with ease. The control strategy in this work is to use the propeller tilts for position and orientation control during autonomous flight of the quadcopter. In conventional quadcopters, two propellers rotate in clockwise direction and other two propellers rotate in counter clockwise direction to cancel out the effective yawing moment of the system. The variation in rotational speeds of these four propellers is utilized for maneuvering. On the other hand, this work incorporates use of varying propeller rotational speeds along with tilting of the propellers for maneuvering during flight. The rotational motion of propellers work in sync with propeller tilts to control the position and orientation of the UAV during the flight. A PD flight controller is developed to achieve various modes of the

  9. Sensor-driven, fault-tolerant control of a maintenance robot

    International Nuclear Information System (INIS)

    Moy, M.M.; Davidson, W.M.

    1987-01-01

    A robot system has been designed to do routine maintenance tasks on the Sandia Pulsed Reactor (SPR). The use of this Remote Maintenance Robot (RMR) is expected to significantly reduce the occupational radiation exposure of the reactor operators. Reactor safety was a key issue in the design of the robot maintenance system. Using sensors to detect error conditions and intelligent control to recover from the errors, the RMR is capable of responding to error conditions without creating a hazard. This paper describes the design and implementation of a sensor-driven, fault-tolerant control for the RMR. Recovery from errors is not automatic; it does rely on operator assistance. However, a key feature of the error recovery procedure is that the operator is allowed to reenter the programmed operation after the error has been corrected. The recovery procedure guarantees that the moving components of the system will not collide with the reactor during recovery

  10. Fault Detection and Isolation and Fault Tolerant Control of Wind Turbines Using Set-Valued Observers

    DEFF Research Database (Denmark)

    Casau, Pedro; Rosa, Paulo Andre Nobre; Tabatabaeipour, Seyed Mojtaba

    2012-01-01

    and Isolation (FDI) and Fault Tolerant Control (FTC) of wind turbines, by taking advantage of the recent advances in SVO theory for model invalidation. A simple wind turbine model is presented along with possible faulty scenarios. The FDI algorithm is built on top of the described model, taking into account......Research on wind turbine Operations & Maintenance (O&M) procedures is critical to the expansion of Wind Energy Conversion systems (WEC). In order to reduce O&M costs and increase the lifespan of the turbine, we study the application of Set-Valued Observers (SVO) to the problem of Fault Detection...... process disturbances, uncertainty and sensor noise. The FTC strategy takes advantage of the proposed FDI algorithm, enabling the controller reconfiguration shortly after fault events. Additionally, a robust controller is designed so as to increase the wind turbine's performance during low severity faults...

  11. Fault Tolerant Flight Control Using Sliding Modes and Subspace Identification-Based Predictive Control

    KAUST Repository

    Siddiqui, Bilal A.

    2016-07-26

    In this work, a cascade structure of a time-scale separated integral sliding mode and model predictive control is proposed as a viable alternative for fault-tolerant control. A multi-variable sliding mode control law is designed as the inner loop of the flight control system. Subspace identification is carried out on the aircraft in closed loop. The identified plant is then used for model predictive controllers in the outer loop. The overall control law demonstrates improved robustness to measurement noise, modeling uncertainties, multiple faults and severe wind turbulence and gusts. In addition, the flight control system employs filters and dead-zone nonlinear elements to reduce chattering and improve handling quality. Simulation results demonstrate the efficiency of the proposed controller using conventional fighter aircraft without control redundancy.

  12. Robust Fault Tolerant Control for a Class of Time-Delay Systems with Multiple Disturbances

    Directory of Open Access Journals (Sweden)

    Songyin Cao

    2013-01-01

    Full Text Available A robust fault tolerant control (FTC approach is addressed for a class of nonlinear systems with time delay, actuator faults, and multiple disturbances. The first part of the multiple disturbances is supposed to be an uncertain modeled disturbance and the second one represents a norm-bounded variable. First, a composite observer is designed to estimate the uncertain modeled disturbance and actuator fault simultaneously. Then, an FTC strategy consisting of disturbance observer based control (DOBC, fault accommodation, and a mixed H2/H∞ controller is constructed to reconfigure the considered systems with disturbance rejection and attenuation performance. Finally, simulations for a flight control system are given to show the efficiency of the proposed approach.

  13. Adaptive robust fault tolerant control design for a class of nonlinear uncertain MIMO systems with quantization.

    Science.gov (United States)

    Ao, Wei; Song, Yongdong; Wen, Changyun

    2017-05-01

    In this paper, we investigate the adaptive control problem for a class of nonlinear uncertain MIMO systems with actuator faults and quantization effects. Under some mild conditions, an adaptive robust fault-tolerant control is developed to compensate the affects of uncertainties, actuator failures and errors caused by quantization, and a range of the parameters for these quantizers is established. Furthermore, a Lyapunov-like approach is adopted to demonstrate that the ultimately uniformly bounded output tracking error is guaranteed by the controller, and the signals of the closed-loop system are ensured to be bounded, even in the presence of at most m-q actuators stuck or outage. Finally, numerical simulations are provided to verify and illustrate the effectiveness of the proposed adaptive schemes. Copyright © 2017 ISA. Published by Elsevier Ltd. All rights reserved.

  14. Microfluidic very large-scale integration for biochips: Technology, testing and fault-tolerant design

    DEFF Research Database (Denmark)

    Araci, Ismail Emre; Pop, Paul; Chakrabarty, Krishnendu

    2015-01-01

    of this paper is on continuous-flow biochips, where the basic building block is a microvalve. By combining these microvalves, more complex units such as mixers, switches, multiplexers can be built, hence the name of the technology, “microfluidic Very Large-Scale Integration” (mVLSI). A roadblock......Microfluidic biochips are replacing the conventional biochemical analyzers by integrating all the necessary functions for biochemical analysis using microfluidics. Biochips are used in many application areas, such as, in vitro diagnostics, drug discovery, biotech and ecology. The focus...... presents the state-of-the-art in the mVLSI platforms and emerging research challenges in the area of continuous-flow microfluidics, focusing on testing techniques and fault-tolerant design....

  15. Critical Gates Identification for Fault-Tolerant Design in Math Circuits

    Directory of Open Access Journals (Sweden)

    Tian Ban

    2017-01-01

    Full Text Available Hardware redundancy at different levels of design is a common fault mitigation technique, which is well known for its efficiency to the detriment of area overhead. In order to reduce this drawback, several fault-tolerant techniques have been proposed in literature to find a good trade-off. In this paper, critical constituent gates in math circuits are detected and graded based on the impact of an error in the output of a circuit. These critical gates should be hardened first under the area constraint of design criteria. Indeed, output bits considered crucial to a system receive higher priorities to be protected, reducing the occurrence of critical errors. The 74283 fast adder is used as an example to illustrate the feasibility and efficiency of the proposed approach.

  16. Neuroadaptive Fault-Tolerant Control of Nonlinear Systems Under Output Constraints and Actuation Faults.

    Science.gov (United States)

    Zhao, Kai; Song, Yongduan; Shen, Zhixi

    2018-02-01

    In this paper, a neuroadaptive fault-tolerant tracking control method is proposed for a class of time-delay pure-feedback systems in the presence of external disturbances and actuation faults. The proposed controller can achieve prescribed transient and steady-state performance, despite uncertain time delays and output constraints as well as actuation faults. By combining a tangent barrier Lyapunov-Krasovskii function with the dynamic surface control technique, the neural network unit in the developed control scheme is able to take its action from the very beginning and play its learning/approximating role safely during the entire system operational envelope, leading to enhanced control performance without the danger of violating compact set precondition. Furthermore, prescribed transient performance and output constraints are strictly ensured in the presence of nonaffine uncertainties, external disturbances, and undetectable actuation faults. The control strategy is also validated by numerical simulation.

  17. Data-based fault-tolerant control for affine nonlinear systems with actuator faults.

    Science.gov (United States)

    Xie, Chun-Hua; Yang, Guang-Hong

    2016-09-01

    This paper investigates the fault-tolerant control (FTC) problem for unknown nonlinear systems with actuator faults including stuck, outage, bias and loss of effectiveness. The upper bounds of stuck faults, bias faults and loss of effectiveness faults are unknown. A new data-based FTC scheme is proposed. It consists of the online estimations of the bounds and a state-dependent function. The estimations are adjusted online to compensate automatically the actuator faults. The state-dependent function solved by using real system data helps to stabilize the system. Furthermore, all signals in the resulting closed-loop system are uniformly bounded and the states converge asymptotically to zero. Compared with the existing results, the proposed approach is data-based. Finally, two simulation examples are provided to show the effectiveness of the proposed approach. Copyright © 2016 ISA. Published by Elsevier Ltd. All rights reserved.

  18. Backstepping Design of Adaptive Neural Fault-Tolerant Control for MIMO Nonlinear Systems.

    Science.gov (United States)

    Gao, Hui; Song, Yongduan; Wen, Changyun

    In this paper, an adaptive controller is developed for a class of multi-input and multioutput nonlinear systems with neural networks (NNs) used as a modeling tool. It is shown that all the signals in the closed-loop system with the proposed adaptive neural controller are globally uniformly bounded for any external input in . In our control design, the upper bound of the NN modeling error and the gains of external disturbance are characterized by unknown upper bounds, which is more rational to establish the stability in the adaptive NN control. Filter-based modification terms are used in the update laws of unknown parameters to improve the transient performance. Finally, fault-tolerant control is developed to accommodate actuator failure. An illustrative example applying the adaptive controller to control a rigid robot arm shows the validation of the proposed controller.In this paper, an adaptive controller is developed for a class of multi-input and multioutput nonlinear systems with neural networks (NNs) used as a modeling tool. It is shown that all the signals in the closed-loop system with the proposed adaptive neural controller are globally uniformly bounded for any external input in . In our control design, the upper bound of the NN modeling error and the gains of external disturbance are characterized by unknown upper bounds, which is more rational to establish the stability in the adaptive NN control. Filter-based modification terms are used in the update laws of unknown parameters to improve the transient performance. Finally, fault-tolerant control is developed to accommodate actuator failure. An illustrative example applying the adaptive controller to control a rigid robot arm shows the validation of the proposed controller.

  19. Reliable and Fault-Tolerant Software-Defined Network Operations Scheme for Remote 3D Printing

    Science.gov (United States)

    Kim, Dongkyun; Gil, Joon-Min

    2015-03-01

    The recent wide expansion of applicable three-dimensional (3D) printing and software-defined networking (SDN) technologies has led to a great deal of attention being focused on efficient remote control of manufacturing processes. SDN is a renowned paradigm for network softwarization, which has helped facilitate remote manufacturing in association with high network performance, since SDN is designed to control network paths and traffic flows, guaranteeing improved quality of services by obtaining network requests from end-applications on demand through the separated SDN controller or control plane. However, current SDN approaches are generally focused on the controls and automation of the networks, which indicates that there is a lack of management plane development designed for a reliable and fault-tolerant SDN environment. Therefore, in addition to the inherent advantage of SDN, this paper proposes a new software-defined network operations center (SD-NOC) architecture to strengthen the reliability and fault-tolerance of SDN in terms of network operations and management in particular. The cooperation and orchestration between SDN and SD-NOC are also introduced for the SDN failover processes based on four principal SDN breakdown scenarios derived from the failures of the controller, SDN nodes, and connected links. The abovementioned SDN troubles significantly reduce the network reachability to remote devices (e.g., 3D printers, super high-definition cameras, etc.) and the reliability of relevant control processes. Our performance consideration and analysis results show that the proposed scheme can shrink operations and management overheads of SDN, which leads to the enhancement of responsiveness and reliability of SDN for remote 3D printing and control processes.

  20. Trust Index Based Fault Tolerant Multiple Event Localization Algorithm for WSNs

    Directory of Open Access Journals (Sweden)

    Jian Wan

    2011-06-01

    Full Text Available This paper investigates the use of wireless sensor networks for multiple event source localization using binary information from the sensor nodes. The events could continually emit signals whose strength is attenuated inversely proportional to the distance from the source. In this context, faults occur due to various reasons and are manifested when a node reports a wrong decision. In order to reduce the impact of node faults on the accuracy of multiple event localization, we introduce a trust index model to evaluate the fidelity of information which the nodes report and use in the event detection process, and propose the Trust Index based Subtract on Negative Add on Positive (TISNAP localization algorithm, which reduces the impact of faulty nodes on the event localization by decreasing their trust index, to improve the accuracy of event localization and performance of fault tolerance for multiple event source localization. The algorithm includes three phases: first, the sink identifies the cluster nodes to determine the number of events occurred in the entire region by analyzing the binary data reported by all nodes; then, it constructs the likelihood matrix related to the cluster nodes and estimates the location of all events according to the alarmed status and trust index of the nodes around the cluster nodes. Finally, the sink updates the trust index of all nodes according to the fidelity of their information in the previous reporting cycle. The algorithm improves the accuracy of localization and performance of fault tolerance in multiple event source localization. The experiment results show that when the probability of node fault is close to 50%, the algorithm can still accurately determine the number of the events and have better accuracy of localization compared with other algorithms.

  1. Transient fault tolerant control for vehicle brake-by-wire systems

    International Nuclear Information System (INIS)

    Huang, Shuang; Zhou, Chunjie; Yang, Lili; Qin, Yuanqing; Huang, Xiongfeng; Hu, Bowen

    2016-01-01

    Brake-by-wire (BBW) systems that have no mechanical linkage between the brake pedal and the brake mechanism are expected to improve vehicle safety through better braking capability. However, transient faults in BBW systems can cause dangerous driving situations. Most existing research in this area focuses on the brake control mechanism, but very few studies try to solve the problem associated with transient fault propagation and evolution in the brake control system hierarchy. In this paper, a hierarchical transient fault tolerant scheme with embedded intelligence and resilient coordination for BBW system is proposed based on the analysis of transient fault propagation characteristics. In this scheme, most transient faults are tackled rapidly by a signature-based detection method at the node level, and the remaining transient faults, which cannot be detected directly at the node level and could degrade the system performance through fault propagation and evolution, are detected and recovered through function and structure models at the system level. To jointly accommodate these BBW transient faults at the system level, a sliding mode control algorithm and a task reallocation strategy are designed. A simulation platform based on Architecture Analysis and Design Language (AADL) is established to evaluate the task reallocation strategy, and a hardware-in-the-loop simulation is carried out to validate the proposed scheme systematically. Experimental results show the effectiveness of this new approach to BBW systems. - Highlights: • We propose a hierarchical transient fault tolerant scheme for BBW systems. • A sliding mode algorithm and a task strategy are designed to tackle transient fault. • The effectiveness of the scheme is verified in both simulation and HIL environments.

  2. Design of fault tolerant control system for individual blade control helicopters

    Science.gov (United States)

    Tamayo, Sergio

    This dissertation presents the development of a fault tolerant control scheme for helicopters fitted with individually controlled blades. This novel approach attempts to improve fault tolerant capabilities of helicopter control system by increasing control redundancy using additional actuators for individual blade input and software re-mixing to obtain nominal or close to nominal conditions under failure. An advanced interactive simulation environment has been developed including modeling of sensor failure, swashplate actuator failure, individual blade actuator failure, and blade delamination to support the design, testing, and evaluation of the control laws. This simulation environment is based on the blade element theory for the calculation of forces and moments generated by the main rotor. This discretized model allows for individual blade analysis, which in turn allows measuring the consequences of a stuck blade, or loss of the surface area of the blade itself, with respect to the dynamics of the whole helicopter. The control laws are based on non-linear dynamic inversion and artificial neural network augmentation, which is a mix of linear and nonlinear methods that compensates for model inaccuracies due to linearization or failure. A stability analysis based on the Lyapunov function approach has shown that bounded tracking error is guaranteed, and under specific circumstances, global stability is guaranteed as well. An analysis over the degrees of freedom of the mechanical system and its impact over the helicopter handling qualities is also performed to measure the degree of redundancy achieved with the addition of individual blade actuators as compared to a classic swashplate helicopter configuration. Mathematical analysis and numerical simulation, using reconfiguration of the individual blade control under failure have shown that this control architecture can potentially improve the survivability of the aircraft and reduce pilot workload under failure

  3. Distributed Computing in Universities and Colleges.

    Science.gov (United States)

    Sircar, Sumit

    1979-01-01

    Analyzes the implications of distributed computing in institutions of higher education. Discusses (1) the extent to which the quality of computing might be enhanced by adopting a distributed computing approach, (2) variations in distributed systems design and the cost of adoption, and (3) administration of distributed systems. (Author/CMV)

  4. Design Optimization of Time- and Cost-Constrained Fault-Tolerant Embedded Systems with Checkpointing and Replication

    DEFF Research Database (Denmark)

    Pop, Paul; Izosimov, Viacheslav; Eles, Petru

    2009-01-01

    We present an approach to the synthesis of fault-tolerant hard real-time systems for safety-critical applications. We use checkpointing with rollback recovery and active replication for tolerating transient faults. Processes and communications are statically scheduled. Our synthesis approach...

  5. An improved fault-tolerant control scheme for PWM inverter-fed induction motor-based EVs.

    Science.gov (United States)

    Tabbache, Bekheïra; Benbouzid, Mohamed; Kheloui, Abdelaziz; Bourgeot, Jean-Matthieu; Mamoune, Abdeslam

    2013-11-01

    This paper proposes an improved fault-tolerant control scheme for PWM inverter-fed induction motor-based electric vehicles. The proposed strategy deals with power switch (IGBTs) failures mitigation within a reconfigurable induction motor control. To increase the vehicle powertrain reliability regarding IGBT open-circuit failures, 4-wire and 4-leg PWM inverter topologies are investigated and their performances discussed in a vehicle context. The proposed fault-tolerant topologies require only minimum hardware modifications to the conventional off-the-shelf six-switch three-phase drive, mitigating the IGBTs failures by specific inverter control. Indeed, the two topologies exploit the induction motor neutral accessibility for fault-tolerant purposes. The 4-wire topology uses then classical hysteresis controllers to account for the IGBT failures. The 4-leg topology, meanwhile, uses a specific 3D space vector PWM to handle vehicle requirements in terms of size (DC bus capacitors) and cost (IGBTs number). Experiments on an induction motor drive and simulations on an electric vehicle are carried-out using a European urban driving cycle to show that the proposed fault-tolerant control approach is effective and provides a simple configuration with high performance in terms of speed and torque responses. Copyright © 2013 ISA. Published by Elsevier Ltd. All rights reserved.

  6. Flight Tests of Autopilot Integrated with Fault-Tolerant Control of a Small Fixed-Wing UAV

    Directory of Open Access Journals (Sweden)

    Shuo Wang

    2016-01-01

    Full Text Available A fault-tolerant control scheme for the autopilot of the small fixed-wing UAV is designed and tested by the actual flight experiments. The small fixed-wing UAV called Xiang Fei is developed independently by Nanjing University of Aeronautics and Astronautics. The flight control system is designed based on an open-source autopilot (Pixhawk. Real-time kinematic (RTK GPS is introduced due to its high accuracy. Some modifications on the longitudinal and lateral guidance laws are achieved to improve the flight control performance. Moreover, a data fusion based fault-tolerant control scheme is integrated in altitude control and speed control for altitude sensor failure and airspeed sensor failure, which are the common problems for small fixed-wing UAV. Finally, the real flight experiments are implemented to test the fault-tolerant control based autopilot of UAV. Real flight test results are given and analyzed in detail, which show that the fixed-wing UAV can track the desired altitude and speed commands during the whole flight process including takeoff, climbing, cruising, gliding, landing, and wave-off by the fault-tolerant control based autopilot.

  7. A fault-tolerant MEF peptide synthesizer using control and direct sensing electrodes employing current and impedance tests

    NARCIS (Netherlands)

    Zhang, X.; Kerkhoff, Hans G.; Mailly, F.; Nouet, P.; Liu, H.; Richardson, A.

    2007-01-01

    The research in the area of microelectronic fluidic (MEF) devices for biomedical applications is rapidly growing. As faults in these devices can have severe personal implications, a system is presented which includes fault-tolerance with respect to the systhesized biomaterials (peptides). It can

  8. Physiological hemostasis based intelligent integrated cooperative controller for precise fault-tolerant control of redundant parallel manipulator

    Science.gov (United States)

    Hao, Kuangrong; Guo, Chongbin; Ding, Yongsheng

    2014-10-01

    This paper focuses on precise fault-tolerant control for actual redundant parallel manipulator. Based on kinematic redundancy, some unnoticed influences such as mechanical clearance have been considered to design a more precise and intelligent fault-tolerant plan for actual plants. According to regulation principles in human hemostasis system, a bio-inspired intelligent integrated cooperative controller (BIICC) is developed including system structure, algorithm and step in parameter tuning. The proposed BIICC optimises partial error signal and improves control performance in each sub-channel. Moreover, the new controller transfers and disposes cooperative control signals among different sub-channels to achieve an intelligent integrated fault-tolerant system. The proposed BIICC is applied to an actual 2-DOF (degrees of freedom) redundant parallel manipulator where the feasibility of the new controller is demonstrated. The BIICC is beneficial to control precision and fault-tolerant capability of redundant plant. The improvements are more obvious in cases where extra actuators of redundant manipulator are broken.

  9. Fault tolerant synchronization of chaotic heavy symmetric gyroscope systems versus external disturbances via Lyapunov rule-based fuzzy control.

    Science.gov (United States)

    Farivar, Faezeh; Shoorehdeli, Mahdi Aliyari

    2012-01-01

    In this paper, fault tolerant synchronization of chaotic gyroscope systems versus external disturbances via Lyapunov rule-based fuzzy control is investigated. Taking the general nature of faults in the slave system into account, a new synchronization scheme, namely, fault tolerant synchronization, is proposed, by which the synchronization can be achieved no matter whether the faults and disturbances occur or not. By making use of a slave observer and a Lyapunov rule-based fuzzy control, fault tolerant synchronization can be achieved. Two techniques are considered as control methods: classic Lyapunov-based control and Lyapunov rule-based fuzzy control. On the basis of Lyapunov stability theory and fuzzy rules, the nonlinear controller and some generic sufficient conditions for global asymptotic synchronization are obtained. The fuzzy rules are directly constructed subject to a common Lyapunov function such that the error dynamics of two identical chaotic motions of symmetric gyros satisfy stability in the Lyapunov sense. Two proposed methods are compared. The Lyapunov rule-based fuzzy control can compensate for the actuator faults and disturbances occurring in the slave system. Numerical simulation results demonstrate the validity and feasibility of the proposed method for fault tolerant synchronization. Copyright © 2011 ISA. Published by Elsevier Ltd. All rights reserved.

  10. Structural Optimization in a Distributed Computing Environment

    National Research Council Canada - National Science Library

    Voon, B. K; Austin, M. A

    1991-01-01

    ...) optimization algorithm customized to a Distributed Numerical Computing environment (DNC). DNC utilizes networking technology and an ensemble of loosely coupled processors to compute structural analyses concurrently...

  11. GATE Monte Carlo simulation of dose distribution using MapReduce in a cloud computing environment.

    Science.gov (United States)

    Liu, Yangchuan; Tang, Yuguo; Gao, Xin

    2017-12-01

    The GATE Monte Carlo simulation platform has good application prospects of treatment planning and quality assurance. However, accurate dose calculation using GATE is time consuming. The purpose of this study is to implement a novel cloud computing method for accurate GATE Monte Carlo simulation of dose distribution using MapReduce. An Amazon Machine Image installed with Hadoop and GATE is created to set up Hadoop clusters on Amazon Elastic Compute Cloud (EC2). Macros, the input files for GATE, are split into a number of self-contained sub-macros. Through Hadoop Streaming, the sub-macros are executed by GATE in Map tasks and the sub-results are aggregated into final outputs in Reduce tasks. As an evaluation, GATE simulations were performed in a cubical water phantom for X-ray photons of 6 and 18 MeV. The parallel simulation on the cloud computing platform is as accurate as the single-threaded simulation on a local server and the simulation correctness is not affected by the failure of some worker nodes. The cloud-based simulation time is approximately inversely proportional to the number of worker nodes. For the simulation of 10 million photons on a cluster with 64 worker nodes, time decreases of 41× and 32× were achieved compared to the single worker node case and the single-threaded case, respectively. The test of Hadoop's fault tolerance showed that the simulation correctness was not affected by the failure of some worker nodes. The results verify that the proposed method provides a feasible cloud computing solution for GATE.

  12. Overlapping clusters for distributed computation.

    Energy Technology Data Exchange (ETDEWEB)

    Mirrokni, Vahab (Google Research, New York, NY); Andersen, Reid (Microsoft Corporation, Redmond, WA); Gleich, David F.

    2010-11-01

    Scalable, distributed algorithms must address communication problems. We investigate overlapping clusters, or vertex partitions that intersect, for graph computations. This setup stores more of the graph than required but then affords the ease of implementation of vertex partitioned algorithms. Our hope is that this technique allows us to reduce communication in a computation on a distributed graph. The motivation above draws on recent work in communication avoiding algorithms. Mohiyuddin et al. (SC09) design a matrix-powers kernel that gives rise to an overlapping partition. Fritzsche et al. (CSC2009) develop an overlapping clustering for a Schwarz method. Both techniques extend an initial partitioning with overlap. Our procedure generates overlap directly. Indeed, Schwarz methods are commonly used to capitalize on overlap. Elsewhere, overlapping communities (Ahn et al, Nature 2009; Mishra et al. WAW2007) are now a popular model of structure in social networks. These have long been studied in statistics (Cole and Wishart, CompJ 1970). We present two types of results: (i) an estimated swapping probability {rho}{infinity}; and (ii) the communication volume of a parallel PageRank solution (link-following {alpha} = 0.85) using an additive Schwarz method. The volume ratio is the amount of extra storage for the overlap (2 means we store the graph twice). Below, as the ratio increases, the swapping probability and PageRank communication volume decreases.

  13. Towards distributed multiscale computing for the VPH

    NARCIS (Netherlands)

    Hoekstra, A.G.; Coveney, P.

    2010-01-01

    Multiscale modeling is fundamental to the Virtual Physiological Human (VPH) initiative. Most detailed three-dimensional multiscale models lead to prohibitive computational demands. As a possible solution we present MAPPER, a computational science infrastructure for Distributed Multiscale Computing

  14. How to Improve Fault Tolerance in Disaster Predictions: A Case Study about Flash Floods Using IoT, ML and Real Data

    Directory of Open Access Journals (Sweden)

    Gustavo Furquim

    2018-03-01

    Full Text Available The rise in the number and intensity of natural disasters is a serious problem that affects the whole world. The consequences of these disasters are significantly worse when they occur in urban districts because of the casualties and extent of the damage to goods and property that is caused. Until now feasible methods of dealing with this have included the use of wireless sensor networks (WSNs for data collection and machine-learning (ML techniques for forecasting natural disasters. However, there have recently been some promising new innovations in technology which have supplemented the task of monitoring the environment and carrying out the forecasting. One of these schemes involves adopting IP-based (Internet Protocol sensor networks, by using emerging patterns for IoT. In light of this, in this study, an attempt has been made to set out and describe the results achieved by SENDI (System for dEtecting and forecasting Natural Disasters based on IoT. SENDI is a fault-tolerant system based on IoT, ML and WSN for the detection and forecasting of natural disasters and the issuing of alerts. The system was modeled by means of ns-3 and data collected by a real-world WSN installed in the town of São Carlos - Brazil, which carries out the data collection from rivers in the region. The fault-tolerance is embedded in the system by anticipating the risk of communication breakdowns and the destruction of the nodes during disasters. It operates by adding intelligence to the nodes to carry out the data distribution and forecasting, even in extreme situations. A case study is also included for flash flood forecasting and this makes use of the ns-3 SENDI model and data collected by WSN.

  15. How to Improve Fault Tolerance in Disaster Predictions: A Case Study about Flash Floods Using IoT, ML and Real Data.

    Science.gov (United States)

    Furquim, Gustavo; Filho, Geraldo P R; Jalali, Roozbeh; Pessin, Gustavo; Pazzi, Richard W; Ueyama, Jó

    2018-03-19

    The rise in the number and intensity of natural disasters is a serious problem that affects the whole world. The consequences of these disasters are significantly worse when they occur in urban districts because of the casualties and extent of the damage to goods and property that is caused. Until now feasible methods of dealing with this have included the use of wireless sensor networks (WSNs) for data collection and machine-learning (ML) techniques for forecasting natural disasters. However, there have recently been some promising new innovations in technology which have supplemented the task of monitoring the environment and carrying out the forecasting. One of these schemes involves adopting IP-based (Internet Protocol) sensor networks, by using emerging patterns for IoT. In light of this, in this study, an attempt has been made to set out and describe the results achieved by SENDI (System for dEtecting and forecasting Natural Disasters based on IoT). SENDI is a fault-tolerant system based on IoT, ML and WSN for the detection and forecasting of natural disasters and the issuing of alerts. The system was modeled by means of ns-3 and data collected by a real-world WSN installed in the town of São Carlos - Brazil, which carries out the data collection from rivers in the region. The fault-tolerance is embedded in the system by anticipating the risk of communication breakdowns and the destruction of the nodes during disasters. It operates by adding intelligence to the nodes to carry out the data distribution and forecasting, even in extreme situations. A case study is also included for flash flood forecasting and this makes use of the ns-3 SENDI model and data collected by WSN.

  16. Lightgrid-an agile distributed computing architecture for Geant4

    International Nuclear Information System (INIS)

    Young, Jason; Perry, John O.; Jevremovic, Tatjana

    2010-01-01

    A light weight grid based computing architecture has been developed to accelerate Geant4 computations on a variety of network architectures. This new software is called LightGrid. LightGrid has a variety of features designed to overcome current limitations on other grid based computing platforms, more specifically, smaller network architectures. By focusing on smaller, local grids, LightGrid is able to simplify the grid computing process with minimal changes to existing Geant4 code. LightGrid allows for integration between Geant4 and MySQL, which both increases flexibility in the grid as well as provides a faster, reliable, and more portable method for accessing results than traditional data storage systems. This unique method of data acquisition allows for more fault tolerant runs as well as instant results from simulations as they occur. The performance increases brought along by using LightGrid allow simulation times to be decreased linearly. LightGrid also allows for pseudo-parallelization with minimal Geant4 code changes.

  17. Cloud Computing for Rigorous Coupled-Wave Analysis

    Directory of Open Access Journals (Sweden)

    N. L. Kazanskiy

    2012-01-01

    Full Text Available Design and analysis of complex nanophotonic and nanoelectronic structures require significant computing resources. Cloud computing infrastructure allows distributed parallel applications to achieve greater scalability and fault tolerance. The problems of effective use of high-performance computing systems for modeling and simulation of subwavelength diffraction gratings are considered. Rigorous coupled-wave analysis (RCWA is adapted to cloud computing environment. In order to accomplish this, data flow of the RCWA is analyzed and CPU-intensive operations are converted to data-intensive operations. The generated data sets are structured in accordance with the requirements of MapReduce technology.

  18. Cellular automaton decoders of topological quantum memories in the fault tolerant setting

    International Nuclear Information System (INIS)

    Herold, Michael; Eisert, Jens; Kastoryano, Michael J; Campbell, Earl T

    2017-01-01

    Active error decoding and correction of topological quantum codes—in particular the toric code—remains one of the most viable routes to large scale quantum information processing. In contrast, passive error correction relies on the natural physical dynamics of a system to protect encoded quantum information. However, the search is ongoing for a completely satisfactory passive scheme applicable to locally interacting two-dimensional systems. Here, we investigate dynamical decoders that provide passive error correction by embedding the decoding process into local dynamics. We propose a specific discrete time cellular-automaton decoder in the fault tolerant setting and provide numerical evidence showing that the logical qubit has a survival time extended by several orders of magnitude over that of a bare unencoded qubit. We stress that (asynchronous) dynamical decoding gives rise to a Markovian dissipative process. We hence equate cellular-automaton decoding to a fully dissipative topological quantum memory, which removes errors continuously. In this sense, uncontrolled and unwanted local noise can be corrected for by a controlled local dissipative process. We analyze the required resources, commenting on additional polylogarithmic factors beyond those incurred by an ideal constant resource dynamical decoder. (paper)

  19. Sensor fault-tolerant control for gear-shifting engaging process of automated manual transmission

    Science.gov (United States)

    Li, Liang; He, Kai; Wang, Xiangyu; Liu, Yahui

    2018-01-01

    Angular displacement sensor on the actuator of automated manual transmission (AMT) is sensitive to fault, and the sensor fault will disturb its normal control, which affects the entire gear-shifting process of AMT and results in awful riding comfort. In order to solve this problem, this paper proposes a method of fault-tolerant control for AMT gear-shifting engaging process. By using the measured current of actuator motor and angular displacement of actuator, the gear-shifting engaging load torque table is built and updated before the occurrence of the sensor fault. Meanwhile, residual between estimated and measured angular displacements is used to detect the sensor fault. Once the residual exceeds a determined fault threshold, the sensor fault is detected. Then, switch control is triggered, and the current observer and load torque table estimates an actual gear-shifting position to replace the measured one to continue controlling the gear-shifting process. Numerical and experiment tests are carried out to evaluate the reliability and feasibility of proposed methods, and the results show that the performance of estimation and control is satisfactory.

  20. Fault-tolerant nonlinear adaptive flight control using sliding mode online learning.

    Science.gov (United States)

    Krüger, Thomas; Schnetter, Philipp; Placzek, Robin; Vörsmann, Peter

    2012-08-01

    An expanded nonlinear model inversion flight control strategy using sliding mode online learning for neural networks is presented. The proposed control strategy is implemented for a small unmanned aircraft system (UAS). This class of aircraft is very susceptible towards nonlinearities like atmospheric turbulence, model uncertainties and of course system failures. Therefore, these systems mark a sensible testbed to evaluate fault-tolerant, adaptive flight control strategies. Within this work the concept of feedback linearization is combined with feed forward neural networks to compensate for inversion errors and other nonlinear effects. Backpropagation-based adaption laws of the network weights are used for online training. Within these adaption laws the standard gradient descent backpropagation algorithm is augmented with the concept of sliding mode control (SMC). Implemented as a learning algorithm, this nonlinear control strategy treats the neural network as a controlled system and allows a stable, dynamic calculation of the learning rates. While considering the system's stability, this robust online learning method therefore offers a higher speed of convergence, especially in the presence of external disturbances. The SMC-based flight controller is tested and compared with the standard gradient descent backpropagation algorithm in the presence of system failures. Copyright © 2012 Elsevier Ltd. All rights reserved.

  1. A review of fault tolerant control strategies applied to proton exchange membrane fuel cell systems

    Science.gov (United States)

    Dijoux, Etienne; Steiner, Nadia Yousfi; Benne, Michel; Péra, Marie-Cécile; Pérez, Brigitte Grondin

    2017-08-01

    Fuel cells are powerful systems for power generation. They have a good efficiency and do not generate greenhouse gases. This technology involves a lot of scientific fields, which leads to the appearance of strongly inter-dependent parameters. This makes the system particularly hard to control and increases fault's occurrence frequency. These two issues call for the necessity to maintain the system performance at the expected level, even in faulty operating conditions. It is called "fault tolerant control" (FTC). The present paper aims to give the state of the art of FTC applied to the proton exchange membrane fuel cell (PEMFC). The FTC approach is composed of two parts. First, a diagnosis part allows the identification and the isolation of a fault; it requires a good a priori knowledge of all the possible faults. Then, a control part allows an optimal control strategy to find the best operating point to recover/mitigate the fault; it requires the knowledge of the degradation phenomena and their mitigation strategies.

  2. An Autonomous Self-Aware and Adaptive Fault Tolerant Routing Technique for Wireless Sensor Networks

    Science.gov (United States)

    Abba, Sani; Lee, Jeong-A

    2015-01-01

    We propose an autonomous self-aware and adaptive fault-tolerant routing technique (ASAART) for wireless sensor networks. We address the limitations of self-healing routing (SHR) and self-selective routing (SSR) techniques for routing sensor data. We also examine the integration of autonomic self-aware and adaptive fault detection and resiliency techniques for route formation and route repair to provide resilience to errors and failures. We achieved this by using a combined continuous and slotted prioritized transmission back-off delay to obtain local and global network state information, as well as multiple random functions for attaining faster routing convergence and reliable route repair despite transient and permanent node failure rates and efficient adaptation to instantaneous network topology changes. The results of simulations based on a comparison of the ASAART with the SHR and SSR protocols for five different simulated scenarios in the presence of transient and permanent node failure rates exhibit a greater resiliency to errors and failure and better routing performance in terms of the number of successfully delivered network packets, end-to-end delay, delivered MAC layer packets, packet error rate, as well as efficient energy conservation in a highly congested, faulty, and scalable sensor network. PMID:26295236

  3. Cellular automaton decoders of topological quantum memories in the fault tolerant setting

    Science.gov (United States)

    Herold, Michael; Kastoryano, Michael J.; Campbell, Earl T.; Eisert, Jens

    2017-06-01

    Active error decoding and correction of topological quantum codes—in particular the toric code—remains one of the most viable routes to large scale quantum information processing. In contrast, passive error correction relies on the natural physical dynamics of a system to protect encoded quantum information. However, the search is ongoing for a completely satisfactory passive scheme applicable to locally interacting two-dimensional systems. Here, we investigate dynamical decoders that provide passive error correction by embedding the decoding process into local dynamics. We propose a specific discrete time cellular-automaton decoder in the fault tolerant setting and provide numerical evidence showing that the logical qubit has a survival time extended by several orders of magnitude over that of a bare unencoded qubit. We stress that (asynchronous) dynamical decoding gives rise to a Markovian dissipative process. We hence equate cellular-automaton decoding to a fully dissipative topological quantum memory, which removes errors continuously. In this sense, uncontrolled and unwanted local noise can be corrected for by a controlled local dissipative process. We analyze the required resources, commenting on additional polylogarithmic factors beyond those incurred by an ideal constant resource dynamical decoder.

  4. Fault Tolerant Analysis For Holonic Manufacturing Systems Based On Collaborative Petri Nets

    Directory of Open Access Journals (Sweden)

    Fu-Shiung Hsieh

    2003-04-01

    Full Text Available Uncertainties are significant characteristics of today's manufacturing systems. Holonic manufacturing systems are new paradigms to handle uncertainties and changes in manufacturing environments. Among many sources of uncertainties, failure prone machines are one of the most important ones. This paper focuses on handling machine failures in holonic manufacturing systems. Machine failure will reduce the number of available resources. Feasibility analysis need to be conducted to check whether the works in process can be completed. To facilitate feasibility analysis, we characterize feasible conditions for systems with failure prone machines. This paper combines the flexibility and robustness of multi-agent theory with the modeling and analytical power of Petri net to adaptively synthesize Petri net agents to control holonic manufacturing systems. The main results include: (1 a collaborative Petri net (CPN agent model for holonic manufacturing systems, (2 a feasible condition to test whether a certain type of machine failures are allowed based on collaborative Petri net agents and (3 fault tolerant analysis of the proposed method.

  5. Fault tolerant deterministic secure quantum communication using logical Bell states against collective noise

    Science.gov (United States)

    Wang, Chao; Liu, Jian-Wei; Chen, Xiu-Bo; Bi, Ya-Gang; Shang, Tao

    2015-04-01

    This study proposes two novel fault tolerant deterministic secure quantum communication (DSQC) schemes resistant to collective noise using logical Bell states. Either DSQC scheme is constructed based on a new coding function, which is designed by exploiting the property of the corresponding logical Bell states immune to collective-dephasing noise and collective-rotation noise, respectively. The secret message can be encoded by two simple unitary operations and decoded by merely performing Bell measurements, which can make the proposed scheme more convenient in practical applications. Moreover, the strategy of one-step quanta transmission, together with the technique of decoy logical qubits checking not only reduces the influence of other noise existing in a quantum channel, but also guarantees the security of the communication between two legitimate users. The final analysis shows that the proposed schemes are feasible and robust against various well-known attacks over the collective noise channel. Project supported by the National Natural Science Foundation of China (Grant Nos. 61272501, 61272514, 61170272, 61472048, 61402058, 61121061, and 61411146001), the Program for New Century Excellent Talents in University of China (Grant No. NCET-13-0681), the National Development Foundation for Cryptological Research (Grant No. MMJJ201401012), the Fok Ying Tong Education Foundation (Grant No. 131067), the Natural Science Foundation of Beijing (Grant Nos. 4132056 and 4152038), the Postdoctoral Science Foundation of China (Grant No. 2014M561826), and the National Key Basic Research Program, China (Grant No. 2012CB315905)

  6. Sensor and sensorless fault tolerant control for induction motors using a wavelet index.

    Science.gov (United States)

    Gaeid, Khalaf Salloum; Ping, Hew Wooi; Khalid, Mustafa; Masaoud, Ammar

    2012-01-01

    Fault Tolerant Control (FTC) systems are crucial in industry to ensure safe and reliable operation, especially of motor drives. This paper proposes the use of multiple controllers for a FTC system of an induction motor drive, selected based on a switching mechanism. The system switches between sensor vector control, sensorless vector control, closed-loop voltage by frequency (V/f) control and open loop V/f control. Vector control offers high performance, while V/f is a simple, low cost strategy with high speed and satisfactory performance. The faults dealt with are speed sensor failures, stator winding open circuits, shorts and minimum voltage faults. In the event of compound faults, a protection unit halts motor operation. The faults are detected using a wavelet index. For the sensorless vector control, a novel Boosted Model Reference Adaptive System (BMRAS) to estimate the motor speed is presented, which reduces tuning time. Both simulation results and experimental results with an induction motor drive show the scheme to be a fast and effective one for fault detection, while the control methods transition smoothly and ensure the effectiveness of the FTC system. The system is also shown to be flexible, reverting rapidly back to the dominant controller if the motor returns to a healthy state.

  7. An Autonomous Self-Aware and Adaptive Fault Tolerant Routing Technique for Wireless Sensor Networks.

    Science.gov (United States)

    Abba, Sani; Lee, Jeong-A

    2015-08-18

    We propose an autonomous self-aware and adaptive fault-tolerant routing technique (ASAART) for wireless sensor networks. We address the limitations of self-healing routing (SHR) and self-selective routing (SSR) techniques for routing sensor data. We also examine the integration of autonomic self-aware and adaptive fault detection and resiliency techniques for route formation and route repair to provide resilience to errors and failures. We achieved this by using a combined continuous and slotted prioritized transmission back-off delay to obtain local and global network state information, as well as multiple random functions for attaining faster routing convergence and reliable route repair despite transient and permanent node failure rates and efficient adaptation to instantaneous network topology changes. The results of simulations based on a comparison of the ASAART with the SHR and SSR protocols for five different simulated scenarios in the presence of transient and permanent node failure rates exhibit a greater resiliency to errors and failure and better routing performance in terms of the number of successfully delivered network packets, end-to-end delay, delivered MAC layer packets, packet error rate, as well as efficient energy conservation in a highly congested, faulty, and scalable sensor network.

  8. A Systematic Approach to Sensitivity Analysis of Fault Tolerant Systems in NMR Architecture

    Directory of Open Access Journals (Sweden)

    Kourosh Aslansefat

    2015-01-01

    Full Text Available A fault tree illustrates the ways through which a system fails. It states different ways in which combination of faulty components result in an undesired event in the system. Being used in phases such as designing and exploiting industrial systems, and the designers able to evaluate the dependability attributes such as reliability, MTTF and sensitivity. In addition, in the mentioned ability, the fault tree is a systematic method for finding systems bottlenecks and weakness point. In spite of its extensive use in evaluating the reliability of systems, fault tree is rarely used in calculating sensitivity. In the last decade, few researches has been conducted in this field, however these methods are not applicable to large scale systems and are not systematic. This paper provides a systematic method for evaluating system sensitivity through fault tree. Then, it introduces sensitivity of NMR architecture as one of the common structures of fault tolerance which is used for enhancing systems’ reliability, safety and availability in industry. This article presents a comprehensive and parameterized formula for NMR structure's sensitivity. The presented method can be a great help for designing and exploiting reliable systems engineers in systematic and instant calculation of sensitivity by means of fault tree.

  9. Fault tolerant multi-sensor fusion based on the information gain

    Science.gov (United States)

    Hage, Joelle Al; El Najjar, Maan E.; Pomorski, Denis

    2017-01-01

    In the last decade, multi-robot systems are used in several applications like for example, the army, the intervention areas presenting danger to human life, the management of natural disasters, the environmental monitoring, exploration and agriculture. The integrity of localization of the robots must be ensured in order to achieve their mission in the best conditions. Robots are equipped with proprioceptive (encoders, gyroscope) and exteroceptive sensors (Kinect). However, these sensors could be affected by various faults types that can be assimilated to erroneous measurements, bias, outliers, drifts,… In absence of a sensor fault diagnosis step, the integrity and the continuity of the localization are affected. In this work, we present a muti-sensors fusion approach with Fault Detection and Exclusion (FDE) based on the information theory. In this context, we are interested by the information gain given by an observation which may be relevant when dealing with the fault tolerance aspect. Moreover, threshold optimization based on the quantity of information given by a decision on the true hypothesis is highlighted.

  10. A Fault-Tolerant Multiple Sensor Fusion Approach Applied to UAV Attitude Estimation

    Directory of Open Access Journals (Sweden)

    Yu Gu

    2016-01-01

    Full Text Available A novel sensor fusion design framework is presented with the objective of improving the overall multisensor measurement system performance and achieving graceful degradation following individual sensor failures. The Unscented Information Filter (UIF is used to provide a useful tool for combining information from multiple sources. A two-step off-line and on-line calibration procedure refines sensor error models and improves the measurement performance. A Fault Detection and Identification (FDI scheme crosschecks sensor measurements and simultaneously monitors sensor biases. Low-quality or faulty sensor readings are then rejected from the final sensor fusion process. The attitude estimation problem is used as a case study for the multiple sensor fusion algorithm design, with information provided by a set of low-cost rate gyroscopes, accelerometers, magnetometers, and a single-frequency GPS receiver’s position and velocity solution. Flight data collected with an Unmanned Aerial Vehicle (UAV research test bed verifies the sensor fusion, adaptation, and fault-tolerance capabilities of the designed sensor fusion algorithm.

  11. Fault-tolerant control with mixed aerodynamic surfaces and RCS jets for hypersonic reentry vehicles

    Directory of Open Access Journals (Sweden)

    Jingjing He

    2017-04-01

    Full Text Available This paper proposes a fault-tolerant strategy for hypersonic reentry vehicles with mixed aerodynamic surfaces and reaction control systems (RCS under external disturbances and subject to actuator faults. Aerodynamic surfaces are treated as the primary actuator in normal situations, and they are driven by a continuous quadratic programming (QP allocator to generate torque commanded by a nonlinear adaptive feedback control law. When aerodynamic surfaces encounter faults, they may not be able to provide sufficient torque as commanded, and RCS jets are activated to augment the aerodynamic surfaces to compensate for insufficient torque. Partial loss of effectiveness and stuck faults are considered in this paper, and observers are designed to detect and identify the faults. Based on the fault identification results, an RCS control allocator using integer linear programming (ILP techniques is designed to determine the optimal combination of activated RCS jets. By treating the RCS control allocator as a quantization element, closed-loop stability with both continuous and quantized inputs is analyzed. Simulation results verify the effectiveness of the proposed method.

  12. Fault-tolerant embedded system design and optimization considering reliability estimation uncertainty

    International Nuclear Information System (INIS)

    Wattanapongskorn, Naruemon; Coit, David W.

    2007-01-01

    In this paper, we model embedded system design and optimization, considering component redundancy and uncertainty in the component reliability estimates. The systems being studied consist of software embedded in associated hardware components. Very often, component reliability values are not known exactly. Therefore, for reliability analysis studies and system optimization, it is meaningful to consider component reliability estimates as random variables with associated estimation uncertainty. In this new research, the system design process is formulated as a multiple-objective optimization problem to maximize an estimate of system reliability, and also, to minimize the variance of the reliability estimate. The two objectives are combined by penalizing the variance for prospective solutions. The two most common fault-tolerant embedded system architectures, N-Version Programming and Recovery Block, are considered as strategies to improve system reliability by providing system redundancy. Four distinct models are presented to demonstrate the proposed optimization techniques with or without redundancy. For many design problems, multiple functionally equivalent software versions have failure correlation even if they have been independently developed. The failure correlation may result from faults in the software specification, faults from a voting algorithm, and/or related faults from any two software versions. Our approach considers this correlation in formulating practical optimization models. Genetic algorithms with a dynamic penalty function are applied in solving this optimization problem, and reasonable and interesting results are obtained and discussed

  13. Extreme temperature robust optical sensor designs and fault-tolerant signal processing

    Science.gov (United States)

    Riza, Nabeel Agha [Oviedo, FL; Perez, Frank [Tujunga, CA

    2012-01-17

    Silicon Carbide (SiC) probe designs for extreme temperature and pressure sensing uses a single crystal SiC optical chip encased in a sintered SiC material probe. The SiC chip may be protected for high temperature only use or exposed for both temperature and pressure sensing. Hybrid signal processing techniques allow fault-tolerant extreme temperature sensing. Wavelength peak-to-peak (or null-to-null) collective spectrum spread measurement to detect wavelength peak/null shift measurement forms a coarse-fine temperature measurement using broadband spectrum monitoring. The SiC probe frontend acts as a stable emissivity Black-body radiator and monitoring the shift in radiation spectrum enables a pyrometer. This application combines all-SiC pyrometry with thick SiC etalon laser interferometry within a free-spectral range to form a coarse-fine temperature measurement sensor. RF notch filtering techniques improve the sensitivity of the temperature measurement where fine spectral shift or spectrum measurements are needed to deduce temperature.

  14. Fault-tolerance performance evaluation of fieldbus for NPCS network of KNGR

    International Nuclear Information System (INIS)

    Jung, Hyun Gi

    1999-02-01

    In contrast with conventional fieldbus researches which are focused merely on real time performance, this study aims to evaluate the real-time performance of the communication system including fault-tolerant mechanisms. Maintaining performance in presence of recoverable faults is very important because the communication network will be applied to next generation NPP(Nuclear Power Plant). In order to guarantee the performance of NPP communication network, the time characteristics of the target system in presence of recoverable fault should be investigated. If the time characteristics meet the requirements of the system, the faults will be recovered by fieldbus recovery mechanisms and the system will be safe. If the time characteristics can not meet the requirements, the faults in the fieldbus can propagate to system failure. In this study, for the purpose of investigating the time characteristics of fieldbus, the recoverable faults are classified and then the formulas which represent delays including recovery mechanisms and the simulation model are developed. In order to validate the proposed approach, the simulation model is applied to the Korea Next Generation Reactor (KNGR) NSSS Process Control System (NPCS). The results of the simulation provide reasonable delay characteristics of the fault cases with recovery mechanisms. Using the outcome of the simulation and the system requirements, we also can calculate the failure propagation probability from fieldbus to outer system

  15. Open-Switch Fault Diagnosis and Fault Tolerant for Matrix Converter with Finite Control Set-Model Predictive Control

    DEFF Research Database (Denmark)

    Peng, Tao; Dan, Hanbing; Yang, Jian

    2016-01-01

    To improve the reliability of the matrix converter (MC), a fault diagnosis method to identify single open-switch fault is proposed in this paper. The introduced fault diagnosis method is based on finite control set-model predictive control (FCS-MPC), which employs a time-discrete model of the MC...... topology and a cost function to select the best switching state for the next sampling period. The proposed fault diagnosis method is realized by monitoring the load currents and judging the switching state to locate the faulty switch. Compared to the conventional modulation strategies such as carrier......-switch fault conditions without any redundant hardware, a fault tolerant strategy based on predictive control is also studied. The fault tolerant strategy is to select the most appropriate switching state, associated with the remaining normal switches of the MC. Experiment results are presented to show...

  16. Fault Diagnosis and Fault-tolerant Control of Modular Multi-level Converter High-voltage DC System

    DEFF Research Database (Denmark)

    Liu, Hui; Ma, Ke; Wang, Chao

    2016-01-01

    device fault, DC line faults as well as AC grid faults. Special attention is given to the comparison of the corresponding fault diagnosis and fault-tolerant control approaches. Further, focus is dedicated to control/protection strategies and topologies with fault ride-though capability for MMC...... of failures and lower the reliability of the MMC-HVDC system. Therefore, research on the fault diagnosis and fault-tolerant control of MMC-HVDC system is of great significance in order to enhance the reliability of the system. This paper provides a comprehensive review of fault diagnosis and fault handling...... strategies of MMC-HVDC systems for the most common faults happened in MMC-HVDC systems covering MMC faults, DC side faults as well as AC side faults. An important part of this paper is devoted to a discussion of the vulnerable spots as well as failure mechanism of the MMC-HVDC system covering switching...

  17. Scheduling and Voltage Scaling for Energy/Reliability Trade-offs in Fault-Tolerant Time-Triggered Embedded Systems

    DEFF Research Database (Denmark)

    Pop, Paul; Poulsen, Kåre Harbo; Izosimov, Viacheslav

    2007-01-01

    are satisfied and the energy is minimized. We present a constraint logic programming- based approach which is able to find reliable and schedulable implementations within limited energy and hardware resources. The developed algorithms have been evaluated using extensive experiments....... transient faults. Addressing simultaneously energy and reliability is especially challenging because lowering the voltage to reduce the energy consumption has been shown to exponentially increase the number of transient faults. In addition, time-redundancy based fault-tolerance techniques such as re...

  18. Byzantine-fault tolerant self-stabilizing protocol for distributed clock synchronization systems

    Science.gov (United States)

    Malekpour, Mahyar R. (Inventor)

    2010-01-01

    A rapid Byzantine self-stabilizing clock synchronization protocol that self-stabilizes from any state, tolerates bursts of transient failures, and deterministically converges within a linear convergence time with respect to the self-stabilization period. Upon self-stabilization, all good clocks proceed synchronously. The Byzantine self-stabilizing clock synchronization protocol does not rely on any assumptions about the initial state of the clocks. Furthermore, there is neither a central clock nor an externally generated pulse system. The protocol converges deterministically, is scalable, and self-stabilizes in a short amount of time. The convergence time is linear with respect to the self-stabilization period.

  19. Distributed Sensing with Fault-Tolerant Resource Reallocation for Disaster Area Assessment

    Science.gov (United States)

    2010-05-01

    mobile ad-hoc networks,” Ad Hoc Networks, vol. In Press, Corrected Proof. 31. A. Mishra, K. Nadkarni , and A. Patcha, “Intrusion detection in wireless ad...hoc networks,” IEEE Wireless Communications, vol. 11, no. 1, 2004, pp. 48-60. 32. K. Nadkarni , and A. Mishra, “Intrusion detection in MANETS - the

  20. Datacollection And Fault Tolerant Design Of Iot Devices Over A Distributed Network System

    Directory of Open Access Journals (Sweden)

    Bharadwaj Turlapati

    2017-10-01

    Full Text Available In a world where connecting and communicating with devices have never been more in need The Internet of Things thereby has a demanding need for a strategy of a design to ensure the communication between these devices is reliable maintainable and scalable. Having many permutations and combinations of possibilities of devices and solutions offered to world this paper addresses a solution with a working use case to design the system check for reliability throughput maintainability scalability and address the issues in the current system and how this design will help to overcome those issues.

  1. Energy/Reliability Trade-offs in Fault-Tolerant Event-Triggered Distributed Embedded Systems

    DEFF Research Database (Denmark)

    Gan, Junhe; Gruian, Flavius; Pop, Paul

    2011-01-01

    and reliability simultaneously is especially challenging, since lowering the voltage to reduce the energy consumption has been shown to increase the transient fault rate. We presented a Tabu Search-based approach which uses an energy/reliability trade-off model to find reliable and schedulable implementations...... with limited energy and hardware resources. We evaluated the algorithm proposed using several synthetic and reallife benchmarks....... task, such that transient faults are tolerated, the timing constraints of the application are satisfied, and the energy consumed is minimized. Tasks are scheduled using fixed-priority preemptive scheduling, while replication is used for recovery from multiple transient faults. Addressing energy...

  2. Computer Sciences and Data Systems, volume 1

    Science.gov (United States)

    1987-01-01

    Topics addressed include: software engineering; university grants; institutes; concurrent processing; sparse distributed memory; distributed operating systems; intelligent data management processes; expert system for image analysis; fault tolerant software; and architecture research.

  3. Fault-tolerant computing with biased-noise superconducting qubits: a case study

    International Nuclear Information System (INIS)

    Aliferis, P; Brito, F; DiVincenzo, D P; Steffen, M; Terhal, B M; Preskill, J

    2009-01-01

    We present a universal scheme of pulsed operations suitable for the IBM oscillator-stabilized flux qubit comprising the controlled-σ z (cphase) gate, single-qubit preparations and measurements. Based on numerical simulations, we argue that the error rates for these operations can be as low as about 0.5% and that noise is highly biased, with phase errors being stronger than all other types of errors by a factor of nearly 10 3 . In contrast, the design of a controlled-σ x (cnot) gate for this system with an error rate of less than about 1.2% seems extremely challenging. We propose a special encoding that exploits the noise bias allowing us to implement a logicalcnot gate where phase errors and all other types of errors have nearly balanced rates of about 0.4%. Our results illustrate how the design of an encoding scheme can be adjusted and optimized according to the available physical operations and the particular noise characteristics of experimental devices.

  4. Decentralized Computing Technology for Fault-Tolerant, Survivable C3I systems. Volume 2

    Science.gov (United States)

    1990-06-01

    the operation invocation. Functional Description Alpha Release I Programming Model .A-61 Obec; l~d Object2 Object3 Object4 noe nodej oe nodee a) NODE...FAILURE Object,1 ha Object2 Object3 Object4 AV Orphan nod;i nodej nodek b) NODE FAILURE Object, h~a Object2 Object3 Object4 node1 node-, node

  5. YF22 Model With On-Board On-Line Learning Microprocessors-Based Neural Algorithms for Autopilot and Fault-Tolerant Flight Control Systems

    National Research Council Canada - National Science Library

    Napolitano, Marcello

    2002-01-01

    This project focused on investigating the potential of on-line learning 'hardware-based' neural approximators and controllers to provide fault tolerance capabilities following sensor and actuator failures...

  6. A distributed computer system for digitising machines

    International Nuclear Information System (INIS)

    Bairstow, R.; Barlow, J.; Waters, M.; Watson, J.

    1977-07-01

    This paper describes a Distributed Computing System, based on micro computers, for the monitoring and control of digitising tables used by the Rutherford Laboratory Bubble Chamber Research Group in the measurement of bubble chamber photographs. (author)

  7. Intelligent on-line fault tolerant control for unanticipated catastrophic failures.

    Science.gov (United States)

    Yen, Gary G; Ho, Liang-Wei

    2004-10-01

    As dynamic systems become increasingly complex, experience rapidly changing environments, and encounter a greater variety of unexpected component failures, solving the control problems of such systems is a grand challenge for control engineers. Traditional control design techniques are not adequate to cope with these systems, which may suffer from unanticipated dynamic failures. In this research work, we investigate the on-line fault tolerant control problem and propose an intelligent on-line control strategy to handle the desired trajectories tracking problem for systems suffering from various unanticipated catastrophic faults. Through theoretical analysis, the sufficient condition of system stability has been derived and two different on-line control laws have been developed. The approach of the proposed intelligent control strategy is to continuously monitor the system performance and identify what the system's current state is by using a fault detection method based upon our best knowledge of the nominal system and nominal controller. Once a fault is detected, the proposed intelligent controller will adjust its control signal to compensate for the unknown system failure dynamics by using an artificial neural network as an on-line estimator to approximate the unexpected and unknown failure dynamics. The first control law is derived directly from the Lyapunov stability theory, while the second control law is derived based upon the discrete-time sliding mode control technique. Both control laws have been implemented in a variety of failure scenarios to validate the proposed intelligent control scheme. The simulation results, including a three-tank benchmark problem, comply with theoretical analysis and demonstrate a significant improvement in trajectory following performance based upon the proposed intelligent control strategy.

  8. Definition and trade-off study of reconfigurable airborne digital computer system organizations

    Science.gov (United States)

    Conn, R. B.

    1974-01-01

    A highly-reliable, fault-tolerant reconfigurable computer system for aircraft applications was developed. The development and application reliability and fault-tolerance assessment techniques are described. Particular emphasis is placed on the needs of an all-digital, fly-by-wire control system appropriate for a passenger-carrying airplane.

  9. Study, design and realization of a fault-tolerant and predictable synchronous communication protocol on off-the-shelf components; Etude, conception et mise en oeuvre d'un protocole de communication synchrone tolerant aux fautes et predictible sur des composants reseaux standards

    Energy Technology Data Exchange (ETDEWEB)

    Chabrol, D

    2006-06-15

    This PhD thesis contributes to the design and realization of safety-critical real-time systems on multiprocessor architectures with distributed memory. They are essential to compute systems that have to ensure complex and critical functions. This PhD thesis deals with communication media management. The communication management conditions strongly the capability of the system to fulfill the timeliness property and the dependability requirements. Our contribution includes: - The design of predictable and fault-tolerant synchronous communication protocol; - The study and the definition of the execution model to have a efficient and safe communications management; - The proposal of a method to generate automatically the communications scheduling. Our approach is based on a communication model that allows the analysis of the feasibility, before execution, of a distributed safe-critical real-time system with timeliness and safety requirements. This leads to the definition of an execution model based on a time-triggered and parallel communication management. A set of linear constraints system is generated automatically to compute the network scheduling and the network load with timeliness fulfillment. Then, the proposed communication interface is based on an advanced version of TDMA protocol which allows to use proprietary components (TTP, FlexRay) as well as standard components (Ethernet). The concepts presented in this thesis lead to the realisation and evaluation of a prototype within the framework of the OASIS project done at the CEA/List. (author)

  10. Distributed computing and nuclear reactor analysis

    International Nuclear Information System (INIS)

    Brown, F.B.; Derstine, K.L.; Blomquist, R.N.

    1994-01-01

    Large-scale scientific and engineering calculations for nuclear reactor analysis can now be carried out effectively in a distributed computing environment, at costs far lower than for traditional mainframes. The distributed computing environment must include support for traditional system services, such as a queuing system for batch work, reliable filesystem backups, and parallel processing capabilities for large jobs. All ANL computer codes for reactor analysis have been adapted successfully to a distributed system based on workstations and X-terminals. Distributed parallel processing has been demonstrated to be effective for long-running Monte Carlo calculations

  11. Bayesian optimization for computationally extensive probability distributions.

    Science.gov (United States)

    Tamura, Ryo; Hukushima, Koji

    2018-01-01

    An efficient method for finding a better maximizer of computationally extensive probability distributions is proposed on the basis of a Bayesian optimization technique. A key idea of the proposed method is to use extreme values of acquisition functions by Gaussian processes for the next training phase, which should be located near a local maximum or a global maximum of the probability distribution. Our Bayesian optimization technique is applied to the posterior distribution in the effective physical model estimation, which is a computationally extensive probability distribution. Even when the number of sampling points on the posterior distributions is fixed to be small, the Bayesian optimization provides a better maximizer of the posterior distributions in comparison to those by the random search method, the steepest descent method, or the Monte Carlo method. Furthermore, the Bayesian optimization improves the results efficiently by combining the steepest descent method and thus it is a powerful tool to search for a better maximizer of computationally extensive probability distributions.

  12. Fel simulations using distributed computing

    NARCIS (Netherlands)

    Einstein, J.; Biedron, S.G.; Freund, H.P.; Milton, S.V.; Van Der Slot, P. J M; Bernabeu, G.

    2016-01-01

    While simulation tools are available and have been used regularly for simulating light sources, including Free-Electron Lasers, the increasing availability and lower cost of accelerated computing opens up new opportunities. This paper highlights a method of how accelerating and parallelizing code

  13. Thermoelectric-Driven Sustainable Sensing and Actuation Systems for Fault-Tolerant Nuclear Incidents

    Energy Technology Data Exchange (ETDEWEB)

    Longtin, Jon [Stony Brook Univ., NY (United States)

    2016-02-08

    safety systems, etc. Such an approach is intrinsically fault tolerant: in the event that system temperatures increase, the amount of available energy will increase, which will make more power available for applications. The system can also be used during normal conditions to provide enhanced monitoring of key system components.

  14. Thermoelectric-Driven Sustainable Sensing and Actuation Systems for Fault-Tolerant Nuclear Incidents

    International Nuclear Information System (INIS)

    Longtin, Jon

    2015-09-01

    safety systems, etc. Such an approach is intrinsically fault tolerant: in the event that system temperatures increase, the amount of available energy will increase, which will make more power available for applications. The system can also be used during normal conditions to provide enhanced monitoring of key system components.

  15. A design of fault tolerant flight control systems for sensor and actuator failures using on-line learning neural networks

    Science.gov (United States)

    An, Younghwan

    The research in this document focuses on the performance of a neural network-based fault tolerant system within a flight control system. This fault tolerant flight control system integrates sensor and actuator failure detection, identification, and accommodation (SFDIA and AFDIA). The SFDIA task is achieved by incorporating a main neural network (MNN) and a set of n decentralized neural networks (DNNs) for a system with n sensors assumed to be without physical redundancy. Particularly, the purpose of the MNN is to detect a wide variety of sensor failures while the purpose of the DNNs is to identify the particular sensor that has failed and accommodate for the failure. The AFDIA scheme also implements a MNN with three neural network controllers (NNCs). The function of NNCs is to regain equilibrium and to compensate for the pitching, rolling, and yawing moments induced by the failure. The NNs are trained on-line using the Extended Back-Propagation Algorithm (EBPA). Because of the on-line learning, neural estimators and controllers have the capability of adapting to changes in the aircraft dynamics and/or modeling discrepancies between the actual aircraft and its mathematical model. This factor makes neural estimators and controllers an attractive option for fault tolerant flight control system. Particular emphasis is placed in this study toward improving the performance of the SFDIA scheme in the presence of ramp-type soft failures which are hard to detect as well as achieving an efficient integration between SFDIA and AFDIA without degradation of performance in terms of false alarm rates and incorrect failure identification.

  16. Study on the systematic approach of Markov modeling for dependability analysis of complex fault-tolerant features with voting logics

    International Nuclear Information System (INIS)

    Son, Kwang Seop; Kim, Dong Hoon; Kim, Chang Hwoi; Kang, Hyun Gook

    2016-01-01

    The Markov analysis is a technique for modeling system state transitions and calculating the probability of reaching various system states. While it is a proper tool for modeling complex system designs involving timing, sequencing, repair, redundancy, and fault tolerance, as the complexity or size of the system increases, so does the number of states of interest, leading to difficulty in constructing and solving the Markov model. This paper introduces a systematic approach of Markov modeling to analyze the dependability of a complex fault-tolerant system. This method is based on the decomposition of the system into independent subsystem sets, and the system-level failure rate and the unavailability rate for the decomposed subsystems. A Markov model for the target system is easily constructed using the system-level failure and unavailability rates for the subsystems, which can be treated separately. This approach can decrease the number of states to consider simultaneously in the target system by building Markov models of the independent subsystems stage by stage, and results in an exact solution for the Markov model of the whole target system. To apply this method we construct a Markov model for the reactor protection system found in nuclear power plants, a system configured with four identical channels and various fault-tolerant architectures. The results show that the proposed method in this study treats the complex architecture of the system in an efficient manner using the merits of the Markov model, such as a time dependent analysis and a sequential process analysis. - Highlights: • Systematic approach of Markov modeling for system dependability analysis is proposed based on the independent subsystem set, its failure rate and unavailability rate. • As an application example, we construct the Markov model for the digital reactor protection system configured with four identical and independent channels, and various fault-tolerant architectures. • The

  17. Microcontroller-Based Fault Tolerant Data Acquisition System For Air Quality Monitoring And Control Of Environmental Pollution

    Directory of Open Access Journals (Sweden)

    Tochukwu Chiagunye

    2015-08-01

    Full Text Available ABSTRACT The design applied Passive fault tolerance to a microcontroller based data acquisition system to achieve the stated considerations where redundant sensors and microcontrollers with associated circuitry were designed and implemented to enable measurement of pollutant concentration information from chimney vents in two industry. Microsoft visual basic was used to develop a data mining tool which implemented an underlying artificial neural network model for forecasting pollutant concentrations for future time periods. The feed forward back propagation method was used to train the ANN model with a training data set while a decision tree algorithm was used to select an optimal output result for the model from its two output neurons.

  18. Synthesis of Flexible Fault-Tolerant Schedules with Preemption for Mixed Soft and Hard Real-Time Systems

    DEFF Research Database (Denmark)

    Izosimov, Viacheslav; Pop, Paul; Eles, Petru

    2008-01-01

    In this paper we present an approach for scheduling with preemption for fault-tolerant embedded systems composed of soft and hard real-time processes. We are interested to maximize the overall utility for average, most likely to happen, scenarios and to guarantee the deadlines for the hard...... as a method to generate flexible schedules that maximize the overall utility for the average case while guarantee timing constraints in the worst case. Our scheduling algorithms determine off-line when to preempt and when to resurrect processes. The experimental results show the superiority of our new...

  19. Robust adaptive fault-tolerant control for leader-follower flocking of uncertain multi-agent systems with actuator failure.

    Science.gov (United States)

    Yazdani, Sahar; Haeri, Mohammad

    2017-11-01

    In this work, we study the flocking problem of multi-agent systems with uncertain dynamics subject to actuator failure and external disturbances. By considering some standard assumptions, we propose a robust adaptive fault tolerant protocol for compensating of the actuator bias fault, the partial loss of actuator effectiveness fault, the model uncertainties, and external disturbances. Under the designed protocol, velocity convergence of agents to that of virtual leader is guaranteed while the connectivity preservation of network and collision avoidance among agents are ensured as well. Copyright © 2017 ISA. Published by Elsevier Ltd. All rights reserved.

  20. Next generation distributed computing for cancer research.

    Science.gov (United States)

    Agarwal, Pankaj; Owzar, Kouros

    2014-01-01

    Advances in next generation sequencing (NGS) and mass spectrometry (MS) technologies have provided many new opportunities and angles for extending the scope of translational cancer research while creating tremendous challenges in data management and analysis. The resulting informatics challenge is invariably not amenable to the use of traditional computing models. Recent advances in scalable computing and associated infrastructure, particularly distributed computing for Big Data, can provide solutions for addressing these challenges. In this review, the next generation of distributed computing technologies that can address these informatics problems is described from the perspective of three key components of a computational platform, namely computing, data storage and management, and networking. A broad overview of scalable computing is provided to set the context for a detailed description of Hadoop, a technology that is being rapidly adopted for large-scale distributed computing. A proof-of-concept Hadoop cluster, set up for performance benchmarking of NGS read alignment, is described as an example of how to work with Hadoop. Finally, Hadoop is compared with a number of other current technologies for distributed computing.

  1. Efficient job handling in the GRID short deadline, interactivity, fault tolerance and parallelism

    CERN Document Server

    Moscicki, Jakub

    2006-01-01

    The major GRID infastructures are designed mainly for batch-oriented computing with coarse-grained jobs and relatively high job turnaround time. However many practical applications in natural and physical sciences may be easily parallelized and run as a set of smaller tasks which require little or no synchronization and which may be scheduled in a more efficient way. The Distributed Analysis Environment Framework (DIANE), is a Master-Worker execution skeleton for applications, which complements the GRID middleware stack. Automatic failure recovery and task dispatching policies enable an easy customization of the behaviour of the framework in a dynamic and non-reliable computing environment. We demonstrate the experience of using the framework with several diverse real-life applications, including Monte Carlo Simulation, Physics Data Analysis and Biotechnology. The interfacing of existing sequential applications from the point of view of non-expert user is made easy, also for legacy applications. We analyze th...

  2. Enhanced Bully Algorithm for Leader Node Election in Synchronous Distributed Systems

    Directory of Open Access Journals (Sweden)

    Md. Golam Murshed

    2012-06-01

    Full Text Available In distributed computing systems, if an elected leader node fails, the other nodes of the system need to elect another leader. The bully algorithm is a classical approach for electing a leader in a synchronous distributed computing system. This paper presents an enhancement of the bully algorithm, requiring less time complexity and minimum message passing. This significant gain has been achieved by introducing node sets and tie breaker time. The latter provides a possible solution to simultaneous elections initiated by different nodes. In comparison with the classical algorithm and its existing modifications, this proposal generates minimum messages, stops redundant elections, and maintains fault-tolerant behaviour of the system.

  3. Distributed computing by oblivious mobile robots

    CERN Document Server

    Flocchini, Paola; Santoro, Nicola

    2012-01-01

    The study of what can be computed by a team of autonomous mobile robots, originally started in robotics and AI, has become increasingly popular in theoretical computer science (especially in distributed computing), where it is now an integral part of the investigations on computability by mobile entities. The robots are identical computational entities located and able to move in a spatial universe; they operate without explicit communication and are usually unable to remember the past; they are extremely simple, with limited resources, and individually quite weak. However, collectively the ro

  4. Distributed computer systems theory and practice

    CERN Document Server

    Zedan, H S M

    2014-01-01

    Distributed Computer Systems: Theory and Practice is a collection of papers dealing with the design and implementation of operating systems, including distributed systems, such as the amoeba system, argus, Andrew, and grapevine. One paper discusses the concepts and notations for concurrent programming, particularly language notation used in computer programming, synchronization methods, and also compares three classes of languages. Another paper explains load balancing or load redistribution to improve system performance, namely, static balancing and adaptive load balancing. For program effici

  5. Impossibility results for distributed computing

    CERN Document Server

    Attiya, Hagit

    2014-01-01

    To understand the power of distributed systems, it is necessary to understand their inherent limitations: what problems cannot be solved in particular systems, or without sufficient resources (such as time or space). This book presents key techniques for proving such impossibility results and applies them to a variety of different problems in a variety of different system models. Insights gained from these results are highlighted, aspects of a problem that make it difficult are isolated, features of an architecture that make it inadequate for solving certain problems efficiently are identified

  6. ATLAS Distributed Computing in LHC Run2

    CERN Document Server

    Campana, Simone; The ATLAS collaboration

    2015-01-01

    The ATLAS Distributed Computing infrastructure has evolved after the first period of LHC data taking in order to cope with the challenges of the upcoming LHC Run2. An increased data rate and computing demands of the Monte-Carlo simulation, as well as new approaches to ATLAS analysis, dictated a more dynamic workload management system (ProdSys2) and data management system (Rucio), overcoming the boundaries imposed by the design of the old computing model. In particular, the commissioning of new central computing system components was the core part of the migration toward the flexible computing model. The flexible computing utilization exploring the opportunistic resources such as HPC, cloud, and volunteer computing is embedded in the new computing model, the data access mechanisms have been enhanced with the remote access, and the network topology and performance is deeply integrated into the core of the system. Moreover a new data management strategy, based on defined lifetime for each dataset, has been defin...

  7. LHCb: LHCb Distributed Computing Operations

    CERN Multimedia

    Stagni, F

    2011-01-01

    The proliferation of tools for monitoring both activities and infrastructure, together with the pressing need for prompt reaction in case of problems impacting data taking, data reconstruction, data reprocessing and user analysis brought to the need of better organizing the huge amount of information available. The monitoring system for the LHCb Grid Computing relies on many heterogeneous and independent sources of information offering different views for a better understanding of problems while an operations team and defined procedures have been put in place to handle them. This work summarizes the state-of-the-art of LHCb Grid operations emphasizing the reasons that brought to various choices and what are the tools currently in use to run our daily activities. We highlight the most common problems experienced across years of activities on the WLCG infrastructure, the services with their criticality, the procedures in place, the relevant metrics and the tools available and the ones still missing.

  8. CMS distributed computing workflow experience

    International Nuclear Information System (INIS)

    Adelman-McCarthy, Jennifer; Gutsche, Oliver; Haas, Jeffrey D; Prosper, Harrison B; Dutta, Valentina; Gomez-Ceballos, Guillelmo; Hahn, Kristian; Klute, Markus; Mohapatra, Ajit; Spinoso, Vincenzo; Kcira, Dorian; Caudron, Julien; Liao Junhui; Pin, Arnaud; Schul, Nicolas; Lentdecker, Gilles De; McCartin, Joseph; Vanelderen, Lukas; Janssen, Xavier; Tsyganov, Andrey

    2011-01-01

    The vast majority of the CMS Computing capacity, which is organized in a tiered hierarchy, is located away from CERN. The 7 Tier-1 sites archive the LHC proton-proton collision data that is initially processed at CERN. These sites provide access to all recorded and simulated data for the Tier-2 sites, via wide-area network (WAN) transfers. All central data processing workflows are executed at the Tier-1 level, which contain re-reconstruction and skimming workflows of collision data as well as reprocessing of simulated data to adapt to changing detector conditions. This paper describes the operation of the CMS processing infrastructure at the Tier-1 level. The Tier-1 workflows are described in detail. The operational optimization of resource usage is described. In particular, the variation of different workflows during the data taking period of 2010, their efficiencies and latencies as well as their impact on the delivery of physics results is discussed and lessons are drawn from this experience. The simulation of proton-proton collisions for the CMS experiment is primarily carried out at the second tier of the CMS computing infrastructure. Half of the Tier-2 sites of CMS are reserved for central Monte Carlo (MC) production while the other half is available for user analysis. This paper summarizes the large throughput of the MC production operation during the data taking period of 2010 and discusses the latencies and efficiencies of the various types of MC production workflows. We present the operational procedures to optimize the usage of available resources and we the operational model of CMS for including opportunistic resources, such as the larger Tier-3 sites, into the central production operation.

  9. CMS distributed computing workflow experience

    Science.gov (United States)

    Adelman-McCarthy, Jennifer; Gutsche, Oliver; Haas, Jeffrey D.; Prosper, Harrison B.; Dutta, Valentina; Gomez-Ceballos, Guillelmo; Hahn, Kristian; Klute, Markus; Mohapatra, Ajit; Spinoso, Vincenzo; Kcira, Dorian; Caudron, Julien; Liao, Junhui; Pin, Arnaud; Schul, Nicolas; De Lentdecker, Gilles; McCartin, Joseph; Vanelderen, Lukas; Janssen, Xavier; Tsyganov, Andrey; Barge, Derek; Lahiff, Andrew

    2011-12-01

    The vast majority of the CMS Computing capacity, which is organized in a tiered hierarchy, is located away from CERN. The 7 Tier-1 sites archive the LHC proton-proton collision data that is initially processed at CERN. These sites provide access to all recorded and simulated data for the Tier-2 sites, via wide-area network (WAN) transfers. All central data processing workflows are executed at the Tier-1 level, which contain re-reconstruction and skimming workflows of collision data as well as reprocessing of simulated data to adapt to changing detector conditions. This paper describes the operation of the CMS processing infrastructure at the Tier-1 level. The Tier-1 workflows are described in detail. The operational optimization of resource usage is described. In particular, the variation of different workflows during the data taking period of 2010, their efficiencies and latencies as well as their impact on the delivery of physics results is discussed and lessons are drawn from this experience. The simulation of proton-proton collisions for the CMS experiment is primarily carried out at the second tier of the CMS computing infrastructure. Half of the Tier-2 sites of CMS are reserved for central Monte Carlo (MC) production while the other half is available for user analysis. This paper summarizes the large throughput of the MC production operation during the data taking period of 2010 and discusses the latencies and efficiencies of the various types of MC production workflows. We present the operational procedures to optimize the usage of available resources and we the operational model of CMS for including opportunistic resources, such as the larger Tier-3 sites, into the central production operation.

  10. Mobile Agents in Networking and Distributed Computing

    CERN Document Server

    Cao, Jiannong

    2012-01-01

    The book focuses on mobile agents, which are computer programs that can autonomously migrate between network sites. This text introduces the concepts and principles of mobile agents, provides an overview of mobile agent technology, and focuses on applications in networking and distributed computing.

  11. A Software Rejuvenation Framework for Distributed Computing

    Science.gov (United States)

    Chau, Savio

    2009-01-01

    A performability-oriented conceptual framework for software rejuvenation has been constructed as a means of increasing levels of reliability and performance in distributed stateful computing. As used here, performability-oriented signifies that the construction of the framework is guided by the concept of analyzing the ability of a given computing system to deliver services with gracefully degradable performance. The framework is especially intended to support applications that involve stateful replicas of server computers.

  12. Fault-Tolerant Control for a Flexible Group Battery Energy Storage System Based on Cascaded Multilevel Converters

    Directory of Open Access Journals (Sweden)

    Junhong Song

    2018-01-01

    Full Text Available A flexible group battery energy storage system (FGBESS based on cascaded multilevel converters is attractive for renewable power generation applications because of its high modularity and high power quality. However, reliability is one of the most important issues and the system may suffer from great financial loss after fault occurs. In this paper, based on conventional fundamental phase shift compensation and third harmonic injection, a hybrid compensation fault-tolerant method is proposed to improve the post-fault performance in the FGBESS. By adjusting initial phase offset and amplitude of injected component, the optimal third harmonic injection is generated in an asymmetric system under each faulty operation. Meanwhile, the optimal redundancy solution under each fault condition is also elaborated comprehensively with a comparison of the presented three fault-tolerant strategies. This takes full advantage of battery utilization and minimizes the loss of energy capacity. Finally, the effectiveness and feasibility of the proposed methods are verified by results obtained from simulations and a 10 kW experimental platform.

  13. An Immune Cooperative Particle Swarm Optimization Algorithm for Fault-Tolerant Routing Optimization in Heterogeneous Wireless Sensor Networks

    Directory of Open Access Journals (Sweden)

    Yifan Hu

    2012-01-01

    Full Text Available The fault-tolerant routing problem is important consideration in the design of heterogeneous wireless sensor networks (H-WSNs applications, and has recently been attracting growing research interests. In order to maintain k disjoint communication paths from source sensors to the macronodes, we present a hybrid routing scheme and model, in which multiple paths are calculated and maintained in advance, and alternate paths are created once the previous routing is broken. Then, we propose an immune cooperative particle swarm optimization algorithm (ICPSOA in the model to provide the fast routing recovery and reconstruct the network topology for path failure in H-WSNs. In the ICPSOA, mutation direction of the particle is determined by multi-swarm evolution equation, and its diversity is improved by immune mechanism, which can enhance the capacity of global search and improve the converging rate of the algorithm. Then we validate this theoretical model with simulation results. The results indicate that the ICPSOA-based fault-tolerant routing protocol outperforms several other protocols due to its capability of fast routing recovery mechanism, reliable communications, and prolonging the lifetime of WSNs.

  14. A Hybrid Fault-Tolerant Strategy for Severe Sensor Failure Scenarios in Late-Stage Offshore DFIG-WT

    Directory of Open Access Journals (Sweden)

    Wei Li

    2017-12-01

    Full Text Available As the phase current sensors and rotor speed/position sensor are prone to fail in the late stage of an offshore doubly-fed induction generator based wind turbine (DFIG-WT, this paper investigates a hybrid fault-tolerant strategy for a severe sensor failure scenario. The phase current sensors in the back-to-back (BTB converter and the speed/position sensor are in the faulty states simultaneously. Based on the 7th-order doubly-fed induction generator (DFIG dynamic state space model, the extended Kalman filter (EKF algorithm is applied for rotor speed and position estimation. In addition, good robustness of this sensorless control algorithm to system uncertainties and measurement disturbances is presented. Besides, a single DC-link current sensor based phase current reconstruction scheme is utilized for deriving the phase current information according to the switching states. A duty ratio adjustment strategy is proposed to avoid missing the sampling points in a switching period, which is simple to implement. Furthermore, the additional active time of the targeted nonzero switching states is complemented so that the reference voltage vector remains in the same position as that before duty ratio adjustment. The validity of the proposed hybrid fault-tolerant sensorless control strategy is demonstrated by simulation results in Matlab/Simulink2017a by considering harsh operating environments.

  15. Characteristic Analysis and Fault-Tolerant Control of Circulating Current for Modular Multilevel Converters under Sub-Module Faults

    Directory of Open Access Journals (Sweden)

    Wen Wu

    2017-11-01

    Full Text Available A modular multilevel converter (MMC is considered to be a promising topology for medium- or high-power applications. However, a significantly increased amount of sub-modules (SMs in each arm also increase the risk of failures. Focusing on the fault-tolerant operation issue for the MMC under SM faults, the operation characteristics of MMC with different numbers of faulty SMs in the arms are analyzed and summarized in this paper. Based on the characteristics, a novel circulating current-suppressing (CCS fault-tolerant control strategy comprised of a basic control unit (BCU and virtual resistance compensation control unit (VRCCU in two parts is proposed, which has three main features: (i it can suppress the multi-different frequency components of the circulating current under different SM fault types simultaneously; (ii it can help fast limiting of the transient fault current caused at the faulty SM bypassed moment; and (iii it does not need extra communication systems to acquire the information of the number of faulty SMs. Moreover, by analyzing the stability performance of the proposed controller using the Root-Locus criterion, the election principle of the value of virtual resistance is revealed. Finally, the efficiency of the control strategy is confirmed with the simulation and experiment studies under different fault conditions.

  16. CMS Distributed Computing Workflow Experience

    CERN Document Server

    Haas, Jeffrey David

    2010-01-01

    The vast majority of the CMS Computing capacity, which is organized in a tiered hierarchy, is located away from CERN. The 7 Tier-1 sites archive the LHC proton-proton collision data that is initially processed at CERN. These sites provide access to all recorded and simulated data for the Tier-2 sites, via wide-area network (WAN) transfers. All central data processing workflows are executed at the Tier-1 level, which contain re-reconstruction and skimming workflows of collision data as well as reprocessing of simulated data to adapt to changing detector conditions. This paper describes the operation of the CMS processing infrastructure at the Tier-1 level. The Tier-1 workflows are described in detail. The operational optimization of resource usage is described. In particular, the variation of different workflows during the data taking period of 2010, their efficiencies and latencies as well as their impact on the delivery of physics results is discussed and lessons are drawn from this experience. The simul...

  17. Distributed System Contract Monitoring

    Directory of Open Access Journals (Sweden)

    Adrian Francalanza Ph.D

    2011-09-01

    Full Text Available The use of behavioural contracts, to specify, regulate and verify systems, is particularly relevant to runtime monitoring of distributed systems. System distribution poses major challenges to contract monitoring, from monitoring-induced information leaks to computation load balancing, communication overheads and fault-tolerance. We present mDPi, a location-aware process calculus, for reasoning about monitoring of distributed systems. We define a family of Labelled Transition Systems for this calculus, which allow formal reasoning about different monitoring strategies at different levels of abstractions. We also illustrate the expressivity of the calculus by showing how contracts in a simple contract language can be synthesised into different mDPi monitors.

  18. A research program in empirical computer science

    Science.gov (United States)

    Knight, J. C.

    1991-01-01

    During the grant reporting period our primary activities have been to begin preparation for the establishment of a research program in experimental computer science. The focus of research in this program will be safety-critical systems. Many questions that arise in the effort to improve software dependability can only be addressed empirically. For example, there is no way to predict the performance of the various proposed approaches to building fault-tolerant software. Performance models, though valuable, are parameterized and cannot be used to make quantitative predictions without experimental determination of underlying distributions. In the past, experimentation has been able to shed some light on the practical benefits and limitations of software fault tolerance. It is common, also, for experimentation to reveal new questions or new aspects of problems that were previously unknown. A good example is the Consistent Comparison Problem that was revealed by experimentation and subsequently studied in depth. The result was a clear understanding of a previously unknown problem with software fault tolerance. The purpose of a research program in empirical computer science is to perform controlled experiments in the area of real-time, embedded control systems. The goal of the various experiments will be to determine better approaches to the construction of the software for computing systems that have to be relied upon. As such it will validate research concepts from other sources, provide new research results, and facilitate the transition of research results from concepts to practical procedures that can be applied with low risk to NASA flight projects. The target of experimentation will be the production software development activities undertaken by any organization prepared to contribute to the research program. Experimental goals, procedures, data analysis and result reporting will be performed for the most part by the University of Virginia.

  19. A Fault-Tolerant Modulation Method to Counteract the Double Open-Switch Fault in Matrix Converter Drive Systems without Redundant Power Devices

    DEFF Research Database (Denmark)

    Chen, Der-Fa; Nguyen-Duy, Khiem; Liu, Tian-Hua

    2012-01-01

    This paper studies the double open-switch fault issue occurring within the conventional matrix converter driving a three-phase permanent-magnet synchronous motor system and proposes a fault-tolerant solution by introducing a revised modulation strategy. In this switching strategy, the rectifier......-stage modulation is adjusted based on the knowledge of the switching logics of the inverter-stage and the operating input voltage sectors. However, the proposed fault-tolerant method does not rely on the assist of any redundant power devices or any reconfiguration of the matrix converter circuit by means of using...

  20. Secure key storage and distribution

    Science.gov (United States)

    Agrawal, Punit

    2015-06-02

    This disclosure describes a distributed, fault-tolerant security system that enables the secure storage and distribution of private keys. In one implementation, the security system includes a plurality of computing resources that independently store private keys provided by publishers and encrypted using a single security system public key. To protect against malicious activity, the security system private key necessary to decrypt the publication private keys is not stored at any of the computing resources. Rather portions, or shares of the security system private key are stored at each of the computing resources within the security system and multiple security systems must communicate and share partial decryptions in order to decrypt the stored private key.